CN115169577A

CN115169577A - Model training method and data processing method

Info

Publication number: CN115169577A
Application number: CN202210744148.0A
Authority: CN
Inventors: 谢晨伟; 吴建民; 赵黎明; 郑赟; 赵德丽
Original assignee: Alibaba China Co Ltd
Current assignee: Alibaba China Co Ltd
Priority date: 2022-06-28
Filing date: 2022-06-28
Publication date: 2022-10-11

Abstract

The application discloses a model training method and a data processing method. The model training method comprises the following steps: obtaining training samples, the training samples comprising: different types of first training data and second training data; processing the first training data by using a first processing model to obtain a first global feature and a first data block feature, wherein the first global feature is used for representing the semantic feature of the first training data, and the first data block feature is used for representing the feature of a data block in the first training data; processing the second training data by using a second processing model to obtain a second global feature and a second data block feature; and adjusting parameters of the first processing model and the second processing model based on the global similarity of the first global feature and the second global feature and the data block similarity of the first data block feature and the second data block feature. The method and the device solve the technical problem that the accuracy of processing the multi-modal data through the model is poor in the related technology.

Description

Model training method and data processing method

技术领域technical field

本申请涉及数据处理领域，具体而言，涉及一种模型训练方法和数据处理方法。The present application relates to the field of data processing, and in particular, to a model training method and a data processing method.

背景技术Background technique

互联网上的信息以视频、图像、文字为主，而用户的搜索请求主要以文字为主，为了实现信息检索目的，可以通过神经网络模型对图像、文本进行统一的特征表示实现。但是，通过上述模型提取出的特征向量无法准确表示图像、文本的细节信息，导致模型处理准确度较差。The information on the Internet is mainly video, image, and text, and the user's search request is mainly text. However, the feature vectors extracted by the above models cannot accurately represent the detailed information of images and texts, resulting in poor model processing accuracy.

针对上述的问题，目前尚未提出有效的解决方案。For the above problems, no effective solution has been proposed yet.

发明内容SUMMARY OF THE INVENTION

本申请实施例提供了一种模型训练方法和数据处理方法，以至少解决相关技术中通过模型对多模态数据进行处理的准确度较差的技术问题。Embodiments of the present application provide a model training method and a data processing method, so as to at least solve the technical problem in the related art that the accuracy of processing multimodal data through a model is poor.

根据本申请实施例的一个方面，提供了一种模型训练方法，包括：获取训练样本，其中，训练样本包括：不同类型的第一训练数据和第二训练数据；利用第一处理模型对第一训练数据进行处理，得到第一全局特征和第一数据块特征，其中，第一全局特征用于表征第一训练数据的语义特征，第一数据块特征用于表征第一训练数据中的数据块的特征；利用第二处理模型对第二训练数据进行处理，得到第二全局特征和第二数据块特征；基于第一全局特征和第二全局特征的全局相似度、第一数据块特征和第二数据块特征的数据块相似度，对第一处理模型和第二处理模型的参数进行调整，第一处理模型和第二处理模型为机器学习模型。According to an aspect of the embodiments of the present application, a model training method is provided, including: acquiring training samples, wherein the training samples include: different types of first training data and second training data; The training data is processed to obtain a first global feature and a first data block feature, wherein the first global feature is used to represent the semantic feature of the first training data, and the first data block feature is used to represent the data block in the first training data feature; use the second processing model to process the second training data to obtain the second global feature and the second data block feature; based on the global similarity of the first global feature and the second global feature, the first data block feature and the first data block feature The data block similarity of the second data block feature adjusts the parameters of the first processing model and the second processing model, and the first processing model and the second processing model are machine learning models.

根据本申请实施例的另一方面，还提供了一种数据处理方法，包括：获取不同类型的第一目标数据和多个第二目标数据；利用第一处理模型对第一目标数据进行处理，得到第一目标数据对应的第一特征向量；利用第二处理模型分别对多个第二目标数据进行处理，得到多个第二目标数据对应的第二特征向量；将第一特征向量和第二特征向量进行匹配，确定多个第二目标数据中的推送数据，其中，推送数据用于表征与第一目标数据相匹配的第二目标数据；其中，第一处理模型和第二处理模型的参数基于第一全局特征和第二全局特征的全局相似度、第一数据块特征和第二数据块特征的数据块相似度进行调整，第一全局特征和第一数据块特征通过第一处理模型对第一训练数据进行处理得到，第二全局特征和第二数据块特征通过第二处理模型对第二训练数据进行处理得到，第一训练数据和第二训练数据是不同类型的数据，第一全局特征用于表征第一训练数据的语义特征，第一数据块特征用于表征第一训练数据中的数据块的特征，第一处理模型和第二处理模型为机器学习模型。According to another aspect of the embodiments of the present application, a data processing method is also provided, including: acquiring different types of first target data and multiple second target data; processing the first target data by using a first processing model, Obtain the first feature vector corresponding to the first target data; use the second processing model to process the plurality of second target data respectively to obtain the second feature vector corresponding to the plurality of second target data; The feature vector is matched to determine the push data in the plurality of second target data, wherein the push data is used to represent the second target data that matches the first target data; wherein, the parameters of the first processing model and the second processing model The adjustment is made based on the global similarity of the first global feature and the second global feature, and the data block similarity of the first data block feature and the second data block feature, and the first global feature and the first data block feature are adjusted by the first processing model. The first training data is processed and obtained, the second global feature and the second data block feature are obtained by processing the second training data through the second processing model, the first training data and the second training data are different types of data, the first global feature The feature is used to represent the semantic feature of the first training data, the first data block feature is used to represent the feature of the data block in the first training data, and the first processing model and the second processing model are machine learning models.

根据本申请实施例的另一方面，还提供了一种数据处理方法，包括：响应作用于客户端的操作界面上的输入指令，在操作界面上显示第一目标数据；响应作用于操作界面上的推送指令，在操作界面上显示多个第二目标数据中的推送数据，其中，推送数据用于表征与第一目标数据相匹配的第二目标数据，第一目标数据和多个第二目标数据的类型不同，推送数据通过将第一目标数据对应的第一特征向量和多个第二目标数据对应的第二特征向量进行匹配得到，第一特征向量通过第一处理模型对第一目标数据进行处理得到，第二特征向量通过第二处理模型对第二目标数据进行处理得到，第一处理模型和第二处理模型的参数基于第一全局特征和第二全局特征的全局相似度、第一数据块特征和第二数据块特征的数据块相似度进行调整，第一全局特征和第一数据块特征通过第一处理模型对第一训练数据进行处理得到，第二全局特征和第二数据块特征通过第二处理模型对第二训练数据进行处理得到，第一训练数据和第二训练数据是不同类型的数据，第一全局特征用于表征第一训练数据的语义特征，第一数据块特征用于表征第一训练数据中的数据块的特征，第一处理模型和第二处理模型为机器学习模型。According to another aspect of the embodiments of the present application, a data processing method is also provided, including: displaying first target data on the operation interface in response to an input instruction acting on the operation interface of the client; A push instruction, displaying the push data in the plurality of second target data on the operation interface, wherein the push data is used to represent the second target data matching the first target data, the first target data and the plurality of second target data The push data is obtained by matching the first feature vector corresponding to the first target data with the second feature vectors corresponding to multiple second target data, and the first feature vector is processed by the first processing model on the first target data. The second feature vector is obtained by processing the second target data through the second processing model, and the parameters of the first processing model and the second processing model are based on the global similarity of the first global feature and the second global feature, the first data The data block similarity between the block feature and the second data block feature is adjusted. The first global feature and the first data block feature are obtained by processing the first training data through the first processing model, and the second global feature and the second data block feature are obtained. The second training data is processed by the second processing model. The first training data and the second training data are different types of data. The first global feature is used to represent the semantic feature of the first training data, and the first data block feature is used for For characterizing the features of the data blocks in the first training data, the first processing model and the second processing model are machine learning models.

根据本申请实施例的另一方面，还提供了一种数据处理方法，包括：在虚拟现实VR设备或增强现实AR设备的呈现画面上展示第一目标数据；获取多个第二目标数据，其中，第一目标数据和多个第二目标数据的类型不同；利用第一处理模型对第一目标数据进行处理，得到第一目标数据对应的第一特征向量；利用第二处理模型分别对多个第二目标数据进行处理，得到多个第二目标数据对应的第二特征向量；将第一特征向量和第二特征向量进行匹配，确定多个第二目标数据中的推送数据，其中，推送数据用于表征与第一目标数据相匹配的第二目标数据；驱动VR设备或AR设备展示推送数据；其中，第一处理模型和第二处理模型的参数基于第一全局特征和第二全局特征的全局相似度、第一数据块特征和第二数据块特征的数据块相似度进行调整，第一全局特征和第一数据块特征通过第一处理模型对第一训练数据进行处理得到，第二全局特征和第二数据块特征通过第二处理模型对第二训练数据进行处理得到，第一训练数据和第二训练数据是不同类型的数据，第一全局特征用于表征第一训练数据的语义特征，第一数据块特征用于表征第一训练数据中的数据块的特征，第一处理模型和第二处理模型为机器学习模型。According to another aspect of the embodiments of the present application, a data processing method is also provided, including: displaying first target data on a presentation screen of a virtual reality VR device or an augmented reality AR device; acquiring a plurality of second target data, wherein , the types of the first target data and the plurality of second target data are different; use the first processing model to process the first target data to obtain the first feature vector corresponding to the first target data; use the second processing model to process the multiple The second target data is processed to obtain a plurality of second feature vectors corresponding to the second target data; the first feature vector and the second feature vector are matched to determine the push data in the plurality of second target data, wherein the push data Used to characterize the second target data that matches the first target data; drive the VR device or AR device to display the push data; wherein, the parameters of the first processing model and the second processing model are based on the first global feature and the second global feature. The global similarity, the data block similarity of the first data block feature and the second data block feature are adjusted. The first global feature and the first data block feature are obtained by processing the first training data by the first processing model. The feature and the second data block feature are obtained by processing the second training data by the second processing model. The first training data and the second training data are different types of data, and the first global feature is used to represent the semantic feature of the first training data. , the first data block feature is used to represent the feature of the data block in the first training data, and the first processing model and the second processing model are machine learning models.

根据本申请实施例的另一方面，还提供了一种模型训练方法，包括：服务器通过调用第一接口获取客户端发送的模型训练请求，其中，第一接口包括第一参数，第一参数的参数值为模型训练请求，模型训练请求用于对第一处理模型和第二处理模型进行训练；服务器基于模型训练请求获取训练样本，其中，训练样本包括：不同类型的第一训练数据和第二训练数据；服务器利用第一处理模型对第一训练数据进行处理，得到第一全局特征和第一数据块特征，其中，第一全局特征用于表征第一训练数据的语义特征，第一数据块特征用于表征第一训练数据中的数据块的特征；服务器利用第二处理模型对第二训练数据进行处理，得到第二全局特征和第二数据块特征；服务器基于第一全局特征和第二全局特征的全局相似度、第一数据块特征和第二数据块特征的数据块相似度，对第一处理模型和第二处理模型的参数进行调整；服务器通过调用第二接口输出第一处理模型和第二处理模型至客户端，其中，第二接口包括第二参数，第二参数的参数值为第一处理模型和第二处理模型，第一处理模型和第二处理模型为机器学习模型。According to another aspect of the embodiments of the present application, a model training method is also provided, including: the server obtains a model training request sent by the client by invoking a first interface, wherein the first interface includes a first parameter, and the value of the first parameter The parameter value is a model training request, and the model training request is used to train the first processing model and the second processing model; the server obtains training samples based on the model training request, wherein the training samples include: different types of first training data and second training data; the server processes the first training data by using the first processing model to obtain a first global feature and a first data block feature, wherein the first global feature is used to represent the semantic feature of the first training data, and the first data block The feature is used to characterize the feature of the data block in the first training data; the server uses the second processing model to process the second training data to obtain the second global feature and the second data block feature; the server is based on the first global feature and the second The global similarity of the global feature, the data block similarity of the first data block feature and the second data block feature, adjust the parameters of the first processing model and the second processing model; the server outputs the first processing model by calling the second interface and a second processing model to the client, wherein the second interface includes a second parameter, the parameter values of the second parameter are the first processing model and the second processing model, and the first processing model and the second processing model are machine learning models.

根据本申请实施例的另一方面，还提供了一种数据处理方法，包括：服务器通过调用第一接口获取客户端发送的第一目标数据，其中，第一接口包括第一参数，第一参数的参数值为第一目标数据；服务器获取多个第二目标数据，其中，第一目标数据和多个第二目标数据的类型不同；服务器利用第一处理模型对第一目标数据进行处理，得到第一目标数据对应的第一特征向量；服务器利用第二处理模型分别对多个第二目标数据进行处理，得到多个第二目标数据对应的第二特征向量；服务器将第一特征向量和第二特征向量进行匹配，确定多个第二目标数据中的推送数据，其中，推送数据用于表征与第一目标数据相匹配的第二目标数据；服务器通过调用第二接口输出推送数据至客户端，其中，第二接口包括第二参数，第二参数的参数值为推送数据；其中，第一处理模型和第二处理模型的参数基于第一全局特征和第二全局特征的全局相似度、第一数据块特征和第二数据块特征的数据块相似度进行调整，第一全局特征和第一数据块特征通过第一处理模型对第一训练数据进行处理得到，第二全局特征和第二数据块特征通过第二处理模型对第二训练数据进行处理得到，第一训练数据和第二训练数据是不同类型的数据，第一全局特征用于表征第一训练数据的语义特征，第一数据块特征用于表征第一训练数据中的数据块的特征，第一处理模型和第二处理模型为机器学习模型。According to another aspect of the embodiments of the present application, a data processing method is further provided, including: the server obtains the first target data sent by the client by calling a first interface, wherein the first interface includes a first parameter, and the first parameter The parameter value is the first target data; the server obtains a plurality of second target data, wherein the first target data and the plurality of second target data are of different types; the server processes the first target data by using the first processing model, and obtains The first feature vector corresponding to the first target data; the server uses the second processing model to process the plurality of second target data respectively to obtain the second feature vector corresponding to the plurality of second target data; the server combines the first feature vector with the first feature vector. The two feature vectors are matched to determine the push data in the plurality of second target data, wherein the push data is used to represent the second target data that matches the first target data; the server outputs the push data to the client by calling the second interface , wherein the second interface includes a second parameter, and the parameter value of the second parameter is the push data; wherein the parameters of the first processing model and the second processing model are based on the global similarity of the first global feature and the second global feature, the first The data block similarity between a data block feature and a second data block feature is adjusted. The first global feature and the first data block feature are obtained by processing the first training data through the first processing model. The second global feature and the second data The block feature is obtained by processing the second training data by the second processing model, the first training data and the second training data are different types of data, the first global feature is used to represent the semantic feature of the first training data, and the first data block The feature is used to characterize the feature of the data block in the first training data, and the first processing model and the second processing model are machine learning models.

根据本申请实施例的另一方面，还提供了一种计算机可读存储介质，计算机可读存储介质包括存储的程序，其中，在程序运行时控制计算机可读存储介质所在设备执行上述实施例中任意一项的方法。According to another aspect of the embodiments of the present application, a computer-readable storage medium is also provided, where the computer-readable storage medium includes a stored program, wherein, when the program runs, the device where the computer-readable storage medium is located is controlled to perform the above-mentioned embodiments. any method.

根据本申请实施例的另一方面，还提供了一种计算机终端，包括：处理器；存储器，与处理器相连接，用于为处理器提供处理一下处理步骤的指令：获取训练样本，其中，训练样本包括：不同类型的第一训练数据和第二训练数据；利用第一处理模型对第一训练数据进行处理，得到第一全局特征和第一数据块特征，其中，第一全局特征用于表征第一训练数据的语义特征，第一数据块特征用于表征第一训练数据中的数据块的特征；利用第二处理模型对第二训练数据进行处理，得到第二全局特征和第二数据块特征；基于第一全局特征和第二全局特征的全局相似度、第一数据块特征和第二数据块特征的数据块相似度，对第一处理模型和第二处理模型的参数进行调整，第一处理模型和第二处理模型为机器学习模型。According to another aspect of the embodiments of the present application, a computer terminal is also provided, including: a processor; a memory connected to the processor and configured to provide the processor with instructions for processing the following processing steps: acquiring training samples, wherein, The training samples include: different types of first training data and second training data; using the first processing model to process the first training data to obtain a first global feature and a first data block feature, wherein the first global feature is used for Characterizing the semantic features of the first training data, and the first data block features are used to characterize the features of the data blocks in the first training data; using the second processing model to process the second training data to obtain second global features and second data block feature; based on the global similarity of the first global feature and the second global feature, the data block similarity of the first data block feature and the second data block feature, the parameters of the first processing model and the second processing model are adjusted, The first processing model and the second processing model are machine learning models.

在本申请实施例中，在获取到不同类型的第一训练数据和第二训练数据之后，可以利用第一处理模型对第一训练数据进行处理，得到第一全局特征和第一数据块特征，并利用第二处理模型对第二训练数据进行处理，得到第二全局特征和第二数据块特征，最后基于第一全局特征和第二全局特征的全局相似度、第一数据块特征和第二数据块特征的数据块相似度，对第一处理模型和第二处理模型的参数进行调整，达到对特征提取模型进行训练的目的。容易注意到的是，由于模型参数的调整是基于全局相似度和数据块相似度实现的，从而达到更准确的表达图像、文本之间的相似度，提升模型训练效果，进一步提升模型对多模态数据进行处理的准确度的技术效果，进而解决了相关技术中通过模型对多模态数据进行处理的准确度较差的技术问题。在图文搜索场景中，通过第一处理模型和第二处理模型可以提取出准确度更高的图像特征和文本特征，进一步使得搜索出的信息更加准确，更符合用户的搜索需求，从而达到了提高图文搜索准确度的效果。In the embodiment of the present application, after obtaining the first training data and the second training data of different types, the first training data may be processed by using the first processing model to obtain the first global feature and the first data block feature, And use the second processing model to process the second training data to obtain the second global feature and the second data block feature, and finally based on the global similarity of the first global feature and the second global feature, the first data block feature and the second data block feature. The data block similarity of the data block features adjusts the parameters of the first processing model and the second processing model, so as to achieve the purpose of training the feature extraction model. It is easy to notice that since the adjustment of model parameters is based on the global similarity and the similarity of data blocks, it can achieve a more accurate expression of the similarity between images and texts, improve the training effect of the model, and further improve the model’s ability to adapt to multiple models. The technical effect of the accuracy of processing modal data is solved, thereby solving the technical problem of poor accuracy of processing multi-modal data through a model in the related art. In the image and text search scenario, the first processing model and the second processing model can extract image features and text features with higher accuracy, further making the searched information more accurate and more in line with the user's search needs, thus achieving The effect of improving the accuracy of image and text search.

附图说明Description of drawings

此处所说明的附图用来提供对本申请的进一步理解，构成本申请的一部分，本申请的示意性实施例及其说明用于解释本申请，并不构成对本申请的不当限定。在附图中：The drawings described herein are used to provide further understanding of the present application and constitute a part of the present application. The schematic embodiments and descriptions of the present application are used to explain the present application and do not constitute an improper limitation of the present application. In the attached image:

图1是根据本申请实施例的一种用于实现模型训练方法的计算机终端(或移动设备)的硬件结构框图；1 is a block diagram of a hardware structure of a computer terminal (or mobile device) for implementing a model training method according to an embodiment of the present application;

图2是根据本申请实施例的一种模型训练方法的计算环境的结构框图；2 is a structural block diagram of a computing environment of a model training method according to an embodiment of the present application;

图3是根据本申请实施例的一种模型训练方法的流程图；3 is a flowchart of a model training method according to an embodiment of the present application;

图4是根据本申请实施例的一种可选的模型训练方法的流程图；4 is a flowchart of an optional model training method according to an embodiment of the present application;

图5是根据本申请实施例的第一种数据处理方法的流程图；5 is a flowchart of a first data processing method according to an embodiment of the present application;

图6是根据本申请实施例的第二种数据处理方法的流程图；6 is a flowchart of a second data processing method according to an embodiment of the present application;

图7是根据本申请实施例的第三种数据处理方法的流程图；7 is a flowchart of a third data processing method according to an embodiment of the present application;

图8是根据本申请实施例的另一种模型训练方法的流程图；8 is a flowchart of another model training method according to an embodiment of the present application;

图9是根据本申请实施例的第四种数据处理方法的流程图；9 is a flowchart of a fourth data processing method according to an embodiment of the present application;

图10是根据本申请实施例的一种模型训练装置的示意图；10 is a schematic diagram of a model training apparatus according to an embodiment of the present application;

图11是根据本申请实施例的第一种数据处理装置的示意图；11 is a schematic diagram of a first data processing apparatus according to an embodiment of the present application;

图12是根据本申请实施例的第二种数据处理装置的示意图；12 is a schematic diagram of a second data processing apparatus according to an embodiment of the present application;

图13是根据本申请实施例的第三种数据处理装置的示意图；13 is a schematic diagram of a third data processing apparatus according to an embodiment of the present application;

图14是根据本申请实施例的另一种模型训练装置的示意图；14 is a schematic diagram of another model training apparatus according to an embodiment of the present application;

图15是根据本申请实施例的第四种数据处理装置的示意图；15 is a schematic diagram of a fourth data processing apparatus according to an embodiment of the present application;

图16是根据本申请实施例的一种计算机终端的结构框图。FIG. 16 is a structural block diagram of a computer terminal according to an embodiment of the present application.

具体实施方式Detailed ways

为了使本技术领域的人员更好地理解本申请方案，下面将结合本申请实施例中的附图，对本申请实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例仅仅是本申请一部分的实施例，而不是全部的实施例。基于本申请中的实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例，都应当属于本申请保护的范围。In order to make those skilled in the art better understand the solutions of the present application, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are only The embodiments are part of the present application, but not all of the embodiments. Based on the embodiments in the present application, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the scope of protection of the present application.

需要说明的是，本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象，而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换，以便这里描述的本申请的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外，术语“包括”和“具有”以及他们的任何变形，意图在于覆盖不排他的包含，例如，包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元，而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。It should be noted that the terms "first", "second", etc. in the description and claims of the present application and the above drawings are used to distinguish similar objects, and are not necessarily used to describe a specific sequence or sequence. It is to be understood that data so used may be interchanged under appropriate circumstances so that the embodiments of the application described herein can be practiced in sequences other than those illustrated or described herein. Furthermore, the terms "comprising" and "having", and any variations thereof, are intended to cover non-exclusive inclusion, for example, a process, method, system, product or device comprising a series of steps or units is not necessarily limited to those expressly listed Rather, those steps or units may include other steps or units not expressly listed or inherent to these processes, methods, products or devices.

首先，在对本申请实施例进行描述的过程中出现的部分名词或术语适用于如下解释：First of all, some nouns or terms that appear in the process of describing the embodiments of the present application are suitable for the following explanations:

多模态特征学习：通过机器学习技术，训练特征编码模型，可以将用户输入的图像或者文本数据映射成一个特征向量，该模型可以用于图文搜索等任务。Multi-modal feature learning: Through machine learning technology, a feature encoding model can be trained, which can map the image or text data input by the user into a feature vector, and the model can be used for tasks such as image and text search.

Transformer模型：一种神经网络模型，可以对输入图像或者文本进行特征提取。输入图像之前，需要将图像切分成多个图像块(例如，切分成16*16个互相不重合的小块)，每个图像块就是一个图像token。同样地，输入文本之前，需要将文本切分成多个文本token，每个文本token可以近似理解成一个单词。Transformer Model: A neural network model that can perform feature extraction on input images or text. Before inputting an image, the image needs to be divided into multiple image blocks (for example, divided into 16*16 small blocks that do not overlap each other), and each image block is an image token. Similarly, before entering text, the text needs to be divided into multiple text tokens, and each text token can be approximately understood as a word.

InfoNCE(Info Noise-contrastive estimation，信息噪声对比估计)损失函数：核心思想是让正样本对之间的相似度大于负样本对之间的相似度。InfoNCE (Info Noise-contrastive estimation) loss function: The core idea is to make the similarity between positive sample pairs greater than the similarity between negative sample pairs.

目前多模态特征学习中，可以使用CLIP(Contrastive Language-Image Pre-training，对比式无监督预训练)模型，在输入图像或者文本数据之后，可以通过CNN或者Transformer等神经网络，将图像或者文本数据表示成一个全局特征向量。在训练结果，首先对所有图像、文本提取特征向量，然后要求正样本图文对的特征向量之间的相似度，大于负样本图文对的特征向量之间的相似度。At present, in multimodal feature learning, the CLIP (Contrastive Language-Image Pre-training) model can be used. After inputting image or text data, the image or text can be processed by neural networks such as CNN or Transformer. The data is represented as a global feature vector. In the training results, first extract feature vectors for all images and texts, and then require that the similarity between the feature vectors of the positive sample image-text pair is greater than the similarity between the feature vectors of the negative sample image-text pair.

但是，该模型通过求平均或者其他方式，将图像或者文本数据压缩到一个全局特征向量中，会丢失较多的图像或者文本细节数据，导致在训练阶段无法区分正样本和较难的负样本，进而在训练过程中引入噪声，影响模型效果。However, the model compresses image or text data into a global feature vector by averaging or other methods, which will lose more image or text detail data, resulting in inability to distinguish positive samples from difficult negative samples in the training stage. Then, noise is introduced in the training process, which affects the model effect.

为了解决上述问题，本申请提供了一种实现方案，通过建立图像token、文本token之间多对多的对应关系，更准确地表达图像或文本数据之间的相似度，从而提升特征训练的效果，进一步提升模型处理的准确度，使得在图文搜索任务中可以向用户提送准确度更高的检索结果。In order to solve the above problems, the present application provides an implementation scheme. By establishing a many-to-many correspondence between image tokens and text tokens, the similarity between images or text data can be more accurately expressed, thereby improving the effect of feature training. , to further improve the accuracy of model processing, so that users can be provided with more accurate retrieval results in image and text search tasks.

实施例1Example 1

根据本申请实施例，还提供了一种模型训练方法，需要说明的是，在附图的流程图示出的步骤可以在诸如一组计算机可执行指令的计算机系统中执行，并且，虽然在流程图中示出了逻辑顺序，但是在某些情况下，可以以不同于此处的顺序执行所示出或描述的步骤。According to an embodiment of the present application, a model training method is also provided. It should be noted that the steps shown in the flowchart of the accompanying drawings may be executed in a computer system such as a set of computer-executable instructions, and although the flowchart A logical order is shown in the figures, but in some cases steps shown or described may be performed in an order different from that herein.

本申请实施例所提供的方法实施例可以在移动终端、计算机终端或者类似的运算装置中执行。图1示出了一种用于实现模型训练方法的计算机终端(或移动设备)的硬件结构框图。如图1所示，计算机终端10(或移动设备10)可以包括一个或多个(图中采用102a、102b，……，102n来示出)处理器102(处理器102可以包括但不限于微处理器MCU或可编程逻辑器件FPGA等的处理装置)、用于存储数据的存储器104、以及用于通信功能的传输装置106。除此以外，还可以包括：显示器、输入/输出接口(I/O接口)、通用串行总线(USB)端口(可以作为BUS总线的端口中的一个端口被包括)、网络接口、电源和/或相机。本领域普通技术人员可以理解，图1所示的结构仅为示意，其并不对上述电子装置的结构造成限定。例如，计算机终端10还可包括比图1中所示更多或者更少的组件，或者具有与图1所示不同的配置。The method embodiments provided in the embodiments of the present application may be executed in a mobile terminal, a computer terminal, or a similar computing device. FIG. 1 shows a block diagram of the hardware structure of a computer terminal (or mobile device) for implementing the model training method. As shown in FIG. 1, the computer terminal 10 (or the mobile device 10) may include one or more processors 102 (represented by 102a, 102b, . processing means such as a processor MCU or a programmable logic device FPGA), a memory 104 for storing data, and a transmission means 106 for communication functions. In addition, may also include: display, input/output interface (I/O interface), universal serial bus (USB) port (may be included as one of the ports of the BUS bus), network interface, power supply and/or or camera. Those of ordinary skill in the art can understand that the structure shown in FIG. 1 is only a schematic diagram, which does not limit the structure of the above electronic device. For example, the computer terminal 10 may also include more or fewer components than shown in FIG. 1 , or have a different configuration than that shown in FIG. 1 .

应当注意到的是上述一个或多个处理器102和/或其他数据处理电路在本文中通常可以被称为“数据处理电路”。该数据处理电路可以全部或部分的体现为软件、硬件、固件或其他任意组合。此外，数据处理电路可为单个独立的处理模块，或全部或部分的结合到计算机终端10(或移动设备)中的其他元件中的任意一个内。该数据处理电路作为一种处理器控制(例如与接口连接的可变电阻终端路径的选择)。It should be noted that the one or more processors 102 and/or other data processing circuits described above may generally be referred to herein as "data processing circuits." The data processing circuit may be embodied in whole or in part as software, hardware, firmware or any other combination. Furthermore, the data processing circuitry may be a single stand-alone processing module, or incorporated in whole or in part into any of the other elements in the computer terminal 10 (or mobile device). The data processing circuit acts as a kind of processor control (eg selection of variable resistor termination paths connected to the interface).

存储器104可用于存储应用软件的软件程序以及模块，如本申请实施例中的模型训练方法对应的程序指令/数据存储装置，处理器102通过运行存储在存储器104内的软件程序以及模块，从而执行各种功能应用以及数据处理，即实现上述的模型训练方法。存储器104可包括高速随机存储器，还可包括非易失性存储器，如一个或者多个磁性存储装置、闪存、或者其他非易失性固态存储器。在一些实例中，存储器104可进一步包括相对于处理器102远程设置的存储器，这些远程存储器可以通过网络连接至计算机终端10。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。The memory 104 can be used to store software programs and modules of the application software, such as program instructions/data storage devices corresponding to the model training method in the embodiment of the present application. The processor 102 executes the software programs and modules stored in the memory 104 by running Various functional applications and data processing implement the above-mentioned model training method. Memory 104 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some instances, memory 104 may further include memory located remotely from processor 102, which may be connected to computer terminal 10 through a network. Examples of such networks include, but are not limited to, the Internet, an intranet, a local area network, a mobile communication network, and combinations thereof.

传输装置106用于经由一个网络接收或者发送数据。上述的网络具体实例可包括计算机终端10的通信供应商提供的无线网络。在一个实例中，传输装置106包括一个网络适配器(Network Interface Controller，NIC)，其可通过基站与其他网络设备相连从而可与互联网进行通讯。在一个实例中，传输装置106可以为射频(Radio Frequency，RF)模块，其用于通过无线方式与互联网进行通讯。Transmission means 106 are used to receive or transmit data via a network. A specific example of the above-mentioned network may include a wireless network provided by a communication provider of the computer terminal 10 . In one example, the transmission device 106 includes a network adapter (Network Interface Controller, NIC), which can be connected to other network devices through a base station so as to communicate with the Internet. In one example, the transmission device 106 may be a radio frequency (Radio Frequency, RF) module, which is used for wirelessly communicating with the Internet.

显示器可以例如触摸屏式的液晶显示器(LCD)，该液晶显示器可使得用户能够与计算机终端10(或移动设备)的用户界面进行交互。The display may be, for example, a touch screen type liquid crystal display (LCD) that enables a user to interact with the user interface of the computer terminal 10 (or mobile device).

图1示出的硬件结构框图，不仅可以作为上述计算机终端(或移动设备)的示例性框图，还可以作为上述服务器的示例性框图，一种可选实施例中，图2以框图示出了使用上述图1所示的计算机终端(或移动设备)作为计算环境201中计算节点的一种实施例。图2是根据本申请实施例的一种模型训练方法的计算环境的结构框图，如图2所示，计算环境201包括运行在分布式网络上的多个(图中采用210-1，210-2，…,来示出)计算节点(如服务器)。每个计算节点都包含本地处理和内存资源，终端用户202可以在计算环境201中远程运行应用程序或存储数据。应用程序可以作为计算环境301中的多个服务220-1,220-2,220-3和220-4进行提供，分别代表服务“A”，“D”，“E”和“H”。The block diagram of the hardware structure shown in FIG. 1 can be used not only as an exemplary block diagram of the above-mentioned computer terminal (or mobile device), but also as an exemplary block diagram of the above-mentioned server. In an optional embodiment, FIG. 2 shows a block diagram as a block diagram. An embodiment of using the computer terminal (or mobile device) shown in FIG. 1 above as a computing node in the computing environment 201 is presented. FIG. 2 is a structural block diagram of a computing environment of a model training method according to an embodiment of the present application. As shown in FIG. 2, the computing environment 201 includes a plurality of (210-1, 210- 2, ..., to show) computing nodes (such as servers). Each computing node contains local processing and memory resources, and end users 202 can run applications or store data remotely in computing environment 201 . Applications may be provided as a plurality of services 220-1, 220-2, 220-3 and 220-4 in computing environment 301, representing services "A", "D", "E" and "H", respectively.

终端用户202可以通过客户端上的web浏览器或其他软件应用程序提供和访问服务，在一些实施例中，可以将终端用户202的供应和/或请求提供给入口网关230。入口网关230可以包括一个相应的代理来处理针对服务220(计算环境201中提供的一个或多个服务)的供应和/或请求。End user 202 may provide and access services through a web browser or other software application on a client, and in some embodiments, end user 202 offers and/or requests may be provided to portal gateway 230 . Ingress gateway 230 may include a corresponding proxy to handle offers and/or requests for service 220 (one or more services provided in computing environment 201).

服务220是根据计算环境201支持的各种虚拟化技术来提供或部署的。在一些实施例中，可以根据基于虚拟机(VM)的虚拟化、基于容器的虚拟化和/或类似的方式提供服务220。基于虚拟机的虚拟化可以是通过初始化虚拟机来模拟真实的计算机，在不直接接触任何实际硬件资源的情况下执行程序和应用程序。在虚拟机虚拟化机器的同时，根据基于容器的虚拟化，可以启动容器来虚拟化整个操作系统(OS)，以便多个工作负载可以在单个操作系统实例上运行。Services 220 are provided or deployed according to various virtualization technologies supported by computing environment 201 . In some embodiments, service 220 may be provided according to virtual machine (VM) based virtualization, container based virtualization, and/or the like. Virtual machine-based virtualization can emulate a real computer by initializing a virtual machine, executing programs and applications without directly touching any actual hardware resources. While virtual machines virtualize machines, according to container-based virtualization, containers can be launched to virtualize an entire operating system (OS) so that multiple workloads can run on a single OS instance.

在基于容器虚拟化的一个实施例中，服务220的若干容器可以被组装成一个POD(例如，Kubernetes POD)。举例来说，如图2所示，服务220-2可以配备一个或多个POD 240-1，240-2，…，240-N(统称为POD 240)。每个POD 240可以包括代理245和一个或多个容器242-1，242-2，…，242-M(统称为容器242)。POD 240中一个或多个容器242处理与服务的一个或多个相应功能相关的请求，代理245通常控制与服务相关的网络功能，如路由、负载均衡等。其他服务220也可以陪陪类似于POD 240的POD。In one embodiment based on container virtualization, several containers of service 220 may be assembled into a POD (eg, a Kubernetes POD). For example, as shown in FIG. 2, a service 220-2 may be equipped with one or more PODs 240-1, 240-2, . . . , 240-N (collectively referred to as PODs 240). Each POD 240 may include an agent 245 and one or more containers 242-1, 242-2, . . . , 242-M (collectively, containers 242). One or more containers 242 in the POD 240 handle requests related to one or more corresponding functions of the service, and the proxy 245 typically controls network functions related to the service, such as routing, load balancing, and the like. Other services 220 may also accompany PODs similar to PODs 240 .

在操作过程中，执行来自终端用户202的用户请求可能需要调用计算环境201中的一个或多个服务220，执行一个服务220的一个或多个功能中需要调用另一个服务220的一个或多个功能。如图2所示，服务“A”220-1从入口网关230接收终端用户202的用户请求，服务“A”220-1可以调用服务“D”220-2，服务“D”220-2可以请求服务“E”220-3执行一个或多个功能。During operation, execution of a user request from end user 202 may require invocation of one or more services 220 in computing environment 201 , execution of one or more functions of one service 220 may require invocation of one or more of another service 220 Function. As shown in FIG. 2, service "A" 220-1 receives a user request from end user 202 from portal gateway 230, service "A" 220-1 may call service "D" 220-2, and service "D" 220-2 may Service "E" 220-3 is requested to perform one or more functions.

上述的计算环境可以是云计算环境，资源的分配由云服务提供上管理，允许功能的开发无需考虑实现、调整或扩展服务器。该计算环境允许开发人员在不构建或维护复杂基础设施的情况下执行响应事件的代码。服务可以被分割完成一组可以自动独立伸缩的功能，而不是扩展单个硬件设备来处理潜在的负载。The above-mentioned computing environment may be a cloud computing environment, and the allocation of resources is managed by the cloud service provider, allowing the development of functions without considering the implementation, adjustment or expansion of the server. This computing environment allows developers to execute code in response to events without building or maintaining complex infrastructure. Instead of scaling a single hardware device to handle the underlying load, services can be split up to perform a set of functions that can be scaled independently and automatically.

在上述运行环境下，本申请提供了如图3所示的模型训练方法。需要说明的是，该实施例的模型训练方法可以由图1所示实施例的计算机终端执行。图3是根据本申请实施例的一种模型训练方法的流程图。如图3所示，该方法可以包括如下步骤：Under the above operating environment, the present application provides a model training method as shown in FIG. 3 . It should be noted that the model training method in this embodiment may be executed by the computer terminal of the embodiment shown in FIG. 1 . FIG. 3 is a flowchart of a model training method according to an embodiment of the present application. As shown in Figure 3, the method may include the following steps:

步骤S302，获取训练样本，其中，训练样本包括：不同类型的第一训练数据和第二训练数据。Step S302, acquiring training samples, wherein the training samples include: different types of first training data and second training data.

上述步骤中的训练样本可以是多模态数据，也即，训练样本可以包含两种不同类型的数据，分别为第一训练数据和第二训练数据，具体的数据类型可以根据应用场景进行确定，在本申请实施例中，以图文搜索场景为例，第一训练数据可以是图像数据，第二训练数据可以是文本数据，但不仅限于此。The training samples in the above steps may be multi-modal data, that is, the training samples may contain two different types of data, namely the first training data and the second training data, and the specific data types may be determined according to the application scenario, In the embodiment of the present application, taking the image and text search scenario as an example, the first training data may be image data, and the second training data may be text data, but not limited thereto.

另外，为了提高模型效果，训练样本可以分为正样本和负样本，正样本包含的两种不同类型的数据相匹配，例如，以图文搜索场景为例，对于正样本，文本数据是“雪纺连衣裙”，图像数据可以是一张“雪纺连衣裙”的图像；对于负样本，文本数据是“雪纺连衣裙”，图像数据可以是一张“明星A”的图像。In addition, in order to improve the model effect, the training samples can be divided into positive samples and negative samples, and the positive samples contain two different types of data that match. Spinning dress", the image data can be an image of "chiffon dress"; for negative samples, text data is "chiffon dress", and image data can be an image of "star A".

在一种可选的实施例中，不同用户训练模型的目的往往不同，为了确保训练出的模型能够满足用户需求，可以将用户提供的多模态数据、或用户服务器中存储的多模态数据作为训练样本。在另一种可选的实施例中，用户提供的训练样本数量有限，为了进一步提升模型处理的准确度，可以基于用户需求，从互联网中获取大量多模态数据作为训练样本。In an optional embodiment, the purpose of training models for different users is often different. In order to ensure that the trained model can meet the needs of users, the multi-modal data provided by the user or the multi-modal data stored in the user server can be used. as a training sample. In another optional embodiment, the number of training samples provided by the user is limited. In order to further improve the accuracy of model processing, a large amount of multimodal data may be obtained from the Internet as training samples based on user requirements.

步骤S304，利用第一处理模型对第一训练数据进行处理，得到第一全局特征和第一数据块特征，其中，第一全局特征用于表征第一训练数据的语义特征，第一数据块特征用于表征第一训练数据中的数据块的特征。Step S304, using the first processing model to process the first training data to obtain a first global feature and a first data block feature, wherein the first global feature is used to represent the semantic feature of the first training data, and the first data block feature A feature used to characterize the data block in the first training data.

上述步骤中的第一处理模型可以是能够对第一训练数据进行特征提取的机器学习模型，该模型可以通过与第一训练数据相同类型的数据进行训练得到，该模型可以采用相关技术中能够实现特征提取的模型结构，在本申请实施例中，以Transformer模型为例进行说明，但不仅限于此。在图文搜索场景中，当第一训练数据是图像数据时，第一处理模型可以是图像编码器；当第一训练数据是文本数据时，第一处理模型可以是文本编码器。The first processing model in the above steps may be a machine learning model capable of extracting features from the first training data, and the model may be obtained by training with the same type of data as the first training data, and the model may be implemented in related technologies. The model structure of feature extraction is described by taking the Transformer model as an example in the embodiments of the present application, but it is not limited to this. In the image-text search scenario, when the first training data is image data, the first processing model may be an image encoder; when the first training data is text data, the first processing model may be a text encoder.

在一种可选的实施例中，当需要对第一训练数据进行特征提取时，首先对第一训练数据进行切分，得到多个数据块patchtoken，然后增加一个可学习的嵌入向量classtoken，得到最终输入至第一处理模型的输入序列，其中，class token可以表示第一训练数据的整体的语义。在将输入序列输入第一处理模型进行特征提取之后，可以得到classtoken对应的特征向量，即上述的第一全局特征，以及每个patch token对应的特征向量，即上述的第一数据块特征。In an optional embodiment, when feature extraction needs to be performed on the first training data, the first training data is first divided to obtain multiple data blocks patchtoken, and then a learnable embedding vector classtoken is added to obtain The input sequence is finally input to the first processing model, wherein the class token can represent the overall semantics of the first training data. After the input sequence is input into the first processing model for feature extraction, the feature vector corresponding to the class token, that is, the aforementioned first global feature, and the feature vector corresponding to each patch token, that is, the aforementioned first data block feature can be obtained.

步骤S306，利用第二处理模型对第二训练数据进行处理，得到第二全局特征和第二数据块特征。Step S306, using the second processing model to process the second training data to obtain the second global feature and the second data block feature.

上述步骤中的第二处理模型可以是能够对第一训练数据进行特征提取的机器学习模型。由于第一训练数据和第二训练数据是两种不同类型的数据，因此，需要针对第一训练数据和第二训练数据分别训练第一处理模型和第二处理模型，从而确保特征提取的准确度。在图文搜索场景中，当第二训练数据是文本数据时，第二处理模型可以是文本编码器；当第二训练数据是图像数据时，第二处理模型可以是图像编码器。The second processing model in the above steps may be a machine learning model capable of performing feature extraction on the first training data. Since the first training data and the second training data are two different types of data, it is necessary to train the first processing model and the second processing model respectively for the first training data and the second training data, so as to ensure the accuracy of feature extraction . In the image-text search scenario, when the second training data is text data, the second processing model may be a text encoder; when the second training data is image data, the second processing model may be an image encoder.

由于第一处理模型和第二处理模型的处理目的相同，因此，第二处理模型的结构与第一处理模型的结构、第二处理模型的处理流程与第一处理模型的处理流程相同，在此不做赘述。Since the processing purposes of the first processing model and the second processing model are the same, the structure of the second processing model is the same as that of the first processing model, and the processing flow of the second processing model is the same as the processing flow of the first processing model. I won't go into details.

步骤S308，基于第一全局特征和第二全局特征的全局相似度、第二数据块特征和第二数据块特征的数据块相似度，对第一处理模型和第二处理模型的参数进行调整。Step S308: Adjust the parameters of the first processing model and the second processing model based on the global similarity between the first global feature and the second global feature, and the data block similarity between the second data block feature and the second data block feature.

上述步骤中的全局相似度可以是第一全局特征和第二全局特征之间一对一的相似程度关系，数据块相似度可以是第一数据块特征和第二数据块特征之间多对多的相似程度关系。The global similarity in the above steps can be a one-to-one similarity relationship between the first global feature and the second global feature, and the data block similarity can be a many-to-many relationship between the first data block feature and the second data block feature. similarity relationship.

在一种可选的实施例中，第一处理模型和第二处理模型的训练目标可以是正样本的相似度大于负样本的相似度，因此，可以基于全局相似度和数据块相似度构建损失函数，进一步对第一处理模型和第二处理模型的参数进行调整，确保两个模型的模型精度相同。In an optional embodiment, the training target of the first processing model and the second processing model may be that the similarity of positive samples is greater than the similarity of negative samples, therefore, a loss function may be constructed based on the global similarity and the data block similarity , and further adjust the parameters of the first processing model and the second processing model to ensure that the model accuracy of the two models is the same.

需要说明的是，参数的具体调整过程可以采用相关技术中提供的方法实现，本申请对此不作具体限定。It should be noted that the specific adjustment process of the parameters can be implemented by using the method provided in the related art, which is not specifically limited in this application.

下面结合图4以图文搜索场景为例，对本申请一种优选的实施例进行详细说明。如图4所示，训练样本包含图像和文本，该图像上绘制有披萨，该文本的内容是“香肠披萨盘子”，图像可以由图像编码器进行特征提取，文本可以由文本编码器进行特征提取。在图像编码器对图像进行处理之前，首先对图像进行切分，分割成6个图像块，也即，6个图像token，然后将6个图像token和一个class token一起输入至图像编码器，得到图像全局特征和6个图像token特征。在文本编码器对文本进行处理之前，首先对文本进行切分，分割成6个字，也即，6个文本token，然后将6个文本token和另一个class token一起输入至文本编辑器，得到文本全局特征和6个文本token特征。图像全局特征和文本全局特征可以用来构建全局特征对齐loss，6个图像token和6个文本token可以通过token特征对齐模块进行处理，构建token特征对齐loss。通过全局特征对齐loss和token特征对齐loss对图像编码器和文本编码器进行参数更新，达到模型训练的目的。通过建立图像全局特征、文本全局特征之间1对1的对应关系，以及图像token、文本token之间多对多的对应关系，从而更准确的表达图像、文本之间的相似度，以此来提升特征训练的效果。A preferred embodiment of the present application will be described in detail below with reference to FIG. 4 taking a picture and text search scenario as an example. As shown in Figure 4, the training sample contains an image and text, the image has a pizza drawn on it, the content of the text is "sausage pizza plate", the image can be feature extracted by the image encoder, and the text can be feature extracted by the text encoder . Before the image encoder processes the image, the image is first divided into 6 image blocks, that is, 6 image tokens, and then the 6 image tokens and a class token are input to the image encoder together to obtain Image global features and 6 image token features. Before the text encoder processes the text, the text is first divided into 6 words, that is, 6 text tokens, and then the 6 text tokens and another class token are input into the text editor together to get Text global features and 6 text token features. Image global features and text global features can be used to construct global feature alignment loss, and 6 image tokens and 6 text tokens can be processed by the token feature alignment module to construct token feature alignment loss. The parameters of image encoder and text encoder are updated through global feature alignment loss and token feature alignment loss to achieve the purpose of model training. By establishing a one-to-one correspondence between image global features and text global features, as well as a many-to-many correspondence between image tokens and text tokens, the similarity between images and texts can be more accurately expressed. Improve the performance of feature training.

通过本申请上述实施例提供的方案，在获取到不同类型的第一训练数据和第二训练数据之后，可以利用第一处理模型对第一训练数据进行处理，得到第一全局特征和第一数据块特征，并利用第二处理模型对第二训练数据进行处理，得到第二全局特征和第二数据块特征，最后基于第一全局特征和第二全局特征的全局相似度、第一数据块特征和第二数据块特征的数据块相似度，对第一处理模型和第二处理模型的参数进行调整，达到对特征提取模型进行训练的目的。容易注意到的是，由于模型参数的调整是基于全局相似度和数据块相似度实现的，从而达到更准确的表达图像、文本之间的相似度，提升模型训练效果，进一步提升模型对多模态数据进行处理的准确度的技术效果，进而解决了相关技术中通过模型对多模态数据进行处理的准确度较差的技术问题。在图文搜索场景中，通过第一处理模型和第二处理模型可以提取出准确度更高的图像特征和文本特征，进一步使得搜索出的信息更加准确，更符合用户的搜索需求，从而达到了提高图文搜索准确度的效果。Through the solutions provided by the above embodiments of the present application, after obtaining different types of first training data and second training data, the first processing model can be used to process the first training data to obtain first global features and first data block feature, and use the second processing model to process the second training data to obtain the second global feature and the second data block feature, and finally based on the global similarity of the first global feature and the second global feature, the first data block feature The data block similarity with the second data block feature is used to adjust the parameters of the first processing model and the second processing model, so as to achieve the purpose of training the feature extraction model. It is easy to notice that since the adjustment of model parameters is based on the global similarity and the similarity of data blocks, it can achieve a more accurate expression of the similarity between images and texts, improve the training effect of the model, and further improve the model’s ability to adapt to multiple models. The technical effect of the accuracy of processing modal data is solved, thereby solving the technical problem of poor accuracy of processing multi-modal data through a model in the related art. In the image and text search scenario, the first processing model and the second processing model can extract image features and text features with higher accuracy, further making the searched information more accurate and more in line with the user's search needs, thus achieving The effect of improving the accuracy of image and text search.

在本申请上述实施例中，基于第一全局特征和第二全局特征的全局相似度、第二数据块特征和第二数据块特征的数据块相似度，对第一处理模型和第二处理模型的参数进行调整，包括：基于全局相似度构建第一损失函数；基于数据块相似度构建第二损失函数；获取第一损失函数和第二损失函数的加权和，得到目标损失函数；基于目标损失函数对第一处理模型和第二处理模型的参数进行调整。In the above embodiments of the present application, based on the global similarity between the first global feature and the second global feature, and the data block similarity between the second data block feature and the second data block feature, the first processing model and the second processing model The parameters are adjusted, including: constructing the first loss function based on the global similarity; constructing the second loss function based on the similarity of the data block; obtaining the weighted sum of the first loss function and the second loss function to obtain the target loss function; based on the target loss The function adjusts the parameters of the first processing model and the second processing model.

上述第一损失函数和第二损失函数可以根据实际训练目标进行设定，例如，在图文搜索场景中，训练目标是正样本的特征向量之间的相似度，大于负样本的特征向量之间的相似度，因此，第一损失函数和第二损失函数可以是InfoNCE损失函数，但不仅限于此。The above-mentioned first loss function and second loss function can be set according to the actual training target. For example, in the image and text search scenario, the training target is the similarity between the feature vectors of the positive samples, which is greater than the similarity between the feature vectors of the negative samples. Similarity, therefore, the first loss function and the second loss function may be InfoNCE loss functions, but are not limited thereto.

在一种可选的实施例中，训练样本可以包含正样本和负样本，例如，如图4所示，“香肠披萨盘子”和图像代表相同的物品，因此，图4中示出的图像和文本属于正样本。因此，全局相似度包含正样本的第一全局相似度和负样本的第二全局相似度，数据块相似度包含正样本的第一数据块相似度和负样本的第二数据块相似度，然后按照InfoNCE损失函数的公式，分别构建第一损失函数和第二损失函数，进而通过将两个损失函数的函数值进行加权和，得到目标损失函数，在目标损失函数大于阈值的情况下，可以对第一处理模型和第二处理模型的参数进行调整，并利用训练样本继续进行训练直至目标损失函数小于阈值，停止对第一处理模型和第二处理模型的参数进行调整，确定整个训练过程结束。In an alternative embodiment, the training samples may contain positive samples and negative samples, for example, as shown in Figure 4, "sausage pizza plate" and the image represent the same item, therefore, the image shown in Figure 4 and The text is a positive sample. Therefore, the global similarity contains the first global similarity of positive samples and the second global similarity of negative samples, the data block similarity contains the first data block similarity of positive samples and the second data block similarity of negative samples, and then According to the formula of the InfoNCE loss function, the first loss function and the second loss function are constructed respectively, and then the target loss function is obtained by weighting the function values of the two loss functions. The parameters of the first processing model and the second processing model are adjusted, and the training samples are used to continue training until the target loss function is smaller than the threshold value, stop adjusting the parameters of the first processing model and the second processing model, and determine that the entire training process is over.

在本申请上述实施例中，基于全局相似度构建第一损失函数，包括：从全局相似度中获取第一样本对应的第一全局相似度，及第二样本对应的第二全局相似度，其中，第一样本包含的两个不同类型的数据相匹配，第二样本包含的两个不同类型的数据不匹配；基于第一全局相似度和第二全局相似度，构建第一损失函数。In the above-mentioned embodiment of the present application, constructing the first loss function based on the global similarity includes: obtaining the first global similarity corresponding to the first sample and the second global similarity corresponding to the second sample from the global similarity, Wherein, two different types of data contained in the first sample match, and two different types of data contained in the second sample do not match; a first loss function is constructed based on the first global similarity and the second global similarity.

在一种可选的实施例中，可以通过如下公式构建第一损失函数L_t：In an optional embodiment, the first loss function L _t can be constructed by the following formula:

其中，s_p表示第一全局相似度，

和

表示第二全局相似度，τ表示超参数。需要说明的是，上述公式给出了使用两个第二全局相似度构建第一损失函数的实现方案，但不仅限于此，可以根据实际需要确定需要使用的第二全局相似度的数量。Among them, _sp represents the first global similarity,

and

represents the second global similarity, and τ represents the hyperparameter. It should be noted that the above formula provides an implementation scheme of using two second global similarities to construct the first loss function, but it is not limited to this, and the number of the second global similarities to be used can be determined according to actual needs.

在本申请上述实施例中，基于数据块相似度构建第二损失函数，包括：从数据块相似度中获取第一样本对应的第一数据块相似度，及第二样本对应的第二数据块相似度；基于第一数据块相似度和第二数据块相似度，构建第二损失函数。In the above-mentioned embodiment of the present application, constructing the second loss function based on the similarity of the data blocks includes: obtaining the similarity of the first data block corresponding to the first sample and the second data corresponding to the second sample from the similarity of the data block Block similarity; builds a second loss function based on the first data block similarity and the second data block similarity.

需要说明的是，第二损失函数和第一损失函数均是InfoNCE损失函数，因此，第二损失函数的构建过程与第一损失函数的构建过程相同，在此不做赘述。It should be noted that both the second loss function and the first loss function are InfoNCE loss functions. Therefore, the construction process of the second loss function is the same as that of the first loss function, which is not repeated here.

在本申请上述实施例中，该方法还包括：利用特征对齐模块对第二数据块特征和第二数据块特征进行特征对齐，得到数据块相似度。In the above-mentioned embodiment of the present application, the method further includes: using a feature alignment module to perform feature alignment on the second data block feature and the second data block feature to obtain the data block similarity.

上述步骤中的特征对齐模块可以是相关技术中提供的任何一种能够实现token特征对齐目的的模型，在本申请实施例中，特征对齐模块可以通过1层cross attention实现。The feature alignment module in the above steps can be any model provided in the related art that can achieve the purpose of token feature alignment. In this embodiment of the present application, the feature alignment module can be implemented through one layer of cross attention.

在一种可选的实施例中，由于实际应用中多模态数据的划分数量较多，而且token特征对齐需要将所有token特征进行处理，因此，为了提高模型训练效率，节省人力成本的效果，可以使用相关技术中提供的特征对齐模块实现token特征对齐，从而确定数据块相似度。In an optional embodiment, since the number of divisions of multimodal data in practical applications is large, and token feature alignment needs to process all token features, in order to improve model training efficiency and save labor costs, The feature alignment module provided in the related art can be used to achieve token feature alignment to determine the similarity of data blocks.

在本申请上述实施例中，该方法还包括：获取第一目标数据；利用第一处理模型对第一目标数据进行处理，得到第一目标数据对应的第一特征向量；利用第二处理模型分别对多个第二目标数据进行处理，得到多个第二目标数据对应的第二特征向量；将第一特征向量和第二特征向量进行匹配，确定多个第二目标数据中的推送数据，其中，推送数据用于表征与第一目标数据相匹配的第二目标数据。In the above-mentioned embodiment of the present application, the method further includes: acquiring first target data; processing the first target data by using a first processing model to obtain a first feature vector corresponding to the first target data; using the second processing model to separate Process a plurality of second target data to obtain second feature vectors corresponding to the plurality of second target data; match the first feature vector and the second feature vector to determine the push data in the plurality of second target data, wherein , the push data is used to represent the second target data that matches the first target data.

在一种可选的实施例中，为了实现多模态数据检索和数据推送的目的，用户可以输入需要进行数据检索或数据推送的第一目标数据，并利用通过上述方法训练好的第一处理模型对第一目标数据进行特征提取，得到第一特征向量；另外，对于数据库中存储的多个第二目标数据(即需要推送给用户的数据)，可以利用通过上述方法训练好的第二处理模型对第二目标数据进行特征提取，得到多个第二特征向量，每个第二特征向量对应一个第二目标数据。然后，通过将第一特征向量和每个第二特征向量进行匹配，可以确定能够匹配成功的第二特征向量，进而将该第二特征向量对应的第二目标数据作为最终推送给用户的检索结果或推送数据。In an optional embodiment, in order to achieve the purpose of multi-modal data retrieval and data push, the user can input the first target data that needs to be retrieved or pushed, and use the first process trained by the above method. The model performs feature extraction on the first target data to obtain a first feature vector; in addition, for a plurality of second target data stored in the database (that is, the data that needs to be pushed to the user), the second processing trained by the above method can be used. The model performs feature extraction on the second target data to obtain a plurality of second feature vectors, each of which corresponds to a second target data. Then, by matching the first feature vector with each second feature vector, a second feature vector that can be successfully matched can be determined, and then the second target data corresponding to the second feature vector is used as the retrieval result that is finally pushed to the user or push data.

需要说明的是，对于前述的各方法实施例，为了简单描述，故将其都表述为一系列的动作组合，但是本领域技术人员应该知悉，本申请并不受所描述的动作顺序的限制，因为依据本申请，某些步骤可以采用其他顺序或者同时进行。其次，本领域技术人员也应该知悉，说明书中所描述的实施例均属于优选实施例，所涉及的动作和模块并不一定是本申请所必须的。It should be noted that, for the sake of simple description, the foregoing method embodiments are all expressed as a series of action combinations, but those skilled in the art should know that the present application is not limited by the described action sequence. Because in accordance with the present application, certain steps may be performed in other orders or concurrently. Secondly, those skilled in the art should also know that the embodiments described in the specification are all preferred embodiments, and the actions and modules involved are not necessarily required by the present application.

通过以上的实施方式的描述，本领域的技术人员可以清楚地了解到根据上述实施例的方法可借助软件加必需的通用硬件平台的方式来实现，当然也可以通过硬件，但很多情况下前者是更佳的实施方式。基于这样的理解，本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来，该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中，包括若干指令用以使得一台终端设备(可以是手机，计算机，服务器，或者网络设备等)执行本申请各个实施例所述的方法。From the description of the above embodiments, those skilled in the art can clearly understand that the method according to the above embodiment can be implemented by means of software plus a necessary general hardware platform, and of course can also be implemented by hardware, but in many cases the former is better implementation. Based on this understanding, the technical solution of the present application can be embodied in the form of a software product in essence or in a part that contributes to the prior art, and the computer software product is stored in a storage medium (such as ROM/RAM, magnetic disk, CD-ROM), including several instructions to make a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) execute the methods described in the various embodiments of this application.

实施例2Example 2

根据本申请实施例，还提供了一种数据处理方法，需要说明的是，在附图的流程图示出的步骤可以在诸如一组计算机可执行指令的计算机系统中执行，并且，虽然在流程图中示出了逻辑顺序，但是在某些情况下，可以以不同于此处的顺序执行所示出或描述的步骤。According to an embodiment of the present application, a data processing method is also provided. It should be noted that the steps shown in the flowchart of the accompanying drawings may be executed in a computer system such as a set of computer-executable instructions, and although the flowchart A logical order is shown in the figures, but in some cases steps shown or described may be performed in an order different from that herein.

图5是根据本申请实施例的第一种数据处理方法的流程图。如图5所示，该方法可以包括如下步骤：FIG. 5 is a flowchart of a first data processing method according to an embodiment of the present application. As shown in Figure 5, the method may include the following steps:

步骤S502，获取不同类型的第一目标数据和多个第二目标数据。Step S502, acquiring different types of first target data and multiple second target data.

上述步骤中的第一目标数据和多个第二目标数据可以是多模态数据，也即属于两种不同的类型，例如，在图文搜索场景中，第一目标数据可以是文本数据，多个第二目标数据可以是图像数据，或者，第一目标数据可以是图像数据，多个第二目标数据可以是文本数据。The first target data and the plurality of second target data in the above steps may be multimodal data, that is, they belong to two different types. The second target data may be image data, or the first target data may be image data, and the plurality of second target data may be text data.

在一种可选的实施例中，第一目标数据可以由用户提供，例如，以文本数据为例，第一目标数据可以是用户手动输入的文本，以图像数据为例，第一目标数据可以是用户拍摄到的，或者由用户从已拍摄图像或互联网中图像筛选出的图像。多个第二目标数据可以是存储在数据库中的数据，也可以是互联网中的数据。In an optional embodiment, the first target data may be provided by a user. For example, taking text data as an example, the first target data may be text manually input by a user. Taking image data as an example, the first target data may be An image captured by the user or filtered by the user from captured images or images from the Internet. The plurality of second target data may be data stored in a database or data in the Internet.

步骤S504，利用第一处理模型对第一目标数据进行处理，得到第一目标数据对应的第一特征向量。Step S504, using the first processing model to process the first target data to obtain a first feature vector corresponding to the first target data.

上述步骤中的第一特征向量可以是通过对输入至第一处理模型的输入序列进行压缩所得到的向量，此处的输入序列可以包含对第一目标数据进行切分所得到的多个数据块，以及能够表示第一目标数据整体语义的class token。第一处理模型可以是能够对第一目标数据进行特征提取的模型，该模型可以通过与第一目标数据相同类型的数据进行训练得到，该模型可以采用相关技术中能够实现特征提取的模型结构，在本申请实施例中，以Transformer模型为例进行说明，但不仅限于此。在图文搜索场景中，当第一目标数据是图像数据时，第一处理模型可以是图像编码器；当第一目标数据是文本数据时，第一处理模型可以是文本编码器。The first feature vector in the above steps may be a vector obtained by compressing the input sequence input to the first processing model, where the input sequence may include multiple data blocks obtained by dividing the first target data , and a class token that can represent the overall semantics of the first target data. The first processing model may be a model capable of performing feature extraction on the first target data, the model may be obtained by training the same type of data as the first target data, and the model may adopt a model structure capable of feature extraction in the related art, In the embodiments of the present application, the Transformer model is used as an example for description, but it is not limited to this. In the image-text search scenario, when the first target data is image data, the first processing model may be an image encoder; when the first target data is text data, the first processing model may be a text encoder.

在一种可选的实施例中，首先对第一目标数据进行切分，得到多个数据块patchtoken，然后增加一个可学习的嵌入向量class token，得到最终输入至第一处理模型的输入序列，其中，class token可以表示第一目标数据的整体的语义。在将输入序列输入第一处理模型进行特征提取之后，可以得到压缩后的向量，即上述的第一特征向量。In an optional embodiment, the first target data is first segmented to obtain multiple data blocks patchtoken, and then a learnable embedding vector class token is added to obtain an input sequence that is finally input to the first processing model, The class token may represent the overall semantics of the first target data. After the input sequence is input into the first processing model for feature extraction, a compressed vector, that is, the above-mentioned first feature vector can be obtained.

需要说明的是，第一处理模型可以是部署在客户端中，还可以部署在云服务器中，为了减轻客户端的运行资源，在本申请实施例中，以第一处理模型部署在云服务器中为例进行说明。It should be noted that the first processing model may be deployed in the client or in the cloud server. In order to reduce the running resources of the client, in this embodiment of the present application, the first processing model is deployed in the cloud server as example to illustrate.

步骤S506，利用第二处理模型分别对多个第二目标数据进行处理，得到多个第二目标数据对应的第二特征向量。Step S506, using the second processing model to process the plurality of second target data respectively to obtain second feature vectors corresponding to the plurality of second target data.

由于第一目标数据和第二目标数据是两种不同类型的数据，因此，需要针对第一目标数据和第二目标数据分别训练第一处理模型和第二处理模型，从而确保特征提取的准确度。在图文搜索场景中，当第二目标数据是文本数据时，第二处理模型可以是文本编码器；当第二目标数据是图像数据时，第二处理模型可以是图像编码器。Since the first target data and the second target data are two different types of data, it is necessary to train the first processing model and the second processing model respectively for the first target data and the second target data, so as to ensure the accuracy of feature extraction . In the image-text search scenario, when the second target data is text data, the second processing model may be a text encoder; when the second target data is image data, the second processing model may be an image encoder.

步骤S508，将第一特征向量和第二特征向量进行匹配，确定多个第二目标数据中的推送数据，其中，推送数据用于表征与第一目标数据相匹配的第二目标数据。Step S508, the first feature vector and the second feature vector are matched to determine the push data in the plurality of second target data, wherein the push data is used to represent the second target data matching the first target data.

在一种可选的实施例中，可以分别计算第一特征向量和每个第二特征向量的相似度，实现将第一特征向量和每个第二特征向量进行匹配，进一步确定最小相似度对应的第二目标数据为与第一目标数据相匹配的第二目标数据，因此，可以将该第二目标数据确定为最终推送给用户的推送数据。In an optional embodiment, the similarity between the first feature vector and each second feature vector may be calculated separately, so as to match the first feature vector and each second feature vector, and further determine the corresponding minimum similarity The second target data is the second target data that matches the first target data. Therefore, the second target data can be determined as the push data that is finally pushed to the user.

其中，第一处理模型和第二处理模型的参数基于第一全局特征和第二全局特征的全局相似度、第一数据块特征和第二数据块特征的数据块相似度进行调整，第一全局特征和第一数据块特征通过第一处理模型对第一训练数据进行处理得到，第二全局特征和第二数据块特征通过第二处理模型对第二训练数据进行处理得到，第一训练数据和第二训练数据是不同类型的数据，第一全局特征用于表征第一训练数据的语义特征，第一数据块特征用于表征第一训练数据中的数据块的特征，第一处理模型和第二处理模型为机器学习模型。The parameters of the first processing model and the second processing model are adjusted based on the global similarity between the first global feature and the second global feature, and the data block similarity between the first data block feature and the second data block feature. The feature and the first data block feature are obtained by processing the first training data by the first processing model, the second global feature and the second data block feature are obtained by processing the second training data by the second processing model, and the first training data and The second training data are different types of data, the first global feature is used to characterize the semantic feature of the first training data, the first data block feature is used to characterize the feature of the data block in the first training data, the first processing model and the The second processing model is a machine learning model.

需要说明的是，第一处理模型和第二处理模型的处理过程可以参考上述实施例中1的模型训练方法，具体训练过程在此不做赘述。It should be noted that, for the processing procedures of the first processing model and the second processing model, reference may be made to the model training method in 1 in the foregoing embodiment, and the specific training procedures will not be repeated here.

在本申请上述实施例中，在确定出推送数据之后，该方法还包括：输出推送数据；接收反馈数据，其中，反馈数据用于表征对推送数据进行修改后的数据；基于反馈数据对第一处理模型和第二处理模型的参数进行调整。In the above embodiment of the present application, after the push data is determined, the method further includes: outputting the push data; receiving feedback data, wherein the feedback data is used to represent the data after the push data is modified; The parameters of the treatment model and the second treatment model are adjusted.

在一种可选的实施例中，当第一处理模型和第二处理模型均部署在云服务器中时，在从多个第二目标数据中确定出推送数据之后，可以将推送数据反馈给用户，由用户进行确认，如果推送数据满足用户需求，则用户可以反馈确认信息；如果推送数据不满足用户需求，则用户可以对推送数据进行修改，得到反馈数据，并通过反馈数据对第一处理模型和第二处理模型的参数进行调整，达到提高服务器性能的效果。In an optional embodiment, when both the first processing model and the second processing model are deployed in the cloud server, after the push data is determined from the plurality of second target data, the push data can be fed back to the user , the user confirms, if the push data meets the user's needs, the user can feedback the confirmation information; if the push data does not meet the user's needs, the user can modify the push data to obtain the feedback data, and use the feedback data to modify the first processing model. and the parameters of the second processing model are adjusted to achieve the effect of improving server performance.

需要说明的是，本申请上述实施例中涉及到的优选实施方案与实施例1提供的方案以及应用场景、实施过程相同，但不仅限于实施例1所提供的方案。It should be noted that the preferred embodiments involved in the above embodiments of the present application are the same as the solutions, application scenarios, and implementation processes provided in Example 1, but are not limited to the solutions provided in Example 1.

实施例3Example 3

图6是根据本申请实施例的第二种数据处理方法的流程图。如图6所示，该方法可以包括如下步骤：FIG. 6 is a flowchart of a second data processing method according to an embodiment of the present application. As shown in Figure 6, the method may include the following steps:

步骤S602，响应作用于客户端的操作界面上的输入指令，在操作界面上显示第一目标数据。Step S602, displaying the first target data on the operation interface in response to the input instruction acting on the operation interface of the client.

上述步骤中的操作界面可以是客户端中提供给用户的操作界面，用户通过在该操作界面中进行操作，可以提供第一目标数据，并且可以将第一目标数据显示在操作界面的第一显示区域中。输入指令可以是用户在文本框中输入文本内容的指令，也可以是用户通过点击“拍摄”按钮生成的指令，还可以是用户通过拖动第一目标数据至输入区域所生成的指令，但不仅限于此。The operation interface in the above steps can be an operation interface provided to the user in the client, and the user can provide the first target data by operating in the operation interface, and the first target data can be displayed on the first display of the operation interface. in the area. The input instruction can be an instruction for the user to input text content in the text box, or an instruction generated by the user by clicking the "shoot" button, or an instruction generated by the user by dragging the first target data to the input area, but not only limited to this.

步骤S604，响应作用于客户端的操作界面上的推送指令，在操作界面上显示多个第二目标数据中的推送数据，其中，推送数据用于表征与第一目标数据相匹配的第二目标数据，第一目标数据和多个第二目标数据的类型不同，推送数据通过将第一目标数据对应的第一特征向量和多个第二目标数据对应的第二特征向量进行匹配得到，第一特征向量通过第一处理模型对第一目标数据进行处理得到，第二特征向量通过第二处理模型对第二目标数据进行处理得到。Step S604, in response to the push instruction acting on the operation interface of the client, display the push data in the plurality of second target data on the operation interface, wherein the push data is used to represent the second target data that matches the first target data , the types of the first target data and the plurality of second target data are different, and the push data is obtained by matching the first feature vector corresponding to the first target data with the second feature vectors corresponding to the plurality of second target data. The vector is obtained by processing the first target data through the first processing model, and the second feature vector is obtained by processing the second target data through the second processing model.

上述步骤中的推送指令可以是用户通过点击“推送”按钮生成的指令，在图文搜索场景中，推送指令可以是图文搜索指令，但不仅限于此。The push instruction in the above steps may be an instruction generated by the user by clicking the "Push" button. In the image and text search scenario, the push instruction may be an image and text search instruction, but is not limited thereto.

需要说明的是，本申请上述实施例中涉及到的优选实施方案与实施例2提供的方案以及应用场景、实施过程相同，但不仅限于实施例2所提供的方案。It should be noted that the preferred embodiments involved in the above embodiments of the present application are the same as the solutions, application scenarios, and implementation processes provided in Example 2, but are not limited to the solutions provided in Example 2.

实施例4Example 4

根据本申请实施例，还提供了一种可以应用于虚拟现实VR设备、增强现实AR设备等虚拟现实场景下的数据处理方法，需要说明的是，在附图的流程图示出的步骤可以在诸如一组计算机可执行指令的计算机系统中执行，并且，虽然在流程图中示出了逻辑顺序，但是在某些情况下，可以以不同于此处的顺序执行所示出或描述的步骤。According to the embodiments of the present application, a data processing method that can be applied to virtual reality scenarios such as virtual reality VR devices and augmented reality AR devices is also provided. It should be noted that the steps shown in the flowcharts of the accompanying drawings can be Execution in a computer system, such as a set of computer-executable instructions, and, although a logical order is shown in the flowcharts, in some cases the steps shown or described may be performed in an order different from that herein.

图7是根据本申请实施例的第三种数据处理方法的流程图。如图7所示，该方法可以包括如下步骤：FIG. 7 is a flowchart of a third data processing method according to an embodiment of the present application. As shown in Figure 7, the method may include the following steps:

步骤S702，在虚拟现实VR设备或增强现实AR设备的呈现画面上展示第一目标数据。Step S702, displaying the first target data on the presentation screen of the virtual reality VR device or the augmented reality AR device.

步骤S704，获取多个第二目标数据，其中，第一目标数据和多个第二目标数据的类型不同。Step S704: Acquire a plurality of second target data, wherein the types of the first target data and the plurality of second target data are different.

步骤S706，利用第一处理模型对第一目标数据进行处理，得到第一目标数据对应的第一特征向量。Step S706, using the first processing model to process the first target data to obtain a first feature vector corresponding to the first target data.

步骤S708，利用第二处理模型分别对多个第二目标数据进行处理，得到多个第二目标数据对应的第二特征向量。Step S708, using the second processing model to process the plurality of second target data respectively to obtain second feature vectors corresponding to the plurality of second target data.

步骤S710，将第一特征向量和第二特征向量进行匹配，确定多个第二目标数据中的推送数据，其中，推送数据用于表征与第一目标数据相匹配的第二目标数据。Step S710: Match the first feature vector and the second feature vector to determine the push data in the plurality of second target data, wherein the push data is used to represent the second target data matching the first target data.

步骤S712，驱动VR设备或AR设备展示推送数据。Step S712, driving the VR device or AR device to display the push data.

可选地，在本实施例中，上述数据处理方法可以应用于由服务器、虚拟现实设备所构成的硬件环境中。在虚拟现实VR设备或增强现实AR设备的呈现画面上展示推送数据，服务器可以为媒体文件运营商对应的服务器，上述网络包括但不限于：广域网、城域网或局域网，上述虚拟现实设备并不限定于：虚拟现实头盔、虚拟现实眼镜、虚拟现实一体机等。Optionally, in this embodiment, the above data processing method may be applied to a hardware environment composed of a server and a virtual reality device. To display the push data on the presentation screen of the virtual reality VR device or augmented reality AR device, the server may be the server corresponding to the media file operator. The above-mentioned networks include but are not limited to: wide area network, metropolitan area network or local area network. The above-mentioned virtual reality equipment does not Limited to: virtual reality helmet, virtual reality glasses, virtual reality all-in-one machine, etc.

可选地，虚拟现实设备包括：存储器、处理器和传输装置。存储器用于存储应用程序，该应用程序可以用于执行：在虚拟现实VR设备或增强现实AR设备的呈现画面上展示第一目标数据；获取多个第二目标数据，其中，第一目标数据和多个第二目标数据的类型不同；利用第一处理模型对第一目标数据进行处理，得到第一目标数据对应的第一特征向量；利用第二处理模型分别对多个第二目标数据进行处理，得到多个第二目标数据对应的第二特征向量；将第一特征向量和第二特征向量进行匹配，确定多个第二目标数据中的推送数据，其中，推送数据用于表征与第一目标数据相匹配的第二目标数据；驱动VR设备或AR设备展示推送数据；其中，第一处理模型和第二处理模型的参数基于第一全局特征和第二全局特征的全局相似度、第一数据块特征和第二数据块特征的数据块相似度进行调整，第一全局特征和第一数据块特征通过第一处理模型对第一训练数据进行处理得到，第二全局特征和第二数据块特征通过第二处理模型对第二训练数据进行处理得到，第一训练数据和第二训练数据是不同类型的数据，第一全局特征用于表征第一训练数据的语义特征，第一数据块特征用于表征第一训练数据中的数据块的特征，第一处理模型和第二处理模型为机器学习模型。Optionally, the virtual reality device includes: a memory, a processor and a transmission device. The memory is used for storing an application program, and the application program can be used for executing: displaying the first target data on the presentation screen of the virtual reality VR device or the augmented reality AR device; acquiring a plurality of second target data, wherein the first target data and The types of the plurality of second target data are different; the first target data is processed by the first processing model to obtain the first feature vector corresponding to the first target data; the plurality of second target data are processed by the second processing model respectively , obtain the second feature vectors corresponding to the plurality of second target data; match the first feature vector and the second feature vector to determine the push data in the plurality of second target data, wherein the push data is used to represent the The second target data that matches the target data; the VR device or the AR device is driven to display the push data; wherein, the parameters of the first processing model and the second processing model are based on the global similarity between the first global feature and the second global feature, the first The data block similarity between the data block feature and the second data block feature is adjusted. The first global feature and the first data block feature are obtained by processing the first training data through the first processing model, and the second global feature and the second data block are obtained. The feature is obtained by processing the second training data by the second processing model, the first training data and the second training data are different types of data, the first global feature is used to represent the semantic feature of the first training data, and the first data block feature For characterizing the features of the data blocks in the first training data, the first processing model and the second processing model are machine learning models.

需要说明的是，该实施例的上述应用在VR设备或AR设备中的数据处理方法可以包括图3所示实施例的方法，以实现驱动VR设备或AR设备展示推送数据的目的。It should be noted that the above data processing method applied in the VR device or AR device of this embodiment may include the method of the embodiment shown in FIG. 3 to achieve the purpose of driving the VR device or AR device to display push data.

可选地，该实施例的处理器可以通过传输装置调用上述存储器存储的应用程序以执行上述步骤。传输装置可以通过网络接收服务器发送的媒体文件，也可以用于上述处理器与存储器之间的数据传输。Optionally, the processor in this embodiment may call the application program stored in the above-mentioned memory through the transmission device to execute the above-mentioned steps. The transmission device can receive the media files sent by the server through the network, and can also be used for data transmission between the above-mentioned processor and the memory.

可选地，在虚拟现实设备中，带有眼球追踪的头戴式显示器，该HMD头显中的屏幕，用于显示展示的视频画面，HMD中的眼球追踪模块，用于获取用户眼球的实时运动路线，跟踪系统，用于追踪用户在真实三维空间的位置信息与运动信息，计算处理单元，用于从跟踪系统中获取用户的实时位置与运动信息，并计算出用户头部在虚拟三维空间中的三维坐标，以及用户在虚拟三维空间中的视野朝向等。Optionally, in a virtual reality device, a head-mounted display with eye-tracking, the screen in the HMD head-mounted display is used to display the displayed video picture, and the eye-tracking module in the HMD is used to obtain the real-time information of the user's eyeballs. Movement route, tracking system, used to track the user's position information and movement information in the real three-dimensional space, calculation processing unit, used to obtain the user's real-time position and movement information from the tracking system, and calculate the user's head in the virtual three-dimensional space. The three-dimensional coordinates in the virtual three-dimensional space, and the user's view orientation in the virtual three-dimensional space.

在本申请实施例中，虚拟现实设备可以与终端相连接，终端与服务器通过网络进行连接，上述虚拟现实设备并不限定于：虚拟现实头盔、虚拟现实眼镜、虚拟现实一体机等，上述终端并不限定于PC、手机、平板电脑等，服务器可以为媒体文件运营商对应的服务器，上述网络包括但不限于：广域网、城域网或局域网。In the embodiment of the present application, the virtual reality device may be connected to the terminal, and the terminal and the server are connected through the network. The above virtual reality device is not limited to: virtual reality helmet, virtual reality glasses, virtual reality all-in-one machine, etc. Not limited to PCs, mobile phones, tablet computers, etc., the server may be a server corresponding to a media file operator, and the above-mentioned network includes but is not limited to: a wide area network, a metropolitan area network, or a local area network.

实施例5Example 5

图8是根据本申请实施例的另一种模型训练方法的流程图。如图8所示，该方法可以包括如下步骤：FIG. 8 is a flowchart of another model training method according to an embodiment of the present application. As shown in Figure 8, the method may include the following steps:

步骤S802，服务器通过调用第一接口获取客户端发送的模型训练请求，其中，第一接口包括第一参数，第一参数的参数值为模型训练请求，模型训练请求用于对第一处理模型和第二处理模型进行训练。Step S802, the server obtains the model training request sent by the client by calling the first interface, wherein the first interface includes a first parameter, the parameter value of the first parameter is the model training request, and the model training request is used for processing the first model and the model training request. The second processing model is trained.

步骤S804，服务器基于模型训练请求获取训练样本，其中，训练样本包括：不同类型的第一训练数据和第二训练数据。Step S804, the server obtains training samples based on the model training request, wherein the training samples include: different types of first training data and second training data.

步骤S806，服务器利用第一处理模型对第一训练数据进行处理，得到第一全局特征和第一数据块特征，其中，第一全局特征用于表征第一训练数据的语义特征，第一数据块特征用于表征第一训练数据中的数据块的特征。Step S806, the server processes the first training data by using the first processing model to obtain a first global feature and a first data block feature, wherein the first global feature is used to represent the semantic feature of the first training data, and the first data block The features are used to characterize the features of the data blocks in the first training data.

步骤S808，服务器利用第二处理模型对第二训练数据进行处理，得到第二全局特征和第二数据块特征。Step S808, the server processes the second training data by using the second processing model to obtain the second global feature and the second data block feature.

步骤S810，服务器基于第一全局特征和第二全局特征的全局相似度、第一数据块特征和第二数据块特征的数据块相似度，对第一处理模型和第二处理模型的参数进行调整。Step S810, the server adjusts the parameters of the first processing model and the second processing model based on the global similarity between the first global feature and the second global feature, and the data block similarity between the first data block feature and the second data block feature. .

步骤S812，服务器通过调用第二接口输出第一处理模型和第二处理模型至客户端，其中，第二接口包括第二参数，第二参数的参数值为第一处理模型和第二处理模型，第一处理模型和第二处理模型为机器学习模型。Step S812, the server outputs the first processing model and the second processing model to the client by calling the second interface, wherein the second interface includes a second parameter, and the parameter values of the second parameter are the first processing model and the second processing model, The first processing model and the second processing model are machine learning models.

实施例6Example 6

图9是根据本申请实施例的第四种数据处理方法的流程图。如图9所示，该方法可以包括如下步骤：FIG. 9 is a flowchart of a fourth data processing method according to an embodiment of the present application. As shown in Figure 9, the method may include the following steps:

步骤S902，服务器通过调用第一接口获取客户端发送的第一目标数据，其中，第一接口包括第一参数，第一参数的参数值为第一目标数据。Step S902, the server acquires the first target data sent by the client by calling a first interface, wherein the first interface includes a first parameter, and the parameter value of the first parameter is the first target data.

步骤S904，服务器获取多个第二目标数据，其中，第一目标数据和多个第二目标数据的类型不同。Step S904, the server acquires a plurality of second target data, wherein the types of the first target data and the plurality of second target data are different.

步骤S906，服务器利用第一处理模型对第一目标数据进行处理，得到第一目标数据对应的第一特征向量。Step S906, the server processes the first target data by using the first processing model to obtain a first feature vector corresponding to the first target data.

步骤S908，服务器利用第二处理模型分别对多个第二目标数据进行处理，得到多个第二目标数据对应的第二特征向量。Step S908, the server processes the plurality of second target data respectively by using the second processing model to obtain second feature vectors corresponding to the plurality of second target data.

步骤S910，服务器将第一特征向量和第二特征向量进行匹配，确定多个第二目标数据中的推送数据，其中，推送数据用于表征与第一目标数据相匹配的第二目标数据。Step S910, the server matches the first feature vector with the second feature vector, and determines the push data in the plurality of second target data, wherein the push data is used to represent the second target data matching the first target data.

步骤S912，服务器通过调用第二接口输出推送数据至客户端，其中，第二接口包括第二参数，第二参数的参数值为推送数据。Step S912, the server outputs the push data to the client by calling a second interface, wherein the second interface includes a second parameter, and the parameter value of the second parameter is the push data.

实施例7Example 7

根据本申请实施例，还提供了一种用于实施上述模型训练方法的模型处理装置，如图10所示，该装置1000包括：获取模块1002、第一处理模块1004、第二处理模块1006和调整模块1008。According to an embodiment of the present application, a model processing apparatus for implementing the above model training method is also provided. As shown in FIG. 10 , the apparatus 1000 includes: an acquisition module 1002 , a first processing module 1004 , a second processing module 1006 and Adjustment module 1008.

其中，获取模块1002用于获取训练样本，其中，训练样本包括：不同类型的第一训练数据和第二训练数据；第一处理模块1004用于利用第一处理模型对第一训练数据进行处理，得到第一全局特征和第一数据块特征，其中，第一全局特征用于表征第一训练数据的语义特征，第一数据块特征用于表征第一训练数据中的数据块的特征；第二处理模块1006用于利用第二处理模型对第二训练数据进行处理，得到第二全局特征和第二数据块特征；调整模块1008用于基于第一全局特征和第二全局特征的全局相似度、第二数据块特征和第二数据块特征的数据块相似度，对第一处理模型和第二处理模型的参数进行调整，第一处理模型和第二处理模型为机器学习模型。The acquisition module 1002 is used to acquire training samples, wherein the training samples include: different types of first training data and second training data; the first processing module 1004 is used to process the first training data by using the first processing model, obtaining the first global feature and the first data block feature, wherein the first global feature is used to represent the semantic feature of the first training data, and the first data block feature is used to represent the feature of the data block in the first training data; the second The processing module 1006 is used for using the second processing model to process the second training data to obtain the second global feature and the second data block feature; the adjusting module 1008 is used for the global similarity based on the first global feature and the second global feature, The second data block feature and the data block similarity of the second data block feature adjust the parameters of the first processing model and the second processing model, and the first processing model and the second processing model are machine learning models.

此处需要说明的是，上述获取模块1002、第一处理模块1004、第二处理模块1006和调整模块1008对应于实施例1中的步骤S302至步骤S308，四个模块与对应的步骤所实现的实例和应用场景相同，但不限于上述实施例1所公开的内容。需要说明的是，上述模块作为装置的一部分可以运行在实施例1提供的计算机终端中。It should be noted here that the acquisition module 1002 , the first processing module 1004 , the second processing module 1006 and the adjustment module 1008 correspond to steps S302 to S308 in Embodiment 1, and the four modules and the corresponding steps are implemented by The examples and application scenarios are the same, but are not limited to the content disclosed in the above-mentioned Embodiment 1. It should be noted that, as a part of the apparatus, the above-mentioned modules may run in the computer terminal provided in Embodiment 1.

在本申请上述实施例中，调整模块包括：第一构建单元、第二构建单元、加权单元和调整单元。In the above embodiments of the present application, the adjustment module includes: a first construction unit, a second construction unit, a weighting unit, and an adjustment unit.

其中，第一构建单元用于基于全局相似度构建第一损失函数；第二构建单元用于基于数据块相似度构建第二损失函数；加权单元用于获取第一损失函数和第二损失函数的加权和，得到目标损失函数；调整单元用于基于目标损失函数对第一处理模型和第二处理模型的参数进行调整。Wherein, the first construction unit is used to construct the first loss function based on the global similarity; the second construction unit is used to construct the second loss function based on the data block similarity; the weighting unit is used to obtain the difference between the first loss function and the second loss function The weighted sum is used to obtain the target loss function; the adjustment unit is used to adjust the parameters of the first processing model and the second processing model based on the target loss function.

在本申请上述实施例中，第一构建单元还用于从全局相似度中获取第一样本对应的第一全局相似度，及第二样本对应的第二全局相似度，其中，第一样本包含的两个不同类型的数据相匹配，第二样本包含的两个不同类型的数据不匹配；基于第一全局相似度和第二全局相似度，构建第一损失函数。In the above-mentioned embodiment of the present application, the first construction unit is further configured to obtain the first global similarity corresponding to the first sample and the second global similarity corresponding to the second sample from the global similarity, wherein the first same The two different types of data contained in this sample match, and the two different types of data contained in the second sample do not match; a first loss function is constructed based on the first global similarity and the second global similarity.

在本申请上述实施例中，第二构建单元还用于从数据块相似度中获取第一样本对应的第一数据块相似度，及第二样本对应的第二数据块相似度；基于第一数据块相似度和第二数据块相似度，构建第二损失函数。In the above-mentioned embodiment of the present application, the second construction unit is further configured to obtain the similarity of the first data block corresponding to the first sample and the similarity of the second data block corresponding to the second sample from the similarity of the data block; The similarity of a data block and the similarity of the second data block are used to construct a second loss function.

在本申请上述实施例中，该装置还包括：特征对齐模块。In the above embodiments of the present application, the apparatus further includes: a feature alignment module.

其中，特征对齐模块用于利用特征对齐模块对第二数据块特征和第二数据块特征进行特征对齐，得到数据块相似度。The feature alignment module is used to perform feature alignment on the second data block feature and the second data block feature by using the feature alignment module to obtain the data block similarity.

实施例8Example 8

根据本申请实施例，还提供了一种用于实施上述数据处理方法的数据处理装置，如图11所示，该装置1100包括：获取模块1102、第一处理模块1104、第二处理模块1106和匹配模块1108。According to an embodiment of the present application, a data processing apparatus for implementing the above data processing method is also provided. As shown in FIG. 11 , the apparatus 1100 includes: an acquisition module 1102 , a first processing module 1104 , a second processing module 1106 and Matching module 1108.

其中，获取模块1102用于获取不同类型的第一目标数据和多个第二目标数据；第一处理模块1104用于利用第一处理模型对第一目标数据进行处理，得到第一目标数据对应的第一特征向量；第二处理模块1106用于利用第二处理模型分别对多个第二目标数据进行处理，得到多个第二目标数据对应的第二特征向量；匹配模块1108用于将第一特征向量和第二特征向量进行匹配，确定多个第二目标数据中的推送数据，其中，推送数据用于表征与第一目标数据相匹配的第二目标数据，第一处理模型和第二处理模型的参数基于第一全局特征和第二全局特征的全局相似度、第一数据块特征和第二数据块特征的数据块相似度进行调整，第一全局特征和第一数据块特征通过第一处理模型对第一训练数据进行处理得到，第二全局特征和第二数据块特征通过第二处理模型对第二训练数据进行处理得到，第一训练数据和第二训练数据是不同类型的数据，第一全局特征用于表征第一训练数据的语义特征，第一数据块特征用于表征第一训练数据中的数据块的特征，第一处理模型和第二处理模型为机器学习模型。The acquisition module 1102 is used to acquire different types of first target data and a plurality of second target data; the first processing module 1104 is used to process the first target data by using the first processing model to obtain the corresponding first target data. The first feature vector; the second processing module 1106 is used to process the plurality of second target data respectively by using the second processing model to obtain the second feature vectors corresponding to the plurality of second target data; the matching module 1108 is used to The feature vector and the second feature vector are matched to determine the push data in the plurality of second target data, wherein the push data is used to represent the second target data that matches the first target data, the first processing model and the second processing model The parameters of the model are adjusted based on the global similarity between the first global feature and the second global feature, the data block similarity between the first data block feature and the second data block feature, and the first global feature and the first data block feature are passed through the first The processing model is obtained by processing the first training data, the second global feature and the second data block feature are obtained by processing the second training data by the second processing model, and the first training data and the second training data are different types of data, The first global feature is used to represent the semantic feature of the first training data, the first data block feature is used to represent the feature of the data block in the first training data, and the first processing model and the second processing model are machine learning models.

此处需要说明的是，上述获取模块1102、第一处理模块1104、第二处理模块1106和匹配模块1108对应于实施例2中的步骤S502至步骤S508，四个模块与对应的步骤所实现的实例和应用场景相同，但不限于上述实施例2所公开的内容。需要说明的是，上述模块作为装置的一部分可以运行在实施例1提供的计算机终端中。It should be noted here that the acquisition module 1102 , the first processing module 1104 , the second processing module 1106 and the matching module 1108 correspond to steps S502 to S508 in Embodiment 2, and the four modules and the corresponding steps are implemented by The examples and application scenarios are the same, but are not limited to the content disclosed in the above-mentioned Embodiment 2. It should be noted that, as a part of the apparatus, the above-mentioned modules may run in the computer terminal provided in Embodiment 1.

实施例9Example 9

根据本申请实施例，还提供了一种用于实施上述数据处理方法的数据处理装置，部署于客户端中，如图12所示，该装置1200包括：第一显示模块1202和第二显示模块1204。According to an embodiment of the present application, a data processing apparatus for implementing the above data processing method is also provided, which is deployed in a client. As shown in FIG. 12 , the apparatus 1200 includes: a first display module 1202 and a second display module 1204.

其中，第一显示模块1202用于响应作用于客户端操作界面上的输入指令，在操作界面上显示第一目标数据；第二显示模块1204用于响应作用于操作界面上的推送指令，在操作界面上显示多个第二目标数据中的推送数据，其中，推送数据用于表征与第一目标数据相匹配的第二目标数据，第一目标数据和多个第二目标数据的类型不同，推送数据通过将第一目标数据对应的第一特征向量和多个第二目标数据对应的第二特征向量进行匹配得到，第一特征向量通过第一处理模型对第一目标数据进行处理得到，第二特征向量通过第二处理模型对第二目标数据进行处理得到，第一处理模型和第二处理模型的参数基于第一全局特征和第二全局特征的全局相似度、第一数据块特征和第二数据块特征的数据块相似度进行调整，第一全局特征和第一数据块特征通过第一处理模型对第一训练数据进行处理得到，第二全局特征和第二数据块特征通过第二处理模型对第二训练数据进行处理得到，第一训练数据和第二训练数据是不同类型的数据，第一全局特征用于表征第一训练数据的语义特征，第一数据块特征用于表征第一训练数据中的数据块的特征，第一处理模型和第二处理模型为机器学习模型。Among them, the first display module 1202 is used to display the first target data on the operation interface in response to the input command acting on the operation interface of the client; the second display module 1204 is used to respond to the push instruction acting on the operation interface, in the operation The interface displays the push data in the plurality of second target data, wherein the push data is used to represent the second target data that matches the first target data, and the first target data and the plurality of second target data are of different types. The data is obtained by matching the first feature vector corresponding to the first target data with the second feature vectors corresponding to a plurality of second target data, the first feature vector is obtained by processing the first target data by the first processing model, and the second feature vector is obtained by processing the first target data through the first processing model. The feature vector is obtained by processing the second target data through the second processing model, and the parameters of the first processing model and the second processing model are based on the global similarity of the first global feature and the second global feature, the first data block feature and the second The data block similarity of the data block feature is adjusted, the first global feature and the first data block feature are obtained by processing the first training data by the first processing model, and the second global feature and the second data block feature are obtained by the second processing model. The second training data is processed to obtain, the first training data and the second training data are different types of data, the first global feature is used to represent the semantic feature of the first training data, and the first data block feature is used to represent the first training data. The features of the data blocks in the data, the first processing model and the second processing model are machine learning models.

此处需要说明的是，上述第一显示模块1202和第二显示模块1204对应于实施例3中的步骤S602至步骤S604，两个模块与对应的步骤所实现的实例和应用场景相同，但不限于上述实施例3所公开的内容。需要说明的是，上述模块作为装置的一部分可以运行在实施例1提供的计算机终端中。It should be noted here that the above-mentioned first display module 1202 and second display module 1204 correspond to steps S602 to S604 in Embodiment 3, and the two modules have the same instances and application scenarios as the corresponding steps, but they are different from each other. It is limited to the content disclosed in the above-mentioned Embodiment 3. It should be noted that, as a part of the apparatus, the above-mentioned modules may run in the computer terminal provided in Embodiment 1.

实施例10Example 10

根据本申请实施例，还提供了一种用于实施上述数据处理方法的数据处理装置，如图13所示，该装置1300包括：第一展示模块1302、获取模块1304、第一处理模块1306、第二处理模块1308、匹配模块1310和第二展示模块1312。According to an embodiment of the present application, a data processing apparatus for implementing the above data processing method is also provided. As shown in FIG. 13 , the apparatus 1300 includes: a first display module 1302, an acquisition module 1304, a first processing module 1306, The second processing module 1308 , the matching module 1310 and the second presentation module 1312 .

其中，第一展示模块1302用于在虚拟现实VR设备或增强现实AR设备的呈现画面上展示第一目标数据；获取模块1304用于获取多个第二目标数据，其中，第一目标数据和多个第二目标数据的类型不同；第一处理模块1306用于利用第一处理模型对第一目标数据进行处理，得到第一目标数据对应的第一特征向量；第二处理模块1308用于利用第二处理模型分别对多个第二目标数据进行处理，得到多个第二目标数据对应的第二特征向量；匹配模块1310用于将第一特征向量和第二特征向量进行匹配，确定多个第二目标数据中的推送数据，其中，推送数据用于表征与第一目标数据相匹配的第二目标数据；第二展示模块1312用于驱动VR设备或AR设备展示推送数据；其中，第一处理模型和第二处理模型的参数基于第一全局特征和第二全局特征的全局相似度、第一数据块特征和第二数据块特征的数据块相似度进行调整，第一全局特征和第一数据块特征通过第一处理模型对第一训练数据进行处理得到，第二全局特征和第二数据块特征通过第二处理模型对第二训练数据进行处理得到，第一训练数据和第二训练数据是不同类型的数据，第一全局特征用于表征第一训练数据的语义特征，第一数据块特征用于表征第一训练数据中的数据块的特征，第一处理模型和第二处理模型为机器学习模型。The first display module 1302 is used to display the first target data on the presentation screen of the virtual reality VR device or the augmented reality AR device; the acquisition module 1304 is used to obtain multiple second target data, wherein the first target data and the multiple The types of the second target data are different; the first processing module 1306 is used to process the first target data by using the first processing model to obtain the first feature vector corresponding to the first target data; the second processing module 1308 is used to use the first target data. The second processing model respectively processes the multiple second target data to obtain the second feature vectors corresponding to the multiple second target data; the matching module 1310 is configured to match the first feature vector and the second feature vector to determine multiple second feature vectors. The push data in the two target data, wherein the push data is used to represent the second target data that matches the first target data; the second display module 1312 is used to drive the VR device or AR device to display the push data; wherein, the first processing The parameters of the model and the second processing model are adjusted based on the global similarity between the first global feature and the second global feature, the data block similarity between the first data block feature and the second data block feature, and the first global feature and the first data The block feature is obtained by processing the first training data by the first processing model, and the second global feature and the second data block feature are obtained by processing the second training data by the second processing model. The first training data and the second training data are For different types of data, the first global feature is used to represent the semantic features of the first training data, the first data block feature is used to represent the features of the data blocks in the first training data, and the first processing model and the second processing model are machines. Learning models.

此处需要说明的是，上述第一展示模块1302、获取模块1304、第一处理模块1306、第二处理模块1308、匹配模块1310和第二展示模块1312对应于实施例4中的步骤S702至步骤S712，六个模块与对应的步骤所实现的实例和应用场景相同，但不限于上述实施例4所公开的内容。需要说明的是，上述模块作为装置的一部分可以运行在实施例1提供的计算机终端中。It should be noted here that the above-mentioned first display module 1302 , acquisition module 1304 , first processing module 1306 , second processing module 1308 , matching module 1310 , and second display module 1312 correspond to steps S702 to S702 in Embodiment 4 S712, the six modules and the corresponding steps have the same examples and application scenarios, but are not limited to the content disclosed in the foregoing Embodiment 4. It should be noted that, as a part of the apparatus, the above-mentioned modules may run in the computer terminal provided in Embodiment 1.

实施例11Example 11

根据本申请实施例，还提供了一种用于实施上述模型训练方法的模型训练装置，部署于服务器，如图14所示，该装置1400包括：第一调用模块1402、获取模块1404、第一处理模块1406、第二处理模块1408、调整模块1410和第二调用模块1412。According to an embodiment of the present application, a model training device for implementing the above model training method is also provided, which is deployed on a server. As shown in FIG. 14 , the device 1400 includes: a first calling module 1402 , an obtaining module 1404 , a first A processing module 1406 , a second processing module 1408 , an adjustment module 1410 and a second calling module 1412 .

其中，第一调用模块1402用于通过调用第一接口获取客户端发送的模型训练请求，其中，第一接口包括第一参数，第一参数的参数值为模型训练请求，模型训练请求用于对第一处理模型和第二处理模型进行训练；获取模块1404用于基于模型训练请求获取训练样本，其中，训练样本包括：不同类型的第一训练数据和第二训练数据；第一处理模块1406用于利用第一处理模型对第一训练数据进行处理，得到第一全局特征和第一数据块特征，其中，第一全局特征用于表征第一训练数据的语义特征，第一数据块特征用于表征第一训练数据中的数据块的特征；第二处理模块1408用于利用第二处理模型对第二训练数据进行处理，得到第二全局特征和第二数据块特征；调整模块1410用于基于第一全局特征和第二全局特征的全局相似度、第一数据块特征和第二数据块特征的数据块相似度，对第一处理模型和第二处理模型的参数进行调整；第二调用模块1412用于通过调用第二接口输出第一处理模型和第二处理模型至客户端，其中，第二接口包括第二参数，第二参数的参数值为第一处理模型和第二处理模型；其中，第一处理模型和第二处理模型的参数基于第一全局特征和第二全局特征的全局相似度、第一数据块特征和第二数据块特征的数据块相似度进行调整，第一全局特征和第一数据块特征通过第一处理模型对第一训练数据进行处理得到，第二全局特征和第二数据块特征通过第二处理模型对第二训练数据进行处理得到，第一训练数据和第二训练数据是不同类型的数据，第一全局特征用于表征第一训练数据的语义特征，第一数据块特征用于表征第一训练数据中的数据块的特征，第一处理模型和第二处理模型为机器学习模型。The first calling module 1402 is used to obtain the model training request sent by the client by calling the first interface, wherein the first interface includes a first parameter, the parameter value of the first parameter is the model training request, and the model training request is used to The first processing model and the second processing model are trained; the obtaining module 1404 is used to obtain training samples based on the model training request, wherein the training samples include: different types of first training data and second training data; the first processing module 1406 uses is used to process the first training data by using the first processing model to obtain a first global feature and a first data block feature, wherein the first global feature is used to represent the semantic feature of the first training data, and the first data block feature is used for Characterize the features of the data blocks in the first training data; the second processing module 1408 is used to process the second training data by using the second processing model to obtain second global features and second data block features; the adjustment module 1410 is used to The global similarity of the first global feature and the second global feature, the data block similarity of the first data block feature and the second data block feature, adjust the parameters of the first processing model and the second processing model; the second calling module 1412 is used to output the first processing model and the second processing model to the client by calling the second interface, wherein the second interface includes a second parameter, and the parameter values of the second parameter are the first processing model and the second processing model; wherein , the parameters of the first processing model and the second processing model are adjusted based on the global similarity of the first global feature and the second global feature, the data block similarity of the first data block feature and the second data block feature, and the first global feature and the first data block feature are obtained by processing the first training data by the first processing model, the second global feature and the second data block feature are obtained by processing the second training data by the second processing model, the first training data and the second data block feature are obtained by processing the second training data. The second training data are different types of data, the first global feature is used to characterize the semantic features of the first training data, the first data block feature is used to characterize the features of the data blocks in the first training data, the first processing model and the second The processing model is a machine learning model.

此处需要说明的是，上述第一调用模块1402、获取模块1404、第一处理模块1406、第二处理模块1408、调整模块1410和第二调用模块1412对应于实施例5中的步骤S802至步骤S812，六个模块与对应的步骤所实现的实例和应用场景相同，但不限于上述实施例5所公开的内容。需要说明的是，上述模块作为装置的一部分可以运行在实施例1提供的计算机终端中。It should be noted here that the above-mentioned first calling module 1402 , obtaining module 1404 , first processing module 1406 , second processing module 1408 , adjustment module 1410 and second calling module 1412 correspond to steps S802 to S802 in Embodiment 5 S812, the six modules and the corresponding steps have the same examples and application scenarios, but are not limited to the content disclosed in Embodiment 5 above. It should be noted that, as a part of the apparatus, the above-mentioned modules may run in the computer terminal provided in Embodiment 1.

实施例12Example 12

根据本申请实施例，还提供了一种用于实施上述数据处理方法的数据处理装置，部署于服务器，如图15所示，该装置1500包括：第一调用模块1502、获取模块1504、第一处理模块1506、第二处理模块1508、匹配模块1510和第二调用模块1512。According to an embodiment of the present application, a data processing apparatus for implementing the above data processing method is also provided, which is deployed on a server. As shown in FIG. 15 , the apparatus 1500 includes: a first calling module 1502 , an obtaining module 1504 , a first A processing module 1506 , a second processing module 1508 , a matching module 1510 and a second calling module 1512 .

其中，第一调用模块1502用于通过调用第一接口获取客户端发送的第一目标数据，其中，第一接口包括第一参数，第一参数的参数值为第一目标数据；获取模块1504用于获取多个第二目标数据，其中，第一目标数据和多个第二目标数据的类型不同；第一处理模块1506用于利用第一处理模型对第一目标数据进行处理，得到第一目标数据对应的第一特征向量；第二处理模块1508用于利用第二处理模型分别对多个第二目标数据进行处理，得到多个第二目标数据对应的第二特征向量；匹配模块1510用于将第一特征向量和第二特征向量进行匹配，确定多个第二目标数据中的推送数据，其中，推送数据用于表征与第一目标数据相匹配的第二目标数据；第二调用模块1512用于通过调用第二接口输出推送数据至客户端，其中，第二接口包括第二参数，第二参数的参数值为推送数据。The first calling module 1502 is configured to obtain the first target data sent by the client by calling the first interface, wherein the first interface includes a first parameter, and the parameter value of the first parameter is the first target data; the obtaining module 1504 uses for obtaining a plurality of second target data, wherein the types of the first target data and the plurality of second target data are different; the first processing module 1506 is used to process the first target data by using the first processing model to obtain the first target data The first feature vector corresponding to the data; the second processing module 1508 is used to process the plurality of second target data respectively by using the second processing model to obtain the second feature vector corresponding to the plurality of second target data; the matching module 1510 is used for Matching the first feature vector and the second feature vector to determine the push data in the plurality of second target data, wherein the push data is used to represent the second target data that matches the first target data; the second calling module 1512 It is used to output the push data to the client by calling the second interface, wherein the second interface includes a second parameter, and the parameter value of the second parameter is the push data.

此处需要说明的是，上述第一调用模块1502、获取模块1504、第一处理模块1506、第二处理模块1508、匹配模块1510和第二调用模块1512对应于实施例6中的步骤S902至步骤S912，六个模块与对应的步骤所实现的实例和应用场景相同，但不限于上述实施例6所公开的内容。需要说明的是，上述模块作为装置的一部分可以运行在实施例1提供的计算机终端中。It should be noted here that the above-mentioned first calling module 1502 , obtaining module 1504 , first processing module 1506 , second processing module 1508 , matching module 1510 and second calling module 1512 correspond to steps S902 to S902 in Embodiment 6 S912, the six modules and the corresponding steps have the same examples and application scenarios, but are not limited to the content disclosed in the above-mentioned Embodiment 6. It should be noted that, as a part of the apparatus, the above-mentioned modules may run in the computer terminal provided in Embodiment 1.

实施例13Example 13

本申请的实施例可以提供一种计算机终端，该计算机终端可以是计算机终端集群中的任意一个计算机终端。可选地，在本实施例中，上述计算机终端也可以替换为移动终端等终端设备。Embodiments of the present application may provide a computer terminal, and the computer terminal may be any computer terminal in a computer terminal cluster. Optionally, in this embodiment, the above-mentioned computer terminal may also be replaced by a terminal device such as a mobile terminal.

可选地，在本实施例中，上述计算机终端可以位于计算机网络的多个网络设备中的至少一个网络设备。Optionally, in this embodiment, the above-mentioned computer terminal may be located in at least one network device among multiple network devices of a computer network.

在本实施例中，上述计算机终端可以执行模型训练方法中以下步骤的程序代码：获取训练样本，训练样本包括：不同类型的第一训练数据和第二训练数据；利用第一处理模型对第一训练数据进行处理，得到第一全局特征和第一数据块特征，第一全局特征用于表征第一训练数据的语义特征，第一数据块特征用于表征第一训练数据中的数据块的特征；利用第二处理模型对第二训练数据进行处理，得到第二全局特征和第二数据块特征；基于第一全局特征和第二全局特征的全局相似度、第一数据块特征和第二数据块特征的数据块相似度，对第一处理模型和第二处理模型的参数进行调整，第一处理模型和第二处理模型为机器学习模型。In this embodiment, the above-mentioned computer terminal can execute the program code of the following steps in the model training method: obtain training samples, the training samples include: different types of first training data and second training data; The training data is processed to obtain a first global feature and a first data block feature. The first global feature is used to represent the semantic feature of the first training data, and the first data block feature is used to represent the feature of the data block in the first training data. Utilize the second processing model to process the second training data to obtain the second global feature and the second data block feature; Based on the global similarity of the first global feature and the second global feature, the first data block feature and the second data The data block similarity of the block feature adjusts the parameters of the first processing model and the second processing model, and the first processing model and the second processing model are machine learning models.

可选地，图16是根据本申请实施例的一种计算机终端的结构框图。如图16所示，该计算机终端A可以包括：一个或多个(图中仅示出一个)处理器1602、以及存储器1604。Optionally, FIG. 16 is a structural block diagram of a computer terminal according to an embodiment of the present application. As shown in FIG. 16 , the computer terminal A may include: one or more (only one is shown in the figure) processors 1602 and a memory 1604 .

其中，存储器可用于存储软件程序以及模块，如本申请实施例中的模型训练方法和装置，及数据处理方法和装置对应的程序指令/模块，处理器通过运行存储在存储器内的软件程序以及模块，从而执行各种功能应用以及数据处理，即实现上述的模型训练方法和数据处理方法。存储器可包括高速随机存储器，还可以包括非易失性存储器，如一个或者多个磁性存储装置、闪存、或者其他非易失性固态存储器。在一些实例中，存储器可进一步包括相对于处理器远程设置的存储器，这些远程存储器可以通过网络连接至终端A。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。The memory can be used to store software programs and modules, such as the model training method and device in the embodiments of the present application, and program instructions/modules corresponding to the data processing method and device, and the processor runs the software programs and modules stored in the memory by running the software programs and modules. , so as to perform various functional applications and data processing, that is, to implement the above-mentioned model training method and data processing method. The memory may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some instances, the memory may further include memory located remotely from the processor, and these remote memories may be connected to Terminal A through a network. Examples of such networks include, but are not limited to, the Internet, an intranet, a local area network, a mobile communication network, and combinations thereof.

处理器可以通过传输装置调用存储器存储的信息及应用程序，以执行下述步骤：获取训练样本，训练样本包括：不同类型的第一训练数据和第二训练数据；利用第一处理模型对第一训练数据进行处理，得到第一全局特征和第一数据块特征，第一全局特征用于表征第一训练数据的语义特征，第一数据块特征用于表征第一训练数据中的数据块的特征；利用第二处理模型对第二训练数据进行处理，得到第二全局特征和第二数据块特征；基于第一全局特征和第二全局特征的全局相似度、第一数据块特征和第二数据块特征的数据块相似度，对第一处理模型和第二处理模型的参数进行调整，第一处理模型和第二处理模型为机器学习模型。The processor can call the information and application programs stored in the memory through the transmission device to perform the following steps: acquire training samples, the training samples include: different types of first training data and second training data; The training data is processed to obtain a first global feature and a first data block feature. The first global feature is used to represent the semantic feature of the first training data, and the first data block feature is used to represent the feature of the data block in the first training data. Utilize the second processing model to process the second training data to obtain the second global feature and the second data block feature; Based on the global similarity of the first global feature and the second global feature, the first data block feature and the second data The data block similarity of the block feature adjusts the parameters of the first processing model and the second processing model, and the first processing model and the second processing model are machine learning models.

可选的，上述处理器还可以执行如下步骤的程序代码：基于全局相似度构建第一损失函数；基于数据块相似度构建第二损失函数；获取第一损失函数和第二损失函数的加权和，得到目标损失函数；基于目标损失函数对第一处理模型和第二处理模型的参数进行调整。Optionally, the above-mentioned processor may also execute the program code of the following steps: constructing a first loss function based on the global similarity; constructing a second loss function based on the data block similarity; obtaining a weighted sum of the first loss function and the second loss function , the target loss function is obtained; the parameters of the first processing model and the second processing model are adjusted based on the target loss function.

可选的，上述处理器还可以执行如下步骤的程序代码：从全局相似度中获取第一样本对应的第一全局相似度，及第二样本对应的第二全局相似度，其中，第一样本包含的两个不同类型的数据相匹配，第二样本包含的两个不同类型的数据不匹配；基于第一全局相似度和第二全局相似度，构建第一损失函数。Optionally, the above-mentioned processor may also execute the program code of the following steps: obtaining the first global similarity corresponding to the first sample and the second global similarity corresponding to the second sample from the global similarity, wherein the first Two different types of data contained in the sample match, and two different types of data contained in the second sample do not match; a first loss function is constructed based on the first global similarity and the second global similarity.

可选的，上述处理器还可以执行如下步骤的程序代码：从数据块相似度中获取第一样本对应的第一数据块相似度，及第二样本对应的第二数据块相似度；基于第一数据块相似度和第二数据块相似度，构建第二损失函数。Optionally, the above-mentioned processor may also execute the program code of the following steps: obtaining the similarity of the first data block corresponding to the first sample and the similarity of the second data block corresponding to the second sample from the similarity of the data block; The similarity of the first data block and the similarity of the second data block are used to construct a second loss function.

可选的，上述处理器还可以执行如下步骤的程序代码：利用特征对齐模块对第一数据块特征和第二数据块特征进行特征对齐，得到数据块相似度。Optionally, the above-mentioned processor may further execute the program code of the following steps: using a feature alignment module to perform feature alignment on the features of the first data block and the features of the second data block to obtain the similarity of the data blocks.

处理器可以通过传输装置调用存储器存储的信息及应用程序，以执行下述步骤：获取不同类型的第一目标数据和多个第二目标数据；利用第一处理模型对第一目标数据进行处理，得到第一目标数据对应的第一特征向量；利用第二处理模型分别对多个第二目标数据进行处理，得到多个第二目标数据对应的第二特征向量；将第一特征向量和第二特征向量进行匹配，确定多个第二目标数据中的推送数据，其中，推送数据用于表征与第一目标数据相匹配的第二目标数据；其中，第一处理模型和第二处理模型的参数基于第一全局特征和第二全局特征的全局相似度、第一数据块特征和第二数据块特征的数据块相似度进行调整，第一全局特征和第一数据块特征通过第一处理模型对第一训练数据进行处理得到，第二全局特征和第二数据块特征通过第二处理模型对第二训练数据进行处理得到，第一训练数据和第二训练数据是不同类型的数据，第一全局特征用于表征第一训练数据的语义特征，第一数据块特征用于表征第一训练数据中的数据块的特征，第一处理模型和第二处理模型为机器学习模型。The processor can call the information and application programs stored in the memory through the transmission device to perform the following steps: acquiring different types of first target data and multiple second target data; using the first processing model to process the first target data, Obtain the first feature vector corresponding to the first target data; use the second processing model to process the plurality of second target data respectively to obtain the second feature vector corresponding to the plurality of second target data; The feature vector is matched to determine the push data in the plurality of second target data, wherein the push data is used to represent the second target data that matches the first target data; wherein, the parameters of the first processing model and the second processing model The adjustment is made based on the global similarity of the first global feature and the second global feature, and the data block similarity of the first data block feature and the second data block feature, and the first global feature and the first data block feature are adjusted by the first processing model. The first training data is processed and obtained, the second global feature and the second data block feature are obtained by processing the second training data through the second processing model, the first training data and the second training data are different types of data, the first global feature The feature is used to represent the semantic feature of the first training data, the first data block feature is used to represent the feature of the data block in the first training data, and the first processing model and the second processing model are machine learning models.

处理器可以通过传输装置调用存储器存储的信息及应用程序，以执行下述步骤：响应作用于客户端的操作界面上的输入指令，在操作界面上显示第一目标数据；响应作用于操作界面上的推送指令，在操作界面上显示多个第二目标数据中的推送数据，其中，推送数据用于表征与第一目标数据相匹配的第二目标数据，第一目标数据和多个第二目标数据的类型不同，推送数据通过将第一目标数据对应的第一特征向量和多个第二目标数据对应的第二特征向量进行匹配得到，第一特征向量通过第一处理模型对第一目标数据进行处理得到，第二特征向量通过第二处理模型对第二目标数据进行处理得到，第一处理模型和第二处理模型的参数基于第一全局特征和第二全局特征的全局相似度、第一数据块特征和第二数据块特征的数据块相似度进行调整，第一全局特征和第一数据块特征通过第一处理模型对第一训练数据进行处理得到，第二全局特征和第二数据块特征通过第二处理模型对第二训练数据进行处理得到，第一训练数据和第二训练数据是不同类型的数据，第一全局特征用于表征第一训练数据的语义特征，第一数据块特征用于表征第一训练数据中的数据块的特征，第一处理模型和第二处理模型为机器学习模型。The processor can call the information and the application program stored in the memory through the transmission device to perform the following steps: display the first target data on the operation interface in response to the input instruction acting on the operation interface of the client; A push instruction, displaying the push data in the plurality of second target data on the operation interface, wherein the push data is used to represent the second target data matching the first target data, the first target data and the plurality of second target data The push data is obtained by matching the first feature vector corresponding to the first target data with the second feature vectors corresponding to multiple second target data, and the first feature vector is processed by the first processing model on the first target data. The second feature vector is obtained by processing the second target data through the second processing model, and the parameters of the first processing model and the second processing model are based on the global similarity of the first global feature and the second global feature, the first data The data block similarity between the block feature and the second data block feature is adjusted. The first global feature and the first data block feature are obtained by processing the first training data through the first processing model, and the second global feature and the second data block feature are obtained. The second training data is processed by the second processing model. The first training data and the second training data are different types of data. The first global feature is used to represent the semantic feature of the first training data, and the first data block feature is used for For characterizing the features of the data blocks in the first training data, the first processing model and the second processing model are machine learning models.

处理器可以通过传输装置调用存储器存储的信息及应用程序，以执行下述步骤：在虚拟现实VR设备或增强现实AR设备的呈现画面上展示第一目标数据；获取多个第二目标数据，其中，第一目标数据和多个第二目标数据的类型不同；利用第一处理模型对第一目标数据进行处理，得到第一目标数据对应的第一特征向量；利用第二处理模型分别对多个第二目标数据进行处理，得到多个第二目标数据对应的第二特征向量；将第一特征向量和第二特征向量进行匹配，确定多个第二目标数据中的推送数据，其中，推送数据用于表征与第一目标数据相匹配的第二目标数据；驱动VR设备或AR设备展示推送数据；其中，第一处理模型和第二处理模型的参数基于第一全局特征和第二全局特征的全局相似度、第一数据块特征和第二数据块特征的数据块相似度进行调整，第一全局特征和第一数据块特征通过第一处理模型对第一训练数据进行处理得到，第二全局特征和第二数据块特征通过第二处理模型对第二训练数据进行处理得到，第一训练数据和第二训练数据是不同类型的数据，第一全局特征用于表征第一训练数据的语义特征，第一数据块特征用于表征第一训练数据中的数据块的特征，第一处理模型和第二处理模型为机器学习模型。The processor can call the information and application programs stored in the memory through the transmission device to perform the following steps: displaying the first target data on the presentation screen of the virtual reality VR device or the augmented reality AR device; acquiring a plurality of second target data, wherein , the types of the first target data and the plurality of second target data are different; use the first processing model to process the first target data to obtain the first feature vector corresponding to the first target data; use the second processing model to process the multiple The second target data is processed to obtain a plurality of second feature vectors corresponding to the second target data; the first feature vector and the second feature vector are matched to determine the push data in the plurality of second target data, wherein the push data Used to characterize the second target data that matches the first target data; drive the VR device or AR device to display the push data; wherein, the parameters of the first processing model and the second processing model are based on the first global feature and the second global feature. The global similarity, the data block similarity of the first data block feature and the second data block feature are adjusted. The first global feature and the first data block feature are obtained by processing the first training data by the first processing model. The feature and the second data block feature are obtained by processing the second training data by the second processing model. The first training data and the second training data are different types of data, and the first global feature is used to represent the semantic feature of the first training data. , the first data block feature is used to represent the feature of the data block in the first training data, and the first processing model and the second processing model are machine learning models.

处理器可以通过传输装置调用存储器存储的信息及应用程序，以执行下述步骤：通过调用第一接口获取客户端发送的模型训练请求，其中，第一接口包括第一参数，第一参数的参数值为模型训练请求，模型训练请求用于对第一处理模型和第二处理模型进行训练；基于模型训练请求获取训练样本，其中，训练样本包括：不同类型的第一训练数据和第二训练数据；利用第一处理模型对第一训练数据进行处理，得到第一全局特征和第一数据块特征，其中，第一全局特征用于表征第一训练数据的语义特征，第一数据块特征用于表征第一训练数据中的数据块的特征；利用第二处理模型对第二训练数据进行处理，得到第二全局特征和第二数据块特征；基于第一全局特征和第二全局特征的全局相似度、第一数据块特征和第二数据块特征的数据块相似度，对第一处理模型和第二处理模型的参数进行调整；通过调用第二接口输出第一处理模型和第二处理模型至客户端，其中，第二接口包括第二参数，第二参数的参数值为第一处理模型和第二处理模型；其中，第一处理模型和第二处理模型的参数基于第一全局特征和第二全局特征的全局相似度、第一数据块特征和第二数据块特征的数据块相似度进行调整，第一全局特征和第一数据块特征通过第一处理模型对第一训练数据进行处理得到，第二全局特征和第二数据块特征通过第二处理模型对第二训练数据进行处理得到，第一训练数据和第二训练数据是不同类型的数据，第一全局特征用于表征第一训练数据的语义特征，第一数据块特征用于表征第一训练数据中的数据块的特征，第一处理模型和第二处理模型为机器学习模型。The processor can call the information and the application program stored in the memory through the transmission device to perform the following steps: obtain the model training request sent by the client by calling the first interface, wherein the first interface includes a first parameter, the parameter of the first parameter The value is a model training request, and the model training request is used to train the first processing model and the second processing model; training samples are obtained based on the model training request, wherein the training samples include: different types of first training data and second training data Utilize the first processing model to process the first training data to obtain the first global feature and the first data block feature, wherein the first global feature is used to characterize the semantic feature of the first training data, and the first data block feature is used for Characterize the feature of the data block in the first training data; use the second processing model to process the second training data to obtain the second global feature and the second data block feature; based on the global similarity of the first global feature and the second global feature adjust the parameters of the first processing model and the second processing model; output the first processing model and the second processing model by calling the second interface to The client, wherein the second interface includes a second parameter, and the parameter values of the second parameter are the first processing model and the second processing model; wherein the parameters of the first processing model and the second processing model are based on the first global feature and the first processing model. The global similarity of the second global feature, the data block similarity of the first data block feature and the second data block feature are adjusted, and the first global feature and the first data block feature are obtained by processing the first training data through the first processing model. , the second global feature and the second data block feature are obtained by processing the second training data by the second processing model, the first training data and the second training data are different types of data, and the first global feature is used to represent the first training data Semantic features of the data, the first data block feature is used to represent the feature of the data block in the first training data, and the first processing model and the second processing model are machine learning models.

处理器可以通过传输装置调用存储器存储的信息及应用程序，以执行下述步骤：通过调用第一接口获取客户端发送的第一目标数据，其中，第一接口包括第一参数，第一参数的参数值为第一目标数据；获取多个第二目标数据，其中，第一目标数据和多个第二目标数据的类型不同；利用第一处理模型对第一目标数据进行处理，得到第一目标数据对应的第一特征向量；利用第二处理模型分别对多个第二目标数据进行处理，得到多个第二目标数据对应的第二特征向量；将第一特征向量和第二特征向量进行匹配，确定多个第二目标数据中的推送数据，其中，推送数据用于表征与第一目标数据相匹配的第二目标数据；通过调用第二接口输出推送数据至客户端，其中，第二接口包括第二参数，第二参数的参数值为推送数据；其中，第一处理模型和第二处理模型的参数基于第一全局特征和第二全局特征的全局相似度、第一数据块特征和第二数据块特征的数据块相似度进行调整，第一全局特征和第一数据块特征通过第一处理模型对第一训练数据进行处理得到，第二全局特征和第二数据块特征通过第二处理模型对第二训练数据进行处理得到，第一训练数据和第二训练数据是不同类型的数据，第一全局特征用于表征第一训练数据的语义特征，第一数据块特征用于表征第一训练数据中的数据块的特征，第一处理模型和第二处理模型为机器学习模型。The processor can call the information and the application program stored in the memory through the transmission device to perform the following steps: obtain the first target data sent by the client by calling the first interface, wherein the first interface includes a first parameter, and the first parameter The parameter value is the first target data; a plurality of second target data are obtained, wherein the types of the first target data and the plurality of second target data are different; the first target data is processed by using the first processing model to obtain the first target data The first feature vector corresponding to the data; the second processing model is used to process the multiple second target data respectively to obtain the second feature vector corresponding to the multiple second target data; the first feature vector and the second feature vector are matched , determine the push data in the plurality of second target data, wherein the push data is used to represent the second target data that matches the first target data; output the push data to the client by calling the second interface, wherein the second interface Including a second parameter, the parameter value of the second parameter is the push data; wherein, the parameters of the first processing model and the second processing model are based on the global similarity of the first global feature and the second global feature, the first data block feature and the first data block feature. The data block similarity of the two data block features is adjusted, the first global feature and the first data block feature are obtained by processing the first training data through the first processing model, and the second global feature and the second data block feature are obtained through the second processing. The model processes the second training data to obtain, the first training data and the second training data are different types of data, the first global feature is used to represent the semantic feature of the first training data, and the first data block feature is used to represent the first The features of the data blocks in the training data, the first processing model and the second processing model are machine learning models.

采用本申请实施例，提供了一种模型训练的方案。利用第一处理模型对第一训练数据进行处理，得到第一全局特征和第一数据块特征，并利用第二处理模型对第二训练数据进行处理，得到第二全局特征和第二数据块特征，最后基于第一全局特征和第二全局特征的全局相似度、第一数据块特征和第二数据块特征的数据块相似度，对第一处理模型和第二处理模型的参数进行调整，达到对特征提取模型进行训练的目的，从而达到更准确的表达图像、文本之间的相似度，提升模型训练效果，进一步提升模型对多模态数据进行处理的准确度的技术效果，进而解决了相关技术中通过模型对多模态数据进行处理的准确度较差的技术问题。在图文搜索场景中，通过第一处理模型和第二处理模型可以提取出准确度更高的图像特征和文本特征，进一步使得搜索出的信息更加准确，更符合用户的搜索需求，从而达到了提高图文搜索准确度的效果。Using the embodiments of the present application, a model training solution is provided. Use the first processing model to process the first training data to obtain first global features and first data block features, and use the second processing model to process the second training data to obtain second global features and second data block features Finally, based on the global similarity of the first global feature and the second global feature, the data block similarity of the first data block feature and the second data block feature, the parameters of the first processing model and the second processing model are adjusted to achieve The purpose of training the feature extraction model is to achieve a more accurate expression of the similarity between images and texts, improve the training effect of the model, and further improve the technical effect of the accuracy of the model's processing of multi-modal data, thereby solving the related problems. The technical problem of poor accuracy in processing multimodal data through models in technology. In the image and text search scenario, the first processing model and the second processing model can extract image features and text features with higher accuracy, further making the searched information more accurate and more in line with the user's search needs, thus achieving The effect of improving the accuracy of image and text search.

本领域普通技术人员可以理解，图16所示的结构仅为示意，计算机终端也可以是智能手机(如Android手机、iOS手机等)、平板电脑、掌声电脑以及移动互联网设备(MobileInternetDevices，MID)、PAD等终端设备。图16其并不对上述电子装置的结构造成限定。例如，计算机终端A还可包括比图16中所示更多或者更少的组件(如网络接口、显示装置等)，或者具有与图16所示不同的配置。Those of ordinary skill in the art can understand that the structure shown in FIG. 16 is for illustration only, and the computer terminal can also be a smart phone (such as an Android mobile phone, an iOS mobile phone, etc.), a tablet computer, an applause computer, and a mobile Internet device (Mobile Internet Devices, MID), Terminal equipment such as PAD. FIG. 16 does not limit the structure of the above electronic device. For example, the computer terminal A may also include more or less components than those shown in FIG. 16 (eg, a network interface, a display device, etc.), or have a different configuration than that shown in FIG. 16 .

本领域普通技术人员可以理解上述实施例的各种方法中的全部或部分步骤是可以通过程序来指令终端设备相关的硬件来完成，该程序可以存储于一计算机可读存储介质中，存储介质可以包括：闪存盘、只读存储器(Read-Only Memory，ROM)、随机存取器(RandomAccess Memory，RAM)、磁盘或光盘等。Those of ordinary skill in the art can understand that all or part of the steps in the various methods of the above embodiments can be completed by instructing the hardware related to the terminal device through a program, and the program can be stored in a computer-readable storage medium, and the storage medium can Including: flash disk, read-only memory (Read-Only Memory, ROM), random access device (RandomAccess Memory, RAM), magnetic disk or optical disk, etc.

实施例14Example 14

本申请的实施例还提供了一种计算机可读存储介质。可选地，在本实施例中，上述计算机可读存储介质可以用于保存上述实施例所提供的模型训练方法和数据处理方法所执行的程序代码。Embodiments of the present application also provide a computer-readable storage medium. Optionally, in this embodiment, the above-mentioned computer-readable storage medium may be used to store program codes executed by the model training method and the data processing method provided in the above-mentioned embodiment.

可选地，在本实施例中，上述计算机可读存储介质可以位于计算机网络中计算机终端集群中的任意一个计算机终端中，或者位于移动终端群中的任意一个移动终端中。Optionally, in this embodiment, the computer-readable storage medium may be located in any computer terminal in a computer terminal cluster in a computer network, or in any mobile terminal in a mobile terminal group.

可选地，在本实施例中，计算机可读存储介质被设置为存储用于执行以下步骤的程序代码：获取训练样本，训练样本包括：不同类型的第一训练数据和第二训练数据；利用第一处理模型对第一训练数据进行处理，得到第一全局特征和第一数据块特征，第一全局特征用于表征第一训练数据的语义特征，第一数据块特征用于表征第一训练数据中的数据块的特征；利用第二处理模型对第二训练数据进行处理，得到第二全局特征和第二数据块特征；基于第一全局特征和第二全局特征的全局相似度、第一数据块特征和第二数据块特征的数据块相似度，对第一处理模型和第二处理模型的参数进行调整，第一处理模型和第二处理模型为机器学习模型。Optionally, in this embodiment, the computer-readable storage medium is configured to store program codes for performing the following steps: acquiring training samples, the training samples including: different types of first training data and second training data; using The first processing model processes the first training data to obtain a first global feature and a first data block feature. The first global feature is used to represent the semantic feature of the first training data, and the first data block feature is used to represent the first training data. Features of the data blocks in the data; using the second processing model to process the second training data to obtain the second global feature and the second data block feature; based on the global similarity of the first global feature and the second global feature, the first The data block similarity between the data block feature and the second data block feature adjusts the parameters of the first processing model and the second processing model, and the first processing model and the second processing model are machine learning models.

可选的，上述计算机可读存储介质还被设置为存储用于执行以下步骤的程序代码：基于全局相似度构建第一损失函数；基于数据块相似度构建第二损失函数；获取第一损失函数和第二损失函数的加权和，得到目标损失函数；基于目标损失函数对参数进行调整。Optionally, the above-mentioned computer-readable storage medium is further configured to store program codes for performing the following steps: constructing a first loss function based on global similarity; constructing a second loss function based on data block similarity; obtaining the first loss function The weighted sum of the second loss function and the target loss function is obtained; the parameters are adjusted based on the target loss function.

可选的，上述计算机可读存储介质还被设置为存储用于执行以下步骤的程序代码：从全局相似度中获取第一样本对应的第一全局相似度，及第二样本对应的第二全局相似度，其中，第一样本包含的两个不同类型的数据相匹配，第二样本包含的两个不同类型的数据不匹配；基于第一全局相似度和第二全局相似度，构建第一损失函数。Optionally, the above-mentioned computer-readable storage medium is further configured to store program codes for performing the following steps: obtaining a first global similarity corresponding to the first sample from the global similarity, and a second corresponding to the second sample from the global similarity. The global similarity, wherein the two different types of data contained in the first sample match, and the two different types of data contained in the second sample do not match; based on the first global similarity and the second global similarity, construct the first a loss function.

可选的，上述计算机可读存储介质还被设置为存储用于执行以下步骤的程序代码：从数据块相似度中获取第一样本对应的第一数据块相似度，及第二样本对应的第二数据块相似度；基于第一数据块相似度和第二数据块相似度，构建第二损失函数。Optionally, the above-mentioned computer-readable storage medium is further configured to store program codes for performing the following steps: obtaining the similarity of the first data block corresponding to the first sample from the similarity of the data block, and the corresponding degree of the second sample. The second data block similarity; based on the first data block similarity and the second data block similarity, a second loss function is constructed.

可选的，上述计算机可读存储介质还被设置为存储用于执行以下步骤的程序代码：利用特征对齐模块对第一数据块特征和第二数据块特征进行特征对齐，得到数据块相似度。Optionally, the computer-readable storage medium is further configured to store program codes for performing the following steps: using a feature alignment module to perform feature alignment on the first data block feature and the second data block feature to obtain the data block similarity.

可选地，在本实施例中，计算机可读存储介质被设置为存储用于执行以下步骤的程序代码：响应作用于客户端的操作界面上的输入指令，在操作界面上显示第一目标数据；响应作用于操作界面上的推送指令，在操作界面上显示多个第二目标数据中的推送数据，其中，推送数据用于表征与第一目标数据相匹配的第二目标数据，第一目标数据和多个第二目标数据的类型不同，推送数据通过将第一目标数据对应的第一特征向量和多个第二目标数据对应的第二特征向量进行匹配得到，第一特征向量通过第一处理模型对第一目标数据进行处理得到，第二特征向量通过第二处理模型对第二目标数据进行处理得到，第一处理模型和第二处理模型的参数基于第一全局特征和第二全局特征的全局相似度、第一数据块特征和第二数据块特征的数据块相似度进行调整，第一全局特征和第一数据块特征通过第一处理模型对第一训练数据进行处理得到，第二全局特征和第二数据块特征通过第二处理模型对第二训练数据进行处理得到，第一训练数据和第二训练数据是不同类型的数据，第一全局特征用于表征第一训练数据的语义特征，第一数据块特征用于表征第一训练数据中的数据块的特征，第一处理模型和第二处理模型为机器学习模型。Optionally, in this embodiment, the computer-readable storage medium is configured to store program codes for executing the following steps: displaying the first target data on the operation interface in response to an input instruction acting on the operation interface of the client; In response to the push instruction acting on the operation interface, the push data in the plurality of second target data is displayed on the operation interface, wherein the push data is used to represent the second target data matching the first target data, and the first target data Different from the types of multiple second target data, the push data is obtained by matching the first feature vector corresponding to the first target data with the second feature vectors corresponding to the multiple second target data, and the first feature vector is processed by the first process. The model is obtained by processing the first target data, the second feature vector is obtained by processing the second target data by the second processing model, and the parameters of the first processing model and the second processing model are based on the first global feature and the second global feature. The global similarity, the data block similarity of the first data block feature and the second data block feature are adjusted. The first global feature and the first data block feature are obtained by processing the first training data by the first processing model. The feature and the second data block feature are obtained by processing the second training data by the second processing model. The first training data and the second training data are different types of data, and the first global feature is used to represent the semantic feature of the first training data. , the first data block feature is used to represent the feature of the data block in the first training data, and the first processing model and the second processing model are machine learning models.

可选地，在本实施例中，计算机可读存储介质被设置为存储用于执行以下步骤的程序代码：在虚拟现实VR设备或增强现实AR设备的呈现画面上展示第一目标数据；获取多个第二目标数据，其中，第一目标数据和多个第二目标数据的类型不同；利用第一处理模型对第一目标数据进行处理，得到第一目标数据对应的第一特征向量；利用第二处理模型分别对多个第二目标数据进行处理，得到多个第二目标数据对应的第二特征向量；将第一特征向量和第二特征向量进行匹配，确定多个第二目标数据中的推送数据，其中，推送数据用于表征与第一目标数据相匹配的第二目标数据；驱动VR设备或AR设备展示推送数据；其中，第一处理模型和第二处理模型的参数基于第一全局特征和第二全局特征的全局相似度、第一数据块特征和第二数据块特征的数据块相似度进行调整，第一全局特征和第一数据块特征通过第一处理模型对第一训练数据进行处理得到，第二全局特征和第二数据块特征通过第二处理模型对第二训练数据进行处理得到，第一训练数据和第二训练数据是不同类型的数据，第一全局特征用于表征第一训练数据的语义特征，第一数据块特征用于表征第一训练数据中的数据块的特征，第一处理模型和第二处理模型为机器学习模型。Optionally, in this embodiment, the computer-readable storage medium is configured to store program codes for performing the following steps: displaying the first target data on a presentation screen of a virtual reality VR device or an augmented reality AR device; acquiring multiple a second target data, wherein the types of the first target data and the plurality of second target data are different; use the first processing model to process the first target data to obtain the first feature vector corresponding to the first target data; use the first target data to process the first target data; The second processing model respectively processes multiple second target data to obtain second feature vectors corresponding to the multiple second target data; matches the first feature vector and the second feature vector to determine which of the multiple second target data push data, wherein the push data is used to represent second target data that matches the first target data; drive the VR device or AR device to display the push data; wherein the parameters of the first processing model and the second processing model are based on the first global The global similarity of the feature and the second global feature, the data block similarity of the first data block feature and the second data block feature are adjusted, and the first global feature and the first data block feature are processed by the first processing model. After processing, the second global feature and the second data block feature are obtained by processing the second training data through the second processing model. The first training data and the second training data are different types of data, and the first global feature is used to represent The semantic feature of the first training data, the first data block feature is used to represent the feature of the data block in the first training data, and the first processing model and the second processing model are machine learning models.

可选地，在本实施例中，计算机可读存储介质被设置为存储用于执行以下步骤的程序代码：通过调用第一接口获取客户端发送的模型训练请求，其中，第一接口包括第一参数，第一参数的参数值为模型训练请求，模型训练请求用于对第一处理模型和第二处理模型进行训练；基于模型训练请求获取训练样本，其中，训练样本包括：不同类型的第一训练数据和第二训练数据；利用第一处理模型对第一训练数据进行处理，得到第一全局特征和第一数据块特征，其中，第一全局特征用于表征第一训练数据的语义特征，第一数据块特征用于表征第一训练数据中的数据块的特征；利用第二处理模型对第二训练数据进行处理，得到第二全局特征和第二数据块特征；基于第一全局特征和第二全局特征的全局相似度、第一数据块特征和第二数据块特征的数据块相似度，对第一处理模型和第二处理模型的参数进行调整；通过调用第二接口输出第一处理模型和第二处理模型至客户端，其中，第二接口包括第二参数，第二参数的参数值为第一处理模型和第二处理模型，第一处理模型和第二处理模型为机器学习模型。Optionally, in this embodiment, the computer-readable storage medium is configured to store program codes for executing the following steps: obtaining a model training request sent by the client by invoking a first interface, wherein the first interface includes a first interface. parameter, the parameter value of the first parameter is a model training request, and the model training request is used to train the first processing model and the second processing model; training samples are obtained based on the model training request, wherein the training samples include: different types of first processing models. training data and second training data; using the first processing model to process the first training data to obtain a first global feature and a first data block feature, wherein the first global feature is used to represent the semantic feature of the first training data, The first data block feature is used to characterize the feature of the data block in the first training data; the second training data is processed by the second processing model to obtain the second global feature and the second data block feature; based on the first global feature and The global similarity of the second global feature, the data block similarity of the first data block feature and the second data block feature, adjust the parameters of the first processing model and the second processing model; output the first processing by calling the second interface The model and the second processing model are sent to the client, wherein the second interface includes a second parameter, the parameter values of the second parameter are the first processing model and the second processing model, and the first processing model and the second processing model are machine learning models .

可选地，在本实施例中，计算机可读存储介质被设置为存储用于执行以下步骤的程序代码：通过调用第一接口获取客户端发送的第一目标数据，其中，第一接口包括第一参数，第一参数的参数值为第一目标数据；获取多个第二目标数据，其中，第一目标数据和多个第二目标数据的类型不同；利用第一处理模型对第一目标数据进行处理，得到第一目标数据对应的第一特征向量；利用第二处理模型分别对多个第二目标数据进行处理，得到多个第二目标数据对应的第二特征向量；将第一特征向量和第二特征向量进行匹配，确定多个第二目标数据中的推送数据，其中，推送数据用于表征与第一目标数据相匹配的第二目标数据；通过调用第二接口输出推送数据，其中，第二接口包括第二参数，第二参数的参数值为推送数据至客户端；其中，第一处理模型和第二处理模型参数基于第一全局特征和第二全局特征的全局相似度、第一数据块特征和第二数据块特征的数据块相似度进行调整，第一全局特征和第一数据块特征通过第一处理模型对第一训练数据进行处理得到，第二全局特征和第二数据块特征通过第二处理模型对第二训练数据进行处理得到，第一训练数据和第二训练数据是不同类型的数据，第一全局特征用于表征第一训练数据的语义特征，第一数据块特征用于表征第一训练数据中的数据块的特征，第一处理模型和第二处理模型为机器学习模型。Optionally, in this embodiment, the computer-readable storage medium is configured to store program codes for executing the following steps: acquiring the first target data sent by the client by invoking a first interface, wherein the first interface includes a first interface. a parameter, the parameter value of the first parameter is the first target data; obtain a plurality of second target data, wherein the types of the first target data and the plurality of second target data are different; use the first processing model to analyze the first target data processing to obtain a first feature vector corresponding to the first target data; using the second processing model to process a plurality of second target data respectively to obtain a second feature vector corresponding to the plurality of second target data; Matching with the second feature vector to determine the push data in the plurality of second target data, wherein the push data is used to represent the second target data that matches the first target data; output the push data by calling the second interface, wherein , the second interface includes a second parameter, and the parameter value of the second parameter is the push data to the client; wherein, the parameters of the first processing model and the second processing model are based on the global similarity of the first global feature and the second global feature, the first The data block similarity between a data block feature and a second data block feature is adjusted. The first global feature and the first data block feature are obtained by processing the first training data through the first processing model. The second global feature and the second data The block feature is obtained by processing the second training data by the second processing model, the first training data and the second training data are different types of data, the first global feature is used to represent the semantic feature of the first training data, and the first data block The features are used to characterize the features of the data blocks in the first training data, and the first processing model and the second processing model are machine learning models.

上述本申请实施例序号仅仅为了描述，不代表实施例的优劣。The above-mentioned serial numbers of the embodiments of the present application are only for description, and do not represent the advantages or disadvantages of the embodiments.

在本申请的上述实施例中，对各个实施例的描述都各有侧重，某个实施例中没有详述的部分，可以参见其他实施例的相关描述。In the above-mentioned embodiments of the present application, the description of each embodiment has its own emphasis. For parts that are not described in detail in a certain embodiment, reference may be made to the relevant descriptions of other embodiments.

在本申请所提供的几个实施例中，应该理解到，所揭露的技术内容，可通过其它的方式实现。其中，以上所描述的装置实施例仅仅是示意性的，例如所述单元的划分，仅仅为一种逻辑功能划分，实际实现时可以有另外的划分方式，例如多个单元或组件可以结合或者可以集成到另一个系统，或一些特征可以忽略，或不执行。另一点，所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口，单元或模块的间接耦合或通信连接，可以是电性或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed technical content can be implemented in other ways. The apparatus embodiments described above are only illustrative, for example, the division of the units is only a logical function division, and there may be other division methods in actual implementation, for example, multiple units or components may be combined or Integration into another system, or some features can be ignored, or not implemented. On the other hand, the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of units or modules, and may be in electrical or other forms.

所述作为分离部件说明的单元可以是或者也可以不是物理上分开的，作为单元显示的部件可以是或者也可以不是物理单元，即可以位于一个地方，或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.

另外，在本申请各个实施例中的各功能单元可以集成在一个处理单元中，也可以是各个单元单独物理存在，也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现，也可以采用软件功能单元的形式实现。In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit. The above-mentioned integrated units may be implemented in the form of hardware, or may be implemented in the form of software functional units.

所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时，可以存储在一个计算机可读取存储介质中。基于这样的理解，本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来，该计算机软件产品存储在一个存储介质中，包括若干指令用以使得一台计算机设备(可为个人计算机、服务器或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括：U盘、只读存储器(ROM，Read-Only Memory)、随机存取存储器(RAM，Random Access Memory)、移动硬盘、磁碟或者光盘等各种可以存储程序代码的介质。The integrated unit, if implemented in the form of a software functional unit and sold or used as an independent product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solutions of the present application can be embodied in the form of software products in essence, or the parts that contribute to the prior art, or all or part of the technical solutions, and the computer software products are stored in a storage medium , including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present application. The aforementioned storage medium includes: U disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), mobile hard disk, magnetic disk or optical disk and other media that can store program codes .

以上所述仅是本申请的优选实施方式，应当指出，对于本技术领域的普通技术人员来说，在不脱离本申请原理的前提下，还可以做出若干改进和润饰，这些改进和润饰也应视为本申请的保护范围。The above are only the preferred embodiments of the present application. It should be pointed out that for those skilled in the art, without departing from the principles of the present application, several improvements and modifications can also be made. It should be regarded as the protection scope of this application.

Claims

1. A method of model training, comprising:

obtaining training samples, wherein the training samples comprise: different types of first training data and second training data;

processing the first training data by using a first processing model to obtain a first global feature and a first data block feature, wherein the first global feature is used for representing semantic features of the first training data, and the first data block feature is used for representing features of data blocks in the first training data;

processing the second training data by using a second processing model to obtain a second global feature and a second data block feature;

adjusting parameters of the first processing model and the second processing model based on the global similarity of the first global feature and the second global feature and the data block similarity of the first data block feature and the second data block feature, wherein the first processing model and the second processing model are machine learning models.

2. The method of claim 1, wherein adjusting parameters of the first and second processing models based on the global similarity of the first and second global features and the block similarity of the first and second block features comprises:

constructing a first loss function based on the global similarity;

constructing a second loss function based on the similarity of the data blocks;

obtaining a weighted sum of the first loss function and the second loss function to obtain a target loss function;

adjusting parameters of the first and second process models based on the objective loss function.

3. The method of claim 2, wherein constructing a first loss function based on the global similarity comprises:

acquiring a first global similarity corresponding to a first sample and a second global similarity corresponding to a second sample from the global similarities, wherein the two different types of data contained in the first sample are matched, and the two different types of data contained in the second sample are not matched;

constructing the first loss function based on the first global similarity and the second global similarity.

4. The method of claim 2, wherein constructing a second loss function based on the data block similarity comprises:

acquiring a first data block similarity corresponding to a first sample and a second data block similarity corresponding to a second sample from the data block similarities;

and constructing the second loss function based on the first data block similarity and the second data block similarity.

5. The method of claim 1, further comprising:

and performing feature alignment on the first data block features and the second data block features by using a feature alignment module to obtain the data block similarity.

6. The method according to any one of claims 1 to 5, wherein the first training data is image data and the second training data is text data.

7. A data processing method, comprising:

acquiring different types of first target data and a plurality of second target data;

processing the first target data by using a first processing model to obtain a first feature vector corresponding to the first target data;

respectively processing the plurality of second target data by using a second processing model to obtain second feature vectors corresponding to the plurality of second target data;

matching the first feature vector and the second feature vector, and determining push data in the plurality of second target data, wherein the push data is used for representing the second target data matched with the first target data;

the parameters of the first processing model and the second processing model are adjusted based on the global similarity of a first global feature and a second global feature, and the data block similarity of the first data block feature and a second data block feature, the first global feature and the first data block feature are obtained by processing first training data through the first processing model, the second global feature and the second data block feature are obtained by processing second training data through the second processing model, the first training data and the second training data are different types of data, the first global feature is used for representing the semantic feature of the first training data, the first data block feature is used for representing the feature of a data block in the first training data, and the first processing model and the second processing model are machine learning models.

8. A data processing method, comprising:

responding to an input instruction acted on an operation interface of a client, and displaying first target data on the operation interface;

displaying, on the operation interface, push data in a plurality of second target data in response to a push instruction acting on the operation interface, wherein the push data is used for characterizing the second target data matching with the first target data, the types of the first target data and the plurality of second target data are different, the push data is obtained by matching a first feature vector corresponding to the first target data with a second feature vector corresponding to the plurality of second target data, the first feature vector is obtained by processing the first target data through a first processing model, the second feature vector is obtained by processing the second target data through a second processing model, parameters of the first processing model and the second processing model are adjusted based on global similarities of first global features and second global features, data block similarities of first data block features and second data block features, the first global features and the first data block features are obtained by processing the first processing model on different types of first data, the second feature block features and the second data block features are obtained by processing the training data through the first processing model, the training data of the first global features and the training data of the second data block features are obtained by processing the training data blocks, and the training data is used for characterizing the training data of the first target data and the training data.

9. A method of model training, comprising:

the method comprises the steps that a server obtains a model training request sent by a client by calling a first interface, wherein the first interface comprises a first parameter, the parameter value of the first parameter is the model training request, and the model training request is used for training a first processing model and a second processing model;

the server obtains training samples based on the model training request, wherein the training samples comprise: different types of first training data and second training data;

the server processes the first training data by using the first processing model to obtain a first global feature and a first data block feature, wherein the first global feature is used for representing a semantic feature of the first training data, and the first data block feature is used for representing a feature of a data block in the first training data;

the server processes the second training data by using the second processing model to obtain a second global feature and a second data block feature;

the server adjusts parameters of the first processing model and the second processing model based on the global similarity of the first global feature and the second global feature and the data block similarity of the first data block feature and the second data block feature;

the server outputs the first processing model and the second processing model to the client by calling a second interface, wherein the second interface comprises a second parameter, a parameter value of the second parameter is the first processing model and the second processing model, and the first processing model and the second processing model are machine learning models.

10. A data processing method, comprising:

the method comprises the steps that a server obtains first target data sent by a client by calling a first interface, wherein the first interface comprises a first parameter, and a parameter value of the first parameter is the first target data;

the server acquires a plurality of second target data, wherein the types of the first target data and the second target data are different;

the server processes the first target data by using a first processing model to obtain a first feature vector corresponding to the first target data;

the server respectively processes the second target data by using a second processing model to obtain second feature vectors corresponding to the second target data;

the server matches the first feature vector with the second feature vector, and determines push data in the second target data, wherein the push data is used for representing the second target data matched with the first target data;

the server outputs the push data to the client by calling a second interface, wherein the second interface comprises a second parameter, and a parameter value of the second parameter is the push data;

11. A computer-readable storage medium, comprising a stored program, wherein the program, when executed, controls an apparatus in which the computer-readable storage medium is located to perform the method of any one of claims 1 to 10.

12. A computer terminal, comprising:

a processor;

a memory coupled to the processor for providing instructions to the processor for processing the following processing steps: obtaining training samples, wherein the training samples comprise: different types of first training data and second training data; processing the first training data by using a first processing model to obtain a first global feature and a first data block feature, wherein the first global feature is used for representing semantic features of the first training data, and the first data block feature is used for representing features of data blocks in the first training data; processing the second training data by using a second processing model to obtain a second global feature and a second data block feature; adjusting parameters of the first processing model and the second processing model based on the global similarity of the first global feature and the second global feature and the data block similarity of the first data block feature and the second data block feature, wherein the first processing model and the second processing model are machine learning models.