WO2018161880A1 - Procédé de push de mot-clé de recherche multimédia, dispositif et support de stockage de données - Google Patents
Procédé de push de mot-clé de recherche multimédia, dispositif et support de stockage de données Download PDFInfo
- Publication number
- WO2018161880A1 WO2018161880A1 PCT/CN2018/078084 CN2018078084W WO2018161880A1 WO 2018161880 A1 WO2018161880 A1 WO 2018161880A1 CN 2018078084 W CN2018078084 W CN 2018078084W WO 2018161880 A1 WO2018161880 A1 WO 2018161880A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- media
- user
- application
- keyword
- information
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 61
- 238000013500 data storage Methods 0.000 title 1
- 230000006399 behavior Effects 0.000 claims description 94
- 230000011218 segmentation Effects 0.000 claims description 87
- 238000004590 computer program Methods 0.000 claims description 2
- 230000000875 corresponding effect Effects 0.000 description 15
- 238000004422 calculation algorithm Methods 0.000 description 14
- 238000001914 filtration Methods 0.000 description 13
- 238000010586 diagram Methods 0.000 description 9
- 238000013475 authorization Methods 0.000 description 8
- 239000000284 extract Substances 0.000 description 7
- 238000012545 processing Methods 0.000 description 7
- 230000006870 function Effects 0.000 description 6
- 238000012216 screening Methods 0.000 description 6
- 238000004891 communication Methods 0.000 description 5
- 230000010365 information processing Effects 0.000 description 4
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000000605 extraction Methods 0.000 description 3
- 230000001960 triggered effect Effects 0.000 description 3
- 206010033307 Overweight Diseases 0.000 description 2
- 230000004913 activation Effects 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- 238000012790 confirmation Methods 0.000 description 2
- 230000002596 correlated effect Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000003672 processing method Methods 0.000 description 2
- 101000801645 Homo sapiens ATP-binding cassette sub-family A member 2 Proteins 0.000 description 1
- 102100024540 Nonsense-mediated mRNA decay factor SMG8 Human genes 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000007717 exclusion Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
Definitions
- the present application relates to the field of Internet technologies, and in particular, to a media search term pushing method and apparatus.
- the embodiment of the present invention provides a media search term pushing method and device, which can recommend a media search term to a user based on the user's Internet user behavior data, which can effectively improve the efficiency of the user acquiring information through the media application.
- an embodiment of the present invention provides a media search term pushing method, which is applied to a media search word pushing device, and the method includes:
- user behavior data of the second user application by the associated user of the user, where the user behavior data includes at least one piece of media information corresponding to the user behavior of the associated user using the second media application;
- an embodiment of the present invention further provides a media search term pushing device, where the device includes a processor and a memory, where the memory stores instructions executable by the processor, when the instruction is executed,
- the processor is used to:
- the user behavior data of the second user application by the associated user of the user, where the user behavior data includes at least one piece of media information corresponding to the user behavior of the associated user using the second media application;
- an embodiment of the present invention further provides a non-volatile computer storage medium in which a computer program for executing the above method is stored.
- FIG. 1 is a schematic structural diagram of an implementation scenario of a media search word pushing method according to an embodiment of the present invention
- FIG. 2 is a schematic flowchart of an implementation process of a media search word pushing method according to an embodiment of the present invention
- FIG. 3 is a schematic structural diagram of an implementation scenario of a media search term pushing method according to another embodiment of the present invention.
- FIG. 4 is a schematic flowchart of an implementation process of a media search word pushing method according to another embodiment of the present invention.
- FIG. 5 is a schematic flowchart of extracting media keywords in an embodiment of the present invention.
- FIG. 6 is a schematic structural diagram of a media search word pushing apparatus in an embodiment of the present invention.
- FIG. 7 is a schematic structural diagram of a keyword extraction module according to an embodiment of the present invention.
- FIG. 8 is a schematic structural diagram of a search word pushing module according to an embodiment of the present invention.
- FIG. 9 is a schematic structural diagram of a hardware component of a media search word pushing apparatus according to an embodiment of the present invention.
- the media search term pushing method in the embodiment of the present invention is implemented by a media search word pushing device, which may be an internet client for obtaining media information from the Internet, for example, network music, without special explanation.
- the first media application and the second media application in the embodiment of the present invention may be different functions of the Internet client, for example, the first media application is a network music application, then the application, the network news application, the network video application, or the browser application, etc.
- the second media application may be a network news application, a network video application, or a browser application. If the first media application is a network video application, the second media application may be a network music application, a network news application, or a browser application. This type of push.
- the first media application and the second media application in the embodiment of the present invention may be Internet applications with different functions used by the user on the same user terminal, or may be Internet applications with different functions used by the user on different user terminals, respectively. Different implementation scenarios
- FIG. 1 is a schematic structural diagram of an implementation scenario of a media search term pushing method in an embodiment of the present invention.
- the media search term pushing device can be implemented in the background server 1102 of the first media application, and
- the flow of the media search word pushing method in this embodiment may include as shown in FIG. 2:
- the media search word pushing device acquires user identification information of a current user of the first media application.
- the first media application in the user terminal 1101 may send the user identification information of the current user to the media search word pushing device of the background server after being activated, and may be the first media application actively reporting or the media search term.
- the push device actively pulls from the first media application, and the user identification information may be a user login account or a bound mobile phone number, an email account, and the like.
- the media search term pushing device acquires, according to the user identification information, the user behavior data of the associated user of the user using the second media application from the background server 1103 of the second media, where the user behavior data includes the The associated user uses at least one piece of media information corresponding to the user behavior of the second media application.
- the background server 1103 of the second media application may share the user behavior data of the user using the second media application to the background server 1102 of the first media application, so that the media search word pushing device may be according to the current user
- the user identification information acquires user behavior data of the second media application used by the associated user of the user.
- the media search term pushing device requests the background server 1103 of the second media application to provide user behavior data of the associated user of the user according to the user identifier information of the current user, for example, may be applied by the second media application.
- the media search term pushing device only needs to provide the user identification information of the current user, for example, the openID, and the background server 1103 of the second media application can return the associated user of the current user to the media search word pushing device.
- User behavior data is only needs to provide the user identification information of the current user, for example, the openID, and the background server 1103 of the second media application can return the associated user of the current user to the media search word pushing device.
- the current user and the associated user of the current user mentioned in the embodiment of the present invention may be user identification information of the same physical user and user identification information of the second media application, which may be represented by the user account.
- the user account used by the current user and the associated user of the current user may be the same or different, but the association relationship between the two user identities needs to be established in advance in the background server, for example, the user login account of Xiao Ming using the first media application. It is ABC2005, Xiaoming's user login account using the second media application is BCD2005, and Huaweing can request to establish the association relationship with the user login account ABC2005 of the first media application when the second media application creates the BCD2005 account.
- the request for establishing the association relationship between the two user login accounts is subsequently submitted in the second media application process, and the background server 1103 of the second media application sends the association to the background server 1102 of the first media application after receiving the request.
- Confirm the inquiry message and receive the use of ABC2 After the association determination message sent by the first media application of the user login account of 005, the association relationship between the two user accounts is established; Huaweing requests the background server 1102 of the first media application to establish an association between the two user login accounts.
- the manner of the relationship is the same as that in the embodiment of the present invention.
- the media search term pushing device when the media search term pushing device requests the background server 1103 of the second media application to provide the user behavior data of the associated user of the user, the user account of the associated user of the second media application may be authorized, in the user.
- the background server 1103 that initiates the authorization of the first media application to the background media server by the second media application sends the authorization token token to the first media application, and the media search word pushing device can The token obtained from the first media application is sent to the background server 1103 of the second media application, and the background server 1103 of the second media application returns the user behavior data of the associated user of the first media application to the media according to the token.
- Searching for the word push device the authorization token can be set to an expiration date, and the authorization process does not need to be repeated during the validity period.
- the user behavior data may include a browsing behavior, a playing behavior, a collecting behavior, a sharing behavior, a downloading behavior, or an evaluation behavior of the associated user using the second media application, and each behavior may be directed to a certain media information, that is, the user.
- the media search word pushing device in the behavior data may be used by the media search term pushing device in the embodiment of the present invention to obtain the behavior of the user by using the media information corresponding to the user behavior of the second media application.
- the preference or type of interest is analyzed to specifically recommend the corresponding media search term to the user in the first media application.
- the user behavior data may include all historical user behavior records of the associated user using the second media application, or may be a user behavior record of the associated user within a recent period of time (eg, nearly one month or nearly one week, etc.).
- the media search word pushing device extracts at least one media keyword from the word segment included in the at least one media information according to the segmentation frequency statistical data of the word segment included in the at least one media information.
- the media search word pushing device extracts the media keyword from the obtained media information corresponding to the user behavior of the second media application. Can be further divided into the following links:
- the media search word pushing device performs text segmentation processing on the obtained media information, for example, a text segmentation processing method such as full-mode word segmentation or search segmentation may be used to obtain text segmentation words included in the plurality of media information.
- a text segmentation processing method such as full-mode word segmentation or search segmentation may be used to obtain text segmentation words included in the plurality of media information.
- the media information content can be pre-processed, such as garbled filtering, punctuation filtering, Chinese character simplification conversion, word segmentation, stop word filtering, and the like.
- the media search word pushing device may first perform correlation screening on the obtained media information, and may specifically be based on the preset first media application. Correlating the word segmentation, determining, in the at least one media information, at least one associated media information, where the associated media information includes at least one associated segmentation of the first media application, thereby including media information not including the associated segmentation As excluded from the associated media information, the amount of subsequent analysis calculations can be effectively reduced.
- the preset related word segment set of the first media application may be a vocabulary set of the domain in which the first media application is located, and the first media application is a network music application as an example, and the preset first media application's associated word segmentation
- the collection may include a collection of song names, a collection of artist names, a collection of album names, a collection of song type names, and the like.
- the related word segmentation of the first media application may be used to perform participle matching only for part of the media information, for example, only the title and abstract in each media information are determined. Whether the keyword segment includes the associated participle of the first media application, and without judging other parts of the media information, the amount of information processing of the correlation screening can be greatly reduced.
- the word segmentation frequency statistics of each participle may include a word frequency, a text frequency, a text number, or an inverse text frequency. Respectively, the frequency, the number of times, or the degree of meaning of the respective participles in the obtained media information (for example, "", “Y”, “Yes", “Yes”, etc., although appearing more, should not be considered Is a keyword).
- At least one media keyword may be extracted from the segmentation included in the acquired media information by using a TF-IDF (Term Frequency-Inverse Document Frequency) algorithm or a TextRank document ranking algorithm.
- TF-IDF Term Frequency-Inverse Document Frequency
- the word frequency TF may be the number of occurrences of a given participle in the certain media information divided by the total number of word segments obtained according to the plurality of media information processing.
- n i,j is the number of occurrences of the word in document d j
- the denominator is the total number of all word segment features in document d j .
- the inverse document frequency IDF may be obtained by dividing the total number of the plurality of pieces of media information by the number of pieces of media information including a participle, and then obtaining the obtained logarithm of the quotient, that is:
- is the total number of the plurality of pieces of media information
- is the number of pieces of media information containing the word t i (ie , the number of media information of n k,j ⁇ 0). Used to assess how important a word is to a document or a domain document set in a corpus.
- Tfi-df i,j tf i,j ⁇ idf i , usually the high word frequency within a particular document, and the low document frequency of the word in the entire document set, can produce a high-weight TF-IDF. Therefore, by filtering words with lower TF-IDF, you can filter out common words and retain important words.
- a predetermined number of (for example, 3, 5, or 10) particials having the highest TF-IDF among the word segmentation of each media information may be determined as a media keyword.
- the importance of the word segmentation appearing in a certain media information can be sorted by the TextRank algorithm, and the most important preset number of word segments can be determined as the media keyword.
- the weight value or the highest ranked segmentation word is extracted as a weight key by using a TF-IDF algorithm or a TextRank document ranking algorithm.
- the media search word pushing device may further perform correlation screening on the obtained weight keywords, and may specifically determine at least one of the at least one weight keyword according to the preset associated word segment set of the first media application.
- a media keyword where the media keyword is an associated word segment in the associated word segment set of the first media application, so that the weight keyword of the associated segment word is excluded as the unrelated segment word, and the user may further focus on using the first media. Search terms that may be used when applying.
- the media search word pushing device pushes the media search term to the first media application according to the at least one media keyword.
- the media search word pushing device sends all or part of the determined media keywords as the media search words to the first media application, and the first media application displays the media search words in the search.
- the column provides users with quick input of search terms. Since these media search words are media keywords that the user is more concerned about in another media application, there is also a greater possibility as the media search words used by the user on the first media application. Thereby, the efficiency of obtaining information by the user through the media application can be effectively improved.
- the media search word pushing device may acquire the search behavior statistics of the plurality of users using the at least one media keyword in the first media application, and further The segmentation frequency statistics of the at least one media keyword in the at least one media information and the search behavior statistics of the at least one media keyword in the first media application, in the at least one media keyword Determining a media search term to push the determined media search term to the first media application. And according to the segmentation frequency statistics data of the media keyword in the at least one media information, the degree of interest or degree of interest of the user on a certain media keyword may be analyzed, and the first media application is applied according to the media keyword.
- the search behavior statistics can obtain the search heat of the media keyword in the first media application, and the recommended scores of a certain media keyword can be calculated by combining the two aspects, and then the plurality of media keywords with the highest recommended score are used as the media.
- the search term is pushed to the first media application.
- the weight score for example, is the TF-IDF value, qv(i) refers to the number of times the i-th media keyword is searched in the first media application for a period of time; qv_max is the maximum number of searches for all qvs, where qv_max is used to do Normalized, in order to avoid the value of the recommended score is too high.
- the media search term pushing device in the embodiment of the present invention extracts the media search term from the media information corresponding to the user behavior by analyzing the user behavior data of the associated user on the second media application, and sends the media search term to the media search term.
- the first media application since the media search words are media keywords that the user is more concerned about on another media application, there is also a greater possibility as the media search words used by the user on the first media application, thereby Effectively improve the efficiency of users' access to information through the media application.
- FIG. 3 is a schematic structural diagram of an implementation scenario of a media search term pushing method in another embodiment of the present invention.
- the media search term pushing device 1201, the first media application 1202, and the second media application 1203 all run on the same user.
- the flow of the media search word pushing method in this embodiment as shown in the figure may include as shown in FIG. 4:
- the media search word pushing device acquires user identification information of a current user of the first media application.
- the media search term pushing device in this embodiment may acquire user behavior data of the second user application of the user by using the second media application in the same user terminal.
- the user behavior data of the associated user using the second media application may be saved in the local specified directory of the second media application, or may be recorded in the background server of the second media application, obtained by the second media application from the background server, and then submitted to the media search.
- Word push device may acquire user behavior data of the second user application of the user by using the second media application in the same user terminal.
- the user behavior data of the associated user using the second media application may be saved in the local specified directory of the second media application, or may be recorded in the background server of the second media application, obtained by the second media application from the background server, and then submitted to the media search.
- Word push device may acquire user behavior data of the second user application of the user by using the second media application in the same user terminal.
- the current user mentioned in the embodiment of the present invention and the associated user of the current user may be respectively the user identity of the same physical user applied in the first media and the user identity applied in the second media, which may be represented by the user account, current The user account used by the user and the associated user of the current user may be the same or different.
- the relationship between the two user identities may be established in advance in the background server of any media application, for example, the user who uses the first media application by Xiao Ming.
- the login account is ABC2005
- Xiaoming's user login account using the second media application is BCD2005
- Xiaoming can request the establishment of the association relationship with the user login account ABC2005 of the first media application when the second media application creates the BCD2005 account.
- the request for establishing the association relationship between the two user login accounts submitted in the second media application process may be used, and the background server of the second media application sends the request to the background server of the first media application after receiving the request.
- Correlate confirmation inquiry message and receive it when it is received After the association determination message sent by the first media application of the user login account of the ABC 2005, the association relationship between the two user accounts is established; Xiao Ming requests the background server of the first media application to establish the association relationship between the two user login accounts.
- Xiao Ming requests the background server of the first media application to establish the association relationship between the two user login accounts.
- the first media application and the second media application may be triggered to initiate a relationship with each other, or triggered by the same third-party application, that is, the user triggers the activation of the second media application when using the first media application.
- the user triggers the activation of the first media application when using the second media application
- the current user account of the first media application is obviously associated with the current user account of the second media application, if the user is using the third application (for example, When the first media application and the second media application are triggered by the instant messaging application or the SNS application, the current user account of the first media application and the current user account of the second media application are both associated with the user account of the third application. It is obvious that the current user account of the first media application is also associated with the current user account of the second media application.
- the media search word pushing device may send the user identification information of the current user of the first media application to the second media application, and the second media application searches for the associated user corresponding to the user identification information, and searches for The user behavior data of the associated user is sent to the media search word pushing device.
- the media search term pushing device may obtain the information of the associated user from the first media application according to the user identifier information of the current user of the first media application, thereby requesting the second media application to provide the associated user. User behavior data.
- the media search word pushing device does not run on the same user terminal as the first media application and the second media application, for example, the first media application and the second media application run on the same user terminal, and the media search The word pushing device is implemented in the background server of the first media application, and the media search word pushing device can also request the second media application to obtain the second media application through the inter-process communication between the first media application and the second media application.
- the associated user of the current user uses the user behavior data of the second media application.
- S403 in this embodiment may further include:
- S4031 Determine, according to the preset association word segmentation of the first media application, at least one associated media information, where the associated media information includes at least one association of the first media application. Participle.
- the preset related word segment set of the first media application may be a vocabulary set of the domain in which the first media application is located, and the first media application is a network music application as an example, and the preset first media application's associated word segmentation
- the collection may include a collection of song names, a collection of artist names, a collection of album names, a collection of song type names, and the like.
- S4032 Extract at least one weight keyword from the participle included in the at least one associated media information according to the segmentation frequency statistics of the participle included in the at least one associated media information.
- the weight keywords refer to S104 in the foregoing embodiment, and details are not described in this embodiment.
- S4033 Determine, according to the preset association word segment set of the first media application, at least one media keyword in the at least one weight keyword, where the media keyword is in an associated word segment set of the first media application. Associated word segmentation.
- the media search word pushing device may obtain, from a background server of the first media application, a search behavior statistic of a plurality of users using the at least one media keyword in the first media application for a period of time. data.
- the media search term is determined in the keyword.
- the degree of interest or degree of interest of the user on a certain media keyword may be analyzed, and the first media application is applied according to the media keyword.
- the search behavior statistics can obtain the search heat of the media keyword in the first media application, and the recommended scores of a certain media keyword can be calculated by combining the two aspects, and then the plurality of media keywords with the highest recommended score are used as the media.
- the search term is pushed to the first media application.
- the weight score for example, is the TF-IDF value, qv(i) refers to the number of times the i-th media keyword is searched in the first media application for a period of time; qv_max is the maximum number of searches for all qvs, where qv_max is used to do Normalized, in order to avoid the value of the recommended score is too high.
- the media search term pushing device sends the media search term to the first media application, and the first media application displays the media search term in the search bar to provide a user to quickly input a search term
- These media search words are media keywords that the user pays more attention to in another media application, and therefore have a greater possibility as media search words used by the user on the first media application, thereby effectively improving the user's application through the media. Get information efficiency.
- the media search term pushing method of the present application can be extended to more implementation scenario architectures.
- the first media application and the second media application run on different user terminals, and the first media application or the media search word pushing device sends a request to the second media application to request the associated user to use the user behavior data of the second media application to determine the media.
- the search term, and thus the embodiment obtained without the creative labor extension, should belong to the technical solution claimed in the present application.
- FIG. 6 is a schematic structural diagram of a media search term pushing device according to an embodiment of the present invention.
- the media search term pushing device in the embodiment of the present invention may be implemented in the same user terminal as the first media application, or may be implemented separately, and may also be implemented.
- the media search term pushing device in the embodiment of the present invention may include at least:
- the user identifier obtaining module 610 is configured to obtain user identifier information of a current user of the first media application.
- the user identification information may be a user login account or a bound mobile phone number, an email account, and the like.
- the media search word pushing device is implemented on the background server of the first media application, the first media application in the user terminal may send the current user's user identification information to the media search word pushing device after being activated, which may be the first The media application actively reports, or the user identifier obtaining module 610 of the media search word pushing device actively pulls from the first media application.
- the behavior data obtaining module 620 is configured to acquire user behavior data of the second user application by the associated user of the user according to the user identification information, where the user behavior data includes user behavior of the associated user using the second media application. Corresponding at least one piece of media information.
- the background server of the second media application may share the user behavior data of the user using the second media application to the background of the first media application.
- the server so that the media search word pushing device can obtain the user behavior data of the second user application of the associated user of the user according to the user identification information of the current user.
- the media search term pushing device requests the background server of the second media application to provide user behavior data of the associated user of the user according to the user identifier information of the current user, for example, may be through the background of the second media application.
- the third-party program provided by the server provides an interface or a cooperative protocol platform established by the two parties, for example, an instant messaging service open platform, an SNS open platform, etc., and obtains user behavior data of the associated user of the user from the background server of the second media application, in the implementation.
- the media search word pushing device only needs to provide the user identification information of the current user, for example, the openID, and the background server of the second media application can return the user behavior data of the associated user of the current user to the media search word pushing device.
- the media search word pushing device may directly request the user behavior data of the associated user from the second media application, and may also pass the A media application requests the second media application to send an inter-process request to obtain user behavior data of the associated user.
- the current user mentioned in the embodiment of the present invention and the associated user of the current user may be respectively the user identity of the same physical user applied in the first media and the user identity applied in the second media, which may be represented by the user account, current
- the user account used by the user and the associated user of the current user may be the same or different, but the relationship between the two user identities needs to be established in the background server in advance.
- the user login account of Xiao Ming using the first media application is ABC2005.
- Xiaoming s user login account using the second media application is BCD2005, and Huawei may request to establish an association relationship with the user login account ABC2005 of the first media application when the second media application creates the BCD2005 account, or Subsequent to submitting a request for establishing an association relationship between the two user login accounts in the second media application process, the background server of the second media application sends an association confirmation inquiry message to the background server of the first media application after receiving the request. And log in to the user who received ABC2005 After the association determination message sent by the first media application, the association relationship between the two user accounts is established; Huaweing requests the background server of the first media application to establish the association relationship between the two user login accounts. For the same reason, it will not be described in detail in the embodiments of the present invention.
- the media search word pushing device when the media search word pushing device requests the background server of the second media application to provide the user behavior data of the associated user of the user, the user account of the associated user of the second media application may be authorized to pass the user.
- the background server of the second media application sends an authorization token token to the first media application, and the media search word pushing device can
- the token obtained by the media application is sent to the background server of the second media application, and the background server of the second media application returns the user behavior data of the associated user of the first media application to the media search word pushing device according to the token.
- the authorization token can be set to an expiration date, and the authorization process does not need to be repeated during the validity period.
- the user behavior data may include a browsing behavior, a playing behavior, a collecting behavior, a sharing behavior, a downloading behavior, or an evaluation behavior of the associated user using the second media application, and each behavior may be directed to a certain media information, that is, the user.
- the media search word pushing device in the behavior data may be used by the media search term pushing device in the embodiment of the present invention to obtain the behavior of the user by using the media information corresponding to the user behavior of the second media application.
- the preference or type of interest is analyzed to specifically recommend the corresponding media search term to the user in the first media application.
- the user behavior data may include all historical user behavior records of the associated user using the second media application, or may be a user behavior record of the associated user within a recent period of time (eg, nearly one month or nearly one week, etc.).
- the keyword extraction module 630 is configured to extract at least one media keyword from the word segmentation included in the at least one media information according to the segmentation frequency statistics of the segmentation included in the at least one piece of media information.
- the media search word pushing device extracts the media keyword from the obtained media information corresponding to the user behavior of the second media application. Can be further divided into the following links:
- the media search word pushing device performs text segmentation processing on the obtained media information, for example, a text segmentation processing method such as full-mode word segmentation or search segmentation may be used to obtain text segmentation words included in the plurality of media information.
- a text segmentation processing method such as full-mode word segmentation or search segmentation may be used to obtain text segmentation words included in the plurality of media information.
- the media information content can be pre-processed, such as garbled filtering, punctuation filtering, Chinese character conversion, word segmentation, stop word filtering, and the like.
- the word segmentation frequency statistics of each participle may include a word frequency, a text frequency, a text number, or an inverse text frequency. Respectively, the frequency, the number of times, or the degree of meaning of the respective participles in the obtained media information (for example, "", “Y”, “Yes", “Yes”, etc., although appearing more, should not be considered Is a keyword).
- At least one media keyword may be extracted from the segmentation included in the acquired media information by using a TF-IDF (Term Frequency-Inverse Document Frequency) algorithm or a TextRank document ranking algorithm.
- TF-IDF Term Frequency-Inverse Document Frequency
- the word frequency TF may be the number of occurrences of a given participle in the certain media information divided by the total number of word segments obtained according to the plurality of media information processing.
- n i,j is the number of occurrences of the word in document d j
- the denominator is the total number of all word segment features in document d j .
- the inverse document frequency IDF may be obtained by dividing the total number of the plurality of pieces of media information by the number of pieces of media information including a participle, and then obtaining the obtained logarithm of the quotient, that is:
- is the total number of the plurality of media information
- is the number of media information including the word t i (ie , the number of media information of n k,j ⁇ 0). Used to assess how important a word is to a document or a domain document set in a corpus.
- Tfi-df i,j tf i,j ⁇ idf i , usually the high word frequency within a particular document, and the low document frequency of the word in the entire document set, can produce a high-weight TF-IDF. Therefore, by filtering words with lower TF-IDF, you can filter out common words and retain important words.
- a predetermined number of (for example, 3, 5, or 10) particials having the highest TF-IDF among the word segmentation of each media information may be determined as a media keyword.
- the importance of the word segmentation appearing in a certain media information can be sorted by the TextRank algorithm, and the most important preset number of word segments can be determined as the media keyword.
- the keyword extraction module 630 may further include as shown in FIG. 7:
- the association information filtering unit 631 is configured to determine, according to the preset association word segmentation of the first media application, at least one associated media information, where the associated media information includes at least one of the The associated participle of the first media application.
- the related information filtering unit 631 may first perform correlation screening on the obtained media information, which may be based on the preset related word segment set of the first media application. Determining at least one associated media information in the at least one media information, the associated media information including at least one associated segmentation of the first media application, thereby using media information not including the associated segmentation as unrelated media information Exclusion can effectively reduce the amount of subsequent analysis calculations.
- the preset related word segment set of the first media application may be a vocabulary set of the domain in which the first media application is located, and the first media application is a network music application as an example, and the preset first media application's associated word segmentation
- the collection may include a collection of song names, a collection of artist names, a collection of album names, a collection of song type names, and the like.
- the related word segmentation of the first media application may be used to perform participle matching only for part of the media information, for example, only the title and abstract in each media information are determined. Whether the keyword segment includes the associated participle of the first media application, and without judging other parts of the media information, the amount of information processing of the correlation screening can be greatly reduced.
- the keyword extracting unit 632 is configured to extract at least one weight keyword from the word segmentation included in the at least one media information according to the word segmentation frequency statistical data of the word segment included in the at least one piece of media information.
- the associated word segment filtering unit 633 is configured to determine, according to the preset associated word segment set of the first media application, at least one media keyword in the at least one weight keyword, where the media keyword is a first media application Associated participles in the associated participle collection.
- the associated word segmentation is filtered by the TF-IDF algorithm or the TextRank document ranking algorithm.
- the unit 633 may further perform correlation screening on the obtained weight keywords, and specifically may be, according to the preset association word segment set of the first media application, determining at least one media keyword in the at least one weight keyword,
- the media keyword is an associated word segment in the associated word segment set of the first media application, so that the weight keyword that is not the associated segment word is excluded as the unrelated segment word, and may be further focused on the user may use when using the first media application. Search term.
- association information filtering unit 631 and the associated word segmentation filtering unit 633 may be any one of them in other embodiments.
- the search word pushing module 640 is configured to push the media search term to the first media application according to the at least one media keyword.
- the search term pushing module 640 sends the media search term to the first media application, and the first media application displays the media search term in the search bar to provide a user to quickly input a search term,
- These media search words are media keywords that the user pays more attention to in another media application, and therefore have a greater possibility as media search words used by the user on the first media application, thereby effectively improving the user's application through the media. Get information efficiency.
- the search term pushing module 640 may further include as shown in FIG. 8:
- the search data obtaining unit 641 is configured to acquire search behavior statistics data of the plurality of users using the at least one media keyword in the first media application.
- a search term determining unit 642 configured to calculate, according to the segmentation frequency statistics of the at least one media keyword in the at least one media information and the search behavior statistics of the at least one media keyword in the first media application, A media search term is determined in the at least one media keyword.
- the degree of interest or degree of interest of the user on a certain media keyword may be analyzed, and the first media application is applied according to the media keyword.
- the search behavior statistics can obtain the search heat of the media keyword in the first media application, and the recommended scores of a certain media keyword can be calculated by combining the two aspects, and then the plurality of media keywords with the highest recommended score are used as the media.
- the search term is pushed to the first media application.
- the weight score for example, is the TF-IDF value, qv(i) refers to the number of times the i-th media keyword is searched in the first media application for a period of time; qv_max is the maximum number of searches for all qvs, where qv_max is used to do Normalized, in order to avoid the value of the recommended score is too high.
- the search word pushing unit 643 is configured to push the determined media search word to the first media application.
- the above-mentioned media search word pushing device may be an electronic device such as a PC, and may also be a portable electronic device such as a PAD, a tablet computer or a laptop computer, and is not limited to the description herein; or may be constituted by a cluster server.
- the electronic search word pushing device includes at least a database for storing data and a processor for data processing, which may include built-in storage, for merging electronic devices that are separately configured as an entity or each unit function. Media or storage media that are set up independently.
- a microprocessor for the processor for data processing, a microprocessor, a central processing unit (CPU), a digital signal processor (DSP, Digital Singnal Processor), or a programmable logic array may be used when performing processing.
- CPU central processing unit
- DSP digital signal processor
- a programmable logic array may be used when performing processing.
- FPGA Field-Programmable Gate Array
- storage medium including an operation instruction, which may be computer executable code, by which the above-described implementation of the present invention is implemented, such as FIG. 2 or 4-5 The various steps in the process of the media search word push method shown.
- the apparatus includes a processor 901, a storage medium 902, and at least one external communication interface 903; the processor 901, the storage medium 902, and the communication interface 903 are all connected by a bus 904.
- the processor 901 in the media search word pushing device can call the operation instruction in the storage medium 902 to execute the following process:
- the user behavior data of the second user application by the associated user of the user, where the user behavior data includes at least one piece of media information corresponding to the user behavior of the associated user using the second media application;
- the disclosed apparatus and method may be implemented in other manners.
- the device embodiments described above are merely illustrative.
- the division of the unit is only a logical function division.
- there may be another division manner such as: multiple units or components may be combined, or Can be integrated into another system, or some features can be ignored or not executed.
- the coupling, or direct coupling, or communication connection of the components shown or discussed may be indirect coupling or communication connection through some interfaces, devices or units, and may be electrical, mechanical or other forms. of.
- the units described above as separate components may or may not be physically separated, and the components displayed as the unit may or may not be physical units, that is, may be located in one place or distributed to multiple network units; Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
- each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may be separately used as one unit, or two or more units may be integrated into one unit;
- the unit can be implemented in the form of hardware or in the form of hardware plus software functional units.
- the foregoing program may be stored in a computer readable storage medium, and the program is executed when executed.
- the foregoing storage device includes the following steps: the foregoing storage medium includes: a mobile storage device, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk.
- ROM read-only memory
- RAM random access memory
- magnetic disk or an optical disk.
- optical disk A medium that can store program code.
- the above-described integrated unit of the present application may be stored in a computer readable storage medium if it is implemented in the form of a software function module and sold or used as a stand-alone product.
- the technical solution of the embodiments of the present invention may be embodied in the form of a software product in essence or in the form of a software product stored in a storage medium, including a plurality of instructions.
- a computer device (which may be a personal computer, server, or network device, etc.) is caused to perform all or part of the methods described in various embodiments of the present invention.
- the foregoing storage medium includes various media that can store program codes, such as a mobile storage device, a ROM, a RAM, a magnetic disk, or an optical disk.
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
L'invention concerne un procédé de push de mot-clé de recherche multimédia consistant à : acquérir des informations d'identifiant d'utilisateur d'un utilisateur actuel d'une première application multimédia (S401); acquérir, selon les informations d'identifiant d'utilisateur, des données de comportement d'utilisateur d'un utilisateur associé de l'utilisateur et à l'aide d'une seconde application multimédia, les données de comportement d'utilisateur comprenant au moins une information multimédia correspondant à un comportement d'utilisateur de l'utilisateur associé à l'aide de la seconde application multimédia (S402); extraire, en fonction de données de fréquence statistique calculées de mots segmentés dans l'un ou les informations multimédias, et à partir des mots segmentés dans l'un ou les informations multimédias, au moins un mot-clé multimédia (S403); et pousser vers la première application multimédia le mot-clé de recherche multimédia déterminé (S406).
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710135931.6 | 2017-03-08 | ||
CN201710135931.6A CN108304422B (zh) | 2017-03-08 | 2017-03-08 | 一种媒体搜索词推送方法和装置 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2018161880A1 true WO2018161880A1 (fr) | 2018-09-13 |
Family
ID=62872018
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2018/078084 WO2018161880A1 (fr) | 2017-03-08 | 2018-03-06 | Procédé de push de mot-clé de recherche multimédia, dispositif et support de stockage de données |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN108304422B (fr) |
WO (1) | WO2018161880A1 (fr) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110941766A (zh) * | 2019-12-10 | 2020-03-31 | 北京字节跳动网络技术有限公司 | 一种信息推送的方法、装置、计算机设备及存储介质 |
CN111415176A (zh) * | 2018-12-19 | 2020-07-14 | 杭州海康威视数字技术股份有限公司 | 一种满意度评价方法、装置及电子设备 |
CN111737501A (zh) * | 2020-06-22 | 2020-10-02 | 北京百度网讯科技有限公司 | 一种内容推荐方法及装置、电子设备、存储介质 |
CN112182358A (zh) * | 2019-07-05 | 2021-01-05 | 百度在线网络技术(北京)有限公司 | 一种多媒体推送计划的创建方法和创建系统 |
CN113704591A (zh) * | 2021-09-06 | 2021-11-26 | 北京雷石天地电子技术有限公司 | 一种媒体数据分析方法、装置、计算机设备和存储介质 |
CN114385903A (zh) * | 2020-10-22 | 2022-04-22 | 腾讯科技(深圳)有限公司 | 应用账号的识别方法、装置、电子设备及可读存储介质 |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110717038B (zh) * | 2019-09-17 | 2022-10-04 | 腾讯科技(深圳)有限公司 | 对象分类方法及装置 |
CN113536244B (zh) * | 2021-07-15 | 2024-11-29 | 维沃移动通信(杭州)有限公司 | 信息处理方法、信息处理装置、电子设备和可读存储介质 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103425650A (zh) * | 2012-05-15 | 2013-12-04 | 腾讯科技(深圳)有限公司 | 推荐搜索方法和系统 |
US20140143246A1 (en) * | 2011-08-02 | 2014-05-22 | Tencent Technology (Shenzhen) Company Limited | Search Method, System and Device |
CN104239571A (zh) * | 2014-09-30 | 2014-12-24 | 北京奇虎科技有限公司 | 一种进行应用推荐的方法和装置 |
CN104239450A (zh) * | 2014-09-01 | 2014-12-24 | 百度在线网络技术(北京)有限公司 | 搜索推荐方法和装置 |
CN104516915A (zh) * | 2013-09-30 | 2015-04-15 | 腾讯科技(北京)有限公司 | 一种基于微博timeline的媒体数据发布方法和装置 |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7730012B2 (en) * | 2004-06-25 | 2010-06-01 | Apple Inc. | Methods and systems for managing data |
US9703892B2 (en) * | 2005-09-14 | 2017-07-11 | Millennial Media Llc | Predictive text completion for a mobile communication facility |
WO2008124033A2 (fr) * | 2007-04-03 | 2008-10-16 | Grape Technology Group Inc. | Système et procédé pour moteur de recherche personnalisé et optimisation du résultat de recherche |
CN102479366A (zh) * | 2010-11-25 | 2012-05-30 | 阿里巴巴集团控股有限公司 | 一种商品推荐方法及系统 |
WO2015096609A1 (fr) * | 2013-12-26 | 2015-07-02 | 乐视网信息技术(北京)股份有限公司 | Procédé et système pour créer un fichier à index inversé d'une ressource vidéo |
CN104572889B (zh) * | 2014-12-24 | 2016-10-05 | 深圳市腾讯计算机系统有限公司 | 一种搜索词推荐方法、装置和系统 |
CN104834698A (zh) * | 2015-04-27 | 2015-08-12 | 百度在线网络技术(北京)有限公司 | 信息推送方法和装置 |
CN105095474B (zh) * | 2015-08-11 | 2018-12-14 | 北京奇虎科技有限公司 | 建立搜索词与应用数据推荐关系的方法及装置 |
CN105808685B (zh) * | 2016-03-02 | 2021-09-28 | 腾讯科技(深圳)有限公司 | 推广信息的推送方法及装置 |
-
2017
- 2017-03-08 CN CN201710135931.6A patent/CN108304422B/zh active Active
-
2018
- 2018-03-06 WO PCT/CN2018/078084 patent/WO2018161880A1/fr active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140143246A1 (en) * | 2011-08-02 | 2014-05-22 | Tencent Technology (Shenzhen) Company Limited | Search Method, System and Device |
CN103425650A (zh) * | 2012-05-15 | 2013-12-04 | 腾讯科技(深圳)有限公司 | 推荐搜索方法和系统 |
CN104516915A (zh) * | 2013-09-30 | 2015-04-15 | 腾讯科技(北京)有限公司 | 一种基于微博timeline的媒体数据发布方法和装置 |
CN104239450A (zh) * | 2014-09-01 | 2014-12-24 | 百度在线网络技术(北京)有限公司 | 搜索推荐方法和装置 |
CN104239571A (zh) * | 2014-09-30 | 2014-12-24 | 北京奇虎科技有限公司 | 一种进行应用推荐的方法和装置 |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111415176A (zh) * | 2018-12-19 | 2020-07-14 | 杭州海康威视数字技术股份有限公司 | 一种满意度评价方法、装置及电子设备 |
CN111415176B (zh) * | 2018-12-19 | 2023-06-30 | 杭州海康威视数字技术股份有限公司 | 一种满意度评价方法、装置及电子设备 |
CN112182358A (zh) * | 2019-07-05 | 2021-01-05 | 百度在线网络技术(北京)有限公司 | 一种多媒体推送计划的创建方法和创建系统 |
CN112182358B (zh) * | 2019-07-05 | 2024-04-30 | 百度在线网络技术(北京)有限公司 | 一种多媒体推送计划的创建方法和创建系统 |
CN110941766A (zh) * | 2019-12-10 | 2020-03-31 | 北京字节跳动网络技术有限公司 | 一种信息推送的方法、装置、计算机设备及存储介质 |
CN110941766B (zh) * | 2019-12-10 | 2023-10-20 | 北京字节跳动网络技术有限公司 | 一种信息推送的方法、装置、计算机设备及存储介质 |
CN111737501A (zh) * | 2020-06-22 | 2020-10-02 | 北京百度网讯科技有限公司 | 一种内容推荐方法及装置、电子设备、存储介质 |
CN114385903A (zh) * | 2020-10-22 | 2022-04-22 | 腾讯科技(深圳)有限公司 | 应用账号的识别方法、装置、电子设备及可读存储介质 |
CN114385903B (zh) * | 2020-10-22 | 2024-02-06 | 腾讯科技(深圳)有限公司 | 应用账号的识别方法、装置、电子设备及可读存储介质 |
CN113704591A (zh) * | 2021-09-06 | 2021-11-26 | 北京雷石天地电子技术有限公司 | 一种媒体数据分析方法、装置、计算机设备和存储介质 |
Also Published As
Publication number | Publication date |
---|---|
CN108304422B (zh) | 2021-12-17 |
CN108304422A (zh) | 2018-07-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2018161880A1 (fr) | Procédé de push de mot-clé de recherche multimédia, dispositif et support de stockage de données | |
US8977573B2 (en) | System and method for identifying customers in social media | |
CN107172151B (zh) | 用于推送信息的方法和装置 | |
CN110413875B (zh) | 一种文本信息推送的方法以及相关装置 | |
US20170185654A1 (en) | Method and server for pushing information proactively | |
CA2896819C (fr) | Identification d'un type d'activite en utilisant des informations publiques | |
US20180189909A1 (en) | Patentability search and analysis | |
CN103870553B (zh) | 一种输入资源推送方法及系统 | |
WO2015196793A1 (fr) | Procédé et dispositif d'analyse d'informations de point d'accès sans fil et support de stockage informatique | |
US11354655B2 (en) | Enhancing merchant databases using crowdsourced browser data | |
WO2016078533A1 (fr) | Procédé, appareil et dispositif de recherche et support de stockage informatique non volatil | |
KR20150041592A (ko) | 피호출자의 전자 디바이스에서 연락처 정보를 업데이트하는 방법 및 전자 디바이스 | |
CN106202440B (zh) | 数据处理方法、装置及设备 | |
PH12020050512B1 (en) | Biometric based user identity verification | |
US10693897B2 (en) | Behavioral and account fingerprinting | |
CN113407818B (zh) | 自动信息检索 | |
CN114741606A (zh) | 企业推荐方法、装置、计算机可读介质及电子设备 | |
CN112507220B (zh) | 信息推送方法、装置及介质 | |
US20160314477A1 (en) | Identifying entities trending in a professional community | |
RU2702275C1 (ru) | Способ и система маркировки действий пользователя для последующего анализа и накопления | |
CN110674386B (zh) | 资源推荐方法、装置及存储介质 | |
JP2018013819A (ja) | ビジネスマッチング支援システムおよびビジネスマッチング支援方法 | |
US8909795B2 (en) | Method for determining validity of command and system thereof | |
CN114615283B (zh) | 基于网页的业务记录同步方法、装置及存储介质 | |
TWM560616U (zh) | 提供關聯功能表選單的電子裝置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 18764798 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 18764798 Country of ref document: EP Kind code of ref document: A1 |