CN103974097B

CN103974097B - Personalized user original video forecasting method based on popularity and social networkies and system

Info

Publication number: CN103974097B
Application number: CN201410219254.2A
Authority: CN
Inventors: 叶保留; 徐轩绚; 陆桑璐
Original assignee: ZHENJIANG Institute OF HIGH-NEW TECHNOLOGY NANJING UNIVERSITY
Current assignee: Nanjing University Zhenjiang Life And Health Industry Research Institute
Priority date: 2014-05-22
Filing date: 2014-05-22
Publication date: 2017-03-01
Anticipated expiration: 2034-05-22
Also published as: CN103974097A

Abstract

The invention discloses a personalized user original video prefetching method and system based on popularity and social network. This method first collects video popularity information, social relations between users, association relations between videos, and user historical behavior information; based on the popularity information, user-created videos are divided into popular videos and long-tail videos; Prefetch list of popular videos is generated by sorting preferences; build a graph model to measure the correlation between users and videos, and generate long-tail video pre-fetch lists for users; weighted linear fusion of popular video pre-fetch lists and long-tail video pre-fetch lists , build a personalized hybrid prefetch model; in the case of ensuring that the user is currently watching the video smoothly, based on the personalized hybrid prefetch model, prefetch one or more top-ranked videos for the user. This system improves the hit rate and accuracy of user original video prefetching, improves the quality of user original video service, and improves the user's viewing experience.

Description

Personalized user-generated video prefetching method based on popularity and social network and its system

技术领域technical field

本发明涉及一种基于流行度和社交网络的个性化用户原创视频预取方法及系统，属于视频预取的技术领域。The invention relates to a personalized user original video prefetching method and system based on popularity and social network, and belongs to the technical field of video prefetching.

背景技术Background technique

随着Web2.0时代的到来，用户不再是单纯的信息接收者，而是同时成为了了信息的发布者，以优酷网和土豆网为代表，基于用户原创内容的视频网站受到广大用户的高度追捧，但是多组研究结果都表明用户的观看体验不能让人满意，播放产生的时延占视频时长的比重较高。减少视频播放时延的相关方法包括：提高服务器的软硬件性能，增加网络带宽，基于网络代理和基于内容分发网络，然而这些方法都有相应的劣势，相比之下视频预取技术能够很好的降低用户可感受时延，同时产生较小的开销。With the advent of the Web 2.0 era, users are no longer simply receivers of information, but have become publishers of information at the same time. Represented by Youku and Tudou, video sites based on user-generated content are popular among users. Highly sought after, but multiple research results show that the user's viewing experience is unsatisfactory, and the delay caused by playback accounts for a relatively high proportion of the video duration. Related methods to reduce video playback delay include: improving server hardware and software performance, increasing network bandwidth, based on network proxy and based on content distribution network, but these methods have corresponding disadvantages, compared with video prefetching technology can be very good It reduces the user's perception of delay, and at the same time generates a small overhead.

但是由于基于用户原创内容的视频服务在用户行为和视频内容上具有很多区别于传统视频服务的特点，例如视频数量很多，视频长度较短，内容数据较少，极快的内容产生速度，不同的流行度分布以及社会网络的存在。因此传统的视频预取算法并不能够很好的适用于用户原创内容视频服务。However, since video services based on user-generated content have many characteristics different from traditional video services in terms of user behavior and video content, such as a large number of videos, short video length, less content data, and extremely fast content generation speed, different Popularity distribution and existence of social network. Therefore, the traditional video prefetching algorithm is not well suited for user-generated content video services.

用户原创视频的流行度分布表现出极端的不平衡，一部分视频有着很高的流行度，选择观看这部分视频时，用户通常考虑视频的流行度以及是否符合自己的兴趣。另一部分视频虽然流行度不高，却通过不同的传播途径，被感兴趣的用户观看，主要的传播途径包括用户之间的社交关系以及视频之间的关联关系。The popularity distribution of user-generated videos is extremely unbalanced. Some videos have high popularity. When choosing to watch these videos, users usually consider the popularity of the videos and whether they meet their own interests. Another part of the videos is not very popular, but they are watched by interested users through different transmission channels. The main transmission channels include the social relationship between users and the association relationship between videos.

因此，有必要提出一种能够综合考虑不同流行度视频，充分利用用户社交关系，视频关联关系，视频流行度信息以及用户历史行为数据的，适用于用户原创视频的视频预取方法和系统，以有效的提高视频预取的命中率和准确度，提升用户原创视频服务的质量，改进用户的观看体验。Therefore, it is necessary to propose a video prefetching method and system suitable for user-created videos that can comprehensively consider videos of different popularity, make full use of user social relations, video association relations, video popularity information, and user historical behavior data. Effectively improve the hit rate and accuracy of video prefetching, improve the quality of user-original video services, and improve the user's viewing experience.

发明内容Contents of the invention

本发明所要解决的技术问题是提供一种基于流行度和社交网络的个性化用户原创视频预取方法及系统。该方法及系统提高了视频预取的命中率和准确度，提升用户原创视频服务的质量，改进了用户的观看体验。The technical problem to be solved by the present invention is to provide a personalized user original video prefetching method and system based on popularity and social network. The method and system improve the hit rate and accuracy of video prefetching, improve the quality of user original video services, and improve the user's viewing experience.

为了达到上述目的，根据本发明的一方面，本发明提供了一种基于流行度和社交网络的用户原创视频预取方法，包括如下步骤：In order to achieve the above object, according to an aspect of the present invention, the present invention provides a method for prefetching user-generated videos based on popularity and social networks, comprising the following steps:

1)信息获取:获取用户对视频的历史行为信息，用户与用户之间的社交关系，视频与视频之间的关联关系以及原创视频的流行度信息；1) Information Acquisition: Obtain the user's historical behavior information on the video, the social relationship between the user and the user, the association between the video and the popularity information of the original video;

2)视频分类：基于用户原创视频的流行度信息，将视频分为热门视频和长尾视频两类；2) Video classification: Based on the popularity information of user-generated videos, videos are divided into two categories: popular videos and long-tail videos;

3)热门视频预取列表生成：基于原创视频的流行度信息以及用户历史行为信息，计算用户对热门视频的偏好度，根据偏好度降序排序，为用户生成热门视频预取列表；3) Popular video prefetch list generation: based on the popularity information of original videos and user historical behavior information, calculate the user's preference for popular videos, sort them in descending order according to preference, and generate a popular video prefetch list for users;

4)图模型构建：基于获取的用户历史行为信息，用户之间的社交关系以及视频之间的关联关系构建图模型，其中的每一个节点或边拥有与实际物理类型一致的节点或关系类型，每一条边拥有与实际关系强度一致的边权值；4) Graph model construction: Construct a graph model based on the obtained user historical behavior information, social relationships between users, and associations between videos, in which each node or edge has a node or relationship type that is consistent with the actual physical type, Each edge has an edge weight consistent with the actual relationship strength;

5)长尾视频预取列表生成：基于图模型，利用基于最短路径权值的相关度测量方法，测量用户和视频的相关度，根据相关度降序排序，为用户生成长尾视频预取列表；5) Long tail video prefetch list generation: Based on the graph model, using the correlation degree measurement method based on the shortest path weight, measure the correlation between users and videos, sort according to the descending order of correlation, and generate long tail video prefetch lists for users;

6)混合预取模型生成：基于获取的用户的历史行为信息，对热门视频预取列表和长尾视频预取列表进行加权线性融合，构建个性化的混合预取模型；6) Hybrid prefetching model generation: Based on the acquired user's historical behavior information, the popular video prefetching list and the long tail video prefetching list are weighted and linearly fused to build a personalized hybrid prefetching model;

7)生成预取视频列表：在保证用户当前观看视频流畅的情况下，基于个性化混合预取模型，为用户预取排在最靠前的一部或多部视频推荐。7) Generate a prefetch video list: In the case of ensuring that the user is currently watching the video smoothly, based on the personalized hybrid prefetch model, prefetch one or more video recommendations that are ranked first for the user.

在上述用户原创视频预取方法中，所述信息获取步骤包括：若在应用系统中存在用户与用户之间的社交关系或者视频与视频之间的关联关系，则获取这些信息，若无，则无需获取；用户对视频的历史行为信息包括用户对视频的观看，分享，上传或收藏行为。In the above user original video prefetching method, the information acquisition step includes: if there is a social relationship between users or an association relationship between videos in the application system, then acquire these information, if not, then No need to obtain; the user's historical behavior information on the video includes the user's behavior of watching, sharing, uploading or favoriteing the video.

在上述用户原创视频预取方法中，在所述热门视频预取列表生成步骤之前还包括：对用户历史行为数据进行预处理步骤，具体为：只保留用户对视频肯定的信息；在非评分系统中，包括用户对视频的观看、分享、收藏以及上传信息。In the above user original video prefetching method, before the step of generating the popular video prefetch list, it also includes: a preprocessing step for user historical behavior data, specifically: only keep the information that the user affirms the video; , including the user's viewing, sharing, collection and uploading information of the video.

在上述用户原创视频预取方法中，其所述视频分类步骤具体为：首先，利用视频观看次数衡量视频流行度，按照视频流行度从高到低排序；其次，累计观看次数占总观看次数80％，流行度排名前10％的视频为热门视频，若流行度排名前10％的视频累计观看次数不足总观看次数的80％，根据流行度排名向后扩展热门视频，直至满足累计观看次数占总观看次数80％；最后，排除所有热门视频，剩余视频为长尾视频。In the above user-generated video prefetching method, the video classification steps are as follows: firstly, use the number of video views to measure the popularity of the video, and sort the videos according to the popularity from high to low; secondly, the cumulative number of views accounts for 80% of the total number of views %, the top 10% of the videos are popular videos, if the cumulative views of the top 10% of the videos are less than 80% of the total views, the popular videos will be expanded backwards according to the popularity ranking until the cumulative views account for 80% of the total views. 80% of the total views; finally, all popular videos are excluded, and the remaining videos are long-tail videos.

在上述用户原创视频预取方法中，其所述热门视频预取列表生成步骤中，偏好度计算的具体过程为：首先，利用视频观看次数计算视频流行度；其次，利用用户历史行为信息计算用户对视频的兴趣因子；最后，结合流行度和兴趣因子，计算用户的视频的偏好度。In the above-mentioned user original video prefetching method, in the popular video prefetching list generation step, the specific process of preference calculation is as follows: first, use the video viewing times to calculate video popularity; second, use user historical behavior information to calculate user The interest factor of the video; finally, the preference of the user's video is calculated by combining the popularity and the interest factor.

在上述用户原创视频预取方法中，其所述图模型构建步骤具体为：首先，将用户实体及视频实体映射为图中相应的节点，并且用实体的类型作为相应节点的类型；其次，若用户之间存在社交关系，则在对应的节点之间添加边，边的权值代表社交关系强度，若视频之间存在关联关系，则在对应的节点之间添加边，边的权值代表关联关系强度；最后，若用户对视频有肯定行为信息，则在用户与该视频间添加边，边的权值代表肯定行为的强度。In the above-mentioned user-generated video prefetching method, the graph model construction steps are as follows: first, map the user entity and the video entity to corresponding nodes in the graph, and use the type of the entity as the type of the corresponding node; secondly, if If there is a social relationship between users, add an edge between the corresponding nodes, and the weight of the edge represents the strength of the social relationship. If there is an association between the videos, add an edge between the corresponding nodes, and the weight of the edge represents the association Relationship strength; finally, if the user has affirmative behavior information on the video, an edge is added between the user and the video, and the weight of the edge represents the strength of the affirmative behavior.

在上述用户原创视频预取方法中，其所述长尾视频预取列表生成步骤中，相关度计算的具体过程为：首先，计算得到节点之间的最短路径，若节点之间不存在路径，相关度为0；其次，计算单条最短路径的权值；最后，采用最短路径的权值之和作为节点之间的相关度。In the above user original video prefetching method, in the long-tail video prefetching list generation step, the specific process of correlation calculation is as follows: first, calculate the shortest path between nodes, if there is no path between nodes, The correlation degree is 0; secondly, the weight of a single shortest path is calculated; finally, the sum of the weights of the shortest path is used as the correlation between nodes.

在上述用户原创视频预取方法中，其所述混合预取模型生成步骤具体为：利用用户历史行为信息，分别计算用户对热门视频及长尾视频的行为数量占总行为数量的比例，将计算结果作为热门视频预取列表和长尾视频预取列表的权值，对热门视频预取列表和长尾视频预取列表进行加权线性融合。In the above user-generated video prefetching method, the hybrid prefetching model generation step specifically includes: using the user's historical behavior information, respectively calculating the proportion of the user's behaviors on popular videos and long-tail videos to the total number of behaviors, and calculating The result is used as the weight of the popular video prefetch list and the long tail video prefetch list, and the weighted linear fusion is performed on the popular video prefetch list and the long tail video prefetch list.

此外，本发明提供了一种基于流行度和社交属性的个性化用户原创视频预取系统，包括：信息获取模块，用于获取用户对视频的历史行为信息，用户与用户之间的社交关系，视频与视频之间的关联关系以及视频自身的流行度信息；预处理模块，用于保留肯定的信息；视频分类模块，用于基于用户原创视频的流行度信息，将视频分为热门视频和长尾视频两类；热门视频预取列表生成模块，用于基于视频流行度和用户历史行为信息，计算用户对热门视频的偏好度，生成热门视频预取列表；图模型构建模块，用于基于用户对视频的历史行为信息，用户之间的社交关系以及视频之间关联关系构建含有节点类型以及边权值的图模型；长尾视频预取列表生成模块，用于基于图模型，利用基于最短路径权值的相关度测量方法，计算用户和长尾视频的相关度，生成长尾视频预取列表；混合预取模型生成模块，用于基于用户历史行为信息，计算热门视频预取权值及长尾视频预取权值，加权线性融合热门视频预取列表和长尾视频预取列表，生成个性化混合预取模型；视频预取模块，用于在保证用户当前观看视频流畅的情况下，基于个性化混合模型，为用户预取排在最靠前的一个或多个视频。In addition, the present invention provides a personalized user-original video prefetching system based on popularity and social attributes, including: an information acquisition module for acquiring historical behavior information of users on videos, social relations between users, The association relationship between videos and the popularity information of the video itself; the preprocessing module is used to retain positive information; the video classification module is used to divide videos into popular videos and long-form videos based on the popularity information of user-created videos There are two types of tail videos; the popular video prefetch list generation module is used to calculate the user's preference for popular videos based on video popularity and user historical behavior information, and generate a popular video prefetch list; the graph model building module is used to Construct a graph model containing node types and edge weights for the historical behavior information of the video, the social relationship between users and the relationship between the videos; the long-tail video prefetch list generation module is used for the graph-based model, using the shortest path-based The weight correlation measurement method calculates the correlation between users and long-tail videos, and generates long-tail video prefetch lists; the hybrid prefetch model generation module is used to calculate popular video prefetch weights and long-tail video based on user historical behavior information. Tail video prefetch weight, weighted linear fusion popular video prefetch list and long tail video prefetch list, generate a personalized hybrid prefetch model; Personalize the hybrid model to prefetch the top-ranked video or videos for the user.

本发明提出了一种基于流行度和社交网络的个性化用户原创视频预取方法及系统。第一次基于用户原创视频流行度的分布将视频分为热门视频和长尾视频两类，根据用户观看这两类视频的不同特点，充分利用用户社交关系，视频关联关系，视频流行度信息以及用户历史行为数据，挖掘了用户对未观看视频的偏好程度，对两类视频分别生成视频预取列表，通过混合预取模型加权线性融合这两类视频预取列表。因此有效的提高了视频预取的命中率和准确度，提升了用户原创视频服务的质量，改进了用户的观看体验。The present invention proposes a personalized user original video prefetching method and system based on popularity and social network. For the first time, based on the popularity distribution of user-generated videos, videos are divided into two categories: popular videos and long-tail videos. Based on the user's historical behavior data, the user's preference for unwatched videos is mined, video prefetch lists are generated for the two types of videos, and the two types of video prefetch lists are fused linearly through the hybrid prefetch model. Therefore, the hit rate and accuracy of video prefetching are effectively improved, the quality of user original video services is improved, and the viewing experience of users is improved.

附图说明Description of drawings

图1为本发明基于流行度和社交网络的个性化用户原创视频预取方法的步骤流程图；Fig. 1 is the flow chart of the steps of the personalized user original video prefetching method based on popularity and social network in the present invention;

图2为本发明实施例的场景描述示意图；FIG. 2 is a schematic diagram of a scene description according to an embodiment of the present invention;

图3为本发明实施例的图模型示意图；Fig. 3 is a schematic diagram of a graph model of an embodiment of the present invention;

图4为本发明实施例的视频预取示意图；FIG. 4 is a schematic diagram of video prefetching according to an embodiment of the present invention;

图5为本发明基于流行度和社交网络的个性化用户原创视频预取系统的结构示意图。FIG. 5 is a schematic structural diagram of the personalized user-original video prefetching system based on popularity and social network of the present invention.

具体实施方式detailed description

本发明提供了一种基于流行度和社交网络的个性化用户原创视频预取方法及系统。下面结合附图进行详细说明。The invention provides a personalized user original video prefetching method and system based on popularity and social network. A detailed description will be given below in conjunction with the accompanying drawings.

参照图1，图1为本发明基于流行度和社交属性的个性化用户原创视频预取方法的步骤流程图，包括如下步骤：With reference to Fig. 1, Fig. 1 is the flow chart of the steps of the personalized user original video prefetching method based on popularity and social attributes of the present invention, including the following steps:

信息获取步骤101：获取用户对视频的历史行为信息，用户与用户之间的社交关系，视频与视频之间的关联关系以及视频自身的流行度信息；Information obtaining step 101: obtaining the user's historical behavior information on the video, the social relationship between users, the association relationship between videos and the popularity information of the video itself;

视频分类步骤102：基于用户原创视频的流行度信息，将用户原创视频分为热门视频和长尾视频两类；Video classification step 102: based on the popularity information of user-generated videos, classify user-generated videos into popular videos and long-tail videos;

热门视频预取列表生成步骤103：基于视频流行度以及用户对视频的历史行为信息，计算用户对热门视频的偏好度，根据偏好度降序排序，为用户生成热门视频预取列表；Popular video prefetch list generation step 103: based on video popularity and user's historical behavior information to video, calculate user's preference degree to popular video, sort according to preference degree descending order, generate popular video prefetch list for user;

图模型构建步骤104：基于用户对视频的历史行为信息，用户之间的社交关系以及视频之间的关联关系构建图模型，其中的每一个节点或边拥有与实际物理类型一致的节点或关系类型，每一条边拥有与实际关系强度一致的边权值；Graph model construction step 104: construct a graph model based on the user’s historical behavior information on videos, the social relationship between users and the association relationship between videos, and each node or edge has a node or relationship type that is consistent with the actual physical type , each edge has an edge weight consistent with the actual relationship strength;

长尾视频预取列表生成步骤105：基于图模型，利用基于最短路径权值的相关度测量方法，测量用户和视频的相关度，根据相关度降序排序，为用户生成长尾视频预取列表；Long-tail video prefetch list generation step 105: Based on the graph model, use the correlation degree measurement method based on the shortest path weight to measure the correlation between users and videos, sort according to the descending order of correlation, and generate long-tail video prefetch lists for users;

生成混合预取模型步骤106：基于获取的用户的历史行为信息，对热门视频预取列表和长尾视频预取列表进行加权线性融合，构建个性化的混合预取模型；Step 106 of generating a hybrid prefetching model: based on the acquired historical behavior information of the user, weighted linear fusion is performed on the popular video prefetching list and the long tail video prefetching list to construct a personalized hybrid prefetching model;

视频预取步骤107：在保证用户当前观看视频流畅的情况下，基于个性化混合预取模型，为用户预取排在最靠前的一部或多部视频推荐。Video prefetching step 107: under the condition that the user is currently watching the video smoothly, based on the personalized hybrid prefetching model, prefetch one or more video recommendations that are ranked the highest for the user.

实际中一种优选的处理方式是，在热门视频预取列表生成步骤之前还包括预处理步骤，对用户历史行为信息进行处理,只保留用户对视频肯定的信息；在非评分系统中，包括用户对视频的观看、分享、收藏以及上传信息。A preferred processing method in practice is to include a preprocessing step before the popular video prefetch list generation step, process the user's historical behavior information, and only keep the information that the user affirms the video; in the non-scoring system, including user Watch, share, bookmark, and upload information about videos.

下面结合一个用户原创视频服务平台的例子，详细描述本方法的具体实施方式。参照图2，图2是本实施例的场景描述图，本实施例描述了一个用户原创视频服务平台，视频有自身的内容信息；用户之间可以建立社交关系；用户可以观看、收藏、上传、分享视频；视频之间存在关联关系，由用户在上传视频时添加。The specific implementation of this method will be described in detail below in conjunction with an example of a user-generated video service platform. With reference to Fig. 2, Fig. 2 is the scene description diagram of this embodiment, and this embodiment has described a user original video service platform, and video has its own content information; Social relations can be established between users; Users can watch, bookmark, upload, Share videos; there is an association between videos, which is added by users when uploading videos.

根据用户对视频的行为，可以构建用户-视频关系矩阵R，如果用户i对视频j有过观看、收藏、上传或者分享行为，R_ij为1，否则R_ij为0。R是一个m×n维的矩阵，m为系统中用户数量，n为系统中视频数量。According to the user's behavior on the video, the user-video relationship matrix R can be constructed. If user i has watched, collected, uploaded or shared the video j, R _ij is 1, otherwise R _ij is 0. R is an m×n-dimensional matrix, m is the number of users in the system, and n is the number of videos in the system.

根据流行度分布对视频进行分类：首先，利用视频观看次数衡量视频流行度，具体如公式(1)，其中P_k表示视频k的流行度，M_k表示视频k被观看的次数。其次，按照视频流行度从高到低排序，累计观看次数占总观看次数80％，流行度排名前10％的视频为热门视频，若流行度排名前10％的视频累计观看次数不足总观看次数的80％，根据流行度排名向后扩展热门视频，直至满足累计观看次数占总观看次数80％；最后，排除所有热门视频，剩余视频为长尾视频。表示热门视频集合，表示长尾视频集合。Classify videos according to the popularity distribution: First, use the number of video views to measure the popularity of videos, as shown in formula (1), where P _k represents the popularity of video k, and M _k represents the number of times video k is viewed. Secondly, according to the video popularity from high to low, the accumulated views account for 80% of the total views, and the top 10% of the videos are popular videos. If the cumulative views of the top 10% of the videos are less than the total views 80% of the popular videos are expanded backwards according to the popularity ranking until the cumulative views account for 80% of the total views; finally, all popular videos are excluded, and the remaining videos are long-tail videos. represents a collection of popular videos, Represents a collection of long-tail videos.

P_k＝log(M_k+1) (1)P _k =log(M _k +1) (1)

下面介绍生成热门视频预取列表过程中用户对热门视频的偏好度计算方法。首先，利用公式(1)计算视频k的流行度P_k；其次，计算用户i对视频k的兴趣因子I_i(k)，具体如公式(2)，其中表示用户i有过肯定行为视频的集合，表示和视频k最相似的N部视频的集合，Sim(k,j)表示视频k和视频j的相似度。The following describes the method for calculating the user's preference for popular videos during the process of generating the popular video prefetch list. First, use formula (1) to calculate the popularity P _k of video k; secondly, calculate the interest factor I _i (k) of user i to video k, specifically as formula (2), where Indicates that user i has a collection of affirmative action videos, Represents the collection of N videos most similar to video k, and Sim(k,j) represents the similarity between video k and video j.

相似度采用余弦相似度计算得到，具体如公式(3)，其中N(k)表示喜欢视频k的用户集合，并按最大值进行归一化，具体如公式(4)。The similarity is calculated using the cosine similarity, as shown in formula (3), where N(k) represents the set of users who like video k, and is normalized by the maximum value, as shown in formula (4).

最后利用公式(5)计算用户i对热门视频k的偏好度S_i(k)。Finally, formula (5) is used to calculate user i's preference S _i (k) for popular video k.

S_i(k)＝P_k×I_i(k) (5)S _i (k) = P _k × I _i (k) (5)

接下来基于用户-视频关系矩阵R，用户与用户之间的社交关系以及视频与视频之间的关联关系，构建图模型。图模型是指一个边带权值的无向图G(U,V,E,R,W；Φ)。其中U代表用户集合，包括所有的用户节点；V代表视频集合，包括所有的视频节点；E代表图中边的集合，包括社交网络中节点之间所有的关系以及所有的用户肯定行为；R代表关系类型集合，包括所有的关系类型；Φ:E→R，代表关系类型的映射函数；W代表图G中边的权值，衡量了某种关系的强度；E中所有的边e，都属于一种关系类型，Φ(e)∈R；e(u,v)∈E，u,v∈U|V，代表节点u和节点v之间存在某种类型的关系；而w(e)代表e这一条边的权值；其中|R|＞1，当|R|＝1的时候，G退化为一个用户‐物品二分图。Next, build a graph model based on the user-video relationship matrix R, the social relationship between users and the relationship between videos. A graph model refers to an undirected graph G(U, V, E, R, W; Φ) with sideband weights. Among them, U represents the user collection, including all user nodes; V represents the video collection, including all video nodes; E represents the set of edges in the graph, including all relationships between nodes in the social network and all user affirmative behaviors; R represents A collection of relationship types, including all relationship types; Φ:E→R, representing the mapping function of relationship types; W represents the weight of edges in graph G, which measures the strength of a certain relationship; all edges e in E belong to A type of relationship, Φ(e)∈R; e(u,v)∈E, u,v∈U|V, represents a certain type of relationship between node u and node v; and w(e) represents The weight of the edge e; where |R|>1, when |R|=1, G degenerates into a user-item bipartite graph.

根据图模型的定义，具体构建过程如下：According to the definition of the graph model, the specific construction process is as follows:

1)将用户实体及视频实体映射为图G中相应的节点，并且用实体的类型作为相应节点的类型；1) user entities and video entities are mapped to corresponding nodes in graph G, and the type of the entity is used as the type of the corresponding node;

2)若用户u_i和用户u_j之间存在社交关系，则添加边e(u_i,u_j)，边的权值ω(e)代表社交关系强度，利用公式(6)计算，其中F_i代表用户i的好友集合，F_j代表用户j的好友集合；2) If there is a social relationship between user u _i and user u _j , add an edge e(u _i , u _j ), and the weight ω(e) of the edge represents the strength of the social relationship, which is calculated using formula (6), where F _i represents the friend set of user i, and F _j represents the friend set of user j;

3)若视频v_i和视频v_j之间存在关联关系，则添加边e(v_i,v_j)，边的权值ω(e)代表关联关系强度，利用公式(7)计算，其中代表视频i的相关视频集合，代表视频j的相关视频集合。如果视频v_j同样也是视频v_i的相关视频，由于是无向图，边e(v_j,v_i)已经存在，所以不需要在添加新的边，只需要更新边的权值为原来的两倍；3) If there is an association relationship between video v _i and video v _j , then add edge e(v _i , v _j ), and the weight ω(e) of the edge represents the strength of the association relationship, which is calculated by formula (7), where represents the collection of related videos for video i, Represents the collection of related videos for video j. If video v _j is also a related video of video v _i , since it is an undirected graph, edge e(v _j ,v _i ) already exists, so there is no need to add a new edge, just update the weight of the edge to the original double;

4)若用户u_i对视频v_j有肯定行为，则添加边e(u_i,v_j)，边的权值ω(e)为1/n，代表肯定行为的强度，其中n为用户u_i有过肯定行为的视频总数4) If user u _i has affirmative behavior on video v _j , then add edge e(u _i , v _j ), and edge weight ω(e) is 1/n, representing the strength of affirmative behavior, where n is user u _iTotal number of videos with affirmative action

下面参照图3介绍生成长尾视频预取列表过程中，基于图模型，用户u_i和长尾视频v_j之间的相关度计算方法：Referring to Fig. 3, the method for calculating the correlation between user u _i and long tail video v _j is introduced based on the graphical model in the process of generating the long tail video prefetch list:

1)在G中，判断节点u_i和节点v_j之间是否有路径相连，如果没有，相似度为0；如果有，计算得到节点u_i和节点v_j之间所有的最短路径，定义Γ(u_i,v_j)为图G中节点u_i和节点v_j之间所有最短路径的集合。虽然G是一个边带权值的图，但是在计算节点之间最短路径的时候，不考虑节点之间边的权值，仅考虑路径上边的数量；1) In G, judge whether there is a path between node u _i and node v _j , if not, the similarity is 0; if yes, calculate all the shortest paths between node u _i and node v _j , define Γ (u _i , v _j ) is the collection of all shortest paths between node u _i and node v _j in graph G. Although G is a graph with edge weights, when calculating the shortest path between nodes, the weight of edges between nodes is not considered, only the number of edges on the path is considered;

2)对节点u_i和节点v_j之间的最短路径ρ∈Γ(u_i,v_j)，计算得到其的权值ω(ρ)。假设最短路径ρ＝{u_i,...,v_j}，ω(ρ)可以根据公式(8)计算得到，用{n₁,...,n_k}来代表最短路径ρ经过的节点，n₁即u_i，n_k即v_j。ω(n_i,n_i+1)代表边e(n_i,n_i+1)的权值，out(n_i)表示节点n_i的度。；2) For the shortest path ρ∈Γ(u _i ,v _j ) between node u _i and node v _j , calculate its weight ω(ρ). Assuming the shortest path ρ={u _i ,...,v _j }, ω(ρ) can be calculated according to formula (8), using {n ₁ ,...,n _k } to represent the nodes passed by the shortest path ρ , n ₁ is u _i , n _k is v _j . ω(n _i ,n _i+1 ) represents the weight of edge e(n _i ,n _i+1 ), and out(n _i ) represents the degree of node n _i . ;

3)对节点u_i和节点v_j之间的所有最短路径ρ∈Γ(u_i,v_j)，将其权值ω(ρ)相加，计算得到节点u_i和节点v_j的相关度sim(u_i,v_j)，如公式(9)所示；3) For all the shortest paths ρ∈Γ(u _i , v _j ) between node u _i and node v _j , add their weights ω(ρ) to calculate the correlation between node u _i and node v _j sim(u _i , v _j ), as shown in formula (9);

下面介绍构建个性化混合预取模型的步骤。由于是个性化混合预取模型，需要充分考虑不同用户的特征。以用户u_i为例，利用用户u_i对视频的历史行为信息，计算热门视频预取列表的权值ω_P(u_i)和长尾视频预取列表的权值ω_L(u_i)，通过线性加权融合的方法，融合热门视频预取列表和长尾视频预取列表，生成个性化混合预取模型具体如公式(10)。The steps for building a personalized hybrid prefetching model are described below. Since it is a personalized hybrid prefetching model, it is necessary to fully consider the characteristics of different users. Taking user u _i as an example, use user u _i 's historical behavior information on videos to calculate the popular video prefetch list The weight ω _P (u _i ) and the long tail video prefetch list The weight ω _L (u _i ), through the linear weighted fusion method, fuses the popular video prefetch list and the long tail video prefetch list to generate a personalized hybrid prefetch model The details are as formula (10).

热门视频预取列表权值ω_P(u_i)的计算具体如公式(11)，长尾视频预取列表权值ω_L(u_i)的计算具体如公式(12)。The calculation of popular video prefetch list weight ω _P (u _i ) is specifically as formula (11), and the calculation of long tail video prefetch list weight ω _L (u _i ) is specific as formula (12).

接下来参照图4介绍视频预取的步骤：Next, refer to Figure 4 to introduce the steps of video prefetching:

1)视频缓存被分为两个部分，播放缓存及预取缓存，播放缓存用于缓存当前正在观看视频内容的数据块，预取缓存用于缓存预取视频内容的片头数据块；1) The video cache is divided into two parts, the playback cache and the prefetch cache. The playback cache is used to cache the data blocks of the video content currently being watched, and the prefetch cache is used to cache the header data blocks of the prefetched video content;

2)从服务器处接受个性化混合预取模型，在之后的播放过程中利用这个混合预取模型预测用户未来可能观看的视频内容；2) Receive a personalized hybrid prefetch model from the server, and use this hybrid prefetch model to predict the video content that the user may watch in the future during the playback process;

3)数据调度策略包括两个阶段，在每个调度周期内，先利用下载带宽下载紧急数据块到播放缓存，以保证用户当前正在观看的视频能够流畅的播放，紧急数据块是指按照视频的播放顺序，播放时间临近当前播放时间的数据块；如果下载完紧急数据块之后仍有富余的下载带宽，基于混合预取模型预测用户未来可能观看的视频，并使用剩余的下载带宽将视频的片头内容，预取到预取缓存中。3) The data scheduling strategy includes two stages. In each scheduling cycle, the urgent data block is first downloaded to the playback cache using the download bandwidth to ensure that the video currently being watched by the user can be played smoothly. The urgent data block refers to the Playing order, the data blocks whose playing time is close to the current playing time; if there is still ample download bandwidth after downloading the urgent data block, predict the video that the user may watch in the future based on the hybrid prefetching model, and use the remaining download bandwidth to upload the title of the video Content, prefetched into the prefetch cache.

最后，参照图5来描述基于流行度和社交网络的个性化用户原创视频预取系统。如图5所示，本系统具体包括：信息获取模块501，用于获取用户对视频的历史行为信息，用户与用户之间的社交关系，视频与视频之间的关联关系以及视频自身的流行度信息；预处理模块502，用于保留肯定的信息；视频分类模块503，用于基于用户原创视频的流行度分布，将视频分为热门视频和长尾视频两类；热门视频预取列表生成模块504，用于基于视频流行度和用户历史行为信息，计算用户对热门视频的偏好度，生成热门视频预取列表；图模型构建模块505，用于基于用户对视频的历史行为信息，用户之间的社交关系以及视频之间关联关系构建含有节点类型以及边权值的图模型；长尾视频预取列表生成模块506，用于基于用户之间社交关系，视频之间关联关系以及用户历史行为信息，计算用户和长尾视频的相关度，生成长尾视频预取列表；混合预取模型生成模块507，用于基于用户历史行为信息，计算热门视频预取权值及长尾视频预取权值，加权线性融合热门视频预取列表和长尾视频预取列表，生成个性化混合预取模型；视频预取模块508，用于在保证用户当前观看视频流畅的情况下，基于个性化混合模型，为用户预取排在最靠前的一个或多个视频Finally, a personalized user-generated video prefetching system based on popularity and social network is described with reference to FIG. 5 . As shown in Figure 5, the system specifically includes: an information acquisition module 501, which is used to acquire the user's historical behavior information on the video, the social relationship between users, the association relationship between videos and the popularity of the video itself Information; preprocessing module 502, for retaining positive information; video classification module 503, for distributing the popularity based on user original video, video is divided into popular video and long-tail video two classes; popular video prefetching list generation module 504, for calculating the user's preference for popular videos based on video popularity and user historical behavior information, and generating a popular video prefetch list; graph model building module 505, for based on the user's historical behavior information for videos, between users Build a graph model containing node types and edge weights between social relationships and video associations; the long-tail video prefetch list generation module 506 is used to based on the social relationship between users, the association relationship between videos, and user history behavior information , calculate the correlation between the user and the long-tail video, and generate the long-tail video prefetch list; the hybrid prefetch model generation module 507 is used to calculate the popular video prefetch weight and the long-tail video prefetch weight based on the user's historical behavior information , weighted linear fusion popular video prefetch list and long-tail video prefetch list, generate personalized hybrid prefetch model; video prefetch module 508, for ensuring that the user is currently watching the video smoothly, based on the personalized hybrid model, Prefetch the top-ranked video or videos for the user

由此，实现了本发明实施例的基于流行度和社交网络的个性化用户原创视频预取系统。Thus, the personalized user original video prefetching system based on the popularity and social network of the embodiment of the present invention is realized.

在此说明的是，系统实施例的原理与方法实施例的原理是相同的，相互之间可以互相参照即可。在此不再赘述。It is explained here that the principle of the system embodiment is the same as that of the method embodiment, and they can be referred to each other. I won't repeat them here.

虽然结合附图描述了本发明的实施例，但不应理解为对本发明的限制。应当指出，对于本技术领域的技术人员来说，在不脱离本发明原理和范围的前提下，还可以作出若干修改和变型，这些修改和变型也应视为在本发明的保护范围之内。Although the embodiments of the present invention have been described in conjunction with the accompanying drawings, they should not be construed as limiting the present invention. It should be pointed out that those skilled in the art can make some modifications and variations without departing from the principle and scope of the present invention, and these modifications and variations should also be considered within the protection scope of the present invention.

Claims

1. a kind of personalized user original video forecasting method based on popularity and social networkies it is characterised in that include with Lower step：

1) acquisition of information:Obtain the historical behavior information to video for the user, the social networks between user and user, video with regard Incidence relation between frequency and the popularity information of original video；

2) visual classification：Based on the popularity information of the original video obtaining, video is divided into popular video and long-tail video two Class；

3) generate popular video and prefetch list：Popularity information based on the original video obtaining and user's history behavioural information, Calculate the preference to popular video for the user, according to preference descending sort, be that the popular video of user's generation prefetches list；

4) graph model builds：Based on the user's history behavioural information obtaining, between the social networks between user and video Incidence relation builds graph model, and each node or every a line have the node type consistent with actual physics type Or relationship type, every a line has the side right value consistent with actual relationship intensity；

5) generate long-tail video and prefetch list：Based on graph model, using the degree of association measuring method based on shortest path weights, survey Amount user and the degree of association of video, according to degree of association descending sort, are that user's generation long-tail video prefetches list；

6) build personalized mixing Prefetching Model：Based on the historical behavior information of the user obtaining, list is prefetched to popular video Prefetch list with long-tail video and be weighted linear fusion, build personalized mixing Prefetching Model；

7) video prefetches：In the case of ensureing user's currently viewing video smoothness, based on personalized mixing Prefetching Model, it is use Family prefetches and comes the most forward one or multi-section video.

2. personalized user original video forecasting method according to claim 1 is it is characterised in that step 1) information obtain During taking, for the incidence relation between the social relations between user and user and video and video, if in application system Exist in system, then obtain these information；If no, need not obtain.

3. personalized user original video forecasting method according to claim 1 and 2 is it is characterised in that in described step 3), before popular video prefetches list generation, also include the step that pretreatment is carried out to user's history behavioural information, specially：Only Retain user's information to video certainly；In non-marking system, including user to the viewing of video, share, collect and on Biography information.

4. personalized user original video forecasting method according to claim 3 is it is characterised in that described step 2) video Classification detailed process be：

2.1) behavior to video according to user, build user-video relationship matrix R, if user i video j was had viewing, Collection, upload or splitting glass opaque, R_ijFor 1, otherwise R_ijIt is the matrix of m × n dimension for 0, R, m is number of users in system, N is number of videos in system；

2.2) classified according to popular Degree distributions video：First, weigh video popularity using video-see number of times, specifically As formula (1),

P_k=log (M_k+1) (1)

Wherein P_kRepresent the popularity of video k, M_kRepresent the number of times that video k is watched；

Secondly, sort from high to low according to video popularity, accumulative viewing number of times accounts for total viewing number of times 80%, before popularity rankings 10% video is popular video, if 10% video adds up to watch number of times that number of times is not watched by Football Association before popularity rankings 80%, extend popular video backward according to popularity rankings, until meeting accumulative viewing number of times to account for total viewing number of times 80%；? Afterwards, exclusion all hot topics video, remaining video is long-tail video,Represent popular video collection,Represent long-tail video collection.

5. the personalized user original video forecasting method according to claim 4 is it is characterised in that step 3) hot topic regards Frequency prefetches in list generation, and the detailed process that described preference calculates is：

3.1) the popularity P according to video k_k, calculate the interest factor I to video k for the user i_i(k), concrete such as formula (2),

WhereinRepresent that user i had the set of behavior video certainly,Represent and the most like N portion video of video k Set, Sim (k, j) represent video k and video j similarity；

3.2) similarity is calculated using cosine similarity, concrete such as formula (3), and wherein N (k) represents the use liking video k Family is gathered, and is normalized by maximum, concrete such as formula (4)

S i m (k, j) = \frac{| N (k) \cap N (j) |}{\sqrt{| N (k) | \times | N (j) |}} - - - (3)

{Sim}^{'} (k, j) = \frac{S i m (k, j)}{\underset{j}{m a x} S i m (k, j)} - - - (4)

Formula (5) is finally utilized to calculate preference S to popular video k for the user i_i(k)

S_i(k)=P_k×I_i(k) (5).

6. personalized user original video forecasting method according to claim 5 is it is characterised in that step 4) graph model structure The detailed process built is：

First, user subject and video entities are mapped as the corresponding node of in figure, and are saved as corresponding with the type of entity The type of point；Secondly, if there are social networks between user, add side between corresponding node, the weights on side represent society Handing over relationship strength, if there is incidence relation between video, adding side between corresponding node, the weights on side represent association and close It is intensity；Finally, if user has behavioural information certainly to video, add side between user and this video, the weights on side represent Certainly the intensity of behavior.

7. personalized user original video forecasting method according to claim 6 is it is characterised in that this graph model is one Non-directed graph G (U, V, E, R, the W of sideband weights；Φ), wherein U represents user's set, including all of user node；V represents and regards Frequency is gathered, including all of video node；The set on side in E representative graph, including all of relation between social networkies interior joint And all of user behavior certainly；R representation relation type set, including all of relationship type；Φ:E → R, representation relation The mapping function of type；In W representative graph G, the weights on side, have weighed the intensity of certain relation；In E, all of side e, broadly falls into one Plant relationship type, Φ (e) ∈ R；E (u, v) ∈ E, u, v ∈ U | V, represent and between node u and node v, there is certain type of pass System；And w (e) represents the weights of this line of e；Wherein | R | ＞ 1, when | R |=1, G deteriorates to user-article two Component.

8. personalized user original video forecasting method according to claim 7 is it is characterised in that step 5) long-tail video Prefetch in list generation, the detailed process of described relatedness computation is：

First, it is calculated the shortest path between node, if there is not path between node, degree of association is 0；Secondly, calculate The weights of wall scroll shortest path；Finally, using shortest path weights sum as the degree of association between node.

9. personalized user original video forecasting method according to claim 8 is it is characterised in that step 6) mixing prefetches Model generate detailed process be：

Based on user's history behavioural information, calculating user respectively and the behavior quantity of popular video and long-tail video is accounted for head office is number The ratio of amount, result of calculation is prefetched list as popular video and long-tail video prefetches the weights of list, weighted linear merges Popular video prefetches list and long-tail video prefetches list.

10. personalized user original video forecasting method according to claim 9 is it is characterised in that step 7) video preprocessor The detailed process taking is：

1) video cache is divided into two parts, plays caching and prefetch caching, broadcasting caching is used for caching currently to be seen See the data block of video content, prefetch caching for the head data block of cache prefetching video content；

2) accept personalized mixing Prefetching Model at server, utilize this to mix Prefetching Model in playing process afterwards The following video content that may watch of prediction user；

3) data dispatch strategy includes two stages, within each dispatching cycle, first with download bandwidth download emergency data block To playing caching, to ensure that the video that user is currently viewed is capable of smooth broadcasting, emergency data block refers to according to video Playing sequence, reproduction time closes on the data block of current play time；If still having more than needed after having downloaded emergency data block Download bandwidth, based on mixing Prefetching Model predict user following may viewing video, and will using remaining download bandwidth The head content of video, is prefetched to and prefetches in caching.

A kind of 11. personalized user original video pre-fetching systems based on video popularity and social networkies are it is characterised in that institute The system of stating includes：

Data obtaining module, for obtaining the historical behavior information to video for the user, the social networks between user and user, depending on Incidence relation between frequency and video and the popularity information of video itself；

Visual classification module, for the popularity information based on user's original video, video is divided into popular video and long-tail to regard Frequently two class；

Pretreatment module, for filtering to user's history behavioural information, retains information certainly；

Popular video prefetches List Generating Module, for based on video popularity and user's history behavioural information, calculating user couple The preference of popular video, generates popular video and prefetches list；

Graph model builds module, for the historical behavior information to video based on user, social networks between user and regarding Between frequency, incidence relation builds the graph model containing node type and side right value；

Long-tail video prefetches List Generating Module, for based on graph model, using the degree of association measurement based on shortest path weights Method, calculates the degree of association of user and long-tail video, generates long-tail video and prefetches list；

Mixing Prefetching Model generation module, for based on user's history behavioural information, calculate popular video prefetch list weights and Long-tail video prefetches list weights, and weighted linear fusion hot topic video prefetches list and long-tail video prefetches list, generates individual character Change mixing Prefetching Model；

Video preprocessor delivery block, for, in the case of ensureing user's currently viewing video smoothness, based on personalized mixed model, being User prefetches and comes one or more videos the most forward.