+

WO2018137104A1 - User behavior analysis method and system based on big data mining - Google Patents

User behavior analysis method and system based on big data mining Download PDF

Info

Publication number
WO2018137104A1
WO2018137104A1 PCT/CN2017/072375 CN2017072375W WO2018137104A1 WO 2018137104 A1 WO2018137104 A1 WO 2018137104A1 CN 2017072375 W CN2017072375 W CN 2017072375W WO 2018137104 A1 WO2018137104 A1 WO 2018137104A1
Authority
WO
WIPO (PCT)
Prior art keywords
user behavior
data
behavior data
page
user
Prior art date
Application number
PCT/CN2017/072375
Other languages
French (fr)
Chinese (zh)
Inventor
熊益冲
Original Assignee
深圳企管加企业服务有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳企管加企业服务有限公司 filed Critical 深圳企管加企业服务有限公司
Priority to PCT/CN2017/072375 priority Critical patent/WO2018137104A1/en
Publication of WO2018137104A1 publication Critical patent/WO2018137104A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor

Definitions

  • the present invention relates to the field of Internet technologies, and in particular, to a user behavior analysis method and system based on big data mining.
  • user behavior analysis refers to the statistical and analysis of real-time and historical user behavior generated by users accessing network services (including accessing and browsing web pages, performing interactive operations, using APP, etc.). information.
  • network service access times access frequency, access time, active time, user input keywords, user click links, user interaction operations (such as adding attention, canceling attention, scoring) , save as a bookmark, add to the shopping cart, take out the shopping cart, form an order, cancel an order, pay, refund, etc.).
  • the most effective means is to record all user behavior information brought by all actions of users, and to analyze and analyze all user behavior information.
  • Big data technology is an information processing technology that targets all data resources of any system and discovers the correlation between data representation. It has been widely used in Internet process optimization, targeted message and advertisement push, and user personalized service. And improvement, etc., has become a strong back-office support behind network services. Based on the big data platform, the analysis and utilization of all user behavior information is realized, which adapts to the characteristics of large user behavior information, complex and diversified data formats, and high computing speed requirements, which can meet the actual needs of various types of network services.
  • the embodiment of the invention provides a non-intelligent air conditioning monitoring method and system based on the Internet of Things, which can monitor various parameter information of the non-intelligent air conditioner, and adjust the non-intelligent air conditioner according to the parameter information, and can check whether the temperature and humidity setting of the air conditioner is reasonable. Or automatically control the air conditioner switch according to the time period to avoid unnecessary waste of cold source.
  • the first aspect of the embodiments of the present invention discloses a user behavior analysis method based on big data mining, including:
  • a user behavior data ontology model is established and stored in the database.
  • the method further includes
  • the user behavior data ontology model is analyzed to find out the user's latest interest data.
  • the user behavior data includes a user behavior main body, an occurrence time, a generated page, a scrolling page up and down, a moving or clicking mouse, a page staying time, collecting, printing, saving, accessing the same page, Copy and paste text operations, current user search criteria, and search keyword-related titles.
  • the pre-processing includes: removing incomplete data, deleting duplicate data, pictures, and page animation; printing, collecting, saving, and downloading operations on the page, and after converting, converting the The corresponding data format is saved in the database;
  • the data aggregation includes: filtering and integrating the correct and invalid user behavior information by using a rule-based user behavior aggregation algorithm.
  • the establishing a user behavior data ontology model specifically includes:
  • the OWL-DL description language is used to build the user behavior data ontology model, and the ontology model is decomposed.
  • the database uses an open source non-relational distributed database.
  • the second aspect of the embodiment of the present invention discloses a user behavior analysis system based on big data mining, including:
  • An acquisition unit for collecting user behavior data An acquisition unit for collecting user behavior data
  • a pre-processing unit for performing pre-processing and aggregation on user behavior data using a parallel computing model
  • the modeling unit is configured to establish a user behavior data ontology model according to the aggregated user behavior data, and store the data in the database.
  • system further includes:
  • the analysis unit is configured to analyze the user behavior data ontology model to find out the latest interest data of the user.
  • the user behavior data includes a user behavior main body, an occurrence time, a generated page, a scrolling page up and down, a moving or clicking mouse, a page staying time, collecting, printing, saving, accessing the same page, Copy and paste text operations, current user search criteria, and search keyword-related titles.
  • the pre-processing includes: removing incomplete data, deleting duplicate data, pictures, and page animation; printing, collecting, saving, and downloading operations on the page, and after converting, converting the The corresponding data format is saved in the database;
  • the data aggregation includes: filtering and integrating the correct and invalid user behavior information by using a rule-based user behavior aggregation algorithm.
  • the modeling unit is specifically configured to: establish an ontology model of the user behavior data by using an OWL-DL description language, and decompose the ontology model, where the database adopts an open source non-relational distributed database. .
  • the user behavior data is collected; the user behavior data is preprocessed and aggregated by using a parallel computing model; and the user behavior data ontology model is established according to the aggregated user behavior data, and stored in the database.
  • the implementation of the embodiments of the present invention combines the powerful processing capability of the cloud computing technology with the large-scale data storage capability, the ontology and its analysis, and the knowledge discovery method, analyzes the massive user behavior data in real time, and timely acquires the user interest, thereby realizing Effective and accurate user push.
  • FIG. 1 is a schematic flowchart of a user behavior analysis method based on big data mining according to a first embodiment of the present invention
  • FIG. 3 is a schematic structural diagram of a terminal device according to a third embodiment of the present invention.
  • the embodiment of the invention provides a user behavior analysis method based on big data mining, which combines the powerful processing capability of cloud computing technology with large-scale data storage capability, ontology and its analysis, and knowledge discovery method to analyze massive user behavior data in real time. Get user interest in a timely manner to achieve effective and accurate user push.
  • FIG. 1 is a user based on big data mining disclosed in the first embodiment of the present invention. Schematic diagram of the behavior analysis method. The method for analyzing user behavior based on big data mining shown in FIG. 1 may include the following steps:
  • the user behavior data includes a user behavior main body, an occurrence time, a generated page, a scrolling page up and down, a moving or clicking mouse, a page staying time, collecting, printing, saving, accessing the same page number, copying and pasting text.
  • the pre-processing includes: removing incomplete data, deleting duplicate data, pictures, and page animation; printing, collecting, saving, and downloading operations on the page, and converting the data into corresponding data after obtaining The format is saved in the database.
  • the data aggregation includes: filtering and integrating the correct and invalid user behavior information by using a rule-based user behavior aggregation algorithm.
  • the user behavior data ontology model is established by using the OWL-DL description language, and the ontology model is decomposed, and the database adopts an open source non-relational distributed database.
  • the aggregated user behavior data is added to the user behavior data ontology model, and the user behavior data ontology model data stored in the database is analyzed to find the user's latest interest data.
  • the user behavior data is collected; the user behavior data is preprocessed and aggregated by using a parallel operation model; and the user behavior data ontology model is established according to the aggregated user behavior data, and stored in the database.
  • the implementation of the embodiments of the present invention combines the powerful processing capability of the cloud computing technology with the large-scale data storage capability, the ontology and its analysis, and the knowledge discovery method, analyzes the massive user behavior data in real time, and timely acquires the user interest, thereby realizing Effective and accurate user push.
  • the system embodiment of the present invention is used to perform the method for implementing the first embodiment of the method of the present invention.
  • the system embodiment of the present invention is used to perform the method for implementing the first embodiment of the method of the present invention.
  • the method related to the embodiment of the present invention is shown, and specific calculation details are not disclosed. Please refer to Embodiments 1 to 2 of the present invention.
  • FIG. 2 is a structural diagram of a user behavior analysis system based on big data mining disclosed in a second embodiment of the present invention. As shown in Figure 2, the system can include:
  • the collecting unit 201 is configured to collect user behavior data.
  • the user behavior data includes a user behavior main body, an occurrence time, a generated page, a scrolling page up and down, a moving or clicking mouse, a page staying time, a favorite, a print, a save, a visit to the same page number, a copy and paste text operation, and a current user search.
  • Conditions search for the title of the keyword.
  • the pre-processing unit 202 is configured to perform pre-processing and aggregation on the user behavior data by using a parallel computing model.
  • the pre-processing includes: removing incomplete data, deleting duplicate data, pictures, page animation; printing, collecting, saving, and downloading operations on the page, and after converting, converting the data into a corresponding data format and saving the same in a database;
  • the data aggregation includes: filtering and integrating the correct and invalid user behavior information by using a rule-based user behavior aggregation algorithm.
  • the modeling unit 203 is configured to establish a user behavior data ontology model according to the aggregated user behavior data, and store the data in the database.
  • the user behavior data ontology model is established by using the OWL-DL description language, and the ontology model is decomposed, and the database adopts an open source non-relational distributed database.
  • the analyzing unit 204 is configured to analyze the user behavior data ontology model to find the latest interest data of the user.
  • the aggregated user behavior data is added to the user behavior data ontology model, and the user behavior data ontology model data stored in the database is analyzed to find the user's latest interest data.
  • the acquisition unit collects user behavior data; the pre-processing unit uses a parallel operation model to perform pre-processing and aggregation on the user behavior data; the modeling unit establishes a user behavior data ontology model according to the aggregated user behavior data. And stored in the database.
  • FIG. 3 is a schematic structural diagram of a terminal device according to an embodiment of the present invention. As shown in FIG. 3, for the convenience of description, only the parts related to the embodiments of the present invention are shown. For the specific technical details not disclosed, please refer to the method part of the embodiment of the present invention.
  • the terminal may include a processor 301, a memory 302, a collector 303, the processor 301, a memory 302, and a transmitter 303 connected by a communication bus 304.
  • each step method flow may be implemented based on the structure of the terminal device.
  • Both the application layer and the operating system kernel can be considered as part of the abstraction structure of the processor 301.
  • the processor 301 performs the following operations by calling program code stored in the memory 302:
  • a user behavior data ontology model is established and stored in the database.
  • the user behavior data includes a user behavior main body, an occurrence time, a generated page, a scrolling page up and down, a moving or clicking mouse, a page staying time, a favorite, a print, a save, a visit to the same page number, a copy and paste text operation, and a current user search.
  • Conditions search for the title of the keyword.
  • the pre-processing includes: removing incomplete data, deleting duplicate data, pictures, page animation; printing, collecting, saving, and downloading operations on the page, and after converting, converting the data into a corresponding data format and saving the same in a database;
  • the data aggregation includes: filtering and integrating the correct and invalid user behavior information by using a rule-based user behavior aggregation algorithm.
  • the collector 303 collects user behavior data; the processor 301 performs preprocessing and aggregation on the user behavior data by using a parallel computing model; and establishes a user behavior data ontology model according to the aggregated user behavior data. And stored in the database. It can be seen that implementing the embodiments of the present invention, the powerful processing capability of the cloud computing technology and the large-scale data storage capability, the ontology and its points The combination of analysis and knowledge discovery methods analyzes massive user behavior data in real time and acquires user interest in time to achieve effective and accurate user push.
  • processor 301 is further configured to perform the following operations by calling program code stored in the memory 302:
  • the user behavior data ontology model is analyzed to find out the user's latest interest data.
  • the embodiment of the present invention further provides a computer storage medium, wherein the computer storage medium can store a program, and the program includes a part or all steps of a monitoring method of any one of the foregoing method embodiments.
  • the insufficiency of the method of the embodiment of the present invention may be adjusted, merged, or deleted according to actual needs.
  • the unit of the terminal in the embodiment of the present invention may be integrated, further divided or deleted according to actual needs.
  • the disclosed system may be implemented in other manners, for example, the system embodiment described above is illustrative, for example, the division of the unit is A logical function partitioning may be implemented in an actual manner. For example, multiple units or components may be combined or integrated into another system, or some features may be ignored or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an inductive or communication connection through some interface, device or unit, and may be electrical or otherwise.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. . Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional unit in various embodiments of the present invention may be integrated in one processing unit. It is also possible that each unit physically exists alone, or two or more units may be integrated in one unit.
  • the above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.
  • each unit included is only divided according to functional logic, but is not limited to the above division, as long as the corresponding The functions of the functional units are only for the purpose of facilitating mutual differentiation and are not intended to limit the scope of the present invention.
  • ROM Read-Only Memory
  • RAM Random Access Memory
  • PROM Programmable Read-Only Memory
  • Erasable Programmable Read Only Memory Erasable Programmable Read Only Memory
  • EPROM One-time Programmable Read-Only Memory
  • OTPROM One-time Programmable Read-Only Memory
  • EEPROM Electronically-Erasable Programmable Read-Only Memory
  • CD-ROM Compact Disc Read-Only Memory

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Disclosed in embodiments of the present invention are a user behavior analysis method and system based on big data mining. The method comprises: collecting user behavior data; preprocessing and aggregating the user behavior data by using a parallel computing model; and establishing a user behavior data ontology model according to the aggregated user behavior data, and storing the user behavior data ontology model into a database. Accordingly, by executing the embodiments of the present invention, the powerful processing capability of a cloud computing technology is combined with a large-scale data storage capability, an ontology and analysis thereof and a knowledge discovery method, in this way, mass user behavior data is analyzed in real time, and interest of a user is found in time, thereby implementing effective and precise pushing to the user.

Description

一种基于大数据挖掘的用户行为分析方法及系统User behavior analysis method and system based on big data mining 技术领域Technical field
本发明涉及互联网技术领域,尤其涉及一种基于大数据挖掘的用户行为分析方法及系统。The present invention relates to the field of Internet technologies, and in particular, to a user behavior analysis method and system based on big data mining.
背景技术Background technique
在互联网应用这一领域,用户行为分析指的是统计和分析用户接入网络服务全过程当中(包括访问和浏览网页、进行交互式操作、使用APP等)产生的实时性和历史性的用户行为信息。在用户接入网络服务的行为过程当中,包含着大量有价值的信息。据测算,用户在一次网上购物的过程中平均要关注3-4件商品,访问5-7个网站,浏览40个以上的页面。用户行为信息包括但不局限于以下内容:网络服务的访问次数、访问频度、访问停留时间、操作活跃时间、用户输入关键词、用户点击链接、用户交互操作(如加关注、取消关注、打分、保存为书签、加入购物车、取出购物车、形成订单、取消订单、付款、退款等等)。通过对用户行为信息的研究,可以从中发现用户在接入网络服务时表现出来的规律分布,并且为提升用户体验、高效信息推送和促进目标营销提供科学、准确的客观依据。针对用户行为的研究与应用,最有效的手段是记录用户的所有行为带来的全部用户行为信息,并对全部的用户行为信息进行统计、分析。In the field of Internet applications, user behavior analysis refers to the statistical and analysis of real-time and historical user behavior generated by users accessing network services (including accessing and browsing web pages, performing interactive operations, using APP, etc.). information. In the process of users accessing network services, there is a lot of valuable information. According to estimates, users should pay attention to 3-4 items on average during an online shopping process, visit 5-7 websites, and browse more than 40 pages. User behavior information includes but is not limited to the following: network service access times, access frequency, access time, active time, user input keywords, user click links, user interaction operations (such as adding attention, canceling attention, scoring) , save as a bookmark, add to the shopping cart, take out the shopping cart, form an order, cancel an order, pay, refund, etc.). Through the research on user behavior information, we can find out the regular distribution of users when accessing network services, and provide scientific and accurate objective basis for improving user experience, efficient information push and promotion of target marketing. For the research and application of user behavior, the most effective means is to record all user behavior information brought by all actions of users, and to analyze and analyze all user behavior information.
大数据技术是以任何系统的全部数据资源为对象并从中发现数据之间表现的相关性关系的信息处理技术,目前已经广泛应用于互联网的流程优化、目标化消息及广告推送、用户个性化服务与改善等方面,成为了网络服务背后强大的后台支撑。基于大数据平台实现对全部用户行为信息的分析与利用,适应了用户行为信息自身规模庞大、数据格式复杂多元化、运算速度要求高的特点,能够满足各类型网络服务的实际需求。Big data technology is an information processing technology that targets all data resources of any system and discovers the correlation between data representation. It has been widely used in Internet process optimization, targeted message and advertisement push, and user personalized service. And improvement, etc., has become a strong back-office support behind network services. Based on the big data platform, the analysis and utilization of all user behavior information is realized, which adapts to the characteristics of large user behavior information, complex and diversified data formats, and high computing speed requirements, which can meet the actual needs of various types of network services.
对于用户行为的分析,国内外做过很多研究,但存在一些问题:首先,大多集中于挖掘WEB日志,但这些日志并不足以及时描述用户访问网站时的情景;其次,大型网站一般拥有庞大的在线用户,产生的实时行为和上下文信息 量巨大,因此,系统的存储能力和计算速度更强,才能及时地将分析结果反馈给用户。而目前,大多数用户行为分析系统采用关系数据库技术与传统的数据处理方法,不能很好满足海量数据的高效分析。For the analysis of user behavior, there have been many studies at home and abroad, but there are some problems: First, most of them focus on mining WEB logs, but these logs are not enough to describe the situation when users visit the website in time; secondly, large websites generally have huge Online users, real-time behavior and contextual information generated The amount is huge, so the storage capacity and calculation speed of the system are stronger, and the analysis results can be fed back to the user in time. At present, most user behavior analysis systems use relational database technology and traditional data processing methods, which can not meet the efficient analysis of massive data.
发明内容Summary of the invention
本发明实施例提供了一种基于物联网的非智能空调监控方法及系统,可以监控非智能空调的各种参数信息,并根据参数信息调节非智能空调,可以实现检查空调温湿度设置是否合理,或根据时段自动控制空调开关机,避免不必要的冷源浪费。The embodiment of the invention provides a non-intelligent air conditioning monitoring method and system based on the Internet of Things, which can monitor various parameter information of the non-intelligent air conditioner, and adjust the non-intelligent air conditioner according to the parameter information, and can check whether the temperature and humidity setting of the air conditioner is reasonable. Or automatically control the air conditioner switch according to the time period to avoid unnecessary waste of cold source.
本发明实施例第一方面公开了一种基于大数据挖掘的用户行为分析方法,包括:The first aspect of the embodiments of the present invention discloses a user behavior analysis method based on big data mining, including:
采集用户行为数据;Collect user behavior data;
对用户行为数据采用并行运算模型进行预处理与聚合;Preprocessing and aggregation of user behavior data using a parallel computing model;
根据聚合后的用户行为数据,建立用户行为数据本体模型,并存储在数据库中。Based on the aggregated user behavior data, a user behavior data ontology model is established and stored in the database.
作为一种可选的实施方式,所述方法还包括,As an optional implementation manner, the method further includes
对用户行为数据本体模型进行分析,找出用户最新兴趣数据。The user behavior data ontology model is analyzed to find out the user's latest interest data.
作为一种可选的实施方式,所述用户行为数据包括用户行为主体、发生时间、发生的页面、上下滚动页面、移动或点击鼠标、页面停留时间、收藏、打印、保存、访问同一页面次数、复制粘贴文字操作、当前用户的搜索条件、搜索关键字对应的标题。As an optional implementation manner, the user behavior data includes a user behavior main body, an occurrence time, a generated page, a scrolling page up and down, a moving or clicking mouse, a page staying time, collecting, printing, saving, accessing the same page, Copy and paste text operations, current user search criteria, and search keyword-related titles.
作为一种可选的实施方式,所述预处理包括:去除不完整数据,删除重复数据、图片、页面动画;对页面进行的打印、收藏、保存、下载操作,在获取后,将其转换为对应的数据格式保存在数据库中;As an optional implementation manner, the pre-processing includes: removing incomplete data, deleting duplicate data, pictures, and page animation; printing, collecting, saving, and downloading operations on the page, and after converting, converting the The corresponding data format is saved in the database;
所述数据聚合包括:对正确、但无效的用户行为信息,采用基于规则的用户行为聚合算法进行过滤、整合。The data aggregation includes: filtering and integrating the correct and invalid user behavior information by using a rule-based user behavior aggregation algorithm.
作为一种可选的实施方式,所述建立用户行为数据本体模型,具体包括:As an optional implementation manner, the establishing a user behavior data ontology model specifically includes:
使用OWL-DL描述语言建立用户行为数据本体模型,并对本体模型进行分解,所述数据库采用开源的非关系型分布式数据库。 The OWL-DL description language is used to build the user behavior data ontology model, and the ontology model is decomposed. The database uses an open source non-relational distributed database.
本发明实施例第二面公开了一种基于大数据挖掘的用户行为分析系统,包括:The second aspect of the embodiment of the present invention discloses a user behavior analysis system based on big data mining, including:
采集单元,用于采集用户行为数据;An acquisition unit for collecting user behavior data;
预处理单元,用于对用户行为数据采用并行运算模型进行预处理与聚合;a pre-processing unit for performing pre-processing and aggregation on user behavior data using a parallel computing model;
建模单元,用于根据聚合后的用户行为数据,建立用户行为数据本体模型,并存储在数据库中。The modeling unit is configured to establish a user behavior data ontology model according to the aggregated user behavior data, and store the data in the database.
作为一种可选的实施方式,所述系统还包括:As an optional implementation manner, the system further includes:
分析单元,用于对用户行为数据本体模型进行分析,找出用户最新兴趣数据。The analysis unit is configured to analyze the user behavior data ontology model to find out the latest interest data of the user.
作为一种可选的实施方式,所述用户行为数据包括用户行为主体、发生时间、发生的页面、上下滚动页面、移动或点击鼠标、页面停留时间、收藏、打印、保存、访问同一页面次数、复制粘贴文字操作、当前用户的搜索条件、搜索关键字对应的标题。As an optional implementation manner, the user behavior data includes a user behavior main body, an occurrence time, a generated page, a scrolling page up and down, a moving or clicking mouse, a page staying time, collecting, printing, saving, accessing the same page, Copy and paste text operations, current user search criteria, and search keyword-related titles.
作为一种可选的实施方式,所述预处理包括:去除不完整数据,删除重复数据、图片、页面动画;对页面进行的打印、收藏、保存、下载操作,在获取后,将其转换为对应的数据格式保存在数据库中;As an optional implementation manner, the pre-processing includes: removing incomplete data, deleting duplicate data, pictures, and page animation; printing, collecting, saving, and downloading operations on the page, and after converting, converting the The corresponding data format is saved in the database;
所述数据聚合包括:对正确、但无效的用户行为信息,采用基于规则的用户行为聚合算法进行过滤、整合。The data aggregation includes: filtering and integrating the correct and invalid user behavior information by using a rule-based user behavior aggregation algorithm.
作为一种可选的实施方式,所述建模单元具体用于:使用OWL-DL描述语言建立用户行为数据本体模型,并对本体模型进行分解,所述数据库采用开源的非关系型分布式数据库。As an optional implementation manner, the modeling unit is specifically configured to: establish an ontology model of the user behavior data by using an OWL-DL description language, and decompose the ontology model, where the database adopts an open source non-relational distributed database. .
从以上技术方案可以看出,本发明实施例具有以下优点:It can be seen from the above technical solutions that the embodiments of the present invention have the following advantages:
本发明实施例中,采集用户行为数据;对用户行为数据采用并行运算模型进行预处理与聚合;根据聚合后的用户行为数据,建立用户行为数据本体模型,并存储在数据库中。由此可见,实施本发明实施例,将云计算技术的强大处理能力和大规模数据存储能力、本体及其分析、知识发现方法相结合,实时分析海量用户行为数据,及时获取用户兴趣,从而实现有效与精准的用户推送。In the embodiment of the present invention, the user behavior data is collected; the user behavior data is preprocessed and aggregated by using a parallel computing model; and the user behavior data ontology model is established according to the aggregated user behavior data, and stored in the database. It can be seen that the implementation of the embodiments of the present invention combines the powerful processing capability of the cloud computing technology with the large-scale data storage capability, the ontology and its analysis, and the knowledge discovery method, analyzes the massive user behavior data in real time, and timely acquires the user interest, thereby realizing Effective and accurate user push.
附图说明DRAWINGS
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例描述中所 需要使用的附图作简要介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域的普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the following description will be made on the embodiments. BRIEF DESCRIPTION OF THE DRAWINGS The accompanying drawings, which are incorporated in the drawings These figures take additional drawings.
图1为本发明第一实施例公开的一种基于大数据挖掘的用户行为分析方法的流程示意图;1 is a schematic flowchart of a user behavior analysis method based on big data mining according to a first embodiment of the present invention;
[根据细则26改正21.03.2017] 
图2为本发明第二实施例公开的一种基于大数据挖掘的用户行为分析系统的结构示意图;
图3为本发明第三实施例公开的一种终端设备的结构示意图。
[Correct according to Rule 26 21.03.2017]
2 is a schematic structural diagram of a user behavior analysis system based on big data mining according to a second embodiment of the present invention;
FIG. 3 is a schematic structural diagram of a terminal device according to a third embodiment of the present invention.
具体实施方式detailed description
为了使本发明的目的、技术方案和优点更加清楚,下面将结合附图对本发明作进一步地详细描述,显然,所描述的实施例仅仅是本发明一部份实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其它实施例,都属于本发明保护的范围。The present invention will be further described in detail with reference to the accompanying drawings, in which . All other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present invention without creative efforts are within the scope of the present invention.
本发明的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别不同的对象,而不是用于描述特定顺序。此外,术语“包括”和“具有”以及它们任何变形,意图在于覆盖不排他的包含。例如包含了一系列步骤或单元的过程、方法、系统、产品或设备没有限定于已列出的步骤或单元,而是可选地还包括没有列出的步骤或单元,或可选地还包括对于这些过程、方法或设备固有的其他步骤或单元。The terms "first", "second" and the like in the specification and claims of the present invention and the above drawings are used to distinguish different objects, and are not intended to describe a specific order. Furthermore, the terms "comprises" and "comprising" and "comprising" are intended to cover a non-exclusive inclusion. For example, a process, method, system, product, or device that comprises a series of steps or units is not limited to the listed steps or units, but optionally also includes steps or units not listed, or alternatively Other steps or units inherent to these processes, methods or devices.
在本文中提及实施例意味着,结合实施例描述的特定特征、结构或特性可以包含在本发明的至少一个实施例中。在说明书的各个位置出现该短语并不一定均是指相同的实施例,也不是与其他实施例互斥的独立的或备选的实施例。本领域技术人员显式地和隐式地理解的是,本文所描述的实施例可以与其他实施例相结合。The reference to an embodiment herein means that a particular feature, structure, or characteristic described in connection with the embodiments can be included in at least one embodiment of the invention. The appearances of the phrases in various places in the specification are not necessarily referring to the same embodiments, and are not exclusive or alternative embodiments that are mutually exclusive. Those skilled in the art will understand and implicitly understand that the embodiments described herein can be combined with other embodiments.
本发明实施例提供了一种基于大数据挖掘的用户行为分析方法,将云计算技术的强大处理能力和大规模数据存储能力、本体及其分析、知识发现方法相结合,实时分析海量用户行为数据,及时获取用户兴趣,从而实现有效与精准的用户推送。The embodiment of the invention provides a user behavior analysis method based on big data mining, which combines the powerful processing capability of cloud computing technology with large-scale data storage capability, ontology and its analysis, and knowledge discovery method to analyze massive user behavior data in real time. Get user interest in a timely manner to achieve effective and accurate user push.
请参阅图1,图1是本发明第一实施例公开的一种基于大数据挖掘的用户 行为分析方法的流程示意图。其中,图1所示的基于大数据挖掘的用户行为分析方法,可以包括以下步骤:Please refer to FIG. 1. FIG. 1 is a user based on big data mining disclosed in the first embodiment of the present invention. Schematic diagram of the behavior analysis method. The method for analyzing user behavior based on big data mining shown in FIG. 1 may include the following steps:
101、采集用户行为数据;101. Collect user behavior data;
本发明实施例中,所述用户行为数据包括用户行为主体、发生时间、发生的页面、上下滚动页面、移动或点击鼠标、页面停留时间、收藏、打印、保存、访问同一页面次数、复制粘贴文字操作、当前用户的搜索条件、搜索关键字对应的标题。In the embodiment of the present invention, the user behavior data includes a user behavior main body, an occurrence time, a generated page, a scrolling page up and down, a moving or clicking mouse, a page staying time, collecting, printing, saving, accessing the same page number, copying and pasting text. The operation, the search criteria of the current user, and the title corresponding to the search keyword.
102、对用户行为数据采用并行运算模型进行预处理与聚合;102. Perform parallel processing model on user behavior data for pre-processing and aggregation;
本发明实施例中,所述预处理包括:去除不完整数据,删除重复数据、图片、页面动画;对页面进行的打印、收藏、保存、下载操作,在获取后,将其转换为对应的数据格式保存在数据库中。In the embodiment of the present invention, the pre-processing includes: removing incomplete data, deleting duplicate data, pictures, and page animation; printing, collecting, saving, and downloading operations on the page, and converting the data into corresponding data after obtaining The format is saved in the database.
所述数据聚合包括:对正确、但无效的用户行为信息,采用基于规则的用户行为聚合算法进行过滤、整合。The data aggregation includes: filtering and integrating the correct and invalid user behavior information by using a rule-based user behavior aggregation algorithm.
103、根据聚合后的用户行为数据,建立用户行为数据本体模型,并存储在数据库中。103. Establish an ontology model of the user behavior data according to the aggregated user behavior data, and store the data in the database.
本发明实施例中,使用OWL-DL描述语言建立用户行为数据本体模型,并对本体模型进行分解,所述数据库采用开源的非关系型分布式数据库。In the embodiment of the present invention, the user behavior data ontology model is established by using the OWL-DL description language, and the ontology model is decomposed, and the database adopts an open source non-relational distributed database.
104、对用户行为数据本体模型进行分析,找出用户最新兴趣数据。104. Analyze the ontology model of the user behavior data to find out the latest interest data of the user.
本发明实施例中,将聚合后的用户行为数据添加到用户行为数据本体模型中,对存储在数据库中的用户行为数据本体模型数据进行分析,找出用户最新兴趣数据。In the embodiment of the present invention, the aggregated user behavior data is added to the user behavior data ontology model, and the user behavior data ontology model data stored in the database is analyzed to find the user's latest interest data.
在图1所描述的方法中,采集用户行为数据;对用户行为数据采用并行运算模型进行预处理与聚合;根据聚合后的用户行为数据,建立用户行为数据本体模型,并存储在数据库中。由此可见,实施本发明实施例,将云计算技术的强大处理能力和大规模数据存储能力、本体及其分析、知识发现方法相结合,实时分析海量用户行为数据,及时获取用户兴趣,从而实现有效与精准的用户推送。 In the method described in FIG. 1, the user behavior data is collected; the user behavior data is preprocessed and aggregated by using a parallel operation model; and the user behavior data ontology model is established according to the aggregated user behavior data, and stored in the database. It can be seen that the implementation of the embodiments of the present invention combines the powerful processing capability of the cloud computing technology with the large-scale data storage capability, the ontology and its analysis, and the knowledge discovery method, analyzes the massive user behavior data in real time, and timely acquires the user interest, thereby realizing Effective and accurate user push.
下面为本发明系统实施例,本发明系统实施例用于执行本发明方法实施例一实现的方法,为了便于说明,仅示出了与本发明实施例相关的办法,具体计算细节未揭示的,请参照本发明实施例一至二。The following is a system embodiment of the present invention. The system embodiment of the present invention is used to perform the method for implementing the first embodiment of the method of the present invention. For the convenience of description, only the method related to the embodiment of the present invention is shown, and specific calculation details are not disclosed. Please refer to Embodiments 1 to 2 of the present invention.
请参阅图2,图2是本发明第二实施例公开的一种基于大数据挖掘的用户行为分析系统的结构图。如图2所示,该系统可以包括:Please refer to FIG. 2. FIG. 2 is a structural diagram of a user behavior analysis system based on big data mining disclosed in a second embodiment of the present invention. As shown in Figure 2, the system can include:
采集单元201,用于采集用户行为数据;The collecting unit 201 is configured to collect user behavior data.
所述用户行为数据包括用户行为主体、发生时间、发生的页面、上下滚动页面、移动或点击鼠标、页面停留时间、收藏、打印、保存、访问同一页面次数、复制粘贴文字操作、当前用户的搜索条件、搜索关键字对应的标题。The user behavior data includes a user behavior main body, an occurrence time, a generated page, a scrolling page up and down, a moving or clicking mouse, a page staying time, a favorite, a print, a save, a visit to the same page number, a copy and paste text operation, and a current user search. Conditions, search for the title of the keyword.
预处理单元202,用于对用户行为数据采用并行运算模型进行预处理与聚合。The pre-processing unit 202 is configured to perform pre-processing and aggregation on the user behavior data by using a parallel computing model.
所述预处理包括:去除不完整数据,删除重复数据、图片、页面动画;对页面进行的打印、收藏、保存、下载操作,在获取后,将其转换为对应的数据格式保存在数据库中;The pre-processing includes: removing incomplete data, deleting duplicate data, pictures, page animation; printing, collecting, saving, and downloading operations on the page, and after converting, converting the data into a corresponding data format and saving the same in a database;
所述数据聚合包括:对正确、但无效的用户行为信息,采用基于规则的用户行为聚合算法进行过滤、整合。The data aggregation includes: filtering and integrating the correct and invalid user behavior information by using a rule-based user behavior aggregation algorithm.
建模单元203,用于根据聚合后的用户行为数据,建立用户行为数据本体模型,并存储在数据库中。The modeling unit 203 is configured to establish a user behavior data ontology model according to the aggregated user behavior data, and store the data in the database.
本发明实施例中,使用OWL-DL描述语言建立用户行为数据本体模型,并对本体模型进行分解,所述数据库采用开源的非关系型分布式数据库。In the embodiment of the present invention, the user behavior data ontology model is established by using the OWL-DL description language, and the ontology model is decomposed, and the database adopts an open source non-relational distributed database.
分析单元204,用于对用户行为数据本体模型进行分析,找出用户最新兴趣数据。The analyzing unit 204 is configured to analyze the user behavior data ontology model to find the latest interest data of the user.
本发明实施例中,将聚合后的用户行为数据添加到用户行为数据本体模型中,对存储在数据库中的用户行为数据本体模型数据进行分析,找出用户最新兴趣数据。In the embodiment of the present invention, the aggregated user behavior data is added to the user behavior data ontology model, and the user behavior data ontology model data stored in the database is analyzed to find the user's latest interest data.
在图2所描述的系统中,采集单元采集用户行为数据;预处理单元对用户行为数据采用并行运算模型进行预处理与聚合;建模单元根据聚合后的用户行为数据,建立用户行为数据本体模型,并存储在数据库中。由此可见,实施本 发明实施例,将云计算技术的强大处理能力和大规模数据存储能力、本体及其分析、知识发现方法相结合,实时分析海量用户行为数据,及时获取用户兴趣,从而实现有效与精准的用户推送。In the system described in FIG. 2, the acquisition unit collects user behavior data; the pre-processing unit uses a parallel operation model to perform pre-processing and aggregation on the user behavior data; the modeling unit establishes a user behavior data ontology model according to the aggregated user behavior data. And stored in the database. This shows that the implementation of this In the embodiment of the invention, the powerful processing capability of the cloud computing technology and the large-scale data storage capability, the ontology and its analysis, and the knowledge discovery method are combined to analyze the massive user behavior data in real time, and timely acquire the user interest, thereby realizing effective and accurate user push. .
请参阅图3,图3为本发明实施例公开的一种终端设备的结构示意图。如图3所示,为了便于说明,仅示出了与本发明实施例相关的部分,具体技术细节未揭示的,请参照本发明实施例方法部分。该终端可以包括处理器301,存储器302,采集器303,所述处理器301,存储器302及发送器303通过通信总线304相连。Referring to FIG. 3, FIG. 3 is a schematic structural diagram of a terminal device according to an embodiment of the present invention. As shown in FIG. 3, for the convenience of description, only the parts related to the embodiments of the present invention are shown. For the specific technical details not disclosed, please refer to the method part of the embodiment of the present invention. The terminal may include a processor 301, a memory 302, a collector 303, the processor 301, a memory 302, and a transmitter 303 connected by a communication bus 304.
前述实施例中,各步骤方法流程可以基于该终端设备的结构实现。其中应用层和操作系统内核均可视为处理器301的抽象化结构的组成部分。In the foregoing embodiment, each step method flow may be implemented based on the structure of the terminal device. Both the application layer and the operating system kernel can be considered as part of the abstraction structure of the processor 301.
在本发明实施例中,处理器301通过调用存储于存储器302中的程序代码,用于执行以下操作:In the embodiment of the present invention, the processor 301 performs the following operations by calling program code stored in the memory 302:
采集用户行为数据;Collect user behavior data;
对用户行为数据采用并行运算模型进行预处理与聚合;Preprocessing and aggregation of user behavior data using a parallel computing model;
根据聚合后的用户行为数据,建立用户行为数据本体模型,并存储在数据库中。Based on the aggregated user behavior data, a user behavior data ontology model is established and stored in the database.
所述用户行为数据包括用户行为主体、发生时间、发生的页面、上下滚动页面、移动或点击鼠标、页面停留时间、收藏、打印、保存、访问同一页面次数、复制粘贴文字操作、当前用户的搜索条件、搜索关键字对应的标题。The user behavior data includes a user behavior main body, an occurrence time, a generated page, a scrolling page up and down, a moving or clicking mouse, a page staying time, a favorite, a print, a save, a visit to the same page number, a copy and paste text operation, and a current user search. Conditions, search for the title of the keyword.
所述预处理包括:去除不完整数据,删除重复数据、图片、页面动画;对页面进行的打印、收藏、保存、下载操作,在获取后,将其转换为对应的数据格式保存在数据库中;The pre-processing includes: removing incomplete data, deleting duplicate data, pictures, page animation; printing, collecting, saving, and downloading operations on the page, and after converting, converting the data into a corresponding data format and saving the same in a database;
所述数据聚合包括:对正确、但无效的用户行为信息,采用基于规则的用户行为聚合算法进行过滤、整合。The data aggregation includes: filtering and integrating the correct and invalid user behavior information by using a rule-based user behavior aggregation algorithm.
在图3所描述的终端设备中,采集器303采集用户行为数据;处理器301对用户行为数据采用并行运算模型进行预处理与聚合;根据聚合后的用户行为数据,建立用户行为数据本体模型,并存储在数据库中。由此可见,实施本发明实施例,将云计算技术的强大处理能力和大规模数据存储能力、本体及其分 析、知识发现方法相结合,实时分析海量用户行为数据,及时获取用户兴趣,从而实现有效与精准的用户推送。In the terminal device described in FIG. 3, the collector 303 collects user behavior data; the processor 301 performs preprocessing and aggregation on the user behavior data by using a parallel computing model; and establishes a user behavior data ontology model according to the aggregated user behavior data. And stored in the database. It can be seen that implementing the embodiments of the present invention, the powerful processing capability of the cloud computing technology and the large-scale data storage capability, the ontology and its points The combination of analysis and knowledge discovery methods analyzes massive user behavior data in real time and acquires user interest in time to achieve effective and accurate user push.
作为一种可选的实施方式,处理器301通过调用存储于存储器302中的程序代码,还用于执行以下操作:As an optional implementation manner, the processor 301 is further configured to perform the following operations by calling program code stored in the memory 302:
对用户行为数据本体模型进行分析,找出用户最新兴趣数据。The user behavior data ontology model is analyzed to find out the user's latest interest data.
本发明实施例还提供一种计算机存储介质,其中,该计算机存储介质可存储有程序,该程序执行时包括上述方法实施例中任何一种服务进程的监控方法的部分或全步骤。The embodiment of the present invention further provides a computer storage medium, wherein the computer storage medium can store a program, and the program includes a part or all steps of a monitoring method of any one of the foregoing method embodiments.
需要说明的是,对于前述的各方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本发明并不受所描述的动作顺序的限制,因为依据本发明,某些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中国所描述的实施例均属于优选实施例,所涉及的动作和单元并不一定是本发明所必须的。It should be noted that, for the foregoing method embodiments, for the sake of simple description, they are all expressed as a series of action combinations, but those skilled in the art should understand that the present invention is not limited by the described action sequence. Because certain steps may be performed in other sequences or concurrently in accordance with the present invention. In the following, those skilled in the art should also understand that the embodiments described in the specification are all preferred embodiments, and the actions and units involved are not necessarily required by the present invention.
本发明实施例的方法的不足顺序可以根据实际需要进行调整、合并或删减。本发明实施例的终端的单元可以根据实际需要进行整合、进一步划分或删减。The insufficiency of the method of the embodiment of the present invention may be adjusted, merged, or deleted according to actual needs. The unit of the terminal in the embodiment of the present invention may be integrated, further divided or deleted according to actual needs.
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参加其他实施例的相关描述。In the above embodiments, the descriptions of the various embodiments are different, and the parts that are not detailed in a certain embodiment can participate in the related description of other embodiments.
在本申请所提供的几个实施例中,应该理解到,所揭露的系统,可通过其他的方式实现,例如,以上所描述的系统实施例是示意性的,例如所述单元的划分,为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的简介耦合或通信连接,可以是电性或其他的形式。In the several embodiments provided by the present application, it should be understood that the disclosed system may be implemented in other manners, for example, the system embodiment described above is illustrative, for example, the division of the unit is A logical function partitioning may be implemented in an actual manner. For example, multiple units or components may be combined or integrated into another system, or some features may be ignored or not executed. In addition, the mutual coupling or direct coupling or communication connection shown or discussed may be an inductive or communication connection through some interface, device or unit, and may be electrical or otherwise.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以是不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. . Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
另外,在本发明各个实施例中的各功能单元可以集成在一个处理单元中, 也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。In addition, each functional unit in various embodiments of the present invention may be integrated in one processing unit. It is also possible that each unit physically exists alone, or two or more units may be integrated in one unit. The above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.
值得注意的是,上述基于物联网的多点测温仪上报系统和终端设备实施例中,所包括的各个单元只是按照功能逻辑进行划分的,但并不局限于上述的划分,只要能够实现相应的功能即可;另外,各功能单元的具体名称也只是为了便于相互区分,并不用于限制本发明的保护范围。It should be noted that, in the foregoing embodiment of the multi-point thermometer reporting system and the terminal device based on the Internet of Things, each unit included is only divided according to functional logic, but is not limited to the above division, as long as the corresponding The functions of the functional units are only for the purpose of facilitating mutual differentiation and are not intended to limit the scope of the present invention.
另外,本领域普通技术人员可以理解上述实施例的各种方法中的全部或部分步骤是可以通过程序来指令相关的硬件来完成,该程序可以存储于一计算机可读存储介质中,存储介质包括只读存储器(Read-Only Memory,ROM)、随机存储器(RandomAccess Memory,RAM)、可编程只读存储器(Programmable Read-only Memory,PROM)、可擦除可编程只读存储器(Erasable Programmable Read Only Memory,EPROM)、一次可编程只读存储器(One-time Programmable Read-Only Memory,OTPROM)、电子抹除式可复写只读存储器(Electrically-Erasable Programmable Read-Only Memory,EEPROM)、只读光盘(Compact Disc Read-Only Memory,CD-ROM)或其他光盘存储器、磁盘存储器、磁带存储器、或者能够用于携带或存储数据的计算机可读的任何其他介质。In addition, those skilled in the art can understand that all or part of the steps of the foregoing embodiments may be completed by a program to instruct related hardware, and the program may be stored in a computer readable storage medium, where the storage medium includes Read-Only Memory (ROM), Random Access Memory (RAM), Programmable Read-Only Memory (PROM), Erasable Programmable Read Only Memory (Erasable Programmable Read Only Memory) , EPROM), One-time Programmable Read-Only Memory (OTPROM), Electronically-Erasable Programmable Read-Only Memory (EEPROM), Read-Only Disc (Compact) Disc Read-Only Memory (CD-ROM) or other optical disc storage, disk storage, magnetic tape storage, or any other medium readable by a computer that can be used to carry or store data.
以上仅为本发明较佳的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明实施例揭露的技术范围内,可轻易想到的变化或替换,都应涵盖在本发明的保护范围之内。因此,本发明的保护范围应该以权利要求的保护范围为准。 The above is only a preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily think of changes or replacements within the technical scope disclosed by the embodiments of the present invention. All should be covered by the scope of the present invention. Therefore, the scope of protection of the present invention should be determined by the scope of the claims.

Claims (10)

  1. 一种基于大数据挖掘的用户行为分析方法,其特征在于,包括:A user behavior analysis method based on big data mining, which is characterized in that it comprises:
    采集用户行为数据;Collect user behavior data;
    对用户行为数据采用并行运算模型进行预处理与聚合;Preprocessing and aggregation of user behavior data using a parallel computing model;
    根据聚合后的用户行为数据,建立用户行为数据本体模型,并存储在数据库中。Based on the aggregated user behavior data, a user behavior data ontology model is established and stored in the database.
  2. 根据权利要求1所述方法,其特征在于,所述方法还包括,The method of claim 1 wherein said method further comprises
    对用户行为数据本体模型进行分析,找出用户最新兴趣数据。The user behavior data ontology model is analyzed to find out the user's latest interest data.
  3. 根据权利要求1所述方法,其特征在于,The method of claim 1 wherein
    所述用户行为数据包括用户行为主体、发生时间、发生的页面、上下滚动页面、移动或点击鼠标、页面停留时间、收藏、打印、保存、访问同一页面次数、复制粘贴文字操作、当前用户的搜索条件、搜索关键字对应的标题。The user behavior data includes a user behavior main body, an occurrence time, a generated page, a scrolling page up and down, a moving or clicking mouse, a page staying time, a favorite, a print, a save, a visit to the same page number, a copy and paste text operation, and a current user search. Conditions, search for the title of the keyword.
  4. 根据权利要求1所述方法,其特征在于,The method of claim 1 wherein
    所述预处理包括:去除不完整数据,删除重复数据、图片、页面动画;对页面进行的打印、收藏、保存、下载操作,在获取后,将其转换为对应的数据格式保存在数据库中;The pre-processing includes: removing incomplete data, deleting duplicate data, pictures, page animation; printing, collecting, saving, and downloading operations on the page, and after converting, converting the data into a corresponding data format and saving the same in a database;
    所述数据聚合包括:对正确、但无效的用户行为信息,采用基于规则的用户行为聚合算法进行过滤、整合。The data aggregation includes: filtering and integrating the correct and invalid user behavior information by using a rule-based user behavior aggregation algorithm.
  5. 根据权利要求4所述方法,其特征在于,所述建立用户行为数据本体模型,具体包括:The method according to claim 4, wherein the establishing a user behavior data ontology model comprises:
    使用OWL-DL描述语言建立用户行为数据本体模型,并对本体模型进行分解,所述数据库采用开源的非关系型分布式数据库。The OWL-DL description language is used to build the user behavior data ontology model, and the ontology model is decomposed. The database uses an open source non-relational distributed database.
  6. 一种基于大数据挖掘的用户行为分析系统,其特征在于,A user behavior analysis system based on big data mining, characterized in that
    采集单元,用于采集用户行为数据;An acquisition unit for collecting user behavior data;
    预处理单元,用于对用户行为数据采用并行运算模型进行预处理与聚合;a pre-processing unit for performing pre-processing and aggregation on user behavior data using a parallel computing model;
    建模单元,用于根据聚合后的用户行为数据,建立用户行为数据本体模型,并存储在数据库中。The modeling unit is configured to establish a user behavior data ontology model according to the aggregated user behavior data, and store the data in the database.
  7. 根据权利要求6所述的系统,其特征在于,所述系统还包括:The system of claim 6 wherein the system further comprises:
    分析单元,用于对用户行为数据本体模型进行分析,找出用户最新兴趣数 据。An analysis unit for analyzing the ontology model of the user behavior data to find out the latest interest of the user according to.
  8. 根据权利要求7所述的系统,其特征在于,所述用户行为数据包括用户行为主体、发生时间、发生的页面、上下滚动页面、移动或点击鼠标、页面停留时间、收藏、打印、保存、访问同一页面次数、复制粘贴文字操作、当前用户的搜索条件、搜索关键字对应的标题。The system according to claim 7, wherein the user behavior data comprises a user behavior subject, an occurrence time, a generated page, a scrolling page up and down, a move or click mouse, a page time, a favorite, a print, a save, and a visit. The same page number, copy and paste text operation, current user's search condition, search title corresponding to the keyword.
  9. 根据权利要求7述的系统,其特征在于,The system of claim 7 wherein:
    所述预处理包括:去除不完整数据,删除重复数据、图片、页面动画;对页面进行的打印、收藏、保存、下载操作,在获取后,将其转换为对应的数据格式保存在数据库中;The pre-processing includes: removing incomplete data, deleting duplicate data, pictures, page animation; printing, collecting, saving, and downloading operations on the page, and after converting, converting the data into a corresponding data format and saving the same in a database;
    所述数据聚合包括:对正确、但无效的用户行为信息,采用基于规则的用户行为聚合算法进行过滤、整合。The data aggregation includes: filtering and integrating the correct and invalid user behavior information by using a rule-based user behavior aggregation algorithm.
  10. 根据权利要求6所述的系统,其特征在于,The system of claim 6 wherein:
    所述建模单元具体用于:使用OWL-DL描述语言建立用户行为数据本体模型,并对本体模型进行分解,所述数据库采用开源的非关系型分布式数据库。 The modeling unit is specifically configured to: establish an ontology model of the user behavior data by using an OWL-DL description language, and decompose the ontology model, wherein the database adopts an open source non-relational distributed database.
PCT/CN2017/072375 2017-01-24 2017-01-24 User behavior analysis method and system based on big data mining WO2018137104A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2017/072375 WO2018137104A1 (en) 2017-01-24 2017-01-24 User behavior analysis method and system based on big data mining

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2017/072375 WO2018137104A1 (en) 2017-01-24 2017-01-24 User behavior analysis method and system based on big data mining

Publications (1)

Publication Number Publication Date
WO2018137104A1 true WO2018137104A1 (en) 2018-08-02

Family

ID=62977811

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/072375 WO2018137104A1 (en) 2017-01-24 2017-01-24 User behavior analysis method and system based on big data mining

Country Status (1)

Country Link
WO (1) WO2018137104A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111460046A (en) * 2020-03-06 2020-07-28 合肥海策科技信息服务有限公司 Scientific and technological information clustering method based on big data

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103793465A (en) * 2013-12-20 2014-05-14 武汉理工大学 Cloud computing based real-time mass user behavior analyzing method and system
CN104462213A (en) * 2014-12-05 2015-03-25 成都逸动无限网络科技有限公司 User behavior analysis method and system based on big data
CN105447186A (en) * 2015-12-16 2016-03-30 汉鼎信息科技股份有限公司 Big data platform based user behavior analysis system
US20160092774A1 (en) * 2014-09-29 2016-03-31 Pivotal Software, Inc. Determining and localizing anomalous network behavior

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103793465A (en) * 2013-12-20 2014-05-14 武汉理工大学 Cloud computing based real-time mass user behavior analyzing method and system
US20160092774A1 (en) * 2014-09-29 2016-03-31 Pivotal Software, Inc. Determining and localizing anomalous network behavior
CN104462213A (en) * 2014-12-05 2015-03-25 成都逸动无限网络科技有限公司 User behavior analysis method and system based on big data
CN105447186A (en) * 2015-12-16 2016-03-30 汉鼎信息科技股份有限公司 Big data platform based user behavior analysis system

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111460046A (en) * 2020-03-06 2020-07-28 合肥海策科技信息服务有限公司 Scientific and technological information clustering method based on big data

Similar Documents

Publication Publication Date Title
US11989707B1 (en) Assigning raw data size of source data to storage consumption of an account
Bordin et al. Dspbench: A suite of benchmark applications for distributed data stream processing systems
CN101334792B (en) A personalized service recommendation system and method
WO2020037917A1 (en) User behavior data recommendation method, server and computer readable medium
CN109684538A (en) A kind of recommended method and recommender system based on individual subscriber feature
CN112115363A (en) Recommendation method, computing device and storage medium
CN116739676A (en) Intelligent advertisement marketing system based on big data
CN106296305A (en) Electric business website real-time recommendation System and method under big data environment
CN103064842B (en) Information subscribing treating apparatus and information subscribing disposal route
CN104216889B (en) Data dissemination analyzing and predicting method and system based on cloud service
Demirbaga HTwitt: a hadoop-based platform for analysis and visualization of streaming Twitter data
CN102063454A (en) Method and equipment combining search and application
WO2018049908A1 (en) Web page generation method and device
CN108255963A (en) A control method and device for Internet-based news information retrieval
CN118608184B (en) User portrait updating method, device, equipment and storage medium
Liu et al. KAT: Knowledge-aware attentive recommendation model integrating two-terminal neighbor features
CN106777367A (en) A kind of user behavior analysis method and system excavated based on big data
Yu et al. A novel framework to alleviate the sparsity problem in context-aware recommender systems
WO2018137104A1 (en) User behavior analysis method and system based on big data mining
Wadhera et al. A systematic Review of Big data tools and application for developments
CN106919653B (en) Log filtering method based on user behavior
CN116506498A (en) Cloud computing-based data accurate pushing method
CN114297234A (en) Method and device for identifying key behavior data
Jung Discovering social bursts by using link analytics on large-scale social networks
CN114519608A (en) Business opportunity extraction method, device, medium and electronic equipment

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17894365

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17894365

Country of ref document: EP

Kind code of ref document: A1

点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载