CN103368957B

CN103368957B - Method and system that web page access behavior is processed, client, server

Info

Publication number: CN103368957B
Application number: CN201310279888.2A
Authority: CN
Inventors: 肖鹏; 郑劲松; 刘起; 符云
Original assignee: Beijing Qihoo Technology Co Ltd; Qizhi Software Beijing Co Ltd
Current assignee: Beijing Qizhi Business Consulting Co ltd; Beijing Qihoo Technology Co Ltd; 360 Digital Security Technology Group Co Ltd
Priority date: 2013-07-04
Filing date: 2013-07-04
Publication date: 2017-03-15
Anticipated expiration: 2033-07-04
Also published as: CN103368957A

Abstract

The invention discloses a method, a system, a client and a server for processing web page access behaviors. The method includes: after monitoring the access request of the i-level page, obtaining a refer chain containing the page ID of the i-level page, the refer chain comprising the page ID and URL of the initial page to the i-level page; All URLs of the server are sent to the server for the server to query whether all the URLs contained in the refer chain belong to the blacklist and/or whitelist database it saves, and then match the query results with the preset rules to obtain the matching results; the receiving server returns The matching result of the i-th level page is processed according to the matching result. Compared with the prior art, the present invention has higher detection efficiency because the refer chain provides more URLs and wider coverage, and can more effectively protect the security of client webpage browsing.

Description

Method and system for processing webpage access behavior, client, server

技术领域technical field

本发明涉及计算机网络技术领域，具体涉及一种对网页访问行为进行处理的方法及系统、客户端、服务器。The invention relates to the technical field of computer networks, in particular to a method and system for processing web page access behaviors, a client and a server.

背景技术Background technique

恶意网站，例如钓鱼网站、或者是欺诈，仿冒网站等，主要是通过仿冒真实网站的URL地址或是页面内容，伪装成银行及电子商务等类型的网站，或是利用真实网站服务器程序上的漏洞，在该网站的某些网页中插入危险的网页代码，以此来骗取用户银行或信用卡账号、密码等私人资料。恶意网页中包含着许多敏感的特征，例如，金融欺诈类的恶意网页会在文字、图片等方面仿冒官网，或是在真实网页中插入虚假票务、虚假中奖、假冒网银、虚假购物等信息，这些特征大多以文本串的形式出现在网页中。Malicious websites, such as phishing websites, fraudulent websites, counterfeit websites, etc., mainly fake the URL addresses or page content of real websites, pretend to be banking and e-commerce websites, or use loopholes in real website server programs , inserting dangerous webpage codes into some webpages of the website to defraud users of private information such as bank or credit card account numbers and passwords. Malicious webpages contain many sensitive features. For example, malicious webpages related to financial fraud will imitate the official website in terms of text and pictures, or insert information such as fake ticketing, fake lottery winning, fake online banking, and fake shopping into real webpages. Features mostly appear in web pages in the form of text strings.

目前对恶意网页识别的方法，主要是通过人工审核恶意网页，以收集一些简单的恶意网站的文本特征，供浏览器插件依据这些文本特征对网页内容进行判断，过滤掉这些已报告的攻击网站。但是，现今恶意网站的存活期越来越短，新的恶意网页层出不穷，需要审核的网页量太大；并且恶意网站的特征变化加快，按照传统的人工审核的方式，提取信息的效率会比较低。The current method for identifying malicious webpages is mainly to manually review malicious webpages to collect text features of some simple malicious websites, which are used by browser plug-ins to judge webpage content based on these text features, and filter out these reported attack websites. However, the survival period of malicious websites is getting shorter and shorter, new malicious webpages emerge in endlessly, and the amount of webpages that need to be reviewed is too large; and the characteristics of malicious websites are changing rapidly. According to the traditional manual review method, the efficiency of extracting information will be relatively low. .

现有的为了防范恶意网站的主要手段是当用户访问某网站时，客户端将网站的URL发送至服务器端的黑白名单数据库进行查询，所谓的黑名单数据库即是已审核确认的恶意网站的URL名单数据库，所谓的白名单数据库即是已审核确认的安全网站的URL。服务器端经过查询后，将网站是否属于恶意网站的结果反馈给客户端。The existing main means to prevent malicious websites is that when a user visits a certain website, the client sends the URL of the website to the black and white list database on the server side for query. The so-called blacklist database is the URL list of malicious websites that have been verified and confirmed. Database, the so-called white list database is the URL of the safe website that has been audited and confirmed. After querying, the server will feed back the result of whether the website is a malicious website to the client.

上述现有的技术手段仅能针对单一的URL进行检测。但是由于目前恶意网站的URL不断变化，服务器端的黑白名单数据库的更新速度远不及恶意网站的变化速度快，因此针对单一的URL进行检测的技术手段不能有效的检测出恶意网站，因而不能实时快速有效地保护客户端的网页浏览安全。The above existing technical means can only detect a single URL. However, because the URLs of malicious websites are constantly changing, the update speed of the black and white list database on the server side is far slower than that of malicious websites. Therefore, the technical means of detecting a single URL cannot effectively detect malicious websites, so it cannot be real-time, fast and effective. Protect the client's web browsing security.

发明内容Contents of the invention

鉴于上述问题，提出了本发明以便提供一种克服上述问题或者至少部分地解决上述问题的对网页访问行为进行处理的系统、客户端、服务器和相应的对网页访问行为进行处理的方法。In view of the above problems, the present invention is proposed to provide a system for processing web page access behaviors, a client, a server and a corresponding method for processing web page access behaviors, which overcome the above problems or at least partially solve the above problems.

根据本发明的一个方面，提供了一种对网页访问行为进行处理的方法，用于检测通过初始页面的第i级链接所打开的第i级页面，i≥2；该方法包括：According to one aspect of the present invention, a method for processing webpage access behavior is provided, which is used to detect the i-th level page opened by the i-th level link of the initial page, i≥2; the method includes:

在监控到第i级页面的访问请求后，获取包含第i级页面的页面ID的refer链，所述refer链包含初始页面至第i级页面的页面ID和URL；After monitoring the access request of the i-level page, obtain a refer chain comprising the page ID of the i-level page, the refer chain comprising the page ID and URL of the initial page to the i-level page;

将所述refer链所包含的所有URL发送给服务器，以供所述服务器查询所述refer链所包含的所有URL是否属于服务器保存的黑名单和/或白名单数据库，然后将查询结果与预设的规则进行匹配得到匹配结果；Send all the URLs included in the refer chain to the server, so that the server can query whether all the URLs included in the refer chain belong to the blacklist and/or whitelist database saved by the server, and then compare the query results with the preset Match the rules to get the matching result;

接收服务器返回的匹配结果，根据所述匹配结果对所述第i级页面的访问行为进行处理。The matching result returned by the server is received, and the access behavior of the i-th level page is processed according to the matching result.

根据本发明的另一方面，提供了一种客户端，用于检测通过初始页面的第i级链接所打开的第i级页面，i≥2；该客户端包括：According to another aspect of the present invention, a client is provided, which is used to detect the i-th level page opened by the i-th level link of the initial page, i≥2; the client includes:

监控模块，适于在监控到第i级页面的访问请求后，获取包含第i级页面的页面ID的refer链，所述refer链包含初始页面至第i级页面的页面ID和URL；The monitoring module is adapted to obtain a refer chain including the page ID of the i-level page after monitoring the access request of the i-level page, and the refer chain includes page IDs and URLs from the initial page to the i-level page;

查询接口，适于将所述refer链所包含的所有URL发送给服务器，以供所述服务器查询所述refer链所包含的所有URL是否属于服务器保存的黑名单和/或白名单数据库，然后将查询结果与预设的规则进行匹配得到匹配结果；以及，接收所述服务器返回的匹配结果；A query interface, adapted to send all the URLs included in the refer chain to the server, so that the server can query whether all the URLs included in the refer chain belong to the blacklist and/or whitelist database saved by the server, and then send matching the query result with a preset rule to obtain a matching result; and receiving the matching result returned by the server;

保护模块，适于根据所述匹配结果对所述第i级页面的访问行为进行处理。The protection module is adapted to process the access behavior of the i-th level page according to the matching result.

根据本发明的另一方面，提供了一种服务器，用于检测通过初始页面的第i级链接所打开的第i级页面，i≥2；该服务器包括：According to another aspect of the present invention, a server is provided, which is used to detect the i-th level page opened by the i-th level link of the initial page, i≥2; the server includes:

黑名单和/或白名单数据库，适于保存属于黑名单和/或白名单的URL；a blacklist and/or whitelist database adapted to save URLs belonging to the blacklist and/or whitelist;

查询接口，适于接收客户端发送的refer链所包含的所有URL，查询所述refer链所包含的所有URL是否属于所述黑名单和/或白名单数据库，然后将查询结果与预设的规则进行匹配得到匹配结果，将所述匹配结果返回给所述客户端。The query interface is adapted to receive all URLs contained in the refer chain sent by the client, query whether all the URLs contained in the refer chain belong to the blacklist and/or whitelist database, and then compare the query results with the preset rules Matching is performed to obtain a matching result, and the matching result is returned to the client.

根据本发明的另一方面，提供了一种对网页访问行为进行处理的系统，包括：上述客户端和服务器。According to another aspect of the present invention, a system for processing webpage access behavior is provided, including: the above-mentioned client and server.

根据本发明提供的对网页访问行为进行处理的方法及系统、客户端、服务器，每当客户端监控到通过初始页面的各级链接对新页面的访问请求后，获取该新页面对应的refer链，将refer链包含的所有URL上报给服务器，由服务器根据这些URL检测出匹配结果，由客户端根据该匹配结果对新页面的访问行为进行处理。与现有技术仅利用新页面的URL进行检测相比，由于refer链所提供的URL更多，覆盖面更广，因而检测效率更高，能够更为有效地保护客户端网页浏览的安全性。According to the method, system, client, and server for processing webpage access behaviors provided by the present invention, whenever the client monitors an access request to a new page through links at all levels of the initial page, it obtains the refer chain corresponding to the new page , report all URLs contained in the refer chain to the server, the server detects matching results based on these URLs, and the client processes the access behavior of the new page according to the matching results. Compared with the prior art that only utilizes the URL of the new page for detection, since the refer chain provides more URLs and covers a wider area, the detection efficiency is higher, and the security of the client's web browsing can be more effectively protected.

上述说明仅是本发明技术方案的概述，为了能够更清楚了解本发明的技术手段，而可依照说明书的内容予以实施，并且为了让本发明的上述和其它目的、特征和优点能够更明显易懂，以下特举本发明的具体实施方式。The above description is only an overview of the technical solution of the present invention. In order to better understand the technical means of the present invention, it can be implemented according to the contents of the description, and in order to make the above and other purposes, features and advantages of the present invention more obvious and understandable , the specific embodiments of the present invention are enumerated below.

附图说明Description of drawings

通过阅读下文优选实施方式的详细描述，各种其他的优点和益处对于本领域普通技术人员将变得清楚明了。附图仅用于示出优选实施方式的目的，而并不认为是对本发明的限制。而且在整个附图中，用相同的参考符号表示相同的部件。在附图中：Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiment. The drawings are only for the purpose of illustrating a preferred embodiment and are not to be considered as limiting the invention. Also throughout the drawings, the same reference numerals are used to designate the same parts. In the attached picture:

图1示出了根据本发明一个实施例的对网页访问行为进行处理的方法的流程图；FIG. 1 shows a flowchart of a method for processing web page access behaviors according to an embodiment of the present invention;

图2示出了根据本发明一个实施例的创建refer链的方法的流程图；FIG. 2 shows a flowchart of a method for creating a refer chain according to an embodiment of the present invention;

图3示出了根据本发明一个实施例的客户端的结构框图；Fig. 3 shows a structural block diagram of a client according to an embodiment of the present invention;

图4示出了根据本发明一个实施例的服务器的结构框图；Fig. 4 shows a structural block diagram of a server according to an embodiment of the present invention;

图5示出了根据本发明一个实施例的对网页访问行为进行处理的系统的结构框图。Fig. 5 shows a structural block diagram of a system for processing webpage access behaviors according to an embodiment of the present invention.

具体实施方式detailed description

下面将参照附图更详细地描述本公开的示例性实施例。虽然附图中显示了本公开的示例性实施例，然而应当理解，可以以各种形式实现本公开而不应被这里阐述的实施例所限制。相反，提供这些实施例是为了能够更透彻地理解本公开，并且能够将本公开的范围完整的传达给本领域的技术人员。Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. Although exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited by the embodiments set forth herein. Rather, these embodiments are provided for more thorough understanding of the present disclosure and to fully convey the scope of the present disclosure to those skilled in the art.

针对现有技术存在的利用单一的URL进行检测的技术手段不能有效的检测出恶意网站，因而不能实时快速有效地保护客户端的网页浏览安全的技术问题，本发明提供了一种利用refer链对网页访问行为进行处理的方案。对于当前用户正在访问的页面，其refer信息为该当前页面的父页面的URL，即链接到该当前页面的前一级页面的URL。本发明根据链接到当前页面的若干级页面的URL得到refer链，利用refer链来对网页访问行为进行处理。Aiming at the technical problem existing in the prior art that using a single URL for detection cannot effectively detect malicious websites, and therefore cannot quickly and effectively protect the client’s web browsing security in real time, the present invention provides a method for using refer chains to detect web pages. The scheme by which access behavior is handled. For the page that the current user is visiting, its refer information is the URL of the parent page of the current page, that is, the URL of the previous page linked to the current page. The present invention obtains refer chains according to the URLs of several levels of pages linked to the current page, and uses the refer chains to process webpage access behaviors.

图1示出了根据本发明一个实施例的对网页访问行为进行处理的方法100的流程图。在本实施例中，将当前页面称为第i级页面，i≥2，该第i级页面是由初始页面的第i级链接所打开的页面。通常，在用户打开浏览器后，浏览器访问默认的初始页面或者通过用户在地址栏的输入触发初始页面的访问请求，通过用户在初始页面上点击链接或者其它链接方式由初始页面链接到第2级页面，通过用户在第2级页面上点击链接或者其它链接方式由第2级页面链接到第3级页面，依此类推，最后由第i-1级页面链接到第i级页面。举例来说，用户打开浏览器后在地址栏输入www.so.com，该页面就是初始页面（下面用A来表示其URL）；然后，用户在搜索栏输入“话费充值”，点击搜索按钮，浏览器会跳到http://www.so.com/s?ie=utf-8&src=360sou_home&q=%E8%AF%9D%E8%B4%B9%E5%85%85%E5%80%BC，该页面为第2级页面（下面用B来表示其URL）；第2级页面提供了很多链接，用户点击其中一个链接，浏览器会跳到此链接对应的页面http://chongzhi.360.cn/mobile/，该页面为第3级页面（下面用C来表示其URL）；用户在第3级页面上点击“网游点卡”链接，浏览器会跳到http://chongzhi.360.cn/GameCard/index，该页面为第4级页面（下面用D来表示其URL）。Fig. 1 shows a flowchart of a method 100 for processing web page access behaviors according to an embodiment of the present invention. In this embodiment, the current page is referred to as the i-th level page, i≥2, and the i-th level page is a page opened by the i-th level link of the initial page. Usually, after the user opens the browser, the browser accesses the default initial page or triggers the initial page access request through the user's input in the address bar, and the initial page is linked to the second page by the user clicking a link on the initial page or other links. The level page is linked from the level 2 page to the level 3 page through the user clicking a link on the level 2 page or other link methods, and so on, and finally the i-1 level page is linked to the i level page. For example, after the user opens the browser and enters www.so.com in the address bar, this page is the initial page (the URL is represented by A in the following); then, the user enters "phone recharge" in the search bar and clicks the search button, The browser will jump to http://www.so.com/s?ie=utf-8&src=360sou_home&q=%E8%AF%9D%E8%B4%B9%E5%85%85%E5%80%BC, This page is a second-level page (B is used to indicate its URL below); the second-level page provides many links, and the user clicks one of the links, and the browser will jump to the page corresponding to this link http://chongzhi.360. cn/mobile/, this page is the third-level page (C is used to represent its URL below); the user clicks the link of "online game point card" on the third-level page, and the browser will jump to http://chongzhi.360. cn/GameCard/index, this page is a level 4 page (D is used below to represent its URL).

如图1所示，本方法100始于步骤S101，其中客户端的浏览器监控第i级页面的访问请求。该第i级页面的访问请求是用户在第i-1级页面点击链接或其它链接方式触发的。在上述示例中，用户在第3级页面上点击“网游点卡”链接，浏览器就会监控到第4级页面：http://chongzhi.360.cn/GameCard/index的访问请求。As shown in FIG. 1 , the method 100 starts at step S101 , where the browser of the client monitors the access request of the i-th level page. The access request for the i-th level page is triggered by the user clicking a link or other link on the i-1 level page. In the above example, when the user clicks the link of "Online Game Point Card" on the third-level page, the browser will monitor the access request of the fourth-level page: http://chongzhi.360.cn/GameCard/index.

在监控到第i级页面的访问请求后，浏览器将加载第i级页面，在加载第i级页面的过程中，获取包含第i级页面的页面ID的refer链，即步骤S102。refer链包含初始页面至第i级页面的页面ID和URL，其中各级页面的页面ID是浏览器在加载页面的过程中为页面所生成的唯一的ID，在refer链中它作为页面的URL的索引值。浏览器通过第i级页面的页面ID查询包含第i级页面的URL且第i级页面是最后一级页面的refer链。例如，refer链为A(ID1)->B(ID2)->C(ID3)->D(ID4)，其中A、B、C和D分别为各级页面的URL，ID1、ID2、ID3和ID4分别为各级页面的页面ID。在浏览器加载页面D时，根据页面D的页面ID4查询到上述refer链。After monitoring the access request of the i-th level page, the browser will load the i-th level page, and during the process of loading the i-th level page, obtain a refer chain including the page ID of the i-th level page, ie step S102. The refer chain includes page IDs and URLs from the initial page to the i-th level page, where the page IDs of all levels of pages are unique IDs generated by the browser during page loading, and serve as the URL of the page in the refer chain index value. The browser uses the page ID of the i-level page to query the refer chain that contains the URL of the i-level page and the i-level page is the last-level page. For example, the refer chain is A(ID1)->B(ID2)->C(ID3)->D(ID4), where A, B, C and D are the URLs of pages at all levels, ID1, ID2, ID3 and ID4 are page IDs of pages at all levels respectively. When the browser loads the page D, the above refer chain is queried according to the page ID4 of the page D.

在上述示例中，在加载第4级页面的过程中，将获取如下refer链：In the above example, during the process of loading the fourth-level page, the following refer chain will be obtained:

A(ID1)->B(ID2)->C(ID3)->D(ID4)A(ID1)->B(ID2)->C(ID3)->D(ID4)

在步骤S102之后，方法100进入步骤S103，其中客户端将refer链所包含的所有URL发送给服务器。客户端可以仅将refer链所包含的各级页面的URL上报给服务器，无需上报各级页面的页面ID。对于refer链：A(ID1)->B(ID2)->C(ID3)->D(ID4)，客户端将A->B->C->D发送给服务器。可选地，根据与服务器之间的云查询协议，本方法可以将refer链所包含的所有URL加密成密文发送给服务器。这里，本发明可以采用可逆加密方法对所有URL进行加密，也可以采用不可逆加密方法对所有URL进行加密。举例来说，计算refer链所包含的各个URL的特征值作为密文。可选地，特征值可以为根据MD5（Message Digest Algorithm，消息摘要算法第五版）计算得到的哈希值，或SHA1（Secure Hash Algorithm，安全哈希算法）码，或CRC（Cyclic Redundancy Check，循环冗余校验）码等可唯一标识原信息的特征码。需要说明的是，在上传URL的密文到云安全服务器的时候，需要首先屏蔽可能带有用户密码的网址字符串，不上传此类URL，以便保证用户信息的安全After step S102, the method 100 enters step S103, wherein the client sends all URLs contained in the refer chain to the server. The client can only report the URLs of the pages at all levels included in the refer chain to the server without reporting the page IDs of the pages at all levels. For the refer chain: A(ID1)->B(ID2)->C(ID3)->D(ID4), the client sends A->B->C->D to the server. Optionally, according to the cloud query protocol with the server, this method can encrypt all URLs included in the refer chain into ciphertext and send it to the server. Here, the present invention may adopt a reversible encryption method to encrypt all URLs, and may also adopt an irreversible encryption method to encrypt all URLs. For example, the feature value of each URL included in the refer chain is calculated as ciphertext. Optionally, the characteristic value can be a hash value calculated according to MD5 (Message Digest Algorithm, message digest algorithm fifth edition), or SHA1 (Secure Hash Algorithm, secure hash algorithm) code, or CRC (Cyclic Redundancy Check, Cyclic redundancy check) codes and other feature codes that can uniquely identify the original information. It should be noted that when uploading the ciphertext of the URL to the cloud security server, it is necessary to block the URL string that may contain the user password first, and not upload such URLs, so as to ensure the security of user information

在步骤S103之后，方法100进入步骤S104，其中服务器查询refer链所包含的所有URL是否属于服务器保存的黑名单和/或白名单数据库，得到查询结果。如果在客户端侧，refer链所包含的所有URL经过可逆加密方法进行了加密，那么服务器首先对接收到的密文进行解密，得到refer链所包含的所有URL；对应地，服务器保存的黑名单和/或白名单数据库中存储的是URL，在服务器获得refer链所包含的所有URL之后，查询黑名单和/或白名单数据库，得到这些URL是否属于黑名单或者是否属于白名单的查询结果。如果在客户端侧，refer链所包含的URL经过不可逆加密方法进行了加密，对应地，服务器保存的黑名单和/或白名单数据库中存储的也是相应的URL的特征值，在服务器获得refer链所包含的所有URL的特征值之后，查询黑名单和/或白名单数据库，得到这些URL是否属于黑名单或者是否属于白名单的查询结果。After step S103, the method 100 proceeds to step S104, wherein the server queries whether all URLs included in the refer chain belong to the blacklist and/or whitelist database saved by the server, and obtains the query result. If on the client side, all the URLs contained in the refer chain are encrypted by a reversible encryption method, then the server first decrypts the received ciphertext to obtain all the URLs contained in the refer chain; correspondingly, the blacklist saved by the server And/or the whitelist database stores URLs, after the server obtains all the URLs included in the refer chain, it queries the blacklist and/or whitelist database to obtain the query results of whether these URLs belong to the blacklist or whether they belong to the whitelist. If on the client side, the URL contained in the refer chain is encrypted by an irreversible encryption method, correspondingly, the blacklist and/or whitelist database stored by the server also stores the characteristic value of the corresponding URL, and the refer chain is obtained on the server After the feature values of all the URLs are included, the blacklist and/or whitelist database is queried to obtain the query result of whether these URLs belong to the blacklist or whether they belong to the whitelist.

在步骤S104之后，方法100进入步骤S105，其中服务器将查询结果与预设的规则进行匹配得到匹配结果。其中预设的规则是根据实际需求而设定的，其具体规定了需要进行风险提示的情况。下面以两种预设规则为例进行说明：After step S104, the method 100 proceeds to step S105, wherein the server matches the query result with a preset rule to obtain a matching result. Among them, the preset rules are set according to actual needs, and specifically stipulate the situations where risk warnings are required. The following two preset rules are used as examples for illustration:

规则一：经过搜索引擎跳转到恶意页面或危险页面或未知页面Rule 1: Jump to malicious or dangerous or unknown pages through search engines

如果查询结果表明第i级页面的URL属于黑名单数据库，即第i级页面为恶意页面或危险页面；或者，第i级页面的URL不属于白名单数据库，即第i级页面为未知页面；并且判断出初始页面至第i-1级页面中任一页面为搜索页面，即第i级页面是经过搜索引擎跳转而来的，表明查询结果与该规则一匹配，得到匹配结果为风险提示信息。If the query result shows that the URL of the i-th page belongs to the blacklist database, that is, the i-th page is a malicious page or a dangerous page; or, the URL of the i-th page does not belong to the whitelist database, that is, the i-th page is an unknown page; And it is determined that any page from the initial page to the i-1th level page is a search page, that is, the i-th level page is redirected by a search engine, indicating that the query result matches the rule one, and the matching result is a risk warning information.

可选地，服务器还保存有搜索页面URL列表。本步骤中，判断初始页面至第i-1级页面中任一页面的URL是否属于搜索页面URL列表，若是，则判断出初始页面至第i-1级页面中任一页面为搜索页面。需要说明的是，判断搜索页面也可以采用其它方法，不仅限于这种方法。Optionally, the server also saves a search page URL list. In this step, it is determined whether the URL of any page from the initial page to the i-1th level page belongs to the search page URL list, and if so, it is determined that any page from the initial page to the i-1th level page is a search page. It should be noted that other methods may also be used for determining the search page, and the method is not limited to this method.

规则二：经过恶意页面或危险页面或未知页面跳转到支付页面Rule 2: Jump to the payment page through a malicious page, a dangerous page, or an unknown page

如果查询结果表明初始页面至第i-1级页面中的任一页面的URL属于黑名单数据库，即该页面为恶意页面或危险页面；或者，初始页面至第i-1级页面中的任一页面的URL不属于白名单数据库，即该页面为未知页面；并且判断出第i级页面为支付页面，表明查询结果与该规则二匹配，得到匹配结果为风险提示信息。If the query results show that the URL of any page from the initial page to the i-1th level page belongs to the blacklist database, that is, the page is a malicious page or a dangerous page; or, any of the initial page to the i-1th level page The URL of the page does not belong to the whitelist database, that is, the page is an unknown page; and it is judged that the i-th level page is a payment page, indicating that the query result matches the rule 2, and the matching result is risk warning information.

可选地，服务器还保存有支付页面URL列表。本步骤中，判断第i级页面的URL是否属于支付页面URL列表，若是，则判断出第i级页面为支付页面。需要说明的是，判断支付页面也可以采用其它方法，不仅限于这种方法。Optionally, the server also saves a payment page URL list. In this step, it is determined whether the URL of the i-level page belongs to the payment page URL list, and if so, it is determined that the i-level page is a payment page. It should be noted that other methods may also be used for determining the payment page, and are not limited to this method.

上述规则一和规则二仅为两个具体的例子，本发明不仅限于这两种规则，根据实际需求，服务器可以预设多种规则用于匹配查询结果。The above-mentioned rule 1 and rule 2 are only two specific examples, and the present invention is not limited to these two rules. According to actual needs, the server can preset multiple rules for matching query results.

在步骤S105之后，方法100进入步骤S106，其中客户端接收服务器返回的匹配结果。After step S105, the method 100 enters step S106, wherein the client receives the matching result returned by the server.

随后，方法100进入步骤S107，客户端的浏览器根据匹配结果对第i级页面的访问行为进行处理。如果接收到的匹配结果为风险提示信息，浏览器向用户提示风险。可选地，浏览器可以向用户提供拦截当前页面和继续访问当前页面的选项，如果用户选择拦截当前页面，则浏览器对当前页面的访问行为进行拦截。这里向用户提示风险可以具体为：在用户界面上标记有问题的页面；或者，在鼠标移到页面上时进行悬浮窗提示；如果确定对页面的访问行为进行拦截，则可直接屏蔽或遮盖有问题的页面。Subsequently, the method 100 enters step S107, and the browser of the client processes the access behavior of the i-th level page according to the matching result. If the received matching result is risk warning information, the browser prompts the user of the risk. Optionally, the browser may provide the user with the option of intercepting the current page and continuing to visit the current page. If the user chooses to intercept the current page, the browser will intercept the access behavior of the current page. Prompting the risk to the user here can be specifically: mark the page with the problem on the user interface; or, when the mouse moves over the page, a floating window prompts; if it is determined to intercept the access behavior of the page, you can directly block or cover the page with problem page.

根据本发明实施例提供的对网页访问行为进行处理的方法，每当监控到通过初始页面的各级链接对新页面的访问请求后，获取该新页面对应的refer链，将refer链包含的所有URL上报给服务器，由服务器根据这些URL检测出匹配结果，由客户端根据该匹配结果对新页面的访问行为进行处理。与现有技术仅利用新页面的URL进行检测相比，由于refer链所提供的URL更多，覆盖面更广，因而检测效率更高，能够更为有效地保护客户端网页浏览的安全性。According to the method for processing webpage access behaviors provided by the embodiments of the present invention, whenever an access request to a new page is monitored through links at all levels of the initial page, the refer chain corresponding to the new page is obtained, and all the links contained in the refer chain are The URLs are reported to the server, and the server detects matching results based on these URLs, and the client processes the access behavior of the new page according to the matching results. Compared with the prior art that only utilizes the URL of the new page for detection, since the refer chain provides more URLs and covers a wider area, the detection efficiency is higher, and the security of the client's web browsing can be more effectively protected.

进一步的，在上述实施例的基础上，在步骤S102之前还包括创建refer链的过程。现有的浏览器提供了获取URL的refer信息的接口，即get_refer接口。但是，通过get_refer接口所获取的refer信息仅包含在访问当前页面之前上一次所访问的页面的URL，即链接到当前页面的前一级页面的URL；并且，从一个页面打开至get_refer接口可以使用需要较长的时间，如果等到get_refer接口可以使用后再去进行检测所需花费的时间过长。为了能够实时获取由各级页面的URL组成的refer链，本发明提供了创建refer链的方法，该方法具体为：每当通过初始页面的各级链接打开新页面时，负责维护refer链的进程获取新页面的页面ID和URL以及新页面的上一级页面的页面ID或URL，根据该上一级页面的页面ID或URL查询对应的refer链，创建refer链的对应节点。Further, on the basis of the above embodiments, a process of creating a refer chain is also included before step S102. Existing browsers provide an interface for obtaining refer information of a URL, that is, a get_refer interface. However, the refer information obtained through the get_refer interface only includes the URL of the page visited last time before visiting the current page, that is, the URL of the previous page linked to the current page; and, opening from a page to the get_refer interface can use It takes a long time. If you wait until the get_refer interface can be used, it will take too long to perform detection. In order to be able to obtain the refer chain composed of the URLs of pages at all levels in real time, the present invention provides a method for creating the refer chain, which specifically includes: whenever a new page is opened through the links of all levels of the initial page, the process of maintaining the refer chain Obtain the page ID and URL of the new page and the page ID or URL of the upper-level page of the new page, query the corresponding refer chain according to the page ID or URL of the upper-level page, and create the corresponding node of the refer chain.

图2示出了根据本发明一个实施例的创建refer链的方法200的流程图。如图2所示，方法200始于第1级节点创建步骤S201。在第1级节点创建步骤S201中，在监控到初始页面的访问请求后，生成初始页面的页面ID，获取初始页面的URL，创建refer链的第1级节点，将初始页面的页面ID和URL作为第1级节点的信息写入refer链。针对浏览器访问的默认页面或通过用户在地址栏的输入触发访问的页面，将其作为初始页面，创建一个新的refer链。具体地，浏览器监控到初始页面的访问请求后，会加载该初始页面。在加载初始页面的过程中，浏览器生成一个唯一的ID作为该初始页面的页面ID，并且获取初始页面的URL。其中初始页面的URL可通过指定响应事件接口来获取，例如通过实现标准插件机制的指定响应事件接口来获取。Fig. 2 shows a flowchart of a method 200 for creating a refer chain according to an embodiment of the present invention. As shown in FIG. 2 , the method 200 starts at step S201 of creating a first-level node. In the first-level node creation step S201, after monitoring the access request of the initial page, generate the page ID of the initial page, obtain the URL of the initial page, create the first-level node of the refer chain, and combine the page ID and URL of the initial page It is written into the refer chain as the information of the first-level node. For the default page accessed by the browser or the page triggered by the user's input in the address bar, use it as the initial page and create a new refer chain. Specifically, after monitoring the access request of the initial page, the browser loads the initial page. During the process of loading the initial page, the browser generates a unique ID as the page ID of the initial page, and obtains the URL of the initial page. Wherein, the URL of the initial page may be obtained by specifying a response event interface, for example, by implementing a specified response event interface of a standard plug-in mechanism.

在IE（Internet Explorer）浏览器中使用浏览器辅助对象（Browser HelperObject，简称：BHO）插件机制，通过响应“BeforeNavigate2”事件可以获取IE当前加载的URL。在火狐（Firefox）浏览器中使用火狐扩展机制提供的指定响应事件接口，获取火狐浏览器当前加载的URL。在谷歌（chrome）浏览器中使用网景插件应用程序编程接口（NetscapePlugin Application Programming Interface，简称：NPAPI）插件机制，获取谷歌浏览器当前加载的URL。在获取初始页面的页面ID（如ID1）和URL（如A）后，将ID1和A作为该refer链的第1级节点的信息，创建refer链为：A(ID1)。其中，ID1为索引信息。Use the Browser Helper Object (BHO for short: BHO) plug-in mechanism in the IE (Internet Explorer) browser to obtain the URL currently loaded by IE by responding to the "BeforeNavigate2" event. In the Firefox browser, use the specified response event interface provided by the Firefox extension mechanism to obtain the URL currently loaded by the Firefox browser. Use the Netscape Plugin Application Programming Interface (NPAPI for short) plug-in mechanism in the Google (chrome) browser to obtain the URL currently loaded by the Google browser. After obtaining the page ID (such as ID1) and URL (such as A) of the initial page, use ID1 and A as the first-level node information of the refer chain, and create a refer chain as: A(ID1). Wherein, ID1 is index information.

需要说明的是，由于实际应用中，人们使用计算机的应用环境，如操作系统、浏览器类型等的不尽相同，因此，前述各个步骤的执行主体也可以有多种实现方式。例如可以是一种带有识别及添加标记功能的浏览器，其中，浏览器可以是Windows操作系统自带的浏览器Internet Explorer（简称IE），以及其他第三方浏览器。所谓第三方浏览器，通常指在Windows操作系统上运行的非IE的浏览器软件，这类第三方浏览器通常会因其有着针对用户的丰富的独特功能设计和个性化扩展，为用户提供了许多方便的应用。例如，同样的插件机制可以运行于多种类型的浏览器，例如，浏览器为IE、firefox、google chrome、safari、opera、QQ浏览器、遨游浏览器、搜狗浏览器或猎豹浏览器等等。It should be noted that, since in actual applications, people use computers with different application environments, such as operating systems, browser types, etc., therefore, there may be multiple implementations for the executing entities of the foregoing steps. For example, it may be a browser with the function of identifying and adding tags, wherein the browser may be Internet Explorer (IE for short) which is a built-in browser of the Windows operating system, or other third-party browsers. The so-called third-party browsers usually refer to non-IE browser software running on the Windows operating system. Such third-party browsers usually provide users with rich and unique function designs and personalized extensions for users. Many handy apps. For example, the same plug-in mechanism can run on multiple types of browsers, for example, the browsers are IE, firefox, google chrome, safari, opera, QQ browser, Aoyou browser, Sogou browser or Cheetah browser, etc.

在第1级节点创建步骤S201之后，方法200进入循环创建第i级节点的过程。从i=2开始，方法200进入步骤S202，其中在监控到第i级页面的访问请求后，生成第i级页面的页面ID，获取第i级页面的URL以及第i-1级页面的页面ID或URL，第i级页面是第i-1级页面的页面级跳转页面。本文将通过用户在第i-1级页面上点击链接或者其它用户行为触发的链接方式由第i-1级页面链接到第i级页面称为页面级跳转。在浏览器监控到经过页面级跳转的第i级页面的访问请求后，会加载第i级页面。在加载第i级页面的过程中，浏览器生成一个唯一的ID作为该第i级页面的页面ID，并且获取第i级页面的URL。其中第i级页面的URL可通过指定响应事件接口来获取，例如通过实现标准插件机制的指定响应事件接口来获取。具体方式可参见前面关于如何获取初始页面的URL的相关描述。After the first-level node creation step S201, the method 200 enters into a process of cyclically creating the i-th level node. Starting from i=2, the method 200 enters step S202, wherein after monitoring the access request of the i-th level page, the page ID of the i-th level page is generated, and the URL of the i-th level page and the page of the i-1th level page are obtained ID or URL, the i-th level page is the page-level redirection page of the i-1 level page. In this article, linking from the i-1th level page to the i-th level page by clicking on a link on the i-1th level page or by other user actions is called page-level jumping. After the browser monitors the access request of the i-th level page through the page-level jump, it will load the i-th level page. During the process of loading the i-th level page, the browser generates a unique ID as the page ID of the i-th level page, and obtains the URL of the i-th level page. Wherein, the URL of the i-th level page can be obtained by specifying a response event interface, for example, by implementing a specified response event interface of a standard plug-in mechanism. For specific methods, refer to the previous description about how to obtain the URL of the initial page.

为了查找到相应的refer链并在其上继续创建节点，在步骤S202中还需要获取第i-1级页面的页面ID或URL。本发明针对浏览器访问新页面的不同情况提供了两种不同的方式来获取第i-1级页面的信息，一种方式（即下述方式一）适用于通过新窗口或新标签（tab）页打开第i级页面的情况；另一种方式（即下述方式二）适用于仍通过当前窗口或当前标签页打开第i级页面的情况。In order to find the corresponding refer chain and continue to create nodes on it, in step S202 it is also necessary to obtain the page ID or URL of the i-1th level page. The present invention provides two different ways to obtain the information of the i-1th level page according to the different situations of the browser accessing the new page, one way (i.e. the following way 1) is applicable to the new window or new tab (tab) The situation where the i-level page is opened by the first page; the other method (i.e. the second method below) is applicable to the situation that the i-level page is still opened through the current window or the current tab page.

方式一：method one:

首先，在监控到第i级页面的访问请求后，获取第i级页面的接口对象指针，根据接口对象指针向第i级页面的接口对象写入在加载第i-1级页面的过程中所获取的第i-1级页面的页面ID。然后，在加载第i级页面的过程中，通过读取第i级页面的接口对象所提供的信息，获取第i-1级页面的页面ID。First, after monitoring the access request of the i-th level page, obtain the interface object pointer of the i-th level page, and write the interface object of the i-th level page according to the interface object pointer to the interface object of the i-th level page. Get the page ID of the i-1th level page. Then, in the process of loading the i-th level page, the page ID of the i-1 level page is obtained by reading the information provided by the interface object of the i-th level page.

上述方式一适用于通过新窗口或新标签（tab）页打开第i级页面的情况。以IE浏览器为例，通过分析IE浏览器打开新窗口或新tab页的实现原理，找到了IE浏览器内部模块创建新窗口或新tab页所调用的相关处理函数，捕获（Hook）该函数，利用该函数的返回值获取新窗口或新tab页（将要加载第i级页面的窗口或tab页）的接口对象指针，如IWEBBROWSER2指针；由于此时浏览器还未开始加载第i级页面，浏览器所记录的当前页面的页面ID还是在加载第i-1级页面的过程中所获取的第i-1级页面的页面ID，因此，此时浏览器可根据该接口对象指针向IWEBBROWSER2对象写入第i-1级页面的页面ID。在开始加载第i级页面之后，通过读取第i级页面的IWEBBROWSER2对象所提供的信息，就可以获取第i-1级页面的页面ID。The above method 1 is applicable to the situation that the i-th level page is opened through a new window or a new tab (tab) page. Taking IE browser as an example, by analyzing the implementation principle of IE browser to open a new window or new tab page, find the relevant processing function called by the internal module of IE browser to create a new window or new tab page, and capture (Hook) this function , use the return value of this function to obtain the interface object pointer of the new window or new tab page (the window or tab page that will load the i-level page), such as the IWEBBROWSER2 pointer; since the browser has not started loading the i-level page at this time, The page ID of the current page recorded by the browser is still the page ID of the i-1th level page obtained during the loading of the i-1th level page. Therefore, at this time, the browser can point to the IWEBBROWSER2 object according to the interface object pointer Write the page ID of the i-1th level page. After starting to load the i-th level page, the page ID of the i-1 level page can be obtained by reading the information provided by the IWEBBROWSER2 object of the i-th level page.

方式二：Method 2:

在监控到第i级页面的访问请求之后并在加载第i级页面之前，通过浏览器提供的get_locationURL接口获取第i-1级页面的URL。After monitoring the access request of the i-level page and before loading the i-level page, obtain the URL of the i-1 level page through the get_locationURL interface provided by the browser.

上述方式二适用于仍通过当前窗口或当前标签页打开第i级页面的情况。在这种情况下，由于没有打开新窗口或新标签页，所以不能够采用与方式一类似的方式获取第i-1级页面的页面ID。针对这种情况，在监控到第i级页面的访问请求之后，但在第i级页面的“BeforeNavigate2”事件之前，get_locationURL接口所提供的仍然还是第i-1级页面的URL，因此利用get_locationURL接口可以获取第i-1级页面的URL。The second method above is applicable to the situation that the i-th level page is still opened through the current window or the current tab page. In this case, since no new window or new tab page is opened, the page ID of the i-1th level page cannot be obtained in a manner similar to the first method. In this case, after monitoring the access request of the i-level page, but before the "BeforeNavigate2" event of the i-level page, the get_locationURL interface still provides the URL of the i-1 level page, so use the get_locationURL interface The URL of the i-1 level page can be obtained.

但是，在通过浏览器提供的get_locationURL接口获取第i-1级页面的URL的步骤之后还需要判断是否由浏览器地址栏的输入行为触发打开第i级页面，具体地，可以通过根据浏览器地址栏的点击和输入动作来判断；若判断结果为是，则将通过浏览器提供的get_locationURL接口获取的第i-1级页面的URL清空，将第i级页面作为初始页面进行处理，执行步骤S201；若判断结果为否，执行步骤S203。However, after the step of obtaining the URL of the i-1th level page through the get_locationURL interface provided by the browser, it is necessary to determine whether the opening of the i-th level page is triggered by the input behavior of the browser address bar. Specifically, according to the browser address If the judgment result is yes, the URL of the i-1th level page obtained through the get_locationURL interface provided by the browser will be cleared, and the i-th level page will be processed as the initial page, and step S201 will be executed. ; If the judgment result is no, execute step S203.

上述方式一和方式二分别针对不同的情况。如果第i级页面是通过新窗口或新标签页打开的，那么步骤S202通过上述方式一获得第i-1级页面的页面ID；如果第i级页面是通过当前窗口或当前标签页打开的，那么步骤S202通过上述方式二获得第i-1级页面的URL。如果步骤S202获得的是第i-1级页面的页面ID，那么后续则根据该页面ID查询对应的refer链；如果步骤S202获得的是第i-1级页面的URL，那么后续则根据该URL查询对应的refer链。The above method 1 and method 2 are respectively aimed at different situations. If the i-th level page is opened through a new window or a new tab page, then step S202 obtains the page ID of the i-1th level page through the above method 1; if the i-th level page is opened through a current window or a current tab page, Then step S202 obtains the URL of the i-1th level page through the second method above. If step S202 obtains the page ID of the i-1th level page, then follow up to query the corresponding refer chain according to the page ID; if step S202 obtains the URL of the i-1th level page, then follow up according to the URL Query the corresponding refer chain.

在步骤S202之后，方法200进入步骤S203，其中查询包含第i-1级页面的页面ID或URL的refer链，创建该refer链的第i级节点，将第i级页面的页面ID和URL作为第i级节点的信息。After step S202, the method 200 enters step S203, wherein the refer chain containing the page ID or URL of the i-1th level page is queried, the i-th level node of the refer chain is created, and the page ID and URL of the i-th level page are used as The information of the i-level node.

具体地，如果在步骤S202中，采用上述方式一获取得到第i-1级页面的页面ID，那么直接查询包含第i-1级页面的页面ID的refer链即可。例如，如果通过步骤S202得到第2级页面的页面ID为ID2，URL为B，以及第1级页面（就是初始页面）的页面ID为ID1，则在本步骤中，查询包含ID1的refer链，并且该refer链的最后一级节点的索引信息为ID1，即A(ID1)；创建该refer链的第2级节点，将ID2和B作为第2级节点的信息，得到refer链为A(ID1)->B(ID2)。如果在步骤S202中，采用上述方式二获取得到第i-1级页面的URL，那么则需要查询包含第i-1级页面的URL的refer链。由于维护refer链的进程有可能维护有多条包含相同URL的refer链，所以本步骤有可能查询得到多条包含第i-1级页面的URL的refer链。但是，由于在上述方式二所适用的仍通过当前窗口或当前标签页打开第i级页面的情况下，页面跳转的时序性良好，所以可以选择最近更新的refer链作为待创建第i级节点的refer链。Specifically, if in step S202, the page ID of the i-1th level page is acquired by the above method 1, then it is sufficient to directly query the refer chain containing the page ID of the i-1th level page. For example, if the page ID of the second-level page is ID2, the URL is B, and the page ID of the first-level page (that is, the initial page) is ID1 through step S202, then in this step, query the refer chain containing ID1, And the index information of the last-level node of the refer chain is ID1, that is, A(ID1); create the second-level node of the refer chain, use ID2 and B as the information of the second-level node, and obtain the refer chain as A(ID1 )->B(ID2). If in step S202, the URL of the i-1th level page is obtained by using the second method above, then it is necessary to query the refer chain containing the URL of the i-1th level page. Since the process of maintaining the refer chain may maintain multiple refer chains containing the same URL, this step may query and obtain multiple refer chains containing the URL of the i-1th level page. However, since the timing of page jumps is good when the i-level page is still opened through the current window or the current tab page, which is applicable to the above-mentioned method 2, the latest updated refer chain can be selected as the i-level node to be created The refer chain.

可选地，在上述步骤S202的方式一中，也可以仅向第i级页面的接口对象写入第i-1级页面的URL，通过读取第i级页面的接口对象所提供的信息，获取第i-1级页面的URL。接着，在步骤S203中，查询包含第i-1级页面的URL的refer链，并在查询到多条refer链的情况下选择最近更新的refer链作为待创建第i级节点的refer链。但是，由于在上述方式一所适用的通过新窗口或新标签（tab）页打开第i级页面的情况下，页面跳转的时序性较差，所以根据页面ID查找refer链的方法的准确度会高于根据URL查找refer链的方法。Optionally, in the first method of step S202 above, it is also possible to write only the URL of the i-1th level page to the interface object of the i-th level page, and by reading the information provided by the interface object of the i-th level page, Get the URL of the i-1 level page. Next, in step S203, query the refer chain containing the URL of the i-1th level page, and select the latest updated refer chain as the refer chain of the i-th level node to be created if multiple refer chains are found. However, in the case of opening the i-level page through a new window or a new tab (tab) page applicable to the above method 1, the timing of page jumps is poor, so the accuracy of the method of finding the refer chain based on the page ID It will be higher than the method of finding the refer chain based on the URL.

循环执行上述步骤S202和步骤S203，由此创建完整的refer链。对于上述示例，所创建的refer链为：A(ID1)->B(ID2)->C(ID3)->D(ID4)。The above step S202 and step S203 are executed cyclically, thereby creating a complete refer chain. For the above example, the created refer chain is: A(ID1)->B(ID2)->C(ID3)->D(ID4).

在创建refer链的过程中，还需要考虑到一种特殊情况，即：在访问某些页面时，该页面会发生多次自动跳转的情况，例如3xx等跳转情况，本文将这种跳转称为页面间跳转。在IE浏览器中，在访问同一个页面时BHO机制提供了三个事件，分别为BeforeNavigate2，NavigateComplete2和DocumentComplete2。在正常情况下，三个事件对应的URL都是相同的，但如果发生多次302跳转，就会发生如下情况：(BeforeNavigate2)url0->(302)url1->(302)url2->(NavigateComplete2)url2->(DocumentComplete2)url2。如果仍以上述示例为例，在访问页面C时，页面C有可能发生多次自动跳转，依次跳转到C1和C2。因此，如果发生页面间跳转的情况，依靠上述方法可能无法捕获到所有跳转页面的URL。In the process of creating the refer chain, you also need to consider a special situation, that is, when you visit certain pages, the page will automatically jump multiple times, such as 3xx and other jumps. This article will refer to this jump Turning is called jumping between pages. In IE browser, the BHO mechanism provides three events when visiting the same page, namely BeforeNavigate2, NavigateComplete2 and DocumentComplete2. Under normal circumstances, the URLs corresponding to the three events are the same, but if multiple 302 redirects occur, the following will happen: (BeforeNavigate2)url0->(302)url1->(302)url2->( NavigateComplete2) url2 -> (DocumentComplete2) url2. If the above example is still taken as an example, when page C is accessed, page C may automatically jump multiple times, jumping to C1 and C2 in turn. Therefore, if there is a jump between pages, it may not be possible to capture URLs of all the jump pages by relying on the above method.

鉴于上述特殊情况，本发明实施例在上述步骤S203之后还包括创建至少一个第i级子节点的步骤，即步骤S204，该步骤是在第i级页面发生页面间跳转的情况下执行的，其中至少一个第i级子节点对应于第i级页面的至少一个页面间跳转页面。在步骤S204中，捕获重定向处理时所调用的函数，从重定向处理时所调用的函数的输入参数中获取第i级页面的至少一个页面间跳转页面的URL；以及，查询包含第i级页面的页面ID的refer链，创建该refer链的至少一个第i级子节点，将第i级页面的页面ID和第i级页面的至少一个页面间跳转页面的URL作为至少一个第i级子节点的信息。具体地，在发生3xx等跳转情况时，浏览器会进行重定向处理，在重定向处理时浏览器会调用“Urlmon!CINet::OnRedirect”函数，该函数的输入参数就记录了页面间跳转页面的URL，通过捕获该函数，就可以获取第i级页面的至少一个页面间跳转页面的URL。将通过这种方法获得的页面间跳转页面的URL作为第i级子节点的信息，该页面间跳转页面的索引ID与第i级页面的页面ID相同。对于上述示例，所创建的refer链为：A(ID1)->B(ID2)->C(ID3)->C1(ID3)->C2(ID3)->D(ID4)。In view of the above-mentioned special circumstances, the embodiment of the present invention further includes the step of creating at least one i-th child node after the above-mentioned step S203, that is, step S204, which is executed when the i-th-level page jumps between pages, Wherein at least one i-th child node corresponds to at least one inter-page jump page of the i-th level page. In step S204, the function called during redirection processing is captured, and the URL of at least one jump page between pages of the i-level page is obtained from the input parameters of the function called during redirection processing; and, the query includes the i-level The refer chain of the page ID of the page, create at least one i-th child node of the refer chain, and use the page ID of the i-th page and the URL of at least one jump page between pages of the i-th page as at least one i-th level Information about child nodes. Specifically, when a jump such as 3xx occurs, the browser will perform redirection processing. During the redirection processing, the browser will call the "Urlmon!CINet::OnRedirect" function, and the input parameter of this function will record the jump between pages. The URL of the redirected page. By capturing this function, the URL of at least one inter-page jump page of the i-th level page can be obtained. The URL of the inter-page jump page obtained by this method is used as the information of the i-th child node, and the index ID of the inter-page jump page is the same as the page ID of the i-th level page. For the above example, the created refer chain is: A(ID1)->B(ID2)->C(ID3)->C1(ID3)->C2(ID3)->D(ID4).

根据本发明实施例提供的对网页访问行为进行处理的方法，每当监控到通过初始页面的各级链接对新页面的访问请求后，获取该新页面对应的refer链，将refer链包含的所有URL上报给服务器，由服务器根据这些URL检测出匹配结果，由客户端根据该匹配结果对新页面的访问行为进行处理。与现有技术仅利用新页面的URL进行检测相比，由于refer链所提供的URL更多，覆盖面更广，因而检测效率更高，能够更为有效地保护客户端网页浏览的安全性。进一步的，本发明实施例还提供了创建refer链的方法，根据该方法能够实时获取由各级页面的URL组成的refer链，这样客户端也可以及时的将refer链所包含的所有URL发送给服务器，服务器由此能及时地获得很全面的URL信息，根据这些URL信息，服务器能够及时的向客户端返回匹配结果，从而实现了实时快速的保护客户端的网页浏览的安全性。According to the method for processing webpage access behaviors provided by the embodiments of the present invention, whenever an access request to a new page is monitored through links at all levels of the initial page, the refer chain corresponding to the new page is obtained, and all the links contained in the refer chain are The URLs are reported to the server, and the server detects matching results based on these URLs, and the client processes the access behavior of the new page according to the matching results. Compared with the prior art that only utilizes the URL of the new page for detection, since the refer chain provides more URLs and covers a wider area, the detection efficiency is higher, and the security of the client's web browsing can be more effectively protected. Further, the embodiment of the present invention also provides a method for creating a refer chain. According to this method, the refer chain composed of URLs of pages at all levels can be obtained in real time, so that the client can also send all the URLs contained in the refer chain to the The server, so that the server can obtain very comprehensive URL information in a timely manner, and according to the URL information, the server can return the matching result to the client in a timely manner, thereby realizing real-time and rapid protection of the security of the client's web browsing.

图3示出了根据本发明一个实施例的客户端的结构框图。该客户端用于检测通过初始页面的第i级链接所打开的第i级页面，i≥2。如图3所示，客户端包括：监控模块31、查询接口32和保护模块33，可选地，客户端还可以包括：加密模块34。Fig. 3 shows a structural block diagram of a client according to an embodiment of the present invention. The client is used to detect the i-th level page opened by the i-th level link of the initial page, i≥2. As shown in FIG. 3 , the client includes: a monitoring module 31 , a query interface 32 and a protection module 33 ; optionally, the client may also include: an encryption module 34 .

监控模块31适于在监控到第i级页面的访问请求后，获取包含第i级页面的页面ID的refer链。该第i级页面的访问请求是用户在第i-1级页面点击链接或其它链接方式触发的。监控模块31在监控到第i级页面的访问请求后，浏览器将加载第i级页面，在加载第i级页面的过程中，监控模块31获取包含第i级页面的页面ID的refer链。refer链包含初始页面至第i级页面的页面ID和URL，其中各级页面的页面ID是浏览器在加载页面的过程中为页面所生成的唯一的ID，在refer链中它作为页面的URL的索引值。浏览器通过第i级页面的页面ID查询包含第i级页面的URL且第i级页面是最后一级页面的refer链。The monitoring module 31 is adapted to obtain the refer chain including the page ID of the i-th level page after monitoring the access request of the i-th level page. The access request for the i-th level page is triggered by the user clicking a link or other link on the i-1 level page. After the monitoring module 31 monitors the access request of the i-th level page, the browser will load the i-th level page, and during the process of loading the i-th level page, the monitoring module 31 obtains the refer chain containing the page ID of the i-th level page. The refer chain includes page IDs and URLs from the initial page to the i-th level page, where the page IDs of all levels of pages are unique IDs generated by the browser during page loading, and serve as the URL of the page in the refer chain index value. The browser uses the page ID of the i-level page to query the refer chain that contains the URL of the i-level page and the i-level page is the last-level page.

查询接口32适于将refer链所包含的所有URL发送给服务器，以供服务器查询refer链所包含的所有URL是否属于服务器保存的黑名单和/或白名单数据库，然后将查询结果与预设的规则进行匹配得到匹配结果；以及，接收服务器返回的匹配结果。可选地，根据与服务器之间的云查询协议，加密模块34将refer链所包含的所有URL加密成密文（有关加密方法的描述可参见方法实施例），发送给查询接口32，由查询接口32将密文发送给服务器。查询接口32可以仅将refer链所包含的各级页面的URL的密文上报给服务器，无需上报各级页面的页面ID。The query interface 32 is suitable for sending all URLs included in the refer chain to the server, so that the server can inquire whether all URLs included in the refer chain belong to the blacklist and/or whitelist database saved by the server, and then compare the query results with the preset The rules are matched to obtain a matching result; and, the matching result returned by the server is received. Optionally, according to the cloud query protocol with the server, the encryption module 34 encrypts all URLs contained in the refer chain into ciphertext (for a description of the encryption method, please refer to the method embodiment), and sends it to the query interface 32. Interface 32 sends the ciphertext to the server. The query interface 32 may only report to the server the ciphertext of the URLs of the pages at all levels included in the refer chain, without reporting the page IDs of the pages at all levels.

保护模块33适于根据匹配结果对第i级页面的访问行为进行处理。如果匹配结果为风险提示信息，保护模块33向用户提示风险。可选地，保护模块33可以向用户提供拦截当前页面和继续访问当前页面的选项，如果用户选择拦截当前页面，则保护模块33对当前页面的访问行为进行拦截。The protection module 33 is adapted to process the access behavior of the i-th level page according to the matching result. If the matching result is risk warning information, the protection module 33 will remind the user of the risk. Optionally, the protection module 33 may provide the user with options to block the current page and continue to visit the current page. If the user chooses to block the current page, the protection module 33 will block the access behavior of the current page.

进一步的，客户端还可以包括refer链创建模块35。refer链创建模块35包括：第一节点创建单元36和第二节点创建单元37。Further, the client may also include a refer chain creation module 35. The refer chain creation module 35 includes: a first node creation unit 36 and a second node creation unit 37 .

第一节点创建单元36适于在监控到初始页面的访问请求后，生成初始页面的页面ID，获取初始页面的URL，创建refer链的第1级节点，将初始页面的页面ID和URL作为第1级节点的信息写入refer链。进一步的，第一节点创建单元36包括：初始页面的页面ID生成单元361、初始页面的URL获取单元362和第一节点创建子单元363。初始页面的页面ID生成单元361适于在监控到初始页面的访问请求后，生成初始页面的页面ID。初始页面的URL获取单元362适于在加载初始页面的过程中，通过指定响应事件接口获取当前加载的初始页面的URL。例如，通过实现标准插件机制的指定响应事件接口来获取。在IE浏览器中使用浏览器辅助对象BHO插件机制，通过响应“BeforeNavigate2”事件可以获取IE当前加载的URL。在火狐（Firefox）浏览器中使用火狐扩展机制提供的指定响应事件接口，获取火狐浏览器当前加载的URL。在谷歌（chrome）浏览器中使用NPAPI插件机制，获取谷歌浏览器当前加载的URL。第一节点创建子单元363适于创建refer链的第1级节点，将初始页面的页面ID和URL作为第1级节点的信息写入refer链。The first node creating unit 36 is adapted to generate the page ID of the initial page after monitoring the access request of the initial page, obtain the URL of the initial page, create the first-level node of the refer chain, and use the page ID and URL of the initial page as the first node The information of level 1 nodes is written into the refer chain. Further, the first node creation unit 36 includes: an initial page page ID generation unit 361 , an initial page URL acquisition unit 362 and a first node creation subunit 363 . The page ID generating unit 361 of the initial page is adapted to generate the page ID of the initial page after monitoring the access request of the initial page. The initial page URL acquiring unit 362 is adapted to acquire the URL of the currently loaded initial page through a specified response event interface during the process of loading the initial page. For example, by implementing the specified response event interface of the standard plug-in mechanism. Use the BHO plug-in mechanism of the browser helper object in the IE browser to obtain the currently loaded URL of the IE by responding to the "BeforeNavigate2" event. In the Firefox browser, use the specified response event interface provided by the Firefox extension mechanism to obtain the URL currently loaded by the Firefox browser. Use the NPAPI plug-in mechanism in the Google (chrome) browser to get the URL currently loaded by the Google browser. The first node creation subunit 363 is adapted to create a first-level node of the refer chain, and write the page ID and URL of the initial page as information of the first-level node into the refer chain.

第二节点创建单元37，i≥2，适于在监控到第i级页面的访问请求后，生成第i级页面的页面ID，获取第i级页面的URL以及第i-1级页面的页面ID或URL，第i级页面是第i-1级页面的页面级跳转页面；以及，查询包含第i-1级页面的页面ID或URL的refer链，创建该refer链的第i级节点，将第i级页面的页面ID和URL作为第i级节点的信息；第二节点创建单元37适于创建refer链的各级节点。进一步的，第二节点创建单元37包括：第i级页面的页面ID生成单元371、第i级页面的URL获取单元372、第i-1级页面的页面ID或URL获取单元373和第二节点创建子单元374。第i级页面的页面ID生成单元371适于在监控到第i级页面的访问请求后，生成第i级页面的页面ID。第i级页面的URL获取单元372适于在加载第i级页面的过程中，通过指定响应事件接口获取当前加载的第i级页面的URL。获取当前加载的第i级页面的URL的具体方式可参见获取初始页面的URL的相关描述。第i-1级页面的页面ID或URL获取单元373适于在监控到第i级页面的访问请求后，获取第i-1级页面的页面ID或URL。第二节点创建子单元374适于查询包含第i-1级页面的页面ID或URL的refer链，创建该refer链的第i级节点，将第i级页面的页面ID和URL作为第i级节点的信息。The second node creation unit 37, i≥2, is adapted to generate the page ID of the i-th page after monitoring the access request of the i-th page, and obtain the URL of the i-th page and the page of the i-1th page ID or URL, the i-level page is the page-level jump page of the i-1 level page; and, query the refer chain containing the page ID or URL of the i-1 level page, and create the i-level node of the refer chain , the page ID and URL of the i-th level page are used as the information of the i-th level node; the second node creating unit 37 is adapted to create all levels of nodes of the refer chain. Further, the second node creation unit 37 includes: the page ID generation unit 371 of the i-th level page, the URL acquisition unit 372 of the i-th level page, the page ID or URL acquisition unit 373 of the i-1th level page and the second node Create subunit 374 . The page ID generating unit 371 of the i-th level page is adapted to generate the page ID of the i-th level page after monitoring the access request of the i-th level page. The URL acquiring unit 372 of the i-th level page is adapted to acquire the URL of the currently loaded i-th level page through a designated response event interface during the loading of the i-th level page. For a specific manner of obtaining the URL of the currently loaded i-th level page, refer to the relevant description of obtaining the URL of the initial page. The page ID or URL acquiring unit 373 of the i-1th level page is adapted to acquire the page ID or URL of the i-1th level page after monitoring the access request of the i-th level page. The second node creation subunit 374 is suitable for querying the refer chain containing the page ID or URL of the i-1th level page, creating the i-th level node of the refer chain, and using the page ID and URL of the i-th level page as the i-th level Node information.

可选地，客户端的第二节点创建单元37还包括：捕获单元375和写入单元376。捕获单元375适于在监控到第i级页面的访问请求后，获取第i级页面的接口对象指针。写入单元376适于根据接口对象指针向第i级页面的接口对象写入在加载第i-1级页面的过程中所获取的第i-1级页面的页面ID。这种实施方式适用于通过新窗口或新标签（tab）页打开第i级页面的情况。以IE浏览器为例，捕获单元375进一步适于在监控到第i级页面的访问请求后，捕获浏览器创建新窗口或新标签页所调用的函数，利用该函数的返回值获取第i级页面的接口对象指针，如IWEBBROWSER2指针。由于此时浏览器还未开始加载第i级页面，浏览器所记录的当前页面的页面ID还是在加载第i-1级页面的过程中所获取的第i-1级页面的页面ID，因此，此时写入单元376可根据该接口对象指针向IWEBBROWSER2对象写入第i-1级页面的页面ID。第i-1级页面的页面ID或URL获取单元373具体适于：在加载第i级页面的过程中，通过读取第i级页面的接口对象所提供的信息，获取第i-1级页面的页面ID。可选地，写入单元376适于根据接口对象指针向第i级页面的接口对象写入在加载第i-1级页面的过程中所获取的第i-1级页面的URL。Optionally, the second node creation unit 37 of the client further includes: a capturing unit 375 and a writing unit 376 . The capturing unit 375 is adapted to acquire the interface object pointer of the i-th level page after monitoring the access request of the i-th level page. The writing unit 376 is adapted to write the page ID of the i-1th level page acquired during the process of loading the i-1th level page into the interface object of the i-th level page according to the interface object pointer. This implementation manner is applicable to the situation that the i-th level page is opened through a new window or a new tab (tab) page. Taking the IE browser as an example, the capture unit 375 is further adapted to capture the function called by the browser to create a new window or new tab page after monitoring the access request of the i-level page, and use the return value of the function to obtain the i-level page. Interface object pointer of the page, such as IWEBBROWSER2 pointer. Since the browser has not yet started loading the i-th level page at this time, the page ID of the current page recorded by the browser is still the page ID of the i-1th level page obtained during the loading of the i-1th level page, so At this time, the writing unit 376 can write the page ID of the i-1th level page to the IWEBBROWSER2 object according to the interface object pointer. The page ID or URL acquisition unit 373 of the i-1th level page is specifically adapted to: in the process of loading the i-th level page, by reading the information provided by the interface object of the i-th level page, to obtain the i-1th level page The page ID of the . Optionally, the writing unit 376 is adapted to write the URL of the i-1th level page acquired during the process of loading the i-1th level page to the interface object of the i-th level page according to the interface object pointer.

可选地，第i-1级页面的页面ID或URL获取单元373进一步适于：在监控到第i级页面的访问请求之后并在加载第i级页面之前，通过浏览器提供的get_locationURL接口获取第i-1级页面的URL。第二节点创建单元37还包括：判断单元377和清空单元378。其中，判断单元377适于判断是否由浏览器地址栏的输入行为触发打开第i级页面，具体地，可以通过根据浏览器地址栏的点击和输入动作来判断；清空单元378适于在判断单元377的判断结果为是的情况下，将第i-1级页面的页面ID或URL获取单元373所获取的第i-1级页面的URL清空，并触发第一节点创建单元36将第i级页面作为初始页面进行处理；在判断单元377的判断结果为否的情况下，判断单元377触发第二节点创建子单元374创建refer链的第i级节点。Optionally, the page ID or URL obtaining unit 373 of the i-1th level page is further adapted to: after monitoring the access request of the i-th level page and before loading the i-th level page, obtain the URL through the get_locationURL interface provided by the browser The URL of the page at level i-1. The second node creating unit 37 also includes: a judging unit 377 and a clearing unit 378 . Wherein, the judging unit 377 is suitable for judging whether the input behavior of the browser address bar triggers the opening of the i-th level page, specifically, it can be judged by clicking and inputting actions according to the browser address bar; the emptying unit 378 is suitable for judging unit When the judgment result of 377 is yes, the page ID of the i-1th level page or the URL of the i-1th level page acquired by the URL acquisition unit 373 is cleared, and the first node creation unit 36 is triggered to convert the i-th level The page is processed as the initial page; if the judging result of the judging unit 377 is negative, the judging unit 377 triggers the second node creating subunit 374 to create the i-th level node of the refer chain.

如果第i-1级页面的页面ID或URL获取单元373获取得到第i-1级页面的页面ID，那么第二节点创建子单元374直接查询包含第i-1级页面的页面ID的refer链，创建该refer链的第i级节点，将第i级页面的页面ID和URL作为第i级节点的信息。如果第i-1级页面的页面ID或URL获取单元373获取得到第i-1级页面的URL，那么第二节点创建子单元374查询包含第i-1级页面的URL的refer链，并在查询到多条refer链的情况下选择最近更新的refer链，创建该refer链的第i级节点，将第i级页面的页面ID和URL作为第i级节点的信息。If the page ID or URL acquisition unit 373 of the i-1th level page obtains the page ID of the i-1th level page, then the second node creation subunit 374 directly queries the refer chain containing the page ID of the i-1th level page , create the i-th level node of the refer chain, and use the page ID and URL of the i-th level page as the information of the i-th level node. If the page ID of the i-1th level page or the URL obtaining unit 373 obtains the URL of the i-1th level page, then the second node creation subunit 374 queries the refer chain containing the URL of the i-1th level page, and in When multiple refer chains are found, select the most recently updated refer chain, create the i-th level node of the refer chain, and use the page ID and URL of the i-th level page as the information of the i-th level node.

在创建refer链的过程中，考虑到页面发生多次自动跳转的情况，refer链创建模块35还包括：第二子节点创建单元38，适于捕获重定向处理时所调用的函数，从重定向处理时所调用的函数的输入参数中获取第i级页面的至少一个页面间跳转页面的URL；以及，查询包含第i级页面的页面ID的refer链，创建该refer链的至少一个第i级子节点，将第i级页面的页面ID和所述第i级页面的至少一个页面间跳转页面的URL作为至少一个第i级子节点的信息。In the process of creating the refer chain, considering that the page automatically jumps multiple times, the refer chain creation module 35 also includes: a second child node creation unit 38, adapted to capture the function called when the redirection is processed, from the redirection Obtain the URL of at least one inter-page jump page of the i-level page from the input parameter of the function called during processing; and query the refer chain containing the page ID of the i-level page, and create at least one i-th page of the refer chain The first-level child node uses the page ID of the i-th level page and the URL of at least one inter-page jump page of the i-th level page as the information of at least one i-th level child node.

图4示出了根据本发明一个实施例的服务器的结构框图。该服务器用于检测通过初始页面的第i级链接所打开的第i级页面，i≥2。如图4所示，服务器包括：黑名单和/或白名单数据库41和查询接口42。其中，Fig. 4 shows a structural block diagram of a server according to an embodiment of the present invention. The server is used to detect the i-th level page opened by the i-th level link of the initial page, i≥2. As shown in FIG. 4 , the server includes: a blacklist and/or whitelist database 41 and a query interface 42 . in,

黑名单和/或白名单数据库41适于保存属于黑名单和/或白名单的URL。服务器预先收集已识别的安全网页和危险/恶意网页，将安全网页的URL保存在白名单数据库中，将危险/恶意网页的URL保存在黑名单数据库中。可选地，黑名单和/或白名单数据库41中存储的也可以是URL的特征值。The blacklist and/or whitelist database 41 is adapted to store URLs belonging to the blacklist and/or whitelist. The server collects identified safe webpages and dangerous/malicious webpages in advance, saves URLs of safe webpages in a whitelist database, and saves URLs of dangerous/malicious webpages in a blacklist database. Optionally, what is stored in the blacklist and/or whitelist database 41 may also be characteristic values of URLs.

优选的，本发明实施例中的黑名单和/或白名单数据库41可以包括但不限于钓鱼网址库，广告欺诈网址库，或其他任何类型的恶意网址库等。Preferably, the blacklist and/or whitelist database 41 in the embodiment of the present invention may include, but not limited to, a phishing website library, an advertising fraud website library, or any other type of malicious website library.

查询接口42适于接收客户端发送的refer链所包含的所有URL，查询refer链所包含的所有URL是否属于黑名单和/或白名单数据库，然后将查询结果与预设的规则进行匹配得到匹配结果，将匹配结果返回给客户端。如果在客户端侧，refer链所包含的所有URL经过可逆加密方法进行了加密，那么查询接口42中包含对接收到的加密密文进行解密的模块，经过该模块解密处理后获得refer链所包含的所有URL。The query interface 42 is suitable for receiving all URLs contained in the refer chain sent by the client, querying whether all URLs contained in the refer chain belong to the blacklist and/or whitelist database, and then matching the query results with preset rules to obtain a match As a result, the matching result is returned to the client. If on the client side, all the URLs included in the refer chain are encrypted by a reversible encryption method, then the query interface 42 includes a module for decrypting the received encrypted ciphertext. All URLs for .

预设的规则是根据实际需求而设定的，其具体规定了需要进行风险提示的情况。下面以两种预设规则为例进行说明：The preset rules are set according to actual needs, and specifically stipulate situations in which risk warnings are required. The following two preset rules are used as examples for illustration:

针对该规则一，查询接口42进一步适于：如果查询结果表明第i级页面的URL属于黑名单数据库，即第i级页面为恶意页面或危险页面；或者，第i级页面的URL不属于白名单数据库，即第i级页面为未知页面；并且判断出初始页面至第i-1级页面中任一页面为搜索页面，即第i级页面是经过搜索引擎跳转而来的，表明查询结果与该规则一匹配，得到匹配结果为风险提示信息。For this rule one, the query interface 42 is further suitable for: if the query result shows that the URL of the i-th page belongs to the blacklist database, that is, the i-th page is a malicious page or a dangerous page; or, the URL of the i-th page does not belong to the blacklist database. List database, that is, the i-th level page is an unknown page; and it is judged that any page from the initial page to the i-1th level page is a search page, that is, the i-th level page is redirected by a search engine, indicating the query result Match the rule one, and the matching result is risk warning information.

针对该规则二，查询接口42进一步适于：如果查询结果表明初始页面至第i-1级页面中的任一页面的URL属于黑名单数据库，即该页面为恶意页面或危险页面；或者，初始页面至第i-1级页面中的任一页面的URL不属于白名单数据库，即该页面为未知页面；并且判断出第i级页面为支付页面，表明查询结果与该规则二匹配，得到匹配结果为风险提示信息。For the second rule, the query interface 42 is further suitable: if the query result shows that the URL of any page from the initial page to the i-1th level page belongs to the blacklist database, that is, the page is a malicious page or a dangerous page; or, the initial page The URL of any page from the page to the i-1th level page does not belong to the whitelist database, that is, the page is an unknown page; and it is judged that the i-th level page is a payment page, indicating that the query result matches the rule 2, and a match is obtained The result is a risk warning message.

进一步的，服务器还可以包括：搜索页面URL数据库43，适于保存搜索页面URL列表；支付页面URL数据库44，适于保存支付页面URL列表。查询接口42通过判断初始页面至第i-1级页面中任一页面的URL属于预设的搜索页面URL列表，确定初始页面至第i-1级页面中任一页面为搜索页面；以及，通过判断第i级页面的URL属于预设的支付页面URL列表，确定第i级页面为支付页面。Further, the server may further include: a search page URL database 43, adapted to store a list of search page URLs; a payment page URL database 44, adapted to store a list of payment page URLs. The query interface 42 determines that any page from the initial page to the i-1th level page is a search page by judging that the URL of any page from the initial page to the i-1th level page belongs to the preset search page URL list; and, by It is judged that the URL of the i-th level page belongs to the preset payment page URL list, and it is determined that the i-th level page is a payment page.

图5示出了根据本发明一个实施例的对网页访问行为进行处理的系统的结构框图。如图5所示，系统包括客户端30和服务器40，关于客户端30和服务器40的具体结构和功能参见上述实施例的描述，在此不再赘述。Fig. 5 shows a structural block diagram of a system for processing webpage access behaviors according to an embodiment of the present invention. As shown in FIG. 5 , the system includes a client 30 and a server 40 . For the specific structures and functions of the client 30 and the server 40 , refer to the description of the above-mentioned embodiments, which will not be repeated here.

根据本发明实施例提供的客户端、服务器以及对网页访问行为进行处理的系统，每当客户端监控到通过初始页面的各级链接对新页面的访问请求后，获取该新页面对应的refer链，将refer链包含的所有URL上报给服务器，由服务器根据这些URL检测出匹配结果，由客户端根据该匹配结果对新页面的访问行为进行处理。与现有技术仅利用新页面的URL进行检测相比，由于refer链所提供的URL更多，覆盖面更广，因而检测效率更高，能够更为有效地保护客户端网页浏览的安全性。进一步的，本发明实施例的客户端还具有创建refer链的功能，根据该功能能够实时获取由各级页面的URL组成的refer链，这样客户端也可以及时的将refer链所包含的所有URL发送给服务器，服务器由此能及时地获得很全面的URL信息，根据这些URL信息，服务器能够及时的向客户端返回匹配结果，从而实现了实时快速的保护客户端的网页浏览的安全性。According to the client, the server, and the system for processing web page access behaviors provided by the embodiments of the present invention, whenever the client monitors a request for access to a new page through links at all levels of the initial page, it obtains the refer chain corresponding to the new page , report all URLs contained in the refer chain to the server, the server detects matching results based on these URLs, and the client processes the access behavior of the new page according to the matching results. Compared with the prior art that only utilizes the URL of the new page for detection, since the refer chain provides more URLs and covers a wider area, the detection efficiency is higher, and the security of the client's web browsing can be more effectively protected. Further, the client in the embodiment of the present invention also has the function of creating a refer chain, and according to this function, the refer chain composed of URLs of pages at all levels can be obtained in real time, so that the client can also timely upload all URLs contained in the refer chain By sending it to the server, the server can obtain very comprehensive URL information in a timely manner. According to the URL information, the server can return the matching result to the client in a timely manner, thereby realizing real-time and rapid protection of the security of the client's web browsing.

根据本发明实施例提供的方法，在所述通过浏览器提供的get_locationURL接口获取第i-1级页面的页面ID和URL的步骤之后还包括：According to the method provided in the embodiment of the present invention, after the step of obtaining the page ID and URL of the i-1th level page through the get_locationURL interface provided by the browser, it further includes:

判断是否是由浏览器地址栏的输入行为触发打开第i级页面；Determine whether the opening of the i-level page is triggered by the input behavior of the browser address bar;

若判断结果为是，则将通过浏览器提供的get_locationURL接口获取的第i-1级页面的URL清空，将第i级页面作为初始页面进行处理；If the judgment result is yes, clear the URL of the i-1 level page obtained through the get_locationURL interface provided by the browser, and process the i-level page as the initial page;

若判断结果为否，则执行所述创建所述refer链的第i级节点的步骤。If the judgment result is no, execute the step of creating the i-th level node of the refer chain.

根据本发明实施例提供的方法，在所述第i级节点创建步骤之后还包括：至少一个第i级子节点创建步骤，所述至少一个第i级子节点对应于第i级页面的至少一个页面间跳转页面：捕获重定向处理时所调用的函数，从所述重定向处理时所调用的函数的输入参数中获取第i级页面的至少一个页面间跳转页面的URL；以及，查询包含所述第i级页面的页面ID的refer链，创建该refer链的至少一个第i级子节点，将所述第i级页面的页面ID和所述第i级页面的至少一个页面间跳转页面的URL作为至少一个第i级子节点的信息。According to the method provided in the embodiment of the present invention, after the step of creating the i-th level node, it further includes: a step of creating at least one i-th level child node, and the at least one i-th level child node corresponds to at least one of the i-th level page Inter-page jump page: capture the function called during redirection processing, and obtain the URL of at least one inter-page jump page of the i-level page from the input parameters of the function called during the redirection processing; and, query Including the refer chain of the page ID of the i-th level page, creating at least one i-th level child node of the refer chain, and jumping between the page ID of the i-th level page and at least one page of the i-th level page The URL of the forwarded page is used as information of at least one i-th child node.

根据本发明实施例提供的客户端，用于检测通过初始页面的第i级链接所打开的第i级页面，i≥2；所述客户端包括：The client provided according to the embodiment of the present invention is used to detect the i-th level page opened by the i-th level link of the initial page, i≥2; the client includes:

根据本发明实施例所述的客户端，如果所述查询接口接收到的匹配结果为风险提示信息，所述保护模块进一步适于：根据所述风险提示信息向用户提示风险，并根据用户的选择对所述第i级页面的访问行为进行拦截。According to the client according to the embodiment of the present invention, if the matching result received by the query interface is risk warning information, the protection module is further adapted to: remind the user of the risk according to the risk warning information, and according to the user's selection Intercepting the access behavior of the i-th level page.

根据本发明实施例所述的客户端，还包括：加密模块，适于将所述refer链所包含的所有URL加密成密文，发送给所述查询接口，由所述查询接口将所述密文发送给服务器。The client according to the embodiment of the present invention further includes: an encryption module, adapted to encrypt all URLs contained in the refer chain into ciphertexts, and send them to the query interface, and the ciphertexts are encrypted by the query interface. The text is sent to the server.

根据本发明实施例所述的客户端，还包括：refer链创建模块；The client according to the embodiment of the present invention further includes: a refer chain creation module;

所述refer链创建模块包括：The refer chain creation module includes:

第一节点创建单元，适于在监控到初始页面的访问请求后，生成初始页面的页面ID，获取初始页面的URL，创建refer链的第1级节点，将所述初始页面的页面ID和URL作为第1级节点的信息写入refer链；The first node creation unit is suitable for generating the page ID of the initial page after monitoring the access request of the initial page, obtaining the URL of the initial page, creating a first-level node of the refer chain, and combining the page ID and URL of the initial page Write the refer chain as the information of the first-level node;

第二节点创建单元，i≥2，适于在监控到第i级页面的访问请求后，生成第i级页面的页面ID，获取第i级页面的URL以及第i-1级页面的页面ID或URL，所述第i级页面是第i-1级页面的页面级跳转页面；以及，查询包含所述第i-1级页面的页面ID或URL的refer链，创建该refer链的第i级节点，将所述第i级页面的页面ID和URL作为第i级节点的信息；The second node creation unit, i≥2, is adapted to generate the page ID of the i-th level page after monitoring the access request of the i-th level page, and obtain the URL of the i-th level page and the page ID of the i-1th level page or URL, the i-th level page is the page-level jump page of the i-1th level page; The i-level node uses the page ID and URL of the i-level page as the information of the i-level node;

所述第二节点创建单元适于创建所述refer链的各级节点。The second node creation unit is adapted to create nodes at all levels of the refer chain.

根据本发明实施例所述的客户端，According to the client described in the embodiment of the present invention,

所述第一节点创建单元包括：The first node creation unit includes:

初始页面的页面ID生成单元，适于在监控到初始页面的访问请求后，生成初始页面的页面ID；The page ID generating unit of the initial page is adapted to generate the page ID of the initial page after monitoring the access request of the initial page;

初始页面的URL获取单元，适于在加载初始页面的过程中，通过指定响应事件接口获取当前加载的初始页面的URL；The URL acquisition unit of the initial page is adapted to obtain the URL of the currently loaded initial page by specifying a response event interface during the process of loading the initial page;

第一节点创建子单元，适于创建refer链的第1级节点，将所述初始页面的页面ID和URL作为第1级节点的信息写入refer链；The first node creation subunit is adapted to create a first-level node of the refer chain, and writes the page ID and URL of the initial page into the refer chain as information of the first-level node;

所述第二节点创建单元包括：The second node creation unit includes:

第i级页面的页面ID生成单元，适于在监控到第i级页面的访问请求后，生成第i级页面的页面ID；The page ID generating unit of the i-level page is adapted to generate the page ID of the i-level page after monitoring the access request of the i-level page;

第i级页面的URL获取单元，适于在加载第i级页面的过程中，通过指定响应事件接口获取当前加载的第i级页面的URL；The URL obtaining unit of the i-level page is adapted to obtain the URL of the currently loaded i-level page through a specified response event interface during the loading of the i-level page;

第i-1级页面的页面ID或URL获取单元，适于在监控到第i级页面的访问请求后，获取第i-1级页面的页面ID或URL；The page ID or URL acquisition unit of the i-1th level page is adapted to obtain the page ID or URL of the i-1th level page after monitoring the access request of the i-th level page;

第二节点创建子单元，适于查询包含所述第i-1级页面的页面ID或URL的refer链，创建该refer链的第i级节点，将所述第i级页面的页面ID和URL作为第i级节点的信息。The second node creates a subunit, which is suitable for querying the refer chain containing the page ID or URL of the i-1th level page, creating the i-level node of the refer chain, and using the page ID and URL of the i-th level page As the information of the i-th level node.

根据本发明实施例所述的客户端，所述第二节点创建单元还包括：捕获单元，适于在监控到第i级页面的访问请求后，获取第i级页面的接口对象指针；以及，写入单元，适于根据所述接口对象指针向第i级页面的接口对象写入在加载第i-1级页面的过程中所获取的第i-1级页面的页面ID；According to the client described in the embodiment of the present invention, the second node creation unit further includes: a capturing unit adapted to obtain the interface object pointer of the i-th level page after monitoring the access request of the i-th level page; and, The writing unit is adapted to write the page ID of the i-1th level page acquired during the process of loading the i-1th level page to the interface object of the i-th level page according to the interface object pointer;

所述第i-1级页面的页面ID或URL获取单元具体适于：在加载第i级页面的过程中，通过读取第i级页面的接口对象所提供的信息，获取第i-1级页面的页面ID。The page ID or URL acquiring unit of the i-1th level page is specifically adapted to: in the process of loading the i-th level page, by reading the information provided by the interface object of the i-th level page, to obtain the i-1th level The page ID of the page.

根据本发明实施例所述的客户端，所述捕获单元进一步适于：在监控到第i级页面的访问请求后，捕获浏览器创建新窗口或新标签页所调用的函数，利用该函数的返回值获取第i级页面的接口对象指针。According to the client described in the embodiment of the present invention, the capture unit is further adapted to: after monitoring the access request of the i-th level page, capture the function called by the browser to create a new window or a new tab page, and use the function The return value obtains the interface object pointer of the i-level page.

根据本发明实施例所述的客户端，所述第i-1级页面的页面ID或URL获取单元进一步适于：在监控到第i级页面的访问请求之后并在加载第i级页面之前，通过浏览器提供的get_locationURL接口获取第i-1级页面的URL。According to the client described in the embodiment of the present invention, the page ID or URL acquisition unit of the i-1th level page is further adapted to: after monitoring the access request of the i-th level page and before loading the i-th level page, Obtain the URL of the i-1 level page through the get_locationURL interface provided by the browser.

根据本发明实施例所述的客户端，所述第二节点创建单元还包括：According to the client described in the embodiment of the present invention, the second node creation unit further includes:

判断单元，适于判断是否是由浏览器地址栏的输入行为触发打开第i级页面；The judging unit is suitable for judging whether the i-th level page is triggered by the input behavior of the browser address bar;

清空单元，适于在所述判断单元的判断结果为是的情况下，将所述第i-1级页面的页面ID或URL获取单元所获取的第i-1级页面的URL清空，并触发第一节点创建单元将第i级页面作为初始页面进行处理；The emptying unit is adapted to clear the page ID of the i-1th level page or the URL of the i-1th level page acquired by the URL acquisition unit when the judgment result of the judgment unit is yes, and trigger The first node creation unit processes the i-th level page as the initial page;

在所述判断单元的判断结果为否的情况下，所述判断单元触发所述第二节点创建子单元创建所述refer链的第i级节点。If the judging result of the judging unit is negative, the judging unit triggers the second node creation subunit to create the i-th level node of the refer chain.

根据本发明实施例所述的客户端，所述refer链创建模块还包括：第二子节点创建单元，适于捕获重定向处理时所调用的函数，从所述重定向处理时所调用的函数的输入参数中获取第i级页面的至少一个页面间跳转页面的URL；以及，查询包含所述第i级页面的页面ID的refer链，创建该refer链的至少一个第i级子节点，将所述第i级页面的页面ID和所述第i级页面的至少一个页面间跳转页面的URL作为至少一个第i级子节点的信息。According to the client described in the embodiment of the present invention, the refer chain creation module further includes: a second child node creation unit, adapted to capture the function called during redirection processing, from the function called during redirection processing Obtain the URL of at least one inter-page jump page of the i-level page in the input parameter of the i-level page; and query the refer chain containing the page ID of the i-level page, and create at least one i-level child node of the refer chain, The page ID of the i-th level page and the URL of at least one inter-page jump page of the i-th level page are used as the information of at least one i-th level child node.

根据本发明实施例所述的服务器，用于检测通过初始页面的第i级链接所打开的第i级页面，i≥2；所述服务器包括：The server according to the embodiment of the present invention is used to detect the i-th level page opened by the i-th level link of the initial page, i≥2; the server includes:

根据本发明实施例所述的服务器，所述查询接口进一步适于：According to the server described in the embodiment of the present invention, the query interface is further adapted to:

如果查询结果表明第i级页面的URL属于黑名单数据库或不属于白名单数据库，并且判断出初始页面至第i-1级页面中任一页面为搜索页面，则得到匹配结果为风险提示信息；If the query result indicates that the URL of the i-level page belongs to the blacklist database or does not belong to the whitelist database, and it is determined that any page from the initial page to the i-1 level page is a search page, the matching result is a risk warning message;

或者，如果查询结果表明初始页面至第i-1级页面中任一页面的URL属于黑名单数据库或不属于白名单数据库，并且判断出第i级页面为支付页面，则得到匹配结果为风险提示信息。Or, if the query result shows that the URL of any page from the initial page to the i-1th level page belongs to the blacklist database or does not belong to the whitelist database, and it is judged that the i-th level page is a payment page, the matching result is a risk warning information.

根据本发明实施例所述的服务器，还包括：The server according to the embodiment of the present invention also includes:

搜索页面URL数据库，适于保存搜索页面URL列表；Search page URL database, suitable for saving search page URL list;

支付页面URL数据库，适于保存支付页面URL列表；A payment page URL database, suitable for storing a list of payment page URLs;

所述查询接口通过判断所述初始页面至第i-1级页面中任一页面的URL属于预设的搜索页面URL列表，确定所述初始页面至第i-1级页面中任一页面为搜索页面；以及，通过判断所述第i级页面的URL属于预设的支付页面URL列表，确定所述第i级页面为支付页面。The query interface determines that any page from the initial page to the i-1th level page is a search page by judging that the URL of any page from the initial page to the i-1th level page belongs to the preset search page URL list. and, determining that the i-th level page is a payment page by judging that the URL of the i-th level page belongs to a preset payment page URL list.

根据本发明实施例的对网页访问行为进行处理的系统，包括上述的客户端和上述的服务器。A system for processing webpage access behaviors according to an embodiment of the present invention includes the above-mentioned client and the above-mentioned server.

在此提供的算法和显示不与任何特定计算机、虚拟系统或者其它设备固有相关。各种通用系统也可以与基于在此的示教一起使用。根据上面的描述，构造这类系统所要求的结构是显而易见的。此外，本发明也不针对任何特定编程语言。应当明白，可以利用各种编程语言实现在此描述的本发明的内容，并且上面对特定语言所做的描述是为了披露本发明的最佳实施方式。The algorithms and displays presented herein are not inherently related to any particular computer, virtual system, or other device. Various generic systems can also be used with the teachings based on this. The structure required to construct such a system is apparent from the above description. Furthermore, the present invention is not specific to any particular programming language. It should be understood that various programming languages can be used to implement the contents of the present invention described herein, and the above description of specific languages is for disclosing the best mode of the present invention.

在此处所提供的说明书中，说明了大量具体细节。然而，能够理解，本发明的实施例可以在没有这些具体细节的情况下实践。在一些实例中，并未详细示出公知的方法、结构和技术，以便不模糊对本说明书的理解。In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure the understanding of this description.

类似地，应当理解，为了精简本公开并帮助理解各个发明方面中的一个或多个，在上面对本发明的示例性实施例的描述中，本发明的各个特征有时被一起分组到单个实施例、图、或者对其的描述中。然而，并不应将该公开的方法解释成反映如下意图：即所要求保护的本发明要求比在每个权利要求中所明确记载的特征更多的特征。更确切地说，如下面的权利要求书所反映的那样，发明方面在于少于前面公开的单个实施例的所有特征。因此，遵循具体实施方式的权利要求书由此明确地并入该具体实施方式，其中每个权利要求本身都作为本发明的单独实施例。Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the invention, in order to streamline this disclosure and to facilitate an understanding of one or more of the various inventive aspects, various features of the invention are sometimes grouped together in a single embodiment, figure, or its description. This method of disclosure, however, is not to be interpreted as reflecting an intention that the claimed invention requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the Detailed Description are hereby expressly incorporated into this Detailed Description, with each claim standing on its own as a separate embodiment of this invention.

本领域那些技术人员可以理解，可以对实施例中的设备中的模块进行自适应性地改变并且把它们设置在与该实施例不同的一个或多个设备中。可以把实施例中的模块或单元或组件组合成一个模块或单元或组件，以及此外可以把它们分成多个子模块或子单元或子组件。除了这样的特征和/或过程或者单元中的至少一些是相互排斥之外，可以采用任何组合对本说明书（包括伴随的权利要求、摘要和附图）中公开的所有特征以及如此公开的任何方法或者设备的所有过程或单元进行组合。除非另外明确陈述，本说明书（包括伴随的权利要求、摘要和附图）中公开的每个特征可以由提供相同、等同或相似目的的替代特征来代替。Those skilled in the art can understand that the modules in the device in the embodiment can be adaptively changed and arranged in one or more devices different from the embodiment. Modules or units or components in the embodiments may be combined into one module or unit or component, and furthermore may be divided into a plurality of sub-modules or sub-units or sub-assemblies. All features disclosed in this specification (including accompanying claims, abstract and drawings), as well as any method or method so disclosed, may be used in any combination, except that at least some of such features and/or processes or units are mutually exclusive. All processes or units of equipment are combined. Each feature disclosed in this specification (including accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.

此外，本领域的技术人员能够理解，尽管在此所述的一些实施例包括其它实施例中所包括的某些特征而不是其它特征，但是不同实施例的特征的组合意味着处于本发明的范围之内并且形成不同的实施例。例如，在下面的权利要求书中，所要求保护的实施例的任意之一都可以以任意的组合方式来使用。Furthermore, those skilled in the art will understand that although some embodiments described herein include some features included in other embodiments but not others, combinations of features from different embodiments are meant to be within the scope of the invention. and form different embodiments. For example, in the following claims, any of the claimed embodiments may be used in any combination.

本发明的各个部件实施例可以以硬件实现，或者以在一个或者多个处理器上运行的软件模块实现，或者以它们的组合实现。本领域的技术人员应当理解，可以在实践中使用微处理器或者数字信号处理器（DSP）来实现根据本发明实施例的客户端、服务器以及对网页访问行为进行处理的系统中的一些或者全部部件的一些或者全部功能。本发明还可以实现为用于执行这里所描述的方法的一部分或者全部的设备或者装置程序（例如，计算机程序和计算机程序产品）。这样的实现本发明的程序可以存储在计算机可读介质上，或者可以具有一个或者多个信号的形式。这样的信号可以从因特网网站上下载得到，或者在载体信号上提供，或者以任何其他形式提供。The various component embodiments of the present invention may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. Those skilled in the art should understand that a microprocessor or a digital signal processor (DSP) can be used in practice to implement some or all of the client, the server, and the system for processing webpage access behaviors according to the embodiments of the present invention Some or all of the features of the component. The present invention can also be implemented as an apparatus or an apparatus program (for example, a computer program and a computer program product) for performing a part or all of the methods described herein. Such a program for realizing the present invention may be stored on a computer-readable medium, or may be in the form of one or more signals. Such a signal may be downloaded from an Internet site, or provided on a carrier signal, or provided in any other form.

应该注意的是上述实施例对本发明进行说明而不是对本发明进行限制，并且本领域技术人员在不脱离所附权利要求的范围的情况下可设计出替换实施例。在权利要求中，不应将位于括号之间的任何参考符号构造成对权利要求的限制。单词“包含”不排除存在未列在权利要求中的元件或步骤。位于元件之前的单词“一”或“一个”不排除存在多个这样的元件。本发明可以借助于包括有若干不同元件的硬件以及借助于适当编程的计算机来实现。在列举了若干装置的单元权利要求中，这些装置中的若干个可以是通过同一个硬件项来具体体现。单词第一、第二、以及第三等的使用不表示任何顺序。可将这些单词解释为名称。It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention can be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In a unit claim enumerating several means, several of these means can be embodied by one and the same item of hardware. The use of the words first, second, and third, etc. does not indicate any order. These words can be interpreted as names.

Claims

1. A method for processing webpage access behavior, which is used to detect the i-th level page opened by the i-th level link of the initial page, and the access request of the i-level page is to click on the link on the i-1 level page or Triggered by other linking methods, i≥2; the method includes:

The step of creating a refer chain, wherein, whenever the i-th level page is opened through the i-th level link of the initial page, the process responsible for maintaining the refer chain obtains the page ID and URL of the i-th level page and the page of the i-1 level page ID or URL, query the corresponding refer chain according to the page ID or URL of the i-1 level page, and create the corresponding node of the refer chain;

After monitoring the access request of the i-level page, obtain a refer chain comprising the page ID of the i-level page, the refer chain comprising the page ID and URL of the initial page to the i-level page;

Send all the URLs included in the refer chain to the server, so that the server can query whether all the URLs included in the refer chain belong to the blacklist and/or whitelist database saved by the server, and then compare the query results with the preset Match the rules to get the matching result;

The matching result returned by the server is received, and the access behavior of the i-th level page is processed according to the matching result.

2. The method according to claim 1, said matching the query result with the preset rule to obtain the matching result further comprises:

If the query result indicates that the URL of the i-level page belongs to the blacklist database or does not belong to the whitelist database, and it is determined that any page from the initial page to the i-1 level page is a search page, the matching result is a risk warning message;

Or, if the query result shows that the URL of any page from the initial page to the i-1th level page belongs to the blacklist database or does not belong to the whitelist database, and it is judged that the i-th level page is a payment page, the matching result is a risk warning information.

3. The method according to claim 2, said determining that any page from the initial page to the i-1th level page is a search page is specifically: judging any page from the initial page to the i-1th level page Whether the URL of the URL belongs to the preset search page URL list, if so, it is determined that any page from the initial page to the i-1th level page is a search page;

The determining that the i-th level page is a payment page specifically includes: determining whether the URL of the i-th level page belongs to a preset payment page URL list, and if so, determining that the i-th level page is a payment page.

4. The method according to claim 2 or 3, if the matching result is risk warning information, then processing the access behavior of the i-th level page according to the matching result specifically comprises: according to the risk The prompt information reminds the user of the risk, and intercepts the access behavior of the i-level page according to the user's choice.

5. The method according to claim 1, wherein said sending all URLs included in the refer chain to the server specifically comprises: encrypting all URLs included in the refer chain into ciphertext and sending them to the server.

6. The method according to claim 1 or 2 or 3 or 5, the step of creating a refer chain further comprises:

Level 1 node creation steps: After monitoring the access request of the initial page, generate the page ID of the initial page, obtain the URL of the initial page, create the first level node of the refer chain, and use the page ID and URL of the initial page as The information of the first level node is written into the refer chain;

The i-level node creation step, i≥2: After monitoring the access request of the i-level page, generate the page ID of the i-level page, obtain the URL of the i-level page and the page ID of the i-1 level page or URL, the i-th level page is the page-level jump page of the i-1th level page; and, query the refer chain containing the page ID or URL of the i-1 level page, and create the i-th level of the refer chain A level node, using the page ID and URL of the i-th level page as the information of the i-th level node;

Create nodes at all levels of the refer chain through the i-th level node creating step.

7. The method according to claim 6, after monitoring the access request of the initial page, obtaining the URL of the initial page is specifically: in the process of loading the initial page, obtaining the currently loaded initial page by specifying a response event interface the URL;

After monitoring the access request of the i-level page, obtaining the URL of the i-level page specifically includes: during the process of loading the i-level page, obtaining the URL of the currently loaded i-level page through a designated response event interface.

8. The method according to claim 7, said acquiring the page ID of the i-1th level page further comprises:

After monitoring the access request of the i-th level page, obtain the interface object pointer of the i-th level page, and write the interface object of the i-th level page according to the interface object pointer to the interface object of the i-th level page in the process of loading the i-1 level page The obtained page ID of the i-1th level page;

In the process of loading the i-th level page, the page ID of the i-1 level page is obtained by reading the information provided by the interface object of the i-th level page.

9. The method according to claim 8, the step of obtaining the interface object pointer of the i-th level page comprises: capturing the function called by the browser to create a new window or new tab page, and using the return value of the function to obtain the i-th level page The interface object pointer of the level page.

10. The method according to claim 7, said obtaining the URL of the i-1th level page further comprising:

After monitoring the access request of the i-level page and before loading the i-level page, obtain the URL of the i-1 level page through the get_locationURL interface provided by the browser.

11. The method according to claim 10, after the step of obtaining the page ID and URL of the i-1th level page through the get_locationURL interface provided by the browser, further comprising:

Determine whether the opening of the i-level page is triggered by the input behavior of the browser address bar;

If the judgment result is yes, clear the URL of the i-1th level page obtained through the get_locationURL interface provided by the browser, and process the i-th level page as the initial page;

If the judgment result is no, execute the step of creating the i-th level node of the refer chain.

12. The method according to claim 6, after the i-th level node creation step, further comprising: at least one i-th level child node creation step, said at least one i-th level child node corresponding to the i-th level page At least one inter-page jump page: capture the function called during redirection processing, and obtain the URL of at least one inter-page jump page of the i-level page from the input parameters of the function called during redirection processing; and , query the refer chain containing the page ID of the i-th level page, create at least one i-th level child node of the refer chain, and combine the page ID of the i-th level page and at least one page of the i-th level page The URL of the inter-jump page is used as the information of at least one i-th child node.

13. A client, used to detect the i-th level page opened by the i-th level link of the initial page, the access request of the i-th level page is triggered by clicking on a link or other links on the i-1 level page, i≥2; the client includes:

The refer chain creation module is suitable for whenever the i-level page is opened through the i-level link of the initial page, the process responsible for maintaining the refer chain obtains the page ID and URL of the i-level page and the page ID of the i-1 level page or URL, query the corresponding refer chain according to the page ID or URL of the i-1 level page, and create the corresponding node of the refer chain;

The monitoring module is adapted to obtain a refer chain including the page ID of the i-level page after monitoring the access request of the i-level page, and the refer chain includes page IDs and URLs from the initial page to the i-level page;

A query interface, adapted to send all the URLs included in the refer chain to the server, so that the server can query whether all the URLs included in the refer chain belong to the blacklist and/or whitelist database saved by the server, and then send matching the query result with a preset rule to obtain a matching result; and receiving the matching result returned by the server;

The protection module is adapted to process the access behavior of the i-th level page according to the matching result.

14. The client according to claim 13, if the matching result received by the query interface is risk warning information, the protection module is further adapted to: remind the user of a risk according to the risk warning information, and according to the user's Choose to intercept the access behavior of the i-th level page.

15. The client according to claim 13, further comprising: an encryption module, adapted to encrypt all URLs included in the refer chain into ciphertext, and send them to the query interface, and the query interface encrypts the The ciphertext is sent to the server.

16. The client according to claim 13 or 14 or 15, the refer chain creation module further comprising:

The first node creation unit is suitable for generating the page ID of the initial page after monitoring the access request of the initial page, obtaining the URL of the initial page, creating a first-level node of the refer chain, and combining the page ID and URL of the initial page Write the refer chain as the information of the first-level node;

The second node creation unit, i≥2, is adapted to generate the page ID of the i-th level page after monitoring the access request of the i-th level page, and obtain the URL of the i-th level page and the page ID of the i-1th level page or URL, the i-th level page is the page-level jump page of the i-1th level page; The i-level node uses the page ID and URL of the i-level page as the information of the i-level node;

The second node creation unit is adapted to create nodes at all levels of the refer chain.

17. The client of claim 16,

The first node creation unit includes:

The page ID generating unit of the initial page is adapted to generate the page ID of the initial page after monitoring the access request of the initial page;

The URL acquisition unit of the initial page is adapted to obtain the URL of the currently loaded initial page by specifying a response event interface during the process of loading the initial page;

The first node creation subunit is adapted to create a first-level node of the refer chain, and writes the page ID and URL of the initial page into the refer chain as information of the first-level node;

The second node creation unit includes:

The page ID generating unit of the i-level page is adapted to generate the page ID of the i-level page after monitoring the access request of the i-level page;

The URL obtaining unit of the i-level page is adapted to obtain the URL of the currently loaded i-level page through a specified response event interface during the loading of the i-level page;

The page ID or URL acquisition unit of the i-1th level page is adapted to obtain the page ID or URL of the i-1th level page after monitoring the access request of the i-th level page;

The second node creates a subunit, which is suitable for querying the refer chain containing the page ID or URL of the i-1th level page, creating the i-level node of the refer chain, and using the page ID and URL of the i-th level page As the information of the i-th level node.

18. The client according to claim 17, the second node creation unit further comprising: a capturing unit adapted to acquire the interface object pointer of the i-th level page after monitoring the access request of the i-th level page; and , a writing unit, adapted to write the page ID of the i-1th level page acquired during the process of loading the i-1th level page to the interface object of the i-th level page according to the interface object pointer;

The page ID or URL acquiring unit of the i-1th level page is specifically adapted to: in the process of loading the i-th level page, by reading the information provided by the interface object of the i-th level page, to obtain the i-1th level The page ID of the page.

19. The client according to claim 18, the capture unit is further adapted to: after monitoring the access request of the i-th level page, capture the function called by the browser to create a new window or a new tab page, utilize the function The return value of gets the interface object pointer of the i-th level page.

20. The client according to claim 17, the page ID or URL obtaining unit of the i-1th level page is further adapted to: after monitoring the access request of the i-th level page and before loading the i-th level page , obtain the URL of the page at level i-1 through the get_locationURL interface provided by the browser.

21. The client according to claim 20, the second node creation unit further comprising:

The judging unit is suitable for judging whether the i-th level page is triggered by the input behavior of the browser address bar;

The emptying unit is adapted to clear the page ID of the i-1th level page or the URL of the i-1th level page acquired by the URL acquisition unit when the judgment result of the judgment unit is yes, and trigger The first node creation unit processes the i-th level page as the initial page;

If the judging result of the judging unit is negative, the judging unit triggers the second node creation subunit to create the i-th level node of the refer chain.

22. The client according to claim 16, the refer chain creation module further comprising: a second child node creation unit adapted to capture a function called during redirection processing, from the function called during redirection processing Obtain the URL of at least one inter-page jump page of the i-level page in the input parameter of the function; and query the refer chain containing the page ID of the i-level page, and create at least one i-level child node of the refer chain , using the page ID of the i-th level page and the URL of at least one inter-page jump page of the i-th level page as the information of at least one i-th level child node.

23. A server, used to detect the i-th level page opened by the i-th level link of the initial page, the access request of the i-th level page is triggered by clicking on a link or other links on the i-1 level page, i ≥2; said server includes:

a blacklist and/or whitelist database adapted to save URLs belonging to the blacklist and/or whitelist;

The query interface is adapted to receive all URLs contained in the refer chain sent by the client, query whether all the URLs contained in the refer chain belong to the blacklist and/or whitelist database, and then compare the query results with the preset rules Perform matching to obtain a matching result, and return the matching result to the client;

Wherein, the refer chain is created in the following manner: whenever the i-th level page is opened through the i-th level link of the initial page, the process responsible for maintaining the refer chain obtains the page ID and URL of the i-th level page and the i-1 level The page ID or URL of the page, query the corresponding refer chain according to the page ID or URL of the i-1th level page, and create the corresponding node of the refer chain.

24. The server of claim 23, the query interface being further adapted to:

25. The server of claim 24, further comprising:

Search page URL database, suitable for saving search page URL list;

A payment page URL database, suitable for storing a list of payment page URLs;

The query interface determines that any page from the initial page to the i-1th level page is a search page by judging that the URL of any page from the initial page to the i-1th level page belongs to the preset search page URL list. page; and, determining that the i-th level page is a payment page by judging that the URL of the i-th level page belongs to a preset payment page URL list.

26. A system for processing webpage access behaviors, comprising the client according to any one of claims 13-22 and the server according to any one of claims 23-25.