CN105893537B

CN105893537B - Method and device for determining geographic information point

Info

Publication number: CN105893537B
Application number: CN201610196304.9A
Authority: CN
Inventors: 程允胜; 吴海山; 汪天一; 许梦雯
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2016-03-31
Filing date: 2016-03-31
Publication date: 2019-06-07
Anticipated expiration: 2036-03-31
Also published as: CN110008414B; CN110008414A; CN105893537A

Abstract

The present application discloses a method and device for determining geographic information points. A specific implementation of the method includes: obtaining the user's location information, wherein the location information includes the user's location coordinates; using the user's location coordinates as the input value of a pre-trained Bayesian prediction model, and obtaining the probability value of the user being at each geographic information point in at least one geographic information point based on the Bayesian prediction model, wherein the Bayesian prediction model is trained using the basic information of the geographic information point as sample data, wherein the basic information includes the location coordinates of the geographic information point and the historical location information of historically visited users; and determining the geographic information point corresponding to the maximum probability value as the geographic information point where the user is located. This implementation achieves the determination of the geographic information point where the user is located.

Description

Method and device for determining geographic information point

技术领域technical field

本申请涉及计算机技术领域，具体涉及互联网技术领域，尤其涉及地理信息点的确定方法和装置。The present application relates to the field of computer technology, in particular to the field of Internet technology, and in particular to a method and apparatus for determining geographic information points.

背景技术Background technique

地理信息点(POI，Point Of Interest)，又称为“信息点”或“兴趣点”，指的是具有一定意义的场所，例如餐馆、学校、停车场。现有技术的定位，尤其是对用户定位，都是针对的用户所处于的绝对位置进行研究。Geographic Information Points (POI, Point Of Interest), also known as "points of information" or "points of interest", refer to places with a certain meaning, such as restaurants, schools, and parking lots. The positioning in the prior art, especially the user positioning, is aimed at researching the absolute position of the user.

然而，现有技术缺少对用户所处于的地理信息点的数据进行挖掘和计算，不能确定用户所处的地理信息点。However, the prior art lacks mining and calculation of the data of the geographic information point where the user is located, and cannot determine the geographic information point where the user is located.

发明内容SUMMARY OF THE INVENTION

本申请的目的在于提出一种改进的地理信息点的确定方法和装置，来解决以上背景技术部分提到的技术问题。The purpose of this application is to propose an improved method and device for determining geographic information points to solve the technical problems mentioned in the above background technology section.

第一方面，本申请提供了一种地理信息点的确定方法，所述方法包括：获取用户的定位信息，其中，所述定位信息包括用户定位坐标；将所述用户定位坐标作为预先训练的贝叶斯预测模型的输入值，并根据所述贝叶斯预测模型得到所述用户处于至少一个地理信息点中每个地理信息点的概率值，其中，所述贝叶斯预测模型利用地理信息点的基本信息作为样本数据训练得到，其中，所述基本信息包括地理信息点的定位坐标、历史到访用户的历史定位信息；将最大概率值对应的地理信息点确定为所述用户所处的地理信息点。In a first aspect, the present application provides a method for determining geographic information points, the method includes: acquiring user positioning information, wherein the positioning information includes user positioning coordinates; using the user positioning coordinates as a pre-trained shell The input value of the Yesian prediction model, and the probability value of the user being located at each geographical information point in the at least one geographical information point is obtained according to the Bayesian prediction model, wherein the Bayesian prediction model uses the geographical information point The basic information is obtained by training as sample data, wherein the basic information includes the positioning coordinates of the geographic information point and the historical positioning information of the historical visiting users; the geographic information point corresponding to the maximum probability value is determined as the geographic information point where the user is located. Information point.

在一些实施例中，所述贝叶斯预测模型参数包括至少一个地理信息点中每个地理信息点的历史到访概率，其中，所述历史到访概率根据地理信息点的定位坐标和所述历史定位信息得到，其中：根据地理信息点的定位坐标和所述历史定位信息得到历史到访概率包括：按照预设规则选取至少一个地理信息点，并建立地理信息点集合；根据所述地理信息点集合中每个地理信息点的定位坐标和所述地理信息点集合中每个历史到访用户定位信息得到所述地理信息点集合中每个历史到访次数；计算所述地理信息点集合中的每个地理信息的历史到访次数的总和，将所述总和作为地理信息点集合的历史到访总次数；根据所述历史到访总次数和所述地理信息点集合中的地理信息点的历史到访次数，得到所述地理信息点集合中的每个地理信息点的历史到访概率。In some embodiments, the Bayesian prediction model parameters include a historical visiting probability of each geographic information point in the at least one geographic information point, wherein the historical visiting probability is based on the positioning coordinates of the geographic information point and the Obtaining historical positioning information, wherein: obtaining the historical visiting probability according to the positioning coordinates of the geographic information point and the historical positioning information includes: selecting at least one geographic information point according to a preset rule, and establishing a set of geographic information points; according to the geographic information The positioning coordinates of each geographic information point in the point set and the positioning information of each historical visiting user in the geographic information point set obtain each historical visit times in the geographic information point set; calculate the number of times in the geographic information point set The sum of the number of historical visits of each geographic information in the The number of historical visits to obtain the historical visit probability of each geographic information point in the set of geographic information points.

在一些实施例中，所述根据地理信息点的定位坐标和所述地理信息点的历史到访用户定位信息得到所述地理信息点的历史到访次数，包括：选取地理信息点预设范围内的历史用户；获取所述历史定位坐标对应的历史用户的历史定位信息和历史搜索记录，其中，历史定位信息包括历史定位坐标和采集所述历史定位坐标时的历史定位时间；如果所述历史搜索记录中包括所述地理信息点的标识信息；则计算所述历史定位时间和搜索该地理信息点的时间点之间的时间间隔；响应于所述时间间隔小于预定阈值，将所述历史用户确定为所述地理信息点的历史到访用户。In some embodiments, the obtaining the number of historical visits to the geographic information point according to the positioning coordinates of the geographic information point and the user positioning information of historical visitors to the geographic information point includes: selecting a geographic information point within a preset range the historical user; obtain the historical positioning information and historical search records of the historical user corresponding to the historical positioning coordinates, wherein the historical positioning information includes the historical positioning coordinates and the historical positioning time when collecting the historical positioning coordinates; if the historical search The record includes the identification information of the geographic information point; then calculate the time interval between the historical positioning time and the time point when the geographic information point is searched; in response to the time interval being less than a predetermined threshold, determine the historical user For the historical visiting users of the geographic information point.

在一些实施例中，所述历史到访用户的历史定位信息包括历史定位坐标和所述历史到访用户位于所述历史定位坐标时的历史定位时间；以及，所述贝叶斯预测模型参数包括：地理信息点的时间概率分布，其中，所述时间概率分布根据所述地理信息点的历史到访用户的历史定位时间得到。In some embodiments, the historical positioning information of the historical visiting user includes historical positioning coordinates and a historical positioning time when the historically visiting user is located at the historical positioning coordinates; and, the Bayesian prediction model parameters include : the time probability distribution of the geographic information point, wherein the time probability distribution is obtained according to the historical positioning time of the historical visiting users of the geographic information point.

在一些实施例中，所述用户的定位信息还包括所述用户位于所述用户定位坐标时的用户定位时间；以及，所述将所述用户定位坐标作为预先训练的贝叶斯预测模型的输入值，并根据所述贝叶斯预测模型得到所述用户处于至少一个地理信息点中每个地理信息点的概率值，包括：将所述用户定位时间和所述用户定位坐标作为预先训练的贝叶斯预测模型的输入值，并根据所述贝叶斯预测模型得到所述用户处于至少一个地理信息点中每个地理信息点的概率值。In some embodiments, the user's positioning information further includes a user positioning time when the user is located at the user positioning coordinates; and the user positioning coordinates are used as an input to a pre-trained Bayesian prediction model and obtaining the probability value of the user at each geographic information point in at least one geographic information point according to the Bayesian prediction model, including: taking the user positioning time and the user positioning coordinates as pre-trained Bayesian The input value of the Yesian prediction model, and the probability value of the user being located at each geographical information point in the at least one geographical information point is obtained according to the Bayesian prediction model.

在一些实施例中，所述贝叶斯预测模型参数包括地理信息点的定位概率，其中，所述定位概率根据所述地理信息点与聚类中心之间的距离得到，其中，所述聚类中心由至少一个地理信息点聚类得到。In some embodiments, the Bayesian prediction model parameters include a location probability of a geographic information point, wherein the location probability is obtained according to a distance between the geographic information point and a cluster center, wherein the cluster The center is obtained by clustering at least one geographic information point.

在一些实施例中，所述聚类中心由至少一个地理信息点聚类得到，包括：通过K-means算法聚类得到至少一个地理信息点的聚类中心，其中：选取至少一个地理信息点，并建立聚类地理信息点集合；根据所述聚类地理信息点集合到访的总次数确定聚类数目；选取所述聚类数目个定位坐标作为初始聚类中心；将所述聚类数目、所述初始聚类中心对应的坐标和聚类地理信息点集合中地理信息点的定位坐标设置为K-means算法的输入值，得到所述聚类数目个聚类中心。In some embodiments, the cluster center is obtained by clustering at least one geographic information point, including: obtaining the cluster center of at least one geographic information point through K-means algorithm clustering, wherein: selecting at least one geographic information point, And establish a cluster geographic information point set; determine the number of clusters according to the total number of visits of the cluster geographic information point set; select the number of positioning coordinates of the cluster number as the initial cluster center; The coordinates corresponding to the initial cluster centers and the location coordinates of the geographic information points in the clustered geographic information point set are set as input values of the K-means algorithm to obtain the number of cluster centers.

在一些实施例中，所述用户的定位信息还包括所述用户位于所述用户定位坐标时的用户定位时间；以及，所述获取用户的定位信息，包括：筛选出用户定位时间在预设时间段内的用户定位坐标，并建立原始用户定位坐标集合；剔除所述原始用户定位坐标集合中的异常点，得到用户定位坐标集合，其中，所述异常点是指在第二预设时间段内移动的距离大于预设距离阈值的坐标点；将所述用户定位坐标集合中的至少一个用户定位坐标通过轨迹聚类算法聚合成一个轨迹中心坐标；将所述用户定位坐标集合中的至少一个用户定位时间的所对应时间点的平均时间点作为轨迹中心时间；以及，所述将所述轨迹中心坐标和所述轨迹中心时间作为预先训练的贝叶斯预测模型的输入值，并根据所述贝叶斯预测模型得到所述用户处于至少一个地理信息点中每个地理信息点的概率值，包括：将所述轨迹中心坐标作为预先训练的贝叶斯预测模型的输入值，并根据所述贝叶斯预测模型得到所述用户处于至少一个地理信息点中每个地理信息点的概率值。In some embodiments, the user's positioning information further includes the user's positioning time when the user is located at the user's positioning coordinates; and the acquiring the user's positioning information includes: filtering out the user's positioning time at a preset time The user positioning coordinates in the segment are obtained, and an original user positioning coordinate set is established; the abnormal points in the original user positioning coordinate set are eliminated to obtain a user positioning coordinate set, wherein the abnormal point refers to the second preset time period. A coordinate point whose moving distance is greater than a preset distance threshold; at least one user positioning coordinate in the user positioning coordinate set is aggregated into a track center coordinate through a track clustering algorithm; at least one user positioning coordinate set in the user positioning coordinate set is aggregated. The average time point of the corresponding time points of the positioning time is used as the track center time; Obtaining the probability value of the user at each geographic information point in the at least one geographic information point by the Yesian prediction model includes: taking the coordinate of the trajectory center as the input value of the pre-trained Bayesian forecasting model, and according to the Bayesian prediction model The Yesian prediction model obtains a probability value that the user is located at each geographic information point in at least one geographic information point.

在一些实施例中，所述方法在将最大概率值对应的地理信息点确定为所述用户所处的地理信息点后，还包括：对所述用户的定位信息添加所述最大概率值对应的地理信息点的历史到访用户标记；将带有历史到访用户标记的所述用户的定位信息加入到所述贝叶斯预测模型的样本数据集合中；利用所述样本数据集合中的样本数据训练生成新的贝叶斯预测模型。In some embodiments, after determining the geographic information point corresponding to the maximum probability value as the geographic information point where the user is located, the method further includes: adding a geographic information point corresponding to the maximum probability value to the user's positioning information The historical visiting user marking of geographic information points; adding the positioning information of the user with the historical visiting user marking to the sample data set of the Bayesian prediction model; using the sample data in the sample data set Train to generate a new Bayesian predictive model.

第二方面，本申请提供了一种地理信息点的确定装置，所述装置包括：获取模块，配置用于获取用户的定位信息，其中，所述定位信息包括用户定位坐标；计算模块，配置用于将所述用户定位坐标作为预先训练的贝叶斯预测模型的输入值，并根据所述贝叶斯预测模型得到所述用户处于至少一个地理信息点中每个地理信息点的概率值，其中，所述贝叶斯预测模型利用地理信息点的基本信息作为样本数据训练得到，其中，所述基本信息包括地理信息点的定位坐标、历史到访用户的历史定位信息；确定模块，配置用于将最大概率值对应的地理信息点确定为所述用户所处的地理信息点。In a second aspect, the present application provides a device for determining a geographic information point, the device includes: an acquisition module configured to acquire user positioning information, wherein the positioning information includes user positioning coordinates; a calculation module configured for is based on using the user positioning coordinates as the input value of the pre-trained Bayesian prediction model, and obtaining the probability value of the user at each geographic information point in the at least one geographic information point according to the Bayesian prediction model, wherein , the Bayesian prediction model is obtained by training the basic information of the geographic information points as sample data, wherein the basic information includes the positioning coordinates of the geographic information points and the historical positioning information of the historical visiting users; the determination module is configured to The geographic information point corresponding to the maximum probability value is determined as the geographic information point where the user is located.

在一些实施例中，所述用户的定位信息还包括所述用户位于所述用户定位坐标时的用户定位时间；以及，所述计算模块，进一步用于：将所述用户定位时间和所述用户定位坐标作为预先训练的贝叶斯预测模型的输入值，并根据所述贝叶斯预测模型得到所述用户处于至少一个地理信息点中每个地理信息点的概率值。In some embodiments, the user's positioning information further includes a user positioning time when the user is located at the user positioning coordinates; and the computing module is further configured to: combine the user positioning time with the user positioning time. The positioning coordinates are used as input values of a pre-trained Bayesian prediction model, and a probability value of the user being at each geographic information point in at least one geographic information point is obtained according to the Bayesian prediction model.

在一些实施例中，所述用户的定位信息还包括所述用户位于所述用户定位坐标时的用户定位时间；以及，所述获取模块，进一步用于：筛选出用户定位时间在预设时间段内的用户定位坐标，并建立原始用户定位坐标集合；剔除所述原始用户定位坐标集合中的异常点，得到用户定位坐标集合，其中，所述异常点是指在第二预设时间段内移动的距离大于预设距离阈值的坐标点；将所述用户定位坐标集合中的至少一个用户定位坐标通过轨迹聚类算法聚合成一个轨迹中心坐标；将所述用户定位坐标集合中的至少一个用户定位时间的所对应时间点的平均时间点作为轨迹中心时间；以及，所述将所述轨迹中心坐标和所述轨迹中心时间作为预先训练的贝叶斯预测模型的输入值，并根据所述贝叶斯预测模型得到所述用户处于至少一个地理信息点中每个地理信息点的概率值，包括：将所述轨迹中心坐标作为预先训练的贝叶斯预测模型的输入值，并根据所述贝叶斯预测模型得到所述用户处于至少一个地理信息点中每个地理信息点的概率值。In some embodiments, the user's positioning information further includes the user positioning time when the user is located at the user positioning coordinates; and the acquiring module is further configured to: filter out the user positioning time within a preset time period The user positioning coordinates in the original user positioning coordinate set are established, and the original user positioning coordinate set is established; the abnormal point in the original user positioning coordinate set is eliminated to obtain the user positioning coordinate set, wherein the abnormal point refers to the movement within the second preset time period. A coordinate point whose distance is greater than a preset distance threshold; at least one user positioning coordinate in the user positioning coordinate set is aggregated into a track center coordinate through a trajectory clustering algorithm; at least one user positioning coordinate in the user positioning coordinate set is positioned The average time point of the corresponding time points of the time is used as the track center time; Obtaining the probability value of the user at each geographic information point in the at least one geographic information point by using the Bayesian prediction model, including: using the coordinates of the trajectory center as the input value of the pre-trained Bayesian prediction model, The SS prediction model obtains a probability value that the user is at each geographic information point in at least one geographic information point.

在一些实施例中，所述装置还包括更新模块，配置用于：对所述用户的定位信息添加所述最大概率值对应的地理信息点的历史到访用户标记；将带有历史到访用户标记的所述用户的定位信息加入到所述贝叶斯预测模型的样本数据集合中；利用所述样本数据集合中的样本数据训练生成新的贝叶斯预测模型。In some embodiments, the apparatus further includes an update module configured to: add a historical visiting user mark of the geographic information point corresponding to the maximum probability value to the positioning information of the user; The marked location information of the user is added to the sample data set of the Bayesian prediction model; and a new Bayesian prediction model is generated by training with the sample data in the sample data set.

本申请提供的地理信息点的确定方法和装置，通过获取用户的定位信息，其中，所述定位信息包括用户定位坐标；将所述用户定位坐标作为预先训练的贝叶斯预测模型的输入值，并根据所述贝叶斯预测模型得到所述用户处于至少一个地理信息点中每个地理信息点的概率值，其中，所述贝叶斯预测模型利用地理信息点的基本信息作为样本数据训练得到，其中，所述基本信息包括地理信息点的定位坐标、历史到访用户的历史定位信息；将最大概率值对应的地理信息点确定为所述用户所处的地理信息点，实现了确定用户所处的地理信息点。The method and device for determining a geographic information point provided by the present application obtain the user's positioning information, wherein the positioning information includes the user's positioning coordinates; the user's positioning coordinates are used as the input value of the pre-trained Bayesian prediction model, and obtain the probability value of each geographic information point in at least one geographic information point of the user according to the Bayesian prediction model, wherein the Bayesian prediction model uses the basic information of the geographic information point as sample data to train to obtain , wherein the basic information includes the positioning coordinates of the geographic information point and the historical positioning information of the historical visiting users; the geographic information point corresponding to the maximum probability value is determined as the geographic information point where the user is located, which realizes the determination of the user's location. geographic information point.

附图说明Description of drawings

通过阅读参照以下附图所作的对非限制性实施例所作的详细描述，本申请的其它特征、目的和优点将会变得更明显：Other features, objects and advantages of the present application will become more apparent by reading the detailed description of non-limiting embodiments made with reference to the following drawings:

图1是本申请可以应用于其中的示例性系统架构图；FIG. 1 is an exemplary system architecture diagram to which the present application can be applied;

图2是根据本申请的地理信息点的确定方法的一个实施例的流程图；2 is a flow chart of an embodiment of a method for determining a geographic information point according to the present application;

图3是根据本申请的地理信息点的确定方法的又一个实施例的流程图；3 is a flow chart of another embodiment of a method for determining a geographic information point according to the present application;

图4是根据本申请的地理信息点的确定方法的地理信息点的时间概率分布；Fig. 4 is the time probability distribution of the geographic information point according to the determination method of the geographic information point of the present application;

图5是根据本申请的地理信息点的确定装置的一个实施例的结构示意图；5 is a schematic structural diagram of an embodiment of an apparatus for determining geographic information points according to the present application;

图6是适于用来实现本申请实施例的终端设备或服务器的计算机系统的结构示意图。FIG. 6 is a schematic structural diagram of a computer system suitable for implementing a terminal device or a server according to an embodiment of the present application.

具体实施方式Detailed ways

下面结合附图和实施例对本申请作进一步的详细说明。可以理解的是，此处所描述的具体实施例仅仅用于解释相关发明，而非对该发明的限定。另外还需要说明的是，为了便于描述，附图中仅示出了与有关发明相关的部分。The present application will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the related invention, but not to limit the invention. In addition, it should be noted that, for the convenience of description, only the parts related to the related invention are shown in the drawings.

需要说明的是，在不冲突的情况下，本申请中的实施例及实施例中的特征可以相互组合。下面将参考附图并结合实施例来详细说明本申请。It should be noted that the embodiments in the present application and the features of the embodiments may be combined with each other in the case of no conflict. The present application will be described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.

图1示出了可以应用本申请的地理信息点的确定方法或地理信息点的确定装置的实施例的示例性系统架构100。FIG. 1 shows an exemplary system architecture 100 to which an embodiment of a method for determining a geographic information point or an apparatus for determining a geographic information point of the present application may be applied.

如图1所示，系统架构100可以包括终端设备101、102、103，网络104和服务器105。网络104用以在终端设备101、102、103和服务器105之间提供通信链路的介质。网络104可以包括各种连接类型，例如有线、无线通信链路或者光纤电缆等等。As shown in FIG. 1 , the system architecture 100 may include terminal devices 101 , 102 , and 103 , a network 104 and a server 105 . The network 104 is a medium used to provide a communication link between the terminal devices 101 , 102 , 103 and the server 105 . The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.

用户可以使用终端设备101、102、103通过网络104与服务器105交互，以接收或发送消息等。终端设备101、102、103上可以安装有各种通讯客户端应用，例如地图类应用、购物类应用、搜索类应用、即时通信工具、邮箱客户端、社交平台软件等。The user can use the terminal devices 101, 102, 103 to interact with the server 105 through the network 104 to receive or send messages and the like. Various communication client applications may be installed on the terminal devices 101 , 102 and 103 , such as map applications, shopping applications, search applications, instant communication tools, email clients, social platform software, and the like.

终端设备101、102、103可以是具有显示屏并且支持网页浏览的各种电子设备，包括但不限于智能手机、平板电脑、电子书阅读器、MP3播放器(Moving Picture ExpertsGroup Audio Layer III，动态影像专家压缩标准音频层面3)、MP4(Moving PictureExperts Group Audio Layer IV，动态影像专家压缩标准音频层面4)播放器、膝上型便携计算机和台式计算机等等。The terminal devices 101, 102, and 103 may be various electronic devices that have a display screen and support web browsing, including but not limited to smart phones, tablet computers, e-book readers, MP3 players (Moving Picture Experts Group Audio Layer III, moving image Expert Compression Standard Audio Layer 3), MP4 (Moving PictureExperts Group Audio Layer IV, Moving Picture Experts Group Audio Layer 4) Players, Laptops and Desktops, etc.

服务器105可以是提供各种服务的服务器，例如对终端设备101、102、103的定位服务提供支持的定位服务服务器。定位服务服务器可以对接收到的定位数据等数据进行分析等处理。The server 105 may be a server that provides various services, for example, a location service server that supports the location services of the terminal devices 101 , 102 and 103 . The location service server may analyze and process the received location data and other data.

需要说明的是，本申请实施例所提供的地理信息点的确定方法一般由服务器105执行，相应地，地理信息点的确定装置一般设置于服务器105中。It should be noted that the method for determining the geographic information point provided by the embodiment of the present application is generally performed by the server 105 , and accordingly, the device for determining the geographic information point is generally set in the server 105 .

应该理解，图1中的终端设备、网络和服务器的数目仅仅是示意性的。根据实现需要，可以具有任意数目的终端设备、网络和服务器。It should be understood that the numbers of terminal devices, networks and servers in FIG. 1 are merely illustrative. There can be any number of terminal devices, networks and servers according to implementation needs.

继续参考图2，示出了根据本申请的地理信息点的确定方法的一个实施例的流程200。上述的地理信息点的确定方法，包括以下步骤：Continuing to refer to FIG. 2 , a flow 200 of an embodiment of a method for determining a geographic information point according to the present application is shown. The above-mentioned method for determining geographic information points includes the following steps:

步骤201，获取用户的定位信息。Step 201: Acquire the positioning information of the user.

在本实施例中，地理信息点的确定方法运行于其上的电子设备(例如图1所示的服务器)可以基于用户所使用的移动终端获取用户的定位信息。需要指出的是，上述基于用户所使用的移动终端获取用户的定位信息，可以有多种方式实现，在这里，实现方式包括但不限于基于GPS(Global Positioning System，全球定位系统)的定位、基于移动运营网的基站的定位、基于AGPS(AssistedGPS，辅助全球卫星定位系统)的定位、基于WiFi的定位以及其他现在已知或将来开发的移动终端定位方式。In this embodiment, the electronic device (for example, the server shown in FIG. 1 ) on which the method for determining the geographic information point runs may acquire the user's positioning information based on the mobile terminal used by the user. It should be pointed out that the above-mentioned acquisition of the user's positioning information based on the mobile terminal used by the user can be implemented in various ways. Here, the implementation methods include but are not limited to GPS (Global Positioning System)-based positioning, The positioning of the base station of the mobile operating network, the positioning based on AGPS (Assisted GPS, Assisted Global Positioning System), the positioning based on WiFi and other mobile terminal positioning methods known now or developed in the future.

在本实施例中，上述用户的定位信息包括用户定位坐标。在这里，用户定位坐标可以是经纬度坐标。In this embodiment, the above-mentioned user positioning information includes user positioning coordinates. Here, the user positioning coordinates may be longitude and latitude coordinates.

步骤202，将用户定位坐标作为预先训练的贝叶斯预测模型的输入值，并根据贝叶斯预测模型得到用户处于至少一个地理信息点中每个地理信息点的概率值。Step 202 , taking the user positioning coordinates as the input value of the pre-trained Bayesian prediction model, and obtaining the probability value of the user at each geographic information point in the at least one geographic information point according to the Bayesian prediction model.

在本实施例中，基于步骤201中得到用户的用户定位坐标，上述电子设备(例如图1所示的服务器)可以首先将上述定位坐标作为预先训练贝叶斯预测模型的输入值；之后再利用贝叶斯预测模型得到用户处于某个地理信息点的概率；一次或多次利用贝叶斯预测模型得到用户处于至少一个地理信息点中每个地理信息点的概率，作为示例，可以将用户的用户定位坐标作为贝叶斯预测模型的输入值得到用户处于甲地的概率是a，再利用贝叶斯预测模型预测上述用户得到用户处于乙地的概率是b。In this embodiment, based on the user positioning coordinates of the user obtained in step 201, the above-mentioned electronic device (such as the server shown in FIG. 1 ) may first use the above-mentioned positioning coordinates as the input values of the pre-trained Bayesian prediction model; The Bayesian prediction model obtains the probability that the user is at a certain geographic information point; the Bayesian prediction model is used one or more times to obtain the probability that the user is at each geographic information point in at least one geographic information point. The user positioning coordinates are used as the input value of the Bayesian prediction model to obtain the probability that the user is in place A as a, and then the Bayesian prediction model is used to predict the above-mentioned user and the probability that the user is in place B is b.

在本实施例中，贝叶斯预测模型利用地理信息点的基本信息作为样本数据训练得到，其中，上述基本信息包括地理信息点的定位坐标，历史到访用户的历史定位信息。在这里，上述贝叶斯预测模型是以贝叶斯公式为基础原理所建立的预测模型，作为示例，应用于本实施例的贝叶斯公式可以用下式表示：In this embodiment, the Bayesian prediction model is obtained by training the basic information of the geographic information points as sample data, wherein the basic information includes the positioning coordinates of the geographic information points and historical positioning information of historical visiting users. Here, the above-mentioned Bayesian prediction model is a prediction model established based on the Bayesian formula. As an example, the Bayesian formula applied in this embodiment can be expressed by the following formula:

P(U|poi)＝A*BP(U|poi)=A*B

其中，poi表示某一地理信息点，U表示用户的定位信息，P(U|poi)表示用户处于某一地理信息点的概率，A、B均是贝叶斯预测模型参数，*表示参数贝叶斯预测模型A和贝叶斯预测模型参数B之间具有运算关系，上述运算关系包括但不限于乘积关系、加和关系。Among them, poi represents a certain geographic information point, U represents the user's positioning information, P(U|poi) represents the probability that the user is at a certain geographic information point, A and B are the parameters of the Bayesian prediction model, and * represents the parameter Bayesian There is an operation relationship between the Yesian prediction model A and the Bayesian prediction model parameter B, and the above operation relationship includes but is not limited to a product relationship and an addition relationship.

在本实施例的一些可选的实现方式中，上述贝叶斯预测模型参数包括至少一个地理信息点中每个地理信息点的历史到访概率，其中，上述历史到访概率根据地理信息点的定位坐标和上述历史定位信息得到，其中，可以通过以下步骤得到根据地理信息的定位坐标和上述历史定位信息到底历史到访概率：按照预设规则选取至少一个地理信息点，并建立地理信息点集合；根据上述地理信息点集合中每个地理信息点的定位坐标和上述地理信息点集合中每个历史到访用户定位信息得到上述地理信息点集合中每个历史到访次数；计算上述地理信息点集合中的每个地理信息的历史到访次数的总和，将上述总和作为地理信息点集合的历史到访总次数；根据上述历史到访总次数和上述地理信息点集合中的地理信息点的历史到访次数，得到上述地理信息点集合中的每个地理信息点的历史到访概率。In some optional implementations of this embodiment, the above-mentioned Bayesian prediction model parameters include a historical visit probability of each geographic information point in the at least one geographic information point, wherein the above-mentioned historical visit probability is based on the The positioning coordinates and the above-mentioned historical positioning information are obtained, wherein, the positioning coordinates according to the geographical information and the historical visiting probability of the above-mentioned historical positioning information can be obtained through the following steps: select at least one geographical information point according to a preset rule, and establish a set of geographical information points Obtain each historical visit times in the above-mentioned geographic information point set according to the positioning coordinates of each geographic information point in the above-mentioned geographic information point set and each historical visiting user positioning information in the above-mentioned geographic information point set; Calculate the above-mentioned geographic information point The sum of the number of historical visits of each geographic information in the set, and the above sum is taken as the total number of historical visits of the set of geographic information points; according to the total number of historical visits and the history of the geographic information points in the above set of geographic information points The number of visits to obtain the historical visiting probability of each geographic information point in the above-mentioned geographic information point set.

作为示例，选取距离较近的三个地理信息点建立地理信息点集合，三个地理信息点分别命名为地理信息点a、地理信息点b、地理信息点c，这三个地理信息点到访的在一天中到访的总次数是100次，其中到访地理信息点a的次数是20次，到访地理信息点b的次数堵车30次，到访地理信息点c的次数是50次，那么地理信息点a的历史到访概率是20/100为百分之二十，那么地理信息点a的历史到访概率是30/100为百分之三十，那么地理信息点a的历史到访概率是50/100为百分之五十。As an example, select three geographic information points that are close to each other to establish a geographic information point set, and the three geographic information points are named as geographic information point a, geographic information point b, and geographic information point c. These three geographic information points visit The total number of visits in a day is 100, of which the number of visits to geographic information point a is 20, the number of visits to geographic information point b is 30 times in traffic jams, and the number of visits to geographic information point c is 50 times. Then the historical visiting probability of geographic information point a is 20/100, which is 20%, then the historical visiting probability of geographic information point a is 30/100, which is 30%. Then the historical visiting probability of geographic information point a is 30/100. The probability of visiting is 50/100 for fifty percent.

可选地，根据地理信息点的定位坐标和上述地理信息点的历史到访用户定位信息得到上述地理信息点的历史到访次数，可以通过以下步骤得到：选取地理信息点预设范围内的历史用户；获取上述历史定位坐标对应的历史用户的历史定位信息和历史搜索记录，其中，历史定位信息包括历史定位坐标和采集上述历史定位坐标时的历史定位时间；如果上述历史搜索记录中包括上述地理信息点的标识信息；则计算上述历史定位时间和搜索该地理信息点的时间点之间的时间间隔；响应于上述时间间隔小于预定阈值，将上述历史用户确定为上述地理信息点的历史到访用户。Optionally, obtaining the number of historical visits of the above-mentioned geographic information point according to the positioning coordinates of the geographic information point and the historical visiting user positioning information of the above-mentioned geographic information point can be obtained by the following steps: selecting the historical data within the preset range of the geographic information point user; obtain the historical positioning information and historical search records of the historical users corresponding to the above-mentioned historical positioning coordinates, wherein the historical positioning information includes historical positioning coordinates and the historical positioning time when the above-mentioned historical positioning coordinates are collected; if the above-mentioned historical search records include the above-mentioned geographic The identification information of the information point; then calculate the time interval between the above-mentioned historical positioning time and the time point of searching for the geographical information point; in response to the above-mentioned time interval being less than a predetermined threshold, the above-mentioned historical user is determined as the historical visit of the above-mentioned geographical information point user.

作为示例，选取地理信息点甲的半径100米范围内上午10点到11点的历史用户，例如用户张一在上午10点30分位于此范围内，然后获取张一的搜索记录，如果搜索记录，如果搜索记录中包括地理信息点甲的标识信息，比如地理信息点甲的名称、与地理信息点甲的名称相似的名称，再后，获取搜索地理信息点甲的标识信息的时间点，比如是10点15分，搜索时间点10点15分与历史定位时间10点30分时间间隔小于预定阈值，则将上述历史用户确定为上述地理信息点的历史到访用户，可以理解的是，上述预定阈值可以是半小时，可以是一天，也可以是一个月。As an example, select the historical users from 10:00 am to 11:00 am within the radius of 100 meters of the geographic information point armor. For example, the user Zhang Yi is located in this range at 10:30 am, and then obtain the search records of Zhang Yi, if the search records , if the search record includes the identification information of the geographic information point A, such as the name of the geographic information point A, a name similar to the name of the geographic information point A, and then obtain the time point when the identification information of the geographic information point A is searched, such as It is 10:15, and the time interval between the search time 10:15 and the historical positioning time 10:30 is less than the predetermined threshold, then the above historical user is determined as the historical visiting user of the above geographic information point. It is understandable that the above The predetermined threshold can be half an hour, one day, or one month.

可选地，根据地理信息点的定位坐标和上述地理信息点的历史到访用户定位信息得到上述地理信息点的历史到访次数，可以通过以下方式得到：获取历史用户连接WiFi的数据，如果此WiFi已确定属于某一地理信息点，那么可以确定此历史用户为该地理信息点的历史到访用户。Optionally, according to the positioning coordinates of the geographic information point and the historical visiting user positioning information of the geographic information point, the number of historical visits to the geographic information point can be obtained in the following manner: obtaining the data of the historical user connection WiFi, if this The WiFi has been determined to belong to a certain geographic information point, then it can be determined that the historical user is a historical visiting user of the geographic information point.

可选地，根据预设规则选取至少一个地理信息点，并建立地理信息点集合，其中，选取至少一个地理的预设规则可以是选取以用户为中心，预定范围内的若干个地理信息点。也可以是预先根据地理信息点的密集程度建立了将某一区域的地理信息点划分为若干个地理信息点集合，例如，将某一大学校园分别以主教学楼、体育馆、图书馆为中心，建立三个地理信息点集合，其中，以体育馆为中心的地理信息点集合中可能包括器材室、操场。Optionally, at least one geographic information point is selected according to a preset rule, and a set of geographic information points is established, wherein the preset rule for selecting at least one geographic information may be to select several geographic information points within a predetermined range centered on the user. It can also be established in advance to divide the geographic information points of a certain area into several sets of geographic information points according to the density of geographic information points. For example, a university campus is centered on the main teaching building, gymnasium, and library. Three sets of geographic information points are established, among which, the set of geographic information points centered on the gymnasium may include equipment rooms and playgrounds.

在本实施例一些可选的实现方式中，上述贝叶斯预测模型参数包括地理信息点的定位概率，其中，上述定位概率根据上述地理信息点与聚类中心之间的距离得到，其中，上述聚类中心由至少一个地理信息点聚类得到。例如，将上述地理信息点与上述聚类中心之间距离的反比作为上述地理信息点的定位概率。当然，还可以将地理信息点与聚类中心之间距离的反比再乘以系数作为上述地理信息点的定位概率。In some optional implementation manners of this embodiment, the Bayesian prediction model parameter includes a location probability of a geographic information point, wherein the location probability is obtained according to the distance between the geographic information point and the cluster center, wherein the above The cluster center is obtained by clustering at least one geographic information point. For example, the inverse ratio of the distance between the geographic information point and the cluster center is used as the positioning probability of the geographic information point. Of course, the inverse ratio of the distance between the geographic information point and the cluster center can also be multiplied by a coefficient as the positioning probability of the geographic information point.

可选地，可以利用上述聚类中心为上述用户与地理信息点之间的桥梁，首先计算用户-聚类中心概率，例如可以将上述用户与聚类中心距离的反比作为用户-聚类中心概率，再将用户-聚类中心概率与上述定位概率的乘积作为用户在地理信息点的概率。作为示例，用户甲在聚类中心A和聚类中心B附近，用户甲与聚类中心A之间的距离是10米，用户甲与聚类中心A之间的距离是20米；聚类中心A的聚类地理信息点集合中有地理信息点c和地理信息点d，其中，地理信息点c与聚类中心A之间的距离1米，地理信息点c与聚类中心A之间的距离是2米；聚类中心B的聚类地理信息点集合中有地理信息点e，其中，地理信息点e与聚类中心B之间的距离是4米；那么，如果将地理中心与聚类中心之间距离的反比作为定位概率，那么地理信息点a的定位概率是1/1，那么地理信息点b的定位概率是1/2，那么地理信息点c的定位概率是1/4，用户与聚类中心A之间的用户-聚类中心概率是1/10，用户与聚类中心B之间的用户-聚类中心概率是1/20；最后，得到，上述用户在地理信息点a的概率是(1/1)*(1/10)，上述用户在地理信息点b的概率是(1/2)*(1/10)，上述用户在地理信息点c的概率是(1/4)*(1/20)。在这里，“/”表示除号，“*”表示运算，优选地，“*”表示乘号。Optionally, the above-mentioned cluster center can be used as a bridge between the above-mentioned users and the geographic information points, and the user-cluster center probability is calculated first, for example, the inverse ratio of the distance between the above-mentioned users and the cluster center can be used as the user-cluster center probability. , and then take the product of the user-cluster center probability and the above positioning probability as the probability of the user at the geographic information point. As an example, user A is near cluster center A and cluster center B, the distance between user A and cluster center A is 10 meters, and the distance between user A and cluster center A is 20 meters; The cluster geographic information point set of A includes geographic information point c and geographic information point d, wherein the distance between geographic information point c and cluster center A is 1 meter, and the distance between geographic information point c and cluster center A is 1 meter. The distance is 2 meters; there is a geographic information point e in the clustering geographic information point set of the cluster center B, and the distance between the geographic information point e and the cluster center B is 4 meters; then, if the geographic center and the The inverse ratio of the distance between the class centers is used as the positioning probability, then the positioning probability of geographic information point a is 1/1, then the positioning probability of geographic information point b is 1/2, then the positioning probability of geographic information point c is 1/4, The user-cluster center probability between the user and the cluster center A is 1/10, and the user-cluster center probability between the user and the cluster center B is 1/20; The probability of a is (1/1)*(1/10), the probability of the above-mentioned user at the geographic information point b is (1/2)*(1/10), and the probability of the above-mentioned user at the geographic information point c is (1 /4)*(1/20). Here, "/" represents a division sign, "*" represents an operation, and preferably, "*" represents a multiplication sign.

可选地，至少一个地理信息点聚类得到聚类中心可以随机划分区域，可以利用聚类算法聚类，其中，聚类算法包括但不限于：k-means聚类算法、层次聚类算法、SOM聚类算法、FCM聚类算法。应该理解，上述聚类算法本身的计算过程是本领域的技术人员所公知的，在此不作赘述。Optionally, the cluster center obtained by clustering at least one geographic information point can be randomly divided into regions, and a clustering algorithm can be used for clustering, wherein the clustering algorithm includes but is not limited to: k-means clustering algorithm, hierarchical clustering algorithm, SOM clustering algorithm, FCM clustering algorithm. It should be understood that the calculation process of the above-mentioned clustering algorithm itself is well known to those skilled in the art, and will not be repeated here.

可选地，可以通过K-means算法聚类得到至少一个地理信息点的聚类中心，可选地，具体过程如下：选取至少一个地理信息点，并建立聚类地理信息点集合；根据上述聚类地理信息点集合到访的总次数确定聚类数目；选取上述聚类数目个定位坐标作为初始聚类中心；将上述聚类数目、上述初始聚类中心对应的坐标和聚类地理信息点集合中地理信息点的定位坐标设置为K-means算法的输入值，得到上述聚类数目个聚类中心。应该理解，K-means算法是本领域的技术人员公知的聚类算法，在此不作赘述。Optionally, the cluster center of at least one geographic information point can be obtained by clustering the K-means algorithm. Optionally, the specific process is as follows: select at least one geographic information point, and establish a clustering geographic information point set; The number of clusters is determined by the total number of visits of the quasi-geographic information point set; the positioning coordinates of the above-mentioned number of clusters are selected as the initial cluster center; The positioning coordinates of the geographic information points in the middle are set as the input values of the K-means algorithm, and the number of cluster centers above is obtained. It should be understood that the K-means algorithm is a clustering algorithm known to those skilled in the art, and details are not described here.

可选地，可以根据聚类地理信息集合中地理信息点的个数确定聚类数目，也可以根据聚类地理信息点集合中地理信息点分布是否具有明显的分区密集的现象确定地理信息点的个数。Optionally, the number of clusters can be determined according to the number of geographic information points in the clustered geographic information set, or the number of geographic information points can be determined according to whether the distribution of geographic information points in the clustered geographic information point set has an obvious phenomenon of dense partitions. number.

可选地，可以随机选取上述聚类数目个定位坐标作为初始聚类中心，也可以将历史到访概率大于预定阈值的地理信息点的定位坐标作为初始聚类中心。Optionally, the above-mentioned number of locating coordinates of the number of clusters may be randomly selected as the initial cluster centers, or the locating coordinates of geographic information points whose historical visiting probability is greater than a predetermined threshold may be used as the initial cluster centers.

步骤203，将最大概率值对应的地理信息点确定为上述用户所处的地理信息点。Step 203: Determine the geographic information point corresponding to the maximum probability value as the geographic information point where the user is located.

在本实施例中，基于步骤202得到的用户处于至少一个地理信息点中每个地理信息点的概率值，上述概率值中选取出最大的概率值作为最大概率值，将最大概率值对应的地理信息点确定为上述用户所处的地理信息点。In this embodiment, based on the probability value of the user at each geographic information point in at least one geographic information point obtained in step 202, the largest probability value is selected as the maximum probability value from the above probability values, and the geographic information corresponding to the maximum probability value is The information point is determined as the geographic information point where the user is located.

本申请的上述实施例提供的方法通过利用地理信息点的历史到访概率和定位概率，实现了确定用户所处的地理信息点。The method provided by the above embodiments of the present application realizes the determination of the geographic information point where the user is located by using the historical visiting probability and positioning probability of the geographic information point.

进一步参考图3，其示出了地理信息点的确定方法的又一个实施例的流程300。该地理信息点的确定方法的流程300，包括以下步骤：With further reference to FIG. 3, a flow 300 of a further embodiment of a method for determining a geographic information point is shown. The process 300 of the method for determining the geographic information point includes the following steps:

步骤301，获取用户的用户定位坐标和用户定位时间。Step 301: Obtain user positioning coordinates and user positioning time of the user.

在本实施例中，地理信息点的确定方法运行于其上的电子设备(例如图1所示的服务器)可以基于用户所使用的移动终端获取用户的定位信息。In this embodiment, the electronic device (for example, the server shown in FIG. 1 ) on which the method for determining the geographic information point runs may acquire the user's positioning information based on the mobile terminal used by the user.

在本实施例中，上述用户的定位信息包括用户定位坐标和上述用户位于上述用户定位坐标时的用户定位时间。可以通过以下步骤获取用户的定位信息：筛选出用户定位时间在预设时间段内的用户定位坐标，并建立原始用户定位坐标集合；剔除上述原始用户定位坐标集合中的异常点，得到用户定位坐标集合，其中，上述异常点是指在第二预设时间段内移动的距离大于预设距离阈值的坐标点；将上述用户定位坐标集合中的至少一个用户定位坐标通过轨迹聚类算法聚合成一个轨迹中心坐标；将上述用户定位坐标集合中的至少一个用户定位时间的所对应时间点的平均时间点作为轨迹中心时间；将上述轨迹中心坐标和上述轨迹中心时间作为预先训练的贝叶斯预测模型的输入值，并根据上述贝叶斯预测模型得到上述用户处于至少一个地理信息点中每个地理信息点的概率值。In this embodiment, the positioning information of the user includes the user positioning coordinates and the user positioning time when the user is located at the user positioning coordinates. The user's positioning information can be obtained through the following steps: filtering out the user positioning coordinates whose user positioning time is within a preset time period, and establishing an original user positioning coordinate set; excluding abnormal points in the above-mentioned original user positioning coordinate set, obtaining the user positioning coordinates set, wherein, the above-mentioned abnormal point refers to a coordinate point whose distance moved within the second preset time period is greater than the preset distance threshold; at least one user positioning coordinate in the above-mentioned user positioning coordinate set is aggregated into one through the trajectory clustering algorithm Track center coordinates; take the average time point of the corresponding time points of at least one user positioning time in the above-mentioned user positioning coordinate set as the track center time; take the above-mentioned track center coordinates and the above-mentioned track center time as the pre-trained Bayesian prediction model The input value of , and the probability value of the user being located at each geographic information point in the at least one geographic information point is obtained according to the above-mentioned Bayesian prediction model.

步骤302，将用户定位坐标和用户定位时间作为预先训练的贝叶斯预测模型的输入值，并根据贝叶斯预测模型得到用户处于至少一个地理信息点中每个地理信息点的概率值。Step 302 , taking the user positioning coordinates and the user positioning time as input values of the pre-trained Bayesian prediction model, and obtaining the probability value of the user at each geographic information point in the at least one geographic information point according to the Bayesian prediction model.

在本实施例中，基于步骤301中得到用户的用户定位坐标，上述电子设备(例如图1所示的服务器)可以首先将上述定位坐标作为预先训练贝叶斯预测模型的输入值；之后再利用贝叶斯预测模型得到用户处于某个地理信息点的概率；一次或多次利用贝叶斯预测模型得到用户处于至少一个地理信息点的概率。In this embodiment, based on the user positioning coordinates of the user obtained in step 301, the above-mentioned electronic device (such as the server shown in FIG. 1 ) may first use the above-mentioned positioning coordinates as the input values of the pre-trained Bayesian prediction model; The Bayesian prediction model obtains the probability that the user is at a certain geographic information point; the Bayesian prediction model is used one or more times to obtain the probability that the user is at at least one geographic information point.

P(U|poi)＝A*BP(U|poi)=A*B

在本实施例中，上述贝叶斯预测模型参数包括至少一个地理信息点中每个地理信息点的历史到访概率，其中，上述历史到访概率根据地理信息点的定位坐标和上述历史定位信息得到。In this embodiment, the parameters of the Bayesian prediction model include a historical visit probability of each geographic information point in the at least one geographic information point, wherein the historical visit probability is based on the positioning coordinates of the geographic information point and the historical positioning information. get.

在本实施例中，上述贝叶斯预测模型参数包括地理信息点的定位概率，其中，上述定位概率根据上述地理信息点与聚类中心之间的距离得到，其中，上述聚类中心由至少一个地理信息点聚类得到。In this embodiment, the parameters of the Bayesian prediction model include the positioning probability of the geographic information point, wherein the positioning probability is obtained according to the distance between the geographic information point and the cluster center, wherein the cluster center is composed of at least one Geographic information point clustering is obtained.

在本实施例中，上述贝叶斯预测模型参数包括地理信息点的时间概率分布，其中，上述时间概率分布根据地理信息点的历史到访用户的历史定位时间得到，其中，上述历史定位时间是历史到访用户处于历史定位坐标时的时间点，历史定位坐标和历史定位时间属于历史到访用户的历史定位信息，通过采集历史用户的历史定位信息得到上述历史定位坐标和历史定位时间。作为示例，可以参考图4，其示出了地理信息点按照一周的时间概率分布。地理信息点甲在一周七天里总共有100个历史到访用户，周一到周五每天有10个历史到访用户，周六有20个历史到访用户，周日有30个历史到访用户，在这里，假定每个历史到访用户到访地理信息点甲一次。可以建立以星期为周期的时间概率分布。In this embodiment, the above-mentioned Bayesian prediction model parameters include the time probability distribution of geographic information points, wherein the above-mentioned time probability distribution is obtained according to the historical positioning time of the historical visiting users of the geographic information points, wherein the above-mentioned historical positioning time is The time point when the historical visiting user is at the historical positioning coordinates, the historical positioning coordinates and the historical positioning time belong to the historical positioning information of the historical visiting user, and the above historical positioning coordinates and historical positioning time are obtained by collecting the historical positioning information of the historical user. As an example, reference may be made to FIG. 4, which shows the time probability distribution of geographic information points over a week. Geographic information point A has a total of 100 historical visiting users in seven days a week, 10 historical visiting users every day from Monday to Friday, 20 historical visiting users on Saturday, and 30 historical visiting users on Sunday. Here, it is assumed that each historical visiting user visits the geographic information point A once. A time probability distribution with a period of weeks can be established.

可选地，时间概率分布的周期可以是一天的二十四小时，可以是一周的七天，可以是一个月的天数，可以是一年的十二个月，也可以是一年的四个季度。当然，也可以上述周期形式的组合。作为示例，可以建立一天二十四小时和一周七天的组合，某用户在周五的19点到访地理信息点甲的概率，可以将周五在地理信息点甲一周中对应的时间概率和19点在地理信息点甲一天中对应的时间概率，作为参数值计算该用户在周五的19点到访地理信息点甲的概率。Optionally, the period of the time probability distribution can be twenty-four hours of a day, seven days of a week, days of a month, twelve months of a year, or four quarters of a year . Of course, a combination of the above periodic forms is also possible. As an example, a combination of twenty-four hours a day and seven days a week can be established. The probability of a user visiting Geographic Information Point A at 19:00 on Friday can be calculated by combining the time probability of Friday at Geographic Information Point A with 19 The probability of the time corresponding to the point at the geographic information point A in one day is used as a parameter value to calculate the probability of the user visiting the geographic information point A at 19:00 on Friday.

在本实施例一些可选的实现方式中，上述用户的定位信息还包括上述用户位于上述用户定位坐标时的用户定位时间，可以将上述用户定位时间和上述用户定位坐标作为预先训练的贝叶斯预测模型的输入值，并根据上述贝叶斯预测模型得到上述用户处于至少一个地理信息点中每个地理信息点的概率值。In some optional implementations of this embodiment, the positioning information of the user further includes the user positioning time when the user is located at the user positioning coordinates, and the user positioning time and the user positioning coordinates may be used as the pre-trained Bayesian The input value of the prediction model is obtained, and the probability value of the user at each geographic information point in the at least one geographic information point is obtained according to the above-mentioned Bayesian prediction model.

步骤303，将最大概率值对应的地理信息点确定为用户所处的地理信息点。Step 303: Determine the geographic information point corresponding to the maximum probability value as the geographic information point where the user is located.

在本实施例中，基于步骤302得到的用户处于至少一个地理信息点中每个地理信息点的概率值，上述概率值中选取出最大的概率值作为最大概率值，将最大概率值对应的地理信息点确定为上述用户所处的地理信息点。In this embodiment, based on the probability value that the user is in each geographic information point of at least one geographic information point obtained in step 302, the largest probability value is selected as the maximum probability value from the above probability values, and the geographic information corresponding to the maximum probability value is The information point is determined as the geographic information point where the user is located.

步骤304，基于用户的定位信息生成新的贝叶斯预测模型。Step 304, generating a new Bayesian prediction model based on the user's positioning information.

在本实施例中，基于步骤303，对上述用户的定位信息添加上述最大概率值对应的地理信息点的历史到访用户标记，将带有历史到访用户标记的上述用户的定位信息加入到上述贝叶斯预测模型的样本数据集合中，利用上述样本数据集合中的样本数据训练生成新的贝叶斯预测模型。In this embodiment, based on step 303, add the historically visited user mark of the geographic information point corresponding to the above-mentioned maximum probability value to the above-mentioned user's positioning information, and add the above-mentioned user's positioning information with the above-mentioned historically visited user mark to the above-mentioned In the sample data set of the Bayesian prediction model, a new Bayesian prediction model is generated by using the sample data in the above-mentioned sample data set to train.

从图3中可以看出，与图2对应的实施例相比，本实施例中的地理信息点的确定方法的流程300突出了引入用户的用户定位时间的步骤，并且利用了地理信息点的时间概率分布，从而实现更准确地确定用户所处的地理信息点。As can be seen from FIG. 3 , compared with the embodiment corresponding to FIG. 2 , the flow 300 of the method for determining a geographic information point in this embodiment highlights the step of introducing the user's user positioning time, and utilizes the Time probability distribution, so as to more accurately determine the geographic information point where the user is located.

进一步参考图5，作为对上述各图所示方法的实现，本申请提供了一种地理信息点的确定装置的一个实施例，该装置实施例与图2所示的方法实施例相对应，该装置具体可以应用于各种电子设备中。Further referring to FIG. 5 , as an implementation of the methods shown in the above figures, the present application provides an embodiment of an apparatus for determining a geographic information point. The apparatus embodiment corresponds to the method embodiment shown in FIG. 2 . The device can be specifically applied to various electronic devices.

如图5所示，本实施例上述的地理信息点的确定装置500包括：获取模块501、计算模块502、确定模块503。其中，获取模块，配置用于获取用户的定位信息，其中，上述定位信息包括用户定位坐标；计算模块，配置用于将上述用户定位坐标作为预先训练的贝叶斯预测模型的输入值，并根据上述贝叶斯预测模型得到上述用户处于至少一个地理信息点中每个地理信息点的概率值，其中，上述贝叶斯预测模型利用地理信息点的基本信息作为样本数据训练得到，其中，上述基本信息包括地理信息点的定位坐标、历史到访用户的历史定位信息；确定模块，配置用于将最大概率值对应的地理信息点确定为上述用户所处的地理信息点。As shown in FIG. 5 , the apparatus 500 for determining a geographic information point in this embodiment includes: an acquisition module 501 , a calculation module 502 , and a determination module 503 . The obtaining module is configured to obtain the user's positioning information, wherein the positioning information includes the user positioning coordinates; the computing module is configured to use the user positioning coordinates as the input value of the pre-trained Bayesian prediction model, and according to The above-mentioned Bayesian prediction model obtains the probability value of the user at each geographic information point in at least one geographic information point, wherein the above-mentioned Bayesian prediction model is obtained by using the basic information of the geographic information point as sample data training, wherein the above-mentioned basic information is obtained. The information includes the positioning coordinates of the geographic information point and historical positioning information of the historical visiting users; the determining module is configured to determine the geographic information point corresponding to the maximum probability value as the geographic information point where the user is located.

在本实施例中，地理信息点的确定装置500的获取模块501可以基于用户所使用的移动终端获取用户的定位信息。需要指出的是，上述基于用户所使用的移动终端获取用户的定位信息，可以有多种方式实现。In this embodiment, the obtaining module 501 of the apparatus 500 for determining a geographic information point may obtain the user's positioning information based on the mobile terminal used by the user. It should be noted that the above-mentioned acquisition of the user's positioning information based on the mobile terminal used by the user may be implemented in various ways.

在本实施例中，基于获取模块501得到的用户的定位信息，上述计算模块502可以获取模块501中得到用户的用户定位坐标，上述电子设备(例如图1所示的服务器)可以首先将上述定位坐标作为预先训练贝叶斯预测模型的输入值；之后再利用贝叶斯预测模型得到用户处于某个地理信息点的概率；一次或多次利用贝叶斯预测模型得到用户处于至少一个地理信息点中每个地理信息点的概率。In this embodiment, based on the user's positioning information obtained by the obtaining module 501, the above-mentioned calculating module 502 can obtain the user's user positioning coordinates in the obtaining module 501, and the above-mentioned electronic device (such as the server shown in FIG. 1) can firstly locate the above-mentioned positioning The coordinates are used as the input value of the pre-trained Bayesian prediction model; then the Bayesian prediction model is used to obtain the probability that the user is at a certain geographic information point; the Bayesian prediction model is used one or more times to obtain the user at least one geographic information point. The probability of each geographic information point in .

在本实施例中，基于计算模块502得到的用户处于至少一个地理信息点中每个地理信息点的概率值，确定模块503从上述概率值中选取出最大的概率值作为最大概率值，将最大概率值对应的地理信息点确定为上述用户所处的地理信息点。In this embodiment, based on the probability value that the user is in each geographic information point in at least one geographic information point obtained by the calculation module 502, the determination module 503 selects the maximum probability value from the above probability values as the maximum probability value, and takes the maximum probability value as the maximum probability value. The geographic information point corresponding to the probability value is determined as the geographic information point where the user is located.

在本实施例一些可选的实现方式中，地理信息点的确定装置500还包括更新模块504，配置用于对上述用户的定位信息添加上述最大概率值对应的地理信息点的历史到访用户标记，将带有历史到访用户标记的上述用户的定位信息加入到上述贝叶斯预测模型的样本数据集合中，利用上述样本数据集合中的样本数据训练生成新的贝叶斯预测模型。In some optional implementation manners of this embodiment, the apparatus 500 for determining a geographic information point further includes an update module 504 configured to add a historically visited user mark of the geographic information point corresponding to the above-mentioned maximum probability value to the above-mentioned user's positioning information , adding the positioning information of the above-mentioned users marked with historical visiting users to the sample data set of the above-mentioned Bayesian prediction model, and using the sample data in the above-mentioned sample data set to train to generate a new Bayesian prediction model.

本领域技术人员可以理解，上述地理信息点的确定装置500还包括一些其他公知结构，例如处理器、存储器等，为了不必要地模糊本公开的实施例，这些公知的结构在图5中未示出。Those skilled in the art can understand that the above-mentioned apparatus 500 for determining a geographic information point also includes some other well-known structures, such as a processor, a memory, etc., in order to unnecessarily obscure the embodiments of the present disclosure, these well-known structures are not shown in FIG. 5 . out.

下面参考图6，其示出了适于用来实现本申请实施例的服务器的计算机系统600的结构示意图。Referring to FIG. 6 below, it shows a schematic structural diagram of a computer system 600 suitable for implementing the server of the embodiment of the present application.

如图6所示，计算机系统600包括中央处理单元(CPU)601，其可以根据存储在只读存储器(ROM)602中的程序或者从存储部分608加载到随机访问存储器(RAM)603中的程序而执行各种适当的动作和处理。在RAM 603中，还存储有系统600操作所需的各种程序和数据。CPU 601、ROM 602以及RAM 603通过总线604彼此相连。输入/输出(I/O)接口605也连接至总线604。As shown in FIG. 6, a computer system 600 includes a central processing unit (CPU) 601, which can be loaded into a random access memory (RAM) 603 according to a program stored in a read only memory (ROM) 602 or a program from a storage section 608 Instead, various appropriate actions and processes are performed. In the RAM 603, various programs and data necessary for the operation of the system 600 are also stored. The CPU 601 , the ROM 602 , and the RAM 603 are connected to each other through a bus 604 . An input/output (I/O) interface 605 is also connected to bus 604 .

以下部件连接至I/O接口605：包括键盘、鼠标等的输入部分606；包括诸如阴极射线管(CRT)、液晶显示器(LCD)等以及扬声器等的输出部分607；包括硬盘等的存储部分608；以及包括诸如LAN卡、调制解调器等的网络接口卡的通信部分609。通信部分609经由诸如因特网的网络执行通信处理。驱动器610也根据需要连接至I/O接口605。可拆卸介质611，诸如磁盘、光盘、磁光盘、半导体存储器等等，根据需要安装在驱动器610上，以便于从其上读出的计算机程序根据需要被安装入存储部分608。The following components are connected to the I/O interface 605: an input section 606 including a keyboard, a mouse, etc.; an output section 607 including a cathode ray tube (CRT), a liquid crystal display (LCD), etc., and a speaker, etc.; a storage section 608 including a hard disk, etc. ; and a communication section 609 including a network interface card such as a LAN card, a modem, and the like. The communication section 609 performs communication processing via a network such as the Internet. A drive 610 is also connected to the I/O interface 605 as needed. A removable medium 611, such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, etc., is mounted on the drive 610 as needed so that a computer program read therefrom is installed into the storage section 608 as needed.

特别地，根据本公开的实施例，上文参考流程图描述的过程可以被实现为计算机软件程序。例如，本公开的实施例包括一种计算机程序产品，其包括有形地包含在机器可读介质上的计算机程序，上述计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中，该计算机程序可以通过通信部分609从网络上被下载和安装，和/或从可拆卸介质611被安装。In particular, according to embodiments of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program tangibly embodied on a machine-readable medium, the computer program containing program code for performing the methods illustrated in the flowcharts. In such an embodiment, the computer program may be downloaded and installed from the network via the communication portion 609 and/or installed from the removable medium 611 .

附图中的流程图和框图，图示了按照本申请各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上，流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分，上述模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意，在有些作为替换的实现中，方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如，两个接连地表示的方框实际上可以基本并行地执行，它们有时也可以按相反的顺序执行，这依所涉及的功能而定。也要注意的是，框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合，可以用执行规定的功能或操作的专用的基于硬件的系统来实现，或者可以用专用硬件与计算机指令的组合来实现。The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code that contains one or more logical functions for implementing the specified functions executable instructions. It should also be noted that, in some alternative implementations, the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It is also noted that each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented in dedicated hardware-based systems that perform the specified functions or operations , or can be implemented in a combination of dedicated hardware and computer instructions.

描述于本申请实施例中所涉及到的模块可以通过软件的方式实现，也可以通过硬件的方式来实现。所描述的模块也可以设置在处理器中，例如，可以描述为：一种处理器包括获取模块、计算模块、确定模块。其中，这些模块的名称在某种情况下并不构成对该模块本身的限定，例如，获取模块还可以被描述为“获取用户的定位信息的模块”。The modules involved in the embodiments of the present application may be implemented in a software manner, and may also be implemented in a hardware manner. The described modules can also be provided in the processor, for example, it can be described as: a processor includes an acquisition module, a calculation module, and a determination module. Wherein, the names of these modules do not constitute a limitation of the module itself under certain circumstances. For example, the acquisition module may also be described as "a module for acquiring user's positioning information".

作为另一方面，本申请还提供了一种非易失性计算机存储介质，该非易失性计算机存储介质可以是上述实施例中上述装置中所包含的非易失性计算机存储介质；也可以是单独存在，未装配入终端中的非易失性计算机存储介质。上述非易失性计算机存储介质存储有一个或者多个程序，当上述一个或者多个程序被一个设备执行时，使得上述设备：获取用户的定位信息，其中，上述定位信息包括用户定位坐标；将上述用户定位坐标作为预先训练的贝叶斯预测模型的输入值，并根据上述贝叶斯预测模型得到上述用户处于至少一个地理信息点中每个地理信息点的概率值，其中，上述贝叶斯预测模型利用地理信息点的基本信息作为样本数据训练得到，其中，上述基本信息包括地理信息点的定位坐标、历史到访用户的历史定位信息；将最大概率值对应的地理信息点确定为上述用户所处的地理信息点。As another aspect, the present application also provides a non-volatile computer storage medium, and the non-volatile computer storage medium may be the non-volatile computer storage medium included in the above-mentioned apparatus in the above-mentioned embodiment; or It is a non-volatile computer storage medium that exists alone and is not assembled into the terminal. The above-mentioned non-volatile computer storage medium stores one or more programs, and when the above-mentioned one or more programs are executed by a device, the above-mentioned device: obtains the user's positioning information, wherein the above-mentioned positioning information includes user positioning coordinates; The above-mentioned user positioning coordinates are used as the input value of the pre-trained Bayesian prediction model, and the probability value of the above-mentioned user at each geographic information point in at least one geographic information point is obtained according to the above-mentioned Bayesian prediction model, wherein the above-mentioned Bayesian The prediction model is obtained by training the basic information of the geographic information points as sample data, wherein the above-mentioned basic information includes the positioning coordinates of the geographic information points and the historical positioning information of historical visiting users; the geographic information point corresponding to the maximum probability value is determined as the above-mentioned user. The geographic information point at which it is located.

以上描述仅为本申请的较佳实施例以及对所运用技术原理的说明。本领域技术人员应当理解，本申请中所涉及的发明范围，并不限于上述技术特征的特定组合而成的技术方案，同时也应涵盖在不脱离上述发明构思的情况下，由上述技术特征或其等同特征进行任意组合而形成的其它技术方案。例如上述特征与本申请中公开的(但不限于)具有类似功能的技术特征进行互相替换而形成的技术方案。The above description is only a preferred embodiment of the present application and an illustration of the applied technical principles. Those skilled in the art should understand that the scope of the invention involved in this application is not limited to the technical solution formed by the specific combination of the above technical features, and should also cover the above technical features or Other technical solutions formed by any combination of its equivalent features. For example, a technical solution is formed by replacing the above-mentioned features with the technical features disclosed in this application (but not limited to) with similar functions.

Claims

1. A method for determining a geographic information point, wherein the method comprises:

Obtaining the user's positioning information, wherein the positioning information includes user positioning coordinates;

The user positioning coordinates are used as the input value of the pre-trained Bayesian prediction model, and the probability value of the user at each geographic information point in the at least one geographic information point is obtained according to the Bayesian prediction model, wherein, The Bayesian prediction model is obtained by using the basic information of the geographic information points as sample data, wherein the basic information includes the positioning coordinates of the geographic information points and the historical positioning information of historical visiting users;

The geographic information point corresponding to the maximum probability value is determined as the geographic information point where the user is located; wherein

Historical visiting users are determined in the following ways:

Select historical users within the preset range of geographic information points;

Obtain the historical positioning information and historical search records of the historical users corresponding to the historical positioning coordinates, wherein the historical positioning information includes the historical positioning coordinates and the historical positioning time when the historical positioning coordinates are collected;

If the historical search record includes the identification information of the geographic information point;

then calculate the time interval between the historical positioning time and the time point of searching for the geographic information point;

In response to the time interval being less than a predetermined threshold, the historical user is determined as a historical visiting user of the geographic information point.

2. The method according to claim 1, wherein the Bayesian prediction model parameter comprises a historical visiting probability of each geographic information point in the at least one geographic information point, wherein the historical visiting probability is based on The positioning coordinates of the geographic information point and the historical positioning information are obtained, wherein:

Obtaining the historical visit probability according to the positioning coordinates of the geographic information point and the historical positioning information includes:

Select at least one geographic information point according to a preset rule, and establish a set of geographic information points;

According to the positioning coordinates of each geographic information point in the geographic information point set and the positioning information of each historical visiting user in the geographic information point set, obtain each historical visit times in the geographic information point set;

Calculate the total number of historical visits of each geographic information in the set of geographic information points, and use the sum as the total number of historical visits of the set of geographic information points;

The historical visiting probability of each geographic information point in the geographic information point set is obtained according to the total number of historical visits and the historical visiting times of the geographic information points in the geographic information point set.

3. The method according to claim 1, wherein the historical positioning information of the historical visiting user comprises historical positioning coordinates and the historical positioning time when the historical visiting user is located at the historical positioning coordinates; and,

The Bayesian prediction model parameters include:

The time probability distribution of the geographic information point, wherein the time probability distribution is obtained according to the historical positioning time of the historical visiting users of the geographic information point.

4. The method according to claim 3, wherein the positioning information of the user further comprises a user positioning time when the user is located at the user positioning coordinates; and,

The user positioning coordinates are used as the input value of the pre-trained Bayesian prediction model, and the probability value that the user is at each geographic information point in the at least one geographic information point is obtained according to the Bayesian prediction model, include:

Taking the user positioning time and the user positioning coordinates as input values of a pre-trained Bayesian prediction model, and obtaining each geographic information point where the user is located in at least one geographic information point according to the Bayesian prediction model probability value.

5. The method according to claim 1, wherein the Bayesian prediction model parameter comprises a location probability of a geographic information point, wherein the location probability is based on the difference between the geographic information point and the cluster center. distance is obtained, wherein the cluster center is obtained by clustering at least one geographic information point.

6. The method according to claim 5, wherein the cluster center is obtained by clustering at least one geographic information point, comprising:

The cluster center of at least one geographic information point is obtained by K-means algorithm clustering, where:

Select at least one geographic information point, and establish a clustered geographic information point set;

Determine the number of clusters according to the total number of visits of the clustered geographic information point set;

Select the number of positioning coordinates of the cluster as the initial cluster center;

The number of clusters, the coordinates corresponding to the initial cluster centers and the location coordinates of the geographic information points in the cluster geographic information point set are set as the input values of the K-means algorithm, and the number of cluster centers is obtained. .

7. The method according to any one of claims 1-6, wherein the user's positioning information further comprises a user positioning time when the user is located at the user positioning coordinates; and,

The obtaining of the user's positioning information includes:

Filter out the user positioning coordinates whose user positioning time is within the preset time period, and establish the original user positioning coordinate set;

Eliminating abnormal points in the original user positioning coordinate set to obtain a user positioning coordinate set, wherein the abnormal point refers to a coordinate point whose moving distance in the second preset time period is greater than a preset distance threshold;

Aggregating at least one user positioning coordinate in the set of user positioning coordinates into one track center coordinate through a trajectory clustering algorithm;

Taking the average time point of the corresponding time points of at least one user positioning time in the user positioning coordinate set as the track center time; and,

The track center coordinates and the track center time are used as input values of the pre-trained Bayesian prediction model, and according to the Bayesian prediction model, it is obtained that the user is located in each geographic location of at least one geographic information point. The probability value of the information point, including:

The coordinates of the track center are used as input values of a pre-trained Bayesian prediction model, and a probability value of the user being at each geographic information point in the at least one geographic information point is obtained according to the Bayesian prediction model.

8. The method according to claim 7, wherein after determining the geographic information point corresponding to the maximum probability value as the geographic information point where the user is located, the method further comprises:

adding the historical visiting user mark of the geographic information point corresponding to the maximum probability value to the positioning information of the user;

adding the positioning information of the user marked with the historical visiting user to the sample data set of the Bayesian prediction model;

Using the sample data in the sample data set to train and generate a new Bayesian prediction model.

9. A device for determining geographic information points, wherein the device comprises:

an acquisition module, configured to acquire user positioning information, wherein the positioning information includes user positioning coordinates;

A computing module, configured to use the user positioning coordinates as an input value of a pre-trained Bayesian prediction model, and obtain, according to the Bayesian prediction model, that the user is at each geographic information point in at least one geographic information point The probability value of , wherein the Bayesian prediction model uses the basic information of geographic information points as sample data to train and obtain, wherein, the basic information includes the positioning coordinates of geographic information points, historical positioning information of historical visiting users;

a determining module, configured to determine the geographic information point corresponding to the maximum probability value as the geographic information point where the user is located; wherein

Historical visiting users are determined in the following ways:

Obtain the historical positioning information and historical search records of the historical users corresponding to the historical positioning coordinates, wherein the historical positioning information includes the historical positioning coordinates and the historical positioning time when the historical positioning coordinates were collected;

10. The apparatus according to claim 9, wherein the Bayesian prediction model parameter comprises a historical visiting probability of each geographic information point in the at least one geographic information point, wherein the historical visiting probability is based on The positioning coordinates of the geographic information point and the historical positioning information are obtained, wherein:

11. The device according to claim 9, wherein the historical positioning information of the historical visiting user comprises historical positioning coordinates and a historical positioning time when the historical visiting user is located at the historical positioning coordinates; and,

The Bayesian prediction model parameters include:

12. The apparatus according to claim 11, wherein the user's positioning information further comprises a user positioning time when the user is located at the user positioning coordinates; and,

The computing module is further used for:

13 . The apparatus according to claim 9 , wherein the Bayesian prediction model parameter comprises a location probability of a geographic information point, wherein the location probability is based on the difference between the geographic information point and the cluster center. 14 . distance is obtained, wherein the cluster center is obtained by clustering at least one geographic information point.

14. The apparatus according to claim 13, wherein the cluster center is obtained by clustering at least one geographic information point, comprising:

15. The device according to any one of claims 9-14, wherein the user's positioning information further comprises a user positioning time when the user is located at the user positioning coordinates; and,

The obtaining module is further used for:

16. The apparatus of claim 15, wherein the apparatus further comprises an update module configured to: