+

CN112380302A - Thermodynamic diagram generation method and device based on track data, electronic equipment and storage medium - Google Patents

Thermodynamic diagram generation method and device based on track data, electronic equipment and storage medium Download PDF

Info

Publication number
CN112380302A
CN112380302A CN202011148718.7A CN202011148718A CN112380302A CN 112380302 A CN112380302 A CN 112380302A CN 202011148718 A CN202011148718 A CN 202011148718A CN 112380302 A CN112380302 A CN 112380302A
Authority
CN
China
Prior art keywords
data
map data
map
tiles
clustering
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011148718.7A
Other languages
Chinese (zh)
Other versions
CN112380302B (en
Inventor
张健钦
张昊
郭小刚
卢剑
陆浩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Civil Engineering and Architecture
Original Assignee
Beijing University of Civil Engineering and Architecture
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Civil Engineering and Architecture filed Critical Beijing University of Civil Engineering and Architecture
Priority to CN202011148718.7A priority Critical patent/CN112380302B/en
Publication of CN112380302A publication Critical patent/CN112380302A/en
Application granted granted Critical
Publication of CN112380302B publication Critical patent/CN112380302B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/172Caching, prefetching or hoarding of files
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Remote Sensing (AREA)
  • Software Systems (AREA)
  • Processing Or Creating Images (AREA)

Abstract

本发明实施例公开了基于轨迹数据的热力图生成方法、装置、电子设备以及存储介质。所述方法,包括:获取轨迹数据和地图数据;将所述轨迹数据以原格式存储于Hadoop平台分布式文件系统;对所述轨迹数据进行聚类,得到聚类数据;将所述地图数据以及所述聚类数据存储于HBase分布式数据库;从所述HBase分布式数据库获取与待生成的热力图相对应的地图数据以及聚类数据;根据所获取的地图数据和聚类数据,生成热力图。基于该方法和装置,可以在保留轨迹数据的位置特征的同时,提高热力图可视化的效率,缩短成图时间,改善因用户交互而发生的卡顿问题,改善用户体验。

Figure 202011148718

The embodiments of the present invention disclose a method, an apparatus, an electronic device and a storage medium for generating a heat map based on trajectory data. The method includes: acquiring trajectory data and map data; storing the trajectory data in the original format in a Hadoop platform distributed file system; clustering the trajectory data to obtain clustered data; storing the map data and the The cluster data is stored in the HBase distributed database; the map data and cluster data corresponding to the heat map to be generated are obtained from the HBase distributed database; the heat map is generated according to the obtained map data and cluster data . Based on the method and the device, the efficiency of heat map visualization can be improved, the time for forming the map can be shortened, the stuck problem caused by user interaction can be improved, and the user experience can be improved while retaining the position characteristics of the trajectory data.

Figure 202011148718

Description

Thermodynamic diagram generation method and device based on track data, electronic equipment and storage medium
Technical Field
The embodiment of the invention relates to the technical field of computers, in particular to a thermodynamic diagram generation method and device based on track data, electronic equipment and a storage medium.
Background
In recent years, with the continuous development of satellite positioning technology, LBS technology, and the internet, position data is collected in various ways, and trajectory big data is explosively increased. Conventional databases have no way to cope with either management or expansion of storage capacity. The arrival of the big data era brings about the problems of data structure change, complex storage structure, information fragmentation and the like, and the research of a technology for serving track big data storage and management is one of the key research directions in the GIS field. The massive track data has great research value and contains a great amount of geographical and spatial information. The thermodynamic diagram is used for rendering the track data, and spatial position characteristics can be comprehensively displayed so that researchers can conveniently mine spatial information of the current area and analyze vehicle movement characteristics.
At present, the defects of thermodynamic diagram visualization of track data are mainly reflected in: firstly, the data scale is large, the visualization mapping time is long, and the interactivity is low; the thermodynamic diagram is low in self-adaption effect, the zoom level is switched, and the position characteristic deformation of the trajectory data displayed by the thermodynamic diagram is large; and thirdly, the color gradients of different zoom levels are the same, so that the data dense area shows a hot core phenomenon. At present, only the storage and query performance is optimized, the technical requirements required by large-scale data visualization cannot be met, and the trajectory data is processed. At present, for the optimization of big data visualization, the mapping efficiency is improved mainly by reducing the whole data volume, however, this method still cannot fully overcome the defect of trajectory data thermodynamic diagram visualization.
Disclosure of Invention
It is an object of embodiments of the present invention to address at least the above problems and/or disadvantages and to provide at least the advantages described hereinafter.
The embodiment of the invention provides a thermodynamic diagram generation method and device based on track data, electronic equipment and a storage medium, and the thermodynamic diagram generation method and device based on track data can improve the efficiency of thermodynamic diagram visualization.
In a first aspect, a thermodynamic diagram generation method based on trajectory data is provided, including:
acquiring track data and map data;
storing the track data in a Hadoop platform distributed file system in an original format;
clustering the track data to obtain clustered data;
storing the map data and the cluster data in an HBase distributed database;
obtaining map data and clustering data corresponding to the thermodynamic diagram to be generated from the HBase distributed database;
and generating a thermodynamic diagram according to the acquired map data and the cluster data.
Optionally, the storing the trajectory data in the Hadoop platform distributed file system in the original format includes:
dividing the track data into a plurality of time slices, wherein each time slice comprises all track data in a preset time range;
in the Hadoop platform distributed file system, the track data contained in the same time slice is stored in a concentrated mode in an original format, and the time slices are stored adjacently according to a time sequence.
Optionally, the map data has a plurality of zoom levels;
the clustering the track data to obtain clustered data includes:
determining a plurality of groups of clustering parameters according to the plurality of zoom levels;
clustering is carried out on the track data contained in each time slice according to the multiple groups of clustering parameters to obtain multiple groups of clustering data corresponding to the multiple zooming levels for each time slice;
the acquiring of the map data and the cluster data corresponding to the thermodynamic diagram to be generated from the HBase distributed database comprises the following steps:
determining the zoom level of the map data corresponding to the thermodynamic diagram to be generated according to the zoom level of the thermodynamic diagram to be generated;
determining a time slice to which clustering data corresponding to the thermodynamic diagram to be generated belong according to the time range of the thermodynamic diagram to be generated;
and acquiring the map data under the corresponding zoom level and the cluster data under the corresponding zoom level under the corresponding time slice from the HBase distributed database.
Optionally, the map data has a plurality of zoom levels;
the clustering the track data to obtain clustered data includes:
determining a plurality of groups of clustering parameters according to the plurality of zoom levels;
clustering the track data according to the multiple groups of clustering parameters to obtain multiple groups of clustering data corresponding to the multiple zooming levels;
the acquiring of the map data and the cluster data corresponding to the thermodynamic diagram to be generated from the HBase distributed database comprises the following steps:
determining the zoom level of the map data corresponding to the thermodynamic diagram to be generated according to the zoom level of the thermodynamic diagram to be generated;
and obtaining the map data and the cluster data under the corresponding zoom level from the HBase distributed database.
Optionally, the sets of clustering parameters include a scan radius;
determining a plurality of groups of clustering parameters according to the plurality of zoom levels comprises:
determining a scanning radius corresponding to each zooming level according to the zooming levels; wherein the scan radius corresponding to each zoom level decreases as the respective zoom level decreases.
Optionally, each set of clustering parameters includes a minimum contained point number;
the determining multiple groups of clustering parameters according to the multiple zoom levels further comprises:
and determining the minimum contained points corresponding to each zooming level according to the zooming levels, wherein the minimum contained points corresponding to each zooming level are reduced along with the reduction of the corresponding zooming level.
Optionally, the sets of cluster data include center coordinates and influence values of a plurality of cluster clusters and coordinates and influence values of a plurality of noise points.
Optionally, the clustering is implemented based on DBScan algorithm.
Optionally, the storing the cluster data in an HBase distributed database includes:
and respectively constructing each clustering data table aiming at each group of clustering data of each time slice corresponding to each zoom level.
Optionally, the map data has a plurality of zoom levels;
the storing the map data in an HBase distributed database comprises:
and constructing each map data table aiming at the map data at each zoom level, and storing 4 tiles which are contained in the map data at each zoom level and are adjacent to each other in a display state into the same row in the corresponding map data table.
Optionally, the constructing each map data table for the map data at each zoom level, and storing 4 tiles, which are included in the map data at each zoom level and are adjacent to each other in the display state, in the same row in the corresponding map data table includes:
calculating the total order m of the map data at each zoom level according to the number n of tiles contained in each row of the map data at each zoom level, wherein,
Figure BDA0002740527720000041
when n-2m is 1, dividing the map data at each zoom level into m × m square sub-grids and n edge sub-grids, wherein the square sub-grids are composed of 4 tiles, 2m edge sub-grids adjacent to the square sub-grids are composed of 2 tiles, and 1 edge sub-grid not adjacent to the square sub-grids is composed of 1 tile;
filling the m-by-m square sub-grids based on a Z-shaped filling curve, filling the 2m edge sub-grids based on a linear type filling curve, connecting the m-by-m square sub-grids and the filling curves of the 2m edge sub-grids into a whole, and extending the filling curves of the m-by-m square sub-grids and the 2m edge sub-grids to 1 edge sub-grid which is not adjacent to the square sub-grids;
encoding the n tiles according to their filling order;
and constructing each map data table aiming at the map data under each zoom level, and sequentially storing the n tiles in the corresponding map data table based on the codes of the n tiles, wherein 4 tiles belonging to the same square sub-grid are stored in the same row in the corresponding map data table, and the tiles belonging to the same edge sub-grid are stored in the same row in the corresponding map data table.
Optionally, the constructing each map data table for the map data at each zoom level, and storing 4 tiles, which are included in the map data at each zoom level and are adjacent to each other in the display state, in the same row in the corresponding map data table includes:
calculating the total order m of the map data at each zoom level according to the number n of tiles contained in each row of the map data at each zoom level, wherein,
Figure BDA0002740527720000042
when n is 2m, dividing the map data of each zoom level into m square sub-grids, wherein the square sub-grids are composed of 4 tiles;
filling the m by m square sub-grids based on a Z-shaped filling curve;
encoding the n tiles according to their filling order;
and constructing each map data table aiming at the map data under each zoom level, and sequentially storing the n tiles in the corresponding map data table based on the codes of the n tiles, wherein 4 tiles belonging to the same square sub-grid are stored in the same row in the corresponding map data table.
In a second aspect, a thermodynamic diagram generation apparatus based on trajectory data is provided, including:
the first acquisition module is used for acquiring track data and map data;
the first storage module is used for storing the track data in a Hadoop platform distributed file system in an original format;
the clustering module is used for clustering the track data to obtain clustering data;
the second storage module is used for storing the map data and the cluster data in an HBase distributed database;
the second acquisition module is used for acquiring map data and cluster data corresponding to the thermodynamic diagram to be generated from the HBase distributed database;
and the generating module is used for generating the thermodynamic diagram according to the acquired map data and the cluster data.
Optionally, the first storage module is specifically configured to:
dividing the track data into a plurality of time slices, wherein each time slice comprises all track data in a preset time range;
in the Hadoop platform distributed file system, the track data contained in the same time slice is stored in a concentrated mode in an original format, and the time slices are stored adjacently according to a time sequence.
Optionally, the map data has a plurality of zoom levels;
the clustering module comprises:
a first determining unit, configured to determine multiple groups of clustering parameters according to the multiple zoom levels;
the clustering unit is used for clustering the track data contained in each time slice according to the multiple groups of clustering parameters to obtain multiple groups of clustering data corresponding to the multiple zoom levels for each time slice;
the second obtaining module includes:
a second determining unit, configured to determine, according to a zoom level of the thermodynamic diagram to be generated, a zoom level of map data corresponding to the thermodynamic diagram to be generated;
a third determining unit, configured to determine, according to the time range of the thermodynamic diagram to be generated, a time slice to which cluster data corresponding to the thermodynamic diagram to be generated belongs;
and the acquisition unit is used for acquiring the map data at the corresponding zoom level and the cluster data at the corresponding zoom level in the corresponding time slice from the HBase distributed database.
Optionally, the map data has a plurality of zoom levels;
the clustering module comprises:
a first determining unit, configured to determine multiple groups of clustering parameters according to the multiple zoom levels;
the clustering unit is used for clustering the track data according to the multiple groups of clustering parameters to obtain multiple groups of clustering data corresponding to the multiple zooming levels;
the second obtaining module includes:
a second determining unit, configured to determine, according to a zoom level of the thermodynamic diagram to be generated, a zoom level of map data corresponding to the thermodynamic diagram to be generated;
and the acquisition unit is used for acquiring the map data and the cluster data under the corresponding zoom level from the HBase distributed database.
Optionally, the sets of clustering parameters include a scan radius;
the first determining unit is specifically configured to:
determining a scanning radius corresponding to each zooming level according to the zooming levels; wherein the scan radius corresponding to each zoom level decreases as the respective zoom level decreases.
Optionally, each set of clustering parameters includes a minimum contained point number;
the first determining unit is specifically configured to:
and determining the minimum contained points corresponding to each zooming level according to the zooming levels, wherein the minimum contained points corresponding to each zooming level are reduced along with the reduction of the corresponding zooming level.
Optionally, the sets of cluster data include center coordinates and influence values of a plurality of cluster clusters and coordinates and influence values of a plurality of noise points.
Optionally, the clustering is implemented based on DBScan algorithm.
Optionally, the second storage module includes:
and the first construction unit is used for constructing each clustering data table aiming at each group of clustering data corresponding to each zooming level of each time slice.
Optionally, the map data has a plurality of zoom levels;
the second storage module includes:
and the second construction unit is used for constructing each map data table aiming at the map data at each zoom level and storing 4 tiles which are adjacent to each other in the display state and are contained in the map data at each zoom level in the same row of the corresponding map data table.
Optionally, the second building unit is specifically configured to:
calculating the total order m of the map data at each zoom level according to the number n of tiles contained in each row of the map data at each zoom level, wherein,
Figure BDA0002740527720000071
when n-2m is 1, dividing the map data at each zoom level into m × m square sub-grids and n edge sub-grids, wherein the square sub-grids are composed of 4 tiles, 2m edge sub-grids adjacent to the square sub-grids are composed of 2 tiles, and 1 edge sub-grid not adjacent to the square sub-grids is composed of 1 tile;
filling the m-by-m square sub-grids based on a Z-shaped filling curve, filling the 2m edge sub-grids based on a linear filling curve, and connecting the m-by-m square sub-grids and the n edge sub-grids by using connecting lines;
encoding the n tiles according to their filling order;
and constructing each map data table aiming at the map data under each zoom level, and sequentially storing the n tiles in the corresponding map data table based on the codes of the n tiles, wherein 4 tiles belonging to the same square sub-grid are stored in the same row in the corresponding map data table, and the tiles belonging to the same edge sub-grid are stored in the same row in the corresponding map data table.
Optionally, the second building unit is specifically configured to:
calculating the total order m of the map data at each zoom level according to the number n of tiles contained in each row of the map data at each zoom level, wherein,
Figure BDA0002740527720000072
when n is 2m, dividing the map data of each zoom level into m square sub-grids, wherein the square sub-grids are composed of 4 tiles;
filling the m by m square sub-grids based on a Z-shaped filling curve;
encoding the n tiles according to their filling order;
and constructing each map data table aiming at the map data under each zoom level, and sequentially storing the n tiles in the corresponding map data table based on the codes of the n tiles, wherein 4 tiles belonging to the same square sub-grid are stored in the same row in the corresponding map data table.
In a third aspect, an electronic device is provided, including: at least one processor, and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor to cause the at least one processor to perform the method described above.
In a fourth aspect, a storage medium is provided, on which a computer program is stored which, when executed by a processor, implements the method described above.
The embodiment of the invention at least comprises the following beneficial effects:
according to the thermodynamic diagram generation method and device based on the track data, provided by the embodiment of the invention, the track data and the map data are firstly obtained; storing the track data in a Hadoop platform distributed file system in an original format; clustering the track data to obtain clustered data; storing the map data and the cluster data in an HBase distributed database; obtaining map data and clustering data corresponding to the thermodynamic diagram to be generated from the HBase distributed database; and generating a thermodynamic diagram according to the acquired map data and the cluster data. Based on the method and the device, the efficiency of thermodynamic diagram visualization can be improved while the position characteristics of the track data are kept, the diagram forming time is shortened, the problem of unsmooth caused by user interaction is solved, and the user experience is improved.
Additional advantages, objects, and features of embodiments of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of embodiments of the invention.
Drawings
FIG. 1 is a flow chart of a method for generating a thermodynamic diagram based on trajectory data according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a track data storage mode according to an embodiment of the present invention;
FIG. 3 is a diagram illustrating map data in a display state according to an embodiment of the present invention;
fig. 4(a) is a schematic diagram illustrating an encoding method of time map data when n is 2 according to an embodiment of the present invention;
fig. 4(b) is a schematic diagram illustrating an encoding method of time map data when n is 4 according to an embodiment of the present invention;
fig. 4(c) is a schematic diagram of an encoding flow of time map data when n is 3 according to an embodiment of the present invention;
fig. 4(d) is a schematic diagram illustrating an encoding method of time map data when n is 3 according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of a storage frame for track data and map data according to another embodiment of the present invention;
fig. 6 is a flowchart illustrating loading of map data according to another embodiment of the present invention;
fig. 7 is a comparison graph of the loading duration of map data according to another embodiment of the present invention;
FIG. 8 is a flowchart of a thermodynamic diagram generation method based on trajectory data according to another embodiment of the present invention;
FIG. 9(a) is a thermodynamic diagram generated using map data at zoom level 11 and raw trajectory data without clustering, according to yet another embodiment of the present invention;
FIG. 9(b) is a thermodynamic diagram generated using map data at zoom level 11 and cluster data according to yet another embodiment of the present invention;
FIG. 9(c) is a thermodynamic diagram generated using map data at zoom level 12 and raw trajectory data without clustering provided by yet another embodiment of the present invention;
FIG. 9(d) is a thermodynamic diagram generated using map data at zoom level 12 and cluster data according to yet another embodiment of the present invention;
FIG. 9(e) is a thermodynamic diagram generated using map data at zoom level 13 and raw trajectory data without clustering provided by yet another embodiment of the present invention;
FIG. 9(f) is a thermodynamic diagram generated using clustered data and map data at zoom level 13 according to yet another embodiment of the present invention;
FIG. 9(g) is a thermodynamic diagram generated using map data at zoom level 14 and raw trajectory data without clustering provided by yet another embodiment of the present invention;
FIG. 9(h) is a thermodynamic diagram generated using map data at zoom level 14 and cluster data according to yet another embodiment of the present invention;
FIG. 10 is a comparison graph of the generation duration of a thermodynamic diagram provided by yet another embodiment of the present invention;
fig. 11 is a schematic structural diagram of a thermodynamic diagram generation apparatus based on trajectory data according to an embodiment of the present invention;
fig. 12 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
The embodiments of the present invention will be described in further detail with reference to the accompanying drawings so that those skilled in the art can implement the embodiments of the invention with reference to the description.
At present, relational databases such as Oracle and PostgreSQL serve as a data warehouse for storing track data, and are mainly used for statically storing and expressing the state of the track data in a certain specific period, so that information storage and management in a certain period cannot be performed in real time. Specifically, the conventional database storage scheme may reach an upper limit of a processing load when the input and output data amount is large, and is not sufficient to support fast storage and query of mass data, and the conventional database has a single data type, and has poor performance in terms of capacity expansion and data backup when facing a large amount of data. The Hadoop open-source cloud storage framework has the characteristics of high expansibility, high fault tolerance, economy and the like and strong computing power, and can provide technical support for the storage of real-time mass track data. The HBase is a NoSQL database which takes Hadoop as a basic technology and comprises a heartbeat mechanism of the HDFS, data backup and other core functions. In the aspect of storage, the HBase supports various data structures, can deal with mass data of PB level, and can be used for storing the mass data due to good expansibility.
Fig. 1 is a flowchart of a method for generating a thermodynamic diagram based on trajectory data, which is executed by a system with processing capability, a server, or a thermodynamic diagram generating device based on trajectory data according to an embodiment of the present invention. As shown in fig. 1, the method includes:
step 110, track data and map data are obtained.
The trajectory data is a sampling sequence with position and time information, and contains the space-time dynamics of the object to be researched. Based on the analysis of the trajectory data, a spatiotemporal distribution characteristic of the object under study may be obtained.
And step 120, storing the track data in the Hadoop platform distributed file system in an original format.
The original format of the track data is txt file. When the track data is stored into the HDFS, the track data is directly stored in the txt format without any processing on the format of the track data. Based on the process, the storage and management efficiency of the massive track data is improved, and the efficiency of generating the thermodynamic diagram is improved.
In some embodiments, storing trace data in native format in a Hadoop platform distributed file system, comprises: dividing the track data into a plurality of time slices, wherein each time slice comprises all track data in a preset time range; in the Hadoop platform distributed file system, the track data contained in the same time slice is stored in a centralized manner in an original format, and a plurality of time slices are stored adjacently according to a time sequence.
The HBase distributed database stores data in the form of tables. The table consists of rows and columns, the columns being divided into several column families. HBase is similar to NoSQL database, HBase is used as the primary Key for searching records by Row Key. When the data is stored, the data is stored according to the lexicographic order of the Row Key. Each column in the table belongs to a column family, and each column is composed of the minimum storage unit called a cell (cell), and the data in the cell is of no type and is stored in a byte code form. It is therefore necessary to pre-process the raw data before storing the data.
The method comprises the following steps of preprocessing original track data, wherein the process comprises the following steps: firstly storing the cluster data into an HDFS (Hadoop distributed file system) according to the original format, and then storing the cluster data into a warehouse according to rows by using HBase. The original track data storage mode is a storage mode based on time dimension, namely a storage mode with time attribute priority. By adopting the method, the spatial point clustering can be conveniently carried out, namely, the clustering analysis can be conveniently carried out on the track data, and the mining analysis of the track data based on time and space is facilitated. Other storage methods, such as a storage method based on vehicle trajectories, a storage method based on spatial distribution, and the like, cannot guarantee effective support of query conditions required for such analysis, because trajectory data in the same time period is not continuously stored on a storage device, which may cause a large number of IO to be generated, thereby reducing data access efficiency.
The embodiment of the invention uses a storage mode based on time dimension aiming at the original track data. Specifically, the time of all the track data is sequenced, then the track data is divided into a plurality of time slices, one time slice comprises all the track data in a preset time range, then the track data belonging to the same time slice is stored in a centralized mode, and all the time slices are arranged according to the time sequence, so that the track data are guaranteed to be stored adjacently in a storage space. For example, all track data within one day can be divided into one time slice every 1 hour, all track data within one day can be divided into 12 time slices, namely, track data between 0:00 and 1:00, track data between 1:00 and 2:00, ·, track data between 23:00 and 24:00, and then the track data contained in each time slice is centrally stored, and the track data between 0:00 and 1:00 and the track data between 1:00 and 2:00 are adjacently stored, and the track data between 1:00 and 2:00 and the track data between 2:00 and 3:00 are adjacently stored, so that the adjacent storage of all the 12 time slices in the time sequence is ensured.
Fig. 2 is a schematic diagram of a track data storage mode according to an embodiment of the present invention. In the HDFS, a data table (hereinafter referred to as a trajectory data table) is constructed for each time slice. In the trajectory data table, the column family may include the following: track data ID, track data longitude LAT, track data latitude LON, DATE DATE, TIME. cndot. the record format of a piece of track data can be: ID1, LAT1, LON1, DATE1, TIME. Each row in the table is used for storing a piece of track data, and all the track data contained in the time slice are arranged in the track data table according to the time sequence. Here, the trajectory data ID is used to indicate the subject to which the trajectory data belongs, for example, when the trajectory data is taxi trajectory data, the trajectory data ID is used to indicate which taxi the trajectory data comes from. That is, in the same trajectory data table, the trajectory data of the same time slice may be from different individuals of the study object, i.e., different taxis. More specifically, when storing trajectory data into the HDFS, the trajectory data is stored based only on the temporal attributes of the trajectory data, regardless of which subject individual the trajectory data is specifically generated by.
And step 130, clustering the track data to obtain clustered data.
The clustering analysis is generally a method for selectively extracting information from raw data according to set clustering parameters and conditions, and is commonly used for classifying and simplifying data.
In the step, the track data can be clustered, so that the position characteristics of the track data are kept, the data volume is reduced, and the efficiency of generating the thermodynamic diagram is improved. In addition, the track data are subjected to clustering analysis, and the thermonuclear phenomenon of a data dense area can be optimized, so that the visualization effect of the thermodynamic diagram is improved. According to the embodiment of the invention, the thermodynamic diagram is generated not by directly utilizing the original track data stored in the HDFS, but the track data is clustered, the clustered data is stored in the HBase distributed database, and the required clustered data is directly obtained from the HBase distributed database when the thermodynamic diagram is generated. Based on this process, the generation efficiency of the thermal map can be further improved.
In some embodiments, the map data has a plurality of zoom levels; clustering the track data to obtain clustered data, including: determining a plurality of groups of clustering parameters according to a plurality of zoom levels; and clustering the track data according to the multiple groups of clustering parameters to obtain multiple groups of clustering data corresponding to multiple zooming levels.
The existing thermodynamic diagrams are low in self-adaption effect, when the zoom levels are switched, the deformation of the position characteristics of the track data displayed by the thermodynamic diagrams is large, and the color gradients of the thermodynamic diagrams at different zoom levels are the same, so that a data dense area presents a thermonuclear phenomenon. Based on this, the embodiment of the invention sets different clustering parameters for different zoom levels of map data, so as to obtain a clustering result matched with the zoom levels, and further, according to the zoom levels of the thermodynamic diagrams required to be generated, corresponding clustering data is obtained for generating an actual thermodynamic diagram, the thermonuclear phenomenon of the generated thermodynamic diagrams in the data dense area is optimized, the position feature display is more detailed, and the visualization effect is improved.
Further, clustering the trajectory data to obtain clustered data, including: determining a plurality of groups of clustering parameters according to a plurality of zoom levels; and clustering the track data contained in each time slice according to the multiple groups of clustering parameters to obtain multiple groups of clustering data corresponding to multiple zoom levels for each time slice.
In some examples, each set of clustering parameters includes a scan radius; determining a plurality of groups of clustering parameters according to a plurality of zoom levels, comprising: determining a scanning radius corresponding to each zooming level according to the zooming levels; wherein the scan radius corresponding to each zoom level decreases as the respective zoom level decreases.
Each set of clustering parameters includes a scan radius. That is, when performing cluster analysis on trajectory data included in a certain time slice, trajectory data included in each cluster formed must be distributed within the range of the scan radius.
As the zoom level of the map data decreases, the number of tiles included in the map data decreases, and the spatial range of the real geographic space corresponding to a unit area in the map data increases, thereby causing the distribution of the trajectory data corresponding to the unit area in the map data to be denser. Therefore, when the zoom level is reduced, the scanning radius is reduced, so that the number of points contained in each cluster is reduced, the density of the track data in each cluster is reduced, the thermonuclear phenomenon of a local area is improved, and the position characteristics of the track data can be more accurately reflected by each cluster.
Specifically, the scanning radius corresponding to each zoom level may be determined according to a spatial range actually covered by a single pixel point in the map data at each zoom level in the real geographic space. In the map data of different zoom levels, the sizes of the individual pixel points are different. The lower the zoom level is, the smaller the size of a single pixel point is, the smaller the spatial range actually covered by the single pixel point in the real geographic space is, and conversely, the larger the size of the single pixel point is, the larger the spatial range actually covered by the single pixel point in the real geographic space is. For example, in a certain map data at a lower zoom level, the spatial range actually covered by a single pixel point in the real geographic space is only 300m, while in a map data at a higher zoom level, the spatial range actually covered by a single pixel point in the real geographic space is 1000 m. The spatial range actually covered by a single pixel point in the map data under each zoom level in the real geographic space can be directly used as the scanning radius corresponding to each zoom level. The spatial range actually covered by a single pixel point in the map data under each zoom level in the real geographic space can be adjusted to a certain extent as required, and the scanning radius corresponding to each zoom level is set. The embodiment of the present invention is not particularly limited to this.
In some examples, the sets of clustering parameters include a minimum contained point number; determining a plurality of groups of clustering parameters according to a plurality of zoom levels, further comprising: and determining the minimum contained points corresponding to each zooming level according to a plurality of zooming levels, wherein the minimum contained points corresponding to each zooming level are reduced along with the reduction of the corresponding zooming level.
Each set of clustering parameters includes a minimum contained point number. That is, when performing cluster analysis on trajectory data included in a certain time slice, the amount of trajectory data included in each cluster formed must be within the range of the minimum number of points included. It should be understood that when the scanning radius and the minimum inclusion point are used as the clustering parameters, the limitation of the scanning radius and the minimum inclusion point must be followed simultaneously in the clustering process.
As the zoom level of the map data decreases, the number of tiles included in the map data decreases, and the spatial range in the real geographic space corresponding to a unit area in the map data increases, thereby causing the distribution of the trajectory data corresponding to the unit area in the map data to be denser. Therefore, when the zoom level is reduced, the minimum number of points included is reduced, which is also beneficial to reducing the number of points included in each cluster, and reducing the density of the track data in each cluster, thereby improving the thermonuclear phenomenon of the local area, and enabling each cluster to reflect the position characteristics of the track data more accurately. The minimum number of points included corresponding to each zoom level may be set according to needs, which is not specifically limited in the embodiment of the present invention.
For any time slice, clustering the track data contained in the time slice based on the multiple groups of clustering parameters corresponding to the multiple zoom levels to obtain multiple groups of clustering data. In some embodiments, each set of cluster data includes a center coordinate and an influence value of a plurality of cluster clusters and a coordinate and an influence value of a plurality of noise points. Here, a noise point may be understood as a discrete point, i.e. individual trajectory data that is not included in any cluster. The noise points may also reflect the position distribution of the trajectory data, and therefore the noise points are taken into account when drawing the thermodynamic diagram.
In some examples, clustering of trajectory data is implemented based on DBScan algorithm. Common clustering algorithms include DBScan algorithm, K-means algorithm, etc. Through the comparison of different clustering algorithms, the DBScan algorithm has the following advantages: firstly, the requirement on the shape of a data set is low; abnormal points in the data can be found; and thirdly, the number of clusters after clustering does not need to be set. Therefore, based on the characteristic that the DBScan algorithm is suitable for a dense data set with any shape, the embodiment of the present invention selects the DBScan algorithm to cluster the trajectory data.
Specifically, for all trajectory data, the following process is adopted to realize clustering analysis:
(1) firstly, preprocessing all track data and eliminating abnormal points.
(2) And determining multiple groups of clustering parameters according to the multiple zooming levels, wherein the clustering parameters comprise scanning radius and minimum contained points.
(3) And clustering the track data contained in each time slice based on the DBScan algorithm. For any time slice, for multiple zoom levels, multiple sets of clustered data may be obtained. It is assumed that the cluster data corresponding to any one zoom level includes n cluster clusters and m noise points.
(4) For any one cluster, the coordinates (x, y) of the center point of the cluster and the value of influence count are calculated using the trajectory data contained in the cluster (see formula (1)). Since the noise points are all single coordinate points, the influence thereof can be directly assigned to 1.
Figure BDA0002740527720000151
Where n is the number of trace points in a cluster, xi、yiThe longitude and latitude of the ith trace point in the cluster.
And step 140, storing the map data and the cluster data in an HBase distributed database.
In some embodiments, storing the clustered data in an HBase distributed database includes: and respectively constructing each clustering data table aiming at each group of clustering data of each time slice corresponding to each zoom level. Based on this, when a thermodynamic diagram needs to be generated, the time range and the scaling level of the thermodynamic diagram can be determined, and then the clustering data table where the clustering data of the corresponding time slice corresponding to the scaling level is located is directly inquired from the HBase distributed database, so as to obtain the corresponding clustering data. Namely, the embodiment of the invention can improve the efficiency of acquiring the related clustering data from the HBase distributed database, thereby improving the efficiency of generating the thermodynamic diagram.
The storage pattern of the cluster data is shown in table 1. The table mainly stores information including the center coordinates of the clustered clusters after clustering, the influence values, and the coordinates and the influence values of the noise points. The Row Key is an integer arranged in sequence, the column family comprises 4 columns which are LAT, LNG, COUNT and procedure, the first three columns respectively store longitude and latitude and influence values, and the procedure column is used as an information supplement column to store other explanatory or auxiliary information.
TABLE 1 clustered data storage schema
Figure BDA0002740527720000152
The map data is generally raster data having a plurality of zoom levels, the map data at each zoom level is composed of a plurality of tiles, and the number of tiles is gradually increased as the zoom level is increased. In order to store and query the map data of each zoom level, the tiles in the map data of each zoom level need to be encoded according to a certain rule. In some embodiments, the tiles may be encoded in an order in which the tiles are naturally arranged in the display state of the map data. Fig. 3 is a schematic diagram of map data provided by an embodiment of the present invention in a display state. As shown in fig. 3, the map data is composed of 12 tiles coded 1-12, the tiles being coded sequentially from top to bottom and from left to right. According to the above coding, 12 tiles included in the map data are sequentially stored in the storage space, that is, the physical storage locations of the tiles having adjacent codes are adjacent to each other, and the physical storage locations of the tiles having non-adjacent codes are not adjacent. However, this encoding method affects the efficiency of reading the map data. As shown in fig. 3, the tiles in the screen display area (the area defined by the dashed box) are coded as 7, 8, 11 and 12, and the 4 tiles are adjacent in the screen display area but spaced at intervals in the physical storage location, in which case the time for querying and reading the data is increased, thereby affecting the efficiency of the thermodynamic diagram generation.
In order to reduce the query time, the physical storage locations of the tiles adjacent to each other in the display state need to be as close as possible, thereby reducing the data reading time and improving the efficiency. The map data adopted by the embodiment of the invention is the map tile data based on the quadtree model. Since the HBase distributed database adopts a column-oriented storage mode, 4 tiles adjacent to each other in a display state are stored in the same row. Specifically, the map data is stored in an HBase distributed database, which includes: and constructing each map data table aiming at the map data at each zoom level, and storing 4 tiles which are contained in the map data at each zoom level and are adjacent to each other in a display state into the same row in the corresponding map data table. Here, "adjacent to each other in the display state" means that 4 tiles are in an abutting relationship with each other, and it can also be considered that 4 tiles constitute a square area. For 4 tiles arranged in the same row in the lateral direction or 4 tiles arranged in the same column in the longitudinal direction in the map data, since these two cases are actually adjacent two by two, there is also a case where 2 tiles are spaced apart by other tiles, and therefore it does not belong to the case of "adjacent to each other in the display state".
In order to realize the ordered storage and the fast query of the map data, and to enable 4 tiles adjacent to each other in the display state to be stored in the same row in the corresponding map data table, the tiles included in the map data need to be encoded. In some examples, the process of encoding the tiles contained in the map data for each zoom level is as follows:
(1) calculating the total order m of the map data at each zoom level according to the number n of tiles contained in each row of the map data at each zoom level, wherein,
Figure BDA0002740527720000171
the right angle brackets indicate rounding up.
(2) And judging the relation between the number n of the tiles and the total order m. When n is 2m, the map data for each zoom level is divided into m square subgrids, where a square subgrid is made up of 4 tiles.
(3) And filling the m by m square sub-grids based on the Z-shaped filling curve. Here, each square subgrid may be filled based on the Z-shaped filling curves, and then the filling curves in each square subgrid are connected by using the connecting lines, so as to fill all the subgrids.
(4) The n tiles are encoded according to the filling order of the n tiles.
(5) And constructing each map data table aiming at the map data under each zoom level, and sequentially storing n tiles in the corresponding map data table based on the codes of the n tiles, wherein 4 tiles belonging to the same square sub-grid are stored in the same row in the corresponding map data table.
Fig. 4(a) is a schematic diagram illustrating an encoding method of time map data when n is 2 according to an embodiment of the present invention; fig. 4(b) is a schematic diagram of an encoding method of time chart data when n is 4 according to an embodiment of the present invention. As shown in fig. 4(a), when n is 2, m is 1, the map data is filled by a 1 st order Z-type filling curve, and the tiles are encoded according to the filling order. As shown in fig. 4(b), when n is 4, m is 2, the map data is filled by a 2-step Z-type filling curve, and the tiles are encoded according to the filling order.
Because the number of tiles in the screen display area may not meet the number required by the Z-shaped fill curve due to the limitation of the screen display area, the present embodiment provides an encoding method in the case that the number of tiles does not support the Z-shaped fill curve encoding. In some examples, the process of encoding the tiles contained in the map data for each zoom level is as follows:
(1) calculating the total order m of the map data at each zoom level according to the number n of tiles contained in each row of the map data at each zoom level, wherein,
Figure BDA0002740527720000172
the right angle brackets indicate rounding up.
(2) And judging the relation between the number n of the tiles and the total order m. When n-2m is 1, the map data at each zoom level is divided into m × m square subgrids and n edge subgrids, where a square subgrid is composed of 4 tiles, 2m edge subgrids adjacent to a square subgrid are composed of 2 tiles, and 1 edge subgrid not adjacent to a square subgrid is composed of 1 tile.
(3) Filling the m square sub-grids based on the Z-shaped filling curve, filling the 2m edge sub-grids based on the linear filling curve, connecting the m square sub-grids and the filling curves of the 2m edge sub-grids into a whole, and extending the filling curves of the m square sub-grids and the 2m edge sub-grids to 1 edge sub-grid which is not adjacent to the square sub-grids.
(4) The n tiles are encoded according to their filling order.
(5) And constructing each map data table aiming at the map data under each zoom level, and sequentially storing n tiles in the corresponding map data table based on the codes of the n tiles, wherein 4 tiles belonging to the same square sub-grid are stored in the same row in the corresponding map data table, and the tiles belonging to the same edge sub-grid are stored in the same row in the corresponding map data table.
Fig. 4(c) is a schematic diagram of an encoding flow of time map data when n is 3 according to an embodiment of the present invention. When n is 3, m is 1, that is, 1 sub-mesh of the map data may be filled by a 1-order Z-type filling curve, and other sub-meshes need to be filled in other manners. As shown in fig. 4(c), map data is first divided into 1 square subgrid and 3 edge subgrids, where 2 edge subgrids adjacent to the square subgrid (i.e., subgrids numbered 2 and 3) are composed of 2 tiles, and an edge subgrid not adjacent to the square subgrid (i.e., subgrid numbered 4) is composed of 1 tile; then, the square sub-grids are filled based on the Z-shaped filling curves, the sub-grids numbered 2 and 3 are filled based on the linear curves, then the square sub-grids and the filling curves of the 2 edge sub-grids are connected into a whole, and the square sub-grids and the filling curves of the 2 edge sub-grids continue to extend to the sub-grids numbered 4, so that all the sub-grids are filled; the tiles are encoded according to the filling order. Fig. 4(d) is a schematic diagram of an encoding method of time chart data when n is 3 according to an embodiment of the present invention. The encoding of the 9 tiles contained by the map data is shown in fig. 4 (d).
The storage mode of the map data provided by the embodiment of the invention is shown in table 2. In table 2, the master Key Row Key corresponds to the number of the sub-grids, and the number of the sub-grids is determined by the filling order of the sub-grids. The column family comprises at least four columns for storing tiles belonging to the same sub-grid and the storage order of the tiles is identical to the coding order of the tiles. Column names are named with tile numbers, and a query for a particular tile can be implemented according to the XY number of the tile. If any, the notes or other information are stored in the notes column of each table. Regarding the map data corresponding to fig. 4(c) and 4(d), the map data includes 4 sub-grids numbered 1, 2, 3, and 4, the master Key Row Key can be determined according to the numbers of the 4 sub-grids, and the tiles included in each sub-grid are respectively stored in the corresponding rows, wherein for the sub-grid numbered 1, 4 tiles encoded as 1, 2, 3, and 4 are sequentially stored in the same Row, for the sub-grid numbered 2, only 2 tiles encoded as 5 and 6 are included, then the 2 tiles are sequentially stored in the next Row, and for the sub-grid numbered 4, only 1 tile encoded as 9 is included, then the 1 tile is stored in a single Row.
TABLE 2 map data storage mode
Figure BDA0002740527720000191
It should be understood that, since the number of tiles included in the map data at different zoom levels may vary, the map data at each zoom level needs to be encoded separately, and finally, the map data at each zoom level is stored in its respective map data table according to the encoding. In this way, when the thermodynamic diagram is generated, the map data table of the map data at the corresponding zoom level is queried from the HBase distributed data, so that the map data at the corresponding zoom level can be obtained.
And 150, acquiring map data and cluster data corresponding to the thermodynamic diagram to be generated from the HBase distributed database.
In some embodiments, obtaining map data and cluster data corresponding to the thermodynamic diagram to be generated from the HBase distributed database includes: determining the zoom level of the map data corresponding to the thermodynamic diagram to be generated according to the zoom level of the thermodynamic diagram to be generated; and obtaining the map data and the cluster data under the corresponding zoom level from the HBase distributed database. Based on the method, the obtained clustering data is matched with the zoom level of the map data, so that the problem of large deformation of the position characteristics of the track data of the thermodynamic diagram under different zoom levels can be solved, and the thermonuclear phenomenon of the data dense area is optimized.
In some embodiments, obtaining map data and cluster data corresponding to the thermodynamic diagram to be generated from the HBase distributed database includes: determining the zoom level of the map data corresponding to the thermodynamic diagram to be generated according to the zoom level of the thermodynamic diagram to be generated; determining a time slice to which clustering data corresponding to the thermodynamic diagram to be generated belong according to the time range of the thermodynamic diagram to be generated; and obtaining the map data under the corresponding zoom level and the cluster data under the corresponding zoom level under the corresponding time slice from the HBase distributed database.
When the trajectory data included in each time slice is clustered, clustered data for the trajectory data included in each time slice can be obtained. When the thermodynamic diagram is drawn, besides the zoom level of the thermodynamic diagram, the time range of the thermodynamic diagram needs to be determined, and the time slice to which the clustering data belongs is determined based on the time range of the thermodynamic diagram.
And 160, generating a thermodynamic diagram according to the acquired map data and the cluster data.
In some embodiments, the cluster data includes a center coordinate and an influence value of the cluster and a coordinate and an influence value of the noise point. According to the influence values of the cluster and the noise points, the gray values of the areas covered by the cluster and the noise points can be calculated, and then the thermodynamic diagram can be generated on the map data by combining the central coordinates of the cluster and the noise points. Here, the gray value is determined based on the center coordinates and the influence value of the cluster and the coordinates and the influence value of the noise point, and the process of generating the thermodynamic diagram by combining the map data is realized by a conventional method in the field of thermodynamic diagram generation.
In summary, the embodiment of the present invention provides a thermodynamic diagram generation method based on trajectory data, which includes first acquiring trajectory data and map data; storing the track data in a Hadoop platform distributed file system in an original format; clustering the track data to obtain clustered data; storing the map data and the cluster data in an HBase distributed database; obtaining map data and clustering data corresponding to the thermodynamic diagram to be generated from the HBase distributed database; and generating a thermodynamic diagram according to the acquired map data and the cluster data. Based on the method and the device, the efficiency of thermodynamic diagram visualization can be improved while the position characteristics of the track data are kept, the diagram forming time is shortened, the problem of unsmooth caused by user interaction is solved, and the user experience is improved.
A specific implementation scenario is provided below to further illustrate the method for generating a thermodynamic diagram based on trajectory data according to an embodiment of the present invention.
Fig. 5 is a schematic diagram of a storage frame of track data and map data according to an embodiment of the present invention. As shown in fig. 5, the track data and map data storage frame constructed based on the HBase distributed database is composed of 5 parts in total. From bottom to top in sequence: 1) a Hadoop storage frame built by PC clusters; 2) an HBase cloud data storage layer depending on HDFS; 3) a data operation layer for querying data; 4) the Web service layer is used for receiving requests and calling data; 5) a Web browser based presentation layer. The system comprises an HBase cloud data storage layer, an HBase distributed database, an HDFS (Hadoop distributed file system) and a Hadoop storage frame, wherein the HBase cloud data storage layer is the HBase distributed database, the HBase distributed database is used for storing map data and clustering data, and the HDFS is constructed on the Hadoop storage frame and used for storing track data. The data operation layer is used for realizing the operation on the track data, the map data and the cluster data, and can comprise a map data processing module and a track data processing module, wherein the map data processing module comprises a map code conversion module and a map data interface, and the track data module comprises an original track data interface and a cluster data interface.
Specifically, the embodiment of the invention builds a Hadoop cluster formed by 5 computers, wherein the memory of each node is 8Gb, the hard disk is 1Tb, and the CPU is an i7 processor. The software configuration is that the Hadoop version is 2.7.6, the HBase version of the distributed database is 2.1.9, the zookeeper version of the coordination service is 3.4.14, the tomcat7.0.90 used by the web server is 7.0, and the Java version is 1.8.0.
The thermodynamic diagram of the embodiment of the invention visually selects the WebGIS visualization technology based on the B/S architecture, and the WebGIS can be understood as a GIS (Geographic Information System) based on the Web environment.
Fig. 6 is a flowchart of loading map data according to an embodiment of the present invention. The loading process of the map data is explained in conjunction with the storage frame of the trajectory data and the map data shown in fig. 5 and fig. 6. Firstly, a web browser judges the number of a required tile according to a screen display area, namely, determines a query condition, and sends a request to a web map server, the web map server converts the tile number into a code required for querying an HBase cloud data storage layer through a map code conversion module, namely, a Row Key, interacts with the HBase cloud data storage layer through a map data interface, queries a corresponding map data table according to the Row Key, determines a cell where the tile is located through the tile number, and then returns the queried map data to the web map server. Accordingly, the acquisition process of the map data necessary for the thermodynamic diagram to be generated is realized.
The map data used in the embodiment of the invention is 18-level (namely 18 zoom levels) map data in the Beijing area. And selecting map data of one zoom level to perform high-pressure query test so as to investigate the loading efficiency of the map data. Fig. 7 is a comparison diagram of loading durations of map data according to an embodiment of the present invention, where two loading time curves in fig. 7 are respectively loading time curves of map data for two encoding manners, a first encoding manner is to encode the map data based on the purpose that 4 tiles adjacent to each other are stored in the same row in a corresponding map data table in the display state (e.g., the encoding manners illustrated in fig. 4(a) to 4 (d)), and a second encoding manner is to encode the tiles according to the order in which the tiles are naturally arranged in the display state of the map data (e.g., the encoding manner illustrated in fig. 3). Since the encoding method for the map data determines the storage method for the map data, the loading time comparison shown in fig. 7 is actually a comparison of the loading efficiency of the map data for the two storage methods.
As can be seen from fig. 7, when the number of requests is small, the difference between the loading times of the map data of the two encoding methods is not large, but the difference between the two encoding methods becomes more and more obvious as the number of requests is increased. The map data are encoded based on the purpose that 4 tiles adjacent to each other in the display state are stored in the same row in the corresponding map data table, the time consumed after 100 times of loading is less than 2000ms, namely, about 50 times of complete loading processes can be completed per second on average, and high concurrency scenes generated in visual interaction can be dealt with. Therefore, after the encoding processing, the whole average loading time shows a shortening trend from the time of sending the request to the HBase distributed database to the time of returning the data to the Web browser end, and the real-time loading rate is relatively stable.
The following provides yet another specific implementation scenario to further illustrate the method for generating a thermodynamic diagram based on trajectory data according to an embodiment of the present invention.
The embodiment of the invention builds a Hadoop cluster consisting of 5 computers, wherein the memory of each node is 8Gb, the hard disk is 1Tb, and the CPU is an i7 processor. The software configuration is that the Hadoop version is 2.7.6, the HBase version of the distributed database is 2.1.9, the zookeeper version of the coordination service is 3.4.14, the tomcat7.0.90 used by the web server is 7.0, and the Java version is 1.8.0.
The thermodynamic diagram of the embodiment of the invention visually selects the WebGIS visualization technology based on the B/S architecture, and the WebGIS can be understood as a GIS (Geographic Information System) based on the Web environment.
The map data used in the embodiment of the invention is 18-level map data (namely, 18-level zoom level) in the Beijing area. The track data is 24 hours of track data of a taxi in Beijing city, and about 1440 records are totally recorded. And randomly selecting track data of a certain time slice for visualization processing.
Fig. 8 is a flowchart of a thermodynamic diagram generation method based on trajectory data according to an embodiment of the present invention. The generation process of the thermal map will be described with reference to fig. 6 and 8, which are storage frames of the trajectory data and the map data shown in fig. 5. In which map data is first loaded based on the loading process of map data shown in fig. 6. The process is the same as the map data loading process in the previous implementation scenario, and the embodiment of the present invention is not described herein again. The loaded map data is map data which is encoded for the purpose of storing 4 tiles adjacent to each other in the display state in the same row in the corresponding map data table and stored in the HBase distributed database based on the encoding manner. Thereafter, a thermodynamic diagram is drawn in conjunction with the acquired cluster data and map data based on the visualization process shown in fig. 8. Specifically, taxi track data are stored in the HDFS, and then are clustered based on the DBSCAN algorithm. And aiming at the cluster and the noise point obtained by clustering, calculating the central coordinate and the influence value of the cluster and the influence value of the noise point in a mode of traversing all the cluster and the noise point one by one, and storing the cluster data into a warehouse after the calculation is finished. Here, each time the clustering parameter is changed, a new round of clustering is performed on the trajectory data, and a new round of traversal is performed on the newly obtained cluster and the noise point to obtain new cluster data, so that cluster data suitable for each zoom level can be obtained from a plurality of zoom levels of the map data. And when the data to be clustered is put into a warehouse, the loading process of the clustered data can be executed according to the requirement generated by the thermodynamic diagram, a data request is sent to the web browser, and the web map server inquires corresponding data from the HBase distributed database according to the inquiry condition and returns the data to the web browser. And the Web browser end calculates the gray value according to the influence value of the clustering data and draws a thermodynamic diagram. In order to distinguish from the thermodynamic diagram generation process utilizing the clustering data, the embodiment of the invention also provides a thermodynamic diagram generation process based on the original track data. Here, when generating the thermodynamic diagram based on the original trajectory data, the original trajectory data is directly acquired from the HDFS, and the thermodynamic diagram is drawn based on the original trajectory data and the loaded map data.
FIG. 9(a) is a thermodynamic diagram generated using map data at zoom level 11 and raw trajectory data without clustering provided by an embodiment of the present invention; FIG. 9(b) is a thermodynamic diagram generated using map data at zoom level 11 and cluster data provided by an embodiment of the present invention; FIG. 9(c) is a thermodynamic diagram generated using map data at zoom level 12 and raw trajectory data without clustering provided by yet another embodiment of the present invention; FIG. 9(d) is a thermodynamic diagram generated using map data at zoom level 12 and cluster data provided by an embodiment of the present invention; FIG. 9(e) is a thermodynamic diagram generated using map data at zoom level 13 and raw trajectory data without clustering provided by an embodiment of the present invention; FIG. 9(f) is a thermodynamic diagram generated using map data at zoom level 13 and cluster data provided by an embodiment of the present invention; FIG. 9(g) is a thermodynamic diagram generated using map data at zoom level 14 and raw trajectory data without clustering provided by an embodiment of the present invention; fig. 9(h) is a thermodynamic diagram generated by using map data with a zoom level of 14 and cluster data according to an embodiment of the present invention. "zoom" denoted in fig. 9(a) to 9 (h): the "typeface represents the zoom level of the thermodynamic diagram, and the time typeface represents the time it takes to generate the thermodynamic diagram. Here, for the thermodynamic diagram generation process using the cluster data, the time taken to complete the following process is taken as the time taken to generate the thermodynamic diagram: and the Web browser end sends a data request, the Web map service end queries corresponding data from the HBase distributed database according to the query conditions and returns the data to the Web browser end, and the Web browser end calculates a gray value according to the influence value of the clustered data and draws a thermodynamic diagram. For the thermodynamic diagram generation process directly using raw track data without clustering, the time taken to complete the following process is taken as the time taken to generate the thermodynamic diagram: and the Web browser sends a data request, the Web map service end inquires corresponding data in the HDFS according to the inquiry condition and returns the data to the Web browser, and the Web browser draws a thermodynamic diagram according to the acquired original track data.
Two thermodynamic diagrams with the same zoom level are compared as a group, namely, fig. 9(a) and 9(b), fig. 9(c) and 9(d), fig. 9(e) and 9(f), and fig. 9(g) and 9(h) respectively form 4 groups, and it is found from the comparison result of each group that the thermonuclear phenomenon of the thermodynamic diagram generated by using the original trajectory data without clustering is more serious, the deformation of the position feature is larger, the generation time of the thermodynamic diagram is longer, and the visualization effect is poorer, and correspondingly, the thermonuclear phenomenon of the data dense region is optimized, the display of the position feature is more detailed, and the overall visualization effect is improved by using the thermodynamic diagram generated by clustering data. Especially, when the thermodynamic diagram is generated by utilizing the cluster data, different cluster parameters are designed aiming at different zoom levels, and the obtained cluster data is adaptive to the zoom levels of the map data, so that the method is more favorable for improving the thermonuclear phenomenon of the data dense area. Fig. 10 is a comparison graph of the generation duration of the thermodynamic diagram provided by the embodiment of the invention. As can be seen from fig. 10, when the zoom level is low, such as 11 levels and 12 levels, there is a certain difference between the time length consumed by using the original trajectory data without clustering processing and the time length consumed by using the clustering data to generate the thermodynamic diagram, and in the case of a higher zoom level, the visualized loading time of the clustering data is significantly reduced.
In summary, the thermodynamic diagram generation method based on the trajectory data provided by the embodiment of the invention improves visualization efficiency, shortens mapping time, reduces the jamming influence caused by user interaction, and improves user interaction experience while preserving data position characteristics. The thermodynamic diagram generation method based on the track data provided by the embodiment of the invention can realize efficient management and storage of mass track data and obtain a better drawing effect.
In addition, the thermodynamic diagram generation method based on the track data provided by the embodiment of the invention is based on the advantages of high reliability, high expansibility, high efficiency, high fault tolerance and the like of a hadoop framework, and a track big data storage scheme based on an HBase platform is designed, so that the thermodynamic diagram generation method based on the track data has better universality in the fields of space data storage, visualization and expansion. By processing the thermodynamic diagram data, the generation efficiency of the thermodynamic diagram is improved, and key technical support is provided for track data mining and analysis based on time attributes.
Fig. 11 shows a schematic structural diagram of a thermodynamic diagram generation apparatus based on trajectory data according to an embodiment of the present invention. As shown in fig. 11, the trajectory-data-based thermodynamic diagram generation apparatus 1100 includes: a first obtaining module 1110, configured to obtain track data and map data; the first storage module 1120 is used for storing the track data in a Hadoop platform distributed file system in an original format; a clustering module 1130, configured to cluster the trajectory data to obtain clustered data; a second storage module 1140, configured to store the map data and the cluster data in an HBase distributed database; a second obtaining module 1150, configured to obtain map data and cluster data corresponding to the thermodynamic diagram to be generated from the HBase distributed database; a generating module 1160, configured to generate a thermodynamic diagram according to the acquired map data and cluster data.
In some embodiments, the first storage module is specifically configured to: dividing the track data into a plurality of time slices, wherein each time slice comprises all track data in a preset time range; in the Hadoop platform distributed file system, the track data contained in the same time slice is stored in a concentrated mode in an original format, and the time slices are stored adjacently according to a time sequence.
In some embodiments, the map data has a plurality of zoom levels; the clustering module comprises: a first determining unit, configured to determine multiple groups of clustering parameters according to the multiple zoom levels; the clustering unit is used for clustering the track data contained in each time slice according to the multiple groups of clustering parameters to obtain multiple groups of clustering data corresponding to the multiple zoom levels for each time slice; the second obtaining module includes: a second determining unit, configured to determine, according to a zoom level of the thermodynamic diagram to be generated, a zoom level of map data corresponding to the thermodynamic diagram to be generated; a third determining unit, configured to determine, according to the time range of the thermodynamic diagram to be generated, a time slice to which cluster data corresponding to the thermodynamic diagram to be generated belongs; and the acquisition unit is used for acquiring the map data at the corresponding zoom level and the cluster data at the corresponding zoom level in the corresponding time slice from the HBase distributed database.
In some embodiments, the map data has a plurality of zoom levels; the clustering module comprises: a first determining unit, configured to determine multiple groups of clustering parameters according to the multiple zoom levels; the clustering unit is used for clustering the track data according to the multiple groups of clustering parameters to obtain multiple groups of clustering data corresponding to the multiple zooming levels; the second obtaining module includes: a second determining unit, configured to determine, according to a zoom level of the thermodynamic diagram to be generated, a zoom level of map data corresponding to the thermodynamic diagram to be generated; and the acquisition unit is used for acquiring the map data and the cluster data under the corresponding zoom level from the HBase distributed database.
In some embodiments, the sets of clustering parameters include a scan radius; the first determining unit is specifically configured to: determining a scanning radius corresponding to each zooming level according to the zooming levels; wherein the scan radius corresponding to each zoom level decreases as the respective zoom level decreases.
In some embodiments, the sets of clustering parameters include a minimum inclusion point number; the first determining unit is specifically configured to: and determining the minimum contained points corresponding to each zooming level according to the zooming levels, wherein the minimum contained points corresponding to each zooming level are reduced along with the reduction of the corresponding zooming level.
In some embodiments, the sets of cluster data include center coordinates and influence values of a plurality of cluster clusters and coordinates and influence values of a plurality of noise points.
In some embodiments, the clustering is implemented based on a DBScan algorithm.
In some embodiments, the second storage module comprises: and the first construction unit is used for constructing each clustering data table aiming at each group of clustering data corresponding to each zooming level of each time slice.
In some embodiments, the map data has a plurality of zoom levels; the second storage module includes: and the second construction unit is used for constructing each map data table aiming at the map data at each zoom level and storing 4 tiles which are adjacent to each other in the display state and are contained in the map data at each zoom level in the same row of the corresponding map data table.
In some embodiments, the second building unit is specifically configured to: calculating the total order m of the map data at each zoom level according to the number n of tiles contained in each row of the map data at each zoom level, wherein,
Figure BDA0002740527720000261
when n-2m is 1, dividing the map data at each zoom level into m × m square sub-grids and n edge sub-grids, wherein the square sub-grids are composed of 4 tiles, 2m edge sub-grids adjacent to the square sub-grids are composed of 2 tiles, and 1 edge sub-grid not adjacent to the square sub-grids is composed of 1 tile; filling the m-by-m square sub-grids based on a Z-shaped filling curve, filling the 2m edge sub-grids based on a linear filling curve, and connecting the m-by-m square sub-grids and the n edge sub-grids by using connecting lines; encoding the n tiles according to their filling order; and constructing each map data table aiming at the map data under each zoom level, and sequentially storing the n tiles in the corresponding map data table based on the codes of the n tiles, wherein 4 tiles belonging to the same square sub-grid are stored in the same row in the corresponding map data table, and the tiles belonging to the same edge sub-grid are stored in the same row in the corresponding map data table.
In some embodiments, the second building unit is specifically configured to: calculating the total order m of the map data at each zoom level according to the number n of tiles contained in each row of the map data at each zoom level, wherein,
Figure BDA0002740527720000271
when n is 2m, dividing the map data of each zoom level into m square sub-grids, wherein the square sub-grids are composed of 4 tiles; filling the m by m square sub-grids based on a Z-shaped filling curve; encoding the n tiles according to their filling order; and constructing each map data table aiming at the map data under each zoom level, and sequentially storing the n tiles in the corresponding map data table based on the codes of the n tiles, wherein 4 tiles belonging to the same square sub-grid are stored in the same row in the corresponding map data table.
Fig. 12 shows an electronic device of an embodiment of the invention. As shown in fig. 12, the electronic apparatus 1200 includes: at least one processor 1210, and a memory 1220 in communication with the at least one processor 1210, wherein the memory stores instructions executable by the at least one processor to cause the at least one processor to perform the method.
Specifically, the memory 1220 and the processor 1210 are connected together via the bus 1230, and can be general-purpose memory and processor, which are not limited in particular, and when the processor 1210 executes the computer program stored in the memory 520, the operations and functions described in the embodiments of the present invention in conjunction with fig. 1 to 10 can be performed.
An embodiment of the present invention further provides a storage medium, on which a computer program is stored, which, when executed by a processor, implements the method. For specific implementation, reference may be made to the method embodiment, which is not described herein again.
While embodiments of the present invention have been disclosed above, it is not limited to the applications listed in the description and the embodiments. It is fully applicable to a variety of fields in which embodiments of the present invention are suitable. Additional modifications will readily occur to those skilled in the art. Therefore, the embodiments of the invention are not to be limited to the specific details and illustrations shown and described herein, without departing from the general concept defined by the claims and their equivalents.

Claims (15)

1.一种基于轨迹数据的热力图生成方法,其特征在于,包括:1. a heat map generation method based on trajectory data, is characterized in that, comprising: 获取轨迹数据和地图数据;Obtain trajectory data and map data; 将所述轨迹数据以原格式存储于Hadoop平台分布式文件系统;The trajectory data is stored in the Hadoop platform distributed file system in the original format; 对所述轨迹数据进行聚类,得到聚类数据;Clustering the trajectory data to obtain clustered data; 将所述地图数据以及所述聚类数据存储于HBase分布式数据库;storing the map data and the clustering data in the HBase distributed database; 从所述HBase分布式数据库获取与待生成的热力图相对应的地图数据以及聚类数据;Obtain map data and cluster data corresponding to the heat map to be generated from the HBase distributed database; 根据所获取的地图数据和聚类数据,生成热力图。Based on the acquired map data and clustering data, a heat map is generated. 2.如权利要求1所述的基于轨迹数据的热力图生成方法,其特征在于,所述将所述轨迹数据以原格式存储于Hadoop平台分布式文件系统,包括:2. The method for generating a heat map based on trajectory data according to claim 1, wherein the storing the trajectory data in the original format in the Hadoop platform distributed file system, comprising: 将所述轨迹数据分割成多个时间切片,其中,每个时间切片包含一预设的时间段内的所有轨迹数据;dividing the trajectory data into a plurality of time slices, wherein each time slice includes all the trajectory data in a preset time period; 在所述Hadoop平台分布式文件系统中,将同一时间切片所包含的轨迹数据以原格式进行集中存储,并且所述多个时间切片依照时间顺序进行邻近存储。In the distributed file system of the Hadoop platform, the trajectory data included in the same time slice is stored in a centralized manner in the original format, and the multiple time slices are stored adjacently in chronological order. 3.如权利要求2所述的基于轨迹数据的热力图生成方法,其特征在于,3. The method for generating heat map based on trajectory data according to claim 2, wherein, 所述地图数据具有多个缩放级别;the map data has multiple zoom levels; 所述对所述轨迹数据进行聚类,得到聚类数据,包括:The clustering of the trajectory data to obtain clustered data includes: 根据所述多个缩放级别,确定多组聚类参数;determining sets of clustering parameters according to the multiple zoom levels; 针对各时间切片所包含的轨迹数据,根据所述多组聚类参数进行聚类,得到针对各时间切片的对应于所述多个缩放级别的多组聚类数据;For the trajectory data included in each time slice, clustering is performed according to the multiple sets of clustering parameters to obtain multiple sets of clustering data corresponding to the multiple zoom levels for each time slice; 所述从所述HBase分布式数据库获取与待生成的热力图相对应的地图数据以及聚类数据,包括:The obtaining of the map data and cluster data corresponding to the heat map to be generated from the HBase distributed database includes: 根据所述待生成的热力图的缩放级别,确定与所述待生成的热力图相对应的地图数据的缩放级别;According to the zoom level of the heat map to be generated, determine the zoom level of the map data corresponding to the heat map to be generated; 根据所述待生成的热力图的时间范围,确定与所述待生成的热力图相对应的聚类数据所属于的时间切片;According to the time range of the heat map to be generated, determine the time slice to which the cluster data corresponding to the heat map to be generated belongs; 从所述HBase分布式数据库获取相应缩放级别下的地图数据以及相应的时间切片下相应缩放级别的聚类数据。Obtain map data at a corresponding zoom level and cluster data at a corresponding zoom level under a corresponding time slice from the HBase distributed database. 4.如权利要求1所述的基于轨迹数据的热力图生成方法,其特征在于,4. The method for generating heat map based on trajectory data according to claim 1, wherein, 所述地图数据具有多个缩放级别;the map data has multiple zoom levels; 所述对所述轨迹数据进行聚类,得到聚类数据,包括:The clustering of the trajectory data to obtain clustered data includes: 根据所述多个缩放级别,确定多组聚类参数;determining sets of clustering parameters according to the multiple zoom levels; 根据所述多组聚类参数,对所述轨迹数据进行聚类,得到对应于所述多个缩放级别的多组聚类数据;Clustering the trajectory data according to the multiple sets of clustering parameters to obtain multiple sets of clustering data corresponding to the multiple zoom levels; 所述从所述HBase分布式数据库获取与待生成的热力图相对应的地图数据以及聚类数据,包括:The obtaining of the map data and cluster data corresponding to the heat map to be generated from the HBase distributed database includes: 根据所述待生成的热力图的缩放级别,确定与所述待生成的热力图相对应的地图数据的缩放级别;According to the zoom level of the heat map to be generated, determine the zoom level of the map data corresponding to the heat map to be generated; 从所述HBase分布式数据库获取相应缩放级别下的地图数据和聚类数据。Obtain map data and cluster data at corresponding zoom levels from the HBase distributed database. 5.如权利要求3或4所述的基于轨迹数据的热力图生成方法,其特征在于,5. The method for generating heat map based on trajectory data according to claim 3 or 4, wherein, 所述各组聚类参数包括扫描半径;Each group of clustering parameters includes a scan radius; 所述根据所述多个缩放级别,确定多组聚类参数,包括:Determining multiple sets of clustering parameters according to the multiple zoom levels, including: 根据所述多个缩放级别,确定各缩放级别对应的扫描半径;其中,各缩放级别对应的扫描半径随着相应缩放级别的减小而减小。According to the multiple zoom levels, a scan radius corresponding to each zoom level is determined; wherein, the scan radius corresponding to each zoom level decreases as the corresponding zoom level decreases. 6.如权利要求3或4所述的基于轨迹数据的热力图生成方法,其特征在于,6. The method for generating heat map based on trajectory data according to claim 3 or 4, wherein, 所述各组聚类参数包括最小包含点数;Each group of clustering parameters includes the minimum number of included points; 所述根据所述多个缩放级别,确定多组聚类参数,还包括:The determining multiple sets of clustering parameters according to the multiple zoom levels further includes: 根据所述多个缩放级别,确定各缩放级别对应的最小包含点数,其中,各缩放级别对应的最小包含点数随着相应缩放级别的减小而减少。According to the multiple zoom levels, the minimum number of inclusion points corresponding to each zoom level is determined, wherein the minimum number of inclusion points corresponding to each zoom level decreases as the corresponding zoom level decreases. 7.如权利要求3或4所述的基于轨迹数据的热力图生成方法,其特征在于,所述各组聚类数据包括多个聚类簇的中心坐标和影响力值以及多个噪声点的坐标和影响力值。7. The method for generating heat map based on trajectory data according to claim 3 or 4, wherein each group of cluster data includes the center coordinates and influence values of a plurality of clusters and the values of a plurality of noise points. Coordinates and influence values. 8.如权利要求3或4所述的基于轨迹数据的热力图生成方法,其特征在于,所述聚类是基于DBScan算法实现的。8 . The method for generating a heat map based on trajectory data according to claim 3 or 4 , wherein the clustering is realized based on the DBScan algorithm. 9 . 9.如权利要求3所述的基于轨迹数据的热力图生成方法,其特征在于,所述将所述聚类数据存储于HBase分布式数据库,包括:9. The method for generating a heat map based on trajectory data according to claim 3, wherein the storing the cluster data in the HBase distributed database comprises: 分别针对各时间切片的对应于各缩放级别的各组聚类数据构建各张聚类数据表。Each cluster data table is respectively constructed for each group of cluster data corresponding to each zoom level of each time slice. 10.如权利要求1所述的基于轨迹数据的热力图生成方法,其特征在于,10. The method for generating heat map based on trajectory data according to claim 1, wherein, 所述地图数据具有多个缩放级别;the map data has multiple zoom levels; 所述将所述地图数据存储于HBase分布式数据库,包括:The described map data is stored in the HBase distributed database, including: 针对各缩放级别下的地图数据构建各张地图数据表,将各缩放级别下的地图数据所包含的在显示状态下彼此相邻的4张瓦片存储于相应地图数据表中的同一行。Each map data table is constructed for the map data at each zoom level, and the four tiles included in the map data at each zoom level that are adjacent to each other in the displayed state are stored in the same row in the corresponding map data table. 11.如权利要求10所述的基于轨迹数据的热力图生成方法,其特征在于,所述针对各缩放级别下的地图数据构建各张地图数据表,将各缩放级别下的地图数据所包含的在显示状态下彼此相邻的4张瓦片存储于相应地图数据表中的同一行,包括:11. The method for generating a heat map based on trajectory data according to claim 10, wherein each map data table is constructed for the map data under each zoom level, and the The 4 tiles that are adjacent to each other in the display state are stored in the same row in the corresponding map data table, including: 根据各缩放级别下的地图数据中每行所包含的瓦片数量n,计算所述各缩放级别下的地图数据的总阶数m,其中,
Figure FDA0002740527710000031
Calculate the total order m of the map data under each zoom level according to the number of tiles n included in each row of the map data under each zoom level, wherein,
Figure FDA0002740527710000031
当n-2m=1时,将所述各缩放级别下的地图数据划分成m*m个正方形子格网和n个边缘子格网,其中,所述正方形子格网由4个瓦片构成,与所述正方形子格网邻接的2m个边缘子格网由2个瓦片构成,与所述正方形子格网不邻接的1个边缘子格网由1个瓦片构成;When n-2m=1, the map data at each zoom level is divided into m*m square sub-grids and n edge sub-grids, wherein the square sub-grid consists of 4 tiles , 2m edge sub-grids adjacent to the square sub-grid are composed of 2 tiles, and 1 edge sub-grid that is not adjacent to the square sub-grid is composed of 1 tile; 基于Z型填充曲线对所述m*m个正方形子格网进行填充,基于直线型填充曲线对所述2m个边缘子格网进行填充,将所述m*m个正方形子格网与所述2m个边缘子格网的填充曲线连接成为一个整体,将所述m*m个正方形子格网与所述2m个边缘子格网的填充曲线延伸至与所述正方形子格网不邻接的1个边缘子格网;Fill the m*m square sub-grids based on the Z-shaped filling curve, fill the 2m edge sub-grids based on the linear filling curve, and combine the m*m square sub-grids with the The filling curves of the 2m edge sub-grids are connected to form a whole, and the m*m square sub-grids and the filling curves of the 2m edge sub-grids are extended to 1 m which is not adjacent to the square sub-grid. edge subgrids; 根据所述n个瓦片的填充顺序对所述n个瓦片进行编码;encoding the n tiles according to the filling order of the n tiles; 针对各缩放级别下的地图数据构建各张地图数据表,基于所述n个瓦片的编码,将所述n个瓦片顺序存储于相应地图数据表中,其中,属于同一个正方形子格网的4个瓦片存储于相应地图数据表中的同一行,属于同一个边缘子格网的瓦片存储于相应地图数据表中的同一行。Each map data table is constructed for the map data at each zoom level, and based on the encoding of the n tiles, the n tiles are sequentially stored in the corresponding map data table, wherein the n tiles belong to the same square sub-grid The 4 tiles are stored in the same row in the corresponding map data table, and the tiles belonging to the same edge subgrid are stored in the same row in the corresponding map data table.
12.如权利要求10所述的基于轨迹数据的热力图生成方法,其特征在于,所述针对各缩放级别下的地图数据构建各张地图数据表,将各缩放级别下的地图数据所包含的在显示状态下彼此相邻的4张瓦片存储于相应地图数据表中的同一行,包括:12. The method for generating a heat map based on trajectory data according to claim 10, wherein each map data table is constructed for the map data under each zoom level, and the The 4 tiles that are adjacent to each other in the display state are stored in the same row in the corresponding map data table, including: 根据各缩放级别下的地图数据中每行所包含的瓦片数量n,计算所述各缩放级别下的地图数据的总阶数m,其中,
Figure FDA0002740527710000041
Calculate the total order m of the map data under each zoom level according to the number of tiles n included in each row of the map data under each zoom level, wherein,
Figure FDA0002740527710000041
当n=2m时,将各缩放级别的地图数据划分成m*m个正方形子格网,其中,所述正方形子格网由4个瓦片构成;When n=2m, the map data of each zoom level is divided into m*m square sub-grids, wherein the square sub-grids are composed of 4 tiles; 基于Z型填充曲线对所述m*m个正方形子格网进行填充;Filling the m*m square sub-grids based on the Z-shaped filling curve; 根据所述n个瓦片的填充顺序对所述n个瓦片进行编码;encoding the n tiles according to the filling order of the n tiles; 针对各缩放级别下的地图数据构建各张地图数据表,基于所述n个瓦片的编码,将所述n个瓦片顺序存储于相应地图数据表中,其中,属于同一个正方形子格网的4个瓦片存储于相应地图数据表中的同一行。Each map data table is constructed for the map data at each zoom level, and based on the encoding of the n tiles, the n tiles are sequentially stored in the corresponding map data table, wherein the n tiles belong to the same square sub-grid The 4 tiles are stored in the same row in the corresponding map data table.
13.一种基于轨迹数据的热力图生成装置,其特征在于,包括:13. A device for generating heat map based on trajectory data, characterized in that it comprises: 第一获取模块,用于获取轨迹数据和地图数据;The first acquisition module is used to acquire trajectory data and map data; 第一存储模块,用于将所述轨迹数据以原格式存储于Hadoop平台分布式文件系统;The first storage module is used to store the trajectory data in the original format in the Hadoop platform distributed file system; 聚类模块,用于对所述轨迹数据进行聚类,得到聚类数据;a clustering module for clustering the trajectory data to obtain clustered data; 第二存储模块,用于将所述地图数据以及所述聚类数据存储于HBase分布式数据库;The second storage module is used to store the map data and the cluster data in the HBase distributed database; 第二获取模块,用于从所述HBase分布式数据库获取与待生成的热力图相对应的地图数据以及聚类数据;The second acquisition module is used to acquire map data and cluster data corresponding to the heat map to be generated from the HBase distributed database; 生成模块,用于根据所获取的地图数据和聚类数据,生成热力图。The generating module is used to generate a heat map according to the acquired map data and clustering data. 14.一种电子设备,其特征在于,包括:至少一个处理器,以及与所述至少一个处理器通信连接的存储器,其中,所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器执行权利要求1-12中任一项所述的方法。14. An electronic device, comprising: at least one processor, and a memory communicatively connected to the at least one processor, wherein the memory stores instructions executable by the at least one processor, The instructions are executed by the at least one processor to cause the at least one processor to perform the method of any of claims 1-12. 15.一种存储介质,其上存储有计算机程序,其特征在于,该程序被处理器执行时,实现权利要求1-12中任一项所述的方法。15. A storage medium on which a computer program is stored, characterized in that, when the program is executed by a processor, the method of any one of claims 1-12 is implemented.
CN202011148718.7A 2020-10-23 2020-10-23 Thermal map generation method, device, electronic device and storage medium based on trajectory data Active CN112380302B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011148718.7A CN112380302B (en) 2020-10-23 2020-10-23 Thermal map generation method, device, electronic device and storage medium based on trajectory data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011148718.7A CN112380302B (en) 2020-10-23 2020-10-23 Thermal map generation method, device, electronic device and storage medium based on trajectory data

Publications (2)

Publication Number Publication Date
CN112380302A true CN112380302A (en) 2021-02-19
CN112380302B CN112380302B (en) 2023-07-21

Family

ID=74580911

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011148718.7A Active CN112380302B (en) 2020-10-23 2020-10-23 Thermal map generation method, device, electronic device and storage medium based on trajectory data

Country Status (1)

Country Link
CN (1) CN112380302B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112905729A (en) * 2021-03-05 2021-06-04 亿海蓝(北京)数据技术股份公司 Thermodynamic diagram generation method and device for track data, electronic equipment and storage medium
CN113934935A (en) * 2021-10-20 2022-01-14 平安国际智慧城市科技股份有限公司 Interactive court map generation method, device, equipment and readable storage medium
CN114119840A (en) * 2022-01-24 2022-03-01 清研捷运(天津)智能科技有限公司 Thermal flow diagram generation method for mass track data
CN114443914A (en) * 2022-04-11 2022-05-06 湖南视觉伟业智能科技有限公司 Data storage, index and query method and system of meta-space server

Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100324867A1 (en) * 2009-06-19 2010-12-23 Microsoft Corporation Data-driven visualization transformation
CN102612678A (en) * 2009-08-14 2012-07-25 特洛吉斯有限公司 Real-time maps with data clustering, expansion and overlay display
CN102819530A (en) * 2011-06-10 2012-12-12 中兴通讯股份有限公司 Method and device for displaying electronic map
CN104464280A (en) * 2014-09-05 2015-03-25 广州市香港科大霍英东研究院 Vehicle advance expenditure prediction method and system
CN105095481A (en) * 2015-08-13 2015-11-25 浙江工业大学 Large-scale taxi OD data visual analysis method
CN105631027A (en) * 2015-12-30 2016-06-01 中国农业大学 Data visualization analysis method and system for enterprise business intelligence
CN106471488A (en) * 2013-09-05 2017-03-01 脸谱公司 Tiling technology for location-based server control
CN106971001A (en) * 2017-04-17 2017-07-21 北京工商大学 A kind of visual analysis system and method for cellular base station location data
CN106991558A (en) * 2017-04-13 2017-07-28 广东南方海岸科技服务有限公司 The automatic generation method and system of main channel between a kind of harbour port
CN108415975A (en) * 2018-02-08 2018-08-17 淮阴工学院 Taxi hot spot recognition methods based on BDCH-DBSCAN
CN108959466A (en) * 2018-06-20 2018-12-07 淮阴工学院 Taxi hot spot method for visualizing and system based on BCS-DBSCAN
CN109405840A (en) * 2017-08-18 2019-03-01 中兴通讯股份有限公司 Map data updating method, server and computer readable storage medium
CN110248365A (en) * 2018-03-07 2019-09-17 中南大学 A kind of pseudo-base station note Spatial-temporal pattern visual analysis method
CN110851741A (en) * 2019-11-09 2020-02-28 郑州天迈科技股份有限公司 Taxi passenger carrying hot spot identification recommendation algorithm
CN111061806A (en) * 2019-11-21 2020-04-24 中国航空无线电电子研究所 Storage method and networked access method for distributed massive geographic tiles

Patent Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100324867A1 (en) * 2009-06-19 2010-12-23 Microsoft Corporation Data-driven visualization transformation
CN102612678A (en) * 2009-08-14 2012-07-25 特洛吉斯有限公司 Real-time maps with data clustering, expansion and overlay display
CN102819530A (en) * 2011-06-10 2012-12-12 中兴通讯股份有限公司 Method and device for displaying electronic map
CN106471488A (en) * 2013-09-05 2017-03-01 脸谱公司 Tiling technology for location-based server control
CN104464280A (en) * 2014-09-05 2015-03-25 广州市香港科大霍英东研究院 Vehicle advance expenditure prediction method and system
CN105095481A (en) * 2015-08-13 2015-11-25 浙江工业大学 Large-scale taxi OD data visual analysis method
CN105631027A (en) * 2015-12-30 2016-06-01 中国农业大学 Data visualization analysis method and system for enterprise business intelligence
CN106991558A (en) * 2017-04-13 2017-07-28 广东南方海岸科技服务有限公司 The automatic generation method and system of main channel between a kind of harbour port
CN106971001A (en) * 2017-04-17 2017-07-21 北京工商大学 A kind of visual analysis system and method for cellular base station location data
CN109405840A (en) * 2017-08-18 2019-03-01 中兴通讯股份有限公司 Map data updating method, server and computer readable storage medium
CN108415975A (en) * 2018-02-08 2018-08-17 淮阴工学院 Taxi hot spot recognition methods based on BDCH-DBSCAN
CN110248365A (en) * 2018-03-07 2019-09-17 中南大学 A kind of pseudo-base station note Spatial-temporal pattern visual analysis method
CN108959466A (en) * 2018-06-20 2018-12-07 淮阴工学院 Taxi hot spot method for visualizing and system based on BCS-DBSCAN
CN110851741A (en) * 2019-11-09 2020-02-28 郑州天迈科技股份有限公司 Taxi passenger carrying hot spot identification recommendation algorithm
CN111061806A (en) * 2019-11-21 2020-04-24 中国航空无线电电子研究所 Storage method and networked access method for distributed massive geographic tiles

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
THANH-CHUNG DAO 等: "Heatmap rendering from large-scale distributed datasets using cloud computing" *
张昊 等: "轨迹大数据云存储与热力图生成方法" *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112905729A (en) * 2021-03-05 2021-06-04 亿海蓝(北京)数据技术股份公司 Thermodynamic diagram generation method and device for track data, electronic equipment and storage medium
CN112905729B (en) * 2021-03-05 2024-01-30 亿海蓝(北京)数据技术股份公司 Thermodynamic diagram generation method and device for track data, electronic equipment and storage medium
CN113934935A (en) * 2021-10-20 2022-01-14 平安国际智慧城市科技股份有限公司 Interactive court map generation method, device, equipment and readable storage medium
CN113934935B (en) * 2021-10-20 2024-07-02 平安国际智慧城市科技股份有限公司 Interactive court map generation method, device, equipment and readable storage medium
CN114119840A (en) * 2022-01-24 2022-03-01 清研捷运(天津)智能科技有限公司 Thermal flow diagram generation method for mass track data
CN114119840B (en) * 2022-01-24 2022-04-08 清研捷运(天津)智能科技有限公司 Thermal flow diagram generation method for mass track data
CN114443914A (en) * 2022-04-11 2022-05-06 湖南视觉伟业智能科技有限公司 Data storage, index and query method and system of meta-space server
CN114443914B (en) * 2022-04-11 2022-07-12 湖南视觉伟业智能科技有限公司 Data indexing and querying method and system of meta-space server

Also Published As

Publication number Publication date
CN112380302B (en) 2023-07-21

Similar Documents

Publication Publication Date Title
CN112380302B (en) Thermal map generation method, device, electronic device and storage medium based on trajectory data
US11874855B2 (en) Parallel data access method and system for massive remote-sensing images
CN111291016B (en) Hierarchical hybrid storage and indexing method for massive remote sensing image data
CN113946700B (en) Space-time index construction method and device, computer equipment and storage medium
CN110599490B (en) A kind of remote sensing image data storage method and system
Eldawy et al. The era of big spatial data
CN103281376B (en) The automatic buffer memory construction method of magnanimity sequential remote sensing image under a kind of cloud environment
CN103425772B (en) A kind of mass data inquiry method with multidimensional information
CN104199986A (en) Vector data space indexing method base on hbase and geohash
CN103995861A (en) Distributed data device, method and system based on spatial correlation
CN108804602A (en) A kind of distributed spatial data storage computational methods based on SPARK
CN113570275A (en) Water resource real-time monitoring system based on BIM and digital elevation model
CN110555448B (en) Method and system for subdividing dispatch area
CN111104457A (en) Massive space-time data management method based on distributed database
Guo et al. A spatially adaptive decomposition approach for parallel vector data visualization of polylines and polygons
Li et al. ConcaveCubes: Supporting Cluster‐based Geographical Visualization in Large Data Scale
JP2023543004A (en) Merge update method, device, and medium for R-tree index based on Hilbert curve
CN116775641A (en) A load-balanced space-time aware big data storage query method and system based on storage object separation mechanism
CN114048204A (en) Beidou grid space indexing method and device based on database inverted index
CN116860905A (en) Space unit coding generation method of city information model
US20170177663A1 (en) Methods and systems for estimating the number of points in two-dimensional data
CN115687517A (en) Method and device for storing spatio-temporal data, database engine and storage medium
CN114116925B (en) A method and device for querying spatiotemporal data
CN110765130B (en) Ripley's K function-based spatio-temporal POI data point pattern analysis method in distributed environment
CN118427226A (en) Time sequence database data partition distribution method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载