CN117194434B - Learning-based trajectory indexing and query method based on homogeneous region partitioning on directed road networks - Google Patents
Learning-based trajectory indexing and query method based on homogeneous region partitioning on directed road networksInfo
- Publication number
- CN117194434B CN117194434B CN202311235731.XA CN202311235731A CN117194434B CN 117194434 B CN117194434 B CN 117194434B CN 202311235731 A CN202311235731 A CN 202311235731A CN 117194434 B CN117194434 B CN 117194434B
- Authority
- CN
- China
- Prior art keywords
- directed
- road
- homogeneous region
- class
- road network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Landscapes
- Traffic Control Systems (AREA)
Abstract
The invention provides a learning type track index and query method based on homogeneous region division on a directional road, which is characterized in that a subsequent array is respectively arranged for each directional road section, the traffic ratio is recorded, the directional road section is searched from the center of the road network in a depth-first mode, the unique serial number of the directional road section is determined, the directional road section is firstly divided into a plurality of classes according to the length and then divided into a plurality of first-order homogeneous region classes according to the number of lanes, all the directional road sections in each first-order homogeneous region class are divided into a plurality of second-order homogeneous region classes, the serial number of a track point in each second-order homogeneous region class is determined, a piecewise linear model of a three-dimensional space is obtained for each second-order homogeneous region class in a training mode to serve as a track point storage position predictor, track query is performed, an original query result set is returned, the method can ensure that track data in each second-order homogeneous region class are uniformly distributed, query performance and prediction accuracy can be improved, and index query efficiency can be improved.
Description
Technical Field
The invention relates to a learning type track index and query method based on homogeneous region division on a directional road network, and belongs to the technical field of road network track data index.
Background
Under the large data environment that the data is explosive-type and large-scale aggregated, and is high-dimensional and complex, the traditional index structure has the problems of high space cost, low query efficiency, high access cost and the like when processing the large data. The learning type index effectively improves the index performance and reduces the access space overhead by modeling and learning the characteristics of the bottom data distribution, the query load and the like.
The multidimensional learning type index is divided into three types, namely an index based on the Leeberg measure, an index based on a space filling curve and an index based on data division. LI PENGFEI et al, in the paper LISA A Learned Index Structure for SPATIAL DATA, propose LISA. The index firstly divides a multidimensional data space into grids, the data are ordered according to grid units, then a function is constructed by adopting a method based on the Leeberg measure to reduce the dimension of the multidimensional data into one-dimensional values, then a monotonic slicing prediction function is used for dividing the one-dimensional data space into a plurality of slices, and each one-dimensional value is mapped into a slice ID. And finally, training a local model to distribute each fragment to a corresponding disk page, wherein the input of the model is a one-dimensional value contained in the fragment, and the output is a page address corresponding to the one-dimensional value.
Based on the index of the space-filling curve, qi Jianzhong et al in the paper "EFFECTIVELY LEARNING SPATIAL INDICES" set forth an index of RSMI, which proposes a recursive strategy that partitions a large dataset, acquires the curve value for each partition using the space-filling curve, and trains a multi-layer perceptron to learn the index of each partition. Then, a rank space-based ordering technique is introduced into each partition to establish ordering of point data, the points are grouped into blocks, a multi-layer perceptron is trained to learn distribution of two-dimensional numerical values, wherein model input is point coordinates, and output is block IDs corresponding to the coordinates.
Indexing based on data partitioning, davitkova et al in The paper "The ML-Index: A multidimensional, learned Index for point, range, AND NEAREST-neighbor queries" propose ML indexing that uses a method similar to The IDISTANCE method for dimension reduction. The index structure mainly comprises a first part and a second part, wherein the first part is used for calculating Euclidean distance between each reference point and data, dividing the data which is close to the reference point into partitions corresponding to the reference points, then calculating one-dimensional key values of each data according to partition offset and the distance between each data and the reference point of the partition, and the second part is used for learning the distribution of the one-dimensional key values by adopting an RMI model, wherein all models are linear models in a two-dimensional space.
Although all three methods convert multidimensional data into one-dimensional data through a dimension reduction mode and then train a model to learn the distribution of the one-dimensional data, when the methods encounter oblique data, the model prediction accuracy is often inaccurate. In order to better learn the distribution characteristics of multidimensional inclined data, the data space is divided according to the data distribution, the inclined data is divided into a plurality of uniformly distributed areas, and an error-controllable lightweight model is trained to learn the distribution relation of the multidimensional data, so that the query efficiency of the index is improved, and the storage size of the index is reduced.
The above-mentioned problems are those that should be considered and addressed in learning track indexing and querying processes based on homogeneous region partitioning on a directional road network.
Disclosure of Invention
The invention aims to provide a learning track index and query method based on homogeneous region division on a directional road network, which solves the problems of low query efficiency, and to-be-improved query performance and prediction accuracy in the prior art.
The technical scheme of the invention is as follows:
A learning track index and inquiry method based on homogeneous region division on directional road network includes such steps as,
S1, setting a subsequent array for each directed road section, scanning all track data on the road network, and recording traffic ratio values in the subsequent array;
S2, dividing the directed road sections into a plurality of classes according to the length, dividing each class into a plurality of first-order homogeneous region classes according to the number of lanes, dividing all the directed road sections in each first-order homogeneous region class into a plurality of second-order homogeneous region classes according to the length and the number of lanes of the precursor directed road sections, and determining the sequence of the second-order homogeneous region classes;
S3, training to obtain a piecewise linear model of the three-dimensional space as a track point storage position predictor, namely learning type track index, inputting track points during training, and outputting storage position serial numbers in the second-order homogeneous region class of the track points;
s4, acquiring a road network range and a time range of the original query, performing track query, decomposing the query into a plurality of sub-queries according to the second-order homogeneous region class, calling a corresponding piecewise linear model to predict, acquiring a sub-query result set, screening, combining all the sub-query result sets, and returning to the original query result set.
Further, in step S1, the road network is a directed graph formed by directed road segments and intersections, each directed road segment is represented by three components including a directed road segment number, a length and a lane number, the track data includes a track name, a track point sequence and a track passing directed road segment, wherein the track points of the track point sequence are represented by two components including a road network position and a time, the road network position is a floating point number, the integer part is the directed road segment number where the point is located, and the decimal part is the relative position of the point on the directed road segment.
Further, in step S1, the traffic ratio is a ratio of the number of trajectories passing through the current and subsequent directed road segments simultaneously to the total number of trajectories passing through the current directed road segment.
Further, in step S1, the depth-first search is started from the road network center directed road segment, and the unique directed road segment number is determined according to the order of the depth-first search, specifically, the depth-first search is started from the road network center directed road segment, and when each time of depth-down search, the subsequent directed road segment with a large traffic ratio is preferentially selected as the next access target, until all the directed road segments are depth-first searched, the directed road segment number starts from 0, and the accessed order of the directed road segments is the directed road segment number.
Further, the directional road section of the road network center is obtained by adding the coordinates of the midpoints of all the directional road sections of the directional road network and calculating an average coordinate, wherein the average coordinate is regarded as the road network center, and the directional road section with the shortest distance from the center of the road network among the directional road sections is regarded as the directional road section of the road network center.
Further, in step S2, the directional road segments are first divided into a plurality of classes according to the lengths, each class is then divided into a plurality of first-order homogeneous region classes according to the number of lanes, all the directional road segments in each first-order homogeneous region class are divided into a plurality of second-order homogeneous region classes according to the lengths and the number of lanes of the precursor directional road segments, specifically,
S21, counting the total length Len of all the directed road sections, setting N as the classification number of the directed road sections, and averaging the total length of the directed road sections contained in each class to be Len/N;
S22, sorting the directed road sections according to the ascending sequence of the length, accumulating the lengths of the directed road sections, calculating the previously accumulated directed road sections as a class if the accumulated length is greater than or equal to Len/N, then re-accumulating the directed road sections from the next directed road section, and the like, and finally dividing N classes;
s23, in each directed road section class divided in the step S22, the directed road sections are divided into 2 classes according to the number of lanes, the number of lanes is less than or equal to the set number and is divided into one class, and the number of lanes is greater than or equal to the set number and is divided into another class, so that all the directed road sections are divided into 2X N first-order homogeneous region classes;
S24, counting the sum of the flows of the precursor directional road sections of all the directional road sections in each first-order homogeneous region class, recording as flow sum, setting M as the number of second-order homogeneous directional road sections, and averaging the total flows of the precursor directional road sections contained in each class as flow sum/M;
S25, sorting the directed road sections according to the ascending order of the precursor directed road section flow, accumulating the precursor directed road section flow, counting the previously accumulated directed road sections as a class if the accumulated precursor directed road section flow is more than or equal to flow sum/M, and then re-accumulating from the next directed road section;
S26, repeating the steps S24-S25, and dividing each first-order homogeneous region class into M second-order homogeneous region classes, so that all the directed road sections in the road network are divided into 2 XN X M second-order homogeneous region classes.
Further, in step S24, the precursor directional road section flow is the sum of the directional road section flows of all the precursor directional road sections of the current directional road section, the precursor directional road section is a connected directional road section directly driven into the current directional road section, and the directional road section flow is the product of the length of the directional road section and the number of lanes.
Further, in step S3, for each second-order homogeneous region class, a piecewise linear model of a three-dimensional space is trained to be used as a trajectory point storage location predictor, i.e., a learning type trajectory index, specifically,
S31, training a piecewise linear model of a three-dimensional space in each second-order homogeneous region, inputting track points (x, y), wherein x represents road network positions, y represents time, and outputting a sequence number in a region where z represents the track points;
s32, setting a maximum error threshold value theta, connecting the ordered track points, namely the first point and the last point into a three-dimensional straight line, and solving a three-dimensional straight line equation by using a least square method:
Wherein P 0(x0,y0,z0) is any point on the three-dimensional straight line, Is a direction vector of a three-dimensional straight line. The above equation is converted to a system of equations:
Then traversing all intermediate points, substituting the above equation set to obtain z 1 and z 2, wherein z 1 and z 2 are the z coordinate values of the projection points of the three-dimensional straight line on the xz plane and the yz plane respectively, and calculating the predicted value Calculating prediction errorRecording the maximum prediction error and the maximum error point;
S33, if the maximum prediction error is larger than theta, segmenting at the maximum error point, adding the error point into the half sections with few points of the left half section and the right half section, repeating the steps above in the left half section and the right half section until the maximum prediction error is smaller than theta, and finally obtaining the piecewise linear model.
Further, in step S4, the road network range and the time range of the original query are obtained, the track query is performed, the query is decomposed into a plurality of sub-queries according to the second-order homogeneous region class, the corresponding piecewise linear model is invoked for prediction, the sub-query result set is obtained, after screening, all the sub-query result sets are combined, the original query result set is returned, specifically,
S41, acquiring a road network range [ pos S,posE ] and a time range [ t S,tE ] of an original query, wherein pos S is a starting road network position, pos E is a stopping road network position, t S is a starting time, and t E is a stopping time;
S42, decomposing a road network range [ pos S,posE ] in an original query into a plurality of sub-road network ranges [ pos s,pose ] according to a second-order homogeneous region class, wherein pos s is a starting road network position in the second-order homogeneous region class, pos e is a stopping road network position in the second-order homogeneous region class, each sub-road network range [ pos s,pose ] and a time range [ t S,tE ] form a sub-query q (pos s,tS,pose,tE), and each sub-query q (pos s,tS,pose,tE) corresponds to one second-order homogeneous region class;
S43, for each sub-query q (pos s,tS,pose,tE), calling a corresponding piecewise linear model, and respectively inputting (pos s,tS) and (pos e,tE) to obtain a predicted value AndThen combining the maximum error threshold value theta of the piecewise linear model in the corresponding second-order similar region class, and obtaining the rangeAcquiring a candidate track point set, screening track points in a sub-query range, and acquiring a sub-query result set;
s44, repeating the step S43, obtaining all sub-query result sets, combining all sub-query result sets, and returning to the original query result set.
The beneficial effects of the invention are as follows:
1. compared with the prior art, the learning type track index and query method based on homogeneous region division on the directional road network divides the road network space according to the similarity of road network traffic, can ensure the track data in each second-order homogeneous region class to be uniformly distributed, reduces model segmentation, can improve query performance and prediction precision, and can improve index query efficiency.
2. According to the learning type track index and query method based on homogeneous region division on the directional road network, depth-first search is carried out based on connectivity mining labels, unique serial numbers of the directional road sections are determined, the traveling behaviors of vehicles are considered, and the consistency of the traveling sequence of the vehicles and the road sequencing is ensured.
3. According to the invention, a piecewise linear model of a three-dimensional space is trained for each second-order homogeneous region class, so that the index size is greatly reduced, the storage cost is reduced, the prediction error of the model is controlled, the number of blocks accessed by a disk during track inquiry is further reduced, and the index inquiry performance is improved.
4. The learning track index and query method based on homogeneous region division on the directional road network not only ensures consistency of directional road section sequencing and vehicle traveling sequence, but also divides a plurality of homogeneous regions uniformly distributed, greatly improves query efficiency of indexes on inclined data, ensures prediction errors of piecewise linear models, and improves query performance of indexes.
Drawings
FIG. 1 is a flow diagram of a learning track index and query method based on homogeneous region division on a directional road network according to an embodiment of the present invention;
FIG. 2 is an explanatory diagram of a road network of a specific example of a learning type track index and query method based on homogeneous region division on the directional road network of an embodiment;
Fig. 3 is an explanatory diagram of directional road segment information in a road network based on a specific example of a learning type track index and query method of homogeneous region division on the directional road network according to an embodiment;
fig. 4 is an explanatory diagram of track data in a road network based on a specific example of a learning-type track index and query method of homogeneous region division on a directional road network according to an embodiment.
Detailed Description
Preferred embodiments of the present invention will be described in detail below with reference to the accompanying drawings.
Examples
A learning track index and query method based on homogeneous region division on a directional road network, as shown in figure 1, comprises the following steps,
S1, setting a subsequent array for each directed road section, scanning all track data on the road network, recording traffic ratio in the subsequent array, starting depth-first search from the directed road section in the center of the road network after scanning, and determining a unique directed road section number according to the sequence of the depth-first search.
In step S1, the road network is a directed graph formed by directed road segments and intersections, each directed road segment is represented by three components of a directed road segment number, a length and a lane number, the track points are represented by two components of a road network position and a time, the road network position is a floating point number, the integer part is the directed road segment number where the point is located, and the decimal part is the relative position of the point on the directed road segment.
In step S1, the traffic ratio is a ratio of the number of trajectories passing through the current and subsequent directed road segments simultaneously to the total number of trajectories passing through the current directed road segment.
In step S1, the depth-first search is started from the road network center directed road segment, and the unique directed road segment number is determined according to the sequence of the depth-first search, specifically, the depth-first search is started from the road network center directed road segment, when each time the road network center directed road segment is searched in depth, the subsequent directed road segment with a large traffic ratio is preferentially selected as the next access target, until all the directed road segments are depth-first searched, the directed road segment number is started from 0, and the accessed sequence of the directed road segments is the directed road segment number. The directional road section of the road network center is obtained by adding the midpoint coordinates of all the directional road sections of the directional road network and calculating an average coordinate, wherein the average coordinate is regarded as the road network center, and the directional road section with the shortest distance from the midpoint of the directional road section to the road network center is regarded as the directional road section of the road network center.
S2, dividing the directed road sections into a plurality of classes according to the length, dividing each class into a plurality of first-order homogeneous region classes according to the number of lanes, dividing all the directed road sections in each first-order homogeneous region class into a plurality of second-order homogeneous region classes according to the length and the number of lanes of the precursor directed road sections, determining the sequence of the second-order homogeneous region classes, and then, in each second-order homogeneous region class, sorting the track points according to time, sorting according to the road network position and determining the sequence numbers of the track points in the second-order homogeneous region classes.
In step S2, the directional road segments are first divided into a plurality of classes according to the length, each class is then divided into a plurality of first-order homogeneous region classes according to the number of lanes, all the directional road segments in each first-order homogeneous region class are divided into a plurality of second-order homogeneous region classes according to the length and the number of lanes of the precursor directional road segments, specifically,
S21, counting the total length Len of all the directed road sections, setting N as the classification number of the directed road sections, and averaging the total length of the directed road sections contained in each class to be Len/N;
S22, sorting the directed road sections according to the ascending sequence of the length, accumulating the lengths of the directed road sections, calculating the previously accumulated directed road sections as a class if the accumulated length is greater than or equal to Len/N, then re-accumulating the directed road sections from the next directed road section, and the like, and finally dividing N classes;
S23, in each directed road section class divided in the step S22, the class is divided into 2 classes according to the number of lanes, the number of lanes is less than or equal to 2 and the class is divided into one class, and the number of lanes is more than or equal to 3 and the class is divided into the other class, so that all the directed road sections are divided into 2X N first-order homogeneous region classes.
S24, counting the sum of the flows of the precursor directional road sections of all the directional road sections in each first-order homogeneous region class, recording as flow sum, setting M as the number of second-order homogeneous directional road sections, and averaging the total flows of the precursor directional road sections contained in each class as flow sum/M;
in step S24, the precursor directional road section flow is the sum of the directional road section flows of all the precursor directional road sections of the current directional road section, the precursor directional road section is a connected directional road section directly driven into the current directional road section, and the directional road section flow is the product of the length of the directional road section and the number of lanes.
S25, sorting the directed road sections according to the ascending order of the precursor directed road section flow, accumulating the precursor directed road section flow, counting the previously accumulated directed road sections as a class if the accumulated precursor directed road section flow is more than or equal to flow sum/M, and then re-accumulating from the next directed road section;
S26, repeating the steps S24-S25, and dividing each first-order homogeneous region class into M second-order homogeneous region classes, so that all the directed road sections in the road network are divided into 2 XN X M second-order homogeneous region classes.
S3, training to obtain a piecewise linear model of a three-dimensional space for each second-order homogeneous region class as a track point storage position predictor, namely learning type track index, inputting track points during training, and outputting storage position serial numbers in the second-order homogeneous region class of the track points.
In step S3, for each second-order homogeneous region class, a piecewise linear model of a three-dimensional space is trained to be used as a trajectory point storage position predictor, i.e. a learning type trajectory index, specifically,
S31, training a piecewise linear model of a three-dimensional space in each second-order homogeneous region, inputting track points (x, y), wherein x represents road network positions, y represents time, and outputting a sequence number in a region where z represents the track points;
s32, setting a maximum error threshold value theta, connecting the ordered track points, namely the first point and the last point into a three-dimensional straight line, and solving a three-dimensional straight line equation by using a least square method:
Wherein P 0(x0,y0,z0) is any point on the three-dimensional straight line, Is a direction vector of a three-dimensional straight line. The above equation is converted to a system of equations:
Then traversing all intermediate points, substituting the above equation set to obtain z 1 and z 2, wherein z 1 and z 2 are the z coordinate values of the projection points of the three-dimensional straight line on the xz plane and the yz plane respectively, and calculating the predicted value Calculating prediction errorRecording the maximum prediction error and the maximum error point;
S33, if the maximum prediction error is larger than theta, segmenting at the maximum error point, adding the error point into the half sections with few points of the left half section and the right half section, repeating the steps above in the left half section and the right half section until the maximum prediction error is smaller than theta, and finally obtaining the piecewise linear model.
S4, acquiring a road network range and a time range of the original query, performing track query, decomposing the query into a plurality of sub-queries according to the second-order homogeneous region class, calling a corresponding piecewise linear model to predict, acquiring a sub-query result set, screening, combining all the sub-query result sets, and returning to the original query result set.
S41, acquiring a road network range [ pos S,posE ] and a time range [ t S,tE ] of an original query, wherein pos S is a starting road network position, pos E is a stopping road network position, t S is a starting time, and t E is a stopping time;
S42, decomposing a road network range [ pos S,posE ] in an original query into a plurality of sub-road network ranges [ pos s,pose ] according to a second-order homogeneous region class, wherein pos s is a starting road network position in the second-order homogeneous region class, pos e is a stopping road network position in the second-order homogeneous region class, each sub-road network range [ pos s,pose ] and a time range [ t S,tE ] form a sub-query q (pos s,tS,pose,tE), and each sub-query q (pos s,tS,pose,tE) corresponds to one second-order homogeneous region class;
S43, for each sub-query q (pos s,tS,pose,tE), calling a corresponding piecewise linear model, and respectively inputting (pos s,tS) and (pos e,tE) to obtain a predicted value AndThen combining the maximum prediction error theta of the piecewise linear model in the corresponding second-order similar region class, and obtaining the rangeAcquiring a candidate track point set, screening track points in a sub-query range, and acquiring a sub-query result set;
s44, repeating the step S43, obtaining all sub-query result sets, combining all sub-query result sets, and returning to the original query result set.
The learning type track index and query method based on homogeneous region division on the directional road network divides the road network space according to the similarity of road network traffic, can ensure the track data in each second-order homogeneous region class to be uniformly distributed, reduces model segmentation, can improve query performance and prediction precision, can improve index query efficiency, and solves the problem of query performance deterioration of track indexes on inclined data.
The learning type track index and query method based on homogeneous region division on the directed road network is characterized in that depth-first search is conducted based on excavation connectivity to determine the sequence of directed road sections, first-order homogeneous region classes are divided according to the length of the directed road sections and the number of lanes, second-order homogeneous region classes are further divided according to the precursor directed road sections of the directed road sections, the sequence of track points is determined in each second-order homogeneous region class, a piecewise linear model of a three-dimensional space is trained for each second-order homogeneous region class, mapping relation between the track points and storage sequence numbers is fitted, and learning type track indexes are constructed to support track query.
According to the learning type track index and query method based on homogeneous region division on the directional road network, depth-first search is carried out based on connectivity mining labels, unique serial numbers of the directional road sections are determined, the traveling behaviors of vehicles are considered, and the consistency of the traveling sequence of the vehicles and the road sequencing is ensured.
According to the invention, a piecewise linear model of a three-dimensional space is trained for each second-order homogeneous region class, so that the index size is greatly reduced, the storage cost is reduced, the prediction error of the model is controlled, the number of blocks accessed by a disk during track inquiry is further reduced, and the index inquiry performance is improved.
The learning track index and query method based on homogeneous region division on the directional road network not only ensures consistency of directional road section sequencing and vehicle traveling sequence, but also divides a plurality of homogeneous regions uniformly distributed, greatly improves query efficiency of indexes on inclined data, ensures prediction errors of piecewise linear models, and improves query performance of indexes.
Specific examples of such a learning type track index and query method based on homogeneous region division on the directional road network of the embodiment are as follows:
As shown in fig. 2, in the road network, there are 9 directional road segments, respectively, seg1, seg2, seg3, seg4, seg5, seg6, seg7, seg8, and seg9, the directional road segment information is specifically shown in fig. 3, there are 7 tracks tra1, tra2, tra3, tra4, tra5, tra6, and tra7, and the track data is specifically shown in fig. 4. The track data comprises a track name, a track point sequence and a track passing directional road section, wherein the track point of the track point sequence consists of two components of a road network position and time, and the track data comprises the following specific components:
p1=(seg1.24,0),p2=(seg1.43,0),p3=(seg1.78,3),p4=(seg1.96,0),p5=(seg4.12,6),p6=(seg4.85,0),p7=(seg2.33,3),p8=(seg3.32,3),p9=(seg6.67,3),p10=(seg5.31,9),p11=(seg5.90,0),p12=(seg8.35,0),p13=(seg8.68,3),p14=(seg7.43,3),p15=(seg9.23,12),p16=(seg9.35,0),p17=(seg9.65,15).
Step S1, setting a subsequent array for each directed road section, scanning all track data on the road network, and recording traffic ratio in the subsequent array, wherein each element in the subsequent array is (the name of the directed road section, the traffic ratio), and the subsequent arrays of seg1, seg4 and seg5 are shown in Table 1:
table 1 subsequent arrays of directed segments seg1, seg4, and seg5
Directed road segment number | Subsequent arrays |
seg1 | [(seg2,0.33),(seg3,0.33),(seg4,0.33)] |
seg4 | [(seg5,0.5),(seg6,0.5)] |
seg5 | [(seg7,0.5),(seg8,0),(seg9,0.5)] |
The subsequent arrays of the remaining directed road segments are empty.
After the scanning is finished, starting depth-first search from the road network center directed road section, determining a unique directed road section number according to the sequence of the depth-first search, wherein the depth-first search algorithm comprises the following steps:
And adding the midpoint coordinates of all the directed road sections, and calculating an average coordinate, wherein the average coordinate is regarded as a road network center, the directed road section with the shortest distance from the road network center among the directed road sections is regarded as the road network center directed road section, the directed road section is searched for depth first, the subsequent directed road section with the large traffic ratio is preferentially selected as the next access target when the directed road section is searched for depth first every time, the accessed sequence of the directed road section is the directed road section number, and the directed road section number is started from 0.
Assuming that seg1 is a road network center directed road segment, depth-first searching is started from the road network center directed road segment seg1, when depth-down exploring is performed, for a node seg1, a subsequent directed road segment seg2 with a large traffic ratio is preferably selected as a next access target, then depth-down exploring is started from seg2, no subsequent directed road segment is returned to seg1, then seg3 is the same, finally depth-down exploring is started from seg4, a subsequent directed road segment seg5 with a large traffic ratio of seg4 is preferably selected as a next access target, the accessed sequence of the directed road segments is a directed road segment number until all the directed road segments are depth-first searched, the specific sequence of the directed road segment numbers is [ seg1, seg2, seg3, seg4, seg5, seg7, seg9, seg8, seg6], and the directed road segments are as follows:
seg1.id=0;seg2.id=1;seg3.id=2;seg4.id=3;seg5.id=4;seg7.id=5;
seg9.id=6;seg8.id=7;seg6.id=8。
and renumbering each directed road section according to the directed road section sequence number, changing the integral part of the corresponding track point road network position, and finally obtaining track point data as follows:
p1=(0.24,0),p2=(0.43,0),p3=(0.78,3),p4=(0.96,0),p5=(3.12,6),p6=(3.85,0),p7=(1.33,3),p8=(2.32,3),p9=(8.67,3),p10=(4.31,9),p11=(4.90,0),p12=(7.35,0),p13=(7.68,3),p14=(5.43,3),p15=(6.23,12),p16=(6.35,0),p17=(6.65,15);
Step S2, dividing the directed road sections into a plurality of classes according to the length, dividing each class into a plurality of first-order homogeneous region classes according to the number of lanes, dividing all the directed road sections in each first-order homogeneous region class into a plurality of second-order homogeneous region classes according to the length and the number of lanes of the precursor directed road sections, determining the sequence of the second-order homogeneous region classes, and then, in each second-order homogeneous region class, sorting track points according to time, sorting according to the road network positions and determining the sequence numbers of the track points in the second-order homogeneous region classes.
Step 21, counting the total length len=1000 of all the directed road sections, setting n=2 as the classification number of the directed road sections, and averaging the total length Len/n=500 of the directed road sections contained in each class;
step S22, sorting the directed road segments according to ascending order of length, then accumulating the lengths of the directed road segments, if the accumulated length is greater than or equal to Len/n=500, calculating the previously accumulated directed road segments as a class, then re-accumulating from the next directed road segment, and the like, and finally dividing n=2 classes, wherein the specific classification is as follows:
[seg4,seg5,seg2,seg3,seg6,seg7],[seg1,seg8,seg9];
Step S23, in each directed road section class divided in the step S22, the class is divided into 2 classes according to the number of lanes, the number of lanes is less than or equal to 2 and is divided into one class, and the number of lanes is more than or equal to 3 and is divided into the other class. Thus, all the directed road segments can be divided into 2×n=4 first order homogeneous region classes, which are specifically classified as follows:
[seg2,seg3,seg6,seg7],[seg4,seg5],[seg8],[seg1,seg9];
Step S24, in the first-order homogeneous zone class [ seg2, seg3, seg6, seg7], the sum of the flows of the precursor directed road segments of all the directed road segments is counted,
Setting m=2 as the class number of the second-order homogeneous directional road sections, wherein the average total flow of the precursor directional road sections contained in each class is flow sum/m=750;
Step S25, sorting the directed road sections according to the ascending order of the flow of the precursor directed road sections, then accumulating the flow of the precursor directed road sections, calculating the previously accumulated directed road sections as a class if the flow of the accumulated precursor directed road sections is more than or equal to flow sum/m=750, then re-accumulating the directed road sections from the next directed road section, and dividing m=2 second-order homogeneous region classes [ seg6, seg7, seg2], [ seg3];
step S26, repeating the steps S24-S25, and dividing all the directed road sections in the road network into 2×n×m=8 second order homogeneous region classes, where the second order homogeneous region classes are as follows:
C7=[seg1,seg9],C8=[]。C1=[seg6,seg7,seg2],C2=[seg3],C3=[seg5],C4=[seg4],C5=[seg8],C6=[],
for example, the second-order homogeneous region class C7 has the track points p1, p2, p3, p4, p15, p16 and p17 falling into C7, which are ordered according to time, then ordered according to road network position, and the track points of C7 are ordered as follows (p1,p2,p4,p16,p3,p15,p17):p1=(0.24,0,0);p2=(0.43,0,1);p4=(0.96,0,2);p16=(6.35,0,3);p3=(0.78,3,4);p15=(6.23,12,5);p17=(6.65,15,6).
And S3, each second-order homogeneous region class is used for learning a piecewise linear model of a three-dimensional space as a track point storage position predictor, namely learning a track index, inputting track points during training, outputting storage position serial numbers in the second-order homogeneous region class of the track points, and calculating the maximum prediction error of the piecewise linear model.
S31, training a piecewise linear model in each second-order homogeneous region, inputting (x, y) representation (road network position, time), and outputting a region sequence number of z representation track points;
Step S32, setting a maximum error threshold value theta=1.6, connecting the ordered track points, the first point and the last point into a three-dimensional straight line, and obtaining a three-dimensional straight line equation by using a least square method
Wherein P 0 (0.19,3.43,0) is any point on a three-dimensional straight line,Is a direction vector of a three-dimensional straight line. The above equation is converted to a system of equations:
Then traversing all intermediate points, substituting the above equation set to obtain z 1 and z 2, wherein z 1 and z 2 are the z coordinate values of the projection points of the three-dimensional straight line on the xz plane and the yz plane respectively, and calculating the predicted value Calculating prediction errorAnd recording the maximum prediction error err max ≡1.98 and the maximum error point p15= (6.23,12);
Step S33, because the maximum prediction error err max is approximately equal to 1.98>1.6, segmentation is carried out at the maximum error point p15, the error point is added into the right half section with few points, the steps are repeated on the left half section (p 1, p2, p4, p16, p 3) until the maximum prediction error is smaller than θ, segmentation is stopped, and the right half section (p 15, p 17) is not needed to be segmented because only two points are needed, and finally a piecewise linear model is obtained:
repeating the steps for each second order homogeneous region class, and constructing a plurality of models.
And S4, acquiring a road network range and a time range of the original query, performing track query, decomposing the query into a plurality of sub-queries according to the second-order homogeneous region class, calling a corresponding piecewise linear model to predict, acquiring a sub-query result set, screening, combining all the sub-query result sets, and returning to the original query result set.
Step S41, obtaining the road network range [ pos S,posE ] = [0,3] and the time range [ t S,tE ] = [0,3] of the original query;
Step S42, decomposing a road network range [ pos S,posE ] = [0,3 ] in the original query into a plurality of sub-road network ranges [ pos s,pose ] according to a second-order homogeneous region class, wherein pos s is a starting road network position in the second-order homogeneous region class, pos e is a stopping road network position in the second-order homogeneous region class, and each sub-road network range [ pos s,pose ] and a time range [ t S,tE ] form sub-query q (pos s,tS,pose,tE);
For example, the second order homogeneous region class c7= [ seg1, seg9] with corresponding sub-network ranges of [0, 1) and [6,7 ], then the sub-network range belonging to C7 in the road network range [0, 3) of the original query is [ pos s,pose ] = [0, 1), and similarly, the other two sub-network ranges [1, 2) and [2, 3) are decomposed, and then the original query can be decomposed into three sub-queries q 1(0,0,1,3)、q2 (1,0,2,3) and q 1 (2,0,3,3), each sub-query q (pos s,tS,pose,tE) corresponding to one second order homogeneous region class.
Step S43, for each sub-query q (pos s,tS,pose,tE), calling the corresponding piecewise linear model, and respectively inputting (pos s,tS) and (pos e,tE) to obtain the predicted valueAndThen combining the maximum error threshold value theta of the piecewise linear model in the corresponding second-order similar region class, and obtaining the rangeAnd obtaining a candidate track point set, screening track points in the sub-query range, and obtaining a sub-query result set.
For example, sub-query q 1 (0,0,1,3), call the corresponding piecewise linear model, and input (0, 0) and (1, 3), respectively, to obtain the predicted valueAndThen combining the maximum error threshold value theta=1.6 of the piecewise linear model in the corresponding second-order homogeneous region class, and in the rangeThat is, a candidate set of track points { (0.24,0), (0.43,0), (0.96,0), (6.35,0), (0.78,3), (6.23,12) } is acquired in the position of [0,5], and track points within the sub-query range (0,0,1,3) are screened to acquire a sub-query result set { (0.24,0), (0.43,0), (0.96,0), (0.78,3) };
Step S44, repeating step S43, obtaining all sub-query result sets, combining all sub-query result sets, and returning the original query result sets { (0.24,0), (0.43,0), (0.78,3), (0.96,0), (1.33,3), (2.32,3)) }.
The foregoing description is only illustrative of the invention and is not to be construed as limiting the invention. Various modifications and variations of the present invention will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, or the like, which is within the spirit and principles of the present invention, should be included in the scope of the claims of the present invention.
Claims (7)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311235731.XA CN117194434B (en) | 2023-09-22 | 2023-09-22 | Learning-based trajectory indexing and query method based on homogeneous region partitioning on directed road networks |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311235731.XA CN117194434B (en) | 2023-09-22 | 2023-09-22 | Learning-based trajectory indexing and query method based on homogeneous region partitioning on directed road networks |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117194434A CN117194434A (en) | 2023-12-08 |
CN117194434B true CN117194434B (en) | 2025-08-08 |
Family
ID=88983340
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311235731.XA Active CN117194434B (en) | 2023-09-22 | 2023-09-22 | Learning-based trajectory indexing and query method based on homogeneous region partitioning on directed road networks |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117194434B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN119474147A (en) * | 2024-10-12 | 2025-02-18 | 中国人民解放军96911部队 | Model-based rapid retrieval method and system for road network and place name element information |
CN120144643A (en) * | 2025-05-15 | 2025-06-13 | 中国地质大学(武汉) | Trajectory query method and device based on adaptive space segmentation |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110095127A (en) * | 2019-04-08 | 2019-08-06 | 西北大学 | A kind of hidden Markov model map-matching method based on segmentation |
CN113051359A (en) * | 2021-03-30 | 2021-06-29 | 大连理工大学 | Large-scale track data similarity query method based on multi-level index structure |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113903173B (en) * | 2021-10-18 | 2022-06-03 | 苏州工业园区测绘地理信息有限公司 | Vehicle track feature extraction method based on directed graph structure and LSTM |
CN114462609B (en) * | 2021-12-07 | 2024-12-20 | 深圳市数字城市工程研究中心 | A floating vehicle data trajectory restoration method based on hidden Markov model |
CN115149961B (en) * | 2022-06-30 | 2024-06-21 | 南京邮电大学 | Method for compressing line network track data based on speed deviation and road semantics |
CN116450892A (en) * | 2023-03-29 | 2023-07-18 | 深圳大学 | Road network spatial index construction method based on learning index |
-
2023
- 2023-09-22 CN CN202311235731.XA patent/CN117194434B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110095127A (en) * | 2019-04-08 | 2019-08-06 | 西北大学 | A kind of hidden Markov model map-matching method based on segmentation |
CN113051359A (en) * | 2021-03-30 | 2021-06-29 | 大连理工大学 | Large-scale track data similarity query method based on multi-level index structure |
Also Published As
Publication number | Publication date |
---|---|
CN117194434A (en) | 2023-12-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN117194434B (en) | Learning-based trajectory indexing and query method based on homogeneous region partitioning on directed road networks | |
CN112181991B (en) | A Grid Remapping Method for Earth Simulation System Based on Rapid Construction of KD Tree | |
CN111539454B (en) | Vehicle track clustering method and system based on meta-learning | |
CN103514243B (en) | Temporal-spatial data management system and Temporal-spatial data management method | |
Kharrat et al. | Clustering algorithm for network constraint trajectories | |
CN110543539B (en) | Method for inquiring track similarity of moving objects in distributed road network environment | |
CN103631910A (en) | Distributed database multi-column composite query system and method | |
CN113344019A (en) | K-means algorithm for improving decision value selection initial clustering center | |
CN112037539B (en) | A method and system for recommending a signal control scheme for saturated urban traffic networks | |
CN110826785A (en) | High-risk road section identification method based on k-medoids clustering and Poisson inverse Gaussian | |
CN105138607B (en) | A kind of KNN querying methods based on combination grain distributed memory grid index | |
CN113733295B (en) | Path optimization method for 3D printing of concrete | |
CN109308303A (en) | An online aggregation method for multi-table join based on Markov chain | |
CN108416381B (en) | A Multi-Density Clustering Method for 3D Point Sets | |
CN116718205A (en) | A path planning method based on the large-scale sparse Chinese postman problem | |
CN114742593A (en) | A method and system for optimal location selection of a logistics storage center | |
CN118013086B (en) | K represents G-Skyline query method | |
Mu et al. | Recommend taxi pick-up hotspots based on density-based clustering | |
CN111813800B (en) | Streaming data real-time approximate calculation method based on deep reinforcement learning | |
CN110334191A (en) | A Set Similarity Query Algorithm Based on Length Partition | |
CN116522381A (en) | Differential privacy-based non-equilibrium position data publishing method | |
CN1300730C (en) | Backward coarse collecting attribute reducing method using directed search | |
CN113934944B (en) | A spatial structure matching method for spatial text data | |
CN116028827B (en) | Trajectory completion method based on vehicle GPS trajectory data clustering | |
Yeh | Spot: Distance based join indices for spatial data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |