US7974814B2 - Multiple sensor fusion engine - Google Patents
Multiple sensor fusion engine Download PDFInfo
- Publication number
- US7974814B2 US7974814B2 US11/818,651 US81865107A US7974814B2 US 7974814 B2 US7974814 B2 US 7974814B2 US 81865107 A US81865107 A US 81865107A US 7974814 B2 US7974814 B2 US 7974814B2
- Authority
- US
- United States
- Prior art keywords
- sensor
- mapping
- probability
- space
- fusion engine
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related, expires
Links
- 230000004927 fusion Effects 0.000 title claims abstract description 48
- 239000013598 vector Substances 0.000 claims abstract description 88
- 238000000034 method Methods 0.000 claims abstract description 37
- 238000013507 mapping Methods 0.000 claims description 72
- 230000006870 function Effects 0.000 description 65
- 238000005259 measurement Methods 0.000 description 21
- 239000011159 matrix material Substances 0.000 description 16
- 238000009826 distribution Methods 0.000 description 13
- 238000012549 training Methods 0.000 description 13
- 238000005457 optimization Methods 0.000 description 9
- 238000013459 approach Methods 0.000 description 8
- 238000010606 normalization Methods 0.000 description 7
- 230000007123 defense Effects 0.000 description 6
- 206010068829 Overconfidence Diseases 0.000 description 3
- 230000000875 corresponding effect Effects 0.000 description 3
- 238000009827 uniform distribution Methods 0.000 description 3
- 230000008859 change Effects 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 230000002596 correlated effect Effects 0.000 description 2
- 231100000518 lethal Toxicity 0.000 description 2
- 230000001665 lethal effect Effects 0.000 description 2
- 238000012886 linear function Methods 0.000 description 2
- 238000004088 simulation Methods 0.000 description 2
- 241000221931 Hypomyces rosellus Species 0.000 description 1
- 244000141359 Malus pumila Species 0.000 description 1
- 238000012356 Product development Methods 0.000 description 1
- 235000021016 apples Nutrition 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000007717 exclusion Effects 0.000 description 1
- 230000014509 gene expression Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 230000004807 localization Effects 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 231100001160 nonlethal Toxicity 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000013139 quantization Methods 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S13/00—Systems using the reflection or reradiation of radio waves, e.g. radar systems; Analogous systems using reflection or reradiation of waves whose nature or wavelength is irrelevant or unspecified
- G01S13/66—Radar-tracking systems; Analogous systems
- G01S13/72—Radar-tracking systems; Analogous systems for two-dimensional tracking, e.g. combination of angle and range tracking, track-while-scan radar
- G01S13/723—Radar-tracking systems; Analogous systems for two-dimensional tracking, e.g. combination of angle and range tracking, track-while-scan radar by using numerical data
- G01S13/726—Multiple target tracking
-
- F—MECHANICAL ENGINEERING; LIGHTING; HEATING; WEAPONS; BLASTING
- F41—WEAPONS
- F41G—WEAPON SIGHTS; AIMING
- F41G3/00—Aiming or laying means
- F41G3/02—Aiming or laying means using an independent line of sight
-
- F—MECHANICAL ENGINEERING; LIGHTING; HEATING; WEAPONS; BLASTING
- F41—WEAPONS
- F41G—WEAPON SIGHTS; AIMING
- F41G7/00—Direction control systems for self-propelled missiles
- F41G7/20—Direction control systems for self-propelled missiles based on continuous observation of target position
- F41G7/22—Homing guidance systems
- F41G7/2206—Homing guidance systems using a remote control station
-
- F—MECHANICAL ENGINEERING; LIGHTING; HEATING; WEAPONS; BLASTING
- F41—WEAPONS
- F41G—WEAPON SIGHTS; AIMING
- F41G7/00—Direction control systems for self-propelled missiles
- F41G7/20—Direction control systems for self-propelled missiles based on continuous observation of target position
- F41G7/30—Command link guidance systems
-
- F—MECHANICAL ENGINEERING; LIGHTING; HEATING; WEAPONS; BLASTING
- F41—WEAPONS
- F41H—ARMOUR; ARMOURED TURRETS; ARMOURED OR ARMED VEHICLES; MEANS OF ATTACK OR DEFENCE, e.g. CAMOUFLAGE, IN GENERAL
- F41H11/00—Defence installations; Defence devices
- F41H11/02—Anti-aircraft or anti-guided missile or anti-torpedo defence installations or systems
Definitions
- This subject invention relates to sensors including but not limited to radar systems used in missile defense applications.
- a Forward Based X-Rand Radar subsystem may be used in a missile defense application to detect, track, and discriminate threats.
- Other radar subsystems such as a Sea-Based X-Band Radar may also be used in the missile defense application.
- kinematic data e.g., the number of detected objects, their location, speed, and trajectory
- classification data with assigned probabilities.
- the classification data may discriminate, as between numerous detected objects, whether they are lethal or not, friend or foe, and their type (e.g., re-entry vehicle, decoy, and the like), among other classification criteria.
- Each radar subsystem includes a unique database and software which outputs class probability vectors which are a function of the measurements obtained from the radar itself and the database of that radar subsystem.
- the respective measurements may be different for a given target.
- the respective radar databases may also be configured differently.
- the software based analysis carried out by each subsystem to analyze their measurements as a function of their respective databases may be different. This is especially true for radar subsystems supplied by different vendors and/or at different times or as between different versions in a product development effort.
- one radar subsystem may provide, to the battle management center, a sensor class probability vector P 1 as a function of its measurements Y 1 and its database D 1 and a different radar subsystem may provide a different sensor class probability vector P 2 as a function of its measurements Y 2 and its database D 2 .
- Multiplying and normalizing the probability vectors from each radar subsystem does not work because correlations are ignored. Other possible solutions either fail to yield accurate results or are difficult to implement.
- the subject invention results from the realization that by fusing the outputs of multiple sensors using a preconfigured base database and a mapping approach wherein an ideal classifier function is approximated, the method and system of this invention heuristically chooses a way to parameterize the classifier function which is then optimized according to a cost function. But, the typically computation intensive optimization is performed off-line and, as such, mapping can occur in real time. The result is a feasible, real time system and method using a minimal amount of computational effort and which yields good results.
- the subject invention features a method of fusing the outputs of multiple sensors.
- the preferred method includes inputting to a fusion engine the output of each sensor including a sensor class probability vector based on each sensor's classification database and estimating a base class probability vector based on the sensor class probabilities output by the sensors and a preconfigured mapping function.
- the sensor class probability vector is converted to log-space.
- the typical mapping function includes anchor points and estimating may include finding the differences between the log-space probability vectors and the anchor points.
- the typical mapping function also includes mapping matrices and estimating may include mapping each difference using the mapping matrices.
- the mapping function may further include reference points and estimating may then include adding reference points to the mapped differences.
- Estimating typically includes weighting and summing the mapped differences, converting the weighted and summed mapped differences back to probability space to produce the probability vector, and normalizing the probability vector.
- the subject invention also features a system for fusing the output of multiple sensors.
- the preferred system comprises multiple sensors each including a classification database and each outputting a sensor class probability vector based on its classification database.
- a fusion engine is responsive to the output of each sensor.
- a base database includes base classes and the fusion engine is configured to estimate a base class probability vector based on the sensor outputs and the base database classes.
- the typical fusion engine is configured to convert the sensor class probability vectors to log-space.
- a preferred mapping function includes anchor points and the fusion engine is configured to find the differences between the log-space probability vectors and the anchor points.
- the mapping function may include mapping matrices and the fusion engine is then configured to map each difference using the mapping matrices. If the mapping function includes reference points, the fusion engine can be configured to add reference points to the mapped differences.
- the preferred fusion engine is configured to weigh and sum the mapped differences, to convert the weighted and summed mapped differences back to probability space to produce the probability vector, and to normalize the probability vector.
- FIG. 1 is a highly schematic view showing an example of an application of a method of fusing the outputs of multiple sensors in accordance with the subject invention
- FIG. 2 is a block diagram showing an example of a fusion node in accordance with the subject invention
- FIG. 3 is a plot of entropy over the simplex in three-dimensional probability space
- FIG. 4 is a plot of entropy over the simplex in log-space in accordance with the subject invention.
- FIG. 5 is a flow chart depicting the primary steps associated with the programming of the fusion node of FIG. 2 ;
- FIG. 6 is a graph showing the base and sensor class distributions for an example when the method of the subject invention was implemented
- FIGS. 7A-7C are plots showing the normalized surprisals for 200 sample points for an example of the implementation of the subject invention showing, respectively, probability fusion, mapping, and feature fusion;
- FIG. 8 is a graph showing the base and sensor class distributions for a second example of the implementation of the method of the subject invention.
- FIGS. 9A-9C are graphs showing the normalized surprisals of 200 sample points for the second example referred to with respect to FIG. 8 for mapping (LS), mapping, and feature fusion, respectively;
- FIG. 10 is a graph showing the base and sensor class distributions for a third example of the implementation of the method of the subject invention.
- FIGS. 11A-11C are plots of the normalized surprisals of 200 sample points for the third example of FIG. 10 showing mapping (LS), mapping, and feature fusion, respectively;
- FIG. 12 is a flow chart describing how mapping may be accomplished in accordance with the subject invention.
- FIG. 1 shows multiple sensor subsystems in a ballistic missile defense application in accordance with an example of the subject invention.
- the multiple sensors include Forward Based X-Band Radar subsystem 10 , Sea-Based X-Band Radar subsystem 12 , Aegis sensor 14 , early warning radar 16 , early warning satellite 18 , and the like all of which assist in tracking and classifying threat 20 and re-entry vehicles 22 and decoys 24 deployed by missile 20 .
- the probability vectors (sensor class probabilities) output by each sensor subsystem are transmitted to battle management center 30 which deploys, sometimes automatically, kill vehicle 32 to intercept any targets likely to be threats.
- the different sensors may output different sensor class probabilities to battle management center 30 .
- sensor 10 FIG. 2
- sensor class probability vector P 1 which is a function of its measurements (Y 1 ) and its database (D 1 ).
- Sensor 12 e.g., a Sea-Based X-Band Radar subsystem
- sensor class probability vector P 2 which is a function of its measurements (Y 2 ) and its database (D 2 ).
- Each sensor's measurements Y are typically a function of the features detected, the sensor parameters, and sensor noise.
- the respective measurements may differ for a given target.
- the respective databases D 1 and D 2 may also be configured differently.
- the software based analysis of each subsystem's measurements as a function of their respective databases may be different.
- fusion node 50 (which may be implemented in a computer located at battle management center 30 , FIG. 1 ) includes software which receives as an input the classification results (P 1 , P 2 ) output by each sensor based on each sensor's database (D 1 , D 2 ).
- Software within fusion node 50 is configured to estimate a classification result ⁇ circumflex over (q) ⁇ as a function of P 1 , P 2 , and D 0 , a preconfigured base database populated with base classes as described below.
- One or more sensors 10 , 12 take measurements of a target's attributes.
- the sensors can measure different sets of attributes, have different noise variances, and have different measurement parameters (e.g., aspect angle). Rather than passing the measurements to fusion node 50 , they each run their own classifier, and pass the resulting probability vector p.
- the sensors have their own databases D 1 and D 2 , which may be different from each other.
- Fusion node 50 is configured to estimate the probabilities of classes in its own database D 0 which is not necessarily common to any of the sensor databases.
- the classes of database D 0 are herein called base classes.
- Bayes' law optimally classifies from a set (vector) of features as follows:
- the difficulty is that there is no simple explicit form for the likelihoods. Even if the conditional distributions for the features are explicitly defined, the likelihoods won't be.
- the distributions can have any form: Gaussian, Gaussian-sum, uniform, piecewise-linear, and the like.
- the classifiers can be treated as black boxes. It can help to think of them as arbitrary functions as opposed to classifiers that return probability vectors. Taking this view makes it apparent that the multi sensor fusion problem can be treated in the same way as the single sensor database-mismatch problem.
- the p vectors are stacked and then processed as usual, because they are nothing but data on which we want to condition our estimates of class probabilities.
- inversion involves inverting the p vector to find a conditional probability density function for the measurements/features. This method is not practical to implement. Given a p vector, if the sensor classifier function were inverted, the result would be one or more regions of measurement space.
- p ) f ( y
- the ideal feature classifier can be integrated over it, to obtain the conditional class probabilities:
- This method would be difficult to implement, and would require a large amount of computation in real time.
- mapping The preferred approach, called mapping, turns out to be feasible and yields good results. Under certain conditions it can achieve optimal results.
- mapping an approximation is made of the ideal classifier function q(p).
- the method involves heuristically choosing a way to parameterize this function, and then optimizing it according to a cost function.
- the computation intensive optimization is preferably performed offline, and the real time application of the mapping uses a small amount of computation. Additionally, this method doesn't suffer the curse of dimensionality as badly as the histogram method because it exploits smoothness of distributions in measurement space. This is loosely analogous to approximating a time series by a truncated Fourier series.
- mapping approach is described in more detail.
- One preferred cost function for optimizing an estimator of conditional probabilities is the expected value, taken over the data, of the cross-entropy between the optimal estimator and the existing estimator.
- the optimal estimator q(p)
- the cost has the important property that it is minimized by the optimal estimator.
- the cost function is defined in a form that is convenient for implementation, and can be expressed using the information theoretic terms mentioned above:
- c The true class. Class definitions are consistent with the base database. When c appears as a subscript of a vector, it denotes taking the c th element of that vector.
- p The probability vector from a sensor. These classes are the sensor classes. For the case where there is more than one sensor, p is a larger, stacked vector of the sensors' probability vectors.
- Equation 12 shows that the cost is equal to the expected entropy of the optimal estimator plus the expectation of the Kullback-Leibler divergence of our estimator with respect to the optimal estimator. This follows from Equation 11 because cross entropy can always be expressed as: entropy plus KL-divergence. Entropy and KL-divergence are both non-negative quantities, and KL-divergence takes a minimum value of zero when the two probability vectors are equal.
- the first term in Equation 12 is non-negative, and independent of our estimator—this term is the lower bound of the cost.
- the second term is also non-negative, and takes the value of zero when the estimator is equal to the optimal estimator. Therefore, the cost is minimized by the optimal estimator.
- Equation 6 is interpreted as the expected surprisal of the true class, given the estimated probability vector.
- the surprisal of an event is defined as the negative logarithm of the probability of the event. Therefore, an occurrence of an event that is thought to have a low probability would have a high surprisal. Intuitively, it can indeed be thought of as how surprised one would be to observe an event. So, the preferred estimator provides class probabilities such that on average, if we were to observe the true class afterwards, we would be the least surprised.
- the information-theoretic cost provides the best criteria to optimize our estimator, but it is also useful to use a simpler cost.
- the other cost is a least squares cost, which produces a relatively quick initial solution in closed form.
- the set of all possible n-dimensional probability vectors forms a simplex in n-dimensional probability space.
- the simplex is the region of points whose elements are non-negative and sum to one.
- the mapped probability vector should also satisfy these constraints which can be done by either solving a constrained optimization problem, or by projecting the mapped result ( ⁇ circumflex over ( ⁇ tilde over (q) ⁇ ) onto the simplex. The latter method is preferred because solving a constrained problem is more difficult, and the average performance tends to suffer as a result of the constraints. If all elements of ⁇ circumflex over ( ⁇ tilde over (q) ⁇ are non-negative, then normalizing it (dividing by its sum) will project the point onto the simplex.
- the tilde ( ⁇ ) signifies that a probability vector hasn't been normalized, i.e. doesn't lie on the simplex.
- the general form of the mapping function, and the post-mapping normalization are described as:
- the log function transforms the interval (0, ⁇ ) to ( ⁇ , ⁇ ), so there are no longer “bad” regions to avoid.
- the mapping is now optimized with no constraints because any mapped point can be projected onto the simplex. Another reason for mapping in log-space is that distance is more intuitive in log-space. It captures the notion that for probabilities, the difference between 0 and 0.1 is more significant than the difference between 0.6 and 0.7.
- y The vector of features.
- p The probability vector from a sensor. These classes are the sensor classes. For the case where there is more than one sensor, p is a larger, stacked vector of the sensors' probability vectors. This is a function of the features y, which were measured by the sensors.
- q(y) The ideal feature classifier. This operates on the features, and returns a probability vector of the base classes.
- (p) The un-normalized mapping function This operates on the sensors' probability vectors, and returns the un-normalized estimated probability vector of the base classes.
- f( ⁇ ) A continuous probability density function. (A discrete pdf can be expressed as a continuous pdf by using delta functions.) E f( ⁇ ) [ ⁇ ] The expected value of [ ⁇ ] over f( ⁇ ).
- p(y) the deterministic function p(y)
- ⁇ circumflex over (q) ⁇ (p) is parameterized in a way that allows a solution for the parameters using least squares.
- the function for estimating base class probabilities ⁇ circumflex over (q) ⁇ (p) is a nonlinear mixing of jointly optimized linear mappings in log-space.
- One element of the function is the use of “anchor points” to localize mappings to different regions of probability space.
- the localization involves a nonlinear function that weights the multiple linear mappings.
- the estimator function carried out by fusion node 50 , FIG. 2 , and as shown in FIG. 5 typically has nine steps:
- step 60 the probability vectors are stacked. If two or more sensors report probability vectors, they are made into one large vector by stacking them:
- This stack is then converted to log-space, step 62 .
- the weights are computed, step 64 by first finding the squared distance of the input s p from each of the anchor points ⁇ s pi ⁇ . Then, to compute the i th weight the product is taken of the squared distances to each anchor point except the i th .
- the equations for the vector of weights are:
- step 72 the results of the individual mappings are combined by taking a weighted sum:
- step 76 Normalize the resulting probability vector, step 76 , includes dividing each element by the sum of the elements:
- the preconfigured data involved in the mapping function includes anchor points 80 , mapping matrices 82 , and reference points 84 .
- the mapping matrices and reference points are a result of the optimization. How to determine the anchor points will be described later.
- n w is the length of the weight vector (and the number of anchor points).
- the matrix N specifies the mapping matrices 82 and the reference points 84 .
- the normalized mapped result in log-space is:
- the mapping matrix N is optimized according to the information theoretic cost. This cost is a nonlinear function which might have multiple local minima, so numerical optimization is used. In the optimization, better results are obtained if there is a good initial guess. For the initial guess, a mapping that minimizes the least squares cost function is preferred.
- the process of determining the mappings is described in FIG. 12 . It typically includes six steps: build the base database D 0 , step 100 , run simulations, step 102 , determine anchor points, step 104 , determine least squares reference points, step 106 , compute initial mapping, step 108 , and compute final mapping, step 110 .
- step 100 involves choosing scenarios, threat types, and trajectories, and populating the database with them.
- the database should encompass all the variations that are expected to be encountered, such as object shapes, materials, deployment times, and trajectory types. Classes are defined, such as: RV, tank, decoy, debris, etc. Each object in the database is marked with its true class. For instance, one trajectory in the database may have 10 objects, consisting of 1 RV, 1 tank, 3 decoys, and 5 debris.
- step 102 radar simulations are run on all trajectories and objects (or a representative set) from the database, which simulate the radars measuring features of the objects, and classifying them using the sensor databases.
- Each sensor classifies the objects it measures, according to its own database, which may or may not match the base database.
- the result of this step is a set of training points.
- Each training point describes one object, and contains the object's true base class, the probability vector that came from each sensor, and the feature vector that each sensor used.
- step 104 the use of multiple anchor points (mappings) amounts to something similar to what is known as the “kernel trick”.
- the kernel trick is a way to make a nonlinear estimator without having to solve nonlinear equations. It is done by first nonlinearly transforming the input data into a higher dimensional space, and then solving for a linear estimator that operates on the transformed input data.
- the anchor points are found by running each sensor's classifier (using that sensor's database) on the feature vector that comes from the mean of each base class distribution, stacking them, and then converting to log-space.
- s _ pi - log ⁇ ( p ⁇ ( y _ i ) ) ( 38 )
- a least squares reference point will be computed for each training point. These will be used to define the least squares cost function when computing the initial mapping.
- the partial derivatives of the least squares cost are:
- N pinv ( A ) AE f(y) [s q* u T ]pinv ( E f(y) [uu T ]) (48)
- the expected values are taken using Monte Carlo integration over the set of training points. Then N is computed via the previous equation. N defines the mappings and reference points. This is the initial mapping solution.
- the “fminunc” function from MATLAB's optimization toolbox is used to minimize the cost function.
- Functions are provided to the fminunc function, which compute the cost and its gradient and Hessian.
- the total cost and its derivatives are computed. Having analytic expressions for the partial derivatives (as opposed to using finite differencing), leads to faster optimization and better results.
- the equations for the single point cost, gradient, and Hessian are as follows:
- ⁇ The single point cost of the ⁇ circumflex over (q) ⁇ ( ⁇ ) function.
- ⁇ circumflex over (q) ⁇ The value of the ⁇ circumflex over (q) ⁇ ( ⁇ ) function at one training point. This is the normalized probability vector of base classes.
- a c The c th column of the identity matrix. Alternatively, it could be defined as the c th column of some arbitrary matrix, whose elements can be chosen to represent costs of various types of misclassification.
- N The matrix containing all mappings and reference points, described in equation 31.
- u The input vector of weighed differences and weights, as described in equation 32.
- N(:) The vectorized parameter matrix. It is formed by stacking the columns of N, in order. The (:) notation is borrowed from MATLAB.
- ⁇ ⁇ ⁇ N The partial derivatives of the single point cost with respect to the parameter matrix N. This is a matrix that is the same size as N. ⁇ ⁇ ⁇ N ⁇ ( : ) The partial derivatives of the single point cost with respect to the vectorzied N. This is the same size as vectorized N. ⁇ 2 ⁇ ⁇ ⁇ N ⁇ ( : ) ⁇ ⁇ N ⁇ ( : ) The Hessian (2 nd partial derivatives) of the single point cost function with respect to vectorzied N. The Kronecker product operator. diag( ⁇ ) The function which forms a diagonal matrix from a vector (as implemented in MATLAB).
- the following three examples show the performance of this system.
- the results are based on simulated data.
- the surprisals for 200 sample points were plotted and the mean calculated, which is the cost.
- the mean surprisal approaches the expected cross entropy between the optimal estimator and our estimator.
- the surprisals are normalized, meaning that the logarithms are taken with the base equal to the number of base classes.
- the sample points are random draws from the joint distribution of class and measurements, and are drawn separately from the training points.
- a uniform distribution always results in a surprisal of 1. While it is possible for the cost to be any positive number, 1 is the upper bound on the cost for a useful algorithm. One can always meet the performance of this bound by admitting total ignorance (a uniform distribution). If the cost is greater than 1, then we could do better by just ignoring the data, and declaring a uniform distribution. We also have a lower bound, which we get by evaluating the performance of feature fusion. Feature fusion is optimal, but it uses the features, so in general it has more information than our algorithm. As a result, the cost from feature fusion is a lower bound that, depending on the scenario, may not be possible to reach.
- the first example is designed to show how much we can improve performance, relative to probability fusion.
- the setup is as follows: 2 features, 2 base classes, and 2 sensors wherein sensor 1 measures feature 1 , and sensor 2 measures feature 2 . There is no database mismatch and the sensors have the correct marginals. 10,000 training points were used. See FIG. 6 .
- the two base classes have equal covariances, but their means differ in both dimensions.
- the covariances are oriented at 45°, and have aspect ratios of 10:1, so the two features are very correlated. Because of the large correlation, and relatively small difference in the means, both sets of marginals have a lot of overlap. Because of the overlap, the results from simple probability fusion are very conservative, meaning that the estimated class probabilities are nearly uniformly distributed. If we had the features, we would get much better performance, because having both features makes it easier to see the separation of the classes. In this case however, it is possible to recover the features, because the probability vectors from the sensors are monotonic functions, which makes them invertible. Our algorithm, while not explicitly inverting to get the measurements, does essentially just as well as feature fusion in this case. If we were to increase the number of training points, the performance would be even closer to that of feature fusion.
- the second example is designed to show a more complicated case, with multiple sensors and database mismatch.
- the setup is as follows: 2 features, 5 base classes, and 2 sensors wherein sensor 1 measures feature 1 , sensor 2 measures feature 1 and feature 2 .
- the sensors have badly mismatched distributions. 10,000 training points were used. See FIG. 8 .
- the five base class distributions vary in spread and correlation.
- Sensor 1 measures feature 1 , and has three classes.
- Sensor 2 measures both features, but has only two highly overlapping classes.
- we'll use the results from the mapping derived from the least squares cost. In this case, mapping performs far better than declaring uniform probabilities (ignorance), substantially better than the least squares cost mapping, and very close to feature fusion. This shows that we can fuse probability vectors from multiple sensors in the presence of database mismatch, and recover most of the performance, relative to feature fusion.
- the third example is the same as the second case, except that now we only have sensor 2 .
- the setup is as follows: 2 features, 5 base classes, and 1 sensor (measures both features).
- the sensor has badly mismatched distributions. 10,000 training points were used. See FIG. 10 .
- the sensor measures both features, and has the same distributions as sensor 2 in the previous example.
- This example is designed to test the algorithm when there is a lot of missing information. It is obvious that it is impossible to do as well as feature fusion, because the two sensor classes, will not give much insight into feature 1 , since they only differ significantly along feature 2 .
- a good algorithm will admit uncertainty when there is insufficient information. The worst thing you can do is to be overconfident, favoring some classes too strongly, while ruling out other classes. Overconfidence is much worse than ignorance. Overconfidence leads to performance that is worse than if you were to ignore the data and declare uniform probabilities.
- An overconfident estimator causes leakage—the declaration of a lethal target as non-lethal. Our estimator is able to avoid overconfidence while discriminating aggressively, as this case shows. The results are substantially better than declaring ignorance. We don't do as well as feature fusion, but it is obvious that that is impossible in this example.
- the result is a feasible and useful method of and system for estimating conditional class probabilities from arbitrary data, which performs well in the presence of missing information. It is able to discriminate aggressively without being overconfident. It has the desirable quality of seeking the optimal or true conditional class probabilities without assuming a decision rule for how the probabilities will be used. This is ideal for operating within a distributed control, multi-sensor network.
- the method we've developed provides a good discrimination/fusion solution for the existing data interface problem.
Landscapes
- Engineering & Computer Science (AREA)
- Radar, Positioning & Navigation (AREA)
- Remote Sensing (AREA)
- General Engineering & Computer Science (AREA)
- Chemical & Material Sciences (AREA)
- Combustion & Propulsion (AREA)
- Aviation & Aerospace Engineering (AREA)
- Computer Networks & Wireless Communication (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Radar Systems Or Details Thereof (AREA)
- Testing Or Calibration Of Command Recording Devices (AREA)
Abstract
Description
p=g(y)Y={y:g(y)=p} (3)
f(y|p)=f(y|yεY) (4)
c | The true class. Class definitions are consistent with the base database. |
When c appears as a subscript of a vector, it denotes taking the cth | |
element of that vector. | |
p | The probability vector from a sensor. These classes are the sensor classes. |
For the case where there is more than one sensor, p is a | |
larger, stacked vector of the sensors' probability vectors. | |
q(p) | The true conditional probability vector of the base classes. That is: |
qi(p) = Pr(c = i|p). This is the optimal estimator, which in general, | |
is unrealizable. | |
{circumflex over (q)}(p) | The estimated conditional probability vector of the base classes. This |
function is what we need to define. | |
Ψ({circumflex over (q)}(·)) | The cost of the selected {circumflex over (q)}(·) function. The argument is omitted |
because the cost is averaged over all p, and therefore the cost is not a | |
function of p. | |
f(·) | A continuous probability density function. (A discrete pdf can be |
expressed as a continuous pdf by using delta functions.) | |
Ef(·)[·] | The expected value of [·] over f(·). |
Hx(q(p), {circumflex over (q)}(p)) | The cross entropy of {circumflex over (q)}(p) with respect to the true probability vector |
q(p). | |
H(q(p)) | The entropy of q(p). |
DKL(q(p)||{circumflex over (q)}(p)) | The Kullback-Leibler divergence of {circumflex over (q)}(p) with respect to q(p). |
S(x)≡−log(x), and (13)
S −1(x)=exp(x). (14)
s p =S(p), and (15)
p=S −1(s p) (16)
y | The vector of features. |
p | The probability vector from a sensor. These classes are the sensor classes. |
For the case where there is more than one sensor, p is a larger, stacked | |
vector of the sensors' probability vectors. This is a function of the features | |
y, which were measured by the sensors. | |
q(y) | The ideal feature classifier. This operates on the features, and returns a |
probability vector of the base classes. | |
(p) | The un-normalized mapping function. This operates on the sensors' probability |
vectors, and returns the un-normalized estimated probability | |
vector of the base classes. | |
S(·) | The conversion from probability space to log-space, defined by: |
S(p) = −log(p) | |
ΨLS({circumflex over (q)}(·)) | The least-squares cost of the selected {circumflex over (q)}(·) function. The argument is |
omitted because the cost is averaged over all p, and therefore the cost is | |
not a function of p. | |
f(·) | A continuous probability density function. (A discrete pdf can be |
expressed as a continuous pdf by using delta functions.) | |
Ef(·)[·] | The expected value of [·] over f(·). |
s p=−log(p) (23)
Δs pi =s p −
Δsqi=MiΔspi (27)
{tilde over (ŝ)} qi =
{tilde over (q)}=exp(−{tilde over (ŝ)} q) (30)
N=└M1 M2 . . . Mnw
u=[w 1(s p −
{tilde over (ŝ)}q=Nu. (34)
where:
ΨLS(N) | The single point least-squares cost of the {circumflex over (q)}(·) function |
determined by N. | |
N | The matrix containing all mappings and reference points, |
described in equation 31. | |
Δs | The difference between the mapped result in log-space |
(pre-normalization) and the corresponding least squares | |
reference point. | |
sq* | The least squares reference point. Each training point has one |
of these associated with it. | |
The mapped result in log-space before normalization. | |
A | The matrix that projects a vector of length nq (the number of |
base classes) onto the zero-sum plane. A = I − (1/nq)11T | |
Ef(y)[·] | The expected value of [·] over f(y). |
f(y) | The pdf of the feature vector. |
u | The input vector of weighed differences and weights, as |
described in |
|
N=pinv(A)AE f(y) [s q* u T ]pinv(E f(y) [uu T]) (48)
where:
ψ | The single point cost of the {circumflex over (q)}(·) function. |
{circumflex over (q)} | The value of the {circumflex over (q)}(·) function at one training point. This |
is the normalized probability vector of base classes. | |
ac | The cth column of the identity matrix. Alternatively, it could |
be defined as the cth column of some arbitrary matrix, | |
whose elements can be chosen to represent costs of various | |
types of misclassification. | |
N | The matrix containing all mappings and reference points, |
described in equation 31. | |
u | The input vector of weighed differences and weights, as |
described in |
|
N(:) | The vectorized parameter matrix. It is formed by stacking |
the columns of N, in order. The (:) notation is borrowed | |
from MATLAB. | |
|
The partial derivatives of the single point cost with respect to the parameter matrix N. This is a matrix that is the same size as N. |
|
The partial derivatives of the single point cost with respect to the vectorzied N. This is the same size as vectorized N. |
|
The Hessian (2nd partial derivatives) of the single point cost function with respect to vectorzied N. |
The Kronecker product operator. | |
diag(·) | The function which forms a diagonal matrix from a vector |
(as implemented in MATLAB). | |
Claims (16)
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/818,651 US7974814B2 (en) | 2007-06-15 | 2007-06-15 | Multiple sensor fusion engine |
PCT/US2008/006070 WO2009023052A2 (en) | 2007-06-15 | 2008-05-13 | Multiple sensor fusion engine |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/818,651 US7974814B2 (en) | 2007-06-15 | 2007-06-15 | Multiple sensor fusion engine |
Publications (2)
Publication Number | Publication Date |
---|---|
US20100070238A1 US20100070238A1 (en) | 2010-03-18 |
US7974814B2 true US7974814B2 (en) | 2011-07-05 |
Family
ID=40351334
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/818,651 Expired - Fee Related US7974814B2 (en) | 2007-06-15 | 2007-06-15 | Multiple sensor fusion engine |
Country Status (2)
Country | Link |
---|---|
US (1) | US7974814B2 (en) |
WO (1) | WO2009023052A2 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10175349B1 (en) | 2014-11-20 | 2019-01-08 | The United States Of America As Represented By The Secretary Of The Air Force | Dual RF and optical mode omnidirectional sensor |
US10371784B2 (en) | 2016-06-03 | 2019-08-06 | Raytheon Company | System and method for multi-sensor multi-target 3D fusion using an unbiased measurement space |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8260052B1 (en) | 2010-11-30 | 2012-09-04 | Raytheon Company | Object identification via data fusion |
US8468111B1 (en) | 2010-11-30 | 2013-06-18 | Raytheon Company | Determining confidence of object identification |
US8595177B1 (en) | 2011-03-08 | 2013-11-26 | Raytheon Company | Risk management for object identification |
US9389681B2 (en) * | 2011-12-19 | 2016-07-12 | Microsoft Technology Licensing, Llc | Sensor fusion interface for multiple sensor input |
US10168420B1 (en) * | 2014-07-15 | 2019-01-01 | Herbert U. Fluhler | Nonlinear interferometric imaging sensor |
DE102015011058A1 (en) * | 2015-08-27 | 2017-03-02 | Rheinmetall Waffe Munition Gmbh | Threat prevention system |
CN108230421A (en) * | 2017-09-19 | 2018-06-29 | 北京市商汤科技开发有限公司 | A kind of road drawing generating method, device, electronic equipment and computer storage media |
DE102017011592A1 (en) * | 2017-12-14 | 2019-06-19 | Diehl Defence Gmbh & Co. Kg | Method for controlling a drone defense system |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5963653A (en) | 1997-06-19 | 1999-10-05 | Raytheon Company | Hierarchical information fusion object recognition system and method |
US5999893A (en) | 1997-05-02 | 1999-12-07 | The United States Of America As Represented By The Secretary Of The Navy | Classification system and method using combined information testing |
US7065465B2 (en) | 2002-03-26 | 2006-06-20 | Lockheed Martin Corporation | Method and system for multi-sensor data fusion |
US7151466B2 (en) | 2004-08-20 | 2006-12-19 | Gabelmann Jeffrey M | Data-fusion receiver |
US20070076917A1 (en) | 2003-03-21 | 2007-04-05 | Lockheed Martin Corporation | Target detection improvements using temporal integrations and spatial fusion |
US20080071800A1 (en) * | 2006-09-14 | 2008-03-20 | Anindya Neogi | System and Method for Representing and Using Tagged Data in a Management System |
-
2007
- 2007-06-15 US US11/818,651 patent/US7974814B2/en not_active Expired - Fee Related
-
2008
- 2008-05-13 WO PCT/US2008/006070 patent/WO2009023052A2/en active Application Filing
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5999893A (en) | 1997-05-02 | 1999-12-07 | The United States Of America As Represented By The Secretary Of The Navy | Classification system and method using combined information testing |
US5963653A (en) | 1997-06-19 | 1999-10-05 | Raytheon Company | Hierarchical information fusion object recognition system and method |
US7065465B2 (en) | 2002-03-26 | 2006-06-20 | Lockheed Martin Corporation | Method and system for multi-sensor data fusion |
US20070076917A1 (en) | 2003-03-21 | 2007-04-05 | Lockheed Martin Corporation | Target detection improvements using temporal integrations and spatial fusion |
US7151466B2 (en) | 2004-08-20 | 2006-12-19 | Gabelmann Jeffrey M | Data-fusion receiver |
US20080071800A1 (en) * | 2006-09-14 | 2008-03-20 | Anindya Neogi | System and Method for Representing and Using Tagged Data in a Management System |
Non-Patent Citations (8)
Title |
---|
Akaike, Hirotugu, "A New Look at the Statistical Model Identification" IEEE Transactions on Automatic Control, Dec. 1974; pp. 716-723. |
Akaike, Hirotugu, "Use of Statistical Models for Time Series Analysis" IEEE Preceedings of ICASP, 1986; pp. 3147-3155. |
Burnham, Kenneth and David Anderson, "Multimodel Inference: Understanding AIC and BIC in Model Selection," Colorado Cooperative Fish and Wildlife Research Unit (USGS-BRD), May 1994; (56 pages total). |
Jaynes, Edwin, "The Well-Posed Problem" Foundations of Physics, 3, 1973; pp. 477-493 (11 pages total), Jun. 1, 1973. |
Mika, Ratsch, Weston, Scholkopf, and Miller, "Fisher Discriminant Analysis with Kernals," IEEE, 1999; pp. 41-48. |
Principe, Xu, Zhao and Fisher, "Learning from Examples with Information Theoretic Criteria," Computational NeuroEngineering Laboratory, University of Florida, Gainsville, FL; (20 pages), Aug. 20, 2000. |
Shore, John and Johnson, Rodney W., "Properties of Cross-Entropy Minimization" IEEE Transactions on Information Theory, vol. IT-27, No. 4, Jul. 1981; pp. 472-482. |
Written Opinion of the International Searching Authority for PCT Application No. PCT/US2008/006070 mailed Jan. 23, 2009 (four (4) pages). |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10175349B1 (en) | 2014-11-20 | 2019-01-08 | The United States Of America As Represented By The Secretary Of The Air Force | Dual RF and optical mode omnidirectional sensor |
US10371784B2 (en) | 2016-06-03 | 2019-08-06 | Raytheon Company | System and method for multi-sensor multi-target 3D fusion using an unbiased measurement space |
US10527705B2 (en) | 2016-06-03 | 2020-01-07 | Raytheon Company | System and method for multi-sensor multi-target 3D fusion using an unbiased measurement space |
Also Published As
Publication number | Publication date |
---|---|
US20100070238A1 (en) | 2010-03-18 |
WO2009023052A3 (en) | 2009-03-26 |
WO2009023052A2 (en) | 2009-02-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7974814B2 (en) | Multiple sensor fusion engine | |
US8872693B1 (en) | Radar signature database validation for automatic target recognition | |
Pathiraja et al. | Multiclass confidence and localization calibration for object detection | |
Zhu et al. | An extended target tracking method with random finite set observations | |
Camps-Valls et al. | Nonlinear system identification with composite relevance vector machines | |
Dudgeon | ATR performance modeling and estimation | |
El-Fallah et al. | Unified Bayesian situation assessment sensor management | |
Bhattacharyya et al. | Evidence theoretic classification of ballistic missiles | |
Lei | Robust detection of radiation threat | |
Malmström et al. | Fusion framework and multimodality for the Laplacian approximation of Bayesian neural networks | |
Munro et al. | Neural network learning of low-probability events | |
Kaufman et al. | Score-based SAR ATR performance model with operating condition dependencies | |
Akhtar | A neural network framework for binary classification of radar detections | |
Maurer et al. | Sensor fusion architectures for ballistic missile defense | |
McCullough et al. | Intelligent fusion processing in BMD applications | |
Sherstjuk et al. | Modeling Hybrid Attacks and Operations to Assess the Threats in Early Warning Systems | |
Jung et al. | Evidence-theoretic reentry target classification using radar: A fuzzy logic approach | |
Copsey | Automatic target recognition using both measurements from identity sensors and motion information from tracking sensors | |
Blasch et al. | Feature-Aided JBPDAF group tracking and classification using an IFFN sensor | |
Kim et al. | Interactive clutter measurement density estimator for multitarget data association | |
Xiong et al. | Airborne Multi-function Radar Air-to-air Working Pattern Recognition Based on Bayes Inference and SVM | |
Farrell et al. | Modeling Kill Chains Probabilistically | |
Copsey et al. | Bayesian approach to recognising relocatable targets | |
Yang et al. | Pose-angular tracking of maneuvering targets with high range resolution (HRR) radar | |
Kim et al. | Efficient track-before-detect for maritime radar via correlation filtering |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: RAYTHEON COMPANY,MASSACHUSETTS Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:PHELPS, ETHAN;REEL/FRAME:019490/0536 Effective date: 20070612 Owner name: RAYTHEON COMPANY, MASSACHUSETTS Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:PHELPS, ETHAN;REEL/FRAME:019490/0536 Effective date: 20070612 |
|
AS | Assignment |
Owner name: RAYTHEON COMPANY,MASSACHUSETTS Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:LANDAU, HERBERT;REEL/FRAME:019796/0269 Effective date: 20070830 Owner name: RAYTHEON COMPANY, MASSACHUSETTS Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:LANDAU, HERBERT;REEL/FRAME:019796/0269 Effective date: 20070830 |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
FPAY | Fee payment |
Year of fee payment: 4 |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1552); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 8 |
|
FEPP | Fee payment procedure |
Free format text: MAINTENANCE FEE REMINDER MAILED (ORIGINAL EVENT CODE: REM.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
LAPS | Lapse for failure to pay maintenance fees |
Free format text: PATENT EXPIRED FOR FAILURE TO PAY MAINTENANCE FEES (ORIGINAL EVENT CODE: EXP.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
STCH | Information on status: patent discontinuation |
Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362 |
|
FP | Lapsed due to failure to pay maintenance fee |
Effective date: 20230705 |