COMPENSATION OF SONAR IMAGE DATA PRIMARILY FOR SEABED CLASSIFICATION
Field of the Invention
In one aspect, this invention relates to the enhancement of sonar image data, especially seabed echo data, stored electronically in digital format, and more particularly to a method for processing such data for the purpose of classifying seabed sediment types based on the amplitude of backscattered echoes, so as to generate stored electronic digital data representing such classification.
In another aspect, this invention relates to the enhancement of sonar image data as it is applied in the context of sonar sounding and the subsequent collection, processing, storage and display of that enhanced data.
Background of the Invention
The type of sediment on a selected portion of sonar-scanned seabed may be estimated by observing the characteristics of acoustic backscatter from sonar pulses generated by multibeam and sidescan echo sounders over a selected track above the seabed portion of interest.
Typical of the echo sounding equipment currently in use is that supplied by the Norwegian company Kongsberg Simrad. For example, the EM 1002 echo sounder records echoes over a 3.3° wide arc (measured fore to aft) and over a 150° sector (measured athwartships) at 95 kHz. For shallow water operations, a pulse of duration 0.2 ms is used and the backscattered intensity is sampled at a suitable selected frequency in the range, say, about 5-15 kHz.
Using currently available technology, the sounder is mounted on the hull of a vessel or on a towfish, travelling with a forward motion of up to 10 knots. The swath swept out by the
sounder has a typical useful size of approximately 5x the water depth. This wide swath is a considerable advantage for multibeam sounders as the resulting seabed maps provide greater detail and ensure that there are no uncharted shoals. Further, the desired seabed maps can be produced more quickly, thus reducing ship survey time.
Typically, multibeam echo sounders are connected to positioning equipment, heading and motion sensing instruments, as well as sound velocity sensors, in order to track the sounder's path and orientation over the seabed. The raw data collected are stored in digital form aboard ship, and receive subsequent processing on shore for "cleaning" (the elimination of unreasonable values) and enhancement.
The amplitude of sonar echoes backscattered from the seabed varies with the type of material present on the seabed as well as other factors such as the distance travelled by the pulse to the seabed (the "range") and the angle at which the pulse is incident at the seabed (the "angle of incidence"). By measuring the amplitude of backscattered echoes, it is possible to assess the composition of the seabed. The angle of incidence at the point from which a given detected sonar signal has been reflected is the angle made at the seabed by the arriving pulse with a vector normal to the surface of the seabed.
The attenuation of the amplitude signal due to the range of the sound waves through seawater is caused both by the spherical spreading of the wave front as it expands out from its source and by absorption of sound energy in water. The absorption of sound in water depends on the temperature as well as the salinity. A technique, known as time-varying gain ("TVG"), is widely used to compensate for the attenuation caused by different ranges through seawater. This does no more than make a rough adjustment by multiplying the amplitude of the received signal by a factor depending on the range. TVG does not always correct properly for both spreading loss and absorption of sound by water. Further, TVG does not adjust for the angle of incidence at all.
In "Areal Seabed Classification using Backscatter Angular Response at 95 kHz" [J.E. Hughes Clarke, B. W. Danforth and P. Valentine published in "High Frequency Acoustics in Shallow Water", NATO SACLANTCEN, Lerici, Italy, July 1997], the authors describe a proposal for seabed classification based on the shape of the angular response curve ("AR"). The AR curve plots the amplitude of the backscatter signal against the angle of incidence on the seabed. The shape of the AR curve is considered in three domains, the length and slope of which are suggested as being determinative of the nature of the seabed. This research offers some insight into the classification problem but does not offer a means or methodology
for determining seabed classification for selected sonar-scanned portions of the seabed, nor any proposal for generating processed data that would adequately represent seabed classification by sediment type or otherwise.
In "Seabed Classification of Multibeam Sonar Images" [J. M. Preston, A.C. Christney, S.F. Bloomer and I.L. Beaudet published in the proceedings of the IEEE Ocean's 2001 Conference, Honolulu, November 2001] the authors, who include the inventors named in this present patent application, outline the context in which the subject invention is to be described. They point out that a division of the seabed echo data and corresponding seabed sediment data into acoustic classes is useful because substrate characteristics for any given sediment class are relatively constant throughout such class and are distinct from those of other classes. This makes it possible to considerably reduce the amount of real sampling of the seabed that needs to be done in order to convert the acoustic classification into a classification by sediment type. The technique proposed by the authors relies on computing measures ("features") that are used, alone or in combination, to infer an appropriate acoustic classification from the recorded data by conventional statistical techniques. Further detail of the techniques employed is provided below. As the processing performed makes use of the entire survey data set, the classification is necessarily done after the survey is complete. In general, this technique can be applied to backscatter data from any multibeam system, provided it is operated consistently during the survey.
Preston et al. in the foregoing paper suggest the computation of more than 130 features that can be used in a principal components analysis to determine the most effective combination of features to act as the predictor of sediment type. Examples suggested are mean, standard deviation, higher-order moments, histogram, quantile, power spectra and fractal dimension. As a final step, cluster analysis is suggested to optimally assign classifications to collected data.
In "HIPS: Hydrographic Information Processing System" [M. Gourley and D. Dodd, White Paper #21 , CARIS, Fredericton, New Brunswick, 1998] the authors describe a Hydrographic Data Cleaning System ("HDCS") that provides a set of interactive software tools for detecting and cleaning up raw swath data. A number of these tools rely on having a human being available to identify and correct (or remove) bad data through personal inspection of the data.
Among the published United States patent literature, there are two patents that have some relevance to the normalization technique of the invention. US patent 5,493,619
describes a normalization technique for eliminating false detections in a mine detection process. US patent 6,052,485 describes a method for automatically identifying clutter in a sonar image. Neither of these issued patents attempts to deal with the enhancement of sonar data for sediment classification.
There are a number of United States patents which disclose inventions of apparatus or methods related to multibeam sounders. US patents US 4,024,490, US 5,177,710, US 5,579,010 and US 5,663,930 are representative and provide further details of the equipment and methods used in echo sounding.
The assessment of the characteristics of the seabed from data supplied by a multibeam echo sounder is generally described in "Seabed Classification of Multibeam Sonar Images" (Preston et al., supra). The assessment is generally performed as follows:
(1) A vessel carrying an echo sounder travels over a portion of the seabed of interest.
(2) At regular intervals along the vessel's path, a pulse of sound energy (a "ping") is transmitted from the echo sounder towards the seabed. A detector mounted on the survey vessel records energy from the ping as it is scattered back from the seabed.
Each ping from the sounder propagates out in a wavefront that advances as a hemisphere from the vessel beneath the surface of the water. The advancing wavefront continues spreading until a portion of the wavefront strikes the seabed at a point directly below the vessel. At that point, a portion of the original pulse of energy in the ping insonifies a small area which scatters some of the sound back towards the surface. As the wavefront continues to advance it energizes a circular (or possibly elliptical or annular) area of the seabed which also scatters back energy as an echo towards the surface.
Although an idealized isotropic echo sounder would transmit a pulse that spreads out over the surface of a hemisphere beneath the water, the sounder's detector receives only sound scattered back from a much narrower region, most easily thought of as having the shape of a fan. In practice, isotropic conditions are not realized; the transmitting transducer seldom has a geometry identical to the receiving transducer, and even in a given transducer, the solid angle of transmission or reception is seldom perfectly conical. Usually, the detector can detect sound energy received at an angle of approximately 150° transverse to the motion of the vessel, but at an angle of only
a few degrees (typically 1-3°) along the course of the vessel. The detector in a multibeam sounder records sufficient data to provide the amplitude of sound received at several thousand points transverse to the motion of the vessel. (The "points" for which amplitude image data are obtained are determined by the log rate of the digital data as it is sampled, determined and stored. The higher the log rate, the greater the number of points.) However, the more usual form in which the data are provided is as a time series of amplitude values for each of approximately 150 beams each of which is approximately 1 ° wide measured in a plane transverse to the motion of the vessel.
The time series of amplitude values recorded for each beam is made up as follows. The first part of the time series is background noise. This corresponds to the time taken for the wavefront to advance from the sounder to that portion of the seabed within a particular beam (the "footprint") and to return to the sounder. Then follows an echo of the ping. The starting point of the echo is at a time corresponding to the arrival of the pulse at the detector after it has been scattered back from the point closest to the detector and within the beam's footprint. The echo of the pulse continues for a period of time corresponding to the wavefront's passage over the beam's footprint after which time the echo dies out and the signal recorded returns to background noise.
For some purposes, such as bathymetry, the amplitude and range provide adequate information for mapping purposes. For example, a beam of 1 ° width measured in a plane transverse to the motion of the vessel in water of depth 200 metres has a footprint of approximately 3.5 metres directly beneath the vessel. However, for purposes where greater resolution is required, such as in sediment classification, it is necessary to use the information recorded during the detection of the echo itself to provide amplitude values at a greater resolution.
The echo recorded for each beam begins with sound that has been scattered back from the point in the beam's footprint that is closest to the sounder and then continues as the wavefront advances over the footprint. The amplitude recorded during any small period of time during the echo corresponds to a small distance across the footprint measured transverse to the direction of the vessel. It is thus possible to split apart the signal received for the entire echo into a number of consecutive sub-signals each of which represents the amplitude reflected back at an intermediate angle within the full angle of the beam.
The process of subdividing the echo conventionally takes into account the angle each beam makes with the horizontal (the "angle of depression", or more usually, "depression angle".). Beams near the nadir (that is close to vertical) have very short echoes due to their small footprints and can produce amplitude data only for a few points across that beam's footprint. On the other hand, a beam detected at a significant angle (perhaps 45°) has a much larger footprint and can be subdivided to produce amplitudes for many more points.
The resulting stream of data produced from one ping contains values of range, depression angle and amplitude as digital values for each of typically several thousand points across the swath created by the sounder. (As mentioned earlier, the number of points depends upon the log rate at which the digital data become available.) These data are stored on a computer-readable medium.
(3) The vessel moves over the survey area accumulating data from repeated pings. In addition to the values of range, depression angle and amplitude, suitable data are recorded describing the position of the vessel with respect to the Earth and describing the orientation of the vessel as it rolls and pitches during its travel. These data, in digital form, are recorded on a computer-readable medium.
(4) The data recorded on the computer-readable recording medium are processed by a computer program stored on a general purpose microcomputer after the survey is complete. The computer program performs the following steps:
(a) The data collected from the survey are converted to a standard format by a "load" program.
As the data from different equipment suppliers are available in different formats on the computer-readable media, a selection is made, using the equipment supplier and model, of a load program appropriate to perform the conversion. The resulting dataset includes records each having values for the amplitude, range and depression angle as measured from a vertical through the echo sounder in a plane transverse to the motion of the vessel. These values are recorded for each point measured across the swath of the sounder and for each ping transmitted during the survey.
Note that although this discussion has proceeded on the premise that certain of the data-processing steps mentioned are performed on a general-purpose digital computer that accepts as inputs data and instructions on suitable computer-readable media, some of the data-processing steps could alternatively be performed by a hard-wired calculator or other suitable data- processing mechanism.
(b) The recorded data may have flaws. A typical problem is that the beginning of the echo is obscured by noise created by reflections from objects such as fish or bubbles in the water.
Frequently, the suppliers of sonar equipment include circuitry (or software) to reject values assessed to be unreasonable e.g., signal values created by spurious echoes. Further there is a usually a quality control step applied to the data after they have been loaded. This data validation and quality control subprogram, using objective criteria well understood by practitioners, identifies data (associated with certain identified points) likely to be in error and marks them as "bad" so that they can be ignored in subsequent processing. Such subprograms are per se known in the technology; see for example that used in the QTC Multiview product supplied by Quester Tangent Inc. of Sidney, British Columbia, Canada and that used in the Hydrographic Information Processing System, ("HIPS"), supplied by Caris Ltd., Fredericton, New Brunswick, Canada.
(c) The amplitude image data may be refined by compensation techniques. A technique to compensate amplitude values is known in the prior art and is named time-varying gain ("TVG"). The TVG technique adjusts each amplitude value based solely on its range from the sounder. Each amplitude value is increased by a factor which increases with the range. TVG does not correct precisely for either spreading loss or the absorption of sound by water. The resulting records following compensation are stored on a computer-readable medium for further processing.
(d) The classification of the seabed according to sediments present there relies on calculating a set of features (the "feature vector") for rectangular patches of selected portions of the seabed; preferably neighbouring rectangular patches of the seabed that together form a blanket or grid for the scanned seabed area. Typically the feature vectors include such items as correlation, shade,
prominence, contrast, energy, entropy and homogeneity - all terms defined in the literature (for example, "Textual Features for Image Classification", R. M. Haralick, K. Shanmugam and I. Dinstein, IEEE Transactions on Systems, Man and Cybernetics, Volume SMC-3, (1973) pp 610-621). Features of this sort are comparative and a given feature value at any point is calculated from a number of values obtained at that point and surrounding points. Thus, each rectangular patch must be of a sufficient size so that there are sufficient points within it to make the calculation of the features statistically significant and representative of the character of the rectangular patch.
Following the calculation of the location of the rectangular patches and the feature vector for each rectangular patch, the further operation of the computer program uses a standard principal components analysis to identify at least some specified number of the most dominant components (linear combinations) from the features calculated for the rectangular patches. Typically a selection of the three most dominant components suffices for subsequent useful analysis.
(e) The concluding step of the program's operation is to perform a cluster analysis using the selected dominant components, chosen for their ability to explain the greatest amount of the variability in the data. The cluster analysis identifies a small number (usually 5 to 10) of centroids around which the records are clustered.
(6) The final result of the computer program's operation are records stored in a dataset on a computer-readable medium. The records are provided in an ordinary tabular format (".DAT") within an ASCII text file, readily usable by a variety of other off-the- shelf software applications. The results are usually mapped on paper or microfiche or viewed interactively on a display monitor.
The maps, mosaics or other displays produced show the original survey area as a quilt of rectangular patches each of which is usually coloured to show a different sediment classification. The display produced can be superimposed on a variety of mosaics, maps or other displays. Typically, the user of such data will choose to place the sediment classification mosaic over a map showing the coastline, and may also superimpose or otherwise include maps of information such as fish populations, marine habitat or dredging activity.
Summary of the Invention
In one aspect, this invention relates to the refinement or enhancement of sonar image data, especially seabed echo data, stored electronically in digital format, and more particularly to a method for processing such data for the purpose of classifying seabed sediment types based on the amplitude of backscattered echoes, so as to generate stored electronic digital data representing such classification.
In another aspect, this invention relates to the enhancement of sonar image data as it is applied in the context of sonar sounding and the subsequent collection, processing, storage and display of that enhanced data.
The foregoing aspects of the invention each have associated method, system and apparatus concepts and implementations, and the resulting stored data products and products derived therefrom (such as mosaics, maps, etc.) represent yet further implementations of the invention.
The digital data to which the technique of the present invention applies, in a seabed sonar survey context, typicially includes stored image values representing: the amplitude of analog sonar echoes detected as a result of backscattering from the seabed; the range (or the equivalent time interval) between transmission and detection of the backscattered sonar pulse; and the angle of depression of the sonar pulse transmission beam (measured from the horizontal and transverse to the direction of motion of the survey vessel).
The inventors have recognized that in order to be able to satisfactorily compare amplitude values of received signals that result from correspondingly differing lengths of paths traversed by the transmitted pulse and its echo from different footprints, it is necessary to remove the distorting effects of both range and angle of incidence at the seabed from the data. To overcome the inadequacies of the prior TVG technique, the inventors use a novel compensation technique to normalize amplitude values. In a preferred implementation of the invention, the entire set of data records from step (a) of the sequence summarized above is arranged in groups according to the values of each record's range and angle of incidence at the seabed. These groups are referred to as "populations". The values for amplitude within each population are stored and remain intact for later comparison.
In the preferred method of the invention, a compensation table is created and used, each cell of which compensation table is associated with a unique combination of range and angle-of-incidence data values. In other words, the compensation table has two dimensions one of which corresponds to range and the other to angle of incidence. Both range data and angle-of-incidence data are divided into selected partitions, each cell corresponding to a unique combination of range and angle-of-incidence partitions. So an arbitrarily selected cell in row u and column v of the compensation table corresponds to the uth partition of range values and the vth partition of angle-of-incidence values. The partitions are selected so that statistically significant compilations of range and angle-of-incidence data can be found within each partition. Each cell contains data representing the number of amplitude image values associated with range and angle-of-incidence data lying within the respective partitions. Using the compensation table, the amplitude image data values are compensated and fresh amplitude image data values are generated for the calculation of the feature vectors (step 5(d) above). These fresh amplitude values are obtained that are adusted to zero mean and unit variance. Note that not all cells will necessarily have statistically significant populations, but this fact will not invalidate the utility of the technique.
To elaborate, to implement the objective of dividing the range data and angle-of- incidence data into partitions such that the populations in each cell are statistically significant, suitable selections of each type of data must be made. Beginning with range data, a selection is made of a range size parameter for the range so that the entire set of range values can be partitioned into equal partitions each of a selected size such that the product of size times range size parameter is greater than or equal to the maximum range for the collected amplitude image data. .A preferred selection of the size parameter is made so that when the range values are classified into the partitions, each partition holds a sufficient number of range values so that subsequent features calculated from the populations contained in the partitions are statistically significant.
Similarly, a selection is made of an angle-of-incidence size parameter for the angle of incidence so that the entire set of angle-of-incidence values can be partitioned into equal partitions each of a selected size such that the product of size times angle-of-incidence size parameter is greater than or equal to 90°. And again each partition holds a sufficient number of values so that subsequent features calculated from the populations contained in the partitions are statistically significant.
In the preferred embodiment, each cell contains amplitude image summary data comprising three numeric data summary values, each of which is initially set to zero. Then
three computed values are inserted into each cell of the table, viz (i) a count of the number of amplitude image data values assigned to the cell; (ii) the sum of amplitude image data values associated with the amplitude image values assigned to the cell; and (iii) the sum of the squares of those amplitude image data values.
Once the compensation table is created in accordance with the foregoing set of directives, the data in the table may be used to adjust the amplitude image data and generate fresh refined amplitude image data for storage and subsequent use.
The data as enhanced by the novel compensation technique according to the present invention are useful for a number of purposes. For example, maps or mosaics can be made from these data without further transformation of the amplitude values.
Although the present invention is particularly useful for and is described in the context of sediment classification, the compensation technique according to the invention has wider applicability. For example, the invention can also enhance images of the seabed for the improved identification of shipwrecks or aircraft debris. In principle, the invention is available for use in connection with processing amplitude image data from any particular surface under sonar survey, the term "surface" being taken in a broad sense to include any exposed stratum or object capable of providing echo amplitude image data from the reflection or backscattering of a transmitted sonar pulse or other suitable sonar signal.
The invention is adaptable for use with multibeam echo sounders and detectors available from a number of suppliers. The detailed operation of the apparatus required to practise the inventive methodology varies from supplier to supplier. The particular description of an echo sounder and detector below is provided by way of example so as to provide a general overview.
Other features and advantages of the invention will become apparent from the detailed description and the claims that follow.
Brief description of the drawings
Figure 1 is a schematic drawing of a vessel projecting a sonar pulse onto a representative portion of the seabed. The figure includes an enlarged area showing the angle
of incidence of a selected portion of the fan. The methodology illustrated for the sonar projection and detection is previously known.
Figure 2 is a schematic flowchart of the typical steps in the processing of raw sonar data according to the invention to produce output digital data representing sediment classification.
Figure 3 shows a plot of the results of a typical cluster analysis to classify sediments.
Figure 4 shows an exploded view of layers, each layer showing the location of one type of sediment plotted against latitude and longitude.
Figure 5 shows a plot of the layers of Figure 4 superimposed to provide a visual representation of an exemplary sediment classification.
Detailed description
Figure 1 shows schematically a cross-section athwartships of a vessel 104, travelling in a direction into the plane of Figure 1 and emitting sound waves, typically as a repeated series of pings from a sounder (not specifically illustrated) located in the vicinity of the keel of the vessel 104. Each ping is made up of a short pulse of sound waves that propagate spherically out towards the seabed 108. The surrounding water level is shown as 102.
A detector (not shown), mounted on the keel of the vessel 104 and located in the vicinity of the sounder, receives sound energy scattered back from the pulse after it has struck the seabed 108. The detector is designed only to receive energy arriving from a solid angle having a fan shape indicated generally in the plane transverse to the motion of the vessel as 106. The fan 106 extends across an arc (typically greater than 150°) transverse to the motion of the vessel and is oriented downwards towards the seabed 108. The fan 106 has a uniform angular size in the direction of the motion of the vessel 104, typically 1-3°.
The detector measures the amplitude of the sound energy received simultaneously at a number of uniformly spaced angles within the fan 106, each of which is referred to as a beam. Typically there are more than one hundred beams. The paths traversed by sound waves from the sounder to the seabed and back to a sensor are indicated generally as beams
111 within the fan 106. For clarity, only a few of the beams are illustrated; 110 and 112 are representative.
Each beam 111 corresponds to a known depression angle relative to the horizontal. Assuming that the vessel 104 is moving in the direction into the plane of Figure 1 , the beam recorded with the greatest angle to starboard is shown as 110; the beam with the greatest angle to port is shown as 112.
The sounder transmits a ping lasting a suitable selected time interval, say about 0.2 to about 15 ms, and the detector (including complementary sampling and data-processing equipment) records the amplitude of backscattered data received for each beam at a sampling frequency from 300 Hz to 25 kHz. This produces a time series for each beam of digitized amplitude values, sampled from the analog amplitude signal.
To avoid interference between successive sounder pings, it is not possible to transmit a second pulse until the echo from a first pulse has returned back to the vessel 104. In water of a depth requiring a sound wave to traverse a range of 250 metres, the round-trip time is approximately one third of a second. Typically, soundings are made (that is, pings are propagated and their echoes detected) at a frequency of twice per second.
A portion 116 of the upper view of Figure 1 is shown as an enlarged detail view 118 in the lower part of the drawing to illustrate the definition of the angle of incidence φ. The seabed 108 is generally not flat and its topography can conveniently be described by a set of normal vectors each orthogonal to a small element of the seabed surface 108. The angle of incidence φ at any point of the seabed 108 is the angle in three dimensions between the incident beam 112 and the normal vector 126 at that point.
The echo sounder and other shipboard instrumentation, all conventional, provide data for several variables pertaining not only to the projected and reflected sonar signal but also to the motion of the vessel. These variables include the course, velocity, roll, pitch and heave of the vessel. This information is used to relate the survey data to a geographic position on the surface of the Earth and to adjust all angles measured relative to the axes of the vessel for the vessel's movement along its course, with due correction for wind, current and wave buffeting, relative to the Earth's surface.
The detector's analog signals are sampled and recorded as digital values. The potential quantity of data generated can be very large. For example, if sonar signals are sampled at
a frequency of 20 kHz, each beam 111 during a 1/4 second round-trip produces a time series of amplitude values containing 5,000 data points. For the entire fan 106, if data are recorded for 150 beams, the amount of data produced is approximately 750,000 data points in 1/4 second, assuming 4 bytes per data point. That amounts to a data generation rate of approximately 12 MB/sec. If the vessel is travelling at 4 metres/second then the digitized data produced for a 10 km track comprises approximately 750 million data points occupying approximately 3 GB.
In order to reduce the volume of data to be stored and processed, data that are unnecessary are removed. In particular, software is used to pick out the beginning and end of the echo and remove the sample values outside the echo. The leading values so removed are replaced with a count of the samples removed or, equivalents, the elapsed time represented by those samples.
Figure 2 shows a flowchart of the typical processing steps taken in the preparation of sediment classification data according to the invention from the raw data 252 supplied by a multibeam echo sounder. This flowchart displays the sequence of steps used in the preferred embodiment. At suitable points in the processing data is stored in files of a particular format. The names of these formats (.SONAR, .NAV, .RECT, .CTABLE, .FFV and .DAT) refer to particular formats used within the QTC Multiview product supplied by Quester Tangent Inc of Sidney, British Columbia. The particular details of these formats are not of significance in the description of the invention.
For convenience, the individual substeps of Figure 2 are grouped by function; these groups have been arbitrarily designated as functional steps (1) to (11) in the following description. The method to be described in steps (1) to (11) below includes the methodology of the invention in the context of other typical and well-known data processing steps. Steps (1) to (3), (4)(a), (4)(b), and (6) to (11) are well-known published methods in the sonar reflection data processing art and rely on standard techniques. Steps (4)(c) and (5) are unique and descriptive of the preferred embodiment of the present invention.
(1) With reference to Figure 2, the steps involving the propagation of sonar pings 244, the detection of sonar echoes 246, the sampling to convert analog signals to digital form 248 and the storage of the resulting digital data 250 have been described above with reference to Figure 1.
Raw data 252 initially obtained in digital form from an echo sounder detector are stored on a suitable storage medium, such as a hard disk, and are converted (substep 254) from one of a variety of formats established by the manufacturers of echo sounder hardware to a standard format. The steps in the process of Figure 2 beginning with substep 254 are typically performed ashore on the raw data 252 after the survey is complete. In this example, the raw survey data 252 are converted to a .SONAR file format and the additional data supplied about the vessel's position and instantaneous orientation are converted into a .NAV file format. The results are stored (substep 256) as files on a hard disk (not illustrated) or similar data storage device.
The .SONAR data includes, inter alia, the travel time from the vessel to the seabed for each detected beam, and the corresponding fore-aft and athwartship beam angles. These allow the depth to be calculated at the point where each beam is incident on the seabed. The .NAV file contains ship position and orientation data which can be combined with data from the .SONAR file to give the geographic position of each beam footprint.
As is described in the Summary of Invention section above, the time series of amplitude values recorded for the echo of each beam contains information about the scattering that has occurred at a sequence of points across that beam's footprint. This corresponds to the passage of the insonified region across the beam's footprint as the wavefront expands spherically out from the sounder.
The echo recorded for each beam begins with sound that has been scattered back from a point on the perimeter of the beam's footprint which is closest to the detector and then continues as the wavefront advances over the footprint. The amplitude recorded during any arbitrarily small period of time during the echo corresponds to a similarly small distance across the footprint measured transverse to the direction of the vessel. The amplitude values for the time series of the entire beam, which have been sampled at a high rate, are then split into consecutive sub-signals each of which is one of the amplitude values. Each resulting amplitude value corresponds to sound scattered back at a point intermediate within the full footprint of the beam. The exact details of the splitting of the beam's time series varies from sounder to sounder and from manufacturer to manufacturer and the details are beyond the scope of this description.
The process of splitting the beam's time series takes into account the depression angle. Beams near the nadir (that is close to vertical) have very short echoes due to their small footprints and can only produce amplitude values at a few points across that beam's footprint. On the other hand, a beam recorded at a significant angle (perhaps
45°) has a much larger footprint and will produce amplitude values at many more points.
The resulting data produced from one ping contain values of range, depression angle and amplitude as digital values at each of the points along the swath produced as a result of the subdivision of the time series associated with each beam. Typically, this amounts to data values at each of several thousand points across the swath created by the sounder. These data are stored on a computer-readable medium.
(2) On any particular survey line, some of the data points received can be expected to be of no utility. They may be misleading reflections from an intervening object, or spurious values caused by the measuring equipment. In the Quality Control substep 258, the range to each point is computed and values that do not appear to be reliable are marked to be ignored in the subsequent processing. There are a variety of standard techniques available to identify data that do not meet appropriate quality criteria.
The foregoing substep does not delete any data. The .SONAR file obtained by substep 256 is augmented with a "mask" that indicates which sample values are to be included and is rewritten (substep 260) as a file to the hard disk.
(3) The description of sediment types requires that the surface of the seabed be broken down conceptually and for purposes of measurement into a quilt of rectangular patches each of which will finally be assigned a sediment classification. This substep 264 draws on data from the .NAV file, known echo sounder properties and suitable parameters indicating the desired rectangle size, to determine rectangular patches in which groups of .SONAR records appear. These data are stored (substep 268) as a file on the hard drive in .RECT format.
The choice of the size of the rectangular patches is a parameter determined to satisfy objective criteria determined at each survey site. In the preferred embodiment, the following criteria have been found to be suitable:
(a) The rectangular patches should be small enough to provide a spatial resolution sufficient for the purposes of the survey;
(b) The rectangular patches should not be in numbers so great that they make the subsequent processing of the data unreasonably time-consuming;
(c) The shape of each rectangular patch should be approximately square as measured on the seabed. As the recorded resolution transverse to the motion of the vessel is usually considerably finer than that along the track of the vessel, the rectangular patches are suitably chosen to include many more points transverse to the vessel's motion than along the track of the vessel.
(d) The rectangular patches should be large enough to include a sufficient number of points to provide a statistically significant sample.
In the preferred embodiment, a suitable size for a rectangular patch is 16 pings along the track of the vessel by 128 points transverse to the motion of the vessel, containing data for 2048 points in total.
(4) To compensate for the effects of range and angle-of-incidence on the recorded amplitude values, a first pass through the survey dataset is used to construct a set of intermediate statistics that are stored in a compensation table. In a subsequent pass through the survey dataset (step (5) below) the recorded amplitude values are adjusted based on the intermediate statistics stored in the compensation table.
The process 262 of building the compensation table takes place in three sub-steps (not specifically illustrated) as follows:
(a) Knowing the depression angle (θ in Figure 1) and the range (calculated from the time elapsed between the transmission of the ping and the arrival of its echo and the speed of sound in sea water for the ambient conditions) at each point detected along the swath, the depth from the surface to that point can be calculated by trigonometry.
(b) By observing how the depth changes across the survey area, it is possible, with one of several well-known techniques, to calculate the normal vectors at points on the seabed along the swath. A review of suitable methods for this
calculation is presented in "A comparison of local surface geometry estimation methods" [Alan M. Mclvor and Robert J. Valkenburg, Machine Vision and Applications (1997) 10: 17-26]. Then, knowing in three dimensions both the depression angle from each point along the swath (the angle θ in Figure 1) and the direction of the normal vector at the point of incidence, it is straightforward to calculate the angle of incidence (the angle φ in Figure 1) as the angle between the two.
(c) If the survey has recorded data for n pings and the detector provides data at m points across the swath for each ping then values corresponding to the j'th point within the i'th ping for the amplitude of the signal, Ampl y, the range to the point of incidence on the seabed, Range », and the angle of incidence φ at the seabed, Angle Vy are available from computer storage.
The compensation table is built as a table in computer memory as follows:
(i) the maximum value of the Range Vi values is found, MaxRange;
(ii) a selection is made of a size parameter SR for the range so that the values for range can be partitioned into NR equal partitions each of size SR such that (NR x SR) is greater than or equal to MaxRange.
A preferred selection for SR is made so that when the range values Range;: are classified into the NR partitions, each partition holds a sufficient number of values so that subsequent features calculated from the populations contained in those partitions are statistically significant.
In practice, a choice of SR of between 1/2 metre (0.3 ms, in shallow water) to 5 metres (3 ms, in deep water) is generally found to be suitable.
(iii) a selection is made of a size parameter SA for the angle of incidence φ (measured in degrees) so that the values for the angle of incidence Anglβjj can be partitioned into NA equal partitions each of size SA such that (NA * SA) is greater than or equal to 90°.
A preferred selection for SA is made so that when the angle of incidence values Angle v. are classified into the NA partitions, each partition holds a sufficient number of values so that subsequent features calculated from the populations contained in those partitions are statistically significant. In practice, a choice of SA in the range about
0.5° to about 1.5° is generally found to be suitable.
(iv) A compensation table Comp is constructed with (NR x NA) cells arranged conceptually as NR rows and NA columns. The cell in row u and column v, Compuv, corresponds to the uth partition of range values and the vth partition of angle of incidence values. Each cell entry holds three numeric values, each of which is initially set to zero: N uv is a count of the number of amplitude values assigned to the Comp uv cell; Suv holds the corresponding sum of amplitude values; and SSQUV holds the corresponding sum of squares of the amplitude values.
The cells in the compensation table are filled by reviewing the entire survey dataset. Any data which has been marked as bad in the quality control step is ignored. For the j'th point within the i'th ping, the values of Range « and Angle » are examined and the corresponding partitions of range and angle of incidence in which the Range v. and Angle v. values fall (as specified by the indices u and v respectively) are identified. The three numeric values associated with the Comp uv cell are updated as follows:
N ™' uv = N ' uv + 1 '
S'uv = S uv + AmPl ij
SSQ'UV = SSQ uv + (Ampl „ x Ampl „)
(The three statements above are examples of the use of software programming conventions to describe the updating of items by an assignment statement. For example, the statement N'uv = Nuv + 1 should be read to mean "take the present value stored for N uv, add 1 to it, and store the result again for N uv". This convention is used elsewhere below.)
The results of building the compensation table are stored (substep 266) in a file on the hard drive in a .CTABLE format.
(5) Once the compensation table has been created, then using the stored amplitude values associated with the cells of the table, the amplitude data are compensated (substep 270) so as to obtain fresh adjusted amplitude values, by the following data processing steps:
First the mean amplitude and mean square amplitude are calculated for each cell in the compensation table:
M uv = S uv / N uv (the mean amplitude)
MSQ uv = SSQ uv / N uv (the mean square amplitude)
Then, the amplitude values are adjusted in a second pass through the entire survey dataset. As was described above for the construction of the compensation table, the values of Range jj and Angle V} for the j'th point within the i'th ping are examined and the corresponding partitions of range and angle of incidence in which the Range y and Angle » values fall (as specified by the indices u and v respectively) are identified. The corresponding amplitude, Ampl y, is modified as follows:
Ampl'jj = (Ampl Vj - M uv) / sqrt ( MSQ uv - ( M uv x M uv ) )
It is important to note that this transformation maps the observed amplitudes into a population with zero mean and unit variance. These values are chosen for convenience in the preferred embodiment, but the technique can equally well be applied to the data to produce populations each having the same value for the mean and variance, but for which the mean is not zero and the variance is not unity.
(6) The compensated amplitude values, which are respectively associated with the rectangular patches determined in step 3, are used to calculate feature vectors, each typically having about 100 or more selected features, that are stored on the hard drive in a file with an .FFV format 276. These feature vectors are used in subsequent principal components analysis and cluster analysis described in more detail below.
The features selected for this analysis are well-known in the art and are discussed in the following publications:
"Textual Features for Image Classification", R. M. Haralick, K. Shanmugam and I. Dinstein, IEEE Transactions on Systems, Man and Cybernetics, Volume SMC-3, (1973) pp 610-621.
"Seabed classification from sonar data: report for 1993." Milvang, O., K.W. Bjerde, R.B. Huseby and A.S. Solberg, 1994 Norsk Regnesentral/Norwegian Computing Centre, Oslo, Norway.
"Pattern Recognition for SAR Thematic Mapping: Co-ocurrence Matrices" Huber R., 1999 http://www.cosy.sbg.ac.at/~reini/diss/node90.html
"The use of image processing techniques for the automated detection of blue-green algae" Thiel S.U. 1994 http://www.cs.cf.ac.Uk/user/S.U. Thiel/thesis/thesis.html
Pace, N.G. and Gao, H., "Swathe Seabed Classification", IEEE Journal of Oceanic Engineering, Vol 13, 83-90, 1988.
"Estimating the fractal dimension from digitized images" Kraft R, Munich University of technology http://www.edv.agrar.tu-muenchen.de/ane/algorithms/algorithms.html
For example, the table below includes many of the commonly included features which are used for the principal component analysis. (The value x^ used below refers to the amplitude value appearing in column j of row i within a selected rectangular patch. Each rectangular patch is assumed to have m rows and n columns.)
It should be emphasized that the processing that takes place in subsequent steps makes use of the feature vector calculated from each rectangular patch and not the amplitude values per se.
(7) A principal components analysis 274 derives from the feature vectors a set of components each of which is a set of weighting factors, one factor for each feature appearing in the feature vector. These components are calculated so that the first component explains to the maximum extent possible the variability in the feature values. The second component is then calculated so as to explain, again to the maximum extent possible, the variability still remaining in the feature values. This process continues searching for components until a significant portion of the variability has been explained. A significant portion selected to be about 90% is commonly found to be acceptable.
The principal component analysis 274 is entirely conventional and follows the accepted methodology described in any standard statistical reference (for example: "Multivariate statistical methods: a primer" B.F. Manly, Chapman and Hall, 1994)
(8) The final computational step is a conventional cluster analysis 272.
To reduce the dimensionality of the task of searching for clusters, only the first three of the components calculated in the principal components analysis are used. These are usually referred to as Q1, Q2 and Q3 and typically explain more than 90% of the variability in the features. They can be conveniently displayed as a graph in three dimensions.
The cluster analysis determines a relatively small number (usually between 5 and 10) of clusters which are identified by the coordinates of their centroids as measured in the space spanned by {Q1, Q2, Q3}. Each sample, as represented by its Q1 , Q2 and Q3 values is assigned by the technique to a cluster using a Bayesian metric. A confidence value can be derived from the record to the closest cluster and to other clusters and by comparison to the cluster covariance in that direction.
Cluster analysis is a well-known technique and more detail of the theoretical basis for the computations described herein may be found in "Numerical Ecology", P. Legendre and L. Legendre, second English edition, Elsevier Science BV, 1998.
Figure 3 shows the typical results of the cluster analysis as a set of points arranged in three dimensions. The three coordinate axes (304, 306 and 308) of the graph of Figure 4 correspond to the values of the three most significant components calculated in the principal component analysis step (Q1 , Q2 and Q3). The feature vector associated with each rectangular patch produces one point in the graph, calculated by standard means from the principal component weights and the set of features retained for each rectangular patch. Figure 4 shows eight clusters of such points, labelled as 310 to 324.
(9) By way of example, Figure 4 shows a computer screen display 402 of the results of a sediment classification from a multibeam survey. The display is of the type produced by the QTC Multiview product supplied by Quester Tangent Corporation, Sidney, British Columbia, Canada. Each cluster identified in the cluster analysis is displayed as a separate layer 404 to 412. Each layer shows the location of points on the seabed, related to axes for latitude 414 and longitude 416, that fall into the same cluster. On a computer screen, the points in each layer have a distinct colour as an aid to viewing (displayed as grey scales in Figure 5). Of special note, in the uppermost layer 404 (corresponding to cobbles and boulders) are two approximately straight lines highlighted within ellipses 418 and 420. These correspond to the known locations of sewage outfalls along the coast of Parksville, British Columbia, Canada.
Figure 5 shows the five layers 404 to 412 of Figure 4 superimposed. This display clearly shows the tracks of the survey vessel which appear as a band structure 504 throughout the image. The two sewage outfalls are shown as 418 and 420.
The vessel's course is established during the survey by latitude (for example 414) and longitude (for example 416) coordinates obtained from a Global Positioning System.
(10) The data displayed in Figures 4 and 5 show portions of the seabed classified according to their acoustic properties. However, in order to make the seabed classification more useful as a description of conditions on the seabed, each cluster must be related to the actual conditions on the seabed. This is done by selecting suitable points on a map of the seabed representative of each cluster and returning with a survey vessel to either view actual conditions on the seabed (with a camera) or to bring sediments to the surface (with a "grab").
(11) The final result of the computer program's operation is a compilation of refined data stored in a dataset on a computer-readable medium. The data records are conveniently provided in an ordinary tabular format (".DAT") within an ASCII text file, readily usable by a variety of other off-the-shelf software applications. The QTC Multiview product is an example of such software.
Note that while the invention has been described with particular reference to a system and methodology associated with seabed sonar surveying, and that the description has been premised on a selected multibeam sonar system, those skilled in the technology will recognize that variants and modifications can readily be made to enable the principles of the invention to be applied to other surfaces and objects to be surveyed, including adaptation to other types of sonar system. For example, some sonar manufacturers provide the backscatter image in a continuous stream for each ping, not as segments within selected beams. The method described in the present specification can be readily adapted to process this type of data, with some changes in the details of the implementation of the method that will be apparent to those skilled in the technology.