+

WO2008003944A2 - Image processing and vectorisation - Google Patents

Image processing and vectorisation Download PDF

Info

Publication number
WO2008003944A2
WO2008003944A2 PCT/GB2007/002470 GB2007002470W WO2008003944A2 WO 2008003944 A2 WO2008003944 A2 WO 2008003944A2 GB 2007002470 W GB2007002470 W GB 2007002470W WO 2008003944 A2 WO2008003944 A2 WO 2008003944A2
Authority
WO
WIPO (PCT)
Prior art keywords
contour
pixel
error
pixels
contours
Prior art date
Application number
PCT/GB2007/002470
Other languages
French (fr)
Other versions
WO2008003944A3 (en
Inventor
John Patterson
Philip Wills
Original Assignee
The University Court Of The University Of Glasgow
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by The University Court Of The University Of Glasgow filed Critical The University Court Of The University Of Glasgow
Publication of WO2008003944A2 publication Critical patent/WO2008003944A2/en
Publication of WO2008003944A3 publication Critical patent/WO2008003944A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • G06T9/20Contour coding, e.g. using detection of edges

Definitions

  • the present invention relates to the vectorisation of images into contour maps which can be interpreted as piecewise continuous fields. It may be used for the vectorisation of photographic images, but is not limited to this.
  • the present invention also relates to a method of smoothing a contour, a method for interpolating between contours, a method of determining errors in a raster format image, a method of generating a higher resolution version of an image and apparatus for carrying out any of the above.
  • each line corresponds to a row of locations in a special two-dimensionally indexed memory called a framestore, and each line on the screen is painted out of the contents of a row of pixels.
  • the fraraestore contains an array of rows of pixels, which also line up in columns. If a column of pixels is set, a vertical line appears on the screen.
  • the term pixel ambiguously refers to the spot on the screen whose intensity is controlled by the pixel value, and by the memory location whether or not it is in the framebuffer itself.
  • the resolution of the image is expressed in terms of two numbers, the number n of pixels in a line (also referred to as a row) and the number of lines 1 (which is also referred to as the number of columns, as pixels line up in columns as described above) making up the image or ⁇ n x 1' .
  • Framebuffer resolutions are expressed similarly
  • a pixel on the screen may be considered to occupy a finite area, which is idealised to being a square of unit distance along each side although there is considerable debate about how a 'pixel' may be most suitably represented ⁇ Blinn, J. What is a pixel? IEEE Computer Graphics & applications, VoI 25 No 5 (Sept) 2005 pp 82- 87) . Essentially the most appropriate model depends on the stage at which the computation of the pixel value has reached. The idealisation above is necessary in order to calculate or represent digitised images as similarly as possible to their original forms. Screen real estate is thus first measured out in display-independent units of pixel side length, which can be corrected to the actual pixel dimensions for metric measurements over the surface of a particular screen or projection area.
  • FIG. 2a shows a vector system.
  • Fig. 2b shows respectively the vector addressing of a screen using both absolute positioning and relative positioning and the vector drawing of a triangle using both absolute positioning and relative positioning.
  • Vector images quite often contain closed shapes, which are filled with a colour or pattern. This colour or pattern can vary spatially over the filled region, as shown in Fig. 3.
  • Fig. 3a shows a filled area in a vector image.
  • Fig. 3b shows an image with a plurality of filled areas making up an image, which has been reconstructed from the data depicted in Fig. 3c.
  • rendering can be used to reconstruct images from stored image data.
  • Techniques known as rendering have been available for many years. These involve the creation of completed (or composite) images, usually on a line-by- line ( ⁇ scan-conversion' ) basis. This involves calculating on a pixel-by-pixel basis a composite line, and then repeating this for each line in the final output image.
  • the output of a rendering process is a raster image. It is well known that such rendering processes can turn a vector image into a raster image. This is because the content of any one given pixel can be determined from the vector information.
  • Vector images are usually drawn by the operator using special-purpose software and manual input devices such as a graphics tablet.
  • photographic or real scene images are always in raster form. This is because the known forms of capturing such images are themselves raster based.
  • storing images in vector form has several advantages.
  • vectors images are generally resolution independent as it is possible to render the final output image at any desired resolution, both spatially and in colour depth, subject to the accuracy of the vector points.
  • Most particularly vector images can be manipulated (warped etc.) without incurring image- degrading effects as can happen with raster images. This is because the vector form is warped, then rendered as usual in a form suitable for local display.
  • the techniques available for the conversion of photographic or real scene information into vector format are subject to corruption by noise which is always present in any captured raster image and this causes difficulties, some insuperable by present means, at all stages of the handling and rendering of the format.
  • An object of the present invention is to provide a method for representing images that are generally captured in a raster form in a vector format which preferably compensates for noise corruption and a method of rendering them out conveniently.
  • Another object of the present invention is to provide a method of determining the errors in a raster format image immediately on input. Once determined the error terms may be used throughout the encoding process to optimise the format prior to manipulation and rendering. In its various aspects the present invention may achieve one or more of the above objectives.
  • a first aspect of the present invention provides a method of converting raster format ' image data to vector format image data comprising the step of generating contour data for a plurality of isochromic contours, whereby each isochromic contour is determined by applying a curve fitting process to a plurality of possible solution points and spatial error bounds for said solution points, wherein said solution points and error bounds are based on said raster format image data, said solution points being deemed to have colour values equal to the colour value of the contour, and the contour data is sufficient to allow reconstruction of said contour using the curve fitting process.
  • the error bounds around solution points for a given colour value are preferably used to determine a finite area which is the strongest constraint on the shape of a contour of that colour value, i.e. the 'contour bounds'. Any contour which fits in that region may be considered a valid contour within the accuracy to which the original image was captured.
  • the step of generating contour data comprises a step of assigning coordinate values to pixels in the raster format image data and interpolating between pixels to find the coordinates of points in the path of a contour having a given colour value. This enables the contour bounds to be found to sub pixel accuracy from the initial solution points .
  • each pixel can be assigned x and y coordinates, which may conveniently be based on the coordinates of the centre of each pixel.
  • Each pixel then has at least a pair of coordinates and a pixel colour value. If two adjacent pixels have colour values of 40 and 50 respectively (in nominal units) , then a contour for colour value 45 will lie between these two pixel centres. The coordinates in pixel space of a point on this contour can be found by interpolation. This enables the contours and/or the contour bounds to be found to sub pixel accuracy.
  • the step of defining said contour (s) takes into account error estimates in the raster format image data.
  • the preceding description could be applied with a nominal error value of zero everywhere in which case the error bounds would define the contour trajectory exactly.
  • the error estimates may be based on differences between the values (e.g. colour values) of adjacent pixels. This enables the contours to be defined more smoothly.
  • the error estimates are preferably based on a fourth order difference between the pixels (as the fourth difference should be zero if there is no noise, or a multiple of the noise otherwise) .
  • the error estimate is based on a multiple of the fourth difference divided by V2 or V70.
  • the exact divisor for the fourth difference in each embodiment will be mediated by a parameter so there are a multiplicity of possible choices for this value, depending on the default accuracy of reconstruction desired.
  • the method may further include the generation of radial line data for one or more radial lines running between two of said isochromic contours, wherein each radial line is a path between said contours which follows the direction of the local gradient.
  • radial lines may be determined by applying a curve fitting process to a plurality of possible solution points and spatial error bounds for said solution points, wherein said solution points and error bounds are based on said raster format image data, said solution points have colour values equal to the colour value of the radial line in the vicinity of the solution point, and the radial line data is sufficient to allow reconstruction of said radial line using the curve fitting process.
  • said spatial error bounds are based on local error estimates for pixel colour values in the raster format data.
  • local error estimates we mean, for example, estimates of the error in pixel values of the raster data in the region of the respective solution point.
  • a global error estimate such as PSNR, MSD or MAD could be used.
  • the local error estimates may be based on fourth order differences between the pixel colour values in the raster format data.
  • the error estimates may be calculated directly from the fourth order differences.
  • the error estimates may be calculated indirectly from the fourth order differences by finding regions where a direct solution for the individual error terms is possible, determining that solution, then using it to unravel other error terms from adjacent fourth differences.
  • this alternative approach is of most use where there are five adjacent pixel values each of which define a linear interpolation step as sufficient for the local error terms.
  • the error estimates in the colour value for each pixel may be calculated by using a convolution like weighted sum of adjacent fourth order differences.
  • the error estimates may be calculated by a combination of any two or more of the above alternatives, the combination being chosen so as to achieve the most accurate result depending on context.
  • At least first order differences between the colour values of adjacent pixels are calculated and error estimates for these differences are generated; and said error estimates for the differences are used to simplify equations for finding the solution points for the isochroraic contours.
  • the error estimate for a difference causes the range of possible values for that difference to overlap zero
  • the term in the equation corresponding to said difference is assumed to be zero. In this way it ,is possible to reduce the order of equation needed to solve the contour.
  • Options include degree one (linear interpolation, second and higher differences are zero within error bounds) degree two (quadratic interpolation, third differences are zero within error bounds) and degree three (cubic interpolation) .
  • a second aspect of the present invention provides a method of converting raster format image data to vector format image data comprising the steps of determining a minimum set of contours to be generated by using diffusion equations to define a plurality of watersheds and generating contour data defining a contour around each watershed.
  • a “watershed” is a region in which, if you select a point in the region, and follow the line of steepest ascent or descent (depending on whether the region is a "hill” (a region in which the area enclosed by the watershed is of greater value, or “higher”, than the value at the watershed) or a "hole” (a region in which the area enclosed by the watershed is of lower value, or “lower”, than the value at the watershed) ) , you will reach the same extreme point.
  • the contour data is generated using a method of the first aspect above, including any combination of the optional or preferred features of that aspect.
  • the method derives a contour set at fixed intervals or according to the interpretation of nonreconstructive instrumentation (such as a pixel value histogram or an extrema histogram) to provide a fixed set of contour values.
  • nonreconstructive instrumentation such as a pixel value histogram or an extrema histogram
  • This method may further partition the image in terms of extrema clustered by overlapping watershed intervals.
  • a third aspect of the present invention provides a method of converting raster format image data into vector format image data according to either of the first or second aspects above, including any combination of the optional or preferred features of those aspects.
  • an error estimate is generated for each of a plurality of pixels in the raster format data
  • the method further comprises a test step in which isochromic contours are generated from the contour data, a colour value for each of said pixels is calculated on the basis of the generated isochromic contours and, if the calculated colour value is outside the error bounds for that pixel, a new isochromic contour is generated for the colour value of that pixel.
  • said pixel colour values are for pixels inside said contours and in the test step said pixel colour values are calculated on the basis of said isochromic contours by using a diffusion equation.
  • said pixel colour values are preferably calculated on the basis of said isochromic contours and one or more radial lines.
  • the contours are preferably generated from the contour data by applying a curve forming process to said contour data .
  • test step is repeated until the colour value for each of said pixels, calculated in said test step, is inside the error bounds for each said pixel.
  • the image has been placed in vector format to within the errors associated with the original raster format image data.
  • the test step may be a recursive process which, at any given repetition, starts with a single closed contour for a given level value.
  • This is provides a "footprint” consisting of those pixels whose original raster format colour values are greater than or equal to the level value (assuming a "hill” arrangement, but the process is equally applicable to "hole” arrangements) .
  • the contour completely contains other closed contours of the same level value, these correspond to "holes" in the footprint and the interior of such holes do not count as part of the footprint.
  • An assumption may then be made about how the pixel colour values vary within the footprint.
  • the pixel colour values for pixels within the footprint may then be determined based on this assumption (which is preferably the same assumption as will be used when rendering or decoding the vector image data) .
  • a comparison can then be made between the determined pixel colour values within the footprint and the original, raster format, colour values for the same pixels. If all of the determined pixel colour values fall within the error bounds of the colour values from the original, raster format, data, then the contour (and the assumption) adequately describe the footprint.
  • the test step can be repeated until all the determined pixel values fall within the error bounds of the colour values from the original data.
  • the assumption may be that pixel levels within the footprint are all the same. This will still result in correct encoding of the raster image data into vector image data, and is relatively simple to perform (for each footprint no calculation of pixel colour values is required: the level value of the contour is simply compared to the original data and the error bounds) .
  • the number of contours required is likely to be large compared to encodings using other assumptions, and the number of repetitions of the test step required to produce the full contour set is also likely to be higher than for other methods .
  • the assumption is as follows.
  • the level value of this second contour is referred to as a target level.
  • the pixel values are then determined by interpolation between the two level values based on the results of a diffusion process which indexes the footprint.
  • the interpolated pixel values that fall outside the error bounds of the pixel values in the original data are identified and an estimate is made of a new target level value which will ensure that a repetition of the interpolation process will result in determined pixel values falling wholly within the error bounds.
  • the result will be a third contour having a compatible target level, i.e. a level which is such that the footprint resulting from subtracting the footprint of the third contour from the footprint of the first contour only contains pixels which, after the assumption is made about how pixel values vary within a footprint, fall within the error bounds of the corresponding pixels in the original, raster data.
  • the outcome is thus a set of contours (comprising the first and third contours above) to each of which the test process is repeated until the original footprint of the first contour is completely accounted for.
  • the vector image data can be built up to include the minimal number of contours needed to describe the image data to within the error boundaries of the original, raster data.
  • Such a process of building up a set of minimal contours is computationally more efficient than providing a set containing more than the optimal number of contours and attempting to determine which contours should be removed.
  • the present invention also includes methods in which redundant contours are removed from a contour set.
  • a fourth aspect of the present invention provides a method of rendering an image from vector format image data comprising the step of determining the pixel colour values between isochromic contours in the image data by using diffusion equations.
  • the pixel colour values may be determined by using diffusion equations in the region bounded by said radial lines and said isochromic contours (which is topologically square) .
  • the diffusion equations used to render said image are preferably identical to the diffusion equations used in said test step.
  • a fifth aspect of the present invention provides a method of determining the errors in pixel colour values in an image comprising the step of calculating the differences in values between adjacent pixels, estimating the background noise based on fourth order differences between said pixel colour values and assigning a corresponding error value to each pixel.
  • the error estimates may be calculated directly from the fourth order differences.
  • the error estimates may be calculated indirectly from the fourth order differences by finding regions where a direct solution for the individual error terms is possible, determining that solution, then using it to unravel other error terms from adjacent fourth order differences.
  • the error estimates in the colour value for each pixel may be calculated by using a convolution like weighted sum of adjacent fourth order differences.
  • the method may further include the step of determining the error estimates of pixels near the border of the image using a border-enhancement technique.
  • the border-enhancement technique preferably includes the steps of generating colour values for virtual pixels lying outside the edge of the image, and using the colour values of said virtual pixels to determine said fourth order differences for the pixels near the edge of the image .
  • the step of determining the colour values of said virtual pixels includes matching a first group of pixels near the edge of the image with a second group of pixels elsewhere in the image having similar colour values to said first group of pixels, and generating colour values for said virtual pixels based on colour values for pixels which have the same positional relationship to said second group of pixels as said virtual pixels have to said first group of pixels.
  • said second group of pixels has colour values which are within error bounds of the colour values of the first group of pixels (or vice-versa) , and the edge- enhancement technique assigns a confidence factor for each virtual pixel based on the error bounds and the similarity between the colour values.
  • the step of determining colour values of said virtual pixels uses the fourth order differences of the pixels near the edge of the image to determine colour values of said virtual pixels from the colour values of said pixels near the edge of the image .
  • a sixth aspect of the present invention provides for use of the errors in pixel colour values determined by a method according to the fifth aspect above, including any combination of the optional or preferred features of that aspect in a method according to any one of the first, second or third aspects above, including any combination of the optional or preferred features of those aspects, either directly, or in the calculation of spatial error bounds .
  • the calculation of the errors may form part of a preliminary set of calculations performed before the contours (and optional radial lines) of the vector format image data are determined. Data from these calculations may then be used in the methods of the preceding aspects.
  • the methods of the above aspects may calculate error bounds for every pixel in the raster format image data, find extrema (maxima, minima, ridges, troughs, plateaus, etc.) in the raster format image data, and optionally determine watersheds or approximations to those watersheds.
  • a seventh aspect of the present invention provides a computer program comprising instructions for carrying out the method of any one of the above aspects, including any combination of the optional or preferred features of those aspects.
  • a further aspect of the present invention provides a computer readable medium containing a computer program according to the seventh aspect above.
  • a further aspect of the present invention provides a computer programmed according to the above seventh aspect of the present invention.
  • a further aspect of the present invention provides an apparatus having means configured for carrying out the method of any one of the above aspects, including any combination of the optional or preferred features of those aspects.
  • the apparatus may be comprise respective components for carrying out each step of the method.
  • a further aspect of the present invention is an image produced by any one of the above aspects.
  • an advantage of the present invention is that it can also be applied to colour or greyscale images, e.g. photographs.
  • Each contour is a contour representing a colour value and the contours are isochromic contours or nominal lines through the image on which all points have the same colour value.
  • a colour is represented as a mix of three colour values (commonly known as a tristimulus system), e.g. Red (R), Green (G) and Blue (B).
  • RGB Red
  • G Green
  • B Blue
  • Each component is quantised to a given range to values.
  • colour and “colour value” are used to refer both to the combined colour or colour value of the three components or to the value of any one of the individual components.
  • the method will be carried out separately for each component (e.g. red, green and blue) so that vectors are generated in each colour space.
  • the colour components are then recombined when the image is generated from the vector data for display on a computer screen or the like.
  • the terms "colour” and “colour value” also include shades of grey (grey level components) . While a grey scale image is often translated from RGB data, each grey scale pixel could be represented by one component giving the grey scale value (or shade) and the present invention can be applied to such a system too. In general it is applicable to any system, tristimulus, single component or otherwise which assigns colours a numerical value, including colour spaces such as L*a*b or subtractive colour systems such as CMYK.
  • Fig. Ia shows a 16 line raster and has already been described
  • Fig. Ib shows a raster display of a triangle on a CRT and has already been described
  • Fig. 2a shows various methods of vector representation and has already been described
  • Fig. 2b shows vector drawing of a triangle by a CRT and has already been described
  • Fig. 3a shows a filled area in a vector image and has already been described
  • Fig. 3b shows an image generated by filled areas defined by vectors and has already been described
  • Fig. 3c shows the data used to construct the image of 3b and has already been described
  • Fig. 4 is a diagrammatic representation of a basic principle behind contour construction
  • Fig. 5a shows how the relationship between the pixel value and the contour level may be used to trace a contour between sample points using the method of marching squares
  • Fig. 5b shows how the two ambiguous cases may be resolved in the marching squares algorithm
  • Fig. 6 is a diagram showing first, second, third and fourth order differences in a row of pixels.
  • pixels in real images are contaminated by noise.
  • pixel values can be represented as a pair ⁇ - ⁇ , ⁇ + ⁇ ) which defines the range of values that a pixel could have.
  • Deriving a value for ⁇ for each point is thus useful in analysing the image.
  • Such a value could be derived globally (e.g. using the global signal- to-noise ratio) or preferably locally, for example using the fourth-differences as set out below.
  • the received isosurface that is the original pixel values and an upper and lower isosurface respectively corresponding to the maximum possible value of each pixel and the minimum possible value of each pixel.
  • the "true" isosurface is ⁇ (x, y) .
  • One method of representing an isosurface is to model the isosurface in terms of isochromic contours, sometimes known as isophotes, which has been used in image analysis and digital mapping since at least 1967 (see Warntz W., Woldenberg M. , "Concepts and Applications - Spatial order", Harvard papers in Theoretical Geography No .1 , May 1967) .
  • This form of equation is the starting point of a Level Set formulation for the calculation of successive contours.
  • the solution to the curve equation will often result in more than one closed path, so we will refer to a complete solution of the equation over a region as a "level set” and each closed path within that solution as a "contour” or "isochromic contour".
  • One possible way of representing an image in vector format is therefore to represent the isosurface as a number of level sets for a specified set of values of p. Since each level set itself consists of one or more contours, we can represent the contours in any standard curve formulation, e.g. Bezier chains.
  • the raster format image data is processed to define a plurality of contours representing the image.
  • Fig.4 illustrates the basic idea behind the method for constructing a contour in the absence of a pre-existing model for a given brightness or colour value, say, a value of 80.
  • the top image shows how pixel values are usually represented by outlining their borders (left) , and for this diagram in terms of their values at their centres (right) .
  • the large image uses this second representation.
  • the large image shows how a contour of value 80 might be found by interpolating between the given pixel values. This method finds a single contour path rather than a ribbon in which the contour might lie.
  • Fig. 4 shows a representation of a 3 by 3 block of pixels, having values 71, 66, 60, 75, 73, 68, 79, 75 and 70.
  • contours are calculated during encoding (which will be discussed in more detail below) .
  • the first is when a pixel value is being re-estimated from the available contour data and is found to lie outside the error bounds for the original pixel. In this case the contour for the original pixel value is traced out by itself and there is no process of searching for a point to start the tracing from.
  • the second situation is where an entire level set is being calculated for a given value of p and here the initial step involves finding all pixels for which p ⁇ (x,y ⁇ ,
  • Both the above processes preferably invoke a highly localised search process to find the next intersection points or points following a solution for a previous point.
  • This process is sometime referred to as a "marching squares" algorithm, but the search process will be modified in a scan-conversion context to discover all solutions along a given scan-line before proceeding to the next one.
  • the process is illustrated with reference to Figs. 5a and 5b. In these Figures, the actual pixel values are irrelevant, so the outcome of the test p ⁇ ⁇ (x,y) is marked for each pixel as " ⁇ " (true) or ">" (false) .
  • Fig. 5b shows (as (i) and (iv) ) the two ambiguous cases in this method.
  • (ii) and (iii) show the two possible alternatives for case (i)
  • (v) and (vi) show the two possible alternatives for case (iv) .
  • the resolution of the ambiguities can be achieved by determining the isosurface value at the centre point (marked in this Figure by a hollow circle) of the square.
  • Outcomes (ii) and (vi) apply if p> ⁇ (x,y) at the centre point, whilst outcomes (iii) and (v) apply if p ⁇ (x,y) at the centre point.
  • pixel maps such as a 2x2 or 4x4 array for each pixel, each entry of which is a sample interpolated from the received pixel array. These pixel maps also allow us to model edges.
  • An optional contour removal process can be carried out on the complete set, by removing single contours and testing to see if the image is still adequately represented after the contour is removed (this contour set can now be stored as the vector format image data, although both radial lines and zero-crossing lines are also legal components of the complete data structure) ; and 12. Encode the optimal contour set to store the image.
  • blind contourisation e.g. by setting contours at a fixed interval, produces a contour set which can be treated just like an optimal contour set by all subsequent processes.
  • the methods of the present invention are performed in the CIE L*a*b* colour space because Euclidian distances in this space have been shown to correspond closely to perceptual dissimilarity.
  • the blue channel is often considerably noisier than the others, so when RGB images are vectorised directly there is a risk of loss of detail from this channel if the noise settings are the same as for the other channels and no real justification for varying them.
  • the colour components (a*b*) contain considerably less spatial detail so can be safely encoded with higher K values than the L* channel.
  • conversion between different colour systems it is preferably done as the first encoding step and the last decoding step, and so the methods presented below can be used with or without such conversions, depending on the colour space in which the raster format image data is originally received and desired to be put out.
  • Raster format image data is obtained, e.g. from a digital photograph.
  • a noise estimate is derived for each pixel in the image as follows. Two sets of difference tables are constructed, involving first, second, third and fourth differences between pixel values in the x (or line or row) direction and in the y (or column) direction.
  • the inventors have noticed that the cubic terms in fitting patches to isochromic surfaces (the 3D model for an isochromic contour map) essentially model noise so from this it would follow that, in the absence of noise, the third differences would be locally constant and the fourth differences zero. In practice this is not the case, so the fourth differences will be some multiple of the local noise in the image.
  • the first differences shown in the table can be referred to as first order differences, the second differences as second order differences, the third differences as third order differences and the fourth differences as fourth order differences.
  • the order of each difference is indicated by the numeral in superscript after the ⁇ symbol .
  • ⁇ 4 ⁇ 4 estimate in P 1 is ⁇ — 7 ⁇ or ⁇ —F ⁇ in the x-
  • 8V2 V70 direction (whichever version is used needs to be used consistently) .
  • This is a one-dimensional estimate (in the x-direction) .
  • a similar calculation is done in the y-direction to determine errors in that direction or dimension.
  • These error estimates may not necessarily be the same, even though referring to the same pixel value, but in practice errors are always taken into account in one dimension only, i.e. along either the x or y directions. Errors in the differences are accordingly calculated from the appropriate one-dimensional error estimate, noting that the error in the first difference will be the sums of the errors in the participating pixels (and typically around twice the error in the pixel value itself) , similarly the error in the second difference will be around four times that of the original participating pixels etc..
  • error estimates may be normalised to more conventional image noise statistics, where these are available, by calculating the statistical means and standard deviations for all of these estimates in the image as a whole and using the error estimate above to index the distribution function derived from such statistics.
  • Our error estimates thus form a measure of deviation from pixel correlations in the appropriate direction.
  • Such normalisation may be carried out separately in x and y or for both directions using the same values, as appropriate. This normalisation is most conveniently done after all difference calculations are completed, i.e. when the pixel colour value error is converted into a spatial error (see later) .
  • Image sampling for resizing is a well-studied process and it is recognised that correlations of degree one (i.e. linear interpolation) , give the most accurate samples overall, but when the assumption breaks down, gives locally worse solutions that higher-order interpolation methods like cubic interpolation. This tends to make linearly interpolated images look less visually pleasing than those obtained with cubic interpolation.
  • degree one i.e. linear interpolation
  • interpolation is carried out first in one direction (usually x) , the (y-) differences for the new interpolants calculated and then the interpolants determined for the other direction (y) using the appropriate form of this formula. It is this final step which yields the central interpolant which is needed to resolve the ambiguous cases shown in Fig. 5b.
  • each of the higher-order forms may result in more than one solution but geometric considerations preclude more than one quadratic solution in the interval and, while there may be more than one solution in the cubic case, there will be either one or three solutions in the interval.
  • the assumption of variation diminishing and the subsequent use of error terms means that we can determine the error in the solution as the widest errors for the more distant solutions. As it is these error values we want to recover, the solutions themselves are just means to this end.
  • the first alternative is to identify the errors for P ⁇ - ⁇ r Pi-i etc. as ⁇ - 2 ⁇ ⁇ i-i etc. and follow their fate through the calculated difference stack.
  • the second alternative is to indicate that we cannot find out enough about ⁇ i- ⁇ , ⁇ -i etc. to determine such things as their signs and that we can only arrive at a bounded, unsigned estimate of their values, i.e. perform a classical error analysis to derive £ > ,_ 2
  • the result of having a single estimate of the error value is that the errors double in magnitude with each degree of difference.
  • the usual model for errors is a Gaussian distribution about zero so that the magnitude of ⁇ would typically be set at about three standard deviations from the centre of this distribution (the more normal choice of two standard deviations, which results in approximately 95% of terms lying within the estimation is insufficient for the present purposes) .
  • N is a normalization constant and the c ⁇ are the weights for the differences.
  • N - E , ⁇ ,-s ⁇ ⁇ ,-A ⁇ ⁇ ,-3 + 2 - ⁇ , ⁇ ⁇ , + 3 ⁇ ⁇ , + 4 + ⁇ , + 5 ( Formula 1 ) .
  • E ⁇ fj Whilst E ⁇ fj will still be contaminated by nearby error terms, it will give an error estimate that is a better indicator of the true value as it is less affected by the surrounding terms. For example, Ei rj is substantially decoupled from E 1+1 ⁇ as it only shares 4 out of a possible 21 error terms (13 with non-zero coefficients).
  • E ⁇ ,j is the unsigned and normalised form of E irj and K 1 and K 2 are constants.
  • K 2 in particular is calculated by averaging the error estimates for the image as a whole, as this value should be zero, so it is a normalisation constant aimed at reinforcing the assumption that errors are distributed about zero.
  • This error term is then scaled according to the scalefactors in the "sum" column of Table 3 to provide error bounds for each of the difference values available for a particular pixel. If the error bound includes zero, then the difference term can be taken as zero and the intersection calculation simplified accordingly. Increasing K 1 therefore increases the extent to which linear interpolation is used and the width of the band in which the contour is deemed to lie.
  • ⁇ * A 4 P 1+2 + ⁇ ⁇ + ⁇ - 4. ⁇ l+i + 6. ⁇ l+2 -A. ⁇ t+ ⁇ + ⁇ , , where the ⁇ V term represents the fourth difference in the true pixel values (so far assumed to be zero; this value will be set to zero once the equations have been derived) .
  • the basic 5-pixel layout of Fig. 6 can now be extended in both directions around P ⁇ .
  • the layout can be extended to the left in the figure layout by using forward differencing after setting all differences higher than d to zero and then applying the formula:
  • each pixel value ⁇ (j t f) will have had a degree d associated with it from previous error analysis. This is the maximum degree necessary for extrapolation away from that value. So in the formula above only the terms involving differences of degree d or less need to be included, as the higher differences have been deemed to be zero. Once ⁇ (/-l,y) has been calculated it may inherit the interpolation degree of although any other possible means of determining this value are also possible.
  • the layout can similarly be extended to the right using the same approach to establishing differences, although backward differences are now used, and the error bounds to the left are intersected for adjacent ⁇ rf s under the same conditions.
  • This process breaks down when an edge is encountered, which may be signalled by a zero crossing of the second difference.
  • the final step in the intersection calculation involves finding spatial error bounds around the intersection solution.
  • the method described above for finding the intersection solution involved solving for x or y for the received isosurface and then establishing where that solution lay on the upper and lower bound isosurfaces.
  • the central difference interpolation formula is: and here the highest difference not deemed to be zero is taken as the mid-point of the intersection of the error bounds for the adjacent entries. Similar formulae with similar considerations apply to the forward and backward difference cases near the edges, although care needs to be taken with the received interpolation value k. Because the first non-central differences are offset by H, the value of k has to be adjusted by ⁇ % in the appropriate direction prior to insertion into the formula .
  • a plateau is in a sense an extended extreme point, as are ridges and troughs. In fact, such extended extreme points tend to be rare (candidate plateaux are rarely completely flat and ridges and troughs will contain further extrema) , but they should be catered for.
  • a plateau in particular will defeat the marching squares process described above, but tracing a contour slightly offset from the plateau level, thereby establishing its bounds and extent will not. Under most circumstances extrema will therefore just be points and they will have regions of influence normally referred to as watersheds . These watersheds can be defined as regions in which energy minimised paths, e.g.
  • One simple technique for finding critical points is to look along a scan line (i.e. proceeding along the x-direction) , noting the signs of the first differences between neighbouring pixels. When the sign changes, the last pixel with the previous sign is a candidate to be a local maximum or minimum. However this may be a false critical point induced by (say) shot noise, and in any case needs to be verified by a similar analysis in the y-direction.
  • the efficiency of encoding processes can be degraded by noise, particularly noise spikes masquerading as extrema. While these can be safely ignored as, for example, the contour finding process will use the enhanced error value to widen the band in which the contour lies, so increasing the likelihood that the contour will pass smoothly through the spike, they still have to be identified in order to be excluded from the extrema list.
  • the most straightforward and common case is that for a spike in a local field of linear interpolations (second differences and upwards deemed to be zero) .
  • the error bounds for the spike will normally be large enough not to trigger higher degree interpolation but even if it is a 1-pixel "hole" in the linear field, this is very noticeable.
  • the condition here for discarding a candidate extremum is that the three adjacent (central) first differences in both x and y have error bounds which overlap a single value which in turn is the average of their three values.
  • the reason for choosing their average can be seen from Table 3 above because the error in the central term is excluded from the sum.
  • the standard test for an extremum is a so-called zero crossing (a term normally used in connection with second derivatives where it signals an edge) in the first derivative, here simulated by differences. This is where the first derivative (difference) changes sign and, assuming the modelled derivative is continuous, the zero point should lie between the values of opposite sign. If zero also lies within the common error bound then these first differences could all be zero and adjacent values need to be checked to establish a first degree zero crossing unambiguously. In the event of a conflict the discard condition supersedes the retention condition unless all three differences could be zeros in which case a sign change indicated by adjacent non-zero first differences determines the outcome.
  • That initial boundary is defined by an ellipse the sizes of whose axes are determined by the errors in x and y, namely ⁇ x, ⁇ y , for the extremum.
  • the implementation of this process as a set of advancing markers (which may be given a particular colour to identify the tile to which they belong) over the pixel field is straightforward. The outcome of this process will be to identify all the pixels on the watershed borders. By starting with a constant initial velocity a Voronoi tessellation (see below) expressed in terms of its borders will result.
  • the diffusion process can preferably be modified by defining v to be:
  • This process identifies a set of border pixels for each watershed.
  • Each tile will then contain a pixel value nearest to the extremum value, and a pixel value further from the extremum value.
  • the nearest value defines a contour that lies entirely inside the tile, and surrounds the extremum.
  • the furthest value will contain a contour that will normally lie entirely outside the tile (except at the point (s) of intersection) and thus contain the whole of the tile as well as parts of neighbouring tiles.
  • the tiles can then be sorted in terms of their furthest contour values and each of these provides a seed for tracing a contour. Along with starting contours for extrema, these provide an initial set of levels for contour finding.
  • Voronoi tessellation e.g. as described in O'Rourke J., "Computational Geometry in C 2 nd Ed.”, CUP 1998.
  • Voronoi tessellation e.g. as described in O'Rourke J., "Computational Geometry in C 2 nd Ed.”, CUP 1998.
  • Voronoi tile will contain all those pixels at least as close to the defining seed point as to any other seed point. Pixels which are equidistant to at least two seed points lie on the tile border and it is this tile border that forms the estimated watershed. Efficient tessellation processes will typically first form the dual of a Voronoi tessellation, which is a Delaunay triangulation joining the nearest seed points to each other and then forming the Voronoi tiles in terms of straight lines joining the mid-points of the vertices of the triangulation.
  • Voronoi tessellation is a Delaunay triangulation joining the nearest seed points to each other and then forming the Voronoi tiles in terms of straight lines joining the mid-points of the vertices of the triangulation.
  • Voronoi tiling is quicker than the diffusion method described above, but the diffusion method has the advantages of being able to handle extended extrema and to produce a result that is better related to the true watershed than the Voronoi tiles.
  • the optimal contour set does adequately represent the image being vectorised, then the set is complete and can be stored as the vector format image data after removal of redundant contours.
  • steps are the recursive methods of finding the optimal contour set, involving starting from a single contour (or a number of seed contours), testing the accuracy of the representation produced by those contours against the original data, and adding further contours if required to match the original data.
  • a contour near to each extremum - this may have already been done in step 6. If the extremum is a point then the contour will describe a small circular region around it. If the extremum is an extended region then the contour will be an extent of the region. The level value will be close to the extremum level value in each case and the contour should be chosen such that linear diffusion within this contour will result in pixel colour values falling within the error bounds of the pixel colour values of the original raster format image data for all pixels inside the contour. These contours replace the extrema in what follows and any small gaps in the overall image can be filled in by diffusion at the end if required.
  • step b) Find for each watershed a bound pair which comprises the level found in step a) and the most distant level associated with the watershed boundary.
  • step d) For each contour in the level set defined by step d) (or step f) below) , identify all watersheds whose footprints have a non-zero intersection with the contour footprint. If a watershed is not wholly contained by the contour it is partitioned by the contour into segments. Find the most distant point on each watershed segment border from its extremum and set the watershed segment bound pair appropriately. The contour now has its own watershed subset associated with it.
  • step f) (A modification of step d) to deal with the partition defined by the contour in e) above) Find another level inside the contour defined by step e) with the following characteristics i) it is either below or intersects all the watershed bound pairs associated with the contour, or ii) it is the highest level for which characteristic i) is true. Find the level set as defined within the contour for this level.
  • step f) Repeat e) and f) until the contour found by step f) is the contour around the extreme point.
  • the vicinity of the extreme point is filled as a special case, diffuse to extremum itself then fill all pixels in extremum footprint with extremum value.
  • steps d) to g) above assume that the entire insufficient contour set is first mapped out, then diffusion is done. However, diffusion on steps d) and f) can be carried out as soon as two levels to diffuse between have been identified. If at any point the diffusion step generates a pixel value which lies outside the error bounds of the original pixel then a new level set for that value is generated and all contours which lie within that level set may be discarded (i.e. all of the level set providing the diffusion target) . There is no need to re-diffuse between the new levels as this will be done correctly on decoding and the next step can be started immediately.
  • step d) leads to multiple contours for each of which steps e) and f) are done and step f) leads to multiple contours for which the process is repeated until the level selected corresponds to an extremum.
  • the method is a recursive process which at any given repetition starts with a single closed contour for a given level number. This is represented by a "footprint" which consists of the pixels whose original values are greater than or equal to the level value.
  • the contour footprint contains pixel values larger (brighter) than those outside the contour.
  • the contour is part of a series representing a hole the process is the same but the decisions are the other way round (e.g. the footprint contains pixels which are smaller than the surrounding pixels).
  • contour contains other closed contours of the same level number then these enclosed contours, which are part of the level set including the first contour, correspond to holes in the footprint and their interiors do not form part of the footprint of the first contour.
  • An assumption is made about how the pixel values vary within the footprint. In one method this is done by determining a level set for a level number which is as different from the original level as possible while ensuring that at least one pixel from every watershed within the footprint is within the footprint for this new level set. This is called the target level.
  • the predicted pixel values for all the pixels within the footprint are then determined by interpolation between the two level numbers (the original and the target) based on the results of a diffusion process which indexes the footprint .
  • the result will be a compatible target level, which defines a difference footprint resulting from subtracting the target level footprint from the original footprint.
  • the target level is compatible if the difference footprint it defines contains only pixels which, after diffusion and interpolation or whatever assumption is made about internal pixel values, fall within the error bounds of corresponding pixels in the original raster format image data.
  • the outcome is a level set of contours to each of which the process is repeated until the original footprint of the original contour is empty. The process can then be repeated for any other base contours in the image.
  • the diffusion or interpolation method used is preferably, but not necessarily the same at each repetition.
  • the diffusion method used is identical to the method that is used to render/decoder the vector image data to produce an image (see below) .
  • any group of at least three nested contours or levels it is possible to test for redundant contours by removing the middle contour or level and attempting to diffuse between the other two. If the outcome is that a middle contour or level is still required but that its footprint is smaller than previously then the inner contour or level is a candidate for replacement by the same process until all the contours with a particular watershed have been exhausted.
  • a test for whether the contour is a candidate for outright removal is to compare the pixels under the contour to see if the interpolated value, on the assumption that the contour has been removed, lies within the error bounds of the original pixel.
  • the interpolation values will be available from previous diffusion steps. This test, or variants on it, could be used to see if contour removal was warranted and only attempting removal and accepting either consequence if the test indicated it was likely to result in outright removal. Watershed analysis is not critical to this process as "exhaustion of the watershed" would mean that the contour surrounding an extremum would become one of the triplet, but an improvement would only result if diffusion was part of the reconstruction/decoding process .
  • the methods of encoding in the present invention may include the step of removing "redundant" contours as described in the example steps above, such removal steps are computationally more expensive than the construction of new contours. Accordingly, the methods of the present invention preferably construct the optimal contour set only by creating new contours from an under-populated contour set, and not by removing contours from an over- populated contour set.
  • the vector format image data may include control information which indicates the assumption (s) used in the encoding process. By including such information, a renderer or decoder can determine how best to render or decode the image to faithfully reproduce the original raster format image.
  • these contours are stored as the vector format image data. This step may be performed for each contour when it is first determined, or may be performed in a single operation for all contours when the complete contour set has been determined.
  • a blind technique is one in which the image is not reconstructed in a test decode during encoding. For example it may be decided to select contours at a fixed interval f where f>2 ⁇ and ⁇ is a global estimate of the average pixel noise. A smaller interval may be used where the implementation of the fill process compensates for contour overlap.
  • Another blind technique sets contour values to local maxima in the pixel value histogram, starting with the largest value, then the largest values in the intervals defined by the previous choice, repeating until the interval is near 2 ⁇ or some other limit criterion.
  • a third blind technique uses the distribution of extreme values (maxima and minima) grouped in intervals of ⁇ . If this is calculated as a histogram of extrema it can be used in the same way as the pixel value histogram to determine a contour set.
  • Blind contourisation results in a unique set of level values for the image which can only be modified locally by identifying clusters and partitioning the image in terms of the watersheds for these clusters. None of these methods require a test rendering step, i.e. image reconstruction.
  • the contours are stored by encoding them as Bezier chains.
  • a Bezier chain is a plurality of linked Bezier curves.
  • a Bezier curve can be defined by only 4 points of data (2 end points and 2 control points), the amount of data required to store each contour, and as a consequence, the total amount of data required to store an image, can be significantly reduced.
  • Other curve fitting methods could be used to store the contour data, and the data file storing the image may contain control information indicating the curve fitting method that has been used to store the contour data.
  • the method of this embodiment is used to turn the vector data into an image for display. Preferably it closely corresponds with the test step of the method of encoding used to encode the data into vector form.
  • the vector format image data is preferably stored as a set of contours, with each contour being represented by a plurality of Bezier chains.
  • the decoder renders out the Bezier chains to form the contours (i.e. turns the stored contours into a string of points) at the resolution of the image that is being reproduced.
  • the decoding process determines the colour values of the pixels falling between the contours by using a diffusion equation with fixed velocity which travels over the footprint between the two levels, herein referred to as the template or fill template when it is quantised onto pixel-like areas.
  • a diffusion equation with fixed velocity which travels over the footprint between the two levels
  • the reason why a fixed velocity is used is to progress the front defined by the equation by a given distance in unit time. These time-stamps can be used to simplify the calculation so it is not necessary to solve the equation directly.
  • the front advances over the template it inserts the time-step (1,2, ...etc. although fractional values may be used for 8-connected paths) into each cell over which it passes, time being a measure of distance under constant velocity.
  • each cell will contain two numbers when the process is complete, which will be when every cell of the template has been visited. If a template cell contains two quantities (a, b) where a is the distance from level A and b the distance from level B, then the interpolated pixel value P for the cell will be:
  • the sum ⁇ + b is the shortest distance between contours at any point on the template, alternatively the least number of time steps to traverse the template, and may vary as that distance varies locally. Either interpolation term may be used to parameterise the entire template surface. The times at which the front reaches the borders of the destination contour between which interpolation is being determined thus determines the distance from the originating border, similarly the times to reach any interior point determine the minimum distance from the original position of the front.
  • radial lines are not used this will be a fixed velocity and the diffusion equations are used twice, once to give a minimum distance to the lower level and once to give a minimum distance to the upper level. If radial lines are used the diffusion equations are first used at fixed velocity four times to determine two orthogonal coordinates for each point between the levels and these coordinates are used to determine the interpolations used to derive velocities for each pixel position.
  • the topology of the diffusion space is rectangular and indexed by 0-1 between levels and 0-2 in the orthogonal direction (this is the ⁇ orthogonal index') then: a, the pixel values on each radial line (a minimum of 3 are required) are interpolated by differencing to degree 3 subject to error management to provide the same number of uniform samples along every radial line, b, these values are differenced to give a set of velocities for each sample point along a radial line, c, corresponding velocities on adjacent radial lines are differenced to degree 2 and interpolated (subject to the usual error conditions) to the number of increments of the orthogonal index.
  • velocities now form a rectangular grid which is interpolated by convolution or integration to provide a velocity for each cell in the original space.
  • the velocities mapped into the original space are used by the diffusion equation along with a curvature damping term to provide a variable-speed diffusion between levels.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Processing (AREA)

Abstract

A method is provided for converting raster format image data to vector format image data comprising the step of generating contour data for a plurality of isochromic contours, whereby each isochromic contour is determined by applying a curve fitting process to a plurality of possible solution points and spatial error bounds for said solution points, wherein said solution points and error bounds are based on said raster format image data, said solution points being deemed to have colour values equal to the colour value of the contour, and the contour data is sufficient to allow reconstruction of said contour using the curve fitting process. Also provided are a method of converting raster format image data to vector format image data comprising the steps of determining a minimum set of contours to be generated by using diffusion equations to define a plurality of watersheds and generating contour data defining a contour around each watershed and methods of rendering an image from vector format image data and of determining errors in pixel colour values.

Description

Image Processing and Vectorisation
The present invention relates to the vectorisation of images into contour maps which can be interpreted as piecewise continuous fields. It may be used for the vectorisation of photographic images, but is not limited to this. The present invention also relates to a method of smoothing a contour, a method for interpolating between contours, a method of determining errors in a raster format image, a method of generating a higher resolution version of an image and apparatus for carrying out any of the above.
With the advent of television and later, with the availability of computer systems, it has become desirable to store and manipulate pictures in an electronic or digital form. In television technology pictures are scanned, stored, transmitted and reconstructed as a series of lines. In computer technology all the lines which make up an image are stored together and define a rectangular area (the raster) which when displayed on computer or television screen give the illusion of an image. A typical raster is shown in Fig. Ia (only a 16- line raster is shown: typical rasters have around 1000 lines) and how it is used to build up an image in Fig. Ib (using a single colour gun) . Such scanning and storage is referred to as a raster system and the term applies equally to digital television and images displayed on a computer. In digital video systems each line corresponds to a row of locations in a special two-dimensionally indexed memory called a framestore, and each line on the screen is painted out of the contents of a row of pixels. The fraraestore contains an array of rows of pixels, which also line up in columns. If a column of pixels is set, a vertical line appears on the screen. The term pixel ambiguously refers to the spot on the screen whose intensity is controlled by the pixel value, and by the memory location whether or not it is in the framebuffer itself. The resolution of the image is expressed in terms of two numbers, the number n of pixels in a line (also referred to as a row) and the number of lines 1 (which is also referred to as the number of columns, as pixels line up in columns as described above) making up the image or Λn x 1' . Framebuffer resolutions are expressed similarly
A pixel on the screen may be considered to occupy a finite area, which is idealised to being a square of unit distance along each side although there is considerable debate about how a 'pixel' may be most suitably represented {Blinn, J. What is a pixel? IEEE Computer Graphics & applications, VoI 25 No 5 (Sept) 2005 pp 82- 87) . Essentially the most appropriate model depends on the stage at which the computation of the pixel value has reached. The idealisation above is necessary in order to calculate or represent digitised images as similarly as possible to their original forms. Screen real estate is thus first measured out in display-independent units of pixel side length, which can be corrected to the actual pixel dimensions for metric measurements over the surface of a particular screen or projection area.
Another known technique for describing images is referred to as a vector system. In this, the co-ordinates of a starting point are given, together with either the co- ordinates of the end point of that line, or a direction and distance. This is repeated as many times as are necessary to make up the desired shape. Both of these techniques are illustrated in Figs. 2a and 2b which show respectively the vector addressing of a screen using both absolute positioning and relative positioning and the vector drawing of a triangle using both absolute positioning and relative positioning. Vector images quite often contain closed shapes, which are filled with a colour or pattern. This colour or pattern can vary spatially over the filled region, as shown in Fig. 3. Fig. 3a shows a filled area in a vector image. Fig. 3b shows an image with a plurality of filled areas making up an image, which has been reconstructed from the data depicted in Fig. 3c.
Various techniques can be used to reconstruct images from stored image data. Techniques known as rendering have been available for many years. These involve the creation of completed (or composite) images, usually on a line-by- line ( ^scan-conversion' ) basis. This involves calculating on a pixel-by-pixel basis a composite line, and then repeating this for each line in the final output image. Thus the output of a rendering process is a raster image. It is well known that such rendering processes can turn a vector image into a raster image. This is because the content of any one given pixel can be determined from the vector information.
Vector images are usually drawn by the operator using special-purpose software and manual input devices such as a graphics tablet. In contrast to images that originate from a computer, photographic or real scene images are always in raster form. This is because the known forms of capturing such images are themselves raster based. However, storing images in vector form has several advantages. For example, vectors images are generally resolution independent as it is possible to render the final output image at any desired resolution, both spatially and in colour depth, subject to the accuracy of the vector points. Most particularly vector images can be manipulated (warped etc.) without incurring image- degrading effects as can happen with raster images. This is because the vector form is warped, then rendered as usual in a form suitable for local display. At present, the techniques available for the conversion of photographic or real scene information into vector format are subject to corruption by noise which is always present in any captured raster image and this causes difficulties, some insuperable by present means, at all stages of the handling and rendering of the format.
An object of the present invention is to provide a method for representing images that are generally captured in a raster form in a vector format which preferably compensates for noise corruption and a method of rendering them out conveniently.
Another object of the present invention is to provide a method of determining the errors in a raster format image immediately on input. Once determined the error terms may be used throughout the encoding process to optimise the format prior to manipulation and rendering. In its various aspects the present invention may achieve one or more of the above objectives.
Accordingly a first aspect of the present invention provides a method of converting raster format ' image data to vector format image data comprising the step of generating contour data for a plurality of isochromic contours, whereby each isochromic contour is determined by applying a curve fitting process to a plurality of possible solution points and spatial error bounds for said solution points, wherein said solution points and error bounds are based on said raster format image data, said solution points being deemed to have colour values equal to the colour value of the contour, and the contour data is sufficient to allow reconstruction of said contour using the curve fitting process.
The error bounds around solution points for a given colour value are preferably used to determine a finite area which is the strongest constraint on the shape of a contour of that colour value, i.e. the 'contour bounds'. Any contour which fits in that region may be considered a valid contour within the accuracy to which the original image was captured.
Preferably the step of generating contour data comprises a step of assigning coordinate values to pixels in the raster format image data and interpolating between pixels to find the coordinates of points in the path of a contour having a given colour value. This enables the contour bounds to be found to sub pixel accuracy from the initial solution points .
For example each pixel can be assigned x and y coordinates, which may conveniently be based on the coordinates of the centre of each pixel. Each pixel then has at least a pair of coordinates and a pixel colour value. If two adjacent pixels have colour values of 40 and 50 respectively (in nominal units) , then a contour for colour value 45 will lie between these two pixel centres. The coordinates in pixel space of a point on this contour can be found by interpolation. This enables the contours and/or the contour bounds to be found to sub pixel accuracy.
Preferably the step of defining said contour (s) takes into account error estimates in the raster format image data. The preceding description could be applied with a nominal error value of zero everywhere in which case the error bounds would define the contour trajectory exactly. The error estimates may be based on differences between the values (e.g. colour values) of adjacent pixels. This enables the contours to be defined more smoothly.
The inventors have noticed that differences in pixel values from cubic terms in fitting patches to isochromic surfaces model noise statistics. Therefore the error estimates are preferably based on a fourth order difference between the pixels (as the fourth difference should be zero if there is no noise, or a multiple of the noise otherwise) . In one embodiment the error estimate is based on a multiple of the fourth difference divided by V2 or V70. The exact divisor for the fourth difference in each embodiment will be mediated by a parameter so there are a multiplicity of possible choices for this value, depending on the default accuracy of reconstruction desired.
The method may further include the generation of radial line data for one or more radial lines running between two of said isochromic contours, wherein each radial line is a path between said contours which follows the direction of the local gradient. Such radial lines may be determined by applying a curve fitting process to a plurality of possible solution points and spatial error bounds for said solution points, wherein said solution points and error bounds are based on said raster format image data, said solution points have colour values equal to the colour value of the radial line in the vicinity of the solution point, and the radial line data is sufficient to allow reconstruction of said radial line using the curve fitting process.
Preferably said spatial error bounds are based on local error estimates for pixel colour values in the raster format data. By "local error estimates", we mean, for example, estimates of the error in pixel values of the raster data in the region of the respective solution point. Alternatively a global error estimate such as PSNR, MSD or MAD could be used. The local error estimates may be based on fourth order differences between the pixel colour values in the raster format data.
In particular, the error estimates may be calculated directly from the fourth order differences.
Alternatively the error estimates may be calculated indirectly from the fourth order differences by finding regions where a direct solution for the individual error terms is possible, determining that solution, then using it to unravel other error terms from adjacent fourth differences. Generally this alternative approach is of most use where there are five adjacent pixel values each of which define a linear interpolation step as sufficient for the local error terms.
Alternatively the error estimates in the colour value for each pixel may be calculated by using a convolution like weighted sum of adjacent fourth order differences.
Alternatively the error estimates may be calculated by a combination of any two or more of the above alternatives, the combination being chosen so as to achieve the most accurate result depending on context.
Preferably at least first order differences between the colour values of adjacent pixels are calculated and error estimates for these differences are generated; and said error estimates for the differences are used to simplify equations for finding the solution points for the isochroraic contours. For example, when the error estimate for a difference causes the range of possible values for that difference to overlap zero, the term in the equation corresponding to said difference is assumed to be zero. In this way it ,is possible to reduce the order of equation needed to solve the contour. Options include degree one (linear interpolation, second and higher differences are zero within error bounds) degree two (quadratic interpolation, third differences are zero within error bounds) and degree three (cubic interpolation) .
A second aspect of the present invention provides a method of converting raster format image data to vector format image data comprising the steps of determining a minimum set of contours to be generated by using diffusion equations to define a plurality of watersheds and generating contour data defining a contour around each watershed.
A "watershed" is a region in which, if you select a point in the region, and follow the line of steepest ascent or descent (depending on whether the region is a "hill" (a region in which the area enclosed by the watershed is of greater value, or "higher", than the value at the watershed) or a "hole" (a region in which the area enclosed by the watershed is of lower value, or "lower", than the value at the watershed) ) , you will reach the same extreme point. Preferably the contour data is generated using a method of the first aspect above, including any combination of the optional or preferred features of that aspect.
Alternatively the method derives a contour set at fixed intervals or according to the interpretation of nonreconstructive instrumentation (such as a pixel value histogram or an extrema histogram) to provide a fixed set of contour values.
This method may further partition the image in terms of extrema clustered by overlapping watershed intervals.
A third aspect of the present invention provides a method of converting raster format image data into vector format image data according to either of the first or second aspects above, including any combination of the optional or preferred features of those aspects. In the method of the third aspect an error estimate is generated for each of a plurality of pixels in the raster format data, and the method further comprises a test step in which isochromic contours are generated from the contour data, a colour value for each of said pixels is calculated on the basis of the generated isochromic contours and, if the calculated colour value is outside the error bounds for that pixel, a new isochromic contour is generated for the colour value of that pixel.
Preferably said pixel colour values are for pixels inside said contours and in the test step said pixel colour values are calculated on the basis of said isochromic contours by using a diffusion equation.
Where the vector format image data includes radial line data, said pixel colour values are preferably calculated on the basis of said isochromic contours and one or more radial lines.
The contours are preferably generated from the contour data by applying a curve forming process to said contour data .
Preferably said test step is repeated until the colour value for each of said pixels, calculated in said test step, is inside the error bounds for each said pixel. When this is true, the image has been placed in vector format to within the errors associated with the original raster format image data.
Thus the test step may be a recursive process which, at any given repetition, starts with a single closed contour for a given level value. This is provides a "footprint" consisting of those pixels whose original raster format colour values are greater than or equal to the level value (assuming a "hill" arrangement, but the process is equally applicable to "hole" arrangements) . If the contour completely contains other closed contours of the same level value, these correspond to "holes" in the footprint and the interior of such holes do not count as part of the footprint. An assumption may then be made about how the pixel colour values vary within the footprint. The pixel colour values for pixels within the footprint may then be determined based on this assumption (which is preferably the same assumption as will be used when rendering or decoding the vector image data) .
A comparison can then be made between the determined pixel colour values within the footprint and the original, raster format, colour values for the same pixels. If all of the determined pixel colour values fall within the error bounds of the colour values from the original, raster format, data, then the contour (and the assumption) adequately describe the footprint.
If the comparison shows that some of the determined pixel colour values do not fall within the error bounds of all of the colour values from the original data, those pixels which fall outside the error bounds can be identified and an estimate can be made of a further contour, the provision of which will result in more of the determined pixel colour values falling within the error bounds of the colour values from the original data.
The test step can be repeated until all the determined pixel values fall within the error bounds of the colour values from the original data.
At one extreme, the assumption may be that pixel levels within the footprint are all the same. This will still result in correct encoding of the raster image data into vector image data, and is relatively simple to perform (for each footprint no calculation of pixel colour values is required: the level value of the contour is simply compared to the original data and the error bounds) . However, the number of contours required is likely to be large compared to encodings using other assumptions, and the number of repetitions of the test step required to produce the full contour set is also likely to be higher than for other methods .
In one particular embodiment of the present invention, the assumption is as follows. A second contour for a level value which is as different from the level value of the first contour, while still containing at least one pixel from every watershed within the footprint of this second contour, is determined. The level value of this second contour is referred to as a target level. The pixel values are then determined by interpolation between the two level values based on the results of a diffusion process which indexes the footprint.
The interpolated pixel values that fall outside the error bounds of the pixel values in the original data are identified and an estimate is made of a new target level value which will ensure that a repetition of the interpolation process will result in determined pixel values falling wholly within the error bounds. The result will be a third contour having a compatible target level, i.e. a level which is such that the footprint resulting from subtracting the footprint of the third contour from the footprint of the first contour only contains pixels which, after the assumption is made about how pixel values vary within a footprint, fall within the error bounds of the corresponding pixels in the original, raster data.
The outcome is thus a set of contours (comprising the first and third contours above) to each of which the test process is repeated until the original footprint of the first contour is completely accounted for.
Thus, by a recursive system of test steps, the vector image data can be built up to include the minimal number of contours needed to describe the image data to within the error boundaries of the original, raster data.
Such a process of building up a set of minimal contours is computationally more efficient than providing a set containing more than the optimal number of contours and attempting to determine which contours should be removed.
However, the present invention also includes methods in which redundant contours are removed from a contour set.
A fourth aspect of the present invention provides a method of rendering an image from vector format image data comprising the step of determining the pixel colour values between isochromic contours in the image data by using diffusion equations.
Where said vector format image data further includes one or more radial lines, the pixel colour values may be determined by using diffusion equations in the region bounded by said radial lines and said isochromic contours (which is topologically square) .
Where the vector format image data was created using a method according to the above third aspect, including any combination of the optional or preferred features of that aspect, the diffusion equations used to render said image are preferably identical to the diffusion equations used in said test step.
A fifth aspect of the present invention provides a method of determining the errors in pixel colour values in an image comprising the step of calculating the differences in values between adjacent pixels, estimating the background noise based on fourth order differences between said pixel colour values and assigning a corresponding error value to each pixel.
The error estimates may be calculated directly from the fourth order differences.
Alternatively, the error estimates may be calculated indirectly from the fourth order differences by finding regions where a direct solution for the individual error terms is possible, determining that solution, then using it to unravel other error terms from adjacent fourth order differences.
Alternatively, the error estimates in the colour value for each pixel may be calculated by using a convolution like weighted sum of adjacent fourth order differences. The method may further include the step of determining the error estimates of pixels near the border of the image using a border-enhancement technique.
The border-enhancement technique preferably includes the steps of generating colour values for virtual pixels lying outside the edge of the image, and using the colour values of said virtual pixels to determine said fourth order differences for the pixels near the edge of the image .
In one arrangement the step of determining the colour values of said virtual pixels includes matching a first group of pixels near the edge of the image with a second group of pixels elsewhere in the image having similar colour values to said first group of pixels, and generating colour values for said virtual pixels based on colour values for pixels which have the same positional relationship to said second group of pixels as said virtual pixels have to said first group of pixels.
Preferably said second group of pixels has colour values which are within error bounds of the colour values of the first group of pixels (or vice-versa) , and the edge- enhancement technique assigns a confidence factor for each virtual pixel based on the error bounds and the similarity between the colour values.
In an alternative arrangement the step of determining colour values of said virtual pixels uses the fourth order differences of the pixels near the edge of the image to determine colour values of said virtual pixels from the colour values of said pixels near the edge of the image .
A sixth aspect of the present invention provides for use of the errors in pixel colour values determined by a method according to the fifth aspect above, including any combination of the optional or preferred features of that aspect in a method according to any one of the first, second or third aspects above, including any combination of the optional or preferred features of those aspects, either directly, or in the calculation of spatial error bounds .
The calculation of the errors may form part of a preliminary set of calculations performed before the contours (and optional radial lines) of the vector format image data are determined. Data from these calculations may then be used in the methods of the preceding aspects.
As part of the preliminary set of calculations, the methods of the above aspects may calculate error bounds for every pixel in the raster format image data, find extrema (maxima, minima, ridges, troughs, plateaus, etc.) in the raster format image data, and optionally determine watersheds or approximations to those watersheds.
Alternatively, some or all of the above calculations may be performed at the point in the methods at which they are required. A seventh aspect of the present invention provides a computer program comprising instructions for carrying out the method of any one of the above aspects, including any combination of the optional or preferred features of those aspects.
A further aspect of the present invention provides a computer readable medium containing a computer program according to the seventh aspect above.
A further aspect of the present invention provides a computer programmed according to the above seventh aspect of the present invention.
A further aspect of the present invention provides an apparatus having means configured for carrying out the method of any one of the above aspects, including any combination of the optional or preferred features of those aspects. The apparatus may be comprise respective components for carrying out each step of the method.
A further aspect of the present invention is an image produced by any one of the above aspects.
The above methods may be applied to line drawings stored in raster format (e.g. black and white bitmaps) . However, an advantage of the present invention is that it can also be applied to colour or greyscale images, e.g. photographs. Each contour is a contour representing a colour value and the contours are isochromic contours or nominal lines through the image on which all points have the same colour value.
There are various ways of digitally representing a colour. Typically a colour is represented as a mix of three colour values (commonly known as a tristimulus system), e.g. Red (R), Green (G) and Blue (B). Each component is quantised to a given range to values. In the present application the terms "colour" and "colour value" are used to refer both to the combined colour or colour value of the three components or to the value of any one of the individual components. Typically the method will be carried out separately for each component (e.g. red, green and blue) so that vectors are generated in each colour space. The colour components are then recombined when the image is generated from the vector data for display on a computer screen or the like. As the present invention can also be applied to grey scale images, the terms "colour" and "colour value" also include shades of grey (grey level components) . While a grey scale image is often translated from RGB data, each grey scale pixel could be represented by one component giving the grey scale value (or shade) and the present invention can be applied to such a system too. In general it is applicable to any system, tristimulus, single component or otherwise which assigns colours a numerical value, including colour spaces such as L*a*b or subtractive colour systems such as CMYK.
An embodiment of the present invention will now be described by way of example only and with reference to the accompanying drawings, in which: Fig. Ia shows a 16 line raster and has already been described;
Fig. Ib shows a raster display of a triangle on a CRT and has already been described;
Fig. 2a shows various methods of vector representation and has already been described;
Fig. 2b shows vector drawing of a triangle by a CRT and has already been described;
Fig. 3a shows a filled area in a vector image and has already been described;
Fig. 3b shows an image generated by filled areas defined by vectors and has already been described;
Fig. 3c shows the data used to construct the image of 3b and has already been described;
Fig. 4 is a diagrammatic representation of a basic principle behind contour construction;
Fig. 5a shows how the relationship between the pixel value and the contour level may be used to trace a contour between sample points using the method of marching squares; Fig. 5b shows how the two ambiguous cases may be resolved in the marching squares algorithm; and
Fig. 6 is a diagram showing first, second, third and fourth order differences in a row of pixels.
Generally pixels in real images are contaminated by noise. By associating an error value ε with each pixel value φ, then pixel values can be represented as a pair {φ - ε, φ + ε) which defines the range of values that a pixel could have. Deriving a value for ε for each point is thus useful in analysing the image. Such a value could be derived globally (e.g. using the global signal- to-noise ratio) or preferably locally, for example using the fourth-differences as set out below.
Using the above notation and extending across all pixels in the image, we have defined three isosurfaces for the image: the received isosurface that is the original pixel values and an upper and lower isosurface respectively corresponding to the maximum possible value of each pixel and the minimum possible value of each pixel. The "true" isosurface is φ(x, y) .
One method of representing an isosurface is to model the isosurface in terms of isochromic contours, sometimes known as isophotes, which has been used in image analysis and digital mapping since at least 1967 (see Warntz W., Woldenberg M. , "Concepts and Applications - Spatial order", Harvard papers in Theoretical Geography No .1 , May 1967) . If the true isosurface is intersected by a plane φ=p, then the resulting isochromic contour corresponds to the set of solutions to p-φ(x,y)=0. This form of equation is the starting point of a Level Set formulation for the calculation of successive contours. The solution to the curve equation will often result in more than one closed path, so we will refer to a complete solution of the equation over a region as a "level set" and each closed path within that solution as a "contour" or "isochromic contour".
One possible way of representing an image in vector format is therefore to represent the isosurface as a number of level sets for a specified set of values of p. Since each level set itself consists of one or more contours, we can represent the contours in any standard curve formulation, e.g. Bezier chains.
However, using isochromic contours to represent an image in which the original data is formed of a large number of pixels with quantised colour values presents the problem of how pixels, which by definition are "areas" of colour, rather than discrete points, can be converted to determine the contours. When attempting to solve this problem, it must also be borne in mind that, for most pixels in real images, we will not have exact pixel values, but only error bounds for each pixel.
Therefore, the raster format image data is processed to define a plurality of contours representing the image. Fig.4 illustrates the basic idea behind the method for constructing a contour in the absence of a pre-existing model for a given brightness or colour value, say, a value of 80. The top image shows how pixel values are usually represented by outlining their borders (left) , and for this diagram in terms of their values at their centres (right) . The large image uses this second representation. The large image shows how a contour of value 80 might be found by interpolating between the given pixel values. This method finds a single contour path rather than a ribbon in which the contour might lie.
The pixel is modelled as a unit square with its sample value centred at: pixelx,pixely where each boundary is modelled as a line of x =pixelx ± 0 • 5 , y= pixeL ± 0 •5 and a computation is performed to establish where the contour intersects the lines x=pixelχr y= pixeL . Fig. 4 shows a representation of a 3 by 3 block of pixels, having values 71, 66, 60, 75, 73, 68, 79, 75 and 70. The conventional interpretation is to treat these values as integrals over an area but if the isosurface is to a first approximation planar through the pixel then the integral value will correspond to the spot height or sample point at the centre of the pixel. With this assumption we associate the spot heights with the intersection points of the rectangular grid of square cells defined by \/x,y : x = pixel x,y = pixel y . Interpolation calculations in x and y are carried out in terms of a continuous approximation of the values in the spot heights along each grid-line. Thus interpolation is done in terms of intersections of lines that run through pixel centres .
Instead of solving p- φ(x,y)= 0 to determine the error bounds for the contour (s), we solve instead for the received isosurface only and calculate the error bounds in x and y for each solution point. The initial part of a contour calculation is thus just like the situation shown in the main part of Fig. 4 for the case of p=80.
There are essentially two circumstances in which contours are calculated during encoding (which will be discussed in more detail below) . The first is when a pixel value is being re-estimated from the available contour data and is found to lie outside the error bounds for the original pixel. In this case the contour for the original pixel value is traced out by itself and there is no process of searching for a point to start the tracing from. The second situation is where an entire level set is being calculated for a given value of p and here the initial step involves finding all pixels for which p≤φ(x,y^,
Both the above processes preferably invoke a highly localised search process to find the next intersection points or points following a solution for a previous point. This process is sometime referred to as a "marching squares" algorithm, but the search process will be modified in a scan-conversion context to discover all solutions along a given scan-line before proceeding to the next one. The process is illustrated with reference to Figs. 5a and 5b. In these Figures, the actual pixel values are irrelevant, so the outcome of the test p < φ(x,y) is marked for each pixel as "<" (true) or ">" (false) . If the pixel centres are marked as shown in Fig.5a, and the initial intersection of the contour with the square joining the pixel centres is shown as the solid black dot on the left side of the square, then the next unique point on the contour path is represented by the hollow dot in each of the cases shown.
Fig. 5b shows (as (i) and (iv) ) the two ambiguous cases in this method. (ii) and (iii) show the two possible alternatives for case (i), whilst (v) and (vi) show the two possible alternatives for case (iv) . As can be seen from this Figure, the resolution of the ambiguities can be achieved by determining the isosurface value at the centre point (marked in this Figure by a hollow circle) of the square. Outcomes (ii) and (vi) apply if p>φ(x,y) at the centre point, whilst outcomes (iii) and (v) apply if p<φ(x,y) at the centre point.
Other complications may arise, but they can be dealt with by constructing pixel maps, such as a 2x2 or 4x4 array for each pixel, each entry of which is a sample interpolated from the received pixel array. These pixel maps also allow us to model edges.
An embodiment of a method of converting raster format image data into vector format image data will now be described. The method of this embodiment can be broadly broken down into two parts: (a) preliminary calculations and (b) a recursive encoding. These parts can be further subdivided thus:
Preliminary calculations:
1. Obtain raster format image data;
2. Calculate differences between raster image data down to fourth order differences;
3. Calculate pixel error bounds based on the fourth order differences;
4. Calculate the error bounds for each difference by working these pixel errors down the stack; and
5. Determine extrema in the raster image data (maxima, minima, ridges, troughs, saddle points etc.).
Encoding into vector format image data:
6. Use diffusion equations to determine watersheds in the image;
7. Generate an isochromic contour which forms part of the optimal contour set; 8. Test how well the optimal contour set represents the image being vectorised.
9. . If the partially formed optimal contour set does not adequately represent the image being vectorised, add an additional isochromic contour to the optimal contour set and repeat from step 8 above;
10. If the partially formed optimal contour set does adequately represent the image being vectorised, then the set is complete;
11. An optional contour removal process can be carried out on the complete set, by removing single contours and testing to see if the image is still adequately represented after the contour is removed (this contour set can now be stored as the vector format image data, although both radial lines and zero-crossing lines are also legal components of the complete data structure) ; and 12. Encode the optimal contour set to store the image.
Note that it is not always necessary to use an optimal contour set, which requires a process of comparison with the original, and blind contourisation (contours chosen without reconstructing the original) will usually produce acceptable results for images not subject to magnification effects. Blind contourisation, e.g. by setting contours at a fixed interval, produces a contour set which can be treated just like an optimal contour set by all subsequent processes.
The use of radial lines and zero crossing lines will be described later.
Although the embodiment described here will perform the above steps in sequential order, it is equally possible that the "Preliminary calculations" listed above could be done as required during the "Encoding" steps. This makes the overall process more complicated to follow, and to program, but it may reduce the total number of calculations that are actually carried out.
Here and subsequently we will normally refer to what happens to just one of the tristimulus values that make up a colour. Each of the steps described is applied to each of these tristimulus values to derive contour maps for each component (e.g. to each of the red, green and blue components if a RGB system is used) . When the final image is rasterised to be viewed then the rasterisation process is applied to each component's map and the tristimulus values finally recombined. We will refer to the tristimulus value as a colour, as before.
In fact it is preferable that the methods of the present invention are performed in the CIE L*a*b* colour space because Euclidian distances in this space have been shown to correspond closely to perceptual dissimilarity. In the RGB space, the blue channel is often considerably noisier than the others, so when RGB images are vectorised directly there is a risk of loss of detail from this channel if the noise settings are the same as for the other channels and no real justification for varying them. By contrast, in L*a*b* space the colour components (a*b*) contain considerably less spatial detail so can be safely encoded with higher K values than the L* channel. If conversion between different colour systems is performed, it is preferably done as the first encoding step and the last decoding step, and so the methods presented below can be used with or without such conversions, depending on the colour space in which the raster format image data is originally received and desired to be put out.
1. Obtain raster format image data: The first stage in the method is preliminary processing of the raster data. Raster format image data is obtained, e.g. from a digital photograph.
2. Calculate differences between raster image data down to fourth order differences:
A noise estimate is derived for each pixel in the image as follows. Two sets of difference tables are constructed, involving first, second, third and fourth differences between pixel values in the x (or line or row) direction and in the y (or column) direction. The inventors have noticed that the cubic terms in fitting patches to isochromic surfaces (the 3D model for an isochromic contour map) essentially model noise so from this it would follow that, in the absence of noise, the third differences would be locally constant and the fourth differences zero. In practice this is not the case, so the fourth differences will be some multiple of the local noise in the image. Experimentation involving injecting noise into synthetic images suggests that this multiple is around 8V2 or -v70 , and that the fourth difference immediately 'below' the original pixel value determines the relevant noise estimate multiple. This is shown in the Fig. 6 which depicts a row of five pixels Pl to P5, and differences Δ between the pixels.
If P,_2 P,_λ P1 P1+1 P1+2 are five pixel values in the x- direction (say) so first differences are calculated as
Δ 1i = Pi+\- Pj , second differences as Δ2/ = Δ1/+i -Δ1/, third differences as Δ 3/ = Δ2/+i-Δ2/ and fourth differences as Δ4/ = Δ3/+i - Δ3/ . For P,_2 P,_, P1 P1+1 P1+2 this gives us the difference table shown in Fig . 6 .
The first differences shown in the table can be referred to as first order differences, the second differences as second order differences, the third differences as third order differences and the fourth differences as fourth order differences. The order of each difference is indicated by the numeral in superscript after the Δ symbol .
In Fig. 6, ΔVi lines up below P1. Accordingly the error
Λ4 Λ4 estimate in P1 is ±—7^ or ±—F≡^ in the x-
8V2 V70 direction (whichever version is used needs to be used consistently) . This is a one-dimensional estimate (in the x-direction) . A similar calculation is done in the y-direction to determine errors in that direction or dimension. These error estimates may not necessarily be the same, even though referring to the same pixel value, but in practice errors are always taken into account in one dimension only, i.e. along either the x or y directions. Errors in the differences are accordingly calculated from the appropriate one-dimensional error estimate, noting that the error in the first difference will be the sums of the errors in the participating pixels (and typically around twice the error in the pixel value itself) , similarly the error in the second difference will be around four times that of the original participating pixels etc.. These error estimates may be normalised to more conventional image noise statistics, where these are available, by calculating the statistical means and standard deviations for all of these estimates in the image as a whole and using the error estimate above to index the distribution function derived from such statistics. Our error estimates thus form a measure of deviation from pixel correlations in the appropriate direction. Such normalisation may be carried out separately in x and y or for both directions using the same values, as appropriate. This normalisation is most conveniently done after all difference calculations are completed, i.e. when the pixel colour value error is converted into a spatial error (see later) .
These differences also allow the interpolation of new pixel values and their related difference terms, and extrapolations off the end of the raster in either the x or y directions. Typically extra pixels will be required at the ends of each row and column to synthesise fourth differences at the image borders. These extra pixels are usually discarded.
We note that cubic interpolation done twice in orthogonal directions (i.e. using two one-dimensional steps) is regarded (with caveats - see later) as the most suitable way of resampling a raster image, i.e. by assuming that all pixel values are correlated by a cubic function we can make the most suitable interpolation of the pixels for a (noisy) image, so we take the view that this estimate is a reasonable approximation to the actual noise contamination in the image on a pixel-by-pixel basis. The process of critical point finding (maxima and minima) and contourisation starts with this error calculation and difference finding step (30).'
3. Calculate pixel error bounds based on the fourth order differences; and
4. Calculate the error bounds for each difference by working these pixel errors down the stack; and
As discussed above, the approach to calculation of error bounds for both pixels and the differences is based on fourth order differences. The obtaining of these differences is discussed in the previous step.
If we model the true isosurface as a tensor-product bicubic patch in x, y and z then the intersection of the plane z = x with z = φ or the plane z = y with z = φ would appear as a locally cubic curve in the plane, with its fourth derivatives, where defined, being zero. If such a surface is fitted to the raster format image data (a set of "sample points") , then the fourth derivatives will be some multiple of the local error, which can be estimated.
The above assumption that such a bicubic patch, in which the knots, which are the parameter values which define the progression of the parameter, are not normally aligned with the pixel centres, is generally a valid model for pixel correlations away from the edges. This assumption is supported by the common use of bicubic surfaces, including their interpretation using cubic differences, for image resampling and resizing, and the use of biquadratic functions, which imply and even lower order of correlation, for direct fitting in a localised region.
Image sampling for resizing is a well-studied process and it is recognised that correlations of degree one (i.e. linear interpolation) , give the most accurate samples overall, but when the assumption breaks down, gives locally worse solutions that higher-order interpolation methods like cubic interpolation. This tends to make linearly interpolated images look less visually pleasing than those obtained with cubic interpolation. By using a process based on differences, we are able to follow correlations up to degree three, and also edges as cases of correlation failure.
All of these process work on pixel values and derive an error term which is a pixel value error. Spatial errors δx, δy are then obtained from these pixel error values
Figure imgf000035_0001
and similarly for δy. In practice the dxj derivatives are approximated by differences.
There is a special case to be dealt with: — = 0, which dx only arises on plains or plateaux, but these cases will require special consideration anyway, as they defeat the "marching squares" approach. Cubic resampling using differences uses independent differences up to degree 3 in x and y, shown as Δ3 j without any particular indication of direction in Fig. 6.
For the purposes of managing marching squares and constructing pixel maps, it is sufficient to consider the case of half-interval interpolation, i.e. deriving
Figure imgf000036_0001
and φ{x,y +j) in a two-pass process involving first differences in x and then differences in y. The interpolation function for φ(x,y) in the x direction is derived by substituting into Stirling' s central difference formula, which, using our indexation can be written:
,/ , Λ ,/ s , .,c k2 A2 k.(k -Y).(k + 1) A3C φ{x+k,y)= φ{x,y)+k.A^ +-A2 ^1 +-± ^ J-Λ^_2
and, substituting k = H, the mid-point interpolation formula is:
Figure imgf000036_0002
This formula is only applicable in the interior of an image, typically for x, y ≥ 3 and similar distances from the high-order index border of the image. In the border regions, different formulae are used, derived from the Newton-Gregory forward and backward differencing formulae. Forward differencing is use in the low-index border regions with the general formula: Λ\
Figure imgf000036_0003
which, on substituting k = H1 becomes:
Again, the indices are those shown in Fig. 6, and in the notation of that figure, the above formula would apply to interpolation between P4 and P5.
The corresponding backward difference formula for high- order index border regions is:
φ ,{/x+k,,y)\=φ,{/x,y\)+k,ΛAi\_x + k.(\k + l)ΛA2 2 X_2 +-k.(^k + \^).(k + 2)'-AA3\_, λ o which, on substituting k = -H, becomes:
Figure imgf000037_0001
Similar formulae apply for half-interval interpolation in y-
Generally, interpolation is carried out first in one direction (usually x) , the (y-) differences for the new interpolants calculated and then the interpolants determined for the other direction (y) using the appropriate form of this formula. It is this final step which yields the central interpolant
Figure imgf000037_0002
which is needed to resolve the ambiguous cases shown in Fig. 5b.
In order to compute the nominal contour trajectory, and hence the error bounds for the contour, it is necessary to solve for the intersection of the contour with whichever border of the half-interval box around the pixel that the contour is deemed to be intersecting from the cases shown in Figs. 5a and 5b. This involves recasting Stirling' s formula in terms of its index variable z to obtain a cubic equation for z of the form:
Figure imgf000038_0001
and finding solutions for z in the intervals:
Figure imgf000038_0002
TABLE 1
This equation and the corresponding equations for the border regions, which are derived from the forward and backward differencing formulae above, are all of the form:
B
A.z3 +—.Z2 +C.z+(φ(x,y)-P)= 0,
where :
Figure imgf000038_0003
Figure imgf000039_0001
TABLE 2
It is highly desirable to solve only the simplest form of these equations. For example, if we could show that Δ3C = 0 and Zl2 = 0, then this simplifies to the linear formula:
z =
C
Although even quantization error is sufficient to ensure non-zero higher differences everywhere, we can use error analysis to establish the degree of local correlation and hence which differences we can set to zero. There is still a requirement to solve for the cubic form (for example using the method outlined in Schwarze J. "Cubic and Quartic Roots", Graphic Gems, Glassner A. ed. Academic Press, San Diego & London 1990), the quadratic form (Δ3C = 0) and the linear form (Δ3C = 0 and Δ2 = 0, which occurs in the majority of cases) .
In principle, each of the higher-order forms may result in more than one solution but geometric considerations preclude more than one quadratic solution in the interval and, while there may be more than one solution in the cubic case, there will be either one or three solutions in the interval. In this last case the assumption of variation diminishing and the subsequent use of error terms means that we can determine the error in the solution as the widest errors for the more distant solutions. As it is these error values we want to recover, the solutions themselves are just means to this end.
The immediate objective is two-fold. First we want a rule that will zero out higher-order differences in a principled manner and thus reduce the degree of the intersection equation which needs to be solved. Any form of the intersection equation higher than degree one introduces considerably higher computational cost and requires tests for exceptions.
Second we want a reasonable estimate for the spatial error in terms of Δx or Δy around any intersection point between the nominal contour for φ(x,y)=P on the raster format image data (the "received isosurface") and thus be able to determine the band in which the true contour will lie.
Both these objectives can be achieved using the same model, namely that the true isosurface can be modelled in terms of piecewise bicubic patches and that these pieces may sometimes be represented adequately by regions of lower order.
Because of the unavoidable presence of noise in the image samples, we cannot infer the true isosurface directly, so we have to further hypothesise that this isosurface will have a variation diminishing relationship with the received isosurface, i.e. that the true isosurface will be smoother than the received isosurface by virtue of the absence of noise.
Therefore we are looking for the smoothest contours that fit within the error bounds for the nominal contours of the isosurface as being the most likely candidate for the true isosurface.
As previously discussed, a degree 3 assumption for the local surface seems more than adequate, and it has even been speculated that the degree 3 terms tend to model local noise rather than the true isosurface. Clearly, solving the degree 3 form of the intersection equation should be avoided, provided that this can be justified.
That justification comes from a deeper investigation of errors. If the local isosurface is actually of degree 3 or less then, if there is no noise contamination, Δ4 = 0. If Δ4 Φ 0 then the value of Δ4 is some measure of the amount of error in the pixel above it in the stack of differences {P± in Fig. 6) .
There are then two ways of proceeding. The first alternative is to identify the errors for P±-∑r Pi-i etc. as Δ±- Δi-i etc. and follow their fate through the calculated difference stack.
The second alternative is to indicate that we cannot find out enough about Δi-, Δ±-i etc. to determine such things as their signs and that we can only arrive at a bounded, unsigned estimate of their values, i.e. perform a classical error analysis to derive £>,_2|5|£',_i| etc.. Since at this stage we only have an estimate of what these values could be, we could quote a single value
Figure imgf000042_0001
etc., and sum the terms to determine how the magnitude of the errors increase with the degree of difference. The results of this process are shown in the table 3 below.
Figure imgf000042_0002
TABLE 3
It can be seen that the result of having a single estimate of the error value is that the errors double in magnitude with each degree of difference. The usual model for errors is a Gaussian distribution about zero so that the magnitude of \ε\ would typically be set at about three standard deviations from the centre of this distribution (the more normal choice of two standard deviations, which results in approximately 95% of terms lying within the estimation is insufficient for the present purposes) .
From the above table, it is also clear that the error bound estimates are significantly larger than the anticipated errors themselves, and so their main value is to provide a scheme of multipliers for a tighter estimate of I ε\ derived by other means.
In the present embodiment, that estimate is ultimately derived from the fourth difference terms which, according to the modelling hypothesis, would be zero in the absence of any noise.
However, in reality, the fourth difference terms will be contaminated with noise derived not only from the pixel itself, but also from the two pixels on either side, i.e.:
Figure imgf000043_0001
However, as an indicator for ε±,
Figure imgf000043_0002
or \ε\ this is still not very useful. As each individual error term is distributed about zero, the values of Δi will also be distributed about zero, and in many cases the individual error terms will tend to cancel each other out. However, from the magnitudes of the coefficients, it is also clear that the ε± term, with a weight of -§ of the total is swamped by the other terms. Accordingly, a better indicator is desirable if it can be justified in terms of its overall improvement in computational cost.
One possibility is the use of a convolution-like weighted sum of the nearby fourth differences. Using the same indexing system, we model the errors ε± using the quantity E±, defined as:
1 2M+\
where N is a normalization constant and the c± are the weights for the differences.
If we take M = 3 and cχ,...,Cη as (1, 3, 5, 6, 5, 3, 1), then:
N-E, = ε,-s ~ ε,-A ~ ε,-3 + 2-ε, ~ ε,+3 ~ ε,+4 + ε,+5 ( Formula 1 ) .
The particular choice of coefficients results in the central error term being isolated and, notwithstanding the differences in sign, the other error terms, which have weights of equal magnitude (or zero) will tend to cancel each other out. The expression of the central term and the tendency for the other terms to cancel can be enhanced by adding together the E-terms in the x and y directions, i.e. by considering the quantity EtJ = E^ + E^ and carrying out the normalisation on E±rj.
Whilst E±fj will still be contaminated by nearby error terms, it will give an error estimate that is a better indicator of the true value as it is less affected by the surrounding terms. For example, Eirj is substantially decoupled from E1+1^ as it only shares 4 out of a possible 21 error terms (13 with non-zero coefficients).
A large error term which is out of line with the others (e.g. caused by shot noise) will be clearly indicated, although an error term similar to its neighbours outside the "isolation zone" (pixel errors with zero coefficients) will tend to be expressed as a similar value rather than its actual value. However, for the purposes of modifying an error bound around a contour, or assessing the degree of noise locally, that is acceptable .
The next issue is normalisation. There is an element of arbitrariness about this because the normalisation term will in the end be multiplied by a quality measure (in this case a constant) . If the non-central error terms do in fact cancel out then we would expect the normalisation constant for E±,j as a whole to be 4. Alternatively, some allowance for non-cancellation can be made by treating Ei+i,3 as being made up out of a series of summed terms of identical errors | ε\ , calculating their mean (which should be zero) and standard deviation from which we could take a normalisation term as being equal to three standard deviations. This gives a value of 2V3 or approximately 3.464, which is close enough to 4.
So the error term for the i,jth pixel is taken as
/L1ZJE -^2) wnere Eι,j is the unsigned and normalised form of Eirj and K1 and K2 are constants. K2 in particular is calculated by averaging the error estimates for the image as a whole, as this value should be zero, so it is a normalisation constant aimed at reinforcing the assumption that errors are distributed about zero.
This error term is then scaled according to the scalefactors in the "sum" column of Table 3 to provide error bounds for each of the difference values available for a particular pixel. If the error bound includes zero, then the difference term can be taken as zero and the intersection calculation simplified accordingly. Increasing K1 therefore increases the extent to which linear interpolation is used and the width of the band in which the contour is deemed to lie.
The error estimate derived above allows us to try to refine the solution in favourable circumstances. By setting error bounds around each difference in the stack below Pi, we can determine which of these should be considered as zero and hence the degree (d) of the intersection solution which is used. The result is that V/>d,Δ^=0 and Ad t is a non-zero constant. In regions where d is small (e.g. d = 1) , we can extract a set of equations for the local error terms in terms of a set of fourth differences calculated around P±.
These fourth differences can be expressed in the form: Δ* = A4P1+2ι+Λ - 4.εl+i + 6.εl+2 -A.εt+ι + ε, , where the ΔV term represents the fourth difference in the true pixel values (so far assumed to be zero; this value will be set to zero once the equations have been derived) .
The basic 5-pixel layout of Fig. 6 can now be extended in both directions around P±. The layout can be extended to the left in the figure layout by using forward differencing after setting all differences higher than d to zero and then applying the formula:
Figure imgf000047_0001
The principle behind the next steps is to generate two synthetic pixel values solely from valid, existing data, namely the differences already in the stack, and then to use them to compute new fourth differences with a known distribution of error terms. These values will be produced by an extrapolation process of the appropriate degree (degree 1, 2 or 3) so their error terms will be zero, by construction. The above formula is intended to provide the two values to the left but an equivalent process and a similar formula will be needed for extension to the right.
For extension to the left we note that each pixel value Φ(jtf) will have had a degree d associated with it from previous error analysis. This is the maximum degree necessary for extrapolation away from that value. So in the formula above only the terms involving differences of degree d or less need to be included, as the higher differences have been deemed to be zero. Once ø(/-l,y) has been calculated it may inherit the interpolation degree of although any other possible means of determining this value are also possible.
From φ(i-l,j) new differences up to Ad_x are put into the stack and Ad t_x (which should be locally constant within these assumptions, so should be the same as Δds to the right) is re-set to the mid-point of the intersection of all adjacent error bounds for the Δrfs to the right for which the local continuity of the associated pixel is of the same degree d. This accumulation of the error bounds intersection ceases when the next difference to contribute is associated with a pixel with a different continuity degree (higher or lower) or has an index outside the difference stack shown in Fig. 6. The difference calculation below Ad now continues to Δ4. This refinement of differences at level Ad may be omitted in less precise implementations.
The same formula is applied again to give a second extrapolated pixel value and associated differences according to local continuity assumptions.
The layout can similarly be extended to the right using the same approach to establishing differences, although backward differences are now used, and the error bounds to the left are intersected for adjacent Δrfs under the same conditions. The appropriate formula is: φ(i+\,j)= φ(i,J)+A^ +Δ^_23,_3. By extending the layout, we can generate five Δ4s, but since we used a difference formula to calculate the pixel extensions, and these respected local continuity conditions, then the error terms associated with each of these will be zero, by construction. Thus the only nonzero error terms would be those originally present, namely εt_2,-••ει+2. Therefore we have five equations in five unknowns :
Figure imgf000049_0004
Figure imgf000049_0001
Since our basic assumption is that the ΔPs are all zero, this gives:
Figure imgf000049_0002
or, by inversion:
Figure imgf000049_0003
Figure imgf000049_0005
This approach is most to be trusted in a region in which d = 1. It yields individual error values which an be checked against the original error estimates and a consistent solution set can be used to progressively uncover error terms in either direction when these are applied within the original difference set.
This is done by moving along the fourth difference terms progressively solving for the next error term by substituting the known differences. For example: ει+3 = Δ4,_, + 4εl+2 -6ει+ι + Ae1 -ε,_λ , and similarly for ει+4, etc. and £,_3,£,_4/ etc. in the other direction. Alternatively Formula 1 for E± can be used in the same way.
This process breaks down when an edge is encountered, which may be signalled by a zero crossing of the second difference.
The final step in the intersection calculation involves finding spatial error bounds around the intersection solution. The method described above for finding the intersection solution involved solving for x or y for the received isosurface and then establishing where that solution lay on the upper and lower bound isosurfaces.
Now that we have quantitative error measures for each of the received pixel values, we can construct models for these isosurfaces, also in terms of the differences. Generally we will only want the first differences, and higher differences can be constructed if they are needed. As each pixel now has associated with it the highest degree of interpolation which can be justified in terms of the highest difference which cannot be set to zero, this determines the number of higher differences which need to be calculated locally. In most cases the first difference will be enough.
In addition, the value for the intersection with the received isosurface is available, so the interpolation can be carried out directly.
To convert to positional errors in x and y, it will also be necessary to interpolate the first difference as an approximation to the differential. If Δ2=0, then it is sufficient to set this interpolated value to the midpoint of the intersection of the error bounds of the interpolated differences. Otherwise, it has to be interpolated linearly (Δ3=0) or quadratically (Δ3≠0).
The central difference interpolation formula is:
Figure imgf000051_0001
and here the highest difference not deemed to be zero is taken as the mid-point of the intersection of the error bounds for the adjacent entries. Similar formulae with similar considerations apply to the forward and backward difference cases near the edges, although care needs to be taken with the received interpolation value k. Because the first non-central differences are offset by H, the value of k has to be adjusted by ±% in the appropriate direction prior to insertion into the formula .
5. Determine extrema in the raster image data (maxima, minima, ridges, troughs, etc.):
Here we mean local maxima, local minima, ridges, troughs, and plateaux. A plateau is in a sense an extended extreme point, as are ridges and troughs. In fact, such extended extreme points tend to be rare (candidate plateaux are rarely completely flat and ridges and troughs will contain further extrema) , but they should be catered for. A plateau in particular will defeat the marching squares process described above, but tracing a contour slightly offset from the plateau level, thereby establishing its bounds and extent will not. Under most circumstances extrema will therefore just be points and they will have regions of influence normally referred to as watersheds . These watersheds can be defined as regions in which energy minimised paths, e.g. steepest ascent (for maxima) or steepest descent (for minima) processes will lead to the extreme point. As will be further discussed below, the border of a watershed is a good candidate for contour membership of an insufficient set as other contours may be needed to refine the topography closer to the extreme point.
One simple technique for finding critical points (maxima and minima discussed separately below) is to look along a scan line (i.e. proceeding along the x-direction) , noting the signs of the first differences between neighbouring pixels. When the sign changes, the last pixel with the previous sign is a candidate to be a local maximum or minimum. However this may be a false critical point induced by (say) shot noise, and in any case needs to be verified by a similar analysis in the y-direction.
The efficiency of encoding processes can be degraded by noise, particularly noise spikes masquerading as extrema. While these can be safely ignored as, for example, the contour finding process will use the enhanced error value to widen the band in which the contour lies, so increasing the likelihood that the contour will pass smoothly through the spike, they still have to be identified in order to be excluded from the extrema list. The most straightforward and common case is that for a spike in a local field of linear interpolations (second differences and upwards deemed to be zero) . Here the error bounds for the spike will normally be large enough not to trigger higher degree interpolation but even if it is a 1-pixel "hole" in the linear field, this is very noticeable. The condition here for discarding a candidate extremum is that the three adjacent (central) first differences in both x and y have error bounds which overlap a single value which in turn is the average of their three values. The reason for choosing their average can be seen from Table 3 above because the error in the central term is excluded from the sum. The standard test for an extremum is a so-called zero crossing (a term normally used in connection with second derivatives where it signals an edge) in the first derivative, here simulated by differences. This is where the first derivative (difference) changes sign and, assuming the modelled derivative is continuous, the zero point should lie between the values of opposite sign. If zero also lies within the common error bound then these first differences could all be zero and adjacent values need to be checked to establish a first degree zero crossing unambiguously. In the event of a conflict the discard condition supersedes the retention condition unless all three differences could be zeros in which case a sign change indicated by adjacent non-zero first differences determines the outcome.
Further optimisation can be used to reduce the number of distinct extrema. If one or more extrema have overlapping spatial error bounds they can be clumped together and used as an extended extremum. All of these data reduction steps will cull more points as the K term increases.
6. Use diffusion equations to determine watersheds in the image :
The nature of watersheds has been discussed above under step 5. Therefore a process to find these watersheds and partition the image into "tiles" around each extreme point by these watersheds is a useful starting point for encoding an insufficient contour set. A precise partitioning is not critical, so fast, simple methods of determining watersheds are preferred.
The method by which the watersheds in the present embodiment are found is based on diffusion using a Fast- Marching narrow band level set method. Like all level set methods, we assume an advancing front F which will be a piecewise continuous closed curve for which we assume that first derivatives are defined nearly everywhere in the data. The evolution of F is defined by:
dF VF
= +v. dt VF
The initial condition on F is that it forms the boundary of the extremum at time t = 0. On our assumption about errors, we can take that initial boundary to be defined by an ellipse the sizes of whose axes are determined by the errors in x and y, namely δx,δy , for the extremum.
Extended extrema are dealt with straightforwardly in the same equation.
What the above equation is defining is the rate of change of the position of F (i.e. its velocity) as being equal to a quantity v along the normal to the curve at that point, with positive normals being taken to point away from the extremum. The level set convention is to make the sign of v in the equation negative but with the same convention for the sign of the normal because of the normal use of the equation to describe a shrinking front. However, here the front is always expanding, so Fast Marching methods, e.g. as described in Sethian J. A. "Level Set methods and Fast Marching Methods", 2nd Ed. CUP 1999, are applicable.
The term v is the velocity of the motion of the position vector, and if we assume that motion is initially uniform, then we take v = constant to start with, then set v = 0 when the front encounters another advancing front, which stops all further progress locally in the region of the meeting. The implementation of this process as a set of advancing markers (which may be given a particular colour to identify the tile to which they belong) over the pixel field is straightforward. The outcome of this process will be to identify all the pixels on the watershed borders. By starting with a constant initial velocity a Voronoi tessellation (see below) expressed in terms of its borders will result.
However, the diffusion process can preferably be modified by defining v to be:
Figure imgf000056_0001
{ 0 when a competing front is encountered.
Then v advances in inverse proportion to the local rate of pixel change, thus making the front F itself evolve like the isochromic contours for the surface over which it is passing. The image will still be partitioned by the same number of tiles as when v was constant, but each tile will now more closely resemble the true watershed for each extremum.
Thus this process identifies a set of border pixels for each watershed. Each tile will then contain a pixel value nearest to the extremum value, and a pixel value further from the extremum value. The nearest value defines a contour that lies entirely inside the tile, and surrounds the extremum. The furthest value will contain a contour that will normally lie entirely outside the tile (except at the point (s) of intersection) and thus contain the whole of the tile as well as parts of neighbouring tiles. The tiles can then be sorted in terms of their furthest contour values and each of these provides a seed for tracing a contour. Along with starting contours for extrema, these provide an initial set of levels for contour finding.
An alternative to the above method is Voronoi tessellation (e.g. as described in O'Rourke J., "Computational Geometry in C 2nd Ed.", CUP 1998.) which is a well-studied process for which efficient solutions exist when the seeds are infinitesimal points. Each
Voronoi tile will contain all those pixels at least as close to the defining seed point as to any other seed point. Pixels which are equidistant to at least two seed points lie on the tile border and it is this tile border that forms the estimated watershed. Efficient tessellation processes will typically first form the dual of a Voronoi tessellation, which is a Delaunay triangulation joining the nearest seed points to each other and then forming the Voronoi tiles in terms of straight lines joining the mid-points of the vertices of the triangulation.
Generally, Voronoi tiling is quicker than the diffusion method described above, but the diffusion method has the advantages of being able to handle extended extrema and to produce a result that is better related to the true watershed than the Voronoi tiles.
7. Generate an isochromic contour which forms part of the optimal contour set; 8. Test how well the optimal contour set represents the image being vectorised; and
9. If the optimal contour set does not adequately represent the image being vectorised, add an additional isochromic contour to the optimal contour set and repeat from step 8 above; or
10. If the optimal contour set does adequately represent the image being vectorised, then the set is complete and can be stored as the vector format image data after removal of redundant contours.
These steps are the recursive methods of finding the optimal contour set, involving starting from a single contour (or a number of seed contours), testing the accuracy of the representation produced by those contours against the original data, and adding further contours if required to match the original data.
One example of a process that provides these steps is set out below:
a) Find a contour near to each extremum - this may have already been done in step 6. If the extremum is a point then the contour will describe a small circular region around it. If the extremum is an extended region then the contour will be an extent of the region. The level value will be close to the extremum level value in each case and the contour should be chosen such that linear diffusion within this contour will result in pixel colour values falling within the error bounds of the pixel colour values of the original raster format image data for all pixels inside the contour. These contours replace the extrema in what follows and any small gaps in the overall image can be filled in by diffusion at the end if required.
b) Find for each watershed a bound pair which comprises the level found in step a) and the most distant level associated with the watershed boundary.
c) Find the bottom-most level in any watershed boundary. This will be the lowest possible level. Its footprint is the whole image, so the level set for this level is a single contour that follows the image borders .
d) Find a second level with the following characteristics i) it is either below, or intersects, all other watershed bound pairs, or ii) it is the highest level for which characteristic i) is true. Find the level set for this level .
e) For each contour in the level set defined by step d) (or step f) below) , identify all watersheds whose footprints have a non-zero intersection with the contour footprint. If a watershed is not wholly contained by the contour it is partitioned by the contour into segments. Find the most distant point on each watershed segment border from its extremum and set the watershed segment bound pair appropriately. The contour now has its own watershed subset associated with it.
f) (A modification of step d) to deal with the partition defined by the contour in e) above) Find another level inside the contour defined by step e) with the following characteristics i) it is either below or intersects all the watershed bound pairs associated with the contour, or ii) it is the highest level for which characteristic i) is true. Find the level set as defined within the contour for this level.
g) Repeat e) and f) until the contour found by step f) is the contour around the extreme point. The vicinity of the extreme point is filled as a special case, diffuse to extremum itself then fill all pixels in extremum footprint with extremum value.
The steps d) to g) above assume that the entire insufficient contour set is first mapped out, then diffusion is done. However, diffusion on steps d) and f) can be carried out as soon as two levels to diffuse between have been identified. If at any point the diffusion step generates a pixel value which lies outside the error bounds of the original pixel then a new level set for that value is generated and all contours which lie within that level set may be discarded (i.e. all of the level set providing the diffusion target) . There is no need to re-diffuse between the new levels as this will be done correctly on decoding and the next step can be started immediately.
The process, in whatever form it takes, is recursive, so step d) leads to multiple contours for each of which steps e) and f) are done and step f) leads to multiple contours for which the process is repeated until the level selected corresponds to an extremum.
Using the diffusion process, the method is a recursive process which at any given repetition starts with a single closed contour for a given level number. This is represented by a "footprint" which consists of the pixels whose original values are greater than or equal to the level value.
For the purposes of illustration, it is assumed that the contour footprint contains pixel values larger (brighter) than those outside the contour. Clearly, if the contour is part of a series representing a hole the process is the same but the decisions are the other way round (e.g. the footprint contains pixels which are smaller than the surrounding pixels).
If the contour contains other closed contours of the same level number then these enclosed contours, which are part of the level set including the first contour, correspond to holes in the footprint and their interiors do not form part of the footprint of the first contour. An assumption is made about how the pixel values vary within the footprint. In one method this is done by determining a level set for a level number which is as different from the original level as possible while ensuring that at least one pixel from every watershed within the footprint is within the footprint for this new level set. This is called the target level.
The predicted pixel values for all the pixels within the footprint are then determined by interpolation between the two level numbers (the original and the target) based on the results of a diffusion process which indexes the footprint .
All predicted pixel values which fall outside the error bounds of the pixels they supposedly represent are now identified and an estimate is made of a new target level value which will ensure that a repetition of the interpolation process will result in interpolated pixel values wholly within error bounds. For example, the estimate could be made by taking the smallest deviation which would guarantee this result. Alternatively the estimate could take a larger deviation, which could result in fewer contours overall but require checking at each estimation. The result will be a compatible target level, which defines a difference footprint resulting from subtracting the target level footprint from the original footprint. The target level is compatible if the difference footprint it defines contains only pixels which, after diffusion and interpolation or whatever assumption is made about internal pixel values, fall within the error bounds of corresponding pixels in the original raster format image data.
The outcome is a level set of contours to each of which the process is repeated until the original footprint of the original contour is empty. The process can then be repeated for any other base contours in the image.
The diffusion or interpolation method used is preferably, but not necessarily the same at each repetition. Preferably the diffusion method used is identical to the method that is used to render/decoder the vector image data to produce an image (see below) .
An extreme possibility is to assume that the pixel values within the footprint are all the same. This would avoid having to find watersheds but this saving may well not compensate enough for the extra trials required to complete the contour set. If only equal values are assumed in footprint interiors then an optimally- contoured image under that assumption will be generated directly and diffusion will not be required at all. However the phrase "optimally contoured" is a bit of a misnomer in this case as there would be many more contours than we would get with even simple diffusion methods, and reducing the quality measure would have very little effect on contour production (it would reduce the number of control points required to specify a contour but not the number of contours required) . However, a more conservative method of estimating a target level could be backed up by a redundancy test which may not be considered necessary when a more aggressive method which allows for in-built repeated trials to get a good solution is used.
Within any group of at least three nested contours or levels it is possible to test for redundant contours by removing the middle contour or level and attempting to diffuse between the other two. If the outcome is that a middle contour or level is still required but that its footprint is smaller than previously then the inner contour or level is a candidate for replacement by the same process until all the contours with a particular watershed have been exhausted.
A test for whether the contour is a candidate for outright removal (which would not require the shuffling of the method discussed above) is to compare the pixels under the contour to see if the interpolated value, on the assumption that the contour has been removed, lies within the error bounds of the original pixel. The interpolation values will be available from previous diffusion steps. This test, or variants on it, could be used to see if contour removal was warranted and only attempting removal and accepting either consequence if the test indicated it was likely to result in outright removal. Watershed analysis is not critical to this process as "exhaustion of the watershed" would mean that the contour surrounding an extremum would become one of the triplet, but an improvement would only result if diffusion was part of the reconstruction/decoding process .
Although the methods of encoding in the present invention may include the step of removing "redundant" contours as described in the example steps above, such removal steps are computationally more expensive than the construction of new contours. Accordingly, the methods of the present invention preferably construct the optimal contour set only by creating new contours from an under-populated contour set, and not by removing contours from an over- populated contour set.
The vector format image data may include control information which indicates the assumption (s) used in the encoding process. By including such information, a renderer or decoder can determine how best to render or decode the image to faithfully reproduce the original raster format image.
Having determined the contours forming the optimal set for encoding the image, these contours are stored as the vector format image data. This step may be performed for each contour when it is first determined, or may be performed in a single operation for all contours when the complete contour set has been determined.
Instead of finding the optimal contour set it is also possible to get good results using a 'blind' contourisation technique. A blind technique is one in which the image is not reconstructed in a test decode during encoding. For example it may be decided to select contours at a fixed interval f where f>2ε and ε is a global estimate of the average pixel noise. A smaller interval may be used where the implementation of the fill process compensates for contour overlap. Another blind technique sets contour values to local maxima in the pixel value histogram, starting with the largest value, then the largest values in the intervals defined by the previous choice, repeating until the interval is near 2ε or some other limit criterion. A third blind technique uses the distribution of extreme values (maxima and minima) grouped in intervals of ±ε. If this is calculated as a histogram of extrema it can be used in the same way as the pixel value histogram to determine a contour set. Blind contourisation results in a unique set of level values for the image which can only be modified locally by identifying clusters and partitioning the image in terms of the watersheds for these clusters. None of these methods require a test rendering step, i.e. image reconstruction.
In preferred embodiments of the invention, the contours are stored by encoding them as Bezier chains. A Bezier chain is a plurality of linked Bezier curves. As a Bezier curve can be defined by only 4 points of data (2 end points and 2 control points), the amount of data required to store each contour, and as a consequence, the total amount of data required to store an image, can be significantly reduced. Other curve fitting methods could be used to store the contour data, and the data file storing the image may contain control information indicating the curve fitting method that has been used to store the contour data.
The method of this embodiment is used to turn the vector data into an image for display. Preferably it closely corresponds with the test step of the method of encoding used to encode the data into vector form.
The vector format image data is preferably stored as a set of contours, with each contour being represented by a plurality of Bezier chains.
Accordingly, in a first step, the decoder renders out the Bezier chains to form the contours (i.e. turns the stored contours into a string of points) at the resolution of the image that is being reproduced.
The decoding process then determines the colour values of the pixels falling between the contours by using a diffusion equation with fixed velocity which travels over the footprint between the two levels, herein referred to as the template or fill template when it is quantised onto pixel-like areas. The reason why a fixed velocity is used is to progress the front defined by the equation by a given distance in unit time. These time-stamps can be used to simplify the calculation so it is not necessary to solve the equation directly. As the front advances over the template it inserts the time-step (1,2, ...etc. although fractional values may be used for 8-connected paths) into each cell over which it passes, time being a measure of distance under constant velocity. This is done in both directions so each cell will contain two numbers when the process is complete, which will be when every cell of the template has been visited. If a template cell contains two quantities (a, b) where a is the distance from level A and b the distance from level B, then the interpolated pixel value P for the cell will be:
α
P =A- -+B- α+b α+ b
which is just an application of linear interpolation. The sum α+ b is the shortest distance between contours at any point on the template, alternatively the least number of time steps to traverse the template, and may vary as that distance varies locally. Either interpolation term may be used to parameterise the entire template surface. The times at which the front reaches the borders of the destination contour between which interpolation is being determined thus determines the distance from the originating border, similarly the times to reach any interior point determine the minimum distance from the original position of the front.
If radial lines are not used this will be a fixed velocity and the diffusion equations are used twice, once to give a minimum distance to the lower level and once to give a minimum distance to the upper level. If radial lines are used the diffusion equations are first used at fixed velocity four times to determine two orthogonal coordinates for each point between the levels and these coordinates are used to determine the interpolations used to derive velocities for each pixel position. If the topology of the diffusion space is rectangular and indexed by 0-1 between levels and 0-2 in the orthogonal direction (this is the λorthogonal index') then: a, the pixel values on each radial line (a minimum of 3 are required) are interpolated by differencing to degree 3 subject to error management to provide the same number of uniform samples along every radial line, b, these values are differenced to give a set of velocities for each sample point along a radial line, c, corresponding velocities on adjacent radial lines are differenced to degree 2 and interpolated (subject to the usual error conditions) to the number of increments of the orthogonal index. These velocities now form a rectangular grid which is interpolated by convolution or integration to provide a velocity for each cell in the original space. The velocities mapped into the original space are used by the diffusion equation along with a curvature damping term to provide a variable-speed diffusion between levels.
Some care needs to be taken with local topology to ensure that each diffusion field is topologically square. Also rendering into pixels is done by supersampling the continuous field as represented by the diffusion values, which is typically a multiple of the final rendering resolution .
A skilled person will appreciate that variations of the disclosed arrangements are possible without departing from the invention. For example a de-convolution pass could be done to replace area samples by point samples. Accordingly, the above description of a specific embodiment is made by way of example only and not for the purposes of limitation. It will be clear to the skilled person that minor modifications may be made without significant changes to the operation described.

Claims

CLAIMS :
1. A method of converting raster format image data to vector format image data comprising the step of generating contour data for a plurality of isochromic contours, whereby each isochromic contour is determined by applying a curve fitting process to a plurality of possible solution points and spatial error bounds for said solution points, wherein said solution points and error bounds are based on said raster format image data, said solution points being deemed to have colour values equal to the colour value of the contour, and the contour data is sufficient to allow reconstruction of said contour using the curve fitting process.
2. A method according to claim 1, further including the generation of radial line data for one or more radial lines running between two of said isochromic contours, wherein each radial line is a path between said contours which follows the direction of the local gradient.
3. A method according to claim 2 wherein each radial line is determined by applying a curve fitting process to a plurality of possible solution points and spatial error bounds for said solution points, wherein said solution points and error bounds are based on said raster format image data, said solution points have colour values equal to the colour value of the radial line in the vicinity of the solution point, and the radial line data is sufficient to allow reconstruction of said radial line using the curve fitting process
4. A method according to any one of the above claims wherein said spatial error bounds are based on local error estimates, including error estimates derived from fourth order differences for pixel colour values in the raster format data.
5. A method according to claim 4 wherein the local error estimates are based on fourth order differences between the pixel colour values in the raster format data .
6. A method according to claim 5 wherein the error estimates are calculated directly from the fourth order differences .
7. A method according to claim 5 wherein the error estimates are calculated indirectly from the fourth order differences by finding regions where a direct solution for the individual error terms is possible, determining that solution, then using it to unravel other error terms from adjacent fourth differences.
8. A method according to claim 5 wherein the error estimates in the colour value for each pixel are calculated by using a convolution like weighted sum of adjacent fourth order differences.
9. A method according to claim 5 wherein the error estimates are calculated by a combination of any two or more of the following methods: calculated directly from the fourth order differences; calculated indirectly from the fourth order differences by finding regions where a direct solution for the individual error terms is possible, determining that solution, then using it to unravel other error terms from adjacent fourth differences; and calculated by using a convolution like weighted sum of adjacent fourth order differences, the combination being chosen so as to achieve the most accurate result depending on context .
10. A method according to any one of the above claims wherein at least first order differences between the colour values of adjacent pixels are calculated and error estimates for these differences are generated; and wherein said error estimates for the differences are used to simplify equations for finding the solution points for the isochromic contours.
11. A method of converting raster format image data to vector format image data comprising the steps of determining a minimum set of contours to be generated by using diffusion equations to define a plurality of watersheds and generating contour data defining a contour around each watershed.
12. A method according to claim 11 wherein the contour data is generated according to a method as defined in any one of claims ■ 1 to 10.
13. A method according to any one of claims 1 to 12 wherein an error estimate is generated for each of a plurality of pixels in the raster format data, and further comprising a test step in which isochromic contours are generated from the contour data, a colour value for each of said pixels is calculated on the basis of the generated isochromic contours and if the calculated colour value is outside the error bounds for that pixel, a new isochromic contour is generated for the colour value of that pixel.
14. A method according to claim 13 wherein said pixel colour values are for pixels inside said contours and in the test step said pixel colour values are calculated on the basis of said isochromic contours by using a diffusion equation.
15. A method according to claim 14 as dependent on claim 2 or any claim dependent on claim 2 wherein said pixel colour values are calculated on the basis of said isochromic contours and one or more radial lines.
16. A method according to any one of claims 13 to 15 wherein the contours are generated from the contour data by applying a curve forming process to said contour data.
17. A method according to any one of claims 13 to 16 further comprising repeating said test step until the colour value for each of said pixels, calculated in said test step, is inside the error bounds for each said pixel, the set of isochromic contours found by this repetition forming the vector format image data.
18. A method according to any one of claims 1 to 12 wherein a contour set is derived at fixed intervals or according to the interpretation of nonreconstructive instrumentation to provide a fixed set of contour values.
19. A method according to claim 18 which further partitions the image in terms of extrema clustered by overlapping watershed intervals.
20. A method of rendering an image from vector format image data comprising the step of determining the pixel colour values between isochromic contours in the image data by using diffusion equations.
21. A method according to claim 20 wherein said vector format image data further includes one or more radial lines, and the pixel colour values are determined by using diffusion equations between said radial lines and said isochromic contours.
22. A method according to claim 20 or claim 21 in which the vector format image data is created using a method according to claim 13 or any claim dependent on claim 13 and the diffusion equations used to render said image are identical to the diffusion equations used in said test step .
23. A method of determining the errors in pixel colour values in an image comprising the step of calculating the differences in values between adjacent pixels, estimating the background noise based on fourth order differences between said pixel colour values and assigning a corresponding error value to each pixel.
24. A method according to claim 23 wherein the error estimates are calculated directly from the fourth order differences.
25. A method according to claim 23 wherein the error estimates are calculated indirectly from the fourth order differences by finding regions where a direct solution for the individual error terms is possible, determining that solution, then using it to unravel other error terms from adjacent fourth order differences.
26. A method according to claim 23 wherein the error estimates in the colour value for each pixel are calculated by using a convolution like weighted sum of adjacent fourth order differences.
27. A method according to any one of claims 23 to 26 further including the step of determining the error estimates of pixels near the edge of the image using an edge-enhancement technique.
28. A method according to claim 27 wherein the edge- enhancement technique includes the steps of generating colour values for virtual pixels lying outside the edge of the image, and using the colour values of said virtual pixels to determine said fourth order differences for the pixels near the edge of the image.
29. A method according to claim 28 wherein the step of determining the colour values of said virtual pixels includes matching a first group of pixels near the edge of the image with a second group of pixels elsewhere in the image having similar colour values to said first group of pixels, and generating colour values for said virtual pixels based on colour values for pixels which have the same positional relationship to said second group of pixels as said virtual pixels have to said first group of pixels.
30. A method according to claim 29 wherein said second group of pixels has colour values which are within error bounds of the colour values of the first group of pixels, and the edge-enhancement technique assigns a confidence factor for each virtual pixel based on the error bounds and the similarity between the colour values.
31. A method according to claim 28 wherein the step of determining colour values of said virtual pixels uses the fourth order differences of the pixels near the edge of the image to determine colour values of said virtual pixels from the colour values of said pixels near the edge of the image.
32. Use of the errors in pixel colour values determined by a method according to any one of claims 23 to 31 in a method according to any one of claims 1 to 17, either directly, or in the calculation of spatial error bounds.
33. A computer readable medium containing a computer program comprising instructions for carrying out the method of any one of the above claims.
34. Apparatus having means configured for carrying out the method of any one of claims 1 to 32.
PCT/GB2007/002470 2006-07-03 2007-07-03 Image processing and vectorisation WO2008003944A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
GB0613199.9 2006-07-03
GB0613199A GB0613199D0 (en) 2006-07-03 2006-07-03 Image processing and vectorisation

Publications (2)

Publication Number Publication Date
WO2008003944A2 true WO2008003944A2 (en) 2008-01-10
WO2008003944A3 WO2008003944A3 (en) 2008-07-31

Family

ID=36888551

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/GB2007/002470 WO2008003944A2 (en) 2006-07-03 2007-07-03 Image processing and vectorisation

Country Status (2)

Country Link
GB (1) GB0613199D0 (en)
WO (1) WO2008003944A2 (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100259683A1 (en) * 2009-04-08 2010-10-14 Nokia Corporation Method, Apparatus, and Computer Program Product for Vector Video Retargeting
GB2504653A (en) * 2012-06-11 2014-02-12 Univ Bath Vector contour coding methods
CN105139452A (en) * 2015-09-01 2015-12-09 电子科技大学 Geological curved surface reconstruction method based on image segmentation
US9769365B1 (en) 2013-02-15 2017-09-19 Red.Com, Inc. Dense field imaging
CN113129402A (en) * 2021-04-19 2021-07-16 中国航发沈阳发动机研究所 Cross section data cloud picture drawing method
CN114296398A (en) * 2021-11-16 2022-04-08 中南大学 High-speed high-precision interpolation method for laser cutting
CN115631350A (en) * 2022-11-17 2023-01-20 博奥生物集团有限公司 Method and device for color recognition of can printing image

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5621819A (en) * 1994-04-22 1997-04-15 Victor Company Of Japan, Ltd. Multidimensional multi-valued color image compression and decompression method
US5694331A (en) * 1993-10-17 1997-12-02 Hewlett-Packard Company Method for expressing and restoring image data

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5694331A (en) * 1993-10-17 1997-12-02 Hewlett-Packard Company Method for expressing and restoring image data
US5621819A (en) * 1994-04-22 1997-04-15 Victor Company Of Japan, Ltd. Multidimensional multi-valued color image compression and decompression method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ZHANG S ET AL: "Vectorization of digital images using algebraic curves" COMPUTERS AND GRAPHICS, ELSEVIER, GB, vol. 22, no. 1, 25 February 1998 (1998-02-25), pages 91-101, XP004123429 ISSN: 0097-8493 *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100259683A1 (en) * 2009-04-08 2010-10-14 Nokia Corporation Method, Apparatus, and Computer Program Product for Vector Video Retargeting
GB2504653A (en) * 2012-06-11 2014-02-12 Univ Bath Vector contour coding methods
WO2013186543A3 (en) * 2012-06-11 2015-02-26 University Of Bath Method of coding video transmissions
US10939088B2 (en) 2013-02-15 2021-03-02 Red.Com, Llc Computational imaging device
US9769365B1 (en) 2013-02-15 2017-09-19 Red.Com, Inc. Dense field imaging
US10277885B1 (en) 2013-02-15 2019-04-30 Red.Com, Llc Dense field imaging
US10547828B2 (en) 2013-02-15 2020-01-28 Red.Com, Llc Dense field imaging
CN105139452A (en) * 2015-09-01 2015-12-09 电子科技大学 Geological curved surface reconstruction method based on image segmentation
CN113129402A (en) * 2021-04-19 2021-07-16 中国航发沈阳发动机研究所 Cross section data cloud picture drawing method
CN113129402B (en) * 2021-04-19 2024-01-30 中国航发沈阳发动机研究所 Cross section data cloud picture drawing method
CN114296398A (en) * 2021-11-16 2022-04-08 中南大学 High-speed high-precision interpolation method for laser cutting
CN114296398B (en) * 2021-11-16 2024-04-05 中南大学 High-speed high-precision interpolation method for laser cutting
CN115631350A (en) * 2022-11-17 2023-01-20 博奥生物集团有限公司 Method and device for color recognition of can printing image

Also Published As

Publication number Publication date
GB0613199D0 (en) 2006-08-09
WO2008003944A3 (en) 2008-07-31

Similar Documents

Publication Publication Date Title
US11348285B2 (en) Mesh compression via point cloud representation
JP5249221B2 (en) Method for determining depth map from image, apparatus for determining depth map
US6771840B1 (en) Apparatus and method for identifying the points that lie on a surface of interest
CN100514367C (en) Stereo 3D reconstruction system and method based on color segmentation
CN102446343B (en) Reconstruction of sparse data
EP2143075B1 (en) A method and apparatus for real-time/on-line performing of multi view multimedia applications
US8452081B2 (en) Forming 3D models using multiple images
WO2008003944A2 (en) Image processing and vectorisation
Tsai et al. Dense disparity estimation with a divide-and-conquer disparity space image technique
RU2382406C1 (en) Method of improving disparity map and device for realising said method
JP4889182B2 (en) Depth map generation by hypothesis mixing in Bayesian framework
JP2004005596A (en) Method and system for 3D reconstruction of multiple views with changing search path and occlusion modeling
KR20020067514A (en) Image matching
CN102609974A (en) Virtual viewpoint image generation process on basis of depth map segmentation and rendering
US7209136B2 (en) Method and system for providing a volumetric representation of a three-dimensional object
Keren et al. Restoring subsampled color images
Lv et al. Light field depth estimation exploiting linear structure in EPI
Kwak et al. View synthesis with sparse light field for 6DoF immersive video
US7456831B2 (en) Method for generating 3D mesh based on unorganized sparse 3D points
WO2003060829A1 (en) Processing boundary information of a graphical object
KR20220078651A (en) arbitrary view creation
US7683914B1 (en) Triangulation based raster image interpolation
Sabbadin et al. Multi-View Ambient Occlusion for Enhancing Visualization of Raw Scanning Data.
US11676334B2 (en) Method and apparatus for plenoptic point clouds generation
US20250111546A1 (en) Sparse glcm: gray-level co-occurrence matrix computation for point cloud processing

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 07733440

Country of ref document: EP

Kind code of ref document: A2

NENP Non-entry into the national phase in:

Ref country code: DE

NENP Non-entry into the national phase in:

Ref country code: RU

122 Ep: pct application non-entry in european phase

Ref document number: 07733440

Country of ref document: EP

Kind code of ref document: A2

点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载