Creating a depth map
The invention relates to a method of generating a depth map comprising depth values representing distances to a viewer, for respective pixels of an image.
The invention further relates to a depth map generating unit for generating a depth map comprising depth values representing distances to a viewer, for respective pixels of an image.
The invention further relates to an image processing apparatus comprising: receiving means for receiving a signal corresponding to an image; and such a depth map generating unit for generating a depth map. The invention further relates to a computer program product to be loaded by a computer arrangement, comprising instructions to generate a depth map comprising depth values representing distances to a viewer, for respective pixels of an image, the computer arrangement comprising processing means and a memory.
In order to generate a 3D impression on a multi-view display device, images from different virtual view points have to be rendered. This requires either multiple input views or some 3D or depth information to be present. This depth information can be either recorded, generated from multiview camera systems or generated from conventional 2D video material. For generating depth information from 2D video several types of depth cues can be applied: such as structure from motion, focus information, geometric shapes and dynamic occlusion. The aim is to generate a dense depth map, i.e. per pixel a depth value. This depth map is subsequently used in rendering a multi-view image to give the viewer a depth impression. In the article "Synthesis of multi viewpoint images at non-intermediate positions" by P.A. Redert, E. A. Hendriks, and J. Biemond, in Proceedings of International Conference on Acoustics, Speech, and Signal Processing, Vol. IV, ISBN 0-8186-7919-0, pages 2749-2752, IEEE Computer Society, Los Alamitos, California, 1997 a method of extracting depth information and of rendering a multi- view image on basis of the input image and the depth map are disclosed.
It is an object of the invention to provide a method of the kind described in the opening paragraph, which is based on a new depth cue.
This object of the invention is achieved in that the method comprises: - segmenting the image into at least one group of pixels corresponding to a foreground object and a further group of pixels corresponding to background; assigning a first group of depth values corresponding to the further group of pixels on basis of a predetermined background depth profile; assigning a second group of depth values corresponding to the at least one group of pixels on basis of a predetermined foreground depth profile, whereby the assigning of the second group of depth values is based on a particular depth value of the background depth profile, the particular depth value belonging to a particular pixel which is located at a predetermined location relative to the at least one group of pixels.
The rationale behind the invention is that for most natural images, i.e. captured by means of a camera the depth values can relatively well fitted to a predetermined model. The model comprises a background and foreground objects. It should be noted that the background can form one or more objects, e.g. the sky, a road, a see or a meadow. Typically, the background extends over a relatively large part of the image. The background is modeled by means of a background depth profile. This background depth profile corresponds to a surface description. Typically, the background in an image corresponds to a horizontally oriented surface in world coordinates. That means that, because of the perspective projection, there is a spatial relation between pixel coordinates in an image and corresponding depth values. In the method according to the invention, to the further group of pixels, i.e. the pixels belonging to the background, depth values are assigned on basis of the background depth profile. That means that the coordinates of the pixels and their position relative to the background depth profile are used to determine the depth values corresponding to these pixels.
In general, in front of the background there are a number of foreground objects. The foreground objects are modeled by means of a foreground depth profile. Because of gravity, most objects are vertically oriented in world coordinates. That means that, typically foreground objects appearing in an image can be fitted relatively well with a predetermined foreground depth profile which is based on that assumption. In the method according to the invention, to pixels corresponding to a foreground object, depth values are assigned on basis of the foreground depth profile. That means that the coordinates of the
pixels and their position relative to the foreground depth profile are used to determine the depth values corresponding to these pixels.
The actual depth values to be used for assignment to pixels corresponding to foreground objects is based on the position of the foreground objects relative to the background. In real life, foreground objects are connected to the background. For instance an object like a car is standing on the ground. That means that from a particular viewpoint the depth values of the car are substantially equal to the depth value of the part of the ground on which the car is standing. Alternatively, a lamp is hanging on the ceiling. That means that from a particular viewpoint the depth values of the lamp are substantially equal to the depth value of the part of the ceiling which is directly connected with the lamp.
In an embodiment of the method according to the invention, the background depth profile corresponds to an increasing function and whereby a relatively low depth value is assigned to a first one of the pixels of the further group of pixels which is located at a first border of the image. A relatively low depth value means that the corresponding pixel is relatively close to the viewer, while a relatively high depth value means that the corresponding pixel is relatively far away from the viewer. This background depth profile is a relatively simple profile which is appropriate to model the background in many images. Preferably, a relatively high depth value is assigned to a second one of pixels of the further group of pixels which is located at a relatively large distance from the first one of the pixels, e.g. in the middle of the image. Alternatively, a relatively high depth value is assigned to a second one of pixels of the further group of pixels which is located at a second border of the image. Typically, the first border corresponds to the bottom of the image and the second border corresponds to the top of the image.
In an embodiment of the method according to the invention, the first border corresponds to the bottom of the image and the second border corresponds to the top of the image. This depth profile corresponds to a substantially horizontally oriented surface.
In an embodiment of the method according to the invention the particular pixel is located below the at least one group of pixels. This corresponds with a quite natural situation that an object is standing on something else, e.g. the ground. In an embodiment of the method according to the invention, the first border corresponds to the top of the image and the second border corresponds to the bottom of the image. This depth profile also corresponds to a substantially horizontally oriented surface, like a ceiling.
In an embodiment of the method according to the invention, the particular pixel is located above the at least one group of pixels. This corresponds with a quite natural situation that an object is hanging on something else, e.g. the ceiling.
In an embodiment of the method according to the invention the foreground depth profile corresponds to a further function which is less increasing than the increasing function of the background depth profile. Although, many foreground objects are typically oriented vertically and hence resulting in substantially mutually equal depth values the method according to in the invention is not limited to this. It is advantageous, e.g. for scaling depth values into a range of possible depth values which can be visualized by a certain display device, to apply alternative depth profiles. Preferably, the two depth profiles have relation as described in Claim 8. The effect of this is that the differences in depth values of consecutive pixel pairs located at the border of a foreground object are increasing. For instance, a first difference in depth values for a first pixel pair which is located adjacent to the particular pixel of claim 1, comprising a first pixel belonging to the foreground object and its neighboring pixel belonging to the background is relatively low. However, a second difference in depth values for a second pixel pair which is located relatively far away from the particular pixel of claim 1, comprising a second pixel belonging to the foreground object and its neighboring pixel belonging to the background is relatively high.
It is a further object of the invention to provide a depth map generating unit of the kind described in the opening paragraph, which is based on a new depth cue.
This object of the invention is achieved in that the generating unit comprises: segmentation means for segmenting the image into at least one group of pixels corresponding to a foreground object and a further group of pixels corresponding to background; - first assigning means for assigning a first group of depth values corresponding to the further group of pixels on basis of a predetermined background depth profile; second assigning means for assigning a second group of depth values corresponding to the at least one group of pixels on basis of a predetermined foreground depth profile, whereby the assigning of the second group of depth values is based on a particular depth value of the background depth profile, the particular depth value belonging to a particular pixel which is located at a predetermined location relative to the at least one group of pixels.
It is a further object of the invention to provide an image processing apparatus comprising a depth map generating unit of the kind described in the opening paragraph which is arranged to generate a depth map based on a new depth cue.
This object of the invention is achieved in that the generating unit comprises: - segmentation means for segmenting the image into at least one group of pixels corresponding to a foreground object and a further group of pixels corresponding to background; first assigning means for assigning a first group of depth values corresponding to the further group of pixels on basis of a predetermined background depth profile; - second assigning means for assigning a second group of depth values corresponding to the at least one group of pixels on basis of a predetermined foreground depth profile, whereby the assigning of the second group of depth values is based on a particular depth value of the background depth profile, the particular depth value belonging to a particular pixel which is located at a predetermined location relative to the at least one group of pixels.
It is a further object of the invention to provide a computer program product of the kind described in the opening paragraph, which is based on a new depth cue.
This object of the invention is achieved in that the computer program product, after being loaded, provides said processing means with the capability to carry out: - segmenting the image into at least one group of pixels corresponding to a foreground object and a further group of pixels corresponding to background; assigning a first group of depth values corresponding to the further group of pixels on basis of a predetermined background depth profile; assigning a second group of depth values corresponding to the at least one group of pixels on basis of a predetermined foreground depth profile, whereby the assigning of the second group of depth values is based on a particular depth value of the background depth profile, the particular depth value belonging to a particular pixel which is located at a predetermined location relative to the at least one group of pixels.
Modifications of the depth map generating unit and variations thereof may correspond to modifications and variations thereof of the image processing apparatus, the method and the computer program product, being described.
These and other aspects of the depth map generating unit, of the image processing apparatus, of the method and of the computer program product, according to the invention will become apparent from and will be elucidated with respect to the implementations and embodiments described hereinafter and with reference to the accompanying drawings, wherein:
Fig. 1 schematically shows an image and the corresponding depth map being generated with the method according to the invention;
Fig. 2 schematically shows another image and the corresponding depth map being generated with the method according to the invention; Figs. 3 A and 3B schematically show results of segmentation;
Fig. 4 schematically shows three depth profiles in one direction; Fig. 5 schematically shows a multi-view image generation unit comprising a depth map generation unit according to the invention; and
Fig. 6 schematically shows an embodiment of the image processing apparatus according to the invention.
Same reference numerals are used to denote similar parts throughout the figures.
Fig. 1 schematically shows an image 100 and the corresponding depth map 122 being generated with the method according to the invention. Fig. 1 shows an image 100 representing an object, i.e. a car and shows the ground on which the car is standing. The image 100 is segmented into a first group of pixels 104 corresponding to the object and a second group of pixels 102 corresponding to the background. In this case no further objects are present in the image 100. That means that each pixel of the image 100 belongs to either the first group of pixels 104 or to the second group of pixels 102.
Fig. 1 schematically shows a predetermined background depth profile 110. The gray values of the background depth profile 110 correspond to depth values. The background depth profile 110 corresponds to a monotonous increasing function in a first direction, i.e. the vertical direction. The increasing function is such that a relatively low depth value is assigned to pixels which are located at the bottom border of the image 100 and that a relatively high depth value is assigned to pixels which are located at the top border of the image 100. The background depth profile 110 also corresponds to a constant function in a second direction, i.e. the horizontal direction. The constant function is such that horizontally neighboring pixels will be assigned mutually equal depth values.
Fig. 1 also shows a predetermined foreground depth profile 112. The gray values of the foreground depth profile 112 corresponds to depth values. The foreground depth profile 112 corresponds to a constant function in two orthogonal directions. That means that all depth values will be mutually equal. This actual depth value is determined on basis of the predetermined background depth profile 110 and a particular pixel 106 which is located in the image 100 below the first group of pixels 104 which correspond to the car. In Fig. 1 is indicated that the actual depth value is derived from the background depth profile 110 by taking a sample 108 from the background depth profile 110 on basis of the coordinates of the particular pixel 106. In this case, all depth values of the predetermined foreground depth profile 112 are equal to the depth value of the sample 108 of the predetermined background profile 110.
Fig. 1 schematically shows that on basis of the segmentation and the predetermined background depth profile 110 the second group of pixels 102 are assigned appropriate depth values. In Fig. 1 the assigned depth values corresponding to the background are referred to with reference number 114.
Fig. 1 schematically shows that on basis of the segmentation and the predetermined foreground depth profile 112 the first group of pixels 104 are assigned appropriate depth values. In Fig. 1 the assigned depth values corresponding to the car are referred to with reference number 116. Fig. 1 schematically shows the final combination of the assignment of depth values to the respective pixels of the image, i.e. the final depth map 122.
Fig. 2 schematically shows another image 200 and the corresponding depth map 222 being generated with the method according to the invention. Fig. 2 shows an image 200 representing an object, i.e. a lamp and shows the ceiling on which the lamp is hanging. The image 200 is segmented into a first group of pixels 204 corresponding to the object and a second group of pixels 202 corresponding to the background. In this case no further objects are present in the image 200. That means that each pixel of the image 200 belongs to either the first group of pixels 204 or to the second group of pixels 202.
Fig. 2 schematically shows a predetermined background depth profile 210. The gray values of the background depth profile 210 corresponds to depth values. The background depth profile 210 corresponds to a monotonous decreasing function in a first direction, i.e. the vertical direction. The decreasing function is such that a relatively high depth value is assigned to pixels which are located at the bottom border of the image 200 and that a relatively low depth value is assigned to pixels which are located at the top border of
the image 200. The background depth profile 210 also corresponds to a constant function in a second direction, i.e. the horizontal direction. The constant function is such that horizontally neighboring pixels will be assigned mutually equal depth values.
Fig. 2 also shows a predetermined foreground depth profile 212. The gray values of the foreground depth profile 212 corresponds to depth values. The foreground depth profile 212 corresponds to a constant function in two orthogonal directions. That means that all depth values will be mutually equal. This actual depth value is determined on basis of the predetermined background depth profile 210 and a particular pixel 206 which is located in the image 200 above the first group of pixels 204 which correspond to the lamp. In Fig. 2 is indicated that the actual depth value is derived from the background depth profile 210 by taking a sample 208 from the background depth profile 210 on basis of the coordinates of the particular pixel 206. In this case, all depth values of the predetermined foreground depth profile 212 are equal to the depth value of the sample 208 of the predetermined background profile 210. Fig. 2 schematically shows that on basis of the segmentation and the predetermined background depth profile 210 the second group of pixels 202 are assigned appropriate depth values. In Fig. 2 the assigned depth values corresponding to the background are referred to with reference number 214.
Fig. 2 schematically shows that on basis of the segmentation and the predetermined foreground depth profile 212 the first group of pixels 204 are assigned appropriate depth values. In Fig. 2 the assigned depth values corresponding to the lamp are referred to with reference number 216.
Fig. 2 schematically shows the final combination of the assignment of depth values to the respective pixels of the image, i.e. the final depth map 222. Figs. 3A and 3B schematically show results of segmentation. Segmentation is an image processing process whereby the pixels of the image are classified and assigned to one of a plurality of groups of pixels, i.e. segments. The segmentation is performed on basis of pixel values. With pixel values is meant color and/or luminance values. Typically, such a group of pixels is surrounded by a contour. There are several known techniques in the field of image processing for determining segments and/or contours. They can e.g. be determined by means of edge detection, homogeneity calculation or based on temporal filtering. Contours can be open or closed.
Figs. 3A and 3B schematically show images and contours which are found on basis of edge detection in the images. Detecting edges might be based on spatial high-pass
filtering of individual images. However, the edges are preferably detected on basis of mutually comparing multiple images, in particular computing pixel value differences of subsequent images of the sequence of video images. A first example of the computation of pixel value differences E (x,y,ri) is given in Equation 1: E(x,y,n) =\ I(x,y,n) - I(x,y,n-l) \ (1) with, I(x,y,ή) the luminance value of a pixel with coordinates x and y of image at time n . Alternatively, the pixel value differences E(x, y,n) are computed on basis of color values:
E(x,y,n) =\ C(x,y,n) - C(x,y,n-ϊ) \ (2) with, C(x,y,n) a color value of a pixel with coordinates x and y of image at time n . In Equation 3 a further alternative is given for the computation of pixel value differences
E{x,y,ή) based on the three different color components R (Red) G (Green) and B (Blue).
E(x,y,n) = max(\ R(x,y,n) - R(x,y,n -ϊ) \,\ G(x,y,n) - G(x,y,n -ϊ) \,\ B(x,y,tή -B(x,y,n-ϊ) \)
(3) Optionally, the pixel value difference signal E is filtered by clipping all pixel value differences which are below a predetermined threshold, to a constant e.g. zero.
Optionally, a morphologic filter operation is applied to remove all spatially small edges. Morphologic filters are common non-linear image processing units. See for instance the article "Low-level image processing by max-min filters" by P.W. Verbeek, H.A. Vrooman and LJ. van Vliet, in "Signal Processing", vol. 15, no. 3, pp. 249-258, 1988. Edge detection might also be based on motion vector fields. That means that regions in motion vector fields having a relatively large motion vector contrast are detected. These regions correspond with edges in the corresponding image. Optionally the edge detection unit is also provided with pixel values, i.e. color and or luminance values of the video images. Motion vector fields are e.g. provided by a motion estimation unit as specified in the article "True-Motion Estimation with 3-D Recursive Search Block Matching" by G. de Haan et. al. in IEEE Transactions on circuits and systems for video technology, vol.3, no.5, October 1993, pages 368-379.
Fig. 3A schematically shows an image 300 comprising a first segment 304 corresponding to background and a second segment 302 corresponding to an object which is located in front of the background. The second segment is surrounded by a closed contour 306. This contour is located on an edge of the first segment 304, i.e. on the border between the first segment 304 and the second 302 segment. In the case of a closed contour it is relatively easy to determine which pixels belong to the first segment and which pixels do not
belong to the first segment. The group of pixels which are inside the contour 306 belong to the first segment 302. The other group of pixels 304 which are located outside the contour 306 do not belong to the first segment 302. In the event of a closed contour it is straightforward to determine a particular pixel 308 which is located at a predetermined location relative to the first segment 302, for instance below the first segment 302.
Fig. 4B shows an image 310 in which an open contour 312 is drawn. This contour is located on an edge of the first segment, i.e. on the border between the first segment and a second segment. Unfortunately, there is not a distinct edge between the group of pixels which are assumed to belong to the first segment and another group of pixels which are assumed not to belong to the first segment. Hence, in the event of an open contour it is not straightforward to determine which pixels belong to the first segment and which do not belong to the first segment. An option to deal with this issue is closing the contour, which is found on basis of edge detection, by connecting to endpoints of the open contour. In Fig. 3B this is indicated with a line-segment with reference number 318 . Subsequently, it is straightforward to determine a particular pixel 314 which is located (e.g.) below the line segment 318. Alternatively, the particular pixel 316 is located (e.g.) below a first one of the end points of the open contour or the particular pixel 320 is located (e.g.) below a second one of the end points of the open contour.
Fig. 4 schematically shows three depth profiles 400-402 in one direction. The x-axis 406 corresponds to a spatial dimension, i.e. direction. In this case, going from left to right in Fig. 4 corresponds to a vertical direction from bottom to top in an image 100. The y- axis corresponds to depth. A first one of a depth profiles 402 as depicted in Fig. 4 corresponds to the predetermined background depth profile 110 as described in connection with Fig. 1. The function of the first one of a depth profiles 402 is specified in Equation 1. D(p) = c + a-y(p) (1)
In words, the depth value D(p) for pixel p is equal to the product of a constant a and the y- coordinate of the pixel y(p) added to a constant c .
A second one of a depth profiles 400 as depicted in Fig. 4 corresponds to the predetermined foreground depth profile 112 as described in connection with Fig. 1. The function of the second one of a depth profiles 400 is specified in Equation 1.
D(p) = c (2)
In words, the depth value D(p) for pixel p is equal to a constant c . As described in connection with Fig. 1 the foreground depth profile 112 having a single value represents a vertically oriented object in world coordinates.
Instead of using such a constant value foreground depth profile alternative foreground depth profiles are possible. In Fig. 4 a third depth profile 401 is depicted which is appropriate to be used as foreground depth profile in combination with the first one of the profiles 402 as background depth profile. It can be clearly seen that by using these two depth profiles 402, 401 an important aspect of the invention is still fulfilled. The effect of this is that the differences in depth values of consecutive pixel pairs located at the border of a foreground object are increasing. For instance, a first difference 408 in depth values for a first pixel pair which is located adjacent to the particular pixel 106, comprising a first pixel belonging to the foreground object and its neighboring pixel belonging to the background is relatively low. However, a second difference in depth values for a second pixel pair which is located relatively far away from the particular pixel of 106, comprising a second pixel belonging to the foreground object and its neighboring pixel belonging to the background is relatively high.
It will be clear that alternative depth profiles can be selected. Besides that it is also possible to determine actual depth values by means of fusion techniques for fusion of depth profiles. S. Pankanti, A. K. Jain, and M. Tuceryan, describe a fusion technique in the article "On Integration of Vision Modules", in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 316-322, 1994. The concept of fusion means that multiple information streams are combined into a single stream of information. Below a set of equations is provided which can be used for such a fusion technique. In that case a first information stream corresponds to depth and a second information stream corresponds to derivative of depth.
Segmentation is denoted as follows:
[θ no segmentation boundary between pixels p and q
S(pq) = < . (i)
I l a segmentation boundary between pixels p and q
The fusion system requires as input depth D, depth derivative VZ), and their variances OD ,OVD- These can be used to define the depth profiles and the relation with segmentation as follows:
background profile: D(p) = c + a- y(p) (1)
, Too £> is at the image border background boundary: σDO) = <^ (4)
[ ό j3 is within the image interior
derivative foreground profile: VD(pq) = 0 (5)
object boundaries: ^vn(P ) = \ (6)
[e S(pq) = 0
The dynamical range of the resultant depth map can be adjusted with a. The numbers b and e regulate the trade-off between background and foreground profiles. Typical values are in between 0.1 and 10. Possibly, e can be chosen differently for vertically and horizontally neighboring pixelpairs pq. The infinities in (4) and (6) are typically implemented by large numbers, e.g. 100-1000.
Computing depth values on basis of depth derivatives is described in patent application with title "Full Depth Map Acquisition". This patent application has filing number EP 03100092.0 (Attorney Docket number PHNL030006). Via the method as disclosed in said patent application the final depth map Dresuι, is obtained.
Fig. 5 schematically shows a multi-view image generation unit 500 comprising a depth map generation unit 501 according to the invention. The multi-view image generation unit 500 is arranged to generate a sequence of multi-view images on basis of a sequence of video images. The multi-view image generation unit 500 is provided with a stream of video images at the input connector 508 and provides two correlated streams of video images at the output connectors 510 and 512, respectively. These two correlated streams of video images are to be provided to a multi-view display device which is arranged to visualize a first series of views on basis of the first one of the correlated streams of video images and to visualize a second series of views on basis of the second one of the correlated streams of video images. If a user, i.e. viewer, observes the first series of views by his left eye and the second series of views by his right eye he notices a 3D impression. It might be that the first one of the correlated streams of video images corresponds to the sequence of video images as received and that the second one of the correlated streams of video images is rendered on basis of the sequence of video images as received. Preferably, both streams of video images are rendered on basis of the sequence of video images image as received. The rendering is e.g. as
described in the article "Synthesis of multi viewpoint images at non-intermediate positions" by P. A. Redert, E. A. Hendriks, and J. Biemond, in Proceedings of International Conference on Acoustics, Speech, and Signal Processing, Vol. IV, ISBN 0-8186-7919-0, pages 2749- 2752, IEEE Computer Society, Los Alamitos, California, 1997. Alternatively, the rendering is as described in "High-quality images from 2.5D video", by R.P. Berretty and F.E. Ernst, in Proceedings Eurographics, Granada, 2003, Short Note 124.
The multi- view image generation unit 500 comprises: a depth map generation unit 501 for generating depth maps for the respective input images on basis of the transitions in the image; and - a rendering unit 506 for rendering the multi-view images on basis of the input images and the respective depth maps, which are provided by the depth map generation unit 501.
The depth map generating unit 501 for generating depth maps comprising depth values representing distances to a viewer, for respective pixels of the images, comprises: a segmentation unit 502 for segmenting the image into at least one group of pixels corresponding to a foreground object and a further group of pixels corresponding to background; a first assigning unit 503 for assigning a first group of depth values corresponding to the further group of pixels on basis of a predetermined background depth profile; a second assigning unit 504 for assigning a second group of depth values corresponding to the at least one group of pixels on basis of a predetermined foreground depth profile, whereby the assigning of the second group of depth values is based on a particular depth value of the background depth profile, the particular depth value belonging to a particular pixel which is located at a predetermined location relative to the at least one group of pixels.
The segmentation unit 502, the first assigning unit 503, the second assigning unit 504 and the rendering unit 506 may be implemented using one processor. Normally, these functions are performed under control of a software program product. During execution, normally the software program product is loaded into a memory, like a RAM, and executed from there. The program may be loaded from a background memory, like a ROM, hard disk, or magnetically and/or optical storage, or may be loaded via a network like
Internet. Optionally an application specific integrated circuit provides the disclosed functionality.
It should be noted that, although the multi-view image generation unit 500 as described in connection with Fig. 5 is designed to deal with video images, alternative embodiments of the depth map generation unit according to the invention are arranged to generate depth maps on basis of individual images, i.e. still pictures.
Fig. 6 schematically shows an embodiment of the image processing apparatus 600 according to the invention, comprising: a receiving unit 602 for receiving a video signal representing input images; - a multi-view image generation unit 501 for generating multi-view images on basis of the received input images, as described in connection with Fig 5; and a multi-view display device 606 for displaying the multi-view images as provided by the multi-view image generation unit 501.
The video signal may be a broadcast signal received via an antenna or cable but may also be a signal from a storage device like a VCR (Video Cassette Recorder) or
Digital Versatile Disk (DVD). The signal is provided at the input connector 510. The image processing apparatus 600 might e.g. be a TV. Alternatively the image processing apparatus 600 does not comprise the optional display device but provides the output images to an apparatus that does comprise a display device 606. Then the image processing apparatus 600 might be e.g. a set top box, a satellite-tuner, a VCR player, a DVD player or recorder.
Optionally the image processing apparatus 600 comprises storage means, like a hard-disk or means for storage on removable media, e.g. optical disks. The image processing apparatus 600 might also be a system being applied by a film-studio or broadcaster.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be constructed as limiting the claim. The word 'comprising' does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention can be implemented by means of hardware comprising several distinct elements and by means of a suitable programmed computer. In the unit claims enumerating several means, several of these means can be embodied by one and the same item of hardware. The usage of the words first, second and third, etcetera do not indicate any ordering. These words are to be interpreted as names.