+

WO1992001998A1 - Method and apparatus for automatic text separation using automatic electronic filtering of multiple drop-out colors for optical character recognition of preprinted forms - Google Patents

Method and apparatus for automatic text separation using automatic electronic filtering of multiple drop-out colors for optical character recognition of preprinted forms Download PDF

Info

Publication number
WO1992001998A1
WO1992001998A1 PCT/US1991/005040 US9105040W WO9201998A1 WO 1992001998 A1 WO1992001998 A1 WO 1992001998A1 US 9105040 W US9105040 W US 9105040W WO 9201998 A1 WO9201998 A1 WO 9201998A1
Authority
WO
WIPO (PCT)
Prior art keywords
color
pixel
grey
scale
processing
Prior art date
Application number
PCT/US1991/005040
Other languages
French (fr)
Inventor
Peter Rudak
Original Assignee
Eastman Kodak Company
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Eastman Kodak Company filed Critical Eastman Kodak Company
Publication of WO1992001998A1 publication Critical patent/WO1992001998A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N1/00Scanning, transmission or reproduction of documents or the like, e.g. facsimile transmission; Details thereof
    • H04N1/46Colour picture communication systems
    • H04N1/56Processing of colour picture signals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/10Image acquisition
    • G06V10/12Details of acquisition arrangements; Constructional details thereof
    • G06V10/14Optical characteristics of the device performing the acquisition or on the illumination arrangements
    • G06V10/143Sensing or illuminating at different wavelengths

Definitions

  • the invention relates to the automatic selection and detection of a drop-out color using a color electronic scanner and more particularly, allows the Optical Character Recognition (OCR) system to adjust the filtering parameters automatically based on the form itself, rather than matching the form to the optical filter.
  • OCR Optical Character Recognition
  • OCR Optical Character Recognition
  • the first step of the OCR process is electronic scanning of the document and converting all of the information to a digital bit-map.
  • the information to be read is separated from the background information—boxes and guide text must be ignored and the filled-out text should be read.
  • the electronic image of the text is processed by the OCR algorithm, where the characters of interest are converted to ASCII data.
  • OCR algorithms processing business forms employ the technique of a "drop-out color". By printing documents in a predetermined color (usually a pastel color) and employing an optical filter of the same color in the electronic scanner, the filled-out text on the document can be separated from the printed form.
  • the color filter causes the scanner to ignore information printed in that color (to the electronic scanner, the form color appears as being equivalent to the white background of the paper).
  • the filled-out text typically is typed or printed in black (or other dark color)
  • this information is captured by the scanner as black.
  • the pre-printed form is converted to a white background and the filled-out text can be processed readily by an OCR algorithm.
  • Use of the optical filter works well in this application, but it limits the customer to a very specific color on the form (one that precisely matches the characteristics of the optical filter installed in the scanner) . Additional drop-out colors can be included in the scanner by adding additional optical filters. Accordingly, the processing of a particular form would require selecting the proper optical filter and mechanically inserting it prior to processing the form.
  • the scanner would separate all images into the three primary colors: red, green and blue.
  • a black and white rendition of the image can be produced simply by adding the three color components.
  • By independently processing the red, green and blue signals it is possible to segregate color information from the common black and white information, so that the apparatus filters all colors, leaving only the high contrast text for OCR reading. Disclosure of the Invention
  • three digital channels are multiplied by appropriate coefficients to insure uniform color and amplitude response among all pixels.
  • the three signals Once the three signals have been corrected for uniformity, they are processed as independent video signals to create three binary representations of the image.
  • an "all color” filter is created which can separate "black” text from any color pre-printed information.
  • the three outputs represent document images using all possible combinations of drop-out colors in color space.
  • Figure 1 illustrates the configurations of a solid state charged coupled device that can be used for color scanning
  • Figure 2 illustrates a block diagram of the circuit used for electronic color filtering in accordance with the invention
  • FIGS. 3A-B illustrate a flow chart that is used in conjunction with white calibration.
  • Figure 1 illustrates the type of electronic scanner used to generate a programmable drop-out color.
  • This scanner would separate all images into the three primary colors: red, green, and blue.
  • a black and white rendition of the image (as a typical electronic scanner would produce today) can be produced simply by adding the three color components.
  • the electronic scanner intended for use in the present apparatus is based on a "contact type" CCD (Charge Coupled Device) 10 currently available as Model TCD126C, made by Toshiba.
  • the CCD is actually several CCD arrays on a single substrate and has a horizontal resolution of 1200 Pixels/inch and spans 12 inches. Because most OCR algorithms can read accurately with scan resolutions of 200 to 400 Pixels/inch, the added resolution can be used for color detection.
  • Such detection is accomplished by masking adjacent pixels with appropriate red, green a and blue optical filters with the spectral content of these filters being based on the spectral characteristics of the CCD device itself.
  • three adjacent cells 12, 14, and 16 form a single "super-pixel" 18, with cells 12, 14, and 16 being masked by red, green and blue optical filters 20, 22, and 24 respectively. If each pixel corresponds to 1/1200 the effective resolution of the CCD device would be 400 Pixels/inch.
  • the output of this scanner contains a three channel output of red 26, green 28, and blue 30 video signals.
  • Figure 2 illustrates a block diagram for use in automatic text separation for OCR reading as well as full image capture.
  • the color scanner 10 outputs three video signals per pixel— ed 26, Green 28, and Blue 30, in a segmented fashion for each scan line.
  • the R, G, B signals are converted to a grey-scale digital representation by respective A/D converters 32, 34 and 36.
  • Each pixel's Red, Green, and Blue component is then fed to multipliers 38, 40 and 42 respectively.
  • the Microprocessor and RAM Storage Subsystem 52 monitors each pixel within a scan line to ensure proper correlation between pixel video data and calibration coefficients 38, 40, and 42, which are sent to the corresponding multiplier 46, 48, and 50 for their respective color channel.
  • the output of these multipliers is in the form of a segmented bit stream of Calibrated Red 56, Green 57, and Blue 58 pixels which can be used as a grey-scale color image.
  • This calibrated color information is also fed to summing junction 59 where the three color components for each pixel are added to form a grey-scale black and white image as its output.
  • the calibrated Red 56, Green 57, and Blue 58 video data is fed back to Microprocessor and RAM Storage Subsystem 52 for diagnostic purposes.
  • the calibrated Red 56, Green 57, and Blue 58 video data is also processed by respective Threshold Circuits 41, 43 and 45 which create 1 bit/pixel video data for Red, Green and Blue.
  • Threshold Circuits 41, 43 and 45 may be in the form of a simple comparator or be as elaborate as an M x N convolution filter with adaptive thresholding.
  • the output of each threshold circuit 41, 43 and 45 is binary where a "1" corresponds to a "dark” pixel, and a "0" corresponds to a "light” pixel.
  • the output of AND gate 63 can be considered a "text" output for typical forms employing a pastel drop-out color.
  • Color background information is filtered out and only typed text information is passed on to an OCR algorithm.
  • OCR algorithm For example, a form printed with non-carbon red ink and filled out with a typewriter using a carbon-based ribbon easily could be processed using this invention.
  • the present invention system would produce an image only of the typed text, ignoring the pre-printed red.
  • the present invention can filter out any non-carbon ink, thereby providing greater flexibility. The user could use inter-mixed documents of different colors without worrying about changing filters.
  • any drop-out colors used on a particular form can be automatically determined and suppressed by making some assumptions about the spectral content of the filled-out text.
  • Most text, used to fill out business forms, can be categorized as "carbon based”. This category includes most typewritten ribbons, pens or pencils. Such text would pass as black regardless of any color filter employed, and the text can be separated from any pre-printed color information by applying an "all-color filter”.
  • White calibration can be used to optimize scanner performance by compensating for any spectral anomolies or sensitivity variations on a pixel by pixel basis.
  • the white calibration method discussed here is the preferred method for assuring uniform response from the scanner, since the compensation can be done just prior to running, thereby compensating for differences due to age or wear.
  • Feeding a white (blank) sheet of paper through the color scanner exercises all three color signals simultaneously. Because a white sheet of paper has a known and predictable spectral curve, the color gain coefficients can be programmed in such a manner as to allow the scanner to mimic this ideal response.
  • Figures 3A and B show a flow chart for implementing white calibration. Step 80 requires microprocessor and RAM storage subsystem 52 (Fig.
  • step 84 an operator feeds a white piece of paper through the color scanner in order to calibrate the response.
  • step 86 the beginning of the page is detected and the calibration process begins.
  • Color scanner 10 outputs a sequential three color data stream (R,G,B) as it scans each horizontal line of the white document. This information is digitized by A/D converters 32, 34 and 36, one for each color channel. The digitized- signals are sent to multipliers 38, 40 and 42 respectively.
  • microprocessor 52 calculates the average red, green, and blue values for each pixel in step 96 by dividing each accumulator value by the line count (number of lines captured) . This information corresponds to the average color response for each horizontal pixel.
  • red, green, and blue coefficients can be calculated for each pixel in step 98. This is done in order to "normalize” the response, which guarantees that each pixel responds in a similar fashion given a similar input.
  • the gain coefficients are calculated by dividing the average R, .G, B response of each pixel into the ideal or optimum R, G, B response. The optimum response is based on the ideal R, G, B values for a "white” input.
  • the apparatus is capable of compensating for any color or gain anomolies by multiplying each pixel's red, green, and blue video value by an image compensating coefficient.
  • color scanner 10 outputs red, green, and blue signals for each horizontal pixel sequentially, and each color signal is digitized by A/D converters 32, 34 and 36.
  • the digital grey scale color information for each pixel is then sent to multiplier circuits 38, 40 and 42 respectively.
  • Microprocessor and RAM storage subsystem 52 recalls the unique R, G, B gain coefficients for each pixel in the horizontal scan and simultaneously presents these coefficients to the 3 multipliers, thereby multiplying each pixel's red, green, and blue values by their corresponding gain coefficient.
  • the outputs of these multipliers represent the normalized red, green, and blue values for each pixel.
  • the output of color scanner 10 is balanced for a correct and uniform spectral response.
  • the present invention is useful for processing business forms in conjunction with optical character recognition systems as a way of separating text information on forms by automatic filtering of color information.
  • This scanner system would separate all images into three primary * colors: red, green and blue.
  • a black and white rendition of the image can be produced simply by adding the three color components.
  • the invention is advantageous in eliminating the drop-out color variability problem associated with mechanical filter insertion. This variability can be caused by the color of the ink used on the forms varying from one printing batch to another such that the mechanical filter was ineffective in removing the printed text on the form printed with the out of tolerance ink.
  • the present invention allows one to intermix documents of different color within a batch, as well as single documents having various drop-out colors (a form with red and blue preprinted information, for example) . Without the use of the present invention, it would be impossible to accomplish this using mechanical filter insertion, as practiced in the prior art.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Character Input (AREA)
  • Battery Mounting, Suspending (AREA)

Abstract

Method and apparatus for the separation of text from previously printed material based on the assumption that text used to fill out business forms can be categorized as 'carbon based' and that such text would pass as black regardless of any color filters employed. Text can be separated from any pre-printed color information by applying an 'all-color filter'.

Description

METHOD AND APPARATUS FOR AUTOMATIC TEXT SEPARATION
USING AUTOMATIC ELECTRONIC FILTERING OF MULTIPLE DROP-OUT COLORS FOR OPTICAL CHARACTER RECOGNITION OF
PREPRINTED FORMS Technical Field of the Invention
The invention relates to the automatic selection and detection of a drop-out color using a color electronic scanner and more particularly, allows the Optical Character Recognition (OCR) system to adjust the filtering parameters automatically based on the form itself, rather than matching the form to the optical filter. Background of the Invention
Optical Character Recognition (OCR) is a useful technique for processing business forms. Machine reading systems can replace several data-entry operators and reduce the expense of data capture.
In general, the first step of the OCR process is electronic scanning of the document and converting all of the information to a digital bit-map. Once the image is captured in an electronic format, the information to be read is separated from the background information—boxes and guide text must be ignored and the filled-out text should be read. Once this separation is accomplished, the electronic image of the text is processed by the OCR algorithm, where the characters of interest are converted to ASCII data. Almost all OCR systems processing business forms employ the technique of a "drop-out color". By printing documents in a predetermined color (usually a pastel color) and employing an optical filter of the same color in the electronic scanner, the filled-out text on the document can be separated from the printed form. The color filter causes the scanner to ignore information printed in that color (to the electronic scanner, the form color appears as being equivalent to the white background of the paper). However, since the filled-out text typically is typed or printed in black (or other dark color), this information is captured by the scanner as black. Hence, the pre-printed form is converted to a white background and the filled-out text can be processed readily by an OCR algorithm. Use of the optical filter works well in this application, but it limits the customer to a very specific color on the form (one that precisely matches the characteristics of the optical filter installed in the scanner) . Additional drop-out colors can be included in the scanner by adding additional optical filters. Accordingly, the processing of a particular form would require selecting the proper optical filter and mechanically inserting it prior to processing the form.
However, slight variations in the printing process can produce variability in the actual color of the printed form, thereby reducing the "drop-out" effect. Such changes can cause noise to be added (the scanner sees the pre-printed form information as black instead of white) which may result in the OCR algorithm producing erroneous results. Alternatively, the changing of optical filters to accommodate these slight variations in printing is not practical, since this would require a large inventory of filters, each with slightly different characteristics. Therefore, the only way to control this problem practically is to tightly control the printing process to insure a uniform drop-out color. As a result, OCR form Reading systems presently in use are generally "closed loop", which means the Forms Processing Firm (such as an insurance carrier) must maintain control over the printing of the forms, because forms created by outside establishments may not read properly due to color variations.
With the present invention, the scanner would separate all images into the three primary colors: red, green and blue. A black and white rendition of the image can be produced simply by adding the three color components. By independently processing the red, green and blue signals, it is possible to segregate color information from the common black and white information, so that the apparatus filters all colors, leaving only the high contrast text for OCR reading. Disclosure of the Invention
In the present invention, three digital channels are multiplied by appropriate coefficients to insure uniform color and amplitude response among all pixels. Once the three signals have been corrected for uniformity, they are processed as independent video signals to create three binary representations of the image. By combining this signal ins such a way as to preserve only the information common to all three channels, an "all color" filter is created which can separate "black" text from any color pre-printed information. In effect, the three outputs represent document images using all possible combinations of drop-out colors in color space. Brief Description of the Drawings
Figure 1 illustrates the configurations of a solid state charged coupled device that can be used for color scanning; Figure 2 illustrates a block diagram of the circuit used for electronic color filtering in accordance with the invention; and
Figures 3A-B illustrate a flow chart that is used in conjunction with white calibration. Modes of Carrying Out the Invention
Figure 1 illustrates the type of electronic scanner used to generate a programmable drop-out color. This scanner would separate all images into the three primary colors: red, green, and blue. A black and white rendition of the image (as a typical electronic scanner would produce today) can be produced simply by adding the three color components. The electronic scanner intended for use in the present apparatus is based on a "contact type" CCD (Charge Coupled Device) 10 currently available as Model TCD126C, made by Toshiba. The CCD is actually several CCD arrays on a single substrate and has a horizontal resolution of 1200 Pixels/inch and spans 12 inches. Because most OCR algorithms can read accurately with scan resolutions of 200 to 400 Pixels/inch, the added resolution can be used for color detection. Such detection is accomplished by masking adjacent pixels with appropriate red, green a and blue optical filters with the spectral content of these filters being based on the spectral characteristics of the CCD device itself. As shown in Fig. 1, three adjacent cells 12, 14, and 16 form a single "super-pixel" 18, with cells 12, 14, and 16 being masked by red, green and blue optical filters 20, 22, and 24 respectively. If each pixel corresponds to 1/1200 the effective resolution of the CCD device would be 400 Pixels/inch. The output of this scanner contains a three channel output of red 26, green 28, and blue 30 video signals. Figure 2 illustrates a block diagram for use in automatic text separation for OCR reading as well as full image capture. The color scanner 10 outputs three video signals per pixel— ed 26, Green 28, and Blue 30, in a segmented fashion for each scan line. The R, G, B signals are converted to a grey-scale digital representation by respective A/D converters 32, 34 and 36. Each pixel's Red, Green, and Blue component is then fed to multipliers 38, 40 and 42 respectively.
The Microprocessor and RAM Storage Subsystem 52 monitors each pixel within a scan line to ensure proper correlation between pixel video data and calibration coefficients 38, 40, and 42, which are sent to the corresponding multiplier 46, 48, and 50 for their respective color channel. The output of these multipliers is in the form of a segmented bit stream of Calibrated Red 56, Green 57, and Blue 58 pixels which can be used as a grey-scale color image. This calibrated color information is also fed to summing junction 59 where the three color components for each pixel are added to form a grey-scale black and white image as its output. In addition, the calibrated Red 56, Green 57, and Blue 58 video data is fed back to Microprocessor and RAM Storage Subsystem 52 for diagnostic purposes. The calibrated Red 56, Green 57, and Blue 58 video data is also processed by respective Threshold Circuits 41, 43 and 45 which create 1 bit/pixel video data for Red, Green and Blue. Threshold Circuits 41, 43 and 45 may be in the form of a simple comparator or be as elaborate as an M x N convolution filter with adaptive thresholding. The output of each threshold circuit 41, 43 and 45 is binary where a "1" corresponds to a "dark" pixel, and a "0" corresponds to a "light" pixel.
All three of these binary signals (corresponding to Red, Green and Blue values for each pixel) are sent to an AND gate 63. If any of the three color components of a given pixel are
"light" or "0" the output of the AND gate will be a "0", corresponding to "white". In the even all three color components of a pixel are "1" or "dark", the output of the AND gate will be a "1" corresponding to "black" .
The output of AND gate 63 can be considered a "text" output for typical forms employing a pastel drop-out color. Color background information is filtered out and only typed text information is passed on to an OCR algorithm. For example, a form printed with non-carbon red ink and filled out with a typewriter using a carbon-based ribbon easily could be processed using this invention. Unlike a conventional scanning system which would produce an image of all printed material (pre-printed red and typed black) , the present invention system would produce an image only of the typed text, ignoring the pre-printed red.
By closely matching an optical red filter with the color of ink used for the pre-printed information and using the filter with a conventional scanning system, similar results can be achieved. However, such a red optical filter could not drop out a green ink. Advantageously, the present invention can filter out any non-carbon ink, thereby providing greater flexibility. The user could use inter-mixed documents of different colors without worrying about changing filters.
Accordingly, any drop-out colors used on a particular form can be automatically determined and suppressed by making some assumptions about the spectral content of the filled-out text. Most text, used to fill out business forms, can be categorized as "carbon based". This category includes most typewritten ribbons, pens or pencils. Such text would pass as black regardless of any color filter employed, and the text can be separated from any pre-printed color information by applying an "all-color filter". White Calibration
White calibration can be used to optimize scanner performance by compensating for any spectral anomolies or sensitivity variations on a pixel by pixel basis. The white calibration method discussed here is the preferred method for assuring uniform response from the scanner, since the compensation can be done just prior to running, thereby compensating for differences due to age or wear. Feeding a white (blank) sheet of paper through the color scanner exercises all three color signals simultaneously. Because a white sheet of paper has a known and predictable spectral curve, the color gain coefficients can be programmed in such a manner as to allow the scanner to mimic this ideal response. Figures 3A and B show a flow chart for implementing white calibration. Step 80 requires microprocessor and RAM storage subsystem 52 (Fig. 2) to set all of the red, green, and blue gain coefficients to a value of 1 and then in step 82 set all of the pixel accumulators (located in memory within microprocessor and RAM storage subsystem 52) to 0. In step 84, an operator feeds a white piece of paper through the color scanner in order to calibrate the response. In step 86 the beginning of the page is detected and the calibration process begins. Color scanner 10 outputs a sequential three color data stream (R,G,B) as it scans each horizontal line of the white document. This information is digitized by A/D converters 32, 34 and 36, one for each color channel. The digitized- signals are sent to multipliers 38, 40 and 42 respectively. Because microprocessor 52 had previously set all gains to a value of 1, the output of each multiplier is equivalent to R,G,B values of each pixel. Microprocessor 52 captures this sequential line of grey scale color information in step 88 within its own memory (RAM) and then adds each pixel's red, green, and blue values to the appropriate accumulator in accordance with step 90. The microprocessor maintains separate accumulators for R, G, and B values for each pixel (total number of accumulators = 3 x number of horizontal pixels). This accumulation process continues until the end of the page is detected in step 92. The total number of lines processed is maintained by a line counter in step 94. Once the scanning* of the page has been completed, microprocessor 52 calculates the average red, green, and blue values for each pixel in step 96 by dividing each accumulator value by the line count (number of lines captured) . This information corresponds to the average color response for each horizontal pixel.
Once this color response is known, red, green, and blue coefficients can be calculated for each pixel in step 98. This is done in order to "normalize" the response, which guarantees that each pixel responds in a similar fashion given a similar input. The gain coefficients are calculated by dividing the average R, .G, B response of each pixel into the ideal or optimum R, G, B response. The optimum response is based on the ideal R, G, B values for a "white" input. Once the gain coefficients are calculated, 3 per pixel they are stored in accordance with step 100 in a dual-ported memory (not shown, but part of the microprocessor and RAM storage subsystem), with microprocessor and RAM storage subsystem 52, thereby completing the white calibration process. Once calibrated, the apparatus (Fig. 2) is capable of compensating for any color or gain anomolies by multiplying each pixel's red, green, and blue video value by an image compensating coefficient. During operation, color scanner 10 outputs red, green, and blue signals for each horizontal pixel sequentially, and each color signal is digitized by A/D converters 32, 34 and 36. The digital grey scale color information for each pixel is then sent to multiplier circuits 38, 40 and 42 respectively. Microprocessor and RAM storage subsystem 52 recalls the unique R, G, B gain coefficients for each pixel in the horizontal scan and simultaneously presents these coefficients to the 3 multipliers, thereby multiplying each pixel's red, green, and blue values by their corresponding gain coefficient. The outputs of these multipliers represent the normalized red, green, and blue values for each pixel. By running calibration, storing the unique color gain coefficients for each pixel, and subsequently using the gain coefficients to normalize the R, G, B response for each pixel, the output of color scanner 10 is balanced for a correct and uniform spectral response.
While the invention has been described in conjunction with specific embodiments thereof, it is evident that many alternatives, modifications, and variations will be apparent to those skilled in the art in light of the foregoing description. Accordingly, it is intended to embrace all such alternatives, modifications and variations as fall within the spirit and broad scope of the appended claims. Advantages and Industrial Applicability
The present invention is useful for processing business forms in conjunction with optical character recognition systems as a way of separating text information on forms by automatic filtering of color information. This scanner system would separate all images into three primary * colors: red, green and blue. A black and white rendition of the image can be produced simply by adding the three color components. The invention is advantageous in eliminating the drop-out color variability problem associated with mechanical filter insertion. This variability can be caused by the color of the ink used on the forms varying from one printing batch to another such that the mechanical filter was ineffective in removing the printed text on the form printed with the out of tolerance ink. Additionally, the present invention allows one to intermix documents of different color within a batch, as well as single documents having various drop-out colors (a form with red and blue preprinted information, for example) . Without the use of the present invention, it would be impossible to accomplish this using mechanical filter insertion, as practiced in the prior art.

Claims

WHAT IS CLAIMED IS:
1. An apparatus for processing a three component color signal generated by a color scanner after said signal components have been converted to a grey-scale digital format by an analog to digital converter associated with each color component on a pixel by pixel basis, said apparatus characterized by: processing means for each of the color video signals to create three binary, 1 bit per pixel video signals, each corresponding to the red, green and blue components of a pixel; and means for combining said color video signals so as to preserve black information only when all of the binary color signals identify the pixel as being black.
2. An apparatus as set forth in Claim 1 wherein the combining means takes the form of an AND gate.
3. An apparatus as set forth in Claim 2 wherein the output of said combining means is a 1 bit per pixel segmented scan line that is suitable for OCR reading.
4. An apparatus as set forth in Claim 1 that further includes means for compensating each pixel within a scan line for amplitude and color response as represented by three color components.
5. An apparatus for processing a color form that was filled out using a carbon based ink, said apparatus characterized by: a color scanner having a plurality of grey-scale color outputs generating scan line having segmented pixels with red, green and blue components; means for converting each output to a grey-scale digital format on a pixel by pixel basis; me ory means for storing at least one scan line of color grey-scale information; means for compensating each pixel within a scan line for amplitude and color response as represented by the three color components; means for processing each of the color video signals to create three binary 1 bit per pixel video signals, each corresponding to the red, green and blue components of a pixel; and means for combining said binary color information such as to preserve black information, associated with carbon based inks, when all of the said binary color outputs indicate the pixel as being black.
6. An apparatus as set forth in Claim 5 wherein the compensating means takes the form of a digital multiplier.
7. A method of processing a color form that was filled out using a carbon based ink, said method characterized by the steps of: scanning the color form and producing at least two grey-scale color outputs having segmented pixels for each color component; converting each output to a grey-scale digital format on a pixel by pixel basis; storing at lest one scan line of each color grey-scale component; compensating for each pixel within a scan line for amplitude and color responses as represented by each color component; processing each of the color video signals to crate three binary 1 bit per pixel video signals, each corresponding to the color components for each pixel; and comparing said binary color information so as to preserve black information, associated with carbon based inks, when all of the color components indicate the pixel as being black.
8. A method of processing a color form as set forth in Claim 7 wherein the combining step means AND'ing all of said outputs.
9. A method of processing a color form as set forth in Claim 7 wherein the combined output is suitable for OCR reading.
10. A method of processing a color form as set forth in Claim 7 wherein the sum of the grey-scale color outputs produce a grey-scale black/white image.
11. A method of processing a color form as set forth in Claim 7 wherein the grey-scale color output signals produce a grey-scale color image.
PCT/US1991/005040 1990-07-24 1991-07-18 Method and apparatus for automatic text separation using automatic electronic filtering of multiple drop-out colors for optical character recognition of preprinted forms WO1992001998A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US55729490A 1990-07-24 1990-07-24
US557,294 1990-07-24

Publications (1)

Publication Number Publication Date
WO1992001998A1 true WO1992001998A1 (en) 1992-02-06

Family

ID=24224826

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1991/005040 WO1992001998A1 (en) 1990-07-24 1991-07-18 Method and apparatus for automatic text separation using automatic electronic filtering of multiple drop-out colors for optical character recognition of preprinted forms

Country Status (3)

Country Link
EP (1) EP0491923A1 (en)
JP (1) JPH05501778A (en)
WO (1) WO1992001998A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7936488B2 (en) 2008-02-15 2011-05-03 Mitsubishi Electric Corporation Image reading apparatus
WO2013009530A1 (en) * 2011-07-08 2013-01-17 Qualcomm Incorporated Parallel processing method and apparatus for determining text information from an image

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5287442B2 (en) * 2009-04-07 2013-09-11 三菱電機株式会社 Image reading device

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0375090A2 (en) * 1988-12-21 1990-06-27 Recognition International Inc. Document processing system

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0375090A2 (en) * 1988-12-21 1990-06-27 Recognition International Inc. Document processing system

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
PATENT ABSTRACTS OF JAPAN vol. 14, no. 437 (E-098)19 September 1990 & JP,A,2 170 674 ( MINOLTA CAMERA CO LTD ) 2 July 1990 see abstract *
PATENT ABSTRACTS OF JAPAN vol. 6, no. 245 (P-159)3 December 1982 & JP,A,57 143 683 ( TOKYO SHIBAURA DENKI KK ) 4 September 1982 see abstract *
PATENT ABSTRACTS OF JAPAN vol. 9, no. 9 (P-327)(1732) 16 January 1985 & JP,A,59 158 481 ( NIPPON DENKI KK ) 7 September 1984 *
see abstract *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7936488B2 (en) 2008-02-15 2011-05-03 Mitsubishi Electric Corporation Image reading apparatus
WO2013009530A1 (en) * 2011-07-08 2013-01-17 Qualcomm Incorporated Parallel processing method and apparatus for determining text information from an image
US9202127B2 (en) 2011-07-08 2015-12-01 Qualcomm Incorporated Parallel processing method and apparatus for determining text information from an image

Also Published As

Publication number Publication date
JPH05501778A (en) 1993-04-02
EP0491923A1 (en) 1992-07-01

Similar Documents

Publication Publication Date Title
US5014328A (en) Automatic detection and selection of a drop-out color used in conjunction with optical character recognition of preprinted forms
US5014329A (en) Automatic detection and selection of a drop-out color using zone calibration in conjunction with optical character recognition of preprinted forms
EP0070161B1 (en) Adaptive thresholder and method
DE69325527T2 (en) Image processing apparatus and method
US4414581A (en) Image signal processing method and apparatus therefor
EP0317268A2 (en) Image recording apparatus
DE3629195C2 (en)
US7580569B2 (en) Method and system for generating contone encoded binary print data streams
US4825296A (en) Method of and apparatus for copying originals in which an image to be printed is evaluated by observing a corresponding low-resolution video image
US7436994B2 (en) System of using neural network to distinguish text and picture in images and method thereof
US4533942A (en) Method and apparatus for reproducing an image which has a coarser resolution than utilized in scanning of the image
DE69631812T2 (en) System and method for a highly addressable printing system
US6775031B1 (en) Apparatus and method for processing images, image reading and image forming apparatuses equipped with the apparatus, and storage medium carrying programmed-data for processing images
EP0732842A2 (en) Image processing apparatus capable of properly determining density of a background portion
US6718059B1 (en) Block selection-based image processing
US5892596A (en) Image processing apparatus capable of reforming marker editing
DE19744501A1 (en) Gray scale value compensation arrangement for image sensor
WO1992001998A1 (en) Method and apparatus for automatic text separation using automatic electronic filtering of multiple drop-out colors for optical character recognition of preprinted forms
US3902009A (en) Multi aperture scanning and printing for facsimile line skipping
US20060165292A1 (en) Noise resistant edge detection
US6693731B1 (en) Image processing apparatus and method
US6108456A (en) Image processing system
US5245446A (en) Image processing system
JP2861089B2 (en) Image addition device
EP3565232A1 (en) Output image generating method of an image reading device, and an image reading device

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): JP

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH DE DK ES FR GB GR IT LU NL SE

WWE Wipo information: entry into national phase

Ref document number: 1991913232

Country of ref document: EP

WWP Wipo information: published in national office

Ref document number: 1991913232

Country of ref document: EP

WWW Wipo information: withdrawn in national office

Ref document number: 1991913232

Country of ref document: EP

点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载