GB2561556A

GB2561556A - Apparatus for providing information about a food product

Info

Publication number: GB2561556A
Application number: GB1705975.9A
Authority: GB
Inventors: Kan Jing; He Zeyi
Original assignee: Individual
Current assignee: Individual
Priority date: 2017-04-13
Filing date: 2017-04-13
Publication date: 2018-10-24
Also published as: GB201705975D0

Abstract

The apparatus receives selected data of the food product; and provides information about the food product based on the analysis of selected data (107). The selected data is configured to include the first food data and omit the second food data if a first predefined data criterion is met, and wherein the selected data is configured to include the second food data and omit the first food data if a second predefined data criterion is met, wherein the first food data is obtained from an image data of the food product (103), and the second food data is obtained from an audio data and/or a text data of the food product (104). Mass, nutrient levels or calories can be analyzed and health, medical, disease or dietary recommendation information provided, using a user profile. Image or audio data can be analysed to determine whether it meets a quality threshold so that decisions can be made about which information to use to determine for example calories in a cup of juice. Callibration using finger length might also allow estimation of food content in a smartphone captured image. Alternatively, user audio input may be used to determine standard sizes such as a cup.

Description

(54) Title of the Invention: Apparatus for providing information about a food product Abstract Title: Apparatus for Providing Information About a Food Product (57) The apparatus receives selected data of the food product; and provides information about the food product based on the analysis of selected data (107). The selected data is configured to include the first food data and omit the second food data if a first predefined data criterion is met, and wherein the selected data is configured to include the second food data and omit the first food data if a second predefined data criterion is met, wherein the first food data is obtained from an image data of the food product (103), and the second food data is obtained from an audio data and/or a text data of the food product (104). Mass, nutrient levels or calories can be analyzed and health, medical, disease or dietary recommendation information provided, using a user profile. Image or audio data can be analysed to determine whether it meets a quality threshold so that decisions can be made about which information to use to determine for example calories in a cup of juice. Callibration using finger length might also allow estimation of food content in a smartphone captured image. Alternatively, user audio input may be used to determine standard sizes such as a cup.

Smart phone

Input user profile /100

Input finger measurement

Capture food image . and audio description

Individual Image / audio based mod volume recognition v r

103 /102 individual Image / audio based food content recognition

Figure 1a

104

105

Nuhieni / calories take-in i calculation ;

Healthy report & disease dietary recommodaiion

Smart phone

Input user profile

Input finger measurement .101

Capture food image and audio description

102 t

Individual Imago / audio based food volume recognition

103 individual image / audio based food content recognition

.........1.........

Maae calculation

I Nutrient / [calories take-in i

I calculation :

Healthy report & disease dietary recommodatian ,107

I * ί ί

I $ >>w AW \Ά<: AAO X«« ΑΧ». ^Λ>

Λ

APPARATUS FOR PROVIDING INFORMATION ABOUT A FOOD PRODUCT

This invention relates to an apparatus for providing information about a food product, a computer-implemented method of providing information about a food product, and to a computer program comprising computer code configured to perform the computerimplemented method.

It is known to use computer-aided food identification and quality estimation systems, which may include machine learning and computer vision capabilities, to keep track of food consumption and monitor dietary habits in order to help maintain a healthy dietary lifestyle.

According to a first aspect of the invention, there is provided an apparatus for providing information about a food product, the apparatus comprising a processor and memory including computer program code, the memory and computer program code configured to, with the processor, enable the apparatus at least to:

receive selected data of the food product; analyse the selected data; and provide information about the food product based on the analysis of the selected data, wherein the selected data is configured to include the first food data and omit the second food data if a first predefined data criterion is met, and wherein the selected data is configured to include the second food data and omit the first food data if a second predefined data criterion is met, wherein the first food data is obtained from an image data of the food product, and the second food data is obtained from an audio data and/or a text data of the food product.

A food product may comprise a single food item or multiple food items. The multiple food items may include discrete food items and/or a mixture of food items. A food data includes qualitative and/or quantitative data about one or more properties of a food product, such as food type, food content, food volume. In the invention, the first and second food data may either have the same qualitative and/or quantitative data about one or more properties of the food product, or have different qualitative and/or quantitative data about one or more properties of the food product.

Conventionally it is known to implement an image-based method by using a mobile phone to capture an image of food, identify food types from the image through image processing and food segmentation, perform food classification based on machine learning, estimate food portion amounts from the image, measure calories based on food portion amounts and nutritional tables, and finally estimate the quantitative food and nutrition intake.

On the other hand the accuracy and efficiency of the conventional image-based method is adversely affected by the use of images with low quality due to, for example, blurriness or poor lighting. The conventional image-based method also has its limitations when it comes to obtaining data from liquid or clear foods and from food that are not presented on a flat plate. For example, it can be difficult to accurately identify food types and estimate food portion amounts from an image of a liquid or clear food, and from an image of food that was presented in a container (such as a bowl or cup) instead of a flat plate.

The apparatus of the invention overcomes the problems faced by the conventional imagebased method through use of the selected data that is based on food data respectively obtained from the image data and the audio and/or text data of the food product. More specifically, the first and second food data provide multiple independent data channels for obtaining the selected data, which in turn improves the accuracy of the provision of the information about the food product.

In addition the use of audio-based and/or text-based food data overcomes the limitations faced when it comes to obtaining data about liquid or clear foods and from food that are not presented on a flat plate, and thereby increases the accuracy of the provision of the information about the food product.

Furthermore the configuration of the apparatus of the invention allows the selected data to be selected from the first and second food data in accordance with predefined data criteria, which can be designed to be based on a decision tree with multiple decision points in order to improve the accuracy of the selected data to be analysed (e.g. through aggregation of accuracy based on multiple decision points) and to optimise time and computational resources used by the apparatus of the invention to provide the information about the food product (e.g. by using a decision point to assess whether it is worth using time and computational resources on a processing step before carrying out said processing step).

It will be appreciated that the invention is not limited to the first and second food data. In addition, the selected data may be configured to additionally include or omit a further food data (e.g. a third food data, a fourth food data, a fifth food data, and so on) if a further predefined data criterion is met, wherein the further food data is obtained from data of the food product that is identifiable (e.g. readable, decipherable) by the apparatus of the invention.

In an optional first example, if the first food data is obtained from image data and the second food data is obtained from audio data, the further food data may be obtained from, for example, text data of the food product or any other form of data of the food product that can be identified by the apparatus of the invention.

In an optional second example, if the first food data is obtained from image data and the second food data is obtained from text data, the further food data may be obtained from, for example, audio data of the food product or any other form of data of the food product that can be identified by the apparatus of the invention.

In an optional third example, if the first food data is obtained from image data and the second food data is obtained from audio and text data, the further food data may be obtained from, for example, any other form of data of the food product that can be identified by the apparatus of the invention.

The features and advantages of the invention in relation to the use of the first and second food data apply mutatis mutandis to the use of the further data in the invention.

Optionally the invention may be expanded to include the possible inclusion of a combination of the food data (e.g. the first and second food data) in the selected data in order to further improve accuracy and efficiency. More specifically, in embodiments of the invention, the selected data may be configured to include a combination of the food data (e.g. the first and second food data) if a third predefined data criterion is met.

The configuration of the apparatus of the invention therefore results in an improved means of helping users monitor their food consumption and dietary habits.

The analysis of the selected data may include, but is not limited to, one or more of food mass determination, food nutrient determination, and food calorie determination. Such determination can be performed by, for example, calculation and/or referencing food databases.

The information about the food product may include, but is not limited to, one or more of food mass, food nutrient, food calories, health profile, medical information, disease information, and dietary recommendation.

The memory and computer program code may be configured to, with the processor, enable the apparatus at least to:

receive a user profile, and analyse the selected data and the user profile; and provide information about the food product based on the analysis of the selected data and the user profile.

By analysing the user profile in combination with the selected data, the apparatus of the invention is able to provide personalised information about the food product. The user profile may include, but is not limited to, name, height, age, weight, and medical condition or history.

Each predefined data criterion is a benchmark by which a quantitative or qualitative property of a given data is measured. Each predefined data criterion may include a data quality criterion and/or a data accuracy criterion. Data quality of a given data is a measurement of the degree of suitability of the given data to serve its purpose. Data accuracy of a given data is a measurement of the degree of closeness of a quantitative or qualitative value in the given data to the actual quantitative or qualitative value.

Optionally each of the first and second food data may include food volume data and/or food content data.

The food volume data can be used to obtain more detailed information about the food product, such as calories, nutrients and other dietary information. For example, the food volume data can be used to determine the mass of the food product (which may include determining the masses of the different food items in the food product), which in turn can be used to determine the calories of the food product. Also, for example, the food volume data can also be used to determine the different amounts of nutrients in the food product, such as fibres, vitamins, minerals and so on.

The food content data can also be used to obtain more detailed information about the food product, such as calories, nutrients and other dietary information. For example, the food content data can be used to provide qualitative information about the content of the food product, such as the food type(s) of the food product.

receive the image data of the food product;

receive the audio and/or text data of the food product;

obtain the first food data from the image data;

obtain the second food data from the audio and/or text data;

determine which one of the predefined data criteria is met; and determine the selected data based on which one of the predefined data criteria is met.

As mentioned above, the first and second food data provide multiple independent data channels for obtaining the selected data, which in turn improves the accuracy of the provision of the information about the food product.

In addition the determination of which one of the predefined data criteria is met can be based on a decision tree with multiple decision points in order to improve the accuracy of the selected data to be analysed and to optimise time and computational resources used by the apparatus of the invention to provide the information about the food product.

Various examples of data criteria and decision points are given as follows.

An image data of poor quality could result in an increase in time and computational resources used to obtain the first food data from the image data, and/or could result in the first food data being inaccurate. The memory and computer program code may be configured to, with the processor, enable the apparatus at least to:

determine whether the image data meets a predefined image quality threshold; and obtain the first food data from the image data if the image data is determined to meet the predefined image quality threshold.

By using the predefined image quality threshold as a condition for obtaining the first food data from the image data, the accuracy and efficiency of the apparatus of the invention are improved. This is because the determination of whether the image data meets the predefined image quality threshold indirectly provides an approximation of the time and computational resources required to obtain the first food data from the image data and/or the accuracy of the first food data, thus permitting the apparatus to make a decision to not obtain the first food data from the image data if the image data is determined to not meet the predefined image quality threshold.

obtain a first food volume data from the image data if the image data is determined to meet the predefined image quality threshold, wherein the first food data includes the first food volume data.

It will be understood that the action of obtaining a first food volume data from the image data includes:

• directly obtaining the first food volume data from the image data; and • indirectly obtaining the first food volume data from the image data, with one or more intermediate processing steps, such as calculation and estimation.

obtain a first food content data from the image data if the image data is determined to meet the predefined image quality threshold, wherein the first food data includes the first food content data.

It will be understood that the action of obtaining a first food content data from the image data includes:

• directly obtaining the first food content data from the image data; and • indirectly obtaining the first food content data from the image data, with one or more intermediate processing steps, such as calculation and estimation.

By using the predefined image quality threshold as a condition for obtaining the first food volume data and/or the first food content data, the quality of the image data available for subsequent processing can be ensured. The assessment of the quality of the image data helps the apparatus to make a decision on whether to use time and computational resources to obtain the first food volume data and/or the first food content data from the image data. If the quality of the image data is high enough, the apparatus can make a decision to obtain the first food volume data and/or the first food content data from the image data. If the quality of the image data is too low, the apparatus can make a decision to not obtain the first food volume data and/or the first food content data from the image data in order to save on said time and computational resources.

An audio and/or text data of poor quality could result in an increase in time and computational resources used to obtain the second food data from the audio and/or text data, and/or could result in the second food data being inaccurate. The memory and computer program code may be configured to, with the processor, enable the apparatus at least to:

determine whether the audio and/or text data meets a predefined audio quality threshold and/or a predefined text quality threshold; and obtain the second food data from the audio and/or text data if the audio and/or text data is determined to meet the predefined image audio threshold and/or the predefined text quality threshold.

By using the predefined audio and/or text quality threshold(s) as a condition for obtaining the second food data from the audio and/or text data, the accuracy and efficiency of the apparatus of the invention are improved. This is because the determination of whether the audio and/or text data meets the predefined audio quality threshold and/or the predefined text quality threshold indirectly provides an approximation of the time and computational resources required to obtain the second food data from the audio and/or text data and/or the accuracy of the second food data, thus permitting the apparatus to make a decision to not obtain the second food data from the audio and/or text data if the audio and/or text data is determined to not meet the predefined audio quality threshold and/or the predefined text quality threshold.

obtain a second food volume data from the audio and/or text data if the audio and/or text data is determined to meet the predefined audio quality threshold and/or the predefined text quality threshold, wherein the second food data includes the second food volume data.

It will be understood that the action of obtaining a second food volume data from the audio and/or text data includes:

• directly obtaining the second food volume data from the audio and/or text data; and • indirectly obtaining the second food volume data from the audio and/or text data, with one or more intermediate processing steps, such as calculation and estimation.

Optionally the memory and computer program code may be configured to, with the processor, enable the apparatus at least to prioritise the first food volume data over the second food volume data for inclusion in the selected data or prioritise the second food volume data over the first food volume data for inclusion in the selected data. This is because the first food volume data may be more accurate than the second food volume data, or the second food volume data may be more accurate than the first food volume data. For example, it may be that the image data, in comparison to the audio and/or text data, is more likely to provide a more accurate food volume data, which in turn would improve the accuracy of the outcome of the analysis of the selected data.

On the other hand, if the image data had been determined to not meet the predefined image quality threshold, the first food volume data is not obtained, and the apparatus of the invention adopts the food volume information of the second food volume data as a backup selection.

obtain a second food content data from the audio and/or text data if the audio and/or text data is determined to meet the predefined audio quality threshold and/or the predefined text quality threshold, wherein the second food data includes the second food content data.

It will be understood that the action of obtaining a second food content data from the audio and/or text data includes:

• directly obtaining the second food content data from the audio and/or text data; and • indirectly obtaining the second food content data from the audio and/or text data, with one or more intermediate processing steps, such as calculation and estimation.

By using the predefined audio and/or text quality threshold(s) as a condition for obtaining the second food volume data and/or the second food content data, the quality of the audio and/or text data available for subsequent processing can be ensured. The assessment of the quality of the audio and/or text data helps the apparatus to make a decision on whether to use time and computational resources to obtain the second food volume data and/or the second food content data from the audio and/or text data. If the quality of the audio and/or text is high enough, the apparatus can make a decision to obtain the second food volume data and/or the second food content data from the audio and/or text data. If the quality of the audio and/or text data is too low, the apparatus can make a decision to not obtain the second food volume data and/or the second food content data from the audio and/or text data in order to save on said time and computational resources.

In embodiments of the invention relating to the first and second food content data, if both of the image data and the audio and/or text data are determined to not meet the respective predefined quality thresholds, the apparatus can make a decision to not obtain both of the first and second food content data from the image data and the audio and/or text data respectively in order to save on time and computational resources. At this stage the apparatus may require a new image data of the food product, and a new audio and/or text data of the food product.

In embodiments of the invention relating to the first and second food content data, if the image data is determined to meet the predefined image quality threshold but the audio and/or text data is determined to not meet the predefined audio and/or text quality threshold(s), the apparatus can make a decision to obtain the first food content data from the image data but not obtain the second food content data from the audio and/or text data in order to save on time and computational resources. The first food content data can then be included in the selected data for subsequent analysis.

In embodiments of the invention relating to the first and second food content data, if the audio and/or text data is determined to meet the predefined audio and/or text quality threshold(s) but the image data is determined to not meet the predefined image quality threshold, the apparatus can make a decision to obtain the second food content data from the audio and/or text data but not obtain the first food content data from the image data in order to save on time and computational resources. The second food content data can then be included in the selected data for subsequent analysis.

In embodiments of the invention relating to the first and second food content data, if both of the image data and the audio and/or text data are determined to meet the respective predefined quality thresholds, the apparatus can make a decision to obtain both of the first and second food content data from the image data and the audio and/or text data respectively and then proceed to determine whether the first and second food content data meet a predefined identity threshold as follows.

determine whether the first and second food content data meet a predefined identity threshold;

include any of the first and second food content data in the selected data if the first and second food content data are determined to meet a predefined identity threshold; and compare the first and second food content data to select one of the first and second food content data to be included in the selected data if the first and second food content data are determined to not meet the predefined identity threshold.

The predefined identity threshold is a benchmark by which the identity between the first and second food content data is measured. If the first and second food content data are determined to meet a predefined identity threshold, the first and second food content data correspond to identical or substantially identical food content, which means that the analysis of the selected data would result in more or less the same outcome regardless of whether the selected data is based on the first food content data or the second food content data. On the other hand, if the first and second food content data are determined to not meet a predefined identity threshold, the first and second food content data correspond to different food content, which means that an analysis of the selected data based on the first food content data would result in a different outcome when compared to an outcome of an analysis of the selected data based on the second food content data.

This configuration of the apparatus of the invention therefore reduces wastage of the time and computational resources used by the apparatus of the invention.

Optionally each of the first and second food content data may include data accuracy information, and the memory and computer program code are configured to, with the processor, enable the apparatus at least to:

compare the data accuracy information of the first and second food content data if the first and second food content data are determined to not meet the predefined identity threshold; and include the one of the first and second food content data with the higher data accuracy in the selected data.

In such embodiments the data accuracy information may include a confidence rate. The confidence rate may be, but is not limited to, one or more of a statistical confidence rate, a confidence rate determined through artificial intelligence, and a confidence rate determined through machine learning analysis.

The inclusion of data accuracy information in the first and second food content data provides a reliable way of choosing which of the first and second food content data to include in the selected food data if the first and second food content data are determined to not meet the predefined identity threshold. This has the effect of enhancing the accuracy of the apparatus of the invention by ensuring that the analysis of the selected data is based on the food content data that is more likely to produce the more accurate outcome.

receive length data of a body part of a user; and use the length data as a calibration to obtain the first food data from the image data, wherein the image data is derived from an image of the food product with the body part.

Preferably the memory and computer program code may be configured to, with the processor, enable the apparatus at least to use the length data as a calibration to obtain the first food volume data from the image data.

The use of a body part as the calibration to obtain the first food data from the image data not only improves the accuracy of the obtained first food data, but also improves the userfriendliness of the apparatus of the invention by removing the need for additional reference/measurement tools, such as an object with a fiducial marker or a food container of known dimensions.

The body part may be, but is not limited to, a finger, a finger segment, a thumb, or a thumb segment. This is particularly useful when a handheld device (such as a mobile phone) is used to capture the image of the food product, since it is convenient to include the finger/finger segment/thumb/thumb segment in the same image. Furthermore the finger/finger segment/thumb/thumb segment can be readily used in combination with a touchscreen to input the length data of the finger/finger segment/thumb/thumb segment into the apparatus of the invention.

In embodiments of the invention the apparatus may be or may include one or more of an electronic device, a portable electronic device, a portable telecommunications device, a mobile phone, a personal digital assistant, a tablet, a phablet, a desktop computer, a laptop computer, a server, a cloud computing network, a smartphone, a smartwatch, smart eyewear, and a module for one or more of the same.

In further embodiments of the invention the apparatus may include an image capture device for obtaining the image data of the food product. The image capture device may be configured to enable the capture of a still image of the food product and/or enable the recording of a video image of the food product.

In still further embodiments of the invention, the apparatus may include an audio capture device for obtaining the audio data of the food product, and/or a text capture device for obtaining the text data of the food product. The audio data may be converted from captured audio through speech recognition. The text capture device may take any form that permits a user to input text, e.g. a keyboard, a touchscreen, etc, and/or the text capture device may be configured to read text, for example, by computer vision and/or optical character recognition. The read text may be from, for example, a food packaging label.

According to a second aspect of the invention, there is provided a computer-implemented method of providing nutrition information about a food product, the method comprising the steps of:

receiving a selected data of the food product; analysing the selected data; and providing information about the food product based on the analysis of the selected data, wherein the selected data is configured to include the first food data and omit the second food data if a first predefined data criterion is met, and wherein the selected data is configured to include the second food data and omit the first food data if a second predefined data criterion is met, wherein the first food data is obtained from an image data of the food product, and the second food data is obtained from an audio data and/or a text data of the food product.

According to a third aspect of the invention, there is provided a computer program comprising computer code configured to perform the method of the second aspect of the invention.

The features and advantages of the apparatus of the first aspect of the invention and its embodiments apply mutatis mutandis to the method of the second aspect of the invention and the computer program of the third aspect of the invention.

It will be appreciated that the use of the terms “first” and “second”, and the like, in this patent specification is merely intended to help distinguish between similar features (e.g. the first and second food data), and is not intended to indicate the relative importance of one feature over another feature, unless otherwise specified.

A preferred embodiment of the invention will now be described, by way of a non-limiting example, with reference to the accompanying drawings in which:

Figures 1a and 1b show a method of providing information about a food product according to an embodiment of the invention; and

Figures 2a to 2d illustrate the use of a touchscreen to input length data of a finger segment into a smartphone.

A method of providing information about a food product according to an embodiment of the invention is shown in the flowcharts of Figures 1a and 1b. The flowchart of Figure 1a illustrates the general concept of the method of providing information about the food product, while the flowchart of Figure 1 b describes in further detail the method of providing information about the food product.

The method of Figures 1a and 1b is implemented through use of a smartphone connected to a cloud computing network via the internet. The smartphone comprises a processor and memory. The cloud computing network comprises a plurality of remote servers, each of which comprises a processor and memory.

The processor is configured to process instructions to carry out various computing operations (e.g. generate, delete, read, write and/or otherwise process data). The processor is also configured to receive data from and transmit data to the memory. The memory is configured to store data for immediate or later use. The memory is configured to store computer program code and/or applications which may be used to instruct/enable the processor to perform the various computing functions. The smartphone is configured to receive data from and transmit data to the cloud computing network via the internet.

The smartphone further includes a touchscreen, a camera, and a microphone. The touchscreen of the smartphone allows a user to generate input data by interacting with the touchscreen. The touchscreen doubles as a display screen for output data. The processor of the smartphone is configured to receive input data from and output data to the touchscreen. The camera is configured to enable the capture of a still image and to enable the recording of a video image. The microphone is configured to enable the capture of audio, which is then converted to audio data by the processor through speech recognition methods. The speech recognition method may be configured to recognise audio from a single language or from different languages. If the speech recognition method is capable of recognising audio from different languages, the speech recognition method may be configured to prompt a user to choose which of the different languages to use.

Initially a user profile is inputted into the smartphone (step 100). This can be done by, for example, using a virtual keyboard on the touchscreen and/or choosing from a range of preset answers displayed on the touchscreen. The user profile may include, but is not limited to, name, height, age, weight, and medical condition or history. It is preferred that the user profile is inputted into the smartphone at the start of the method of Figures 1 a and 1b, but it will be appreciated that the user profile may instead be inputted into the smartphone at a different time during the implementation of the method of Figures 1a and 1b.

In the embodiment shown, the input of the user profile into the smartphone is performed as a one-off process which means that there is no need to input the user profile again for future implementations of the method of Figures 1a and 1b. It will however be appreciated that a new or updated user profile can be inputted after the start of the implementation of the method of Figures 1a and 1b, and also a new or updated user profile can be inputted in future implementations of the method of Figures 1a and 1 b.

This is followed by the user inputting a finger measurement into the smartphone (step 101). Figures 2a to 2d illustrate the sequence of steps for inputting the length data 204 of a finger segment 200 into the smartphone via the touchscreen.

Firstly, the tip of the index finger is chosen as the finger segment 200, where the tip of the index finger corresponds to the distal phalanx bone (Figure 2a). It will be appreciated that any of the other segments of the index finger, the segments of the other fingers, and the segments of the thumb can be chosen instead.

Next, the touchscreen prompts the user to align an edge of the chosen finger segment 200 with a dashed line 202 displayed on the touchscreen (Figure 2b). In this case the edge of the tip of the index finger corresponds to the connection between the distal and middle phalanxes.

When the user aligns the edge of the finger segment 200 with the dashed line 202 and presses the touchscreen (Figure 2c), the smartphone then records the length data 204 of the chosen finger segment 200 (Figure 2d). The recording of the length data 204 of the chosen finger segment 200 can be implemented through use of, for example, capacitive and vision-based multi-touch screens which allow sensing of the contact’s shape and size respectively (for example, see (i) Han, J. (2005), Low-cost multi-touch sensing through frustrated total internal reflection, and (ii) Sebastian Boring et al. (2012), The Fat Thumb: Using the Thumb’s Contact Size for Single-Handed Mobile Interaction. MobileHCI '12 Proceedings of the 14th international conference on Human-computer interaction with mobile devices and services companion). The recorded length data 204 is stored in the memory for later use.

In the embodiment shown, the input of the finger measurement into the smartphone is performed as a one-off process which means that there is no need to input the finger measurement again in future implementations of the method of Figures 1a and 1b. It will however be appreciated that a new or updated finger measurement can be inputted after the start of the implementation of the method of Figures 1a and 1b, and also a new or updated finger measurement can be inputted in future implementations of the method of Figures 1a and 1b.

The camera is used to obtain an digital image (or multiple digital images) of a food product which becomes image data stored in the memory for later use (step 102). The microphone is used to record an audio description of the food product which becomes audio data stored in the memory for later use (step 102). When recording the audio description, the user can describe the food product by using food quantifiers to indicate the food type and amount, e.g. a bowl of rice, a cup of vegetable soup.

After the image and audio data are obtained, the method of Figures 1a and 1b carries out, in parallel, a food volume recognition step (step 103) and a food content recognition step (step 104). For each of the food volume and food content recognition steps, a decision tree with multiple decision points is employed in order to optimise the method of Figures 1a and 1b.

The decision tree for the food volume recognition step is described as follows.

The image data is examined using edge detection to determine whether it meets a predefined image quality threshold. If the quality of the image data is too low (e.g. due to blurriness or poor lighting) to permit accurate detection of the shape and edges of the food product, the method of Figures 1a and 1b makes the decision to not obtain a first food volume data from the image data. If the quality of the image data is high enough to permit accurate detection of the shape and edges of the food product, the method of Figures 1a and 1 b proceeds to obtain a first food volume data from the image data by estimating the dimensions of the food product based on the image data and the length data 204 of the finger segment 200.

The length data 204 can be used as a calibration to obtain the first food volume data from the image data by ensuring that the chosen finger segment 200 is in the same digital image as the food product. The first food volume data may include a confidence rate, which provides a measure of the accuracy of the first food volume data with respect to the actual volume of the food product. The confidence rate may be, but is not limited to, one or more of a statistical confidence rate, a confidence rate determined through artificial intelligence, and a confidence rate determined through machine learning analysis.

The audio data is examined using audio noise detection to evaluate its signal-to-noise ratio in order to determine whether it meets a predefined audio quality threshold. If the quality of the audio data is too low (e.g. due to background noise being too high or due to the audio description having a low volume compared to the background noise) to permit an accurate identification of the amount of the food product, the method of Figures 1 a and 1 b makes the decision to not obtain a second food volume data from the audio data. If the quality of the audio data is high enough to permit an accurate identification of the amount of the food product, the method of Figures 1a and 1b proceeds to obtain a second food volume data from the audio data by converting the audio data into text using speech recognition, for example, semantic encoding.

The food quantifiers are then extracted from the text, and a database is searched in order to compare the extracted food quantifiers with predefined food quantifiers, where the predefined food quantifiers are associated with standard (e.g. average) food volumes. For example, when the word “cup” is identified from a user inputted audio description “cup of orange juice”, a database is used to search for the word “cup” and provide the average volume estimation, e.g. 250 ml, as the second food volume data. Whilst it is recognised that estimating the food volume in this manner can result in inaccuracies, this will nevertheless provide a reasonable food volume estimate that would allow the continued implementation of the method of Figures 1a and 1b, especially when the first food volume data is unavailable or very inaccurate.

The first food volume data is prioritised over the second food volume data for inclusion in the selected data. This is because the image data is more likely to provide a more accurate food volume data obtained by estimating the dimensions of the food product based on the image data and the length data 204 of the finger segment 200, while the audio data is more likely to provide a less accurate food volume data due to the use of predefined food quantifiers associated with standard (e.g. average) food volumes.

Thus, if the image data meets the predefined image quality threshold, the first food volume data is obtained, and the food volume information of the first food volume data is used for subsequent analysis. On the other hand, if the image data had been determined to not meet the predefined image quality threshold, the first food volume data is not obtained, and the food volume information of the second food volume data is adopted as a backup selection.

The decision tree for the food content recognition step is described as follows.

The image data is examined using edge detection to determine whether it meets a predefined image quality threshold. If the quality of the image data is too low (e.g. due to blurriness or poor lighting) to permit accurate detection of the shape and edges of the food product, the method of Figures 1a and 1b makes the decision to not obtain a first food content data from the image data. If the quality of the image data is high enough to permit accurate detection of the shape and edges of the food product, the method of Figures 1a and 1b proceeds to obtain a first food content data from the image data through image segmentation and classification by way of machine learning in order to recognise the content of the food product with reference to a food content database.

In order to help identify the food content, the image segmentation and classification process may include, but is not limited to, one or more of the following features:

• identification of food items through statistical pattern recognition techniques;

• localization and multi-class recognition performance using image segmentation, such as active contours (see M. Kass, et al (1988), Snakes: Active Contour Model.

International Journal of Computer Vision), normalized cuts (see J. Shi and J. Malik (2000). Normalized cuts and image segmentation. IEEE Trans. Pattern Analysis and Machine Intelligence), and local variation (see P. Felzenszwalb and D. Huttenlocher (1998), Image segmentation using local variation. Computer Vision and Pattern Recognition);

• extraction of colour and texture features from segmented food regions (see (i) Ye He, et al (2014). Analysis Of Food Images: Features And Classification, Image Processing (ICIP); and (ii) Andrew Rabinovich, et al (2007), Does Image Segmentation Improve Object Categorization, Technical Report UCSD CSE Dept);

• classification of image segments into particular food labels using the features extracted from the image segments;

• extraction of colour, texture and local region features for classification, such as Gabor filter (see (i) Ye He, et al (2014). Analysis Of Food Images: Features And Classification, Image Processing (ICIP); and (ii) Parisa Pouladzadeh, et al (2014), Measuring Calorie and Nutrition from Food Image. IEEE Transactions On Instrumentation And Measurement);

• classification of segmented regions with extracted features based on the application of k-nearest neighbours (KNN) or Support Vector Machine (SVM) to classify colour and texture features (Fengqing Zhu, et al (2011), Segmentation Assisted Food Classification for Dietary. Proc SPIE Int Soc Opt Eng).

The audio data is examined using audio noise detection to evaluate its signal-to-noise ratio in order to determine whether it meets a predefined audio quality threshold. If the quality of the audio data is too low (e.g. due to background noise being too high or due to the audio description having a low volume compared to the background noise) to permit an accurate identification of the content of the food product, the method of Figures 1a and 1b makes the decision to not obtain a second food content data from the audio data. If the quality of the audio data is high enough to permit an accurate identification of the content of the food product, the methods of Figures 1a and 1b proceeds to obtain a second food content data from the audio data by converting the audio data into text using speech recognition for example, semantic encoding.

The food quantifiers are then extracted from the text, and a database is searched in order to compare the extracted food quantifiers with predefined food quantifiers, where the predefined food quantifiers are associated with types of food. For example, when the word “orange juice” is identified from a user inputted audio description “cup of orange juice”, a database is used to search for the word “orange juice” and provide the exact or closest result as the second food content data.

Each of the first and second food content data includes a confidence rate, which provides a measure of the accuracy of each food content data with respect to the actual content of the food product. The confidence rate may be, but is not limited to, one or more of a statistical confidence rate, a confidence rate determined through artificial intelligence, and a confidence rate determined through machine learning analysis.

If both of the image and audio data are determined to not meet the predefined image quality threshold and the predefined audio quality threshold respectively, the food content recognition step will be terminated, and the user will be prompted to obtain a new digital image and record a new audio description of the food product.

If the image data is determined to meet the predefined image quality threshold but the audio data is determined to not meet the predefined audio text quality threshold, the first food content data is obtained from the image data but the second food content data is not obtained from the audio text data. The first food content data is then chosen for subsequent analysis, which is described later in this specification.

If the audio data is determined to meet the predefined audio quality threshold but the image data is determined to not meet the predefined image quality threshold, the second food content data is obtained from the audio data but the first food content data is not obtained from the image data. The second food content data is then chosen for subsequent analysis, which is described later in this specification.

If both of the image data and the audio data are determined to meet the respective predefined image and audio quality thresholds, both of the first and second food content data are obtained from the image data and the audio data respectively. The first and second food content data are then compared to determine whether the first and second food content data meet a predefined identity threshold. The first and second food content data meeting a predefined identity threshold means that the first and second food content data correspond to identical or substantially identical food content.

If the first and second food content data are determined to meet the predefined identity threshold, any of the first and second food content data is chosen for subsequent analysis, since the similarity between the two food content data means that the subsequent analysis will yield more or less the same outcome regardless of whether the first food content data or the second food content data is used. If the first and second food content data are determined to not meet the predefined identity threshold, the confidence rates of the first and second food content data are compared and the food content data with the higher confidence rate is chosen for subsequent analysis, since the analysis of the food content data with the higher confidence rate is more likely to produce a more accurate outcome than the analysis of the food content data with the lower confidence rate.

The food volume data and the food content data chosen for subsequent analysis is referred to herein as the selected data.

After the selected data is obtained, the chosen food volume data and food content data are used to calculate the mass of the food product (step 105) in combination with food density values, which can be obtained based on food type and density information from, for example, a public food/nutritional database (such as the Health Canada database) or from offline research.

The mass of the food product (or the mass of each food item of the food product) can be calculated using the following general equation:

Food Mass = Food Density x Food Volume

After the mass of the food product is calculated, the amount of calories and the amount and type of nutrients in the food product can be derived using calories and nutritional tables as references (step 106).

The amount of calories of the food product (or the amount of calories of each food item of the food product) can be calculated using the following general equation:

Calories from nutritiona I table x Food Mass Calories =Food Mass from nutritiona I table

The selected data is also analysed in combination with the user profile in order to provide personalised information about the food product. Such analysis can be carried out by using, for example, big data analytics with the help of the cloud computing network. The personalised information can be used to provide a user with a health profile, medical information, disease information, and dietary recommendations in relation to the food product (step 107).

For example, the determined calories and nutrients of the food product can be analysed in combination with the user profile in order to provide dietary recommendations and/or dietary warnings, which can be obtained from a database with preset dietary recommendations and/or dietary warnings. In a particular example, if the food product is recognised as being a banana, the calories and nutrients of the banana will be determined, and the user profile will be analysed together with the determined nutrients of the banana to determine whether the user is permitted to consume the banana. If the user profile includes a chronic kidney disease (CKD) condition, a warning will be provided to the user to advise against consuming the banana, since bananas contain potassium which should not be taken by a patient with CKD.

The food mass, calories, and nutrients and the personalised information can be displayed on the touchscreen.

The method of Figures 1a and 1b advantageously makes use of image data and audio data to respectively provide the first and second food data which provide multiple independent data channels for the selected data to be analysed. The implementation of multiple decision points in the decision trees in the method of Figures 1a and 1b not only results in an aggregation of accuracy that improves the overall accuracy of the selected data, but also optimises time and computational resources used to provide the information about the food product by using the decision points to assess whether it is worth using time and computational resources on the next processing step before carrying out said next processing step.

It is envisaged that, in other embodiments of the invention, the method of Figures 1a and 1b may be implemented by one or more of an electronic device, a portable electronic device, a portable telecommunications device, a mobile phone, a personal digital assistant, a tablet, a phablet, a desktop computer, a laptop computer, a server, a cloud computing network, a smartphone, a smartwatch, smart eyewear, and a module for one or more of the same.

It is envisaged that, in other embodiments of the invention, the camera may be replaced by a different type of image capture device, and the microphone may be replaced by a different type of audio capture device.

It is envisaged that, in still other embodiments of the invention, text data of the food product may be used in combination with or to replace the audio data as a source for the second food volume and content data. A text description of the food product may be inputted into the smartphone by using a virtual keyboard on the touchscreen and/or choosing from a range of preset answers displayed on the touchscreen. Alternatively or additionally, the smartphone may be configured to read text, for example, by computer vision and/or optical character recognition. The above-described processing of the audio data applies mutatis mutandis to the processing of the text data. For example, the predefined text quality threshold for assessing the text data may be based on the spelling and/or grammar of the text data, and food quantifiers are extracted from the text data if the text data is determined to meet the predefined text quality threshold.

It will be appreciated that references to a memory or a processor may encompass a plurality of memories or processors.

It will be understood that the term “threshold” can refer to a single value or a range of values.

Claims

1. An apparatus for providing information about a food product, the apparatus comprising a processor and memory including computer program code, the memory and computer program code configured to, with the processor, enable the apparatus at least to:

receive selected data of the food product; analyse the selected data; and provide information about the food product based on the analysis of the selected data, wherein the selected data is configured to include a first food data and omit a second food data if a first predefined data criterion is met, and wherein the selected data is configured to include the second food data and omit the first food data if a second predefined data criterion is met, wherein the first food data is obtained from an image data of the food product, and the second food data is obtained from an audio data and/or a text data of the food product.

2. An apparatus according to Claim 1 wherein the selected data is configured to additionally include or omit a further food data if a further predefined data criterion is met, wherein the further food data is obtained from data of the food product that is identifiable by the apparatus.

3. An apparatus according to Claim 1 or Claim 2 wherein the selected data is configured to include a combination of the food data if a third predefined data criterion is met.

4. An apparatus according to any one of the preceding claims wherein the analysis of the selected data includes one or more of food mass determination, food nutrient determination, and food calorie determination.

5. An apparatus according to any one of the preceding claims wherein the information about the food product includes one or more of food mass, food nutrient, food calories, health profile, medical information, disease information, and dietary recommendation.

6. An apparatus according to any one of the preceding claims wherein the memory and computer program code are configured to, with the processor, enable the apparatus at least to:

7. An apparatus according to any one of the preceding claims wherein each predefined data criterion includes a data quality criterion and/or a data accuracy criterion.

8. An apparatus according to any one of the preceding claims wherein each of the first and second food data includes food volume data and/or food content data.

9. An apparatus according to any one of the preceding claims wherein the memory and computer program code are configured to, with the processor, enable the apparatus at least to:

receive the image data of the food product;

receive the audio and/or text data of the food product;

obtain the first food data from the image data;

obtain the second food data from the audio and/or text data;

10. An apparatus according to Claim 9 wherein the memory and computer program code are configured to, with the processor, enable the apparatus at least to:

11. An apparatus according to Claim 10 wherein the memory and computer program code are configured to, with the processor, enable the apparatus at least to:

12. An apparatus according to Claim 10 or Claim 11 wherein the memory and computer program code are configured to, with the processor, enable the apparatus at least to:

13. An apparatus according to any one of Claims 9 to 12 wherein the memory and computer program code are configured to, with the processor, enable the apparatus at least to:

determine whether the audio and/or text data meets a predefined audio quality threshold and/or a predefined text quality threshold; and obtain the second food data from the audio and/or text data if the audio and/or text data is determined to meet the predefined audio quality threshold and/or the predefined text quality threshold.

14. An apparatus according to Claim 13 wherein the memory and computer program code are configured to, with the processor, enable the apparatus at least to:

15. An apparatus according to Claim 14 when dependent from Claim 11, wherein the memory and computer program code are configured to, with the processor, enable the apparatus at least to:

prioritise the first food volume data over the second food volume data for inclusion in the selected data, or prioritise the second food volume data over the first food volume data for inclusion in the selected data.

16. An apparatus according to any one of Claims 13 to 15 wherein the memory and computer program code are configured to, with the processor, enable the apparatus at least to:

17. An apparatus according to Claims 12 and 16 wherein the memory and computer program code are configured to, with the processor, enable the apparatus at least to:

18. An apparatus according to Claim 17 wherein each of the first and second food content data includes data accuracy information, and the memory and computer program code are configured to, with the processor, enable the apparatus at least to:

19. An apparatus according to Claim 18 wherein the data accuracy information includes a confidence rate.

20. An apparatus according to any one of Claims 9 to 19 wherein the memory and computer program code are configured to, with the processor, enable the apparatus at least to:

21. An apparatus according to Claim 20 when dependent on Claim 11, wherein the memory and computer program code are configured to, with the processor, enable the apparatus at least to use the length data as a calibration to obtain the first food volume data from the image data.

22. An apparatus according to Claim 20 or Claim 21 wherein the body part is a finger, a finger segment, a thumb, or a thumb segment.

23. An apparatus according to any one of the preceding claims wherein the apparatus is or includes one or more of an electronic device, a portable electronic device, a portable telecommunications device, a mobile phone, a personal digital assistant, a tablet, a phablet, a desktop computer, a laptop computer, a server, a cloud computing network, a smartphone, a smartwatch, smart eyewear, and a module for one or more of the same.

5 24. An apparatus according to any one of the preceding claims wherein the apparatus includes an image capture device for obtaining the image data of the food product, an audio capture device for obtaining the audio data of the food product, and/or a text capture device for obtaining the text data of the food product.

10 25. A computer-implemented method of providing information about a food product, the method comprising the steps of:

receiving a selected data of the food product; analysing the selected data; and providing information about the food product based on the analysis of the selected

15 data, wherein the selected data is configured to include the first food data and omit the second food data if a first predefined data criterion is met, and wherein the selected data is configured to include the second food data and omit the first food data if a second predefined data criterion is met, wherein the first food data is obtained from an image data

20 of the food product, and the second food data is obtained from an audio data and/or a text data of the food product.

26. A computer program comprising computer code configured to perform the method of Claim 25.

04 07 17

Amendments to the claims have been filed as follows :CLAIMS

1. An apparatus for providing information about a food product, the apparatus comprising a processor and memory including computer program code, the memory and

5 computer program code configured to, with the processor, enable the apparatus at least to:

receive selected data of the food product; analyse the selected data; and provide information about the food product based on the analysis of the selected

10 data, wherein the selected data is configured to include a first food data and omit a second food data if a first predefined data criterion is met, and wherein the selected data is configured to include the second food data and omit the first food data if a second predefined data criterion is met, wherein the first food data is obtained from an image data

15 of the food product, and the second food data is obtained from an audio data and/or a text data of the food product, wherein each predefined data criterion includes a data quality criterion and/or a data accuracy criterion, wherein each of the first and second food data includes food volume data and/or food content data.

20 2. An apparatus according to Claim 1 wherein the selected data is configured to additionally include or omit a further food data if a further predefined data criterion is met, wherein the further food data is obtained from data of the food product that is identifiable by the apparatus.

25 3. An apparatus according to Claim 1 or Claim 2 wherein the selected data is configured to include a combination of the food data if a third predefined data criterion is met.

4. An apparatus according to any one of the preceding claims wherein the analysis of

30 the selected data includes one or more of food mass determination, food nutrient determination, and food calorie determination.

5. An apparatus according to any one of the preceding claims wherein the information about the food product includes one or more of food mass, food nutrient, food calories,

35 health profile, medical information, disease information, and dietary recommendation.

04 07 17

receive a user profile, and

5 analyse the selected data and the user profile; and provide information about the food product based on the analysis of the selected data and the user profile.

7. An apparatus according to any one of the preceding claims wherein the memory

10 and computer program code are configured to, with the processor, enable the apparatus at least to:

receive the image data of the food product; receive the audio and/or text data of the food product; obtain the first food data from the image data;

15 obtain the second food data from the audio and/or text data;

20 8. An apparatus according to Claim 7 wherein the memory and computer program code are configured to, with the processor, enable the apparatus at least to:

9. An apparatus according to Claim 8 wherein the memory and computer program code are configured to, with the processor, enable the apparatus at least to:

obtain a first food volume data from the image data if the image data is determined to meet the predefined image quality threshold, wherein the first food data includes the

30 first food volume data.

10. An apparatus according to Claim 8 or Claim 9 wherein the memory and computer program code are configured to, with the processor, enable the apparatus at least to:

obtain a first food content data from the image data if the image data is determined

35 to meet the predefined image quality threshold, wherein the first food data includes the first food content data.

04 07 17

11. An apparatus according to any one of Claims 7 to 10 wherein the memory and computer program code are configured to, with the processor, enable the apparatus at least to:

determine whether the audio and/or text data meets a predefined audio quality 5 threshold and/or a predefined text quality threshold; and obtain the second food data from the audio and/or text data if the audio and/or text data is determined to meet the predefined audio quality threshold and/or the predefined text quality threshold.

10 12. An apparatus according to Claim 11 wherein the memory and computer program code are configured to, with the processor, enable the apparatus at least to:

obtain a second food volume data from the audio and/or text data if the audio and/or text data is determined to meet the predefined audio quality threshold and/or the predefined text quality threshold, wherein the second food data includes the second food

15 volume data.

13. An apparatus according to Claim 12 when dependent from Claim 9, wherein the memory and computer program code are configured to, with the processor, enable the apparatus at least to:

20 prioritise the first food volume data over the second food volume data for inclusion in the selected data, or prioritise the second food volume data over the first food volume data for inclusion in the selected data.

14. An apparatus according to any one of Claims 11 to 13 wherein the memory and

25 computer program code are configured to, with the processor, enable the apparatus at least to:

obtain a second food content data from the audio and/or text data if the audio and/or text data is determined to meet the predefined audio quality threshold and/or the predefined text quality threshold, wherein the second food data includes the second food

30 content data.

15. An apparatus according to Claims 10 and 14 wherein the memory and computer program code are configured to, with the processor, enable the apparatus at least to:

determine whether the first and second food content data meet a predefined 35 identity threshold;

include any of the first and second food content data in the selected data if the first and second food content data are determined to meet a predefined identity threshold; and

04 07 17 compare the first and second food content data to select one of the first and second food content data to be included in the selected data if the first and second food content data are determined to not meet the predefined identity threshold.

5 16. An apparatus according to Claim 15 wherein each of the first and second food content data includes data accuracy information, and the memory and computer program code are configured to, with the processor, enable the apparatus at least to:

compare the data accuracy information of the first and second food content data if the first and second food content data are determined to not meet the predefined identity

10 threshold; and include the one of the first and second food content data with the higher data accuracy in the selected data.

17. An apparatus according to Claim 16 wherein the data accuracy information

15 includes a confidence rate.

18. An apparatus according to any one of Claims 7 to 17 wherein the memory and computer program code are configured to, with the processor, enable the apparatus at least to:

20 receive length data of a body part of a user; and use the length data as a calibration to obtain the first food data from the image data, wherein the image data is derived from an image of the food product with the body part.

19. An apparatus according to Claim 18 when dependent on Claim 9, wherein the

25 memory and computer program code are configured to, with the processor, enable the apparatus at least to use the length data as a calibration to obtain the first food volume data from the image data.

20. An apparatus according to Claim 18 or Claim 19 wherein the body part is a finger,

30 a finger segment, a thumb, or a thumb segment.

21. An apparatus according to any one of the preceding claims wherein the apparatus is or includes one or more of an electronic device, a portable electronic device, a portable telecommunications device, a mobile phone, a personal digital assistant, a tablet, a

35 phablet, a desktop computer, a laptop computer, a server, a cloud computing network, a smartphone, a smartwatch, smart eyewear, and a module for one or more of the same.

04 07 17

22. An apparatus according to any one of the preceding claims wherein the apparatus includes an image capture device for obtaining the image data of the food product, an audio capture device for obtaining the audio data of the food product, and/or a text capture device for obtaining the text data of the food product.

23. A computer-implemented method of providing information about a food product, the method comprising the steps of:

receiving a selected data of the food product; analysing the selected data; and

10 providing information about the food product based on the analysis of the selected data, wherein the selected data is configured to include the first food data and omit the second food data if a first predefined data criterion is met, and wherein the selected data is configured to include the second food data and omit the first food data if a second

15 predefined data criterion is met, wherein the first food data is obtained from an image data of the food product, and the second food data is obtained from an audio data and/or a text data of the food product, wherein each predefined data criterion includes a data quality criterion and/or a data accuracy criterion, wherein each of the first and second food data includes food volume data and/or food content data.

24. A computer program comprising computer code configured to perform the method of Claim 23.

Intellectual

Property

Office

Application No: GB1705975.9 Examiner: Robert Shorthouse