A simple streamlit app based on open food facts data
The data displayed in this app is based on the Open Food Facts database, which is a collaborative project that collects and shares information about food products from around the world. The data is available under the Open Database License (ODbL).
- uv
- Python 3.12 or higher (can be installed via
uv python install 3.12
) - make (highly recommended)
Note
On Windows, you can install GNU Make
as mentioned here.
On Linux and MacOS, Make
should be installed by default.
Using make
make install
Using uv
uv sync --frozen --no-dev
Important
Please before running the app, ensure to export the PYTHONPATH
environment variable to include the src
directory by running :
export PYTHONPATH="./src"
If you are familiar with direnv, you can use it to automatically set the PYTHONPATH
variable when you enter the project directory via the .envrc
file.
The feature to automatically add a directory path to the
PYTHONPATH
via uv is currently discussed here.
There is multiple ways to run this app :
# using the provided make command
make app
# using uv
uv run streamlit app.py
# using streamlit within your virtual environment
streamlit run app.py
-
First download the parquet file from the OFF database available here.
-
From there you can start exploring using duckdb
Assuming you are using the CLI version of duckdb, you can run the following command to start exploring the data :
# enter duckdb CLI
duckdb
# create a persistent database named food.duckdb
.open food.duckdb
-- Print all available columns in the parquet file
DESCRIBE read_parquet('food.parquet');
-- create a table from the parquet file
CREATE TABLE IF NOT EXISTS off_french_food_analysis AS (
SELECT g['unnest']['text'] AS product_name,
brands AS brand,
quantity,
nutriscore_grade AS nutriscore,
code AS barcode,
additives_n,
allergens_tags,
categories,
ingredients,
manufacturing_places,
owner,
stores
FROM read_parquet('food.parquet') AS f,
UNNEST(f.generic_name) AS g
WHERE lang = 'fr' -- only keep french products
AND obsolete IS FALSE
AND completeness > 0.8
AND g['unnest']['lang'] = 'fr' -- only keep products with french description available
AND nutriscore_grade NOT IN ('not-applicable','unknown')
WHERE len(code) = 13 -- only keep products with a 13 digit code
ORDER BY nutriscore_grade DESC
);
-- export the table to a parquet file
COPY (SELECT * FROM off_french_food_analysis)
TO 'off_french_food_analysis.parquet' (FORMAT PARQUET);
This project is licensed under the GNU General Public License v3.0.
Feel free to open issues or pull requests. Contributions are welcome !