+
Skip to content

CAprogs/off_explorer

Repository files navigation

App Logo

CI status codecov pre-commit Python Versions GitHub

About

A simple streamlit app based on open food facts data

The data displayed in this app is based on the Open Food Facts database, which is a collaborative project that collects and shares information about food products from around the world. The data is available under the Open Database License (ODbL).

Prerequisites

  • uv
  • Python 3.12 or higher (can be installed via uv python install 3.12)
  • make (highly recommended)

Note

On Windows, you can install GNU Make as mentioned here.
On Linux and MacOS, Make should be installed by default.

Installation

Using make

make install

Using uv

uv sync --frozen --no-dev

Running the app

Important

Please before running the app, ensure to export the PYTHONPATH environment variable to include the src directory by running :

export PYTHONPATH="./src"

If you are familiar with direnv, you can use it to automatically set the PYTHONPATH variable when you enter the project directory via the .envrc file.

The feature to automatically add a directory path to the PYTHONPATH via uv is currently discussed here.

There is multiple ways to run this app :

# using the provided make command
make app

# using uv
uv run streamlit app.py

# using streamlit within your virtual environment
streamlit run app.py

Data extraction & analysis

Reproducing the extracted data from this project

  • First download the parquet file from the OFF database available here.

  • From there you can start exploring using duckdb

Assuming you are using the CLI version of duckdb, you can run the following command to start exploring the data :

# enter duckdb CLI
duckdb

# create a persistent database named food.duckdb
.open food.duckdb
-- Print all available columns in the parquet file
DESCRIBE read_parquet('food.parquet');

-- create a table from the parquet file
CREATE TABLE IF NOT EXISTS off_french_food_analysis AS (
    SELECT g['unnest']['text'] AS product_name,
            brands AS brand,
            quantity,
            nutriscore_grade AS nutriscore,
            code AS barcode,
            additives_n,
            allergens_tags,
            categories,
            ingredients,
            manufacturing_places,
            owner,
            stores
        FROM read_parquet('food.parquet') AS f,
            UNNEST(f.generic_name) AS g
            WHERE lang = 'fr' -- only keep french products
            AND obsolete IS FALSE
            AND completeness > 0.8
            AND g['unnest']['lang'] = 'fr' -- only keep products with french description available
            AND nutriscore_grade NOT IN ('not-applicable','unknown')
            WHERE len(code) = 13 -- only keep products with a 13 digit code
            ORDER BY nutriscore_grade DESC
        );

-- export the table to a parquet file
COPY (SELECT * FROM off_french_food_analysis)
TO 'off_french_food_analysis.parquet' (FORMAT PARQUET);

License

This project is licensed under the GNU General Public License v3.0.

Contributing

Feel free to open issues or pull requests. Contributions are welcome !

References

About

Open Food Facts Explorer : A simple streamlit app based on open food facts data

Resources

License

Stars

Watchers

Forks

Releases

No releases published
点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载