GitHub - CogitatorTech/infera: A DuckDB extension for in-database inference

Infera

In-Database Machine Learning for DuckDB

Infera is a DuckDB extension that allows you to use machine learning (ML) models directly in SQL queries to perform inference on data stored in DuckDB tables. It is developed in Rust and uses Tract as the backend inference engine. Infera supports loading and running models in ONNX format. Check out the ONNX Model Zoo repository on Hugging Face for a large collection of ready-to-use models that can be used with Infera.

Motivation

In a conventional data science workflow, when data is stored in a database, it is not typically possible to use ML models directly on the data. Users need to move the data out of the database first (for example, export it to a CSV file) and load the data into a Python or R environment, run the model there, and then import the results back into the database. This process is time-consuming and inefficient. Infera aims to solve this problem by letting users run ML models directly in SQL queries inside the database. It simplifies the workflow and speeds up the process for users, and eliminates the need for moving data around.

Features

Adds ML inference as a first-class citizen in SQL queries.
Supports loading and using local as well as remote models.
Supports using ML models in ONNX format with a simple and flexible API.
Supports performing inference on table columns or raw tensor data.
Supports both single-value and multi-value model outputs.
Supports autoloading all models from a specified directory.
Thread-safe, fast, and memory-efficient.

See the ROADMAP.md for the list of implemented and planned features.

Important

Infera is in early development, so bugs and breaking changes are expected. Please use the issues page to report bugs or request features.

Quickstart

Clone the repository and build the Infera extension from source:

git clone --recursive https://github.com/CogitatorTech/infera.git
cd infera

make release

Start DuckDB shell (with Infera statically linked to it):

./build/release/duckdb

Run the following SQL commands in the shell to try Infera out:

-- Normally, we need to load the extension first,
-- but the `duckdb` binary that we built in the previous step
-- already has Infera statically linked to it.
-- So, we don't need to load the extension explicitly.

-- 1. Load a simple linear model from a remote URL
select infera_load_model('linear_model',
                         'https://github.com/CogitatorTech/infera/raw/refs/heads/main/test/models/linear.onnx');

-- 2. Run a prediction using a very simple linear model
-- Model: y = 2*x1 - 1*x2 + 0.5*x3 + 0.25
select infera_predict('linear_model', 1.0, 2.0, 3.0);
-- Expected output: 1.75

-- 3. Unload the model when we're done with it
select infera_unload_model('linear_model');

-- 4. Check the Infera version
select infera_get_version();

Note

After building from source, the Infera binary will be build/release/extension/infera/infera.duckdb_extension. You can load it using the load 'build/release/extension/infera/infera.duckdb_extension'; in the DuckDB shell. Note that the extension binary will only work with the DuckDB version that it was built against. At the moment, Infera is not available as a DuckDB community extension. Nevertheless, you can still use Infera by building it from source yourself, or downloading pre-built binaries from the releases page for your platform. Please check the this page for more details on how to use extensions in DuckDB.

Documentation

Check out the docs directory for the API documentation, how to build Infera from source, and more.

Examples

Check out the examples directory for SQL scripts that show how to use Infera.

Contributing

See CONTRIBUTING.md for details on how to make a contribution.

License

Infera is available under either of the following licenses:

MIT License (LICENSE-MIT)
Apache License, Version 2.0 (LICENSE-APACHE)

Acknowledgements

The logo is from here with some modifications.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
.github		.github
docs		docs
external		external
infera		infera
test		test
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
.pre-commit-config.yaml		.pre-commit-config.yaml
CMakeLists.txt		CMakeLists.txt
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE-APACHE		LICENSE-APACHE
LICENSE-MIT		LICENSE-MIT
Makefile		Makefile
README.md		README.md
ROADMAP.md		ROADMAP.md
extension_config.cmake		extension_config.cmake
logo.svg		logo.svg
pyproject.toml		pyproject.toml
rust-toolchain.toml		rust-toolchain.toml
vcpkg.json		vcpkg.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Licenses found

Uh oh!

Repository files navigation

Infera

Motivation

Features

Quickstart

Documentation

Examples

Contributing

License

Acknowledgements

About

Licenses found

Uh oh!

Releases 1

Sponsor this project

Uh oh!

Uh oh!

Languages

Uh oh!

License

Licenses found

CogitatorTech/infera

Folders and files

Latest commit

History

Repository files navigation

Infera

Motivation

Features

Quickstart

Documentation

Examples

Contributing

License

Acknowledgements

About

Topics

Resources

License

Licenses found

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 1

Sponsor this project

Uh oh!

Uh oh!

Languages