Distil-expenses

We trained SLM assistants for personal expenses summaries - two Llama 3.2 models (1B and 3B parameters) that you can run locally via Ollama!

1. Installation

First, install Ollama, following the instructions on their website.

Then set up the virtual environment:

python -m venv .venv
. .venv/bin/activate
pip install huggingface_hub pandas openai

Available models hosted on huggingface:

Finally, download the models from huggingface and build them locally:

hf download distil-labs/Distil-expenses-Llama-3.2-3B-Instruct --local-dir distil-model

cd distil-model
ollama create expense_llama3.2 -f Modelfile

2. Run the assistant

Next, we load the model and the expenses csv file. By default we load the downloaded Llama3.2 3B model and transactions.csv, but you can also provide different paths.

python finance_tool_demo.py

# optionally, if you change the model name or file name
python finance_tool_demo.py --model <model_name> --file <file_name>

The assistant can answer queries about expenses over all categories or limited to 1 category.

Assistant features:

expense sums (optional min/max limits)
expense counts (optional min/max limits)
monthly average
compare two periods
exit - exit gracefully (or just hit ctrl + c)

3. Examples

Sum:

What was my total spending on dining in January 2024?

ANSWER:  From 2024-01-01 to 2024-01-31 you spent 24.5 total on dining.
--------------------------------------------------
Give me my total expenses from 5th February to 11th March 2024

ANSWER:  From 2024-02-05 to 2024-03-11 you spent 348.28 total.
--------------------------------------------------

Count:

How many times did I go shopping over $100 in 2024?

ANSWER:  From 2024-01-01 to 2024-12-31 you spent 8 times over 100 on shopping.
--------------------------------------------------
Count all my shopping under $100 in the first half of 2024

ANSWER:  From 2024-01-01 to 2024-06-30 you spent 6 times under 100 on shopping.
--------------------------------------------------

Compare:

Compare shopping spending in March 2024 and in May 2024

ANSWER:  You spent from 2024-03-01 to 2024-03-31 LESS than from 2024-05-01 to 2024-05-31 by 164.05.
--------------------------------------------------
Did I spend more in Q1 2024 or Q2 2024?

ANSWER:  You spent from 2024-01-01 to 2024-03-31 LESS than from 2024-04-01 to 2024-06-30 by 392.36.
--------------------------------------------------

Averages:

What's my average spending on entertainment until end May in 2024?

ANSWER:  On average you spent monthly 14.79 (73.97 / 5) from 2024-01-01 to 2024-05-31 on entertainment.
--------------------------------------------------

3. Use your own data

If you want to use your own expenses documents, you have to use the same format as transactions.csv:

date,provider_name,amount,category
2024-01-05,Whole Foods,-145.32,shopping
2024-01-10,Netflix,-15.99,entertainment
2024-01-18,Shell Gas Station,-52.40,transportation
...

Mandatory columns are date, amount and category - any other columns are ignored. The date has to be in the YYYY-MM-DD format, expenses should be negative while income should be positive. You can use any categories (more common categories are more suitable).

Next, pass the path of your file to the script, for example:

python finance_tool_demo.py --file ~/Documents/expenses.csv

5. Fine-tuning setup

The tuned models were trained using knowledge distillation, leveraging the teacher model GPT-OSS 120B. We used 24 train examples and complemented them with 2500 synthetic examples.

We compare the teacher model and both student models on 25 held-out test examples:

Model	Correct (25)	Tool call accuracy
GPT-OSS	23	0.92
Llama3.2 3B (tuned)	22	0.88
Llama3.2 1B (tuned)	23	0.92
Llama3.2 3B (base)	6	0.24
Llama3.2 1B (base)	0	0.00

The training config file and train/test data splits are available under data/. UPDATE: +1 correctly classified example due to missed mislabeling (GPT-OSS, Llama 3.2 1B 0.88 -> 0.92, 3B 0.82 -> 0.88)

FAQ

Q: Why don't we just use Llama3.X yB for this??

We focus on small models (< 8B parameters), and these make errors when used out of the box (see 5.)

Q: The model does not work as expected

A: The tool calling on our platform is in active development! Follow us on LinkedIn for updates, or join our community. You can also try to rephrase your query.

Q: I want to use tool calling for my use-case

A: Visit our website and reach out to us, we offer custom solutions.

Q: Do you support multi-turn or chained queries?

A: Not yet (see previous questions).

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
data		data
.gitignore		.gitignore
README.md		README.md
finance_tool_demo.py		finance_tool_demo.py
llogo.png		llogo.png
transactions.csv		transactions.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Distil-expenses

1. Installation

2. Run the assistant

3. Examples

3. Use your own data

5. Fine-tuning setup

FAQ

About

Uh oh!

Releases

Packages

Languages

distil-labs/Distil-expenses

Folders and files

Latest commit

History

Repository files navigation

Distil-expenses

1. Installation

2. Run the assistant

3. Examples

3. Use your own data

5. Fine-tuning setup

FAQ

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages