-
Notifications
You must be signed in to change notification settings - Fork 56
New Feature: Ollama support for reasoning models #7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Added Ollama support for DeepSeek models.
I added an example of how to use deepseek models with ollama in ReagClient
Added ollama in tool.poetry.dependencies
python/src/reag/client.py
Outdated
| ) | ||
| except json.JSONDecodeError: | ||
| print("Error: Could not parse response:", message_content) | ||
| content, reasoning, is_irrelevant = self._extract_think_content(message_content) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@g-hano why is this added in here? it should be in the try block?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just updated the code. Please let me know if you'd like to see any other improvements.
moved `self._extract_think_content()` inside try block
|
@g-hano thanks for your PR. I've used your code to use DeepSeek-R1 with Ollama. It's outputting the Reasoning, but, not the Content. Does it work fine on your end? If I run it with o3-mini it outputs the Content correctly. BTW, I had to make some changes in the pyproject.toml to make it work with my system-wide Ollama v 0.5.7 |
|
Ok, I realized that the output of ollama/deepseek-r1* models doesn't exactly follow the structure at _extract_think_content function, by making the below edit, I was able to make it work. def _extract_think_content(self, text: str) -> tuple[str, str, bool]:
"""
Extract reasoning from the <think> block and use everything after the </think> tag as content.
"""
# Extract reasoning from the <think>...</think> block.
reasoning_match = re.search(r'<think>(.*?)</think>', text, flags=re.DOTALL)
reasoning = reasoning_match.group(1).strip() if reasoning_match else ""
# Find the position of the closing </think> tag.
end_think_tag = "</think>"
end_index = text.find(end_think_tag)
# If found, content is everything after the </think> tag; otherwise, use the whole text.
if end_index != -1:
content = text[end_index + len(end_think_tag):].strip()
else:
content = text.strip()
# You can decide what to do with is_irrelevant.
# Here we simply set it to False, meaning the content is relevant.
is_irrelevant = False
return content, reasoning, is_irrelevant |
But with your code the method always returns false for |
|
@g-hano thank you for your contribution |
|
This is released in |
Overview
This PR adds Ollama support and enhances response parsing capabilities for different model outputs. Fixes #6
Adding Ollama support to ReAG by enhancing the ReagClient class to support local model inference and improving response parsing.
Changes Made
Enhanced
ReagClientinitialization:api_baseparameter for Ollama supportAdded response parsing improvements:
_extract_think_contentmethod to handle DeepSeek models' markdown responsesEnvironment
Implementation Details
The implementation adds support for:
Example Usage