这是indexloc提供的服务,不要输入任何密码
Skip to content

Conversation

@psinha40898
Copy link
Contributor

@psinha40898 psinha40898 commented Jul 22, 2025

TLDR

2.5 flash lite was not working with the Gemini CLI because unlike the other models in the 2.5 series, flash-lite requires thinking to be explicitly turned on otherwise it is treated like a non thinking model.

Dive Deeper

The GenAI Gemini API states that 2.5 Pro always has thinking enabled, and 2.5 Flash has thinking enabled by default.

2.5 Flash Lite has thinking disabled by default.

Gemini API

Gemini API Thinking
View full docs →


Vertex API

Vertex AI Budget
View full docs →

Therefore if Flash Lite is to be treated as a thinking model and includeThoughts is an essential parameter in a minimalist generateContentConfig, then Flash Lite should be treated as a special case.

Alternatively,

  1. Flash Lite can be treated as a non thinking model
  2. 2.5 Pro and 2.5 Flash can receive thinkingBudget parameters, even though they do not require it to enable thinking.
  3. Flash Lite can be left disabled

Reviewer Test Plan

Check that the following commands with and without the changes

Non interactive prompting for flash lite:

gemini -m gemini-2.5-flash-lite -p "just reply with pong"

Running the CLI with flash lite:

gemini --model gemini-2.5-flash-lite

Testing Matrix

🍏 🪟 🐧
npm run
npx
Docker
Podman - -
Seatbelt - -

Linked issues / bugs

Fixes #4548

@psinha40898 psinha40898 requested a review from a team as a code owner July 22, 2025 12:09
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @psinha40898, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a functional issue with the gemini-2.5-flash-lite model within the Gemini CLI. The core problem was that flash-lite requires explicit enablement of its 'thinking' capability, unlike other gemini-2.5 models which have it on by default. My changes introduce logic to correctly identify and configure models based on their specific API defaults for 'thinking', ensuring flash-lite operates as intended.

Highlights

  • Model Configuration Logic: Introduced a new helper function isThinkingDefault to accurately determine if a given Gemini model has 'thinking' enabled by default according to API specifications. This function specifically identifies gemini-2.5-flash-lite as a model where thinking is not enabled by default.
  • Flash Lite Thinking Fix: Modified the GeminiClient's content generation configuration to correctly apply thinkingConfig parameters. This change ensures that for models like gemini-2.5-flash-lite, where thinking is supported but not enabled by default, the thinkingBudget: -1 parameter is explicitly added to enable thinking functionality, resolving the issue with its behavior in the CLI.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments or fill out our survey to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly adjusts the content generation configuration to enable 'thinking' for the gemini-2.5-flash-lite model, which has it disabled by default. The logic appears sound. However, my review identifies two important areas for improvement: a critical lack of unit tests for the new logic, and a high-severity issue regarding the use of a magic number, which impacts code maintainability. Addressing these points will significantly improve the quality and robustness of the changes.

@teeler
Copy link
Contributor

teeler commented Jul 24, 2025

Thanks for the fix, LGTM modulo the comment about making -1 a constant.

You'll need to wait for another approver, but this seems fine to me.

@psinha40898
Copy link
Contributor Author

psinha40898 commented Jul 24, 2025

Thanks for the fix, LGTM modulo the comment about making -1 a constant.

You'll need to wait for another approver, but this seems fine to me.

It is now corrected. Thank you for reviewing this!

@gemini-cli gemini-cli bot added kind/enhancement priority/p1 Important and should be addressed in the near term. labels Aug 22, 2025
@galz10 galz10 self-requested a review August 22, 2025 20:43
@psinha40898 psinha40898 requested a review from galz10 August 22, 2025 22:31
@galz10 galz10 enabled auto-merge August 29, 2025 16:08
@galz10 galz10 added this pull request to the merge queue Aug 29, 2025
Merged via the queue into google-gemini:main with commit f2bddfe Aug 29, 2025
18 checks passed
thacio added a commit to thacio/auditaria that referenced this pull request Aug 29, 2025
davideast pushed a commit to davideast/gemini-cli that referenced this pull request Sep 2, 2025
Co-authored-by: Gal Zahavi <38544478+galz10@users.noreply.github.com>
@ei-grad
Copy link
Contributor

ei-grad commented Oct 23, 2025

This introduced some unneeded code. The issue was fixed by merge of #3033. Very sad that nobody tried to reproduce the problem. Not here, not there. Absolutely non-google level of engineering.

@psinha40898
Copy link
Contributor Author

psinha40898 commented Oct 23, 2025

This introduced some unneeded code. The issue was fixed by merge of #3033. Very sad that nobody tried to reproduce the problem. Not here, not there. Absolutely non-google level of engineering.

The PR was opened against main and your PR was merged separately weeks after this was opened because this is a big repo with duplicate issues. Sorry about that. I also didn't see your PR which was opened first.

I still think it is better to have the code reflect the intention of the API than to hardcode the thinkingBudget at -1, but I can't make that call for sure.

Ty for checking

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

priority/p1 Important and should be addressed in the near term.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

flash-lite isn't suported in non-interactive mode

5 participants