You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi !Can I ask some details about the 'RL w. Text-only CoT' setting in Table 5 in your paper?
In this setting, did you ask the model to also output the coordinates of the box but didn't input the cropped image back into the llm? Or just output the thinking process like traditional CoT?
Looking forward to your reply. Thanks !