Improve ChainedLLM to handle when to move to next llm in chain, add r… #1354

austin-aryn-ai · 2025-06-17T02:50:33Z

…etries to Gemini, remove retries from VLMTableStructureExtractor

austin-aryn-ai · 2025-06-17T04:37:36Z

If you're OK with the approach taken here, I can add some tests.

bsowell

Check the async thing. Otherwise looks fine.

bsowell · 2025-06-17T15:56:21Z

lib/sycamore/sycamore/data/table.py

        return d

+    @staticmethod
+    def extract_table_block(html_str: str):


I don't see a concrete problem with this, but another option might be to use BeautifulSoup, which we should already have as a dependency. It theoretically handles "broken" HTML, and might theoretically be able to recover from other kinds of issues. Not a big priority.

Switched to using BeautifulSoup.

bsowell · 2025-06-17T15:58:31Z

lib/sycamore/sycamore/llms/chained_llm.py

+    def __init__(
+        self,
+        chain: list[LLM],
+        response_checker: Optional[Callable[[str], bool]] = None,


Cool. let's start with this. I'd love to get to the point where we can have a library of "validators" that can be used with LLM calls.

bsowell · 2025-06-17T16:00:29Z

lib/sycamore/sycamore/llms/gemini.py

+        timeout=120.0,
+    )
+    async def generate_content_async(self, model, contents, config):
+        return self._client.models.generate_content(model=model, contents=contents, config=config)


Doesn't this need to be self._client.aio.models.generate_content or need an await?

Ah yes! I'll add 'await'.

bsowell · 2025-06-17T16:04:07Z

lib/sycamore/sycamore/transforms/table_structure/extract.py

+            if res.startswith("```html"):
+                res = res[7:].rstrip("`")
+            res = res.strip()


One thought is that you could have the html checker handle this as well, since it's pretty common for llms to use the backtick thing.

I don't know if we want the response check to modify the response. Also, it's only used when the LLM is a ChainedLLM so we still need this code for non-chained LLMs.

Improve ChainedLLM to handle when to move to next llm in chain, add r…

fdcabf7

…etries to Gemini, remove retries from VLMTableStructureExtractor

austin-aryn-ai requested a review from bsowell June 17, 2025 02:50

austin-aryn-ai added 2 commits June 16, 2025 19:54

Fix lint

2617d8d

Make mypy for Python 3.9 happy

92ba5c3

bsowell reviewed Jun 17, 2025

View reviewed changes

austin-aryn-ai added 2 commits June 17, 2025 15:05

Address comments; add new tests

2ef74c2

Fix lint

7d6f7a5

bsowell approved these changes Jun 18, 2025

View reviewed changes

austin-aryn-ai merged commit fff2c54 into main Jun 18, 2025
13 of 15 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve ChainedLLM to handle when to move to next llm in chain, add r… #1354

Improve ChainedLLM to handle when to move to next llm in chain, add r… #1354

Uh oh!

austin-aryn-ai commented Jun 17, 2025

Uh oh!

austin-aryn-ai commented Jun 17, 2025

Uh oh!

bsowell left a comment

Uh oh!

bsowell Jun 17, 2025

Uh oh!

austin-aryn-ai Jun 17, 2025

Uh oh!

bsowell Jun 17, 2025

Uh oh!

bsowell Jun 17, 2025

Uh oh!

austin-aryn-ai Jun 17, 2025

Uh oh!

bsowell Jun 17, 2025

Uh oh!

austin-aryn-ai Jun 17, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Improve ChainedLLM to handle when to move to next llm in chain, add r… #1354

Improve ChainedLLM to handle when to move to next llm in chain, add r… #1354

Uh oh!

Conversation

austin-aryn-ai commented Jun 17, 2025

Uh oh!

austin-aryn-ai commented Jun 17, 2025

Uh oh!

bsowell left a comment

Choose a reason for hiding this comment

Uh oh!

bsowell Jun 17, 2025

Choose a reason for hiding this comment

Uh oh!

austin-aryn-ai Jun 17, 2025

Choose a reason for hiding this comment

Uh oh!

bsowell Jun 17, 2025

Choose a reason for hiding this comment

Uh oh!

bsowell Jun 17, 2025

Choose a reason for hiding this comment

Uh oh!

austin-aryn-ai Jun 17, 2025

Choose a reason for hiding this comment

Uh oh!

bsowell Jun 17, 2025

Choose a reason for hiding this comment

Uh oh!

austin-aryn-ai Jun 17, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants