Fix fast notebook tests #1316

bsowell · 2025-05-26T21:41:46Z

This PR combines two commits in an attempt to the get the fast notebook tests to pass.

Previously the _infer_prompt method would call asyncio.run to start up an
event loop in the case that an LLM was in async mode. This causes issues in
environments like Jupyter notebooks that already have a running event loop. To
avoid conflicts, we modify this to start a new thread and run the new event
loop there.

I also switched from GPT_3_5_TURBO_INSTRUCT to GPT_4O_MINI to avoid context errors.

Previously the _infer_prompt method would call asyncio.run to start up an event loop in the case that an LLM was in async mode. This causes issues in environments like Jupyter notebooks that already have a running event loop. To avoid conflicts, we modify this to start a new thread and run the new event loop there.

I was getting context length errors, and 4O_MINI is cheaper anyway.

Copilot

Pull Request Overview

This PR fixes issues with fast notebook tests by changing the LLM model and updating the approach for running asynchronous event loops in environments with an already running loop.

Change from GPT_3_5_TURBO_INSTRUCT to GPT_4O_MINI in the notebook script
Update Python version in the notebook metadata
Replace asyncio.run with a new-thread based approach for running an async loop in base_llm.py

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
notebooks/default-prep-script.ipynb	Updated LLM model and Python version to support faster notebook tests
lib/sycamore/sycamore/transforms/base_llm.py	Modified asynchronous execution to run a new event loop in a separate thread

Copilot · 2025-05-26T21:42:07Z

lib/sycamore/sycamore/transforms/base_llm.py

+
+        fut = asyncio.run_coroutine_threadsafe(_infer_prompts_async([p for _, p in nonempty], llm), new_loop)
+
+        responses = fut.result()


Consider adding a timeout to fut.result() (e.g., fut.result(timeout=some_seconds)) to prevent potential indefinite blocking if the asynchronous tasks encounter issues.

Suggested change

responses = fut.result()

try:

responses = fut.result(timeout=30) # Replace 30 with an appropriate timeout value

except concurrent.futures.TimeoutError:

new_loop.call_soon_threadsafe(new_loop.stop)

t.join()

new_loop.close()

raise TimeoutError("The asynchronous task timed out.")

HenryL27

Oh man, thanks for fixing this. It's been bothering me that I couldn't get something like it to work.

bsowell · 2025-05-27T18:29:57Z

Oh man, thanks for fixing this. It's been bothering me that I couldn't get something like it to work.

I found the asyncio docs a little thin, so I asked gemini "Write a short python program to demonstrate the behavior of the asyncio.run_coroutine_threadsafe function." and it worked surprisingly well!

bsowell added 3 commits May 26, 2025 14:36

Switch from GPT_3_5_TURBO_INSTRUCT to GPT_4O_MINI for test.

8467549

I was getting context length errors, and 4O_MINI is cheaper anyway.

Remove extraneous print

8a283cd

bsowell requested review from HenryL27 and Copilot May 26, 2025 21:41

Copilot AI reviewed May 26, 2025

View reviewed changes

Add workshop notebooks to excluded list.

421e4f5

HenryL27 approved these changes May 27, 2025

View reviewed changes

bsowell merged commit 0a41cb7 into main May 27, 2025
12 of 15 checks passed

bsowell deleted the ben/async_notebook branch May 27, 2025 18:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix fast notebook tests #1316

Fix fast notebook tests #1316

Uh oh!

bsowell commented May 26, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI May 26, 2025

Uh oh!

HenryL27 left a comment

Uh oh!

bsowell commented May 27, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		fut = asyncio.run_coroutine_threadsafe(_infer_prompts_async([p for _, p in nonempty], llm), new_loop)

		responses = fut.result()

-        responses = fut.result()
+        try:
+            responses = fut.result(timeout=30)  # Replace 30 with an appropriate timeout value
+        except concurrent.futures.TimeoutError:
+            new_loop.call_soon_threadsafe(new_loop.stop)
+            t.join()
+            new_loop.close()
+            raise TimeoutError("The asynchronous task timed out.")

Fix fast notebook tests #1316

Fix fast notebook tests #1316

Uh oh!

Conversation

bsowell commented May 26, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI May 26, 2025

Choose a reason for hiding this comment

Uh oh!

HenryL27 left a comment

Choose a reason for hiding this comment

Uh oh!

bsowell commented May 27, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants