这是indexloc提供的服务,不要输入任何密码
Skip to content

Conversation

@johnpyp
Copy link

@johnpyp johnpyp commented Nov 18, 2025

Description

Fixes #9421

This PR refactors watch mode task restarts to be "fine-grained", such that if a package changes, only active tasks which rely on that output get killed and restarted.

(Similar to #10846, but taking a slightly different approach which I think is cleaner).

Testing Instructions

Manual testing confirms the issue is fixed with this change. There aren't any existing automated tests for watch and the like, given that it's a long running process I assume.

@vercel
Copy link
Contributor

vercel bot commented Nov 18, 2025

@johnpyp is attempting to deploy a commit to the Vercel Team on Vercel.

A member of the Team first needs to authorize it.

@johnpyp johnpyp changed the title fix: interruptible, persistent tasks kill a targeted subset of the graph fix: fine grained interruptible task restarts in watch mode Nov 18, 2025
@johnpyp johnpyp changed the title fix: fine grained interruptible task restarts in watch mode fix: add fine grained interruptible task restarts in watch mode Nov 18, 2025
@johnpyp johnpyp force-pushed the fix/interruptible-task-termination branch 2 times, most recently from 0475181 to c091360 Compare November 18, 2025 22:19
@johnpyp johnpyp force-pushed the fix/interruptible-task-termination branch from c091360 to de049c1 Compare November 18, 2025 23:40
@johnpyp johnpyp changed the title fix: add fine grained interruptible task restarts in watch mode fix: Add fine grained interruptible task restarts in watch mode Nov 18, 2025
@johnpyp johnpyp marked this pull request as ready for review November 18, 2025 23:43
@johnpyp johnpyp requested a review from a team as a code owner November 18, 2025 23:43
@johnpyp johnpyp requested a review from tknickman November 18, 2025 23:43
Comment on lines +198 to +202
.filter(|c| predicate(c))
.cloned()
.collect();

lock.children.retain(|c| !predicate(c));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Between the evaluation in the .filter() and the .retain() a child's state might change. We should do something more like:

  let (matching, remaining): (Vec<_>, Vec<_>) = lock.children
      .drain(..)
      .partition(|c| predicate(c));
  lock.children = remaining;

// process manager is closing. If it is closing, it means we are shutting down
// the entire run. If it is not closing, it means we are restarting specific
// tasks.
if self.manager.is_closing() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is potentially dangerous based on race conditions and dependent task behavior. What tests can we write to validate that we're not creating bugs here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Example Race:

Thread 1 (exec.rs):        Thread 2 (manager):
child exits
is_closing() → false
                           close() called
                           is_closing = true
return Restarted ❌

That should've been Shutdown.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we introduce some atomics, this concern would get addressed.

pub struct ChildCommandChannel(mpsc::Sender<ChildCommand>);
pub struct ChildCommandChannel {
sender: mpsc::Sender<ChildCommand>,
task_id: Option<TaskId<'static>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a little rough, because now the child process has to be aware of task_ids . These channels shouldn't be aware of "business logic" ideally, to preserve encapsulation.

I'm also worried about the lifetime implications. The TaskId is going to live the entire program, but line 433 has a .clone() that can end up referencing tasks that are non-static. That could result in panics.

Another small, small worry I have, but a worry nonethless, is the memory overhead associated with this strategy. TaskIds aren't dropped until the Child is dropped, so its possible to see some memory in high concurrency scenarios. (I haven't taken the time to validate this, but based on my other comments, I'd love to see an architectural change where this is no longer a concern).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm fair enough. Is there a better identifier or way of getting the "set of processes mapped to tasks" that would allow us to not have the process code care about tasks at all?

That was the primary thing I wasn't confident in going into this change without being too familiar with the codebase.

Copy link
Contributor

@anthonyshew anthonyshew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, thank you for this PR. I've put some comments into the review, but, as a top-level review comment, I'd love to see an approach to solve this such that we can be minimally invasive to the codebase outside of watch.rs. In truth, turbo watch deserves a large refactor, and so I'd love to keep the blast radius of PRs for fixes on the existing code to be small.

I completely understand that what I'm asking may border on impossible. This code is quite tricky, and has no test coverage, but I'd love to keep exploring what's possible here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Interruptible tasks are terminated without being restarted

2 participants