+
Skip to content

Conversation

arendjr
Copy link
Contributor

@arendjr arendjr commented Jul 11, 2025

Summary

This moves the logic for the scanner's can_handle() implementation into the workspace method is_ignored_by_scanner(), from where it can be reused by the Watcher. But because this logic depends on the ScanKind, we now keep track of the ScanKind inside the Watcher too, which should align the behaviours much better.

I'm also somewhat hopeful this can be a step towards solving #6838, although that will require an even tighter integration between the watcher and the rest of the scanner.

This may have a tiny overlap with #6839, so we should wait for that one to be merged first.

Test Plan

Everything should stay green, especially the tests introduced in #6839. Once #6839 is merged, I will also create a few variations to ensure differences in ScanKind are respected.

Docs

N/A

@arendjr arendjr requested review from a team July 11, 2025 11:53
Copy link

changeset-bot bot commented Jul 11, 2025

⚠️ No Changeset found

Latest commit: b877a89

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@github-actions github-actions bot added the A-Project Area: project label Jul 11, 2025
@arendjr arendjr force-pushed the make-watcher-aware-of-scan-kind branch from c5aed0a to f50fe78 Compare July 11, 2025 19:55
@github-actions github-actions bot added the A-LSP Area: language server protocol label Jul 11, 2025
Copy link
Member

@ematipico ematipico left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @arendjr. I left an observation. Let me know what you think

Comment on lines +92 to +93
/// Tracks the folders we are watching, including the scan kind per folder.
watched_folders: FxHashMap<PathBuf, ScanKind>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we can ask for this information from the Workspace, instead of copying this information for all folders. The thing is, there's one ScanKind per project, and that one instance is computed from the configuration. I've noticed that we also have a method that takes the greater ScanKind if we've different ones, but this isn't always true.

We already have the logic that computes ScanKind inside the find_project_root, we just need to expose it and we should be able to give it to the watcher on request.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a very slight duplication between the workspace and the watcher indeed, but I do think this is where it belongs. Note we don't copy this information for all folders, only for the watched folder, so in a typical setup we still store it only once per project.

I think however, that in order to address #6838, we may want to implement more granular watching and track it right here. But that's none of the workspace's business.

So the division as I see it is: the workspace determines what to watch, and the watcher is responsible for the actual watching.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but I do think this is where it belongs.

Can you explain why? IMHO, it's worth documenting. At the moment, I disagree, and if there's a specific reason to duplicate the information, let's at least explain it in a comment so we don't lose the insight.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why you disagree? I just spent three paragraphs explaining my reasoning, so I'm unsure which part you're disagreeing with.

Copy link
Member

@ematipico ematipico Jul 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why you disagree? I just spent three paragraphs explaining my reasoning, so I'm unsure which part you're disagreeing with.

When the configuration changes, and the ScanKind is downgraded from Project to KnownFiles, the LSP picks the change via the LSP file watcher, and the project is scanned again using the KnownFiles. Now the watcher has its copy of ScanKind, and we do run the heuristic, it keeps using Project, based on what's written here: https://github.com/arendjr/biome/blob/f50fe78ca28acd434be858e55766604fdacae72e/crates/biome_service/src/workspace_watcher.rs#L241-L243

There are some inconsistencies, and that's why I think there should be one single source of truth.


Now, I don't have enough knowledge of the watcher, so I don't understand technical details such as "we may want to implement more granular watching and track it right here".

So, I still have my doubts about the logic, but unfortunately, I don't have enough technical knowledge to understand everything. So don't feel blocked by my comment. However, if the new logic addresses my concern, feel free to resolve my comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a fair concern, but I think the solution is simple: if the ScanKind changes, the workspace should tell the watcher about it.

Now the watcher has its copy of ScanKind, and we do run the heuristic, it keeps using Project, based on what's written here: https://github.com/arendjr/biome/blob/f50fe78ca28acd434be858e55766604fdacae72e/crates/biome_service/src/workspace_watcher.rs#L241-L243

I think this may be slightly misinterpreted. The doc was meant to indicate that if a single watcher notification is received from the OS that contains paths that belong to different projects with different scan kind configurations, only then do we assume to use a single scan kind for both. It's a highly unlikely possibility, but we play it safe in such a case.


Btw, I think we overlooked something with #6839, which is that when we call update_service_data(), we should've already checked whether the file was ignored or not. So it seems we're trying to put the is_ignored_by_scanner() check too deep, which is also giving an issue with this PR, since the ScanKind is no longer available at that point. I will resolve that here as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have moved the calls to is_ignored_by_scanner() to the watcher itself. This should not have any other effect except to avoid redundant calls done by the scanner itself, and all the tests are still green. I added a test case too. But just in case, maybe you want to re-test on your end as well.

Copy link
Member

@ematipico ematipico left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@arendjr arendjr force-pushed the make-watcher-aware-of-scan-kind branch from f50fe78 to a07fe0a Compare July 15, 2025 16:57
@arendjr arendjr merged commit 533956a into biomejs:main Jul 15, 2025
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-LSP Area: language server protocol A-Project Area: project

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载