-
-
Notifications
You must be signed in to change notification settings - Fork 721
feat(core): scanner v2 #6989
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(core): scanner v2 #6989
Conversation
🦋 Changeset detectedLatest commit: a0bd572 The changes in this PR will be included in the next version bump. This PR includes changesets to release 14 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
CodSpeed Performance ReportMerging #6989 will not alter performanceComparing Summary
|
Can we run the ecosystem CI before merging and make sure we didn't introduce any regressions? |
Nice! Have you performed some benchmarks/tests on big project to see the improvements? |
imports.insert(path.to_path_buf(), visitor.collect_info()); | ||
|
||
let module_info = visitor.collect_info(); | ||
for import_path in module_info.all_import_paths() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just amazing and mind-blowing!
where | ||
T: WorkspaceScannerBridge, | ||
{ | ||
#[inline] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you test the implication of all these #[inline]
whenever you can? i.e. check if there's a difference when removing them. Thank you
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just tested this PR against a codebase, and it doesn't work anymore. Even without project rules, the CLI doesn't finish.
Without project rules, it's stuck. With debug logs, it only prints
2025-07-24T11:56:01.887489Z INFO main crates/biome_cli/src/commands/mod.rs: Configuration file loaded: Some("**/biome.jsonc"), diagnostics detected 1
2025-07-24T11:56:01.888581Z DEBUG main insert_project: crates/biome_service/src/projects.rs: Insert workspace folder **
With a project rule enabled, it starts logging few index_file:open_file_internal
, but then it stops, it doesn't log anything anymore, the memory of the process starts increasing slowly.
Blocking for now
Yeah, sorry, the tests aren’t all green yet either, and it hang to cancel the runs yesterday because something was hanging indeed. Definitely needs some fixes still before it’s done. @Conaclos Also still waiting for the tests to get green, then I’ll do some comparisons. |
Fixed the failures in I think we can merge this PR to |
Thanks, nice work! I’ll have a closer look tomorrow, but just to temper expectations: I don’t think this PR will help much with the reports of leaks, because it does nothing except possibly reduce the amount of |
Oh, I'm afraid the CI status was misleading. Tests are actually still failing here. I'm on it! |
I've done some manual testing indeed, and it appears the LSP behaviour is good. At least I've not gotten any CPU spikes in repositories where previously I could reproduce it, and I've tried running builds while the LSP is open too.
Yes, I agree it's a fix. The reason I'm hesitant though is that we can anticipate that some users may require config changes, which makes it technically a breaking change. I don't think it's such a major change however, because it is still a fix. So if we're really strict in one direction, we would require releasing a v3, but if we don't care at all, it can go in a patch release. I think neither extreme is really helpful though, I think it deserves a small shoutout for users, so a minor it is :) We never strictly followed semver anyway... |
Bloody Windows 😅 @Conaclos I tried two open-source projects ( |
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the ✨ Finishing Touches🧪 Generate unit tests
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
# Conflicts: # crates/biome_lsp/src/server.rs # crates/biome_service/src/workspace/scanner.tests.rs # crates/biome_service/src/workspace/server.rs # crates/biome_service/src/workspace/server.tests.rs # crates/biome_service/src/workspace/watcher.tests.rs # crates/biome_service/tests/spec_tests.rs # packages/@biomejs/backend-jsonrpc/src/workspace.ts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome work!
Just tested on my private large repository. No issues so far, and got faster ~1s. v2.1.3
|
Thanks for confirming, @siketyan ! PR with website docs is ready too: biomejs/website#2853 |
Co-authored-by: Emanuele Stoppa <my.burning@gmail.com> Co-authored-by: Naoki Ikeguchi <me@s6n.jp>
fs.insert( | ||
file_path.into(), | ||
r#"{ | ||
"files": { "includes": ["test.js", "!**/folder"] } | ||
} | ||
"# | ||
.as_bytes(), | ||
); | ||
|
||
let test = Utf8Path::new("test.js"); | ||
fs.insert(test.into(), UNFORMATTED.as_bytes()); | ||
|
||
let test2 = Utf8Path::new("folder/test2.js"); | ||
fs.insert(test2.into(), UNFORMATTED.as_bytes()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this test doesn’t actually test what you want it to, see #7268 (comment)
so this whole feature was shipped completely broken and is breaking user code as we speak.
Summary
Fixed #6965: Implemented smarter scanner for project rules.
Previously, if project rules were enabled Biome's scanner would scan all dependencies regardless of whether they were used by/reachable from source files or not. While this worked for a first version, it was far from optimal.
The new scanner first scans everything listed under the
files.includes
setting, and then descends into the dependencies that were discovered there, including transitive dependencies. This has three main advantages:vcs.useIgnoreFile
is enabled,.gitignore
gets respected as well. Assuming you have folders such asbuild/
ordist/
configured there, those will be automatically ignored by the scanner.In addition, the following refactorings have been made:
workspace
module. The watcher has become a proper submodule of the scanner.WorkspaceServer
(and the watcher bridge trait on top of that), so there's no/hardly any actual overhead, but it provides a much cleaner separation of responsibilities. As an added benefit, the traits allow for much easier unit testing of the scanner and watcher in isolation, which I've done for the watcher already.is_opened_by_scanner
has been renamed tois_indexed
. The scanner and watcher also have their terminology changed so they no longer open/close a file, but they index/unload it. This is to highlight the differences between the scanner behaviour when they load a file and the behaviour when a client opens a file.Test Plan
Tests added and expanded. It will take me a bit longer to get everything green already, but I thought it was ready for review at least.
Docs
biomejs/website#2853