+
Skip to content

Conversation

l0ngvh
Copy link
Contributor

@l0ngvh l0ngvh commented Aug 10, 2025

Summary

This PR fixes multiple bugs with the Yaml lexer:

  • Even though the tokens were emitted in the right order, their text index were actually wrong. This caused the trivia to be attached to the wrong tokens in wrong order.
  • Reorder how FLOW_END tokens are emitted. Now they are only emitted after all trailing trivia of the child flow token have been emitted. This was causing problem with rowan since tree sink isn't able to properly handle zero-width token with trailing trivia. For example, this yaml document will now produce this stream of tokens:
"abc"  # abc

FLOW_START:0
DOUBLE_QUOTED_LITERAL:5
WHITESPACE:1
COMMENT: 5
FLOW_END:0
NEWLINE:1

P/S: I really should stop pretending that YAML is easy. It's certainly not 🥲

Test Plan

Added new tests for various error cases

Docs

N/A

Summary by CodeRabbit

  • Refactor

    • Simplified YAML lexer token and scope management for improved clarity and maintainability.
    • Adjusted handling of whitespace and newlines in tokenization.
    • Updated parsing logic for block map entries to improve error handling.
  • Bug Fixes

    • Improved lexer and parser handling of edge cases involving mappings, sequences, and plain scalars.
  • Tests

    • Added multiple new YAML test cases to cover error scenarios and complex structures.
    • Updated existing tests to reflect revised tokenization of whitespace and newlines.
    • Enhanced coverage for sequences, mappings, and handling of trailing trivia.

Copy link

changeset-bot bot commented Aug 10, 2025

⚠️ No Changeset found

Latest commit: 16c2eaf

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

Copy link
Contributor

coderabbitai bot commented Aug 10, 2025

Walkthrough

The YAML lexer and parser underwent a significant refactor to simplify token and scope management. Wrapper structs for token scoping were removed, replaced by direct token handling with explicit caching for scope-closing tokens. Several new helper methods were introduced, and method signatures were updated accordingly. Numerous new YAML test cases were added and existing ones updated to reflect these lexer changes.

Changes

Cohort / File(s) Change Summary
Lexer Refactor
crates/biome_yaml_parser/src/lexer/mod.rs
Refactored YAML lexer to remove wrapper structs for scoped tokens, introduced direct token handling, added explicit caching for scope-closing tokens, updated method signatures, added helper methods for whitespace/newline/trivia, and clarified scope management logic.
Lexer Test Adjustments
crates/biome_yaml_parser/src/lexer/tests/block.rs
Updated test assertions in lex_compact_mapping_in_sequence and lex_nested_sequence for new expected counts of NEWLINE and WHITESPACE tokens, reflecting changes in lexer tokenization.
Parser Update
crates/biome_yaml_parser/src/parser/block.rs
Replaced direct token consumption (bump(COLON)) with expect(T![:]) for parsing implicit block map entries, and added a TODO for improved error messages.
New/Updated Block Error Tests
crates/biome_yaml_parser/tests/yaml_test_suite/err/block/*
Added new test cases: mapping_followed_plain.yaml, mapping_key_contains_multiple_values.yaml, mapping_missing_colon.yaml, plain_followed_mapping.yaml, plain_separated_by_comments.yaml, sequence_follow_by_map.yaml to check various block mapping and sequence error scenarios.
New Flow Error Test
crates/biome_yaml_parser/tests/yaml_test_suite/err/flow/plain_start_with_indicator.yaml
Added test for flow sequences starting with indicators, testing parser behavior for edge cases in flow-style YAML.
Updated Block Sequence Test
crates/biome_yaml_parser/tests/yaml_test_suite/ok/block/sequence/nested_block_sequence.yaml
Modified nested block sequence structure to introduce additional empty/nested sequences and alter ordering.
New Block Sequence Trivia Test
crates/biome_yaml_parser/tests/yaml_test_suite/ok/block/sequence/sequence_with_trailing_trivia.yaml
Added a test for nested sequences with trailing whitespace and comments to verify handling of trivia in block sequences.
New Flow Trivia Test
crates/biome_yaml_parser/tests/yaml_test_suite/ok/flow/flow_with_trailing_trivia.yaml
Added a test for flow sequences with trailing comments and whitespace.

Sequence Diagram(s)

sequenceDiagram
    participant Source as YAML Source
    participant Lexer as YamlLexer
    participant Tokens as LinkedList<LexToken>
    participant Parser as Parser

    Source->>Lexer: Provide YAML input
    loop While not EOF
        Lexer->>Lexer: Consume next token (direct, no wrappers)
        alt Scope starts or ends
            Lexer->>Lexer: Cache or emit scope-closing tokens
        end
        Lexer->>Tokens: Append LexToken(s)
    end
    Tokens->>Parser: Provide token stream
    Parser->>Parser: Parse tokens (using expect for COLON, etc.)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~40 minutes

Suggested reviewers

  • dyc3

Poem

A rabbit hopped through YAML fields,
Refactoring scopes with nimble yields.
Tokens now flow, wrappers no more,
Trivia and whitespace neatly in store.
With new tests to guide each leap,
This lexer's garden is tidy and deep!
🐇✨

Note

🔌 MCP (Model Context Protocol) integration is now available in Early Access!

Pro users can now connect to remote MCP servers under the Integrations page to get reviews and chat conversations that understand additional development context.

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@github-actions github-actions bot added the A-Parser Area: parser label Aug 10, 2025
@l0ngvh l0ngvh force-pushed the fix-tokens-out-of-order branch from 16c2eaf to c9fa99c Compare August 10, 2025 01:38
Copy link

codspeed-hq bot commented Aug 10, 2025

CodSpeed Performance Report

Merging #7168 will not alter performance

Comparing vohoanglong0107:fix-tokens-out-of-order (c9fa99c) with main (79a0755)

Summary

✅ 128 untouched benchmarks

@l0ngvh l0ngvh marked this pull request as ready for review August 10, 2025 07:36
@l0ngvh l0ngvh requested review from a team August 10, 2025 07:36
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (4)
crates/biome_yaml_parser/tests/yaml_test_suite/err/block/sequence_follow_by_map.yaml (1)

1-3: Good negative test; minor naming nit.

Scenario is valid for an error test (sequence item followed by a mapping at same indent). Consider renaming the file to "sequence_followed_by_map.yaml" for grammatical clarity.

crates/biome_yaml_parser/src/parser/block.rs (1)

168-169: Consider implementing the TODO for improved error handling.

The switch from bump to expect improves error handling by generating diagnostics when the colon is missing. However, the generic error message from expect may not be user-friendly for YAML-specific syntax errors.

Would you like me to help implement a custom error message that provides more context about the expected mapping syntax? For example: "Expected ':' after mapping key in block map entry".

crates/biome_yaml_parser/src/lexer/mod.rs (2)

135-176: Complex but necessary logic for mapping disambiguation.

The refactored method correctly handles the various cases for potential mapping starts, including cached token management. The logic properly distinguishes between block maps and flow values.

Consider monitoring the performance impact of the multiple LinkedList operations in this hot path, especially for large YAML files. If performance becomes an issue, you might consider using a more efficient data structure or pre-allocating list capacity.


797-798: Potential confusion with column tracking and tabs.

The comment states "number of bytes since the last newline" but the field is named column. This could be misleading when tabs are involved, as tabs typically don't correspond to single columns in text display.

Consider clarifying whether this tracks byte offsets or visual columns, especially given that YAML allows both spaces and tabs as whitespace.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 79a0755 and c9fa99c.

⛔ Files ignored due to path filters (19)
  • crates/biome_yaml_parser/tests/yaml_test_suite/err/block/mapping_followed_plain.yaml.snap is excluded by !**/*.snap
  • crates/biome_yaml_parser/tests/yaml_test_suite/err/block/mapping_key_contains_multiple_values.yaml.snap is excluded by !**/*.snap
  • crates/biome_yaml_parser/tests/yaml_test_suite/err/block/mapping_missing_colon.yaml.snap is excluded by !**/*.snap
  • crates/biome_yaml_parser/tests/yaml_test_suite/err/block/multi_line_flow_collection_broken_out_of_parent_block.yaml.snap is excluded by !**/*.snap
  • crates/biome_yaml_parser/tests/yaml_test_suite/err/block/multi_line_plain_token_broken_out_parent_flow.yaml.snap is excluded by !**/*.snap
  • crates/biome_yaml_parser/tests/yaml_test_suite/err/block/plain_followed_mapping.yaml.snap is excluded by !**/*.snap
  • crates/biome_yaml_parser/tests/yaml_test_suite/err/block/plain_separated_by_comments.yaml.snap is excluded by !**/*.snap
  • crates/biome_yaml_parser/tests/yaml_test_suite/err/block/sequence_follow_by_map.yaml.snap is excluded by !**/*.snap
  • crates/biome_yaml_parser/tests/yaml_test_suite/err/flow/plain_start_with_indicator.yaml.snap is excluded by !**/*.snap
  • crates/biome_yaml_parser/tests/yaml_test_suite/ok/block/mapping/explicit_block_mapping_with_compact_mapping.yaml.snap is excluded by !**/*.snap
  • crates/biome_yaml_parser/tests/yaml_test_suite/ok/block/mapping/explicit_block_mapping_with_empty_value_and_mapping_as_key.yaml.snap is excluded by !**/*.snap
  • crates/biome_yaml_parser/tests/yaml_test_suite/ok/block/mapping/nested_implicit_block_mapping.yaml.snap is excluded by !**/*.snap
  • crates/biome_yaml_parser/tests/yaml_test_suite/ok/block/mapping/simple_block_mapping_separated_by_trivia.yaml.snap is excluded by !**/*.snap
  • crates/biome_yaml_parser/tests/yaml_test_suite/ok/block/sequence/map_nested_in_sequence.yaml.snap is excluded by !**/*.snap
  • crates/biome_yaml_parser/tests/yaml_test_suite/ok/block/sequence/nested_block_sequence.yaml.snap is excluded by !**/*.snap
  • crates/biome_yaml_parser/tests/yaml_test_suite/ok/block/sequence/sequence_nested_in_map.yaml.snap is excluded by !**/*.snap
  • crates/biome_yaml_parser/tests/yaml_test_suite/ok/block/sequence/sequence_with_trailing_trivia.yaml.snap is excluded by !**/*.snap
  • crates/biome_yaml_parser/tests/yaml_test_suite/ok/block/sequence/simple_block_sequence.yaml.snap is excluded by !**/*.snap
  • crates/biome_yaml_parser/tests/yaml_test_suite/ok/flow/flow_with_trailing_trivia.yaml.snap is excluded by !**/*.snap
📒 Files selected for processing (13)
  • crates/biome_yaml_parser/src/lexer/mod.rs (13 hunks)
  • crates/biome_yaml_parser/src/lexer/tests/block.rs (2 hunks)
  • crates/biome_yaml_parser/src/parser/block.rs (1 hunks)
  • crates/biome_yaml_parser/tests/yaml_test_suite/err/block/mapping_followed_plain.yaml (1 hunks)
  • crates/biome_yaml_parser/tests/yaml_test_suite/err/block/mapping_key_contains_multiple_values.yaml (1 hunks)
  • crates/biome_yaml_parser/tests/yaml_test_suite/err/block/mapping_missing_colon.yaml (1 hunks)
  • crates/biome_yaml_parser/tests/yaml_test_suite/err/block/plain_followed_mapping.yaml (1 hunks)
  • crates/biome_yaml_parser/tests/yaml_test_suite/err/block/plain_separated_by_comments.yaml (1 hunks)
  • crates/biome_yaml_parser/tests/yaml_test_suite/err/block/sequence_follow_by_map.yaml (1 hunks)
  • crates/biome_yaml_parser/tests/yaml_test_suite/err/flow/plain_start_with_indicator.yaml (1 hunks)
  • crates/biome_yaml_parser/tests/yaml_test_suite/ok/block/sequence/nested_block_sequence.yaml (1 hunks)
  • crates/biome_yaml_parser/tests/yaml_test_suite/ok/block/sequence/sequence_with_trailing_trivia.yaml (1 hunks)
  • crates/biome_yaml_parser/tests/yaml_test_suite/ok/flow/flow_with_trailing_trivia.yaml (1 hunks)
🧰 Additional context used
📓 Path-based instructions (4)
crates/biome_*/**/*

📄 CodeRabbit Inference Engine (CLAUDE.md)

Core crates must be located in /crates/biome_*/

Files:

  • crates/biome_yaml_parser/tests/yaml_test_suite/err/block/sequence_follow_by_map.yaml
  • crates/biome_yaml_parser/tests/yaml_test_suite/err/block/mapping_missing_colon.yaml
  • crates/biome_yaml_parser/tests/yaml_test_suite/err/block/mapping_followed_plain.yaml
  • crates/biome_yaml_parser/tests/yaml_test_suite/err/flow/plain_start_with_indicator.yaml
  • crates/biome_yaml_parser/tests/yaml_test_suite/ok/block/sequence/nested_block_sequence.yaml
  • crates/biome_yaml_parser/tests/yaml_test_suite/ok/block/sequence/sequence_with_trailing_trivia.yaml
  • crates/biome_yaml_parser/tests/yaml_test_suite/err/block/plain_followed_mapping.yaml
  • crates/biome_yaml_parser/tests/yaml_test_suite/err/block/plain_separated_by_comments.yaml
  • crates/biome_yaml_parser/tests/yaml_test_suite/ok/flow/flow_with_trailing_trivia.yaml
  • crates/biome_yaml_parser/src/parser/block.rs
  • crates/biome_yaml_parser/src/lexer/tests/block.rs
  • crates/biome_yaml_parser/tests/yaml_test_suite/err/block/mapping_key_contains_multiple_values.yaml
  • crates/biome_yaml_parser/src/lexer/mod.rs
**/*.{rs,toml}

📄 CodeRabbit Inference Engine (CLAUDE.md)

Format code (Rust + TOML) using just format

Format Rust and TOML files using just f (alias for just format)

Files:

  • crates/biome_yaml_parser/src/parser/block.rs
  • crates/biome_yaml_parser/src/lexer/tests/block.rs
  • crates/biome_yaml_parser/src/lexer/mod.rs
**/*.rs

📄 CodeRabbit Inference Engine (CONTRIBUTING.md)

**/*.rs: Run the linter for the whole project using just l (alias for just lint)
Use Rust's doctest feature for testing code in documentation comments
For documentation of rules, assists, and their options, use inline rustdoc documentation

Files:

  • crates/biome_yaml_parser/src/parser/block.rs
  • crates/biome_yaml_parser/src/lexer/tests/block.rs
  • crates/biome_yaml_parser/src/lexer/mod.rs
**/*.{rs,snap}

📄 CodeRabbit Inference Engine (CONTRIBUTING.md)

Use the insta crate for snapshot testing in Rust and manage snapshots with cargo insta commands

Files:

  • crates/biome_yaml_parser/src/parser/block.rs
  • crates/biome_yaml_parser/src/lexer/tests/block.rs
  • crates/biome_yaml_parser/src/lexer/mod.rs
🧠 Learnings (2)
📚 Learning: 2025-08-09T23:07:50.448Z
Learnt from: CR
PR: biomejs/biome#0
File: crates/biome_parser/CONTRIBUTING.md:0-0
Timestamp: 2025-08-09T23:07:50.448Z
Learning: Applies to crates/biome_parser/**/token_source.rs : Wrap your lexer with a `BufferedLexer` and implement `TokenSourceWithBufferedLexer` for your token source.

Applied to files:

  • crates/biome_yaml_parser/src/lexer/mod.rs
📚 Learning: 2025-08-09T23:07:50.448Z
Learnt from: CR
PR: biomejs/biome#0
File: crates/biome_parser/CONTRIBUTING.md:0-0
Timestamp: 2025-08-09T23:07:50.448Z
Learning: Applies to crates/biome_parser/lexer/mod.rs : Implement the `Lexer` trait from the `biome_parser` crate for your lexer struct.

Applied to files:

  • crates/biome_yaml_parser/src/lexer/mod.rs
🪛 YAMLlint (1.37.1)
crates/biome_yaml_parser/tests/yaml_test_suite/err/block/sequence_follow_by_map.yaml

[error] 3-3: syntax error: expected , but found '?'

(syntax)

crates/biome_yaml_parser/tests/yaml_test_suite/err/block/mapping_followed_plain.yaml

[error] 3-3: syntax error: expected , but found ''

(syntax)

crates/biome_yaml_parser/tests/yaml_test_suite/err/flow/plain_start_with_indicator.yaml

[error] 1-1: syntax error: found character '>' that cannot start any token

(syntax)

crates/biome_yaml_parser/tests/yaml_test_suite/ok/block/sequence/sequence_with_trailing_trivia.yaml

[error] 1-1: trailing spaces

(trailing-spaces)

crates/biome_yaml_parser/tests/yaml_test_suite/err/block/plain_separated_by_comments.yaml

[error] 3-3: syntax error: expected , but found ''

(syntax)

crates/biome_yaml_parser/tests/yaml_test_suite/ok/flow/flow_with_trailing_trivia.yaml

[error] 1-1: trailing spaces

(trailing-spaces)


[error] 3-3: trailing spaces

(trailing-spaces)


[error] 4-4: trailing spaces

(trailing-spaces)

crates/biome_yaml_parser/tests/yaml_test_suite/err/block/mapping_key_contains_multiple_values.yaml

[error] 2-2: syntax error: expected , but found ''

(syntax)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Bench (biome_package)
🔇 Additional comments (17)
crates/biome_yaml_parser/tests/yaml_test_suite/err/block/mapping_missing_colon.yaml (1)

1-2: LGTM for error coverage.

This clearly exercises a missing-colon mapping entry after a valid pair. Fits the err suite well.

crates/biome_yaml_parser/tests/yaml_test_suite/err/flow/plain_start_with_indicator.yaml (1)

1-1: Ensure diagnostic matches lexer intent for '>' and '|' in flow context.

This should produce an error since '>' and '|' are block scalar indicators and invalid as plain scalars in flow. Verify the expected diagnostic (message/code and span) in the snapshot to avoid brittleness.

crates/biome_yaml_parser/tests/yaml_test_suite/err/block/plain_followed_mapping.yaml (1)

1-2: Clarify expected error in assertions.

A stray plain scalar after a mapping key/value at the same indent should error. Ensure the test asserts the error location at Line 2 and the intended diagnostic category (e.g., unexpected plain scalar in block mapping).

crates/biome_yaml_parser/tests/yaml_test_suite/err/block/mapping_key_contains_multiple_values.yaml (1)

1-2: Solid err case for composite key tokenization.

Combining multiple scalars into a key should fail; this is a good regression test for key token boundaries after the lexer refactor.

crates/biome_yaml_parser/tests/yaml_test_suite/err/block/mapping_followed_plain.yaml (1)

1-3: Good negative test case for mapping followed by a plain scalar line

Captures the invalid pattern where a plain scalar line is followed by a mapping entry at the same indent under a:. This aligns with the intended lexer/parser error handling.

crates/biome_yaml_parser/tests/yaml_test_suite/err/block/plain_separated_by_comments.yaml (1)

1-3: Solid coverage for comment-separated plain scalar edge case

Intentionally invalid structure; matches the lexer/parser error scenario and the YAMLlint diagnostic.

crates/biome_yaml_parser/tests/yaml_test_suite/ok/block/sequence/nested_block_sequence.yaml (1)

1-2: Expanded nested sequence shapes look correct

Adds both the “empty item with nested sequence” form and compact - - 40. Good coverage for block sequence nesting and indentation handling.

Also applies to: 4-4

crates/biome_yaml_parser/tests/yaml_test_suite/ok/flow/flow_with_trailing_trivia.yaml (1)

1-4: Intentional trailing whitespace to assert trivia handling

These trailing spaces and inline comments are necessary to validate the new lexer behavior around trailing trivia and closing tokens. Keep as-is; don’t auto-trim.

crates/biome_yaml_parser/src/lexer/tests/block.rs (2)

88-89: Updated expectations for indentation tokenization look correct

Replacing earlier extra NEWLINEs with explicit WHITESPACE:2 after each NEWLINE:1 aligns with the new lexer semantics. This should stabilize assertions around block-indentation.

Also applies to: 96-97


128-129: Indentation after line break is now explicit (WHITESPACE:2)

This mirrors the prior change and ensures the nested - 20 is preceded by indent trivia tokens. Consistent with the refactor.

crates/biome_yaml_parser/src/parser/block.rs (1)

177-177: Good consistency with error handling improvement.

The change from p.bump(T![:]) to p.expect(T![:]) here maintains consistency with the similar change at line 169, ensuring uniform error handling across both flow YAML and flow JSON node parsing paths.

crates/biome_yaml_parser/tests/yaml_test_suite/ok/block/sequence/sequence_with_trailing_trivia.yaml (1)

1-1: Clarify if trailing spaces are intentional for testing.

The static analysis tool detected trailing spaces on this line. Given that this test file is specifically for "sequence_with_trailing_trivia", these trailing spaces might be intentional to test the lexer's handling of whitespace trivia.

If the trailing spaces are intentional for testing trivia handling, consider adding a comment to document this intent. Otherwise, they should be removed to comply with YAML best practices.

crates/biome_yaml_parser/src/lexer/mod.rs (5)

28-29: Well-designed solution for token ordering issues.

The addition of cached_scope_closing_tokens elegantly solves the problem of zero-width tokens with trailing trivia mentioned in the PR objectives. This caching mechanism ensures FLOW_END tokens are emitted after all trailing trivia, maintaining proper token order.


69-81: Clean refactoring of token consumption logic.

The removal of wrapper structs and direct handling of LinkedList<LexToken> simplifies the code significantly. The pattern matching is more readable and the token list appending is efficient.


86-104: Improved sequence entry handling with explicit scope management.

The refactored method correctly manages sequence scope creation and token ordering. The use of is_none_or for checking scope conditions is idiomatic, and the explicit push_front/push_back operations ensure proper token sequence.


427-441: Well-structured flow scope evaluation.

The new evaluate_flow_scope method cleanly separates the logic for determining flow collection boundaries and properly manages the caching of scope closing tokens.


650-675: Proper newline handling with coordinate tracking.

The new consume_newline method correctly handles both Unix (\n) and Windows (\r\n) line endings and properly updates the text coordinates.

Comment on lines +1 to +3
a:
plain # separated by
aa: bb
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Ensure linters don’t break this err-case (YAMLlint warning is expected here)

YAMLlint reports “expected , but found ''” on Line 3 — that’s intentional. Verify your CI excludes tests/yaml_test_suite/err/ from YAMLlint or configures a per-path ignore so these fixtures don’t fail the pipeline.


🏁 Script executed:

#!/bin/bash
# Check for yamllint configs and potential pre-commit hooks that could affect tests
fd -a ".yamllint" -u || true
fd -a ".pre-commit-config.yaml" -u || true
rg -n "yamllint|trailing-spaces|err/block" -S || true

# If you use yamllint, ensure its config excludes crates/biome_yaml_parser/tests/yaml_test_suite/err/**

Length of output: 497


Configure YAMLlint to ignore err/ fixtures
It looks like there isn’t a YAMLlint config in the repo to exclude your negative-test fixtures, so the “expected <block end>” error in
crates/biome_yaml_parser/tests/yaml_test_suite/err/** will break CI. Please add or update your YAMLlint config (e.g. /.yamllint) to exclude that path.

• No .yamllint or relevant pre-commit config was detected in the repo.
• Add something like this to the top-level .yamllint:

extends: default

ignore:
  - "crates/biome_yaml_parser/tests/yaml_test_suite/err/**"

• Ensure your CI YAMLlint step picks up this config so that err/ fixtures don’t fail the pipeline.

🧰 Tools
🪛 YAMLlint (1.37.1)

[error] 3-3: syntax error: expected , but found ''

(syntax)

🤖 Prompt for AI Agents
In
crates/biome_yaml_parser/tests/yaml_test_suite/err/block/mapping_followed_plain.yaml
lines 1 to 3, the YAMLlint configuration is missing and causes CI failures on
negative-test fixtures. To fix this, create or update a top-level .yamllint file
in the repo root to include an ignore rule for the err/ test fixtures by adding
an ignore section with the path
"crates/biome_yaml_parser/tests/yaml_test_suite/err/**". Also ensure the CI
pipeline YAMLlint step uses this config file so these error fixtures are
excluded from linting and do not break the build.

Comment on lines +1 to +4
- 20
- "20" # abc
- [50]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Protect trailing spaces from formatters/linters

If you run YAML formatters/linters, ensure this path is excluded or rules like “trailing-spaces” are disabled for this file. Otherwise the critical spaces may be stripped and tests will lose coverage.


🏁 Script executed:

#!/bin/bash
# Look for editor/formatter settings that might trim trailing whitespace
fd -a ".editorconfig" -u || true
rg -n "trim_trailing_whitespace" -S || true
rg -n "yamllint|trailing-spaces|flow_with_trailing_trivia.yaml" -S || true

# If needed, add a repo-level exception for:
# crates/biome_yaml_parser/tests/yaml_test_suite/ok/flow/flow_with_trailing_trivia.yaml

Length of output: 2589


Preserve trailing whitespace in flow_with_trailing_trivia.yaml

Add an override to your .editorconfig so that editorconfig-aware formatters won’t strip the critical trailing spaces in this test file:

# Prevent trimming trailing spaces in the YAML trivia test
[crates/biome_yaml_parser/tests/yaml_test_suite/ok/flow/flow_with_trailing_trivia.yaml]
trim_trailing_whitespace = false

This ensures that any formatter or linter respecting .editorconfig leaves the trailing spaces intact, preserving test coverage.

🧰 Tools
🪛 YAMLlint (1.37.1)

[error] 1-1: trailing spaces

(trailing-spaces)


[error] 3-3: trailing spaces

(trailing-spaces)


[error] 4-4: trailing spaces

(trailing-spaces)

🤖 Prompt for AI Agents
In
crates/biome_yaml_parser/tests/yaml_test_suite/ok/flow/flow_with_trailing_trivia.yaml
lines 1 to 4, trailing whitespace is critical for the test but may be stripped
by formatters. To fix this, add an override section in the project's
.editorconfig file specifically for this YAML file, setting
trim_trailing_whitespace to false. This will prevent editorconfig-aware tools
from removing the trailing spaces and preserve the test's intended formatting.

@l0ngvh l0ngvh merged commit 6b388d9 into biomejs:main Aug 11, 2025
46 of 47 checks passed
@l0ngvh l0ngvh deleted the fix-tokens-out-of-order branch August 11, 2025 00:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-Parser Area: parser

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载