这是indexloc提供的服务,不要输入任何密码
Skip to content

Conversation

@wemeetagain
Copy link
Member

@wemeetagain wemeetagain commented Oct 23, 2025

Motivation

Description

  • review of validateColumnsByRangeResponse
    • simplify the duplicate/order checking loop
      • use a map to deduplicate the various cases
    • minor updates to the index checking loop
      • reuse map from ^, use non-nullish array

@wemeetagain wemeetagain requested a review from a team as a code owner October 23, 2025 13:34
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @wemeetagain, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request focuses on refining the validateColumnsByRangeResponse function to enhance its robustness and efficiency. The changes primarily involve simplifying the logic for detecting duplicate and out-of-order data column sidecars, leading to a more maintainable and performant validation process for incoming data.

Highlights

  • Data Structure Refinement: The seenColumns map has been refactored from Map<Slot, fulu.DataColumnSidecars> to Map<Slot, Map<number, fulu.DataColumnSidecar>>. This change significantly improves the efficiency of checking for duplicate column indices within the same slot.
  • Simplified Validation Logic: The core loop for checking duplicate columns and out-of-order slots/indices has been streamlined. Complex conditional blocks were replaced with more direct checks using the new map structure, reducing code complexity and improving readability.
  • Optimized Index Handling: The logic for tracking and validating column indices has been enhanced. A new currentIndex variable helps maintain order, and map-based lookups (seenSlotColumns.has()) are now used for faster duplicate and presence checks.
  • Improved Data Retrieval and Checks: Retrieval of columnSidecars for validation now correctly converts the nested map values into an array. Additionally, checks for empty columnSidecars arrays and missing indices have been optimized for clarity and performance.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request does a great job of simplifying and optimizing the logic in validateColumnsByRangeResponse. Using a Map for seenColumns to deduplicate and look up columns is much cleaner and more performant than the previous array-based approach. The subsequent optimizations to use columnSidecarsMap.has(i) for checking missing indices is also a significant improvement.

However, I've found a logic issue in the new implementation for checking out-of-order column indices. The current check can incorrectly flag columns as out of order when transitioning between slots. I've left a specific comment with a suggested fix for this.

Copy link
Contributor

@nazarhussain nazarhussain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from that one point highlighted by Gemini #8560 (comment), LGTM.

@wemeetagain
Copy link
Member Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request simplifies the validation logic in validateColumnsByRangeResponse, which is a great improvement for readability and maintainability. The use of a Map to check for duplicates is much cleaner than the previous approach.

I've found a critical issue in the new logic for checking out-of-order column indices that could lead to incorrect warnings. I've provided a detailed comment and a code suggestion to fix this. I also found a minor redundancy that can be cleaned up. Overall, this is a good refactoring, and with the suggested fixes, it will be much more robust.

@codecov
Copy link

codecov bot commented Oct 23, 2025

Codecov Report

❌ Patch coverage is 0% with 42 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (mkeil/fix-by-range-validation@abcc0d4). Learn more about missing BASE report.

Additional details and impacted files
@@                       Coverage Diff                        @@
##             mkeil/fix-by-range-validation    #8560   +/-   ##
================================================================
  Coverage                                 ?   52.10%           
================================================================
  Files                                    ?      852           
  Lines                                    ?    65426           
  Branches                                 ?     4796           
================================================================
  Hits                                     ?    34090           
  Misses                                   ?    31267           
  Partials                                 ?       69           
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@wemeetagain
Copy link
Member Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request provides a great simplification and performance improvement to the validateColumnsByRangeResponse function. By leveraging a Map to track seen columns, the logic for handling duplicates and out-of-order items is now much cleaner and more efficient. The changes are well-implemented and significantly enhance code readability. I have one suggestion to re-introduce sorting for column sidecars to improve the robustness of the function, but overall this is an excellent refactoring.

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Copy link
Member

@matthewkeil matthewkeil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be ok. Still should throw for all out of order conditions but might throw to many or not enough warnings. But it simplifies the logic a lot so seems like a good tradeoff. Removes some of the duplication in the checks. Nice and DRY. Will get this unit tested asap to make sure it catches all cases

Comment on lines +665 to +669
if (currentSlot !== slot) {
// a new slot has started, reset index
currentIndex = -1;
} else {
currentIndex = columnSidecar.index;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty sure there is a bug in this logic. If the columns are delivered out of order then the indexing will be off because its not slot-wise aware. I think it will end up throwing a whole slew (or possible not enough) errors/warnings after the out of order is found. I guess though big picture we will not really care "how many out of order" there are and will just downscore on the first one so really only the first matters

@wemeetagain wemeetagain merged commit 6d0776e into mkeil/fix-by-range-validation Oct 23, 2025
22 of 23 checks passed
@wemeetagain wemeetagain deleted the cayman/review-validateColumnsByRangeResponse branch October 23, 2025 19:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants