+
Skip to content

Conversation

mattsu2020
Copy link
Contributor

Add automatic buffer-size heuristics (ported from commit a0e77d9). We now size external-sort chunks based on input file sizes and available memory, clamping to 512 KiB–128 MiB so we avoid both tiny buffers and risky overcommit on constrained systems.

Respect user-provided --buffer-size. Only automatically computed sizes are raised to the safety minimum; explicit values are left untouched, which keeps external sorting and --compress-program working even when users choose small buffers.

Performance Comparison (baseline vs. current)
Measurements come from hyperfine --warmup 3 --runs 10; values are means in milliseconds (lower is better).

Scenario Baseline Current Delta Speedup
ASCII 500k 17.04 14.59 -2.45 1.17×
Numeric 500k 36.89 37.29 +0.40 0.99×
ASCII 4M 112.47 107.52 -4.95 1.05×
ASCII 4M (-S 32M) 207.40 100.48 -106.92 2.06×
ASCII 16M 854.18 414.45 -439.73 2.06×
ASCII 16M (-S 512M) 838.05 426.85 -411.20 1.96×

Copy link

github-actions bot commented Oct 6, 2025

GNU testsuite comparison:

Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

Copy link

github-actions bot commented Oct 7, 2025

GNU testsuite comparison:

Skip an intermittent issue tests/misc/tee (fails in this run but passes in the 'main' branch)

Copy link

GNU testsuite comparison:

Skip an intermittent issue tests/misc/tee (fails in this run but passes in the 'main' branch)
Skip an intermittent issue tests/tail/overlay-headers (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

Copy link

GNU testsuite comparison:

Skip an intermittent issue tests/tail/overlay-headers (fails in this run but passes in the 'main' branch)

@sylvestre
Copy link
Contributor

you have to refresh the fuzz/Cargo.lock file

Copy link

GNU testsuite comparison:

Skip an intermittent issue tests/tail/overlay-headers (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/misc/tee (passes in this run but fails in the 'main' branch)

@sylvestre
Copy link
Contributor

please follow this documentation for the performance work:
https://github.com/uutils/coreutils/blob/main/docs/src/performance.md

We would like to see hyperfine results and codspeed benchmark :)
thanks

.filter(|s| matches!(s.settings.mode, SortMode::GeneralNumeric))
.count();

self.precomputed.fast_lexicographic = self.mode == SortMode::Default
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it needs comments, and moved into a function

rlim_max: 0,
};
match unsafe { getrlimit(RLIMIT_NOFILE, &raw mut limit) } {
match unsafe { libc::getrlimit(RLIMIT_NOFILE, &mut limit) } {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this change?

let file_hint = file_size_hint(files);
let mem_hint = available_memory_hint();

match (file_hint, mem_hint) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the cases need comments here
it is a bit cryptic

quarter.max(max)
}

#[cfg(target_os = "linux")]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i was already not convinced in #8802
but
there is probably a better way than parsing /proc/meminfo

esp in the sort.rs code

}
}

fn ascii_case_insensitive_cmp(a: &[u8], b: &[u8]) -> Ordering {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is that necessary ?

path: Option<PathBuf>,
}

fn handler_state() -> Arc<Mutex<HandlerRegistration>> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why ?

.clone()
}

fn ensure_signal_handler_installed(state: Arc<Mutex<HandlerRegistration>>) -> UResult<()> {
Copy link
Contributor

@sylvestre sylvestre Oct 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same, it needs some explanations

Copy link

codspeed-hq bot commented Oct 11, 2025

CodSpeed Performance Report

Merging #8833 will degrade performances by 2.69%

Comparing mattsu2020:sort_performan-ce (74111c1) with main (36ca7bc)

Summary

⚡ 16 improvements
❌ 2 regressions
✅ 88 untouched
⏩ 73 skipped1

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

Benchmark BASE HEAD Change
sort_accented_data[500000] 819.3 ms 352.6 ms ×2.3
sort_ascii_only[500000] 798 ms 343.7 ms ×2.3
sort_case_insensitive[500000] 417.2 ms 272.1 ms +53.3%
sort_case_sensitive[500000] 327.1 ms 163.2 ms ×2
sort_key_field[500000] 688.8 ms 707.9 ms -2.69%
sort_mixed_data[500000] 730 ms 317.8 ms ×2.3
sort_numeric[500000] 1.2 s 1.2 s -2.61%
sort_reverse_locale[500000] 820.7 ms 351.7 ms ×2.3
sort_unique_locale[500000] 1,142.3 ms 473 ms ×2.4
sort_ascii_c_locale 27.9 ms 21.3 ms +30.79%
sort_ascii_utf8_locale 55.8 ms 42.7 ms +30.7%
sort_german_c_locale 94.5 ms 38 ms ×2.5
sort_german_locale 94.5 ms 38 ms ×2.5
sort_mixed_c_locale 94.1 ms 37.9 ms ×2.5
sort_mixed_utf8_locale 94.1 ms 37.9 ms ×2.5
sort_random_strings 55.3 ms 32.4 ms +70.75%
sort_reverse_mixed 93.2 ms 37.3 ms ×2.5
sort_unique_mixed 87.6 ms 38.2 ms ×2.3

Footnotes

  1. 73 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@sylvestre
Copy link
Contributor

impressive wins :)

Replaced fixed 1GB buffer size with dynamic calculation based on input file sizes and available memory. Added heuristics to clamp buffer size within 512 KiB to 128 MiB range, preventing overcommitment on constrained systems while optimizing for typical workloads. Updated dependencies and imports to use libc directly for better portability.
- Add sysconf to codespell ignore list to prevent false positives
- Remove "NOTE:" from buffer heuristics comment for cleaner style
Copy link

GNU testsuite comparison:

Skip an intermittent issue tests/tail/overlay-headers (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

Co-authored-by: Sylvestre Ledru <sylvestre@debian.org>
Copy link

GNU testsuite comparison:

Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

Extract logic for determining fast lexicographic and ASCII case-insensitive paths into new methods `can_use_fast_lexicographic()` and `can_use_fast_ascii_insensitive()` for improved code readability and maintainability.
…ion and latest snapshot loading

- Added comment explaining the static guard for one-time handler installation per process
- Added comment clarifying the need to load current lock/path snapshot for temp dir cleanup
@sylvestre
Copy link
Contributor

i am planning to merge #8746
because:

Improve memory detection on Linux by using libc::sysinfo instead of parsing /proc/meminfo. This provides more direct access to system memory info, calculates available memory as freeram + bufferram, and removes the need for the parse_meminfo_value helper function for better efficiency and accuracy.
Changed the cfg attribute from `target_family = "unix"` to `target_family = "linux"` to more precisely target Linux systems, aligning with the Linux-specific libc usage in the code.
…code

This change updates the conditional compilation attribute from `target_family` to `target_os` to accurately target Linux OS-specific code, ensuring proper compilation behavior.
This change modifies the conditional compilation attribute for libc imports in the sort utility to include Android alongside Linux, enabling compatibility with Android platforms that share similar system interfaces.
Removed unnecessary import for _SC_PAGESIZE and _SC_PHYS_PAGES, and prefixed libc constants with 'libc::' in sysconf calls to ensure fully qualified references and avoid potential naming conflicts.
… to dictionary

Updated the jargon wordlist in .vscode/cspell.dictionaries/jargon.wordlist.txt to include new technical terms, preventing false positives in spell checking for code-related documentation and comments.
Copy link

GNU testsuite comparison:

Skip an intermittent issue tests/misc/tee (fails in this run but passes in the 'main' branch)
Skip an intermittent issue tests/tail/overlay-headers (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

Modified conditional compilation in physical_memory_bytes() to skip Unix-specific sysconf calls on Redox OS, ensuring compatibility by returning None for unsupported platforms.
Copy link

GNU testsuite comparison:

Skip an intermittent issue tests/tail/overlay-headers (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

Copy link

GNU testsuite comparison:

Skipping an intermittent issue tests/misc/tee (passes in this run but fails in the 'main' branch)
Skipping an intermittent issue tests/tail/overlay-headers (passes in this run but fails in the 'main' branch)
Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

Copy link

GNU testsuite comparison:

Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

@mattsu2020 mattsu2020 requested a review from sylvestre October 13, 2025 01:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载