-
-
Notifications
You must be signed in to change notification settings - Fork 308
Fix shift overflow in inflate and send_code. #1859
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughThe pull request modifies the data type of the Changes
Possibly related PRs
Suggested Labels
Suggested Reviewers
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (4)
🚧 Files skipped from review as they are similar to previous changes (3)
⏰ Context from checks skipped due to timeout of 90000ms (81)
🔇 Additional comments (7)
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## develop #1859 +/- ##
===========================================
- Coverage 80.38% 79.81% -0.58%
===========================================
Files 139 139
Lines 11176 11172 -4
Branches 2867 2864 -3
===========================================
- Hits 8984 8917 -67
- Misses 1214 1268 +54
- Partials 978 987 +9 ☔ View full report in Codecov by Sentry. |
aa81e2f to
6cc1a76
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
inflate.h (1)
124-124: LGTM! The type change is necessary for preventing shift overflow.The change from
uint32_ttouint64_tfor theholdmember is a breaking change that affects the ABI. This requires a version bump in the library.Consider adding a note in the changelog about this breaking change to help users upgrade safely.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
infback.c(2 hunks)inflate.c(6 hunks)inflate.h(1 hunks)trees_emit.h(1 hunks)
🧰 Additional context used
🪛 GitHub Check: codecov/patch
inflate.c
[warning] 294-294: inflate.c#L294
Added line #L294 was not covered by tests
[warning] 482-482: inflate.c#L482
Added line #L482 was not covered by tests
[warning] 580-580: inflate.c#L580
Added line #L580 was not covered by tests
[warning] 707-707: inflate.c#L707
Added line #L707 was not covered by tests
🔇 Additional comments (7)
trees_emit.h (1)
56-58: LGTM! The overflow protection is correctly implemented.The added conditional check prevents potential overflow by ensuring
bi_bufis only updated when there's enough space in the buffer.Let's verify that all callers handle the case when the shift operation is skipped:
✅ Verification successful
The overflow protection is safe across all callers
The verification shows that all uses of
send_bitsproperly handle the buffer state:
- Used with small, controlled bit lengths (2-7 bits)
- Buffer state (bi_buf, bi_valid) is consistently tracked
- Buffer fullness is managed through proper flushing mechanisms
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Search for direct uses of send_bits macro rg -A 5 "send_bits\s*\(" --type cLength of output: 3143
infback.c (2)
105-105: LGTM! The type change in PULLBYTE macro is correct.The change to use
uint64_tfor the shift operation prevents overflow and is consistent with the type change ininflate_state.
157-157: LGTM! The hold variable declaration is updated correctly.The change to use
uint64_tfor theholdvariable is consistent with the type change ininflate_state.inflate.c (4)
294-294: LGTM! The hold calculation in inflatePrime is updated correctly.The change to use
uint64_tfor the shift operation prevents overflow.Note: Static analysis indicates this line is not covered by tests.
🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 294-294: inflate.c#L294
Added line #L294 was not covered by tests
482-482: LGTM! The hold variable declaration is updated correctly.The change to use
uint64_tfor theholdvariable is consistent with the type change ininflate_state.Note: Static analysis indicates this line is not covered by tests.
🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 482-482: inflate.c#L482
Added line #L482 was not covered by tests
580-580: LGTM! The type cast for head->time assignment is correct.The change ensures proper type handling when assigning to head->time.
Note: Static analysis indicates this line is not covered by tests.
🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 580-580: inflate.c#L580
Added line #L580 was not covered by tests
707-707: LGTM! The type casts for ZSWAP32 are correct.The changes ensure proper type handling when using ZSWAP32.
Note: Static analysis indicates line 707 is not covered by tests.
Let's verify the test coverage for these changes:
Also applies to: 1131-1131
🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 707-707: inflate.c#L707
Added line #L707 was not covered by tests
58d1c4b to
49c8e86
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
inflate.c (1)
294-294: Overall approach to fixing bit buffer overflow looks good.The changes consistently use a 64-bit buffer for bit operations, which prevents overflow during bit accumulation and shifting. However, there are several narrowing conversions to 32-bit integers that need verification:
- Time value assignment
- Dictionary ID comparison
- Check value comparison
Please ensure that these values cannot exceed 32 bits in any valid use case.
Consider adding explicit range checks before these narrowing conversions to fail fast if the values exceed 32 bits, rather than silently truncating them.
Also applies to: 390-390, 580-580, 707-707, 1131-1131
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
infback.c(2 hunks)inflate.c(6 hunks)inflate.h(2 hunks)trees_emit.h(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- trees_emit.h
- inflate.h
- infback.c
⏰ Context from checks skipped due to timeout of 90000ms (170)
- GitHub Check: macOS GCC UBSAN
- GitHub Check: macOS Clang ASAN (ARM64)
- GitHub Check: macOS Clang ASAN
- GitHub Check: Windows GCC Compat No Opt
- GitHub Check: Windows GCC Native Instructions (AVX)
- GitHub Check: Windows GCC
- GitHub Check: Windows ClangCl Win64 Native Instructions (AVX)
- GitHub Check: Windows ClangCl Win64
- GitHub Check: Windows ClangCl Win32
- GitHub Check: Windows MSVC 2019 v140 Win32
- GitHub Check: EL9 Clang S390X DFLTCC MSAN
- GitHub Check: Ubuntu GCC SPARC64
- GitHub Check: Ubuntu GCC PPC64LE
- GitHub Check: Ubuntu GCC AARCH64 Compat No Opt UBSAN
- GitHub Check: Ubuntu GCC AARCH64 No NEON UBSAN
- GitHub Check: Ubuntu GCC AARCH64 No ACLE UBSAN
- GitHub Check: Ubuntu GCC AARCH64 ASAN
- GitHub Check: macOS GCC UBSAN
- GitHub Check: macOS Clang ASAN (ARM64)
- GitHub Check: macOS Clang ASAN
- GitHub Check: Windows GCC Compat No Opt
- GitHub Check: Windows GCC Native Instructions (AVX)
- GitHub Check: Windows GCC
- GitHub Check: Windows ClangCl Win64 Native Instructions (AVX)
- GitHub Check: Windows ClangCl Win64
- GitHub Check: Windows ClangCl Win32
- GitHub Check: Windows MSVC 2019 v140 Win32
- GitHub Check: EL9 Clang S390X DFLTCC MSAN
- GitHub Check: Ubuntu GCC SPARC64
- GitHub Check: Ubuntu GCC PPC64LE
- GitHub Check: Ubuntu GCC AARCH64 Compat No Opt UBSAN
- GitHub Check: Ubuntu GCC AARCH64 No NEON UBSAN
- GitHub Check: Ubuntu GCC AARCH64 No ACLE UBSAN
- GitHub Check: Ubuntu GCC AARCH64 ASAN
- GitHub Check: macOS GCC UBSAN
- GitHub Check: macOS Clang ASAN (ARM64)
- GitHub Check: macOS Clang ASAN
- GitHub Check: Windows GCC Compat No Opt
- GitHub Check: Windows GCC Native Instructions (AVX)
- GitHub Check: Windows GCC
- GitHub Check: Windows ClangCl Win64 Native Instructions (AVX)
- GitHub Check: Windows ClangCl Win64
- GitHub Check: Windows ClangCl Win32
- GitHub Check: Windows MSVC 2019 v140 Win32
- GitHub Check: EL9 Clang S390X DFLTCC MSAN
- GitHub Check: Ubuntu GCC SPARC64
- GitHub Check: Ubuntu GCC PPC64LE
- GitHub Check: Ubuntu GCC AARCH64 Compat No Opt UBSAN
- GitHub Check: Ubuntu GCC AARCH64 No NEON UBSAN
- GitHub Check: Ubuntu GCC AARCH64 No ACLE UBSAN
- GitHub Check: Ubuntu GCC AARCH64 ASAN
- GitHub Check: macOS GCC UBSAN
- GitHub Check: macOS Clang ASAN (ARM64)
- GitHub Check: macOS Clang ASAN
- GitHub Check: Windows GCC Compat No Opt
- GitHub Check: Windows GCC Native Instructions (AVX)
- GitHub Check: Windows GCC
- GitHub Check: Windows ClangCl Win64 Native Instructions (AVX)
- GitHub Check: Windows ClangCl Win64
- GitHub Check: Windows ClangCl Win32
- GitHub Check: Windows MSVC 2019 v140 Win32
- GitHub Check: EL9 Clang S390X DFLTCC MSAN
- GitHub Check: Ubuntu GCC SPARC64
- GitHub Check: Ubuntu GCC PPC64LE
- GitHub Check: Ubuntu GCC AARCH64 Compat No Opt UBSAN
- GitHub Check: Ubuntu GCC AARCH64 No NEON UBSAN
- GitHub Check: Ubuntu GCC AARCH64 No ACLE UBSAN
- GitHub Check: Ubuntu GCC AARCH64 ASAN
- GitHub Check: macOS GCC UBSAN
- GitHub Check: macOS Clang ASAN (ARM64)
- GitHub Check: macOS Clang ASAN
- GitHub Check: Windows GCC Compat No Opt
- GitHub Check: Windows GCC Native Instructions (AVX)
- GitHub Check: Windows GCC
- GitHub Check: Windows ClangCl Win64 Native Instructions (AVX)
- GitHub Check: Windows ClangCl Win64
- GitHub Check: Windows ClangCl Win32
- GitHub Check: Windows MSVC 2019 v140 Win32
- GitHub Check: EL9 Clang S390X DFLTCC MSAN
- GitHub Check: Ubuntu GCC SPARC64
- GitHub Check: Ubuntu GCC PPC64LE
- GitHub Check: Ubuntu GCC AARCH64 Compat No Opt UBSAN
- GitHub Check: Ubuntu GCC AARCH64 No NEON UBSAN
- GitHub Check: Ubuntu GCC AARCH64 No ACLE UBSAN
- GitHub Check: Ubuntu GCC AARCH64 ASAN
- GitHub Check: macOS GCC UBSAN
- GitHub Check: macOS Clang ASAN (ARM64)
- GitHub Check: macOS Clang ASAN
- GitHub Check: Windows GCC Compat No Opt
- GitHub Check: Windows GCC Native Instructions (AVX)
- GitHub Check: Windows GCC
- GitHub Check: Windows ClangCl Win64 Native Instructions (AVX)
- GitHub Check: Windows ClangCl Win64
- GitHub Check: Windows ClangCl Win32
- GitHub Check: Windows MSVC 2019 v140 Win32
- GitHub Check: EL9 Clang S390X DFLTCC MSAN
- GitHub Check: Ubuntu GCC SPARC64
- GitHub Check: Ubuntu GCC PPC64LE
- GitHub Check: Ubuntu GCC AARCH64 Compat No Opt UBSAN
- GitHub Check: Ubuntu GCC AARCH64 No NEON UBSAN
- GitHub Check: Ubuntu GCC AARCH64 No ACLE UBSAN
- GitHub Check: Ubuntu GCC AARCH64 ASAN
- GitHub Check: macOS GCC UBSAN
- GitHub Check: macOS Clang ASAN (ARM64)
- GitHub Check: macOS Clang ASAN
- GitHub Check: Windows GCC Compat No Opt
- GitHub Check: Windows GCC Native Instructions (AVX)
- GitHub Check: Windows GCC
- GitHub Check: Windows ClangCl Win64 Native Instructions (AVX)
- GitHub Check: Windows ClangCl Win64
- GitHub Check: Windows ClangCl Win32
- GitHub Check: Windows MSVC 2019 v140 Win32
- GitHub Check: EL9 Clang S390X DFLTCC MSAN
- GitHub Check: Ubuntu GCC SPARC64
- GitHub Check: Ubuntu GCC PPC64LE
- GitHub Check: Ubuntu GCC AARCH64 Compat No Opt UBSAN
- GitHub Check: Ubuntu GCC AARCH64 No NEON UBSAN
- GitHub Check: Ubuntu GCC AARCH64 No ACLE UBSAN
- GitHub Check: Ubuntu GCC AARCH64 ASAN
- GitHub Check: macOS GCC UBSAN
- GitHub Check: macOS Clang ASAN (ARM64)
- GitHub Check: macOS Clang ASAN
- GitHub Check: Windows GCC Compat No Opt
- GitHub Check: Windows GCC Native Instructions (AVX)
- GitHub Check: Windows GCC
- GitHub Check: Windows ClangCl Win64 Native Instructions (AVX)
- GitHub Check: Windows ClangCl Win64
- GitHub Check: Windows ClangCl Win32
- GitHub Check: Windows MSVC 2019 v140 Win32
- GitHub Check: EL9 Clang S390X DFLTCC MSAN
- GitHub Check: Ubuntu GCC SPARC64
- GitHub Check: Ubuntu GCC PPC64LE
- GitHub Check: Ubuntu GCC AARCH64 Compat No Opt UBSAN
- GitHub Check: Ubuntu GCC AARCH64 No NEON UBSAN
- GitHub Check: Ubuntu GCC AARCH64 No ACLE UBSAN
- GitHub Check: Ubuntu GCC AARCH64 ASAN
- GitHub Check: macOS GCC UBSAN
- GitHub Check: macOS Clang ASAN (ARM64)
- GitHub Check: macOS Clang ASAN
- GitHub Check: Windows GCC Compat No Opt
- GitHub Check: Windows GCC Native Instructions (AVX)
- GitHub Check: Windows GCC
- GitHub Check: Windows ClangCl Win64 Native Instructions (AVX)
- GitHub Check: Windows ClangCl Win64
- GitHub Check: Windows ClangCl Win32
- GitHub Check: Windows MSVC 2019 v140 Win32
- GitHub Check: EL9 Clang S390X DFLTCC MSAN
- GitHub Check: Ubuntu GCC SPARC64
- GitHub Check: Ubuntu GCC PPC64LE
- GitHub Check: Ubuntu GCC AARCH64 Compat No Opt UBSAN
- GitHub Check: Ubuntu GCC AARCH64 No NEON UBSAN
- GitHub Check: Ubuntu GCC AARCH64 No ACLE UBSAN
- GitHub Check: Ubuntu GCC AARCH64 ASAN
- GitHub Check: macOS GCC UBSAN
- GitHub Check: macOS Clang ASAN (ARM64)
- GitHub Check: macOS Clang ASAN
- GitHub Check: Windows GCC Compat No Opt
- GitHub Check: Windows GCC Native Instructions (AVX)
- GitHub Check: Windows GCC
- GitHub Check: Windows ClangCl Win64 Native Instructions (AVX)
- GitHub Check: Windows ClangCl Win64
- GitHub Check: Windows ClangCl Win32
- GitHub Check: Windows MSVC 2019 v140 Win32
- GitHub Check: EL9 Clang S390X DFLTCC MSAN
- GitHub Check: Ubuntu GCC SPARC64
- GitHub Check: Ubuntu GCC PPC64LE
- GitHub Check: Ubuntu GCC AARCH64 Compat No Opt UBSAN
- GitHub Check: Ubuntu GCC AARCH64 No NEON UBSAN
- GitHub Check: Ubuntu GCC AARCH64 No ACLE UBSAN
- GitHub Check: Ubuntu GCC AARCH64 ASAN
🔇 Additional comments (4)
inflate.c (4)
294-294: LGTM! Prevents bit shift overflow.The cast to
uint64_tbefore the left shift operation prevents potential overflow whenstate->bitsapproaches 32.
390-390: LGTM! Safe bit accumulation.The cast to
uint64_tensures safe bit accumulation in the hold buffer, preventing overflow when shifting bytes into higher positions.
707-707: Verify the safety of dictionary ID and check value conversions.The casts from
uint64_ttounsignedin both the dictionary ID and check value comparisons could potentially truncate data. Let's verify that these values are guaranteed to be 32-bit.Also applies to: 1131-1131
✅ Verification successful
The conversions from uint64_t to unsigned are safe
The narrowing conversions are safe because:
- Dictionary IDs and check values are Adler-32 checksums, which by definition produce 32-bit values (RFC1950)
- The ZSWAP32 macro operates on 32-bit values, and the input values cannot exceed this width
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Check if there are any bounds checks or documentation about the expected ranges # of dictionary IDs and check values. # Search for documentation or bounds checks on dictionary IDs rg -A 5 -B 5 'dictid|dict.*id' . # Search for documentation or bounds checks on check values rg -A 5 -B 5 'check.*=.*hold' .Length of output: 21072
580-580: Verify the safety of narrowing conversion.The cast from
uint64_ttounsignedcould potentially truncate data. Let's verify that the time value cannot exceed the range ofunsigned.✅ Verification successful
The narrowing conversion is safe
The conversion from
uint64_ttounsignedis safe becauseNEEDBITS(32)explicitly ensures that only 32 bits are loaded intoholdbefore the assignment, which fits within the range ofunsigned.🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Check if there are any bounds checks on the time value before this assignment # or any documentation about the expected range of the time value. # Search for any bounds checks or documentation about time value rg -A 5 -B 5 'time.*=.*hold' . rg -A 5 -B 5 'gz_header.*time' .Length of output: 548
Script:
#!/bin/bash # Check NEEDBITS macro definition and hold variable type rg -B 5 -A 2 'NEEDBITS.*\(' inflate.c rg -B 2 -A 2 'hold.*=' inflate.c | grep -v 'time.*='Length of output: 8591
|
Not sure why Github thought I wanted to remove a reviewer, that was not my intention. I do want to see reviews before I merge this though, next stable release is waiting for this currently. |
|
Should probably performance test the send_bits. To me it seems like a lot of extra if statements just to hide warnings. |
|
I don't see any problems in PR now. |
phprus
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should wait for @nmoinvaz performance tests, but I don't think these are false positive warnings.
I think this PR needs to be merged.
I'm not quite sure if compilers can optimize out some of the if statements especially when there is more than one macro invocation in series. The current version is in my opinion a lot more readable than the previous one. As far as I managed to check, running without static analyzer never hits the added code and neither of the two length variables is ever zero with valid stream. |
|
I had some gremlins causing benchmarks to be highly random (+-5% run to run) recently, finally figured it out, turns out I should not disable deeper C-states on the idling hyperthread while the other one is running the benchmark. I am thinking this change in behavior might have been caused by a kernel or cpu microcode update. 🤷♂️ DevelopPR #1859 eca360a1ce671e29c2a808e915367ae5573fb8e7The PR adds 2176 bytes of code.
PR + added UNLIKELY() hints to all three if checks.Unlikely hints shrink the code by 320 bytes. IOW, performance takes a noticeable hit with these changes, at least in my tests on an i9-9900K (x86-64 w/AVX2). |
I thought about using an assert but those don't generally trap on release code, so no easy way to verify if it is false positive. I wanted to see that the code pass all tests on CI first. |
I'm open for suggestions. |
|
Referring to: Idk if this will work: #pragma clang diagnostic push
#pragma clang diagnostic ignored "-Wbitwise-shift"
bi_buf |= val << bi_valid;
#pragma clang diagnostic popThere is no real case where |
|
We could also attempt something like this: bi_buf |= val << (bi_valid & BIT_BUF_SIZE);Which I think is the same as |
That doesn't work... You can use "bitwise and" only with numbers where either lower or higher bits are all ones. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
inflate.c (1)
294-294: Architecture-level suggestion for bit buffer handling.All the changes consistently upgrade the bit buffer from 32-bit to 64-bit types. Consider adding a type definition for the bit buffer to make future changes easier.
+/* Define bit buffer type for consistency */ +typedef uint64_t bit_buf_t; + /* ... */ -uint64_t hold; /* bit buffer */ +bit_buf_t hold; /* bit buffer */Also applies to: 390-390, 482-482, 580-580, 707-707, 1131-1131
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
infback.c(2 hunks)inflate.c(6 hunks)inflate.h(2 hunks)trees_emit.h(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- infback.c
- trees_emit.h
- inflate.h
⏰ Context from checks skipped due to timeout of 90000ms (57)
- GitHub Check: macOS Clang Native Instructions (ARM64)
- GitHub Check: Windows MSVC 2019 v140 Win64
- GitHub Check: Windows MSVC 2019 v140 Win32
- GitHub Check: EL9 Clang S390X DFLTCC MSAN
- GitHub Check: Ubuntu GCC AARCH64 ASAN
- GitHub Check: macOS Clang Native Instructions (ARM64)
- GitHub Check: Windows MSVC 2019 v140 Win64
- GitHub Check: Windows MSVC 2019 v140 Win32
- GitHub Check: EL9 Clang S390X DFLTCC MSAN
- GitHub Check: Ubuntu GCC AARCH64 ASAN
- GitHub Check: macOS Clang Native Instructions (ARM64)
- GitHub Check: Windows MSVC 2019 v140 Win64
- GitHub Check: Windows MSVC 2019 v140 Win32
- GitHub Check: EL9 Clang S390X DFLTCC MSAN
- GitHub Check: Ubuntu GCC AARCH64 ASAN
- GitHub Check: macOS Clang Native Instructions (ARM64)
- GitHub Check: Windows MSVC 2019 v140 Win64
- GitHub Check: Windows MSVC 2019 v140 Win32
- GitHub Check: EL9 Clang S390X DFLTCC MSAN
- GitHub Check: Ubuntu GCC AARCH64 Compat No Opt UBSAN
- GitHub Check: Ubuntu GCC AARCH64 ASAN
- GitHub Check: macOS Clang Native Instructions (ARM64)
- GitHub Check: Windows MSVC 2019 v140 Win64
- GitHub Check: Windows MSVC 2019 v140 Win32
- GitHub Check: EL9 Clang S390X DFLTCC MSAN
- GitHub Check: Ubuntu GCC AARCH64 Compat No Opt UBSAN
- GitHub Check: Ubuntu GCC AARCH64 ASAN
- GitHub Check: macOS Clang Native Instructions (ARM64)
- GitHub Check: Windows MSVC 2019 v140 Win64
- GitHub Check: Windows MSVC 2019 v140 Win32
- GitHub Check: EL9 Clang S390X DFLTCC MSAN
- GitHub Check: Ubuntu GCC AARCH64 Compat No Opt UBSAN
- GitHub Check: Ubuntu GCC AARCH64 ASAN
- GitHub Check: macOS Clang Native Instructions (ARM64)
- GitHub Check: Windows MSVC 2019 v140 Win64
- GitHub Check: Windows MSVC 2019 v140 Win32
- GitHub Check: EL9 Clang S390X DFLTCC MSAN
- GitHub Check: Ubuntu GCC AARCH64 Compat No Opt UBSAN
- GitHub Check: Ubuntu GCC AARCH64 ASAN
- GitHub Check: macOS Clang Native Instructions (ARM64)
- GitHub Check: Windows MSVC 2019 v140 Win64
- GitHub Check: Windows MSVC 2019 v140 Win32
- GitHub Check: EL9 Clang S390X DFLTCC MSAN
- GitHub Check: Ubuntu GCC AARCH64 Compat No Opt UBSAN
- GitHub Check: Ubuntu GCC AARCH64 ASAN
- GitHub Check: macOS Clang Native Instructions (ARM64)
- GitHub Check: Windows MSVC 2019 v140 Win64
- GitHub Check: Windows MSVC 2019 v140 Win32
- GitHub Check: EL9 Clang S390X DFLTCC MSAN
- GitHub Check: Ubuntu GCC AARCH64 Compat No Opt UBSAN
- GitHub Check: Ubuntu GCC AARCH64 ASAN
- GitHub Check: macOS Clang Native Instructions (ARM64)
- GitHub Check: Windows MSVC 2019 v140 Win64
- GitHub Check: Windows MSVC 2019 v140 Win32
- GitHub Check: EL9 Clang S390X DFLTCC MSAN
- GitHub Check: Ubuntu GCC AARCH64 Compat No Opt UBSAN
- GitHub Check: Ubuntu GCC AARCH64 ASAN
🔇 Additional comments (6)
inflate.c (6)
482-482: Consistent type change for hold variable.The change from implicit to
uint64_ttype for theholdvariable in theinflate()function aligns with the PR's goal of standardizing the bit buffer size.
707-707: Verify ZSWAP32 operation with unsigned cast.The cast to
unsignedbefore ZSWAP32 operation needs verification to ensure it doesn't affect the byte-swapping operation.✅ Verification successful
🏁 Script executed:
#!/bin/bash # Description: Check the implementation of ZSWAP32 macro rg -A 2 'ZSWAP32\('Length of output: 2555
Cast to Unsigned is Appropriate for the ZSWAP32 Operation
After reviewing the definitions of the ZSWAP32 macro (which include implementations using built-in byte swap routines and manual bit-shifting) and its usage across the codebase, the explicit cast to unsigned in
ZSWAP32((unsigned)hold)is correct. The cast ensures that the input is treated as a 32-bit unsigned value, which is exactly what ZSWAP32 is designed to operate on, preserving the intended bit pattern for accurate byte swapping.
390-390: Verify the PULLBYTE macro usage.The change to
uint64_tin the PULLBYTE macro is consistent with the PR's objective to standardize theholdvariable size.✅ Verification successful
🏁 Script executed:
#!/bin/bash # Description: Check for any other instances of PULLBYTE macro to ensure consistent usage rg -A 2 'PULLBYTE\(\)'Length of output: 1809
PULLBYTE macro usage appears consistent with the PR’s objective.
- The macro definition in inflate.c uses a cast to
uint64_t, and similar usage is found across the codebase.- No conflicting usage or deviations were identified in the preliminary search.
1131-1131: Verify data check comparison.The cast to
unsignedin the data check comparison is consistent with other changes, but we should verify it doesn't affect the checksum validation.✅ Verification successful
🏁 Script executed:
#!/bin/bash # Description: Check other instances of checksum validation rg -A 2 'state->check'Length of output: 3428
Checksum validation comparison verified.
- The usage of
ZSWAP32((unsigned)hold)in the checksum comparison is consistent with similar conversions used in other sections (e.g., checksum updates in bothinflate.candarch/s390/dfltcc_inflate.c).- The cast to
unsigneddoes not affect the correctness of the checksum validation, as it maintains consistency with how the checksum value is computed and compared throughout the codebase.
294-294: Verify the bit shift operation.The change from implicit to explicit 64-bit type for
holdis part of the fix for shift overflow. However, we should verify that the shift operation is safe.✅ Verification successful
🏁 Script executed:
#!/bin/bash # Description: Check if bits can exceed 32 in inflatePrime ast-grep --pattern $'if \(bits > $_\) { $$$ }'Length of output: 49
I'll gather more context around the bit shift operation in
inflate.cto verify that the usage ofstate->bitsis appropriately safeguarded. Let's check the surrounding code where this operation occurs and search for any bounds checks onstate->bits.
🏁 Script executed:
#!/bin/bash # 1. Look up the context of the shift operation in inflate.c to see how 'state->bits' is handled. rg "hold \+= \(uint64_t\)value << state->bits" -n -C 5 inflate.c # 2. Search for any conditional checks on 'state->bits' in the file. rg "if\s*\(.*state->bits\s*[<>]=" -n -C 5 inflate.cLength of output: 426
Shift operation in inflate.c (line 294) is safe as implemented.
The condition at line 291 ensures that the sum of bits to be shifted (state->bits plus bits) never exceeds 32, keeping the shift amount well within safe bounds for a 64-bit value. This verification confirms that the explicit cast and the bounded shift ensure no overflow occurs during the operation.
580-580: Verify time value truncation.The cast to
unsignedwhen assigning tostate->head->timecould potentially truncate values. We should verify if this is intentional and safe.✅ Verification successful
🏁 Script executed:
#!/bin/bash # Description: Check the type of state->head->time to understand potential truncation ast-grep --pattern $'struct $_head { $$$ unsigned time; $$$ }'Length of output: 69
I'm not seeing enough context from the initial AST search. Let's run a few additional shell scripts to locate both the assignment context for
state->head->timeand howholdis defined ininflate.cso we can better assess whether the cast tounsignedtruncates values intentionally.
🏁 Script executed:
#!/bin/bash # 1. Search for the assignment of state->head->time in inflate.c to see surrounding context. rg -n "state->head->time" inflate.c -C 5 # 2. Search for the declaration or usage of 'hold' to understand its type. rg -n "hold" inflate.c -C 10 # 3. Attempt to locate the definition of the structure containing 'time'. # This searches for lines defining a member 'time' in any struct in inflate.c. rg -n "time[[:space:]]*;" inflate.c -C 5Length of output: 13028
Attention: The 32-bit cast is intentional and safe.
The header’s time field is specified as a 32‐bit value (consistent with the GZIP format), and the code explicitly reads 32 bits (via the NEEDBITS(32) macro) into the 64-bit accumulator before casting. In this context, truncating to an unsigned 32-bit value is by design and does not lead to data loss.
You can't use pragmas within preprocessor macros. |
|
Retested with updated PR For reference, Develop: This PR: It is now 128 bytes smaller than Develop. Confirmed these benchmarks with another run of before/after compiled with increased function alignment, showing the same trends as the above (0.58% slower on average). |
|
I benchmarked this alternative: --- a/trees_emit.h
+++ b/trees_emit.h
@@ -45,10 +45,10 @@ extern Z_INTERNAL const int base_dist[D_CODES];
uint32_t total_bits = bi_valid + len;\
send_bits_trace(s, val, len);\
sent_bits_add(s, len);\
- if (total_bits < BIT_BUF_SIZE) {\
+ if (total_bits < BIT_BUF_SIZE && bi_valid < BIT_BUF_SIZE) {\
bi_buf |= val << bi_valid;\
bi_valid = total_bits;\
- } else if (bi_valid == BIT_BUF_SIZE) {\
+ } else if (bi_valid >= BIT_BUF_SIZE) {\
put_uint64(s, bi_buf);\
bi_buf = val;\
bi_valid = len;\Wouldn't this also fix the problem? It keeps the block ordering more optimal. Surprisingly this results in even smaller code, -320bytes, the compiler must see some optimization opportunity that I don't. |
|
That does look a bit better. I think the logic is probably ok. Is it necessary to change |
That's exactly the code what I thought first, but it's kinda less readable than the current "alternative"... If it performs better, I can change... I didn't like the extra test at top as every extra check usually means penalty somewhere and rely on optimizer to mask the penalty. |
Static analyser does always require that and we really try to not mess too much with it to avoid the risk of replacing bad code with another bad code... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
inflate.c (1)
291-292: Consider adding overflow checks for bits parameter.The function checks if
bits > 16or if the total bits exceed 32, but with the change touint64_t, we might want to add additional validation:-if (bits > 16 || state->bits + (unsigned int)bits > 32) +if (bits > 16 || state->bits + (unsigned int)bits > 32 || + ((uint64_t)1 << bits) > UINT64_MAX / (1ULL << state->bits))
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
infback.c(2 hunks)inflate.c(6 hunks)inflate.h(2 hunks)trees_emit.h(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- trees_emit.h
- inflate.h
- infback.c
⏰ Context from checks skipped due to timeout of 90000ms (113)
- GitHub Check: Windows ClangCl Win32
- GitHub Check: Windows MSVC 2019 v140 Win32
- GitHub Check: EL9 Clang S390X DFLTCC MSAN
- GitHub Check: macOS GCC
- GitHub Check: macOS GCC Symbol Prefix & Compat
- GitHub Check: Ubuntu GCC AARCH64 ASAN
- GitHub Check: macOS GCC Symbol Prefix
- GitHub Check: Ubuntu Emscripten WASM32
- GitHub Check: Ubuntu GCC AARCH64 Compat No Opt
- GitHub Check: Ubuntu GCC AARCH64 No NEON
- GitHub Check: Ubuntu GCC AARCH64 No ACLE
- GitHub Check: Ubuntu GCC AARCH64
- GitHub Check: Windows ClangCl Win32
- GitHub Check: Windows MSVC 2019 v140 Win32
- GitHub Check: EL9 Clang S390X DFLTCC MSAN
- GitHub Check: macOS GCC
- GitHub Check: macOS GCC Symbol Prefix & Compat
- GitHub Check: Ubuntu GCC AARCH64 ASAN
- GitHub Check: macOS GCC Symbol Prefix
- GitHub Check: Ubuntu Emscripten WASM32
- GitHub Check: Ubuntu GCC AARCH64 Compat No Opt
- GitHub Check: Ubuntu GCC AARCH64 No NEON
- GitHub Check: Ubuntu GCC AARCH64 No ACLE
- GitHub Check: Ubuntu GCC AARCH64
- GitHub Check: Windows ClangCl Win32
- GitHub Check: Windows MSVC 2019 v140 Win32
- GitHub Check: EL9 Clang S390X DFLTCC MSAN
- GitHub Check: macOS GCC
- GitHub Check: macOS GCC Symbol Prefix & Compat
- GitHub Check: Ubuntu GCC AARCH64 ASAN
- GitHub Check: macOS GCC Symbol Prefix
- GitHub Check: Ubuntu Emscripten WASM32
- GitHub Check: Ubuntu GCC AARCH64 Compat No Opt
- GitHub Check: Ubuntu GCC AARCH64 No NEON
- GitHub Check: Ubuntu GCC AARCH64 No ACLE
- GitHub Check: Ubuntu GCC AARCH64
- GitHub Check: Windows ClangCl Win32
- GitHub Check: Windows MSVC 2019 v140 Win32
- GitHub Check: EL9 Clang S390X DFLTCC MSAN
- GitHub Check: macOS GCC
- GitHub Check: macOS GCC Symbol Prefix & Compat
- GitHub Check: Ubuntu GCC AARCH64 ASAN
- GitHub Check: macOS GCC Symbol Prefix
- GitHub Check: Ubuntu Emscripten WASM32
- GitHub Check: Ubuntu GCC AARCH64 Compat No Opt
- GitHub Check: Ubuntu GCC AARCH64 No NEON
- GitHub Check: Ubuntu GCC AARCH64 No ACLE
- GitHub Check: Ubuntu GCC AARCH64
- GitHub Check: Windows ClangCl Win32
- GitHub Check: Windows MSVC 2019 v140 Win32
- GitHub Check: EL9 Clang S390X DFLTCC MSAN
- GitHub Check: macOS GCC
- GitHub Check: macOS GCC Symbol Prefix & Compat
- GitHub Check: Ubuntu GCC AARCH64 ASAN
- GitHub Check: macOS GCC Symbol Prefix
- GitHub Check: Ubuntu Emscripten WASM32
- GitHub Check: Ubuntu GCC AARCH64 Compat No Opt
- GitHub Check: Ubuntu GCC AARCH64 No NEON
- GitHub Check: Ubuntu GCC AARCH64 No ACLE
- GitHub Check: Ubuntu GCC AARCH64
- GitHub Check: Windows ClangCl Win32
- GitHub Check: Windows MSVC 2019 v140 Win32
- GitHub Check: EL9 Clang S390X DFLTCC MSAN
- GitHub Check: macOS GCC
- GitHub Check: macOS GCC Symbol Prefix & Compat
- GitHub Check: Ubuntu GCC AARCH64 ASAN
- GitHub Check: macOS GCC Symbol Prefix
- GitHub Check: Ubuntu Emscripten WASM32
- GitHub Check: Ubuntu GCC AARCH64 Compat No Opt
- GitHub Check: Ubuntu GCC AARCH64 No NEON
- GitHub Check: Ubuntu GCC AARCH64 No ACLE
- GitHub Check: Ubuntu GCC AARCH64
- GitHub Check: Windows ClangCl Win32
- GitHub Check: Windows MSVC 2019 v140 Win32
- GitHub Check: EL9 Clang S390X DFLTCC MSAN
- GitHub Check: macOS GCC Symbol Prefix & Compat
- GitHub Check: Ubuntu GCC AARCH64 ASAN
- GitHub Check: macOS GCC Symbol Prefix
- GitHub Check: Ubuntu Emscripten WASM32
- GitHub Check: Ubuntu GCC AARCH64 Compat No Opt
- GitHub Check: Ubuntu GCC AARCH64 No NEON
- GitHub Check: Ubuntu GCC AARCH64 No ACLE
- GitHub Check: Ubuntu GCC AARCH64
- GitHub Check: Windows ClangCl Win32
- GitHub Check: Windows MSVC 2019 v140 Win32
- GitHub Check: EL9 Clang S390X DFLTCC MSAN
- GitHub Check: macOS GCC Symbol Prefix & Compat
- GitHub Check: Ubuntu GCC AARCH64 ASAN
- GitHub Check: macOS GCC Symbol Prefix
- GitHub Check: Ubuntu Emscripten WASM32
- GitHub Check: Ubuntu GCC AARCH64 No NEON
- GitHub Check: Ubuntu GCC AARCH64 No ACLE
- GitHub Check: Ubuntu GCC AARCH64
- GitHub Check: Windows ClangCl Win32
- GitHub Check: Windows MSVC 2019 v140 Win32
- GitHub Check: EL9 Clang S390X DFLTCC MSAN
- GitHub Check: macOS GCC Symbol Prefix & Compat
- GitHub Check: Ubuntu GCC AARCH64 ASAN
- GitHub Check: macOS GCC Symbol Prefix
- GitHub Check: Ubuntu Emscripten WASM32
- GitHub Check: Ubuntu GCC AARCH64 No NEON
- GitHub Check: Ubuntu GCC AARCH64 No ACLE
- GitHub Check: Ubuntu GCC AARCH64
- GitHub Check: Windows ClangCl Win32
- GitHub Check: Windows MSVC 2019 v140 Win32
- GitHub Check: EL9 Clang S390X DFLTCC MSAN
- GitHub Check: macOS GCC Symbol Prefix & Compat
- GitHub Check: Ubuntu GCC AARCH64 ASAN
- GitHub Check: macOS GCC Symbol Prefix
- GitHub Check: Ubuntu Emscripten WASM32
- GitHub Check: Ubuntu GCC AARCH64 No NEON
- GitHub Check: Ubuntu GCC AARCH64 No ACLE
- GitHub Check: Ubuntu GCC AARCH64
🔇 Additional comments (4)
inflate.c (4)
482-482: LGTM! The change to uint64_t aligns with the PR objectives.The change from
unsignedtouint64_tfor theholdvariable ensures consistent 64-bit storage across all versions of the inflate function, preventing potential overflow issues.
386-392: LGTM! The PULLBYTE macro has been updated correctly.The modification to use
uint64_tin the bit accumulator ensures proper handling of larger values and maintains consistency with theholdvariable type change.
294-294: LGTM! The shift operation in inflatePrime is now safer.The change to use
uint64_tfor the shift operation prevents potential overflow when shifting by large bit values.
580-580: Verify the type casting implications.The code adds explicit casts to
unsignedin several places:
state->head->time = (unsigned)(hold)strm->adler = state->check = ZSWAP32((unsigned)hold)ZSWAP32((unsigned)hold)) != state->checkWhile these casts appear correct, we should verify that:
- No data loss occurs when casting from
uint64_ttounsigned- The
ZSWAP32macro correctly handles the unsigned valueAlso applies to: 707-707, 1131-1131
✅ Verification successful
🏁 Script executed:
#!/bin/bash # Search for ZSWAP32 macro definition and usage rg -A 5 "ZSWAP32" --type cLength of output: 4821
CAST IMPLICATIONS in inflate.c
After reviewing the output for macro definitions and usage, it appears that the explicit casts to unsigned (typically a 32‑bit value) are consistently applied to the variable hold before passing it to ZSWAP32. In our codebase, hold is used in a controlled context where only 32 meaningful bits are expected, and ZSWAP32 is defined to operate on a 32‑bit value. This confirms that:
• No unexpected data loss occurs when converting a uint64_t to an unsigned, given that only the lower 32 bits are used.
• The ZSWAP32 macro is implemented to correctly handle the unsigned value across different platforms.Thus, the changes in explicit casts appear safe and correct.
|
Interestingly, testing the code in godbolt (although as a function, not a macro, so less chances for the compiler to optimize) seems to indicate that changing position on the two if checks would allow it to only have 2 extra code blocks instead of 3. https://godbolt.org/z/91e5WT1sP --- a/trees_emit.h
+++ b/trees_emit.h
@@ -45,10 +45,10 @@ extern Z_INTERNAL const int base_dist[D_CODES];
uint32_t total_bits = bi_valid + len;\
send_bits_trace(s, val, len);\
sent_bits_add(s, len);\
- if (total_bits < BIT_BUF_SIZE) {\
+ if (bi_valid < BIT_BUF_SIZE && total_bits < BIT_BUF_SIZE) {\
bi_buf |= val << bi_valid;\
bi_valid = total_bits;\
- } else if (bi_valid == BIT_BUF_SIZE) {\
+ } else if (bi_valid >= BIT_BUF_SIZE) {\
put_uint64(s, bi_buf);\
bi_buf = val;\
bi_valid = len;\ |
This is how I "parse" the logic.
|
I guess the extra jump is more expensive than the extra test. |
Modern compilers and processors are pretty good at guessing if jump is taken most of the time and reorder the code to make it as fast as possible. Like I said earlier, I'm open for suggestions from anyone, but I'm still worried if @nmoinvaz has trouble decoding the logic, then how can we expect that new or less experienced contributors can decode the logic and avoid breaking it later on... Honestly thinking a lot of the people who have contributed regularly are more experienced coders than I am. |
Also there should not be any need for checking That would make the diff only This also reduces the code size further. |
Static analyser doesn't work that way... |
Right, I had completely changed focus from static analyzer to optimizing the code while retaining correctness. |
I have Clang 18 and that doesn't yet support range hints... Next big update will be April 2026 and then maybe the default becomes either Clang 19 or Clang 20 depending on if Clang 20 is stable enough. Visual C++ has supported range hints for quite a while already. I haven't tested gcc 13 and later as those are broken for me. |
Dead2
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks good. Unless there is further discussion/changes, I'll merge this once comments have been added.
Size of
holdwas inconsistent between different inflate versions. Make all use 64-bit variable.In
send_bitsthere was no guard against shifting by variable bit width, which will overflow.Summary by CodeRabbit
Release Notes
Performance Improvements
Bug Fixes
Technical Updates
inflate_statestructure by modifying thepaddingarray size.