-
Notifications
You must be signed in to change notification settings - Fork 4
Description
debug-info-uint-overflow-issue.md
Issue Type:
bug
/ critical
Summary:
Provider experiences integer underflow when calculating available resources with active leases, resulting in wrapped uint64 values and inability to accept any new bids despite having sufficient capacity.
Environment:
- Provider Version: 0.8.4 (also tested 0.8.2, 0.8.3 - bug exists in all)
- Inventory Operator Version: 0.8.4
- Helm Chart Version: provider-12.1.4, akash-inventory-operator-12.1.4
- Kubernetes: K3s v1.33.4
- OS: Ubuntu 24.04.3 LTS
- Hardware: Intel i9-10850K (10 cores), 32GB RAM, NVIDIA RTX 3090
- Provider Address:
akash1vd5ek0addwxy5tkxg3j09chrsnq8p0fkjgqwwr
Bug Description:
When a provider has one or more active leases, the available resource calculation produces integer underflow, showing UINT64_MAX - allocated_resources
instead of the correct remaining capacity.
Evidence:
Current State:
{
"allocatable": {
"cpu": 10000,
"gpu": 1,
"memory": 20948713472,
"storage_ephemeral": 66653508761
},
"available": {
"cpu": 18446744073709531616,
"gpu": 1,
"memory": 18446744061600858112,
"storage_ephemeral": 18446743990039205017
},
"active_leases": [
{
"cpu": 4000,
"gpu": 0,
"memory": 4294967296,
"storage_ephemeral": 21474836480
}
]
}
The Math:
- Total CPU: 10,000 millicores (10 cores)
- Active lease CPU: 4,000 millicores (4 cores)
- Expected available: 6,000 millicores (6 cores)
- Actual available: 18,446,744,073,709,531,616
Analysis: 18,446,744,073,709,531,616 = (2^64 - 1) - 4,000 + 1
This is a classic unsigned integer underflow. The subtraction operation wraps around to near-maximum uint64 value.
Kubernetes Shows Correct Resources:
Allocated resources:
Resource Requests Limits
cpu 5915m (59%) 8520m (85%)
memory 6950Mi (34%) 15018Mi (75%)
Kubernetes correctly shows 59% CPU allocated with plenty of room for more workloads.
Impact:
- Critical: Provider cannot accept ANY new bids while active leases exist
- Provider logs show:
insufficient capacity for reservation
for all orders - Provider earning potential limited to first lease only (~$0.24/day)
- Makes small-to-medium providers economically unviable
Reproduction Steps:
- Deploy Akash provider (any version 0.8.2-0.8.4)
- Successfully win and deploy one lease
- Query provider status:
curl -k https://provider:8443/status
- Observe wrapped uint64 values in
available
resources - Attempt to bid on new orders - all will fail with "insufficient capacity"
Provider Logs:
D[2025-10-10|02:56:02.287] cluster resources dump={
"nodes":[{
"name":"akash.insomnyak.com",
"allocatable":{"cpu":10000,"gpu":1,"memory":20948713472,"storage_ephemeral":66653508761},
"available":{"cpu":18446744073709531616,"gpu":1,"memory":18446744061600858112,"storage_ephemeral":18446743990039205017}
}]
}
I[2025-10-10|02:41:25.243] insufficient capacity for reservation
E[2025-10-10|02:41:25.243] reserving resources err="insufficient capacity"
Expected Behavior:
With 10 cores total and 4 cores allocated:
{
"available": {
"cpu": 6000,
"memory": 16653745776,
"storage_ephemeral": 45178672281
}
}
Attempted Workarounds:
- ✗ Downgraded to 0.8.3 - bug persists
- ✗ Downgraded to 0.8.2 - bug persists
- ✗ Set
AKASH_OVERCOMMIT_PCT_CPU: "200"
- no effect - ✗ Restarted inventory operator - no effect
- ✗ Ensured inventory operator and provider versions match - no effect
Additional Context:
This appears to be a bug in the provider's resource accounting logic when subtracting active lease resources from allocatable capacity. The operation likely uses unsigned integers without proper bounds checking.
The bug makes it impossible for providers to efficiently serve multiple concurrent leases, severely limiting the network's capacity utilization.
Requested Action:
Please investigate the resource calculation code path in the provider binary, specifically where available resources are computed by subtracting active allocations from total capacity. This needs signed integer arithmetic or overflow protection.