Lightning-fast XBRL parser that's 50-150x faster than traditional parsers, built for speed and accuracy when processing SEC EDGAR filings.
Key Performance Metrics:
- 50-150x faster than traditional XBRL parsers
- 140,000+ facts/second throughput
- < 50MB memory for 100K facts
- Linear scaling with file size
crabrl is built on Rust's zero-cost abstractions and modern parsing techniques. While established parsers like Arelle provide comprehensive XBRL specification support and extensive validation capabilities, crabrl focuses on high-performance parsing for scenarios where speed is critical.
Optimization | Impact | Technology |
---|---|---|
Zero-copy parsing | -90% memory allocs | quick-xml with string slicing |
No garbage collection | Predictable latency | Rust's ownership model |
Faster hashmaps | 2x lookup speed | ahash instead of default hasher |
Compact strings | -50% memory for small strings | compact_str |
Parallelization | 4-8x on multicore | rayon work-stealing |
Memory mapping | Zero-copy file I/O | memmap2 |
Better allocator | -25% allocation time | mimalloc |
Benchmark results: 100,000 XBRL facts parsed in 56ms (crabrl) vs 2,672ms (Arelle) on identical hardware.
Feature | Description | Status |
---|---|---|
XBRL 2.1 Instance | Parse facts, contexts, units from .xml files |
✅ Stable |
SEC Validation | EDGAR-specific rules and checks | ✅ Stable |
Calculation Linkbase | Validate arithmetic relationships | ✅ Stable |
Presentation Linkbase | Extract display hierarchy | 🚧 Beta |
Label Linkbase | Human-readable concept names | 🚧 Beta |
Definition Linkbase | Dimensional relationships | 📋 Planned |
Formula Linkbase | Business rules validation | 📋 Planned |
Inline XBRL (iXBRL) | HTML-embedded XBRL | 📋 Planned |
cargo install crabrl
git clone https://github.com/stefanoamorelli/crabrl
cd crabrl
cargo build --release --features cli
[dependencies]
crabrl = "0.1.0"
# Parse and display summary
crabrl parse filing.xml
# Parse with statistics (timing and throughput)
crabrl parse filing.xml --stats
# Validate with generic rules
crabrl validate filing.xml
# Validate with SEC EDGAR rules
crabrl validate filing.xml --profile sec-edgar
# Validate with strict mode (warnings as errors)
crabrl validate filing.xml --strict
# Benchmark performance
crabrl bench filing.xml --iterations 100
use crabrl::Parser;
// Parse XBRL document
let parser = Parser::new();
let doc = parser.parse_file("filing.xml")?;
// Access parsed data
println!("Facts: {}", doc.facts.len());
println!("Contexts: {}", doc.contexts.len());
println!("Units: {}", doc.units.len());
// From file path
let doc = parser.parse_file("filing.xml")?;
// From bytes
let xml_bytes = std::fs::read("filing.xml")?;
let doc = parser.parse_bytes(&xml_bytes)?;
use crabrl::{Parser, Validator};
let parser = Parser::new();
let doc = parser.parse_file("filing.xml")?;
// Generic validation
let validator = Validator::new();
let result = validator.validate(&doc)?;
if result.is_valid {
println!("Document is valid!");
} else {
for error in &result.errors {
eprintln!("Error: {}", error);
}
}
// SEC EDGAR validation (stricter rules)
let sec_validator = Validator::sec_edgar();
let sec_result = sec_validator.validate(&doc)?;
Performance comparison with Arelle v2.17.4 (Python-based XBRL processor with full specification support):
File Size | Facts | crabrl | Arelle | Ratio |
---|---|---|---|---|
Tiny | 10 | 1.1 ms | 164 ms | 150x |
Small | 100 | 1.4 ms | 168 ms | 119x |
Medium | 1K | 1.7 ms | 184 ms | 108x |
Large | 10K | 6.1 ms | 351 ms | 58x |
Huge | 100K | 57 ms | 2,672 ms | 47x |
Company | Filing Type | File Size | Facts | Parse Time | Throughput |
---|---|---|---|---|---|
Apple | 10-K 2023 | 1.4 MB | 1,075 | 2.1 ms | 516K facts/sec |
Microsoft | 10-Q 2023 | 2.8 MB | 2,341 | 4.3 ms | 544K facts/sec |
Tesla | 10-K 2023 | 3.1 MB | 3,122 | 5.8 ms | 538K facts/sec |
# Quick benchmark with Criterion
cargo bench
# Compare against Arelle
cd benchmarks && python compare_performance.py
# Test on real SEC filings
python scripts/download_fixtures.py # Download Apple, MSFT, Tesla, etc.
cargo run --release --bin crabrl -- bench fixtures/apple/aapl-20230930_htm.xml
- XBRL International - Official XBRL specifications
- XBRL 2.1 Specification - Core standard we implement
- SEC EDGAR - Search real company filings
- EDGAR Filer Manual - SEC filing requirements
Crate | Purpose | Why We Chose It |
---|---|---|
quick-xml |
XML parsing | Zero-copy, fastest XML parser in Rust |
ahash |
HashMap hashing | 2x faster than default hasher |
compact_str |
String storage | Small string optimization |
rayon |
Parallelization | Work-stealing for automatic load balancing |
mimalloc |
Memory allocator | Microsoft's high-performance allocator |
criterion |
Benchmarking | Statistical benchmarking with graphs |
- Arelle - Complete XBRL processor with validation, formulas, and rendering (Python)
- python-xbrl - Lightweight Python parser
- xbrl-parser - JavaScript/Node.js
- XBRL4j - Java implementation
This open-source project is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0). This means:
- You can use, modify, and distribute this software
- If you modify and distribute it, you must release your changes under AGPL-3.0
- If you run a modified version on a server, you must provide the source code to users
- See the LICENSE file for full details
For commercial licensing options or other licensing inquiries, please contact stefano@amorelli.tech.
© 2025 Stefano Amorelli – Released under the GNU Affero General Public License v3.0. Enjoy! 🎉