+
Skip to content

Rust XBRL parser that's 50-150x faster than traditional parsers. Built for speed and accuracy when processing SEC EDGAR filings.

License

Notifications You must be signed in to change notification settings

stefanoamorelli/crabrl

Repository files navigation

crabrl 🦀

Crates.io CI Status License: AGPL v3 Rust Version Downloads docs.rs

crabrl Performance

Lightning-fast XBRL parser that's 50-150x faster than traditional parsers, built for speed and accuracy when processing SEC EDGAR filings.

Performance

Performance Benchmarks

Speed Comparison

Speed Comparison

Key Performance Metrics:

  • 50-150x faster than traditional XBRL parsers
  • 140,000+ facts/second throughput
  • < 50MB memory for 100K facts
  • Linear scaling with file size

Technical Architecture

crabrl is built on Rust's zero-cost abstractions and modern parsing techniques. While established parsers like Arelle provide comprehensive XBRL specification support and extensive validation capabilities, crabrl focuses on high-performance parsing for scenarios where speed is critical.

Implementation Details

Optimization Impact Technology
Zero-copy parsing -90% memory allocs quick-xml with string slicing
No garbage collection Predictable latency Rust's ownership model
Faster hashmaps 2x lookup speed ahash instead of default hasher
Compact strings -50% memory for small strings compact_str
Parallelization 4-8x on multicore rayon work-stealing
Memory mapping Zero-copy file I/O memmap2
Better allocator -25% allocation time mimalloc

Benchmark results: 100,000 XBRL facts parsed in 56ms (crabrl) vs 2,672ms (Arelle) on identical hardware.

XBRL Support Status

Feature Description Status
XBRL 2.1 Instance Parse facts, contexts, units from .xml files ✅ Stable
SEC Validation EDGAR-specific rules and checks ✅ Stable
Calculation Linkbase Validate arithmetic relationships ✅ Stable
Presentation Linkbase Extract display hierarchy 🚧 Beta
Label Linkbase Human-readable concept names 🚧 Beta
Definition Linkbase Dimensional relationships 📋 Planned
Formula Linkbase Business rules validation 📋 Planned
Inline XBRL (iXBRL) HTML-embedded XBRL 📋 Planned

Installation

From crates.io

cargo install crabrl

From Source

git clone https://github.com/stefanoamorelli/crabrl
cd crabrl
cargo build --release --features cli

As Library Dependency

[dependencies]
crabrl = "0.1.0"

Usage

CLI

# Parse and display summary
crabrl parse filing.xml

# Parse with statistics (timing and throughput)
crabrl parse filing.xml --stats

# Validate with generic rules
crabrl validate filing.xml

# Validate with SEC EDGAR rules
crabrl validate filing.xml --profile sec-edgar

# Validate with strict mode (warnings as errors)
crabrl validate filing.xml --strict

# Benchmark performance
crabrl bench filing.xml --iterations 100

Library

Basic Usage

use crabrl::Parser;

// Parse XBRL document
let parser = Parser::new();
let doc = parser.parse_file("filing.xml")?;

// Access parsed data
println!("Facts: {}", doc.facts.len());
println!("Contexts: {}", doc.contexts.len());
println!("Units: {}", doc.units.len());

Parse from Different Sources

// From file path
let doc = parser.parse_file("filing.xml")?;

// From bytes
let xml_bytes = std::fs::read("filing.xml")?;
let doc = parser.parse_bytes(&xml_bytes)?;

Validation

use crabrl::{Parser, Validator};

let parser = Parser::new();
let doc = parser.parse_file("filing.xml")?;

// Generic validation
let validator = Validator::new();
let result = validator.validate(&doc)?;

if result.is_valid {
    println!("Document is valid!");
} else {
    for error in &result.errors {
        eprintln!("Error: {}", error);
    }
}

// SEC EDGAR validation (stricter rules)
let sec_validator = Validator::sec_edgar();
let sec_result = sec_validator.validate(&doc)?;

Performance Measurements

Performance comparison with Arelle v2.17.4 (Python-based XBRL processor with full specification support):

Synthetic Dataset Benchmarks

File Size Facts crabrl Arelle Ratio
Tiny 10 1.1 ms 164 ms 150x
Small 100 1.4 ms 168 ms 119x
Medium 1K 1.7 ms 184 ms 108x
Large 10K 6.1 ms 351 ms 58x
Huge 100K 57 ms 2,672 ms 47x

SEC Filing Parse Times

Company Filing Type File Size Facts Parse Time Throughput
Apple 10-K 2023 1.4 MB 1,075 2.1 ms 516K facts/sec
Microsoft 10-Q 2023 2.8 MB 2,341 4.3 ms 544K facts/sec
Tesla 10-K 2023 3.1 MB 3,122 5.8 ms 538K facts/sec

Run Your Own Benchmarks

# Quick benchmark with Criterion
cargo bench

# Compare against Arelle
cd benchmarks && python compare_performance.py

# Test on real SEC filings
python scripts/download_fixtures.py  # Download Apple, MSFT, Tesla, etc.
cargo run --release --bin crabrl -- bench fixtures/apple/aapl-20230930_htm.xml

Resources & Links

XBRL Standards

Dependencies We Use

Crate Purpose Why We Chose It
quick-xml XML parsing Zero-copy, fastest XML parser in Rust
ahash HashMap hashing 2x faster than default hasher
compact_str String storage Small string optimization
rayon Parallelization Work-stealing for automatic load balancing
mimalloc Memory allocator Microsoft's high-performance allocator
criterion Benchmarking Statistical benchmarking with graphs

Alternative XBRL Parsers

  • Arelle - Complete XBRL processor with validation, formulas, and rendering (Python)
  • python-xbrl - Lightweight Python parser
  • xbrl-parser - JavaScript/Node.js
  • XBRL4j - Java implementation

License ⚖️

This open-source project is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0). This means:

  • You can use, modify, and distribute this software
  • If you modify and distribute it, you must release your changes under AGPL-3.0
  • If you run a modified version on a server, you must provide the source code to users
  • See the LICENSE file for full details

For commercial licensing options or other licensing inquiries, please contact stefano@amorelli.tech.

© 2025 Stefano Amorelli – Released under the GNU Affero General Public License v3.0. Enjoy! 🎉

About

Rust XBRL parser that's 50-150x faster than traditional parsers. Built for speed and accuracy when processing SEC EDGAR filings.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载