这是indexloc提供的服务,不要输入任何密码
Skip to content

Victor-C-Zhang/blosc2-bench

Repository files navigation

blosc2-bench

Data and environmental setup to run benchmarks against blosc2

Onetime setup

!!! Note This section included for posterity only. The processed files are provided in the repo already.

First, install GDAL to parse the input files.

sudo dnf install gdal-devel

Grab the input files from the respective agencies. Run the processing scripts to produce the datasets.

create_rea6 TOT_PRECIP.2D.201512.grb
create_era5 data.grib

The REA6_precip dataset is taken from Breaking Down Memory Walls. The code they used is here. Instead of only evaluating over the small 20KiB section of one sample, we evaluate over the entire dataset of 744 samples from the COSMO-REA6 precipitation dataset.

The other datasets are taken from the bytedelta analysis. Here the code they used no longer works. They don't publish the English names of the datasets they are using, so we can only guess to our best ability which datasets they have pulled. We use the following sample from the ERA5 reanalysis (shortname names):

  • 10 metre u wind component (ERA5_wind)
  • Mean sea level pressure (ERA5_pressure)
  • Total precipitation (ERA5_precip)
  • Downward UV radiation at the surface (ERA5_flux)
  • Snow density (ERA5_snow)

Some stats on the corpora

File prefix Size X Y
REA6_precip 5590016 848 824
ERA5_* 8305920 1440 721

How to run benchmarks

First, create a python virtual environment. Then, download blosc-btune

pip install blosc2-btune

Run the benchmarks by providing the dataset directory to run against. Example:

./roundtrip /path/to/REA6_precip/

This will write a CSV file named stats.csv to the dataset directory containing the test files.

About

Data and environmental setup to run benchmarks against blosc2

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published