Lossless Preprocessing of Floating Point Data to Enhance Compression

Taurone, Francesco; Lucani, Daniel E.; Fehér, Marcell; Zhang, Qi

doi:10.1007/978-3-031-38318-2_45

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 741))

Included in the following conference series:

International Symposium on Distributed Computing and Artificial Intelligence

894 Accesses
2 Citations
1 Altmetric

Abstract

Data compression algorithms typically rely on identifying repeated sequences of symbols from the original data to provide a compact representation of the same information, while maintaining the ability to recover the original data from the compressed sequence. Using data transformations prior to the compression process has the potential to enhance the compression capabilities, being lossless as long as the transformation is invertible. Floating point data presents unique challenges to generate invertible transformations with high compression potential. This paper identifies key conditions for basic operations of floating point data that guarantee lossless transformations. Then, we show four methods that make use of these observations to deliver lossless compression of real datasets, where we improve compression rates up to 40%.

This work was supported by the IoTalentum Project within the Framework of Marie Skłodowska-Curie Actions Innovative Training Networks (ITN)-European Training Networks (ETN), which is funded by the European Union Horizon 2020 Research and Innovation Program under Grant 953442.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+

from $39.99 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 189.00; Price excludes VAT (USA)

Softcover Book: USD 249.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Automatic Exploration of Reduced Floating-Point Representations in Iterative Methods

Information Processing on Compressed Data

The Effect of Lossy Data Compression in Computational Fluid Dynamics Applications: Resilience and Data Postprocessing

References

IEEE 754-2019 standard for floating-point arithmetic (2019)
Google Scholar
Batal, I., Hauskrecht, M.: A supervised time series feature extraction technique using DCT and DWT. In: 2009 International Conference on Machine Learning and Applications (2009)
Google Scholar
City of Chicago: Taxi trips dataset (2022). https://tinyurl.com/4rypurjp
Jean-loup Gailly, M.A.: Zlib compressor. https://www.zlib.net/
Kaya, H.: Gas turbine CO and NOx emission dataset (2019). https://tinyurl.com/2ubk63ra
Hurst, A., Lucani, D.E., Assent, I., Zhang, Q.: Glean: generalized-deduplication-enabled approximate edge analytics. IEEE Internet of Things J. 10, 4006–4020 (2023)
Article Google Scholar
Hurst, A., Lucani, D.E., Zhang, Q.: GreedyGD: enhanced generalized deduplication for direct analytics in IoT (2023). https://arxiv.org/abs/2304.07240
Klower, M., Razinger, M., Dominguez, J.J., Duben, P.D., Palmer, T.N.: Compressing atmospheric data into its real information content. Nat. Comput. Sci. 1, 713–724 (2021)
Article Google Scholar
Hill, M.D.: Notes on floating point arithmetic, part of course CS/ECE 354 at university of Wisconsin. https://tinyurl.com/mvzcyc3x
Muller, J.M., et al.: Handbook of Floating-Point Arithmetic (2010)
Google Scholar
Taurone, F., Lucani, D.E., Fehér, M., Zhang, Q.: Change a bit to save bytes: compression for floating point time-series data. In: IEEE ICC (2023). https://arxiv.org/abs/2303.04478
Vestergaard, R., Lucani, D.E., Zhang, Q.: A randomly accessible lossless compression scheme for time-series data. In: IEEE Conference on Computer Communications, IEEE INFOCOM 2020 (2020)
Google Scholar

Download references

Author information

Authors and Affiliations

DIGIT, Department of Electrical and Computer Engineering, Aarhus University, Aarhus, Denmark
Francesco Taurone, Daniel E. Lucani, Marcell Fehér & Qi Zhang

Authors

Francesco Taurone
View author publications
Search author on:PubMed Google Scholar
Daniel E. Lucani
View author publications
Search author on:PubMed Google Scholar
Marcell Fehér
View author publications
Search author on:PubMed Google Scholar
Qi Zhang
View author publications
Search author on:PubMed Google Scholar

Corresponding author

Correspondence to Francesco Taurone .

Editor information

Editors and Affiliations

High Performance Computing Center, King Abdulaziz University, Jeddah, Saudi Arabia
Rashid Mehmood
University of Minho, Braga, Portugal
Victor Alves
ISEP/GECAD, Instituto Superior de Engenharia do Port, Porto, Portugal
Isabel Praça
Kielce University of Technology, Kielce, Poland
Jarosław Wikarek
BISITE, University of Salamanca, Salamanca, Spain
Javier Parra-Domínguez
Institute of Mathematics and Informatics, Bulgarian Academy of Sciences, Sofia, Bulgaria
Roussanka Loukanova
Universidad de Valladolid, Valladolid, Spain
Ignacio de Miguel
INESC-TEC and GECAD/ISEP, Universidade de Trás-os-Montes e Alto Douro, Vila Real, Portugal
Tiago Pinto
Universidade de Trás-os-Montes e Alto Douro, Vila Real, Portugal
Ricardo Nunes
University of Calabria, Arcavacata, Rende CS, Cosenza, Italy
Michela Ricca

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Taurone, F., Lucani, D.E., Fehér, M., Zhang, Q. (2023). Lossless Preprocessing of Floating Point Data to Enhance Compression. In: Mehmood, R., et al. Distributed Computing and Artificial Intelligence, Special Sessions I, 20th International Conference. DCAI 2023. Lecture Notes in Networks and Systems, vol 741. Springer, Cham. https://doi.org/10.1007/978-3-031-38318-2_45

Download citation

DOI: https://doi.org/10.1007/978-3-031-38318-2_45
Published: 26 July 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-38317-5
Online ISBN: 978-3-031-38318-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Keywords

Publish with us

Policies and ethics

Lossless Preprocessing of Floating Point Data to Enhance Compression