WO2013033895A1

WO2013033895A1 - Data compressing and decompressing method, program, storage medium, and electronic product

Info

Publication number: WO2013033895A1
Application number: PCT/CN2011/079417
Authority: WO
Inventors: 崔军
Original assignee: 速压公司
Priority date: 2011-09-07
Filing date: 2011-09-07
Publication date: 2013-03-14

Abstract

A data compressing and decompressing method, a program, a storage medium, and an electronic product. The data compressing method in the present invention comprises the steps of: analyzing an initial character string of data so as to select a transform algorithm applicable to the initial character string, the transform algorithm being an algorithm that can extend a run length of a character string; applying the selected algorithm to the initial character string so as to obtain a new character string with a greater run length; adding a character for recoding the transform algorithm to the new character string; and obtaining a run length of the new character string by performing run length encoding on the new character string, so as to obtain compressed data.

Description

Data compression and decompression method, program, storage medium and electronic product

The present invention relates to a method of compressing and decompressing data, and related programs, storage media and electronic products, and more particularly to lossless recompression of compressed data. Background technique

With the rapid development of computers and the Internet, the amount of data storage and online data transmission has increased dramatically, and video traffic will account for 90% of network traffic. Network congestion and congestion are becoming more and more serious. Relying on increasing bandwidth and speeding up transmission is the current solution to solve the problem of network transmission. These programs require huge investment and time to achieve. Inventing a lossless compression technology that performs high-speed and compression on data, especially compressed data, will have a major impact on the IT field, and greatly enhance the ability of data storage and network transmission data under existing conditions. , reduce investment and save time, improve the quality of data transmission. Summary of the invention

The present invention provides a method for compressing data (including compressed files of various formats), comprising the steps of: analyzing an initial character sequence of data to select a transform algorithm suitable for the initial character sequence, The transform algorithm is an algorithm that can lengthen the run length of the character sequence by a certain transform; the selected transform algorithm is used for the initial character sequence to obtain a new character sequence having a longer run length; The sequence adds a character for recording the transformation algorithm; and obtains a program sequence of the new character sequence by run-length encoding the new character sequence, thereby obtaining compressed data.

According to a preferred method of the present invention, the initial character sequence is a binary number, and the transform algorithm includes one of the following algorithms or a combination of several algorithms: inverting the initial character sequence in a regular digit, which may be an initial The even-numbered bits of the sequence of characters are inverted or the bits of each two-bit interval of the initial sequence of characters are inverted; the adjacent n-bit characters of the initial sequence of characters are entirely exchanged, n is an integer greater than or equal to 2; and the initial character a combination of fixed characters in the sequence Conventional replacement.

According to a preferred method of the present invention, the transform algorithm is a combination of a plurality of different algorithms that continuously perform a plurality of different transforms on the same sequence of characters or a plurality of different algorithms that are used for character sequences of different fields in the initial character sequence. combination.

According to a preferred method of the present invention, the analyzing the initial character sequence of the data comprises: applying a plurality of transform algorithms to the initial character sequence exhaustively, and comparing the compression ratios obtainable by the respective transform algorithms to obtain a data compression ratio Preferably, the transform algorithm searches for a specific character type in the initial character sequence and compares the search results of the respective transform algorithms to determine an applicable transform algorithm. The analyzing the initial character sequence of the data includes performing segmentation analysis on the initial character sequence.

According to another aspect of the present invention, a data decompression method is provided, comprising: obtaining, for a compressed data obtained according to the method described above, the new data from a program sequence of the compressed data by an inverse operation of run-length encoding a sequence of characters, obtaining the transformation algorithm of the record, applying an inverse operation of the transformation algorithm to the new sequence of characters, thereby obtaining an initial sequence of characters of the data.

The invention also provides a computer program comprising instructions adapted to cause a data processing apparatus to perform the above described compression method and/or data decompression method.

The present invention also provides a storage medium including a computer program, wherein the program causes a compression method according to the present invention to be performed on the initial data when storing initial data to the storage medium, thereby obtaining compression of the initial data Data; when the initial data is copied from the storage medium to the outside, a decompression method according to the present invention is performed on the compressed data, thereby decompressing the compressed data into the initial data. Compression is lossless compression, obeying the protocol of data transmission.

The present invention also provides an electronic product comprising the storage medium according to the above.

The invention realizes lossless recompression of data (including compressed data), greatly reduces the data storage cost, and accelerates the data network ^"transmission, especially in the multi-faceted application of the streaming media network, such as video, audio, Transmission of images, files, etc., has great value.

1 is a flow chart of a data compression method in accordance with the present invention. detailed description

Specific embodiments of the present invention are described below with reference to the accompanying drawings. It will be understood by those skilled in the art that the specific embodiments of the present invention are not intended to limit the scope of the invention.

In step 1 of Fig. 1, data analysis is first performed on the initial data to be compressed.

The data used in the computer system is a binary number, where the initial data is selected as a binary character sequence for ease of explanation. This application has a very strong unique function for further compression of compressed data, however, the method of the present application is equally applicable to uncompressed data.

Here, for example, the data sequence to be compressed by the method of the present invention is

010101, the data is a compressed data sequence. A plurality of transform algorithms can be applied to the data sequence, and the transform algorithm is an algorithm that can lengthen the run length of the character sequence by a certain transform.

For example, the first conversion algorithm is to invert the even bits of the data sequence, and the new sequence after the inversion of the even bits of the sequence is:

000000 Obviously, multiple long run lengths appear in the new data sequence so that they can be recompressed.

The second transformation algorithm is to invert the data sequence every two bits, and the new data sequence obtained by performing the transformation on the initial data sequence is:

000111 Some long run lengths have also appeared in the transformed new data sequence, ie they can be recompressed.

The third transformation algorithm is to exchange adjacent n-bit characters of the data sequence, and n is an integer greater than or equal to 2. For example, the initial data sequence is transformed by converting adjacent three-bit characters of the data sequence. The new data sequence is:

010101 Similarly, multiple long run lengths appear in the transformed new data sequence, so that it can be recompressed.

The fourth transformation algorithm is to replace the fixed character combination in the data sequence by, for example, replacing "10" in the initial data with 1, replacing "11" with 0, and performing the transformation on the initial data sequence. The new data sequence obtained is:

1000111100001001111000101101010001000110111 There are also many long run lengths in the transformed new data sequence, which can be recompressed.

The above several transformation algorithms are merely exemplary, and those skilled in the art can design more transformation algorithms based on the content, and the transformation algorithm can be more complicated, and can be a transformation in the form of a calculation formula, which can be continuous on the same character sequence. A combination of a plurality of different algorithms of a plurality of different transforms may also be a combination of a plurality of different algorithms for a sequence of characters of different fields in the initial character sequence, as long as it can ultimately lengthen the run length of the data.

From the examples of the above several transformation algorithms, those skilled in the art can also see that these several transformation algorithms are superior to the same data. For example, the above initial data has a longer run length after the first type of transformation than after the second type of transformation, thereby facilitating data compression. Therefore, the initial data to be compressed needs to be analyzed to select the transform algorithm that is most suitable for the data. When analyzing the initial data, a plurality of transform algorithms may be exhaustively applied to the initial data, and the compression ratios obtainable by the respective transform algorithms may be compared to obtain a transform algorithm that optimizes the data compression ratio, and the initial characters may also be Search for a specific transformation in the sequence The specific character type for which the method is directed, and compares the search results of the respective transform algorithms to determine the applicable transform algorithm. Other methods for analyzing data to obtain a transformation algorithm matched thereto are also contemplated by those skilled in the art, and are all included within the scope of the present invention.

In step 2 of Figure 1, the selected transform algorithm is applied to the initial data to transform the data. Here, as described above, the selected transform algorithm may be a single algorithm or a combination of multiple algorithms, which is determined according to the condition of the specific initial data.

In step 3 of Fig. 1, the transform algorithm is recorded in the transformed data sequence. For example, a plurality of transform algorithms can be numbered and numbered markers added to the transformed data sequence to record the transform algorithm.

In step 4 of Figure 1, the transformed data is run-length encoded to implement compression of the data. For example, the data sequence after the first transformation can be represented as a swim-up column:

2 2 11 2 2 1 3 11 2 3 5 4 1 2 2 4 1 1 5 6. In step 5, the above-mentioned swim program column is represented by a binary number and is the compressed data.

The decompression method according to the present invention is the inverse of the above compression method, comprising: obtaining, for the compressed data obtained by the above compression method, the transformed data from the run sequence of the compressed data by an inverse operation of the run length encoding Sequence, obtaining the transformation algorithm of the record, applying an inverse operation of the transformation algorithm to the transformed data sequence, thereby obtaining an initial data sequence of the data.

Since the compression and decompression method of the present invention obtains the effect of compression by transforming the data without any deletion or corruption of the initial data, the compression method of the present invention is a lossless compression method.

Both the compression method and the decompression method according to the present invention can be implemented in the form of a computer program. The present invention may also be embodied as a storage medium including a calculation order, wherein the program is: when storing initial data to the storage medium, the initial data is compressed according to the present invention, thereby obtaining the initial data. Compressing data; when copying the initial data from the storage medium to the outside, performing a decompression method according to the present invention on the compressed data, thereby The compressed data is decompressed into the initial data. The storage medium described herein can be a flash memory, an optical disk, or other storage device as known to those skilled in the art.

The present invention can also be embodied as various electronic products including the storage medium according to the present invention, so that data can be stored in a small amount of data when copying data or downloading data from a network, while returning to the original data at the time of decompression without Any damage to the original data. Examples of such electronic products may be smartphones, MP4 players, or other electronic devices known to those skilled in the art.

By the compression method of the invention, further lossless compression of the compressed data can be realized, a compression ratio of at least 50% can be achieved, and high-speed compression can be achieved, thereby greatly reducing the data storage cost and greatly accelerating the data network. The speed of transmission can significantly improve the application of streaming media transmission, real-time playback and so on.

The compression method of the present invention is applicable to various file formats of computers.

Claims

Rights request

1. A data compression method, comprising the following steps:

The initial character sequence of the data is analyzed to select a transform algorithm suitable for the initial character sequence, and the transform algorithm is an algorithm that can lengthen the run length of the character sequence by a certain transform; the selected transform algorithm is used And the initial character sequence, thereby obtaining a new character sequence having a long run length; adding a word program column for recording the transform algorithm to the new character sequence, thereby obtaining compressed data.

The method according to claim 1, wherein the initial character sequence is a binary number, and the transform algorithm comprises one of the following algorithms or a combination of several algorithms:

Inverting the initial character sequence in regular digits;

Substituting the adjacent n-bit characters of the initial character sequence as a whole, n being an integer greater than or equal to 2;

Conventional replacement of fixed character combinations in the initial sequence of characters.

The method according to claim 2, wherein said initial character sequence is inverted in a regular character bit.

4. A method according to claim 1 or 2, said transform algorithm being a combination of a plurality of different algorithms that continuously perform a plurality of different transforms on the same sequence of characters or a sequence of characters used for different fields in the initial sequence of characters A combination of different algorithms.

The method according to claim 1 or 2, wherein analyzing the initial character sequence of the data comprises applying a plurality of transform algorithms to the initial character sequence exhaustively, and comparing compression ratios obtainable by the respective transform algorithms to obtain data The compression ratio is the optimal transformation algorithm.

6. The method according to claim 1 or 2, wherein analyzing the initial character sequence of the data comprises searching the initial character sequence for a particular character type for which the particular transform algorithm is directed, and comparing the search results of the respective transform algorithms, thereby Determine the applicable transformation algorithm.

A data decompression method, comprising: the method according to any one of claims 1-6 The compressed data obtained by the method obtains the new character sequence from the run sequence of the compressed data by an inverse operation of the run length encoding, obtains the transform algorithm of the record, and applies the inverse of the transform algorithm to the new character sequence. An operation is performed to obtain an initial sequence of characters of the data.

A computer program, the program comprising instructions adapted to cause a data processing apparatus to perform the data compression method according to any one of claims 1-6 and/or the data decompression method according to claim 7.

9. A storage medium comprising a calculation order, wherein the program causes a data compression method according to any one of claims 1-6 to be performed on the initial data when storing initial data to the storage medium, thereby Obtaining compressed data of the initial data; when copying the initial data from the storage medium to the outside, performing the data decompression method according to claim 7 on the compressed data, thereby decompressing the compressed data into a The initial data.

10. An electronic product comprising the storage medium according to claim 9.