+

WO2013033895A1 - Data compressing and decompressing method, program, storage medium, and electronic product - Google Patents

Data compressing and decompressing method, program, storage medium, and electronic product Download PDF

Info

Publication number
WO2013033895A1
WO2013033895A1 PCT/CN2011/079417 CN2011079417W WO2013033895A1 WO 2013033895 A1 WO2013033895 A1 WO 2013033895A1 CN 2011079417 W CN2011079417 W CN 2011079417W WO 2013033895 A1 WO2013033895 A1 WO 2013033895A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
initial
sequence
character sequence
algorithm
Prior art date
Application number
PCT/CN2011/079417
Other languages
French (fr)
Chinese (zh)
Inventor
崔军
Original Assignee
速压公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 速压公司 filed Critical 速压公司
Priority to PCT/CN2011/079417 priority Critical patent/WO2013033895A1/en
Publication of WO2013033895A1 publication Critical patent/WO2013033895A1/en

Links

Classifications

    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/3084Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction using adaptive string matching, e.g. the Lempel-Ziv method
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/40Conversion to or from variable length codes, e.g. Shannon-Fano code, Huffman code, Morse code

Definitions

  • the present invention relates to a method of compressing and decompressing data, and related programs, storage media and electronic products, and more particularly to lossless recompression of compressed data. Background technique
  • the present invention provides a method for compressing data (including compressed files of various formats), comprising the steps of: analyzing an initial character sequence of data to select a transform algorithm suitable for the initial character sequence,
  • the transform algorithm is an algorithm that can lengthen the run length of the character sequence by a certain transform; the selected transform algorithm is used for the initial character sequence to obtain a new character sequence having a longer run length;
  • the sequence adds a character for recording the transformation algorithm; and obtains a program sequence of the new character sequence by run-length encoding the new character sequence, thereby obtaining compressed data.
  • the initial character sequence is a binary number
  • the transform algorithm includes one of the following algorithms or a combination of several algorithms: inverting the initial character sequence in a regular digit, which may be an initial The even-numbered bits of the sequence of characters are inverted or the bits of each two-bit interval of the initial sequence of characters are inverted; the adjacent n-bit characters of the initial sequence of characters are entirely exchanged, n is an integer greater than or equal to 2; and the initial character a combination of fixed characters in the sequence Conventional replacement.
  • the transform algorithm is a combination of a plurality of different algorithms that continuously perform a plurality of different transforms on the same sequence of characters or a plurality of different algorithms that are used for character sequences of different fields in the initial character sequence. combination.
  • the analyzing the initial character sequence of the data comprises: applying a plurality of transform algorithms to the initial character sequence exhaustively, and comparing the compression ratios obtainable by the respective transform algorithms to obtain a data compression ratio
  • the transform algorithm searches for a specific character type in the initial character sequence and compares the search results of the respective transform algorithms to determine an applicable transform algorithm.
  • the analyzing the initial character sequence of the data includes performing segmentation analysis on the initial character sequence.
  • a data decompression method comprising: obtaining, for a compressed data obtained according to the method described above, the new data from a program sequence of the compressed data by an inverse operation of run-length encoding a sequence of characters, obtaining the transformation algorithm of the record, applying an inverse operation of the transformation algorithm to the new sequence of characters, thereby obtaining an initial sequence of characters of the data.
  • the invention also provides a computer program comprising instructions adapted to cause a data processing apparatus to perform the above described compression method and/or data decompression method.
  • the present invention also provides a storage medium including a computer program, wherein the program causes a compression method according to the present invention to be performed on the initial data when storing initial data to the storage medium, thereby obtaining compression of the initial data Data; when the initial data is copied from the storage medium to the outside, a decompression method according to the present invention is performed on the compressed data, thereby decompressing the compressed data into the initial data. Compression is lossless compression, obeying the protocol of data transmission.
  • the present invention also provides an electronic product comprising the storage medium according to the above.
  • the invention realizes lossless recompression of data (including compressed data), greatly reduces the data storage cost, and accelerates the data network ⁇ "transmission, especially in the multi-faceted application of the streaming media network, such as video, audio, Transmission of images, files, etc., has great value.
  • FIG. 1 is a flow chart of a data compression method in accordance with the present invention. detailed description
  • step 1 of Fig. 1 data analysis is first performed on the initial data to be compressed.
  • the data used in the computer system is a binary number, where the initial data is selected as a binary character sequence for ease of explanation.
  • This application has a very strong unique function for further compression of compressed data, however, the method of the present application is equally applicable to uncompressed data.
  • the data sequence to be compressed by the method of the present invention is
  • the data is a compressed data sequence.
  • a plurality of transform algorithms can be applied to the data sequence, and the transform algorithm is an algorithm that can lengthen the run length of the character sequence by a certain transform.
  • the first conversion algorithm is to invert the even bits of the data sequence
  • the new sequence after the inversion of the even bits of the sequence is:
  • the second transformation algorithm is to invert the data sequence every two bits, and the new data sequence obtained by performing the transformation on the initial data sequence is:
  • the third transformation algorithm is to exchange adjacent n-bit characters of the data sequence, and n is an integer greater than or equal to 2.
  • the initial data sequence is transformed by converting adjacent three-bit characters of the data sequence.
  • the new data sequence is:
  • the fourth transformation algorithm is to replace the fixed character combination in the data sequence by, for example, replacing "10" in the initial data with 1, replacing "11” with 0, and performing the transformation on the initial data sequence.
  • the new data sequence obtained is:
  • transformation algorithms are merely exemplary, and those skilled in the art can design more transformation algorithms based on the content, and the transformation algorithm can be more complicated, and can be a transformation in the form of a calculation formula, which can be continuous on the same character sequence.
  • a combination of a plurality of different algorithms of a plurality of different transforms may also be a combination of a plurality of different algorithms for a sequence of characters of different fields in the initial character sequence, as long as it can ultimately lengthen the run length of the data.
  • the above initial data has a longer run length after the first type of transformation than after the second type of transformation, thereby facilitating data compression. Therefore, the initial data to be compressed needs to be analyzed to select the transform algorithm that is most suitable for the data.
  • a plurality of transform algorithms may be exhaustively applied to the initial data, and the compression ratios obtainable by the respective transform algorithms may be compared to obtain a transform algorithm that optimizes the data compression ratio, and the initial characters may also be Search for a specific transformation in the sequence The specific character type for which the method is directed, and compares the search results of the respective transform algorithms to determine the applicable transform algorithm.
  • Other methods for analyzing data to obtain a transformation algorithm matched thereto are also contemplated by those skilled in the art, and are all included within the scope of the present invention.
  • the selected transform algorithm is applied to the initial data to transform the data.
  • the selected transform algorithm may be a single algorithm or a combination of multiple algorithms, which is determined according to the condition of the specific initial data.
  • step 3 of Fig. 1 the transform algorithm is recorded in the transformed data sequence.
  • a plurality of transform algorithms can be numbered and numbered markers added to the transformed data sequence to record the transform algorithm.
  • step 4 of Figure 1 the transformed data is run-length encoded to implement compression of the data.
  • the data sequence after the first transformation can be represented as a swim-up column:
  • step 5 the above-mentioned swim program column is represented by a binary number and is the compressed data.
  • the decompression method according to the present invention is the inverse of the above compression method, comprising: obtaining, for the compressed data obtained by the above compression method, the transformed data from the run sequence of the compressed data by an inverse operation of the run length encoding Sequence, obtaining the transformation algorithm of the record, applying an inverse operation of the transformation algorithm to the transformed data sequence, thereby obtaining an initial data sequence of the data.
  • the compression method of the present invention obtains the effect of compression by transforming the data without any deletion or corruption of the initial data
  • the compression method of the present invention is a lossless compression method.
  • Both the compression method and the decompression method according to the present invention can be implemented in the form of a computer program.
  • the present invention may also be embodied as a storage medium including a calculation order, wherein the program is: when storing initial data to the storage medium, the initial data is compressed according to the present invention, thereby obtaining the initial data. Compressing data; when copying the initial data from the storage medium to the outside, performing a decompression method according to the present invention on the compressed data, thereby The compressed data is decompressed into the initial data.
  • the storage medium described herein can be a flash memory, an optical disk, or other storage device as known to those skilled in the art.
  • the present invention can also be embodied as various electronic products including the storage medium according to the present invention, so that data can be stored in a small amount of data when copying data or downloading data from a network, while returning to the original data at the time of decompression without Any damage to the original data.
  • Examples of such electronic products may be smartphones, MP4 players, or other electronic devices known to those skilled in the art.
  • the compression method of the invention further lossless compression of the compressed data can be realized, a compression ratio of at least 50% can be achieved, and high-speed compression can be achieved, thereby greatly reducing the data storage cost and greatly accelerating the data network.
  • the speed of transmission can significantly improve the application of streaming media transmission, real-time playback and so on.
  • the compression method of the present invention is applicable to various file formats of computers.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

A data compressing and decompressing method, a program, a storage medium, and an electronic product. The data compressing method in the present invention comprises the steps of: analyzing an initial character string of data so as to select a transform algorithm applicable to the initial character string, the transform algorithm being an algorithm that can extend a run length of a character string; applying the selected algorithm to the initial character string so as to obtain a new character string with a greater run length; adding a character for recoding the transform algorithm to the new character string; and obtaining a run length of the new character string by performing run length encoding on the new character string, so as to obtain compressed data.

Description

数据压缩和解压缩方法、 程序、 存储介质及电子产品 技术领域  Data compression and decompression method, program, storage medium and electronic product
本发明涉及对数据的压缩、 解压缩方法, 以及相关的程序、 存储介质 以及电子产品, 本发明尤其涉及对压缩数据的无损再压缩。 背景技术  The present invention relates to a method of compressing and decompressing data, and related programs, storage media and electronic products, and more particularly to lossless recompression of compressed data. Background technique
随着计算机、 互联网的迅 ii , 数据存储和网上数据传输量剧增, 而视频流量将会占到网络流量的 90%, 网络拥挤和堵塞日趋严重。 依靠加 大带宽和传输提速是目前解决网^ 输问题的推动方案。 这些方案需要巨 大的投资和时间来实现。 发明一种对数据、 特别是对已压缩的数据进行高 速、压缩比较优的无损压缩技术, 将会对 IT领域产生重大影响, 实现在现 有条件下极大地增强数据存储和网络传输数据的能力、 减小投资和节省时 间、 提高数据传输的品质的目的。 发明内容  With the rapid development of computers and the Internet, the amount of data storage and online data transmission has increased dramatically, and video traffic will account for 90% of network traffic. Network congestion and congestion are becoming more and more serious. Relying on increasing bandwidth and speeding up transmission is the current solution to solve the problem of network transmission. These programs require huge investment and time to achieve. Inventing a lossless compression technology that performs high-speed and compression on data, especially compressed data, will have a major impact on the IT field, and greatly enhance the ability of data storage and network transmission data under existing conditions. , reduce investment and save time, improve the quality of data transmission. Summary of the invention
本发明提供一种对数据 (包括各种格式的已压缩文件)进行压缩的方 法, 包括以下步骤: 对数据的初始字符序列进行分析, 以选择适用于所述 初始字符序列的变换算法, 所述变换算法为通过一定的变换可以使字符序 列的游程长度变长的算法; 将选出的变换算法用于所述初始字符序列, 从 而获得具有较长游程长度的新字符序列; 对所述新字符序列添加用于记录 所述变换算法的字符; 以及通过对所述新字符序列进行游程编码而获得所 述新字符序列的游程序列, 从而获得压缩数据。  The present invention provides a method for compressing data (including compressed files of various formats), comprising the steps of: analyzing an initial character sequence of data to select a transform algorithm suitable for the initial character sequence, The transform algorithm is an algorithm that can lengthen the run length of the character sequence by a certain transform; the selected transform algorithm is used for the initial character sequence to obtain a new character sequence having a longer run length; The sequence adds a character for recording the transformation algorithm; and obtains a program sequence of the new character sequence by run-length encoding the new character sequence, thereby obtaining compressed data.
根据本发明的优选方法, 所述初始字符序列为二进制数, 所述变换算 法包括以下一种算法或几种算法的组合: 将初始字符序列在规律数位的字 符进行反相, 其可以为将初始字符序列的偶数位反相或者将初始字符序列 的每间隔两位的位进行反相等; 将初始字符序列的相邻的 n位字符整体交 换, n为大于等于 2的整数; 以及对初始字符序列中的固定字符组合进行 约定替换。 According to a preferred method of the present invention, the initial character sequence is a binary number, and the transform algorithm includes one of the following algorithms or a combination of several algorithms: inverting the initial character sequence in a regular digit, which may be an initial The even-numbered bits of the sequence of characters are inverted or the bits of each two-bit interval of the initial sequence of characters are inverted; the adjacent n-bit characters of the initial sequence of characters are entirely exchanged, n is an integer greater than or equal to 2; and the initial character a combination of fixed characters in the sequence Conventional replacement.
根据本发明的优选方法, 所述变换算法是对同样的字符序列连续进行 多个不同变换的多个不同算法的组合或者是用于初始字符序列中的不同字 段的字符序列的多个不同算法的组合。  According to a preferred method of the present invention, the transform algorithm is a combination of a plurality of different algorithms that continuously perform a plurality of different transforms on the same sequence of characters or a plurality of different algorithms that are used for character sequences of different fields in the initial character sequence. combination.
根据本发明的优选方法, 其中对数据的初始字符序列进行分析包括: 对所述初始字符序列穷举应用多种变换算法, 并比较各个变换算法能够获 得的压缩比, 以获得使数据压缩比最佳的变换算法, 或者在所述初始字符 序列中搜索特定的字符型, 并比较各个变换算法的搜索结果, 从而确定适 用的变换算法。 其中对数据的初始字符序列进行分析包括对所述初始字符 序列进行分段分析。  According to a preferred method of the present invention, the analyzing the initial character sequence of the data comprises: applying a plurality of transform algorithms to the initial character sequence exhaustively, and comparing the compression ratios obtainable by the respective transform algorithms to obtain a data compression ratio Preferably, the transform algorithm searches for a specific character type in the initial character sequence and compares the search results of the respective transform algorithms to determine an applicable transform algorithm. The analyzing the initial character sequence of the data includes performing segmentation analysis on the initial character sequence.
根据本发明另一方面, 提供了一种数据解压缩的方法, 包括: 对于根 据上面所述的方法获得的压缩数据, 通过游程编码的逆运算从所述压缩数 据的游程序列获得所述新字符序列, 获取记录的所述变换算法, 对所述新 字符序列应用所述变换算法的逆运算,从而获得所述数据的初始字符序列。  According to another aspect of the present invention, a data decompression method is provided, comprising: obtaining, for a compressed data obtained according to the method described above, the new data from a program sequence of the compressed data by an inverse operation of run-length encoding a sequence of characters, obtaining the transformation algorithm of the record, applying an inverse operation of the transformation algorithm to the new sequence of characters, thereby obtaining an initial sequence of characters of the data.
本发明还提供一种计算机程序, 所述程序包括适合于使数据处理装置 执行上述压缩方法和 /或数据解压缩方法的指令。  The invention also provides a computer program comprising instructions adapted to cause a data processing apparatus to perform the above described compression method and/or data decompression method.
本发明还提供一种包含计算机程序的存储介质, 其中所述程序使得, 当向该存储介质存储初始数据时, 对所述初始数据执亍根据本发明的压缩 方法, 从而获得该初始数据的压缩数据; 当从该存储介质向外部复制所述 初始数据时, 对所述压缩数据执行根据本发明的解压缩方法, 从而将所述 压缩数据解压缩为所述初始数据。压缩为无损压缩,遵守数据传输的协议。  The present invention also provides a storage medium including a computer program, wherein the program causes a compression method according to the present invention to be performed on the initial data when storing initial data to the storage medium, thereby obtaining compression of the initial data Data; when the initial data is copied from the storage medium to the outside, a decompression method according to the present invention is performed on the compressed data, thereby decompressing the compressed data into the initial data. Compression is lossless compression, obeying the protocol of data transmission.
本发明还提供一种电子产品, 其包括根据上述的存储介质。  The present invention also provides an electronic product comprising the storage medium according to the above.
本发明实现了对数据(包括压缩数据)的无损再压缩, 大大降低了数 据存储成本, 加速了数据网^ "输, 特别是在流媒体网^ ·输的多方面应 用, 如视频、 音频、 图像、 文件等传输, 具有巨大价值。 附图说明  The invention realizes lossless recompression of data (including compressed data), greatly reduces the data storage cost, and accelerates the data network ^"transmission, especially in the multi-faceted application of the streaming media network, such as video, audio, Transmission of images, files, etc., has great value.
图 1是根据本发明的数据压缩方法的流程图。 具体实施方式 1 is a flow chart of a data compression method in accordance with the present invention. detailed description
下面参考附图说明本发明的具体实施例。 本领域技术人员可以理解, 下文中的具体实施例只是用于更好地说明本发明, 而不是限制本发明的范 围。  Specific embodiments of the present invention are described below with reference to the accompanying drawings. It will be understood by those skilled in the art that the specific embodiments of the present invention are not intended to limit the scope of the invention.
在图 1的步骤 1中, 首先对将要压缩的初始数据进行数据分析。  In step 1 of Fig. 1, data analysis is first performed on the initial data to be compressed.
在计算机系统中使用的数据是二进制数, 这里将初始数据选择为二进 制的字符序列, 以便于说明。 本申请对压缩数据进行进一步的压缩具有极 强的独有功能, 但是, 本申请的方法同样可用于未经压缩的数据。  The data used in the computer system is a binary number, where the initial data is selected as a binary character sequence for ease of explanation. This application has a very strong unique function for further compression of compressed data, however, the method of the present application is equally applicable to uncompressed data.
在此, 例如, 将通过本发明的方法压缩的数据序列为,  Here, for example, the data sequence to be compressed by the method of the present invention is
010101, 该数据为压缩数据序列。 对于该数据序列可以应用多种变换算法, 所述变 换算法为通过一定的变换可以使字符序列的游程长度变长的算法。 010101, the data is a compressed data sequence. A plurality of transform algorithms can be applied to the data sequence, and the transform algorithm is an algorithm that can lengthen the run length of the character sequence by a certain transform.
例如, 第一种变换算法为, 将数据序列的偶数位反相, 上述序列的偶 数位反相后的新的序列为:  For example, the first conversion algorithm is to invert the even bits of the data sequence, and the new sequence after the inversion of the even bits of the sequence is:
000000 显然, 新数据序列中出现了多个长游程长度, 从而可以对其再压缩。 000000 Obviously, multiple long run lengths appear in the new data sequence so that they can be recompressed.
第二种变换算法为, 将数据序列每间隔两位进行反相, 对上述初始数 据序列进行该变换后得到的新的数据序列为:  The second transformation algorithm is to invert the data sequence every two bits, and the new data sequence obtained by performing the transformation on the initial data sequence is:
000111 变换后的新数据序列中也出现了一些长游程长度, 即也可再压缩。 000111 Some long run lengths have also appeared in the transformed new data sequence, ie they can be recompressed.
第三种变换算法为, 将数据序列的相邻 n位字符进行交换, n为大于 等于 2的整数, 例如, 对上述初始数据序列进行将数据序列的相邻三位字 符进行交换的变换后得到的新的数据序列为:  The third transformation algorithm is to exchange adjacent n-bit characters of the data sequence, and n is an integer greater than or equal to 2. For example, the initial data sequence is transformed by converting adjacent three-bit characters of the data sequence. The new data sequence is:
010101 同样地, 变换后的新数据序列中出现多处长游程长度, 从而可再压缩。 010101 Similarly, multiple long run lengths appear in the transformed new data sequence, so that it can be recompressed.
第四种变换算法为, 对数据序列中的固定字符组合进行约定替换, 例 如, 将初始数据中的 "10"替换为 1, 将 "11"替换为 0, 对上述初始数据 序列进行该变换后得到的新的数据序列为:  The fourth transformation algorithm is to replace the fixed character combination in the data sequence by, for example, replacing "10" in the initial data with 1, replacing "11" with 0, and performing the transformation on the initial data sequence. The new data sequence obtained is:
1000111100001001111000101101010001000110111 变换后的新数据序列中同样出现多处长游程长度, 从而可再压缩。 1000111100001001111000101101010001000110111 There are also many long run lengths in the transformed new data sequence, which can be recompressed.
上述的几种变换算法只是示例性的, 本领域技术人员基于该内容可以 设计出更多的变换算法, 变换算法可以更加复杂, 可以是计算公式形式的 变换, 可以是对同样的字符序列连续进行多个不同变换的多个不同算法的 组合, 也可以是用于初始字符序列中的不同字段的字符序列的多个不同算 法的组合, 只要其最终可以使数据的游程长度变长即可。  The above several transformation algorithms are merely exemplary, and those skilled in the art can design more transformation algorithms based on the content, and the transformation algorithm can be more complicated, and can be a transformation in the form of a calculation formula, which can be continuous on the same character sequence. A combination of a plurality of different algorithms of a plurality of different transforms may also be a combination of a plurality of different algorithms for a sequence of characters of different fields in the initial character sequence, as long as it can ultimately lengthen the run length of the data.
从上述几种变换算法的示例中, 本领域技术人员也可以看出, 这几种 变换算法对于同一个数据来说是存在优劣的。 例如, 上述初始数据在经过 第一种变换后明显比经过第二种变换后具有更长的游程长度, 从而更有利 于数据的压缩。 因此, 需要对将要压缩的初始数据进行分析, 以选择出最 适合于该数据的变换算法。 当分析初始数据时, 可以对所述初始数据穷举 应用多种变换算法, 并比较各个变换算法能够获得的压缩比, 以获得使数 据压缩比最优的变换算法, 还可以在所述初始字符序列中搜索特定变换算 法所针对的特定的字符型, 并比较各个变换算法的搜索结果, 从而确定适 用的变换算法。 本领域技术人员据此还可想到其他用于分析数据以获得与 其匹配的变换算法的方法, 其都包括在本发明的范围内。 From the examples of the above several transformation algorithms, those skilled in the art can also see that these several transformation algorithms are superior to the same data. For example, the above initial data has a longer run length after the first type of transformation than after the second type of transformation, thereby facilitating data compression. Therefore, the initial data to be compressed needs to be analyzed to select the transform algorithm that is most suitable for the data. When analyzing the initial data, a plurality of transform algorithms may be exhaustively applied to the initial data, and the compression ratios obtainable by the respective transform algorithms may be compared to obtain a transform algorithm that optimizes the data compression ratio, and the initial characters may also be Search for a specific transformation in the sequence The specific character type for which the method is directed, and compares the search results of the respective transform algorithms to determine the applicable transform algorithm. Other methods for analyzing data to obtain a transformation algorithm matched thereto are also contemplated by those skilled in the art, and are all included within the scope of the present invention.
在图 1的步骤 2中, 将选定的变换算法应用于初始数据, 以对数据进 行变换。 这里, 如上文所述, 选定的变换算法可以是单个算法, 也可以是 多种算法的组合, 这根据具体初始数据的情况而确定。  In step 2 of Figure 1, the selected transform algorithm is applied to the initial data to transform the data. Here, as described above, the selected transform algorithm may be a single algorithm or a combination of multiple algorithms, which is determined according to the condition of the specific initial data.
在图 1的步骤 3中, 在变换后的数据序列中记录变换算法。 例如, 可 以对多种变换算法进行编号,并将编号的标记添加到变换后的数据序列中, 从而记录所述变换算法。  In step 3 of Fig. 1, the transform algorithm is recorded in the transformed data sequence. For example, a plurality of transform algorithms can be numbered and numbered markers added to the transformed data sequence to record the transform algorithm.
在图 1的步骤 4中, 对变换后的数据进行游程编码, 以实现对数据的 压缩, 例如经过第一种变换后的数据序列可以表示为游程序列:  In step 4 of Figure 1, the transformed data is run-length encoded to implement compression of the data. For example, the data sequence after the first transformation can be represented as a swim-up column:
2 2 11 2 2 1 3 11 2 3 5 4 1 2 2 4 1 1 5 6, 在步骤 5中, 将上述游程序列用二进制数表示出来即为压缩后的数据。 2 2 11 2 2 1 3 11 2 3 5 4 1 2 2 4 1 1 5 6. In step 5, the above-mentioned swim program column is represented by a binary number and is the compressed data.
根据本发明的解压缩方法即为上述压缩方法的逆过程, 包括: 对于通 过上述压缩方法获得的压缩数据, 通过游程编码的逆运算从所述压缩数据 的游程序列获得所述变换后的数据序列, 获取记录的所述变换算法, 对所 述变换后的数据序列应用所述变换算法的逆运算, 从而获得所述数据的初 始数据序列。  The decompression method according to the present invention is the inverse of the above compression method, comprising: obtaining, for the compressed data obtained by the above compression method, the transformed data from the run sequence of the compressed data by an inverse operation of the run length encoding Sequence, obtaining the transformation algorithm of the record, applying an inverse operation of the transformation algorithm to the transformed data sequence, thereby obtaining an initial data sequence of the data.
由于本发明的压缩和解压缩方法通过对数据的变换来获得压缩的效 果, 而并没有对初始数据有任何删减或损坏, 因此本发明的压缩方法是一 种无损压缩方法。  Since the compression and decompression method of the present invention obtains the effect of compression by transforming the data without any deletion or corruption of the initial data, the compression method of the present invention is a lossless compression method.
根据本发明的压缩方法和解压缩方法都可以实施为计算机程序的形 式。 本发明还可以实施为包含计算积 序的存储介质, 其中所述程序为: 当向该存储介质存储初始数据时, 对所述初始数据 Μ亍根据本发明的压缩 方法, 从而获得该初始数据的压缩数据; 当从该存储介质向外部复制所述 初始数据时, 对所述压缩数据执行根据本发明的解压缩方法, 从而将所述 压缩数据解压缩为所述初始数据。 这里所述的存储介质可以是闪存器、 光 盘或本领域技术人员 的其他存储装置。 Both the compression method and the decompression method according to the present invention can be implemented in the form of a computer program. The present invention may also be embodied as a storage medium including a calculation order, wherein the program is: when storing initial data to the storage medium, the initial data is compressed according to the present invention, thereby obtaining the initial data. Compressing data; when copying the initial data from the storage medium to the outside, performing a decompression method according to the present invention on the compressed data, thereby The compressed data is decompressed into the initial data. The storage medium described herein can be a flash memory, an optical disk, or other storage device as known to those skilled in the art.
本发明还可以实施为各种电子产品, 其包括根据本发明的存储介质, 从而在复制数据或从网络下载数据时可以以小数据量存储数据, 同时在解 压缩时恢复到原数据而不会对原数据有任何损坏。 所述电子产品的实例可 以为智能手机、 MP4播放器或本领域技术人员公知的其他电子装置。  The present invention can also be embodied as various electronic products including the storage medium according to the present invention, so that data can be stored in a small amount of data when copying data or downloading data from a network, while returning to the original data at the time of decompression without Any damage to the original data. Examples of such electronic products may be smartphones, MP4 players, or other electronic devices known to those skilled in the art.
通过本发明的压缩方法, 可以实现对已压缩数据的进一步无损压缩, 可以实现至少为 50%的压缩比, 并能高速压缩, 从而大大降低了数据存储 成本, 并且也大大加速了数据网^ ^输速度, 从而可以显著改善流媒体传 输、 实时播放等多方面应用。  By the compression method of the invention, further lossless compression of the compressed data can be realized, a compression ratio of at least 50% can be achieved, and high-speed compression can be achieved, thereby greatly reducing the data storage cost and greatly accelerating the data network. The speed of transmission can significantly improve the application of streaming media transmission, real-time playback and so on.
本发明的压缩方法适用于计算机的各种文件格式。  The compression method of the present invention is applicable to various file formats of computers.

Claims

权 利 要 求 Rights request
1. 一种数据压缩方法, 包括以下步骤: 1. A data compression method, comprising the following steps:
对数据的初始字符序列进行分析, 以选择适用于所述初始字符序列的 变换算法, 所述变换算法为通过一定的变换可以使字符序列的游程长度变 长的算法; 将选出的变换算法用于所述初始字符序列, 从而获得具有长游 程长度的新字符序列; 对所述新字符序列添加用于记录所述变换算法的字 程序列, 从而获得压缩数据。  The initial character sequence of the data is analyzed to select a transform algorithm suitable for the initial character sequence, and the transform algorithm is an algorithm that can lengthen the run length of the character sequence by a certain transform; the selected transform algorithm is used And the initial character sequence, thereby obtaining a new character sequence having a long run length; adding a word program column for recording the transform algorithm to the new character sequence, thereby obtaining compressed data.
2.根据权利要求 1的方法, 所述初始字符序列为二进制数, 所述变换 算法包括以下一种算法或几种算法的组合:  The method according to claim 1, wherein the initial character sequence is a binary number, and the transform algorithm comprises one of the following algorithms or a combination of several algorithms:
将初始字符序列在规律数位的字符进行反相;  Inverting the initial character sequence in regular digits;
将初始字符序列的相邻的 n位字符整体交换, n为大于等于 2的整数; 以及  Substituting the adjacent n-bit characters of the initial character sequence as a whole, n being an integer greater than or equal to 2;
对初始字符序列中的固定字符组合进行约定替换。  Conventional replacement of fixed character combinations in the initial sequence of characters.
3.根据权利要求 2的方法, 所述将初始字符序列在规律数位的字符进 的位进行反相。  The method according to claim 2, wherein said initial character sequence is inverted in a regular character bit.
4.根据权利要求 1或 2的方法, 所述变换算法是对同样的字符序列连 续进行多个不同变换的多个不同算法的组合或者是用于初始字符序列中的 不同字段的字符序列的多个不同算法的组合。  4. A method according to claim 1 or 2, said transform algorithm being a combination of a plurality of different algorithms that continuously perform a plurality of different transforms on the same sequence of characters or a sequence of characters used for different fields in the initial sequence of characters A combination of different algorithms.
5.根据权利要求 1或 2的方法, 其中对数据的初始字符序列进行分析 包括对所述初始字符序列穷举应用多种变换算法, 并比较各个变换算法能 够获得的压缩比, 以获得使数据压缩比最优的变换算法。  The method according to claim 1 or 2, wherein analyzing the initial character sequence of the data comprises applying a plurality of transform algorithms to the initial character sequence exhaustively, and comparing compression ratios obtainable by the respective transform algorithms to obtain data The compression ratio is the optimal transformation algorithm.
6.根据权利要求 1或 2的方法, 其中对数据的初始字符序列进行分析 包括在所述初始字符序列中搜索特定变换算法所针对的特定的字符型, 并 比较各个变换算法的搜索结果, 从而确定适用的变换算法。  6. The method according to claim 1 or 2, wherein analyzing the initial character sequence of the data comprises searching the initial character sequence for a particular character type for which the particular transform algorithm is directed, and comparing the search results of the respective transform algorithms, thereby Determine the applicable transformation algorithm.
7. 一种数据解压缩方法, 包括: 对于根据权利要求 1-6中任一项所述 的方法获得的压缩数据, 通过游程编码的逆运算从所述压缩数据的游程序 列获得所述新字符序列, 获取记录的所述变换算法, 对所述新字符序列应 用所述变换算法的逆运算, 从而获得所述数据的初始字符序列。 A data decompression method, comprising: the method according to any one of claims 1-6 The compressed data obtained by the method obtains the new character sequence from the run sequence of the compressed data by an inverse operation of the run length encoding, obtains the transform algorithm of the record, and applies the inverse of the transform algorithm to the new character sequence. An operation is performed to obtain an initial sequence of characters of the data.
8. 一种计算机程序, 所述程序包括适合于使数据处理装置执行根据权 利要求 1-6中任一项的数据压缩方法和 /或根据权利要求 7的数据解压缩方 法的指令。  A computer program, the program comprising instructions adapted to cause a data processing apparatus to perform the data compression method according to any one of claims 1-6 and/or the data decompression method according to claim 7.
9. 一种包含计算积 序的存储介质, 其中所述程序使得, 当向该存储 介质存储初始数据时, 对所述初始数据执行根据权利要求 1-6中任一项的 数据压缩方法, 从而获得该初始数据的压缩数据; 当从该存储介质向外部 复制所述初始数据时, 对所述压缩数据执行根据权利要求 7所述的数据解 压缩方法, 从而将所述压缩数据解压缩为所述初始数据。  9. A storage medium comprising a calculation order, wherein the program causes a data compression method according to any one of claims 1-6 to be performed on the initial data when storing initial data to the storage medium, thereby Obtaining compressed data of the initial data; when copying the initial data from the storage medium to the outside, performing the data decompression method according to claim 7 on the compressed data, thereby decompressing the compressed data into a The initial data.
10. 一种电子产品, 其包括根据权利要求 9的存储介质。  10. An electronic product comprising the storage medium according to claim 9.
PCT/CN2011/079417 2011-09-07 2011-09-07 Data compressing and decompressing method, program, storage medium, and electronic product WO2013033895A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2011/079417 WO2013033895A1 (en) 2011-09-07 2011-09-07 Data compressing and decompressing method, program, storage medium, and electronic product

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2011/079417 WO2013033895A1 (en) 2011-09-07 2011-09-07 Data compressing and decompressing method, program, storage medium, and electronic product

Publications (1)

Publication Number Publication Date
WO2013033895A1 true WO2013033895A1 (en) 2013-03-14

Family

ID=47831439

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2011/079417 WO2013033895A1 (en) 2011-09-07 2011-09-07 Data compressing and decompressing method, program, storage medium, and electronic product

Country Status (1)

Country Link
WO (1) WO2013033895A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101039374A (en) * 2006-03-14 2007-09-19 联想(北京)有限公司 Image lossless compression and image decompressing method
CN101198056A (en) * 2006-12-05 2008-06-11 华为技术有限公司 Variable length encoding method and device
US20100117875A1 (en) * 2008-11-10 2010-05-13 Apple Inc. System and method for compressing a stream of integer-valued data

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101039374A (en) * 2006-03-14 2007-09-19 联想(北京)有限公司 Image lossless compression and image decompressing method
CN101198056A (en) * 2006-12-05 2008-06-11 华为技术有限公司 Variable length encoding method and device
US20100117875A1 (en) * 2008-11-10 2010-05-13 Apple Inc. System and method for compressing a stream of integer-valued data

Similar Documents

Publication Publication Date Title
US20210400278A1 (en) Codebook generation for cloud-based video applications
US20200153942A1 (en) Method and system for transmitting a data file over a data network
US8275897B2 (en) System and methods for accelerated data storage and retrieval
CN104504307B (en) Audio frequency and video copy detection method and device based on copy cell
US11575947B2 (en) Residual entropy compression for cloud-based video applications
CN112188198B (en) Image data compression and decompression method and system
US10366698B2 (en) Variable length coding of indices and bit scheduling in a pyramid vector quantizer
CN109983535B (en) Transform-based audio codec and method with sub-band energy smoothing
CN113271467B (en) Ultra-high-definition video layered coding and decoding method supporting efficient editing
WO2013033895A1 (en) Data compressing and decompressing method, program, storage medium, and electronic product
KR101632689B1 (en) The method for recovery of multimedia piece file
EP3461009A1 (en) High density archival
Mohamed Wireless communication systems: Compression and decompression algorithms
Grzes Voice Long Distance Transmission Using Audio Codec for Low-Performance Microcontrollers and LoRa Communication for Use in IoT
JP2002135128A (en) Data-compression method, data compression/expansion method, data-compression device, and data compression/ expansion device
Compression et al. Data Compression
KR100975063B1 (en) Apparatus for decoding variable length coded bitstream and method thereof, and recording medium having recorded thereon a program for implementing the same
JP4008457B2 (en) Data compression system and data compression program

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11872052

Country of ref document: EP

Kind code of ref document: A1

DPE1 Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101)
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 11872052

Country of ref document: EP

Kind code of ref document: A1

点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载