Compression Algorithms

Home > Computer Science > Algorithms and data structures > Compression Algorithms

These are algorithms used to reduce the amount of storage or transmission space required for data. Examples include Lempel-Ziv-Welch compression and run-length encoding.

Lossless vs. lossy compression: Understanding the differences between these two types of compression algorithms and their implications for data quality and file size.
Huffman coding: One of the most commonly used lossless compression algorithms, based on variable-length codes assigned to each symbol in the input data.
Lempel-Ziv-Welch (LZW) compression: Another popular lossless compression algorithm that uses a dictionary to assign variable-length codes to frequently occurring sequences of symbols.
Arithmetic coding: A more advanced lossless compression algorithm that uses fractional numbers to represent probability distributions of symbols in the input data.
Run-length encoding (RLE): A simple lossless compression algorithm that counts consecutive occurrences of identical symbols and replaces them with a shorter encoding.
Burrows-Wheeler transform (BWT): A reversible transformation of the input data that groups similar sequences of symbols together and makes them easier to compress using other algorithms.
Prediction coding: A lossless compression technique that uses statistical models to predict the next symbol in the input data and encodes the difference between the prediction and the actual symbol.
Transform coding: A lossy compression technique that applies mathematical transformations (such as Fourier or wavelet transforms) to the input data to reduce redundancies and eliminate high-frequency noise.
Entropy coding: A method of encoding symbols in the input data using their probabilities of occurrence, typically used as a final step in many compression algorithms to further reduce the file size.
Compressed sensing: A new approach to signal processing and data compression that exploits the sparsity of real-world signals to achieve high compression ratios with minimal loss of information.
Dictionary-based compression: A type of lossless compression that uses a pre-defined dictionary of common phrases or patterns to compress the data.
Delta encoding: Another lossless compression technique that stores the difference between adjacent symbols in the input data, effectively reducing the data size.
Variable-length coding: A general technique used in many compression algorithms to assign shorter codes to more frequently occurring symbols or patterns in the input data.
Huffman coding: A lossless data compression algorithm that assigns shorter codes to more frequently occurring symbols in a message.
Lempel-Ziv-Welch (LZW) compression: A lossless data compression algorithm that builds a dictionary of frequently occurring phrases and replaces them with shorter codes.
Run-length encoding (RLE): A lossless data compression algorithm that replaces sequences of repeated data with a single data value and a count.
Burrows-Wheeler Transform (BWT): A data transform that reorders characters in a string to group similar characters together, allowing for more efficient compression.
Arithmetic coding: A lossless data compression algorithm that encodes entire messages into a single number.
Dictionary-based compression: A form of compression that uses a pre-built dictionary of frequently occurring words or phrases to replace them with shorter codes.
Delta encoding: A form of compression that stores the difference between successive versions of data to reduce file size.
Transform-based compression: A form of compression that applies a mathematical transformation to the original data to reduce redundancy.
Entropy encoding: A lossless data compression algorithm that uses the entropy of the data (a measure of its randomness) to compress it.
Lossy compression: A compression method that achieves high compression ratios by discarding some information from the original data, often used for multimedia like images, videos, and audio.
Lossless compression: A compression method that retains all information from the original data, often used for text and other data that needs to be preserved exactly.
Predictive coding: A lossless data compression algorithm that reduces redundancy by predicting future values based on past values.
"In information theory, data compression, source coding, or bit-rate reduction is the process of encoding information using fewer bits than the original representation."
"Any particular compression is either lossy or lossless."
"Lossless compression reduces bits by identifying and eliminating statistical redundancy. No information is lost in lossless compression. Lossy compression reduces bits by removing unnecessary or less important information."
"Typically, a device that performs data compression is referred to as an encoder."
"Compression is useful because it reduces the resources required to store and transmit data."
"The process of reducing the size of a data file is often referred to as data compression."
"Source coding should not be confused with channel coding, for error detection and correction or line coding, the means for mapping data onto a signal."
"Data compression is subject to a space-time complexity trade-off."
"Lossy compression reduces bits by removing unnecessary or less important information."
"One that performs the reversal of the process (decompression) as a decoder."
"The design of data compression schemes involves trade-offs among various factors..."
"Computational resources are consumed in the compression and decompression processes."
"For instance, a compression scheme for video may require expensive hardware for the video to be decompressed fast enough to be viewed as it is being decompressed..."
"No information is lost in lossless compression."
"Lossless compression reduces bits by identifying and eliminating statistical redundancy."
"In the context of data transmission, it is called source coding: encoding is done at the source of the data before it is stored or transmitted."
"The option to decompress the video in full before watching it may be inconvenient or require additional storage."
"Source coding should not be confused with channel coding, for error detection and correction or line coding, the means for mapping data onto a signal."
"The amount of distortion introduced (when using lossy data compression)."
"The degree of compression, the amount of distortion introduced (when using lossy data compression), and the computational resources required to compress and decompress the data."