Data compression algorithms reduce the size of digital files to reduce storage and transmission costs. There are two main types: lossless techniques exactly reconstruct the original data, while lossy techniques allow small changes but achieve much higher compression ratios. Common techniques include run-length encoding, Huffman coding, and Lempel-Ziv-Welch (LZW) compression. JPEG is a widely used industry standard that applies lossy compression well-suited for images, while MPEG does the same for video.
Download as PPTX, PDF, TXT or read online on Scribd
Download as pptx, pdf, or txt
0 ratings0% found this document useful (0 votes)
49 views18 pages
Lecture 10 - Data Compression
Data compression algorithms reduce the size of digital files to reduce storage and transmission costs. There are two main types: lossless techniques exactly reconstruct the original data, while lossy techniques allow small changes but achieve much higher compression ratios. Common techniques include run-length encoding, Huffman coding, and Lempel-Ziv-Welch (LZW) compression. JPEG is a widely used industry standard that applies lossy compression well-suited for images, while MPEG does the same for video.
Download as PPTX, PDF, TXT or read online on Scribd
Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1/ 18
Data Compression
By Fareed Ahmed Jokhio
Data Compression • Data transmission and storage cost money. • The more information being dealt with, the more it costs. • In spite of this, most digital data are not stored in the most compact form. • Rather, they are stored in whatever way makes them easiest to use, such as: ASCII text from word processors, binary code that can be executed on a computer, individual samples from a data acquisition system, etc. Data Compression • Typically, these easy-to-use encoding methods require data files about twice as large as actually needed to represent the information. • Data compression is the general term for the various algorithms and programs developed to address this problem. • A compression program is used to convert data from an easy-to-use format to one optimized for compactness. • Likewise, an uncompression program returns the information to its original form. Data Compression • We examine five techniques for data compression in this lecture. • The first three are simple encoding techniques, called: run-length, Huffman, and delta encoding. • The last two are elaborate procedures that have established themselves as industry standards: LZW and JPEG. Data Compression Strategies • Table below shows two different ways that data compression algorithms can be categorized. • In (a), the methods have been classified as either lossless or lossy. Data Compression Strategies • A lossless technique means that the restored data file is identical to the original. • This is absolutely necessary for many types of data, for example: executable code, word processing files, tabulated numbers, etc. • You cannot afford to misplace even a single bit of this type of information. Data Compression Strategies • In comparison, data files that represent images and other acquired signals do not have to be keep in perfect condition for storage or transmission. • All real world measurements inherently contain a certain amount of noise. • If the changes made to these signals resemble a small amount of additional noise, no harm is done. • Compression techniques that allow this type of degradation are called lossy. Data Compression Strategies • This distinction is important because lossy techniques are much more effective at compression than lossless methods. • The higher the compression ratio, the more noise added to the data. Data Compression Strategies • Images transmitted over the world wide web are an excellent example of why data compression is important. • Suppose we need to download a digitized color photograph over a computer's 33.6 kbps modem. • If the image is not compressed (a TIFF file, for example), it will contain about 600 kbytes of data. Data Compression Strategies • If it has been compressed using a lossless technique (such as used in the GIF format), it will be about one-half this size, or 300 kbytes. • If lossy compression has been used (a JPEG file), it will be about 50 kbytes. • The point is, the download times for these three equivalent files are 142 seconds, 71 seconds, and 12 seconds, respectively. Data Compression Strategies • That's a big difference! JPEG is the best choice for digitized photographs, while GIF is used with drawn images, such as company logos that have large areas of a single color. Data Compression Strategies • Our second way of classifying data compression methods is shown in Table below Data Compression Strategies • Most data compression programs operate by taking a group of data from the original file, compressing it in some way, and then writing the compressed group to the output file. • For instance, one of the techniques in this table is CS&Q, short for coarser sampling and/or quantization. Data Compression Strategies • Suppose we are compressing a digitized waveform, such as an audio signal that has been digitized to 12 bits. • We might read two adjacent samples from the original file (24 bits), discard one of the sample completely, discard the least significant 4 bits from the other sample, and then write the remaining 8 bits to the output file. • With 24 bits in and 8 bits out, we have implemented a 3:1 compression ratio using a lossy algorithm. Data Compression Strategies • While this is rather crude in itself, it is very effective when used with a technique called transform compression. • As we will discuss later, this is the basis of JPEG. Data Compression Strategies • Table below shows CS&Q to be a fixed-input fixed-output scheme Data Compression Strategies • That is, a fixed number of bits are read from the input file and a smaller fixed number of bits are written to the output file. • Other compression methods allow a variable number of bits to be read or written. Data Compression Strategies • As you go through the description of each of these compression methods, refer back to this table to understand how it fits into this classification scheme. • Why are JPEG and MPEG not listed in this table? • These are composite algorithms that combine many of the other techniques. • They are too sophisticated to be classified into these simple categories.