News Column

Researchers Submit Patent Application, "Flexible Quantization", for Approval

September 9, 2014



By a News Reporter-Staff News Editor at Information Technology Newsweekly -- From Washington, D.C., VerticalNews journalists report that a patent application by the inventors Tu, Chengjie (Sammamish, WA); Srinivasan, Sridhar (Redmond, WA), filed on April 28, 2014, was made available online on August 28, 2014.

The patent's assignee is Microsoft Corporation.

News editors obtained the following quote from the background information supplied by the inventors: "Transform coding is a compression technique used in many audio, image and video compression systems. Uncompressed digital image and video is typically represented or captured as samples of picture elements or colors at locations in an image or video frame arranged in a two-dimensional (2D) grid. This is referred to as a spatial-domain representation of the image or video. For example, a typical format for images consists of a stream of 24-bit color picture element samples arranged as a grid. Each sample is a number representing color components at a pixel location in the grid within a color space, such as RGB, or YIQ, among others. Various image and video systems may use various different color, spatial and time resolutions of sampling. Similarly, digital audio is typically represented as time-sampled audio signal stream. For example, a typical audio format consists of a stream of 16-bit amplitude samples of an audio signal taken at regular time intervals.

"Uncompressed digital audio, image and video signals can consume considerable storage and transmission capacity. Transform coding reduces the size of digital audio, images and video by transforming the spatial-domain representation of the signal into a frequency-domain (or other like transform domain) representation, and then reducing resolution of certain generally less perceptible frequency components of the transform-domain representation. This generally produces much less perceptible degradation of the digital signal compared to reducing color or spatial resolution of images or video in the spatial domain, or of audio in the time domain.

"More specifically, a typical block transform-based codec 100 shown in FIG. 1 divides the uncompressed digital image's pixels into fixed-size two dimensional blocks (X.sub.1, . . . X.sub.n), each block possibly overlapping with other blocks. A linear transform 120-121 that does spatial-frequency analysis is applied to each block, which converts the spaced samples within the block to a set of frequency (or transform) coefficients generally representing the strength of the digital signal in corresponding frequency bands over the block interval. For compression, the transform coefficients may be selectively quantized 130 (i.e., reduced in resolution, such as by dropping least significant bits of the coefficient values or otherwise mapping values in a higher resolution number set to a lower resolution), and also entropy or variable-length coded 130 into a compressed data stream. At decoding, the transform coefficients will inversely transform 170-171 to nearly reconstruct the original color/spatial sampled image/video signal (reconstructed blocks {circumflex over (X)}.sub.1, . . . {circumflex over (X)}.sub.n).

"The block transform 120-121 can be defined as a mathematical operation on a vector x of size N. Most often, the operation is a linear multiplication, producing the transform domain output y=Mx, M being the transform matrix. When the input data is arbitrarily long, it is segmented into N sized vectors and a block transform is applied to each segment. For the purpose of data compression, reversible block transforms are chosen. In other words, the matrix M is invertible. In multiple dimensions (e.g., for image and video), block transforms are typically implemented as separable operations. The matrix multiplication is applied separably along each dimension of the data (i.e., both rows and columns).

"For compression, the transform coefficients (components of vector y) may be selectively quantized (i.e., reduced in resolution, such as by dropping least significant bits of the coefficient values or otherwise mapping values in a higher resolution number set to a lower resolution), and also entropy or variable-length coded into a compressed data stream.

"At decoding in the decoder 150, the inverse of these operations (dequantization/entropy decoding 160 and inverse block transform 170-171) are applied on the decoder 150 side, as show in FIG. 1. While reconstructing the data, the inverse matrix M.sup.-1 (inverse transform 170-171) is applied as a multiplier to the transform domain data. When applied to the transform domain data, the inverse transform nearly reconstructs the original time-domain or spatial-domain digital media.

"In many block transform-based coding applications, the transform is desirably reversible to support both lossy and lossless compression depending on the quantization factor. With no quantization (generally represented as a quantization factor of 1) for example, a codec utilizing a reversible transform can exactly reproduce the input data at decoding. However, the requirement of reversibility in these applications constrains the choice of transforms upon which the codec can be designed.

"Many image and video compression systems, such as MPEG and Windows Media, among others, utilize transforms based on the Discrete Cosine Transform (DCT). The DCT is known to have favorable energy compaction properties that result in near-optimal data compression. In these compression systems, the inverse DCT (IDCT) is employed in the reconstruction loops in both the encoder and the decoder of the compression system for reconstructing individual image blocks.

"According to one possible definition, quantization is a term used for an approximating non-reversible mapping function commonly used for lossy compression, in which there is a specified set of possible output values, and each member of the set of possible output values has an associated set of input values that result in the selection of that particular output value. A variety of quantization techniques have been developed, including scalar or vector, uniform or non-uniform, with or without dead zone, and adaptive or non-adaptive quantization.

"The quantization operation is essentially a biased division by a quantization parameter QP which is performed at the encoder. The inverse quantization or multiplication operation is a multiplication by QP performed at the decoder. These processes together introduce a loss in the original transform coefficient data, which shows up as compression errors or artifacts in the decoded image. In a simplistic codec, a certain fixed value of QP can be applied to all transform coefficients in a frame. While this may be an acceptable solution in some cases, it has several deficiencies:

"The human visual system is not equally sensitive to all frequencies, or to all spatial locations within a frame, or to all luminance and chrominance channels. Using different QP values for different coefficients may provide a visually superior encoding even with the same or smaller number of compressed bits. Likewise, other error metrics can be suitably optimized as well.

"Rate control or the ability of an encoder to produce a compressed file of a desired size is not easy to perform with a single QP across the entire frame.

"It is therefore desirable to allow the encoder to vary QP across the image in an arbitrary manner. However, this means that the actual value of QP used for each data partition should be signaled in the bitstream. This leads to an enormous overhead just to carry the QP signaling information, making it unsuitable in practice. What is desired is a flexible yet bit-economic means of signaling QP, particularly for commonly encountered scenarios.

"In summary, quantization is the primary mechanism for most image and video codecs to control compressed image quality and compression ratio. Quantization methods supported by most popular codecs provide few features or little flexibility, or incur significant overhead of additional bits. Often, an image or a video frame is usually quantized uniformly, or with limited ability to vary quantization over spatial locations. This lack of flexibility hurts compression quality, and prevents accurate rate control on the fly. On the other hand, some codecs provide nearly unrestricted freedom in supporting quantization methods. Encoding to signal use of different quantizers takes additional bits in the encoded media, and could itself adversely affect compression efficiency. Further, the process of building a conformant decoder requires a large number of test passes generated by all possible combinations of the quantizer methods, which can be onerous."

As a supplement to the background information on this patent application, VerticalNews correspondents also obtained the inventors' summary information for this patent application: "The following Detailed Description presents variations of a flexible quantization technique that provides the ability to vary quantization along various dimensions of the encoded digital media data. For example, one representative implementation of the flexible quantization technique can vary quantization over three dimensions--over (i) spatial locations, (ii) frequency sub bands, and (iii) color channels. The Detailed Description further presents ways to efficiently signal the flexible quantization in the encoded digital media data. The benefit of this quantization approach is that the overhead incurred by quantization related side information is minimized for the primary usage scenarios, while allowing maximum flexibility if desired by the encoder.

"This Summary is provided to introduce a selection of concepts in a simplified form that is further described below in the Detailed Description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter. Additional features and advantages of the invention will be made apparent from the following detailed description of embodiments that proceeds with reference to the accompanying drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

"FIG. 1 is a block diagram of a conventional block transform-based codec in the prior art.

"FIG. 2 is a flow diagram of a representative encoder incorporating the block pattern coding.

"FIG. 3 is a flow diagram of a representative decoder incorporating the block pattern coding.

"FIG. 4 is a table containing a pseudo-code definition for signaling of a DC quantizer according to a flexible quantization technique.

"FIG. 5 is a table containing a pseudo-code definition for signaling of a low-pass quantizer according to the flexible quantization technique.

"FIG. 6 is a table containing a pseudo-code definition for signaling of a high-pass quantizer according to the flexible quantization technique.

"FIG. 7 is a table containing a pseudo-code definition for signaling of quantizers at a frame layer according to the flexible quantization technique.

"FIG. 8 is a table containing a pseudo-code definition for signaling of quantizers at a tile layer in spatial mode according to the flexible quantization technique.

"FIG. 9 is a table containing a pseudo-code definition for signaling of quantizers of a DC sub-band at the tile layer in frequency mode according to the flexible quantization technique.

"FIG. 10 is a table containing a pseudo-code definition for signaling of quantizers of a low-pass sub-band at the tile layer in frequency mode according to the flexible quantization technique.

"FIG. 11 is a table containing a pseudo-code definition for signaling of quantizers of a high-pass sub-band at the tile layer in frequency mode according to the flexible quantization technique.

"FIG. 12 is a table containing a pseudo-code definition for signaling of quantizers at a macroblock layer in spatial mode according to the flexible quantization technique.

"FIG. 13 is a table containing a pseudo-code definition for signaling of low-pass quantizers at the macroblock layer in frequency mode according to the flexible quantization technique.

"FIG. 14 is a table containing a pseudo-code definition for signaling of high-pass quantizers at the macroblock layer in frequency mode according to the flexible quantization technique.

"FIG. 15 is a block diagram of a suitable computing environment for implementing a media encoder/decoder with flexible quantization."

For additional information on this patent application, see: Tu, Chengjie; Srinivasan, Sridhar. Flexible Quantization. Filed April 28, 2014 and posted August 28, 2014. Patent URL: http://appft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&u=%2Fnetahtml%2FPTO%2Fsearch-adv.html&r=4057&p=82&f=G&l=50&d=PG01&S1=20140821.PD.&OS=PD/20140821&RS=PD/20140821

Keywords for this news article include: Information Technology, Information and Data Compression, Microsoft Corporation.

Our reports deliver fact-based news of research and discoveries from around the world. Copyright 2014, NewsRx LLC


For more stories covering the world of technology, please see HispanicBusiness' Tech Channel



Source: Information Technology Newsweekly


Story Tools






HispanicBusiness.com Facebook Linkedin Twitter RSS Feed Email Alerts & Newsletters