CN101617539A - Based on computation complexity in the digital media coder of conversion and precision control - Google Patents

Based on computation complexity in the digital media coder of conversion and precision control Download PDF

Info

Publication number
CN101617539A
CN101617539A CN200880005630A CN200880005630A CN101617539A CN 101617539 A CN101617539 A CN 101617539A CN 200880005630 A CN200880005630 A CN 200880005630A CN 200880005630 A CN200880005630 A CN 200880005630A CN 101617539 A CN101617539 A CN 101617539A
Authority
CN
China
Prior art keywords
digital media
precision
transform
arithmetic
decoder
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN200880005630A
Other languages
Chinese (zh)
Other versions
CN101617539B (en
Inventor
S·斯里尼瓦杉
C·图
S·瑞古纳萨恩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Microsoft Technology Licensing LLC
Original Assignee
Microsoft Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Microsoft Corp filed Critical Microsoft Corp
Publication of CN101617539A publication Critical patent/CN101617539A/en
Application granted granted Critical
Publication of CN101617539B publication Critical patent/CN101617539B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/65Transmission of management data between client and server
    • H04N21/654Transmission by server directed to the client
    • H04N21/6547Transmission by server directed to the client comprising parameters, e.g. for client setup
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/12Selection from among a plurality of transforms or standards, e.g. selection between discrete cosine transform [DCT] and sub-band transform or selection between H.263 and H.264
    • H04N19/122Selection of transform size, e.g. 8x8 or 2x4x8 DCT; Selection of sub-band transforms of varying structure or type
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/156Availability of hardware or computational resources, e.g. encoding based on power-saving criteria
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/46Embedding additional information in the video signal during the compression process

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computing Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Discrete Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The Digital Media encoder/decoder comprises and signaling in the computation complexity of the decoding place various patterns relevant with precision.Encoder can send the syntax elements of indication in the arithmetic precision (for example, using 16 or 32 bit arithmetics) of the transform operation of decoding place execution.Encoder can also signal in decoder output place whether use convergent-divergent, and this permits the wideer dynamic range of the intermediate data of decoding place, but owing to the convergent-divergent computing has increased computation complexity.

Description

基于变换的数字媒体编解码器中的计算复杂度和精度控制 Computational Complexity and Precision Control in Transform-Based Digital Media Codecs

背景background

基于块变换的编码Block Transform Based Coding

变换编码是在许多数字媒体(例如音频、图像和视频)压缩系统中使用的一种压缩技术。未压缩的数字图像和视频通常作为以二维(2D)网格排列的图像或视频帧中各位置处的图元或色彩的样本来表示或捕捉。这被称为图像或视频的空间域表示。例如,用于图像的典型格式由被排列为网格的24位色彩图元样本流构成。每一样本是表示诸如RGB或YIQ等色彩空间内该网格中的一个像素位置处的色彩分量的数字。各种图像和视频系统可使用各种不同的色彩、空间和时间分辨率的采样。类似地,数字音频通常被表示为时间采样的音频信号流。例如,典型的音频格式由以有规律的时间间隔所取的16位音频信号幅度样本流构成。Transform coding is a compression technique used in many digital media (eg, audio, image, and video) compression systems. Uncompressed digital images and video are typically represented or captured as samples of primitives or colors at various locations in an image or video frame arranged in a two-dimensional (2D) grid. This is called a spatial domain representation of an image or video. For example, a typical format for images consists of a stream of 24-bit color primitive samples arranged as a grid. Each sample is a number representing a color component at a pixel location in the grid in a color space such as RGB or YIQ. Various image and video systems may use sampling at various color, spatial and temporal resolutions. Similarly, digital audio is often represented as a stream of time-sampled audio signals. For example, a typical audio format consists of a stream of 16-bit audio signal amplitude samples taken at regular time intervals.

未压缩的数字音频、图像和视频信号可消耗大量的存储和传输能力。变换编码通过将信号的空间域表示变换成频域(或其它类似的变换域)表示,然后降低该变换域表示的某些一般较不可感知的频率分量的分辨率,从而减小了数字音频、图像和视频的大小。与降低空间域中的图像或视频或时域中的音频的色彩或空间分辨率相比,这一般产生了较不可感知的数字信号劣化。Uncompressed digital audio, image and video signals can consume large amounts of storage and transmission capacity. Transform coding reduces digital audio, The size of images and videos. This generally produces less perceptible degradation of the digital signal than reducing the color or spatial resolution of images or video in the spatial domain or audio in the temporal domain.

更具体而言,图1所示的典型的基于块变换的编码器/解码器系统100(也被称为“编解码器”)将未压缩的数字图像的像素划分成固定大小的二维块(X1,...Xn),每一块可能与其它块重叠。对每一块应用进行空间-频率分析的线性变换120-121,这将块内彼此隔开的样本转换成一般表示块间隔上相应的频带内的数字信号的强度的一组频率(或变换)系数。为了压缩,变换系数可被选择性地量化130(即,诸如通过丢弃系数值的最低有效位或将较高分辨率数字集中的值映射到较低分辨率来降低分辨率),并且还被熵编码或可变长度编码130成压缩数据流。在解码时,变换系数进行逆变换170-171以便几乎重构原始的色彩/空间采样图像/视频信号(重构块

Figure G2008800056300D00021
)。More specifically, a typical block-transform-based encoder/decoder system 100 (also referred to as a "codec") shown in FIG. 1 divides the pixels of an uncompressed digital image into fixed-size two-dimensional blocks (X 1 , . . . X n ), each block may overlap other blocks. A linear transform 120-121 that performs a space-frequency analysis is applied to each block, which converts the samples spaced apart from each other within the block into a set of frequency (or transform) coefficients that generally represent the strength of the digital signal in the corresponding frequency band over the block interval . For compression, the transform coefficients may be selectively quantized 130 (i.e., to reduce resolution such as by discarding the least significant bits of coefficient values or mapping values in a higher resolution digital set to a lower resolution), and also entropy Encoding or variable length encoding 130 into a compressed data stream. On decoding, the transform coefficients are inverse transformed 170-171 in order to nearly reconstruct the original color/space sampled image/video signal (reconstruction block
Figure G2008800056300D00021
).

块变换120-121可被定义为对大小为N的向量x的数学运算。最通常的是,该运算是线性乘法,从而产生变换域输出y=Mx,M是变换矩阵。当输入数据是任意长时,它被分段成大小为N的向量,并且向每一段应用块变换。出于数据压缩的目的,选择可逆块变换。换言之,矩阵M是可逆的。在多个维度中(例如,对于图像和视频),块变换通常被实现为可分运算。沿数据的每一维(即,行和列)可分地应用矩阵乘法。A block transform 120-121 may be defined as a mathematical operation on a vector x of size N. Most commonly, this operation is a linear multiplication, resulting in a transform domain output y=Mx, where M is the transformation matrix. When the input data is arbitrarily long, it is segmented into vectors of size N, and a block transformation is applied to each segment. For the purpose of data compression, the reversible block transform is chosen. In other words, matrix M is invertible. In multiple dimensions (eg, for images and videos), block transformations are often implemented as separable operations. Matrix multiplication is applied separably along each dimension of the data (ie, rows and columns).

为了压缩,变换系数(向量y的分量)可被选择性地量化(即,诸如通过丢弃系数值的最低有效位或将较高分辨率数字集中的值映射到较低分辨率来降低分辨率),并还可被熵编码或可变长度编码成压缩数据流。For compression, the transform coefficients (components of the vector y) can be selectively quantized (i.e., to reduce resolution such as by discarding the least significant bits of the coefficient values or mapping values in higher resolution digit sets to lower resolutions) , and can also be entropy coded or variable length coded into a compressed data stream.

在解码器150中解码时,如图1所示,在解码器150侧应用这些运算的逆过程(解量化(dequantization)/熵解码160和逆块变换170-171)。在重构数据时,将逆矩阵M-1(逆变换170-171)作为乘数应用于变换域数据。当应用于变换域数据时,逆变换几乎重构原始时域或空间域数字媒体。When decoding in the decoder 150, as shown in FIG. 1, the inverse process of these operations (dequantization/entropy decoding 160 and inverse block transformation 170-171) is applied on the decoder 150 side. When reconstructing the data, the inverse matrix M -1 (inverse transform 170-171) is applied as a multiplier to the transform domain data. When applied to transform-domain data, the inverse transform nearly reconstructs the original time-domain or space-domain digital media.

在许多基于块变换的编码应用中,变换理想地是可逆的以取决于量化因子同时支持有损和无损压缩两者。如果例如没有量化(一般被表示为量化因子1),则利用可逆变换的编解码器可在解码时精确地再现输入数据。然而,这些应用中的可逆性的要求约束了对用于设计编解码器的变换的选择。In many block transform-based coding applications, the transform is ideally reversible to support both lossy and lossless compression depending on the quantization factor. A codec utilizing reversible transforms can reproduce the input data exactly when decoded if, for example, there is no quantization (commonly denoted quantization factor 1). However, the requirement of invertibility in these applications constrains the choice of transforms for designing codecs.

诸如MPEG和Windows Media等许多图像和视频压缩系统利用基于离散余弦变换(DCT)的变换。已知DCT具有得到近乎最优的数据压缩的良好能量压缩特性。在这些压缩系统中,在压缩系统的编码器和解码器两者中的重构环路中采用了逆DCT(IDCT)来重构各个图像块。Many image and video compression systems, such as MPEG and Windows Media, utilize discrete cosine transform (DCT) based transforms. DCT is known to have good energy compression properties leading to near-optimal data compression. In these compression systems, an inverse DCT (IDCT) is employed in the reconstruction loop in both the encoder and decoder of the compression system to reconstruct the individual image blocks.

量化Quantify

量化是大多数图像和视频编解码器控制压缩的图像质量和压缩比的主要机制。根据一个可能的定义,量化是用于通常用于有损压缩的近似不可逆映射函数的术语,其中有一组指定的可能输出值,并且该组可能的输出值中的每一成员具有导致对该特定输出值的选择的一组相关联的输入值。已经开发了各种量化技术,包括标量或矢量、均匀或非均匀、有或没有死区、以及自适应或非自适应量化。Quantization is the primary mechanism by which most image and video codecs control compressed image quality and compression ratio. According to one possible definition, quantization is the term used for approximately irreversible mapping functions commonly used in lossy compression, where there is a specified set of possible output values, and each member of the set of possible output values has A set of associated input values for a selection of output values. Various quantization techniques have been developed, including scalar or vector, uniform or non-uniform, with or without dead zone, and adaptive or non-adaptive quantization.

量化运算本质上是按照量化参数QP的加偏除法(biased division),这在编码器处执行。逆量化或乘法运算是与QP的乘法,这在解码器处执行。这些过程共同引入了原始变换系数数据的损失,这表现为解码的图像中的压缩误差或伪像。The quantization operation is essentially a biased division according to the quantization parameter QP, which is performed at the encoder. The inverse quantization or multiplication operation is a multiplication with the QP, which is performed at the decoder. Together, these processes introduce a loss of the original transform coefficient data, which manifests as compression errors or artifacts in the decoded image.

概述overview

以下详细描述呈现控制使用数字媒体编解码器的解码的计算复杂度和精度的工具和技术。在该技术的一个方面,编码器用信号通知在解码器处使用缩放或未缩放精度模式中的一个。在缩放精度模式中,在编码器处预乘(例如乘8)输入图像。解码器处的输出也通过取整除法来缩放。在未缩放精度模式中,不应用这种缩放运算。在未缩放精度模式中,编码器或解码器可以处理较小的变换系数动态范围,并且因此具有较低的计算复杂度。The following detailed description presents tools and techniques for controlling the computational complexity and precision of decoding using digital media codecs. In one aspect of the technique, the encoder signals at the decoder to use one of scaled or unscaled precision modes. In scaled precision mode, the input image is premultiplied (eg multiplied by 8) at the encoder. The output at the decoder is also scaled by rounding and division. In unscaled precision mode, this scaling operation is not applied. In unscaled precision mode, the encoder or decoder can handle a smaller dynamic range of transform coefficients and thus has lower computational complexity.

在该技术的另一方面,编解码器还可以用信号通知解码器执行变换运算所要求的精度。在一个实现中,位流句法的元素用信号通知是否对解码器处的变换采用较低精度算术运算。In another aspect of this technique, the codec can also signal to the decoder the precision required to perform the transform operations. In one implementation, an element of the bitstream syntax signals whether to employ lower precision arithmetic operations for the transform at the decoder.

提供本概述是为了以简化的形式介绍将在以下详细描述中进一步描述的一些概念。该概述不旨在标识所要求保护的主题的关键特征或必要特征,也不旨在用于帮助确定所要求保护的主题的范围。本发明的其它特征和优点在参考附图继续阅读以下对实施例的详细描述后将变得显而易见。This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter. Other features and advantages of the present invention will become apparent after continuing to read the following detailed description of the embodiments with reference to the accompanying drawings.

附图简述Brief description of the drawings

图1是现有技术中常规的基于块变换的编解码器的框图。Fig. 1 is a block diagram of a conventional block transform based codec in the prior art.

图2是包含块模式编码的代表性编码器的流程图。Figure 2 is a flow diagram of a representative encoder including block mode encoding.

图3是包含块模式编码的代表性解码器的流程图。Figure 3 is a flowchart of a representative decoder including block mode encoding.

图4是图2和图3的代表性编码器/解码器的一个实现中的包括核心变换和后滤波(重叠)运算的逆重叠变换的图。4 is a diagram of an inverse lapped transform including a kernel transform and a post-filtering (overlap) operation in one implementation of the representative encoder/decoder of FIGS. 2 and 3 .

图5是标识变换运算的输入数据点的图。5 is a diagram identifying input data points for a transformation operation.

图6是用于实现图2和图3的媒体编码器/解码器的合适的计算环境的框图。6 is a block diagram of a suitable computing environment for implementing the media encoder/decoder of FIGS. 2 and 3 .

详细描述A detailed description

以下描述涉及控制基于变换的数字媒体编解码器的精度和计算复杂度的技术。以下描述在数字媒体压缩系统或编解码器的上下文中描述了该技术的一个示例实现。该数字媒体系统以压缩形式对数字媒体数据进行编码以便传输或存储,并解码该数据以供回放或其它处理。出于说明的目的,包含计算复杂度和精度控制的该示例性压缩系统是图像或视频压缩系统。另选地,该技术也可被结合到用于其它数字媒体数据的压缩系统或编解码器中。计算复杂度和精度控制技术不要求数字媒体压缩系统以特定的编码格式来编码压缩数字媒体数据。The following description relates to techniques for controlling the precision and computational complexity of transform-based digital media codecs. The following description describes one example implementation of the techniques in the context of a digital media compression system or codec. The digital media system encodes digital media data in a compressed form for transmission or storage and decodes the data for playback or other processing. For purposes of illustration, this exemplary compression system involving computational complexity and precision control is an image or video compression system. Alternatively, this technique may also be incorporated into compression systems or codecs for other digital media data. Computational complexity and precision control techniques do not require digital media compression systems to encode compressed digital media data in a specific encoding format.

1.1.编码器/解码器1.1. Encoder/Decoder

图2和图3是在代表性2维(2D)数据编码器200和解码器300中采用的过程的一般化图示。该图呈现结合了2D数据编码器和解码器的压缩系统的一般化或简化的图示,该2D数据编码器和解码器使用计算复杂度和精度控制技术来实现压缩。在使用控制技术的替换压缩系统中,可使用比该代表性编码器和解码器中所示的更多或更少的过程来进行2D数据压缩。例如,某些编码器/解码器还可包括色彩转换、色彩格式、可缩放编码、无损编码、宏块模式等。取决于可基于从无损到有损变化的量化参数的量化,压缩系统(编码器和解码器)可提供2D数据的无损和/或有损压缩。2 and 3 are generalized illustrations of processes employed in a representative 2-dimensional (2D) data encoder 200 and decoder 300 . The figure presents a generalized or simplified illustration of a compression system incorporating a 2D data encoder and decoder that uses computational complexity and precision control techniques to achieve the compression. In alternative compression systems using control techniques, more or fewer processes than shown in this representative encoder and decoder can be used for 2D data compression. For example, some encoders/decoders may also include color conversion, color formats, scalable encoding, lossless encoding, macroblock modes, etc. Compression systems (encoder and decoder) may provide lossless and/or lossy compression of 2D data, depending on quantization based on quantization parameters that may vary from lossless to lossy.

2D数据编码器200产生压缩位流220,压缩位流220是作为输入提供给编码器的2D数据210的更紧凑表示(对于典型输入)。例如,2D数据输入可以是图像、视频序列帧、或具有两个维度的其它数据。2D数据编码器将输入数据帧划分成块(一般在图2中示为分区230),这在所示的实现中是形成跨该帧的平面的规则图案的非重叠4×4像素块。这些块被分组成称为宏块的群集,在该代表性编码器中其大小是16×16像素。宏块进而被分组成称为瓦块(tile)的规则结构。瓦块也可形成图像上的规则图案,使得水平行中的瓦块是统一的高度且是对齐的,而垂直列中的瓦块是统一的宽度且是对齐的。在该代表性编码器中,瓦块可以是任意大小,该大小在水平和/或垂直方向上是16的倍数。替换编码器实现可以将图像划分成块、宏块、瓦块或其它大小和结构的其它单元。The 2D data encoder 200 produces a compressed bitstream 220, which is a more compact representation (for a typical input) of the 2D data 210 provided as input to the encoder. For example, a 2D data input may be an image, a frame of a video sequence, or other data having two dimensions. The 2D data encoder divides the input data frame into blocks (shown generally in Figure 2 as partitions 230), which in the implementation shown are non-overlapping 4x4 pixel blocks that form a regular pattern across the plane of the frame. These blocks are grouped into clusters called macroblocks, which in this representative encoder are 16x16 pixels in size. Macroblocks are in turn grouped into regular structures called tiles. The tiles can also form a regular pattern on the image, such that tiles in horizontal rows are of uniform height and aligned, while tiles in vertical columns are of uniform width and aligned. In this representative encoder, tiles can be of any size that is a multiple of 16 in the horizontal and/or vertical directions. Alternative encoder implementations may divide the image into blocks, macroblocks, tiles, or other units of other sizes and structures.

对块之间的每一边缘应用“前向重叠”算子240,之后使用块变换250来变换每一4×4的块。该块变换250可以是由Srinivasan在2004年12月17日提交的题为“Reversible Transform For Lossy And Lossless 2-D DataCompression”(用于有损和无损2D数据压缩的可逆变换)的美国专利申请第11/015,707号中所描述的可逆的、无缩放的2D变换。重叠算子240可以是由Tu等人在2004年12月17日提交的题为“Reversible OverlapOperator for Efficient Lossless Data Compression”(用于高效无损数据压缩的可逆重叠算子)的美国专利申请第11/015,148号;以及Tu等人在2005年1月14日提交的题为“Reversible 2-Dimensional Pre-/Post-Filter for LappedBiorthogonal Transform”(用于重叠双正交变换的可逆2维预/后滤波器)的美国专利申请第11/035,991号中描述的可逆重叠算子。或者,可使用离散余弦变换或其它块变换和重叠算子。在变换之后,令每一4×4的变换块的DC系数260经受一类似的处理链(块化、前向重叠、之后是4×4的块变换)。所得的DC变换系数和AC变换系数被量化270、熵编码280和分组化290。A "forward overlap" operator 240 is applied to each edge between blocks, after which a block transform 250 is used to transform each 4x4 block. The block transform 250 may be obtained by Srinivasan in U.S. Patent Application No. 1, entitled "Reversible Transform For Lossy And Lossless 2-D Data Compression", filed on December 17, 2004 by Srinivasan. The reversible, unscaled 2D transform described in 11/015,707. The overlap operator 240 may be submitted by Tu et al. on December 17, 2004, entitled "Reversible Overlap Operator for Efficient Lossless Data Compression" (reversible overlap operator for efficient lossless data compression) U.S. Patent Application No. 11/ 015,148; and "Reversible 2-Dimensional Pre-/Post-Filter for Lapped Biorthogonal Transform" by Tu et al., filed Jan. 14, 2005 ) of US Patent Application No. 11/035,991 described in the reversible overlap operator. Alternatively, a discrete cosine transform or other block transform and overlap operator may be used. After transformation, the DC coefficients 260 of each 4x4 transform block are subjected to a similar processing chain (blocking, forward overlap, followed by 4x4 block transform). The resulting DC transform coefficients and AC transform coefficients are quantized 270 , entropy encoded 280 and packetized 290 .

解码器执行逆过程。在解码器侧,从其各自的分组中提取310变换系数位,从中系数本身被解码320和解量化330。DC系数340通过应用逆变换来重新生成,并且使用跨DC块边缘应用的合适的平滑算子来“逆重叠”DC系数的平面。随后,通过向DC系数应用4×4的逆变换350来重新生成整个数据,并从位流中解码AC系数342。最后,对所得图像平面中的块边缘进行逆重叠滤波360。这产生重构的2D数据输出。The decoder performs the reverse process. On the decoder side, the transform coefficient bits are extracted 310 from their respective packets, from which the coefficients themselves are decoded 320 and dequantized 330 . The DC coefficients 340 are regenerated by applying an inverse transform, and "inverse overlapping" the planes of the DC coefficients using a suitable smoothing operator applied across the DC block edges. Subsequently, the entire data is regenerated by applying a 4x4 inverse transform 350 to the DC coefficients, and the AC coefficients are decoded 342 from the bitstream. Finally, inverse overlap filtering is performed 360 on the block edges in the resulting image plane. This produces a reconstructed 2D data output.

在一示例性实现中,编码器200(图2)将输入图像压缩成压缩位流220(例如文件),而解码器300(图3)基于所采用的是无损还是有损编码来重构原始输入或其近似。编码过程涉及应用以下所讨论的前向重叠变换(LT),这是用同样在以下更全面描述的可逆2维预/后滤波来实现的。解码过程涉及应用使用可逆2维预/后滤波的逆重叠变换(ILT)。In an exemplary implementation, encoder 200 (FIG. 2) compresses an input image into a compressed bitstream 220 (e.g., a file), and decoder 300 (FIG. 3) reconstructs the original image based on whether lossless or lossy encoding was used. input or its approximation. The encoding process involves applying the forward lapped transform (LT) discussed below, which is achieved with invertible 2-dimensional pre/post filtering, also described more fully below. The decoding process involves applying an inverse lapped transform (ILT) using reversible 2-dimensional pre/post filtering.

所示的LT和ILT在确切的意义上是彼此的逆,并且因此可被统称为可逆重叠变换。作为一种可逆变换,LT/ILT对可用于无损图像压缩。The shown LT and ILT are in the exact sense the inverse of each other, and thus may be collectively referred to as an invertible lapped transform. As a reversible transform, the LT/ILT pair can be used for lossless image compression.

由所示的编码器200/解码器300压缩的输入数据210可以是各种色彩格式(例如,RGB/YUV 4:4:4、YUV 4:2:2或YUV 4:2:0彩色图像格式)的图像。通常,输入图像总是具有亮度(Y)分量。如果它是RGB/YUV 4:4:4、YUV 4:2:2或YUV 4:2:0图像,则该图像还具有色度分量,诸如U分量和V分量。图像的这些单独的色彩平面或分量可具有不同的空间分辨率。在例如YUV 4:2:0色彩格式的输入图像的情况下,U和V分量具有Y分量一半的宽度和高度。The input data 210 compressed by the illustrated encoder 200/decoder 300 can be in various color formats (e.g., RGB/YUV 4:4:4, YUV 4:2:2, or YUV 4:2:0 color image formats )Image. In general, an input image always has a luminance (Y) component. If it is an RGB/YUV 4:4:4, YUV 4:2:2 or YUV 4:2:0 image, the image also has chroma components such as U and V components. These individual color planes or components of an image may have different spatial resolutions. In case of an input image in eg YUV 4:2:0 color format, the U and V components have half the width and height of the Y component.

如上所述,编码器200将输入图像或图片块化成宏块。在一示例性实现中,编码器200将输入图像块化成Y通道中的16×16像素区域(称为“宏块”)(取决于色彩格式,可以是U和V通道中的16×16、16×8或8×8区域)。每一宏块色彩平面被块化成4×4像素的区域或块。因此,对于本示例性编码器实现,宏块按以下的方式由各种色彩格式组成:As described above, the encoder 200 blocks an input image or picture into macroblocks. In an exemplary implementation, the encoder 200 blocks the input image into 16×16 pixel regions (referred to as “macroblocks”) in the Y channel (depending on the color format, this could be 16×16 in the U and V channels, 16×8 or 8×8 area). Each macroblock color plane is blockized into regions or blocks of 4x4 pixels. Therefore, for this exemplary encoder implementation, a macroblock is composed of various color formats in the following manner:

1.对于灰度图像,每一宏块包含16个4×4的亮度(Y)块。1. For grayscale images, each macroblock contains 16 4x4 luma (Y) blocks.

2.对于YUV 4:2:0格式彩色图像,每一宏块包含16个4×4的Y块,以及4个各自为4×4的色度(U和V)块。2. For a YUV 4:2:0 format color image, each macroblock contains 16 4×4 Y blocks, and 4 4×4 chroma (U and V) blocks each.

3.对于YUV 4:2:2格式彩色图像,每一宏块包含16个4×4的Y块,以及8个各自为4×4的色度(U和V)块。3. For YUV 4:2:2 format color images, each macroblock contains 16 4×4 Y blocks, and 8 4×4 chrominance (U and V) blocks each.

4.对于RGB或YUV 4:4:4彩色图像,每一宏块对Y、U和V通道中的每一个包含16个块。4. For RGB or YUV 4:4:4 color images, each macroblock contains 16 blocks for each of the Y, U, and V channels.

因此,在变换之后,该代表性编码器200/解码器300中的宏块具有三个频率子带:DC子带(DC宏块)、低通子带(低通宏块)和高通子带(高通宏块)。在该代表性系统中,低通和/或高通子带在位流中是可任选的——这些子带可被完全丢弃。Thus, after transformation, a macroblock in this representative encoder 200/decoder 300 has three frequency subbands: a DC subband (DC macroblock), a low-pass subband (low-pass macroblock), and a high-pass subband (Qualcomm macroblock). In this representative system, the lowpass and/or highpass subbands are optional in the bitstream - these subbands can be dropped entirely.

此外,压缩数据可按以下两种次序之一被填塞到位流中:空间次序和频率次序。对于空间次序,瓦块内的同一宏块的不同子带被排序在一起,且所得的每一瓦块的位流被写入一个分组中。对于频率次序,来自瓦块内的不同宏块的同一子带被分组在一起,且因此瓦块的位流被写入以下三个分组中:DC瓦块分组、低通瓦块分组和高通瓦块分组。另外,可以有其它数据层。Furthermore, compressed data can be stuffed into the bitstream in one of two orders: spatial order and frequency order. For spatial order, different subbands of the same macroblock within a tile are ordered together, and the resulting bitstream for each tile is written into one packet. For frequency order, the same subbands from different macroblocks within a tile are grouped together, and thus the bitstream of the tile is written in the following three groups: DC tile grouping, low-pass tile grouping, and high-pass tile grouping block grouping. Additionally, there may be other data layers.

因此,对于该代表性系统,图像按以下“维度”来组织:Therefore, for this representative system, images are organized in the following "dimensions":

空间维度:帧→瓦块→宏块;Spatial dimension: frame→tile→macroblock;

频率维度:DC|低通|高通;以及Frequency Dimensions: DC|Low Pass|High Pass; and

通道维度:亮度|色度0|色度1……(例如,Y|U|V)。Channel dimensions: luma|chroma0|chroma1... (eg, Y|U|V).

以上箭头表示分层结构,而垂直条表示划分。Arrows above indicate hierarchy, while vertical bars indicate divisions.

尽管该代表性系统按照空间、频率和通道维度来组织压缩的数字媒体数据,但是此处描述的灵活量化方法可以应用于沿着更少、更多或其它维度来组织其数据的替换编码器/解码器系统。例如,该灵活量化方法可应用于使用更大数量的频带、其它格式的色彩通道(例如,YIQ、RGB等)、附加图像通道(例如,用于立体声视觉或其它多照相机阵列)的编码。Although this representative system organizes compressed digital media data along spatial, frequency, and channel dimensions, the flexible quantization methods described here can be applied to alternative encoders/coders that organize their data along fewer, more, or other dimensions. decoder system. For example, this flexible quantization method can be applied to encodings using larger numbers of frequency bands, color channels in other formats (eg, YIQ, RGB, etc.), additional image channels (eg, for stereo vision or other multi-camera arrays).

2.逆核心及重叠变换2. Inverse core and overlapping transformation

概览overview

在编码器200/解码器300的一个实现中,解码器侧的逆变换采取两级重叠变换的形式。步骤如下:In one implementation of the encoder 200/decoder 300, the inverse transform at the decoder side takes the form of a two-stage lapped transform. Proceed as follows:

·对与安排在被称为DC平面的平面阵列中的重构DC和低通系数相对应的每一4×4块应用逆核心变换(ICT)。• An inverse core transform (ICT) is applied to each 4x4 block corresponding to the reconstructed DC and low-pass coefficients arranged in a planar array called a DC plane.

·可任选地将后滤波运算应用于均匀地跨DC平面中的块的4×4区域。此外,对边界2×4和4×2区域应用后滤波器,而四个角区域不改变。- Optionally apply post-filtering operations to uniformly span the 4x4 region of the block in the DC plane. In addition, the post-filter is applied to the border 2×4 and 4×2 regions, while the four corner regions are not changed.

·所得阵列包含对应于第一级变换的4×4块的DC系数。DC系数被(象征性地)复制到更大的阵列,并且重构的高通系数被填充到剩余位置中。• The resulting array contains DC coefficients corresponding to a 4x4 block of the first stage transform. The DC coefficients are (symbolically) copied to a larger array, and the reconstructed high-pass coefficients are filled into the remaining positions.

·对每一4×4块应用ICT。• Apply ICT to each 4x4 block.

·可任选地将后滤波运算应用于均匀地跨DC平面中的块的4×4区域。此外,对边界2×4和4×2区域应用后滤波器,而四个角区域不改变。- Optionally apply post-filtering operations to uniformly span the 4x4 region of the block in the DC plane. In addition, the post-filter is applied to the border 2×4 and 4×2 regions, while the four corner regions are not changed.

该过程在图4中示出。This process is illustrated in FIG. 4 .

后滤波器的应用由压缩位流220中的OVERLAP_INFO(重叠信息)句法元素来管控。OVERLAP_INFO可以取三个值:Application of the post-filter is governed by the OVERLAP_INFO (overlap information) syntax element in the compressed bitstream 220 . OVERLAP_INFO can take three values:

·如果OVERLAP_INFO=0,则不执行后滤波。• If OVERLAP_INFO = 0, no post-filtering is performed.

·如果OVERLAP_INFO=1,则只执行外部后滤波。• If OVERLAP_INFO = 1, only external post-filtering is performed.

·如果OVERLAP_INFO=2,则执行内部及外部后滤波。• If OVERLAP_INFO = 2, perform internal and external post-filtering.

逆核心变换inverse kernel transform

核心变换(CT)受常规地被称为4×4离散余弦变换(DCT)启发,但它在根本上是不同的。第一关键差异是DCT是线性的而CT是非线性的。第二关键差异是由于其是在实数上定义的事实,DCT不是整数到整数空间中的无损运算。CT是在整数上定义的,并且在该空间中是无损的。第三关键差异是2D DCT是可分运算。CT特意是不可分的。The Kernel Transform (CT) is inspired by what is conventionally known as the 4x4 Discrete Cosine Transform (DCT), but it is fundamentally different. The first key difference is that DCT is linear while CT is non-linear. The second key difference is due to the fact that it is defined on the real numbers, the DCT is not a lossless operation in integer-to-integer space. CT is defined on integers and is lossless in that space. The third key difference is that 2D DCT is a separable operation. CT is specifically indivisible.

整个逆变换过程可被写成三个基本的2×2变换运算的级联,它们是:The entire inverse transformation process can be written as a cascade of three basic 2×2 transformation operations, which are:

·2×2哈达玛(Hadamard)变换:T_h2×2 Hadamard transform: T_h

·逆1D旋转:InvT_oddInverse 1D rotation: InvT_odd

·逆2D旋转:InvT_odd_oddInverse 2D rotation: InvT_odd_odd

这些变换是作为不可分运算来实现的,并且被首先描述,其后是整个ICT的描述。These transformations are implemented as inseparable operations and are described first, followed by the description of the entire ICT.

2D 2×2哈达玛变换T_h2D 2×2 Hadamard Transform T_h

如以下伪码表所示,编码器/解码器实现2D 2×2哈达玛变换T_h。R是舍入因子,其值只可以是0或1。T_h是对合的(即,对数据向量[a b c d]应用两次T-h会成功恢复[a b c d]的原始值,假定R在两次应用之间未改变)。逆T_h是T_h本身。As shown in the following pseudocode table, the encoder/decoder implements the 2D 2×2 Hadamard transform T_h. R is a rounding factor whose value can only be 0 or 1. T_h is involuntary (i.e., applying T-h twice to the data vector [a b c d] successfully restores the original value of [a b c d], assuming R is unchanged between the two applications). The inverse T_h is T_h itself.

Figure G2008800056300D00081
Figure G2008800056300D00081

Figure G2008800056300D00091
Figure G2008800056300D00091

逆1D旋转InvT_oddInverse 1D rotation InvT_odd

T_odd的无损逆由下表中的伪码定义。The lossless inverse of T_odd is defined by the pseudocode in the table below.

Figure G2008800056300D00092
Figure G2008800056300D00092

逆2D旋转InvT_odd_oddInverse 2D rotation InvT_odd_odd

逆2D旋转InvT_odd_odd由下表中的伪码定义。The inverse 2D rotation InvT_odd_odd is defined by the pseudocode in the table below.

Figure G2008800056300D00093
Figure G2008800056300D00093

Figure G2008800056300D00101
Figure G2008800056300D00101

ICT运算ICT operation

2×2数据和先前列出的伪码之间的对应在图5中示出。此处介绍使用四个灰度级来指示四个数据点的彩色编码,以方便下一节中的变换描述。The correspondence between the 2×2 data and the previously listed pseudocode is shown in FIG. 5 . Here we introduce color coding using four gray levels to indicate four data points to facilitate the transformation description in the next section.

2D 4×4点ICT是使用T_h、逆T_odd和逆T_odd_odd来构建的。注意,逆T_h是T_h本身。ICT包括两个阶段,其在以下伪码中示出。每一阶段包括能在该阶段内以任意顺序或同时完成的四个2×2变换。2D 4×4 point ICT is constructed using T_h, inverse T_odd and inverse T_odd_odd. Note that the inverse T_h is T_h itself. ICT consists of two phases, which are shown in the following pseudocode. Each stage consists of four 2x2 transforms that can be done in any order or simultaneously within that stage.

如果输入数据块是 a b c d e f g h i j k l m n o p , 则4×4_IPCT_1stStage()和4×4_IPCT_2ndStage()定义如下:If the input data block is a b c d e f g h i j k l m no o p , Then 4×4_IPCT_1stStage() and 4×4_IPCT_2ndStage() are defined as follows:

Figure G2008800056300D00103
Figure G2008800056300D00103

Figure G2008800056300D00111
Figure G2008800056300D00111

函数2×2_ICT与T_h相同。The function 2×2_ICT is the same as T_h.

后滤波概览Post Filtering Overview

四个算子定义逆重叠变换中所使用的后滤波器。它们是:Four operators define the post-filter used in the inverse lapped transform. They are:

·4×4后滤波器· 4×4 post-filter

·4点后滤波器· 4-point post filter

·2×2后滤波器· 2×2 post-filter

·2点后滤波器· 2-point post filter

后滤波器使用T_h、InvT_odd_odd、invScale和invRotate。invRotate和invScale分别在以下各表中定义。The post filter uses T_h, InvT_odd_odd, invScale and invRotate. invRotate and invScale are defined in the following tables respectively.

Figure G2008800056300D00112
Figure G2008800056300D00112

Figure G2008800056300D00113
Figure G2008800056300D00113

4×4后滤波器4×4 post filter

最初,在OVERLAP_INFO是1或2时,对所有色彩平面中的所有块连结(均匀地跨4个块的区域)应用4×4后滤波器。同样,在OVERLAP_INFO是2时,对所有平面的DC平面中的所有块连结应用4×4滤波器,而在OVERLAP_INFO是2且色彩格式是YUV 4:2:0或YUV 4:2:2时,只对亮度平面的DC平面中的所有块连结应用4×4滤波器。Initially, when OVERLAP_INFO is 1 or 2, a 4x4 post-filter is applied to all block concatenations (uniformly across an area of 4 blocks) in all color planes. Similarly, when OVERLAP_INFO is 2, a 4×4 filter is applied to all block connections in the DC plane of all planes, and when OVERLAP_INFO is 2 and the color format is YUV 4:2:0 or YUV 4:2:2, A 4x4 filter is applied to all block concatenations in the DC plane of the luma plane only.

如果输入数据是 a b c d e f g h i j k l m n o p , 则4×4后滤波器4×4PostFilter(a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p)在下表中定义:If the input data is a b c d e f g h i j k l m no o p , Then the 4×4 post filter 4×4PostFilter(a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p) is defined in the following table:

Figure G2008800056300D00122
Figure G2008800056300D00122

Figure G2008800056300D00131
Figure G2008800056300D00131

4点后滤波器4 point post filter

对跨图像的边界上的2×4和4×2区域的边缘应用线性4点滤波器。如果输入数据是[a b c d],则4点后滤波器4PostFilter(a,b,c,d)在下表中定义。Applies a linear 4-point filter to the edges of the 2×4 and 4×2 regions across the border of the image. If the input data is [a b c d], the 4-point post filter 4PostFilter(a, b, c, d) is defined in the following table.

Figure G2008800056300D00132
Figure G2008800056300D00132

2×2后滤波器2×2 post filter

对跨YUV 4:2:0和YUV 4:2:2数据的色度通道的DC平面中的块的区域应用2×2后滤波器。如果输入数据是 a b c d , 则2×2后滤波器2×2PostFilter(a,b,c,d)在下表中定义:A 2×2 post-filter is applied to regions of blocks in the DC plane spanning YUV 4:2:0 and chroma channels of YUV 4:2:2 data. If the input data is a b c d , Then the 2×2 post filter 2×2PostFilter(a, b, c, d) is defined in the following table:

Figure G2008800056300D00134
Figure G2008800056300D00134

Figure G2008800056300D00141
Figure G2008800056300D00141

2点后滤波器2 point post filter

对跨块的边界2×1和1×2样本应用2点后滤波器。2点后滤波器2PostFilter(a,b)在下表中定义:Applies a 2-point post-filter to 2×1 and 1×2 samples across block boundaries. The 2-point post filter 2PostFilter(a, b) is defined in the following table:

Figure G2008800056300D00142
Figure G2008800056300D00142

用于执行上述重叠变换的变换运算所要求的精度的信令可以在压缩数据结构的头部中执行。在该示例实现中,LONG_WORD_FLAG和NO_SCALED_FLAGS是在压缩位流中(例如,在图像头部中)发送来用信号通知解码器要应用的精度和计算复杂度的句法元素。The signaling of the precision required for the transform operations used to perform the lapped transform described above can be performed in the header of the compressed data structure. In this example implementation, LONG_WORD_FLAG and NO_SCALED_FLAGS are syntax elements sent in the compressed bitstream (eg, in the picture header) to signal the decoder the precision and computational complexity to apply.

3.精度和字长3. Precision and word length

该示例编码器/解码器执行整数运算。此外,该示例编码器/解码器支持无损编码和解码。因此,该示例编码器/解码器所要求的主机器精度是整数。The example encoder/decoder performs integer arithmetic. Additionally, the sample encoder/decoder supports lossless encoding and decoding. Therefore, the required host machine precision for this example encoder/decoder is integer.

然而,在该示例编码器/解码器中定义的整数运算对有损编码导致舍入误差。这些误差在设计上很小,然而,它们在率失真曲线上导致下降。出于通过减少舍入误差来改进编码性能的目的,示例编码器/解码器定义第二机器精度。在该模式下,对输入预乘8(即,左移3位),并且最终输出除以8取整(即,右移3位)。这些运算在编码器的前端和解码器的后端执行,并且对该过程的其余部分在很大程度上是不可见的。此外,相应地缩放量化等级,以便用主机器精度创建并使用第二机器精度解码(反之亦然)的流产生可接受的图像。However, the integer operations defined in this example encoder/decoder result in round-off errors for lossy encoding. These errors are small by design, however, they cause a drop in the rate-distortion curve. For the purpose of improving encoding performance by reducing round-off errors, the example encoder/decoder defines a second machine precision. In this mode, the input is premultiplied by 8 (ie, shifted left by 3 bits), and the final output is divided by 8 and rounded (ie, shifted right by 3 bits). These operations are performed at the front end of the encoder and at the back end of the decoder, and are largely invisible to the rest of the process. Furthermore, the quantization levels are scaled accordingly so that streams created with the main machine precision and decoded with the second machine precision (and vice versa) produce acceptable images.

在需要无损压缩时不能使用第二机器精度。在创建压缩文件时使用的机器精度在头部中被显式地标记。Second machine precision cannot be used when lossless compression is required. The machine precision used when creating the archive is explicitly marked in the header.

第二机器精度等于在编解码器中使用缩放算术,并且因此该模式被称为缩放的。主机器精度被称为未缩放的。Second machine precision is equal to using scaling arithmetic in the codec, and thus this mode is called scaled. Main machine precision is said to be unscaled.

该示例编码器/解码器被设计来提供良好的编码和解码速度。该示例编码器/解码器的设计目标是对一个8位输入而言,编码器和解码器上的数据值不超过16位有符号值。(然而,变换阶段内的中间运算可超过这一数字。)这对两种机器精度模式而言都是成立的。This example encoder/decoder is designed to provide good encoding and decoding speed. The design goal of this example encoder/decoder is that for an 8-bit input, the data values at the encoder and decoder do not exceed 16-bit signed values. (However, intermediate operations within the transform stage can exceed this number.) This is true for both machine-precision modes.

相反,在选择第二机器精度时,中间值的范围跨度是8位的。因为主机器精度避免预乘8,所以其范围跨度是8-3=5位。In contrast, when the second machine precision is selected, the range span of intermediate values is 8 bits. Because main machine precision avoids premultiplication by 8, its range spans 8-3=5 bits.

第一示例编码器/解码器对中间值使用两种不同字长。这些字长是16和32位。The first example encoder/decoder uses two different word sizes for intermediate values. These word lengths are 16 and 32 bits.

第二示例位流句法和语义Second Example Bitstream Syntax and Semantics

第二示例位流句法和语义是分层的,并且包括以下各层:图像、瓦块、宏块、和块。The second example bitstream syntax and semantics is layered and includes the following layers: picture, tile, macroblock, and block.

图像(IMAGE)Image (IMAGE)

IMAGE(){                位数            描述符IMAGE(){ Descriptor

   IMAGE_HEADER         可变            structIMAGE_HEADER variable struct

   bAlphaPlane=FALSEbAlphaPlane=FALSE

    IMAGE_PLANE_HEADER        可变            structIMAGE_PLANE_HEADER variable struct

    if(ALPHACHANNEL_FLAG){if(ALPHACHANNEL_FLAG){

        bAlphaPlane=TRUEbAlphaPlane=TRUE

        IMAGE_PLANE_HEADER    可变            StructIMAGE_PLANE_HEADER Variable Struct

       }}

    INDEX_TABLE               可变            structINDEX_TABLE mutable struct

    TILE                      可变            structTILE struct variable

}}

图像头部(IMAGE_HEADER)Image header (IMAGE_HEADER)

IMAGE_HEADER(){               位数            描述符IMAGE_HEADER(){ Descriptor

   GDISIGNATURE               64              uimsbfGDISIGNATURE 64 uimsbf

   RESERVED1                  4               uimsbfRESERVED1 4 4 uimsbf

   RESERVED2                  4               uimsbfRESERVED2 4 4 uimsbf

   TILING_FLAG                1               boolTILING_FLAG 1 bool

FREQUENCYMODE_BITSTREAM_FLAG  1               uimsbfFREQUENCYMODE_BITSTREAM_FLAG 1 uimsbf

   IMAGE_ORIENTATION          3               uimsbfIMAGE_ORIENTATION 3 uimsbf

   INDEXTABLE_PRESENT_FLAG    1               uimsbfINDEXTABLE_PRESENT_FLAG 1 uimsbf

   OVERLAP_INFO               2               uimsbfOVERLAP_INFO 2 uimsbf

   SHORT_HEADER_FLAG          1               boolSHORT_HEADER_FLAG 1 bool

   LONG_WORD_FLAG             1               boolLONG_WORD_FLAG 1 bool

   WINDOWING_FLAG             l               boolWINDOWING_FLAG l bool

   TRIM_FLEXBITS_FLAG         1               boolTRIM_FLEXBITS_FLAG 1 bool

   RESERVED3                  3               uimsbfRESERVED3 3 3 uimsbf

   ALPHACHANNEL_FLAG          1               boolALPHACHANNEL_FLAG 1 bool

   SOURCE_CLR_FMT             4               uimsbfSOURCE_CLR_FMT 4 uimsbf

   SOURCE_BITDEPTH            4               uimsbfSOURCE_BITDEPTH 4 uimsbf

   If(SHORT_HEADER_FLAG){If(SHORT_HEADER_FLAG){

       WIDTH_MINUS1           16              uimsbfWIDTH_MINUS1 16 uimsbf

        HEIGHT_MINUS1            16        uimsbfHEIGHT_MINUS1 16 uimsbf

    }}

    else{else {

        WIDTH_MINUS1             32        uimsbfWIDTH_MINUS1 32 uimsbf

        HEIGHT_MINUS1            32        uimsbfHEIGHT_MINUS1 32 uimsbf

    }}

    if(TILING_FLAG){if(TILING_FLAG){

        NUM_VERT_TILES_MINUS1    12        uimsbfNUM_VERT_TILES_MINUS1 12 uimsbf

        NUM_HORIZ_TILES_MINUS1   12        uimsbfNUM_HORIZ_TILES_MINUS1 12 uimsbf

    }}

    for(n=0;n<for(n=0;n<

NUM_VERT_TILES_MINUS1;n++){NUM_VERT_TILES_MINUS1; n++){

      If(SHORT_HEADER_FLAG)If(SHORT_HEADER_FLAG)

                                 8         uimsbf8 uimsbf

WIDTH_IN_MB_OF_TILE_MINUS1[n]WIDTH_IN_MB_OF_TILE_MINUS1[n]

      elseelse

                                 16        uimsbf...

WIDTH_IN_MB_OF_TILE_MINUS1[n]WIDTH_IN_MB_OF_TILE_MINUS1[n]

   }}

   for(n=0;n<for(n=0;n<

NUM_HORIZ_TILES_MINUS1;n++){NUM_HORIZ_TILES_MINUS1; n++){

      If(SHORT_HEADER_FLAG)If(SHORT_HEADER_FLAG)

                                 8         uimsbf8 uimsbf

HEIGHT_IN_MB_OF_TILE_MINUS1[n]HEIGHT_IN_MB_OF_TILE_MINUS1[n]

      elseelse

                                 16        uimsbf...

HEIGHT_IN_MB_OF_TILE_MINUS1[n]HEIGHT_IN_MB_OF_TILE_MINUS1[n]

    }}

    if(WINDOWING_FLAG){if(WINDOWING_FLAG){

                NUM_TOP_EXTRAPIXELS     6        uimsbfNUM_TOP_EXTRAPIXELS 6 uimsbf

                NUM_LEFT_EXTRAPIXELS    6        uimsbfNUM_LEFT_EXTRAPIXELS 6 uimsbf

                NUM_BOTTOM_EXTRAPIXELS  6        uimsbfNUM_BOTTOM_EXTRAPIXELS 6 uimsbf

                NUM_RIGHT_EXTRAPIXELS   6        uimsbfNUM_RIGHT_EXTRAPIXELS 6 uimsbf

            }}

        }}

    IMAGE_PLANE_HEADER(){               位数     描  述IMAGE_PLANE_HEADER(){ digits description

                                                 符Symbol

    CLR_FMT                             3        uimsbfCLR_FMT 3 uimsbf

    NO_SCALED_FLAG                      1        boolNO_SCALED_FLAG 1 bool

    BANDS_PRESENT                       4        uimsbfBANDS_PRESENT 4 uimsbf

    if(CLR_FMT==YUV444){If(CLR_FMT==YUV444){

        CHROMA_CENTERING                4        uimsbfCHROMA_CENTERING 4 uimsbf

        COLOR_INTERPRETATION            4        uimsbfCOLOR_INTERPRETATION 4 uimsbf

    }}

    Else if(CLR_FMT==NCHANNEL){Else if(CLR_FMT==NCHANNEL){

        NUM_CHANNELS_MINUS              14       uimsbfNUM_CHANNELS_MINUS 14 uimsbf

        COLOR_INTERPRETATION            4        uimsbfCOLOR_INTERPRETATION 4 uimsbf

    }}

    if(SOURCE_CLR_FMT==BAYER){if(SOURCE_CLR_FMT==BAYER){

        BAYER_PATTERN                   2        uimsbfBAYER_PATTERN 2 uimsbf

        CHROMA_CENTERING_BAYER          2        uimsbfCHROMA_CENTERING_BAYER 2 uimsbf

        COLOR_INTERPRETATION            4        uimsbfCOLOR_INTERPRETATION 4 uimsbf

    }}

    if(SOURCE_BITDEPTH            ∈if(SOURCE_BITDEPTH ∈

{BD16,BD16S,BD32,BD32S}){{BD16, BD16S, BD32, BD32S}){

       SHIFT_BITS                       8        uimsbfSHIFT_BITS 8 uimsbf

    }}

    if(SOURCE_BITEPTH==BD32F){If(SOURCE_BITEPTH==BD32F){

        LEN_MANTISSA                8        uimsbfLEN_MANTISSA 8 uimsbf

        EXP_BIAS                    8        uimsbfEXP_BIAS 8 uimsbf

    }}

    DC_FRAME_UNIFORM                1        boolDC_FRAME_UNIFORM 1 bool

    if(DC_FRAME_UNIFORM){if(DC_FRAME_UNIFORM){

    DC_QP()                         可变     structDC_QP() mutable struct

    }}

    if(BANDS_PRESENT!=SB_DC_ONLY){if(BANDS_PRESENT!=SB_DC_ONLY){

        USE_DC_QP                   1        boolUSE_DC_QP 1 bool

        if(USE_DC_QP==FALSE){If(USE_DC_QP==FALSE){

            LP_FRAME_UNIFORM        1        boolLP_FRAME_UNIFORM 1 bool

            if(LP_FRAME_UNIFORM){      if(LP_FRAME_UNIFORM){

                NUM_LP_QPS=1NUM_LP_QPS=1

                LP_QP()             可变     structLP_QP() mutable struct

            }}

        }}

    if(BANDS_PRESENT!=SB_NO_HIGHPASS){if(BANDS_PRESENT!=SB_NO_HIGHPASS){

        USE_LP_QP                   1        boolUSE_LP_QP 1 bool

        if(USE_LP_QP==FALSE){If(USE_LP_QP==FALSE){

            HP_FRAME_UNIFORM        1        bool                                                

            if(HP_FRAME_UNIFORM){      if(HP_FRAME_UNIFORM){

                NUM_HP_QPS=1NUM_HP_QPS=1

                HP_QP()             可变     struct                                                                                                                             

            }}

        }}

    }}

}}

FLUSH_BYTE                          可变FLUSH_BYTE variable

}}

从第二示例位流句法和语义中所选择的一些位流元素定义如下。Some selected bitstream elements from the second example bitstream syntax and semantics are defined as follows.

长字标志(LONG_WORD_FLAG)(1位)Long word flag (LONG_WORD_FLAG) (1 bit)

LONG_WORD_FLAG是1位句法元素并指定是否将16位整数用于变换计算。在该第二示例位流句法中,如果LONG_WORD_FLAG==0(FALSE(假)),则16位整数和数组可以用于变换计算的外部阶段(变换中的中间运算(如(3*a+1)>>1)是用更高准确度来执行的)。如果LONG_WORD_FLAG==TRUE(真),则应将32位整数和数组用于变换计算。LONG_WORD_FLAG is a 1-bit syntax element and specifies whether 16-bit integers are used for transform calculations. In this second example bitstream syntax, if LONG_WORD_FLAG == 0 (FALSE (false)), then 16-bit integers and arrays can be used in the external stages of the transformation calculation (intermediate operations in the transformation (such as (3*a+1 )>>1) is performed with higher accuracy). If LONG_WORD_FLAG == TRUE, then 32-bit integers and arrays shall be used for transform calculations.

注意:32位算术可被用来解码图像而不管LONG_WORD_FLAG的值。该句法元素可由解码器用来选择用于实现的最高效字长。Note: 32-bit arithmetic can be used to decode images regardless of the value of LONG_WORD_FLAG. This syntax element can be used by the decoder to select the most efficient word size for implementation.

无缩放算术标志(NO_SCALED_FLAG)(1位)No scaling arithmetic flag (NO_SCALED_FLAG) (1 bit)

NO_SCALED_FLAG是指定变换是否使用缩放的1位句法元素。如果NO_SCALED_FLAG==1,则不应执行缩放。如果NO_SCALED_FLAG==0,则应当执行缩放。在这种情况下,缩放应当通过将最终阶段(色彩转换)的输出适当地下舍入3位来执行。NO_SCALED_FLAG is a 1-bit syntax element that specifies whether the transform uses scaling. If NO_SCALED_FLAG==1, scaling should not be performed. If NO_SCALED_FLAG==0, then scaling should be performed. In this case scaling should be performed by appropriately rounding down the output of the final stage (color conversion) by 3 bits.

注意:如果需要无损编码,则即使无损编码只用于图像的子区域,NO_SCALED_FLAG也应被设为TRUE。有损编码可以使用任一模式。NOTE: If lossless encoding is desired, NO_SCALED_FLAG should be set to TRUE even if lossless encoding is only used for subregions of the image. Lossy encoding can use either mode.

注意:在使用缩放时(即,NO_SCALED_FLAG==FALSE),尤其是在低QP的情况下,有损编码的率失真性能很好。Note: The rate-distortion performance of lossy encoding is good when scaling is used (ie, NO_SCALED_FLAG==FALSE), especially at low QP.

4.长字标志的信令和使用4. Signaling and use of the long word flag

代表性编码器/解码器的一个示例图像格式支持各种各样的像素格式,包括高动态范围和宽色域格式。所支持的数据类型包括有符号整数、无符号整数、定点浮动和浮点浮动。所支持的位深包括每色彩通道8、16、24和32位。示例图像格式允许使用达每色彩通道24位的图像的无损压缩,以及使用达每色彩通道32位的图像的有损压缩。An example image format of a representative encoder/decoder supports a wide variety of pixel formats, including high dynamic range and wide color gamut formats. Supported data types include signed integer, unsigned integer, fixed-point float, and float-float. Supported bit depths include 8, 16, 24 and 32 bits per color channel. An example image format allows lossless compression of images using up to 24 bits per color channel, and lossy compression of images using up to 32 bits per color channel.

同时,该示例图像格式被设计成提供高质量图像和压缩效率,并允许低复杂度编码和解码实现。At the same time, this example image format is designed to provide high image quality and compression efficiency, and to allow low-complexity encoding and decoding implementations.

为支持低复杂度实现,示例图像格式中的变换被设计成最小化动态范围的扩张。两阶段变换只将动态范围增加5位。因此,如果图像位深是每色彩通道8位,则16位算术可足以在解码器处执行所有变换运算。对于其它位深,变换运算可能需要更高精度的算术。To support low-complexity implementations, the transformations in the example image formats are designed to minimize expansion of the dynamic range. The two-stage transform only increases the dynamic range by 5 bits. Thus, if the image bit depth is 8 bits per color channel, 16-bit arithmetic may be sufficient to perform all transform operations at the decoder. For other bit depths, transform operations may require higher precision arithmetic.

如果在解码器处已知执行变换运算所要求的精度,则解码特定位流的计算复杂度可以降低。可以使用句法元素(例如,图像头部中的1位标志)来用信号将该信息通知给解码器。所描述的信令技术和句法元素可以降低解码位流的计算复杂度。The computational complexity of decoding a particular bitstream can be reduced if the precision required to perform the transform operation is known at the decoder. This information can be signaled to the decoder using a syntax element (eg, a 1-bit flag in the picture header). The described signaling techniques and syntax elements can reduce the computational complexity of decoding a bitstream.

在一个示例实现中,使用1位句法元素LONG_WORD_FLAG。例如,如果LONG_WORD_FLAG==FALSE,则16位整数和数组可被用于变换计算的外部阶段,并且如果LONG_WORD_FLAG==TRUE,则32位整数和数组应被用于变换计算。In one example implementation, the 1-bit syntax element LONG_WORD_FLAG is used. For example, if LONG_WORD_FLAG==FALSE, 16-bit integers and arrays may be used for the external stages of transform calculations, and if LONG_WORD_FLAG==TRUE, 32-bit integers and arrays should be used for transform calculations.

在该代表性编码器/解码器的一个实现中,可以对16位宽的字执行原地变换运算,但变换内的中间运算(如计算b+=(3*a+1)>>1所给出的“提升”步骤的3*a的积)是用更高准确度(例如,18位或更高精度)来执行的。然而,在该示例中,中间变换值a和b本身可以存储在16位整数内。In one implementation of this representative encoder/decoder, transform operations can be performed in-place on 16-bit wide words, but intermediate operations within the transform (as given by computing b+=(3*a+1)>>1 The product of 3*a of the "lifting" step out) is performed with higher accuracy (eg, 18 bits or higher). However, in this example, the intermediate transformed values a and b may themselves be stored in 16-bit integers.

32位算术可被用来解码图像而不管LONG_WORD_FLAG元素的值。LONG_WORD_FLAG元素可由编码器/解码器用来选择用于实现的最高效字长。例如,如果编码器能验证16位和32位精度变换步骤产生相同的输出值,则它可以选择将LONG_WORD_FLAG元素设为FALSE。32-bit arithmetic can be used to decode images regardless of the value of the LONG_WORD_FLAG element. The LONG_WORD_FLAG element may be used by an encoder/decoder to select the most efficient word length for implementation. For example, an encoder may choose to set the LONG_WORD_FLAG element to FALSE if it can verify that 16-bit and 32-bit precision transform steps produce the same output value.

5.NO_SCALED_FLAG的信令和使用5. Signaling and use of NO_SCALED_FLAG

代表性编码器/解码器的一个示例图像格式支持各种各样的像素格式,包括高动态范围和宽色域格式。同时,该代表性编码器/解码器的设计优化图像质量和压缩效率,并允许低复杂度的编码和解码实现。An example image format of a representative encoder/decoder supports a wide variety of pixel formats, including high dynamic range and wide color gamut formats. Meanwhile, the design of this representative encoder/decoder optimizes image quality and compression efficiency, and allows low-complexity encoding and decoding implementation.

如上所述,该代表性编码器/解码器使用两阶段的分层的基于块的变换,其中所有变换步骤都是整数运算。这些整数运算中存在的小舍入误差导致有损压缩期间的压缩效率的损失。为对抗这一问题,该代表性编码器/解码器的一个实现定义用于解码器运算的两个不同的精度模式:缩放模式和未缩放模式。As mentioned above, this representative encoder/decoder uses a two-stage layered block-based transform where all transform steps are integer operations. The presence of small round-off errors in these integer operations results in a loss of compression efficiency during lossy compression. To combat this problem, one implementation of this representative encoder/decoder defines two different precision modes for decoder operations: scaled mode and unscaled mode.

在缩放精度模式下,在编码器处对输入图像预乘8(即,左移3位),并且在解码器处的最终输出除以8取整(即,右移3位)。缩放精度模式中的运算最小化舍入误差,并且产生改进的率失真性能。In scaled precision mode, the input image is premultiplied by 8 at the encoder (ie, shifted left by 3 bits), and the final output at the decoder is divided by 8 and rounded (ie shifted right by 3 bits). Operations in scaled precision mode minimize round-off errors and yield improved rate-distortion performance.

在未缩放精度模式中,不存在这种缩放。以未缩放精度模式运算的编码器或解码器必须处理较小的变换系数动态范围,并且因此具有较低的计算复杂度。然而,对于在该模式中运算而言,压缩效率上存在少量恶化。无损编码(不用量化,即将量化参数即QP设为1)只能使用未缩放精度模式来得到所确保的可逆性。In unscaled precision mode, there is no such scaling. An encoder or decoder operating in unscaled precision mode has to deal with a smaller dynamic range of transform coefficients and thus has lower computational complexity. However, there is a small penalty in compression efficiency for operating in this mode. Lossless coding (without quantization, i.e. setting the quantization parameter ie QP to 1) can only use the unscaled precision mode to get the guaranteed reversibility.

编码器在创建压缩文件时所使用的精度模式在压缩位流220的图像头部中使用NO_SCALED_FLAG来显式地用信号通知(图2)。建议解码器300也对其运算使用同一精度模式。The precision mode used by the encoder when creating the compressed file is explicitly signaled using NO_SCALED_FLAG in the image header of the compressed bitstream 220 (FIG. 2). It is proposed that the decoder 300 also use the same precision mode for its operations.

NO_SCALED_FLAG是图像头部中的如下指定精度模式的1位句法元素:NO_SCALED_FLAG is a 1-bit syntax element in the image header specifying the precision mode as follows:

如果NO_SCALED_FLAG==TRUE,则未缩放模式应被用于解码器运算。If NO_SCALED_FLAG == TRUE, unscaled mode shall be used for decoder operations.

如果NO_SCALED_FLAG==FALSE,则应当使用缩放。在这种情况下,缩放模式应当通过将最终阶段(色彩转换)的输出适当地舍入3位来用于运算。If NO_SCALED_FLAG == FALSE, then scaling should be used. In this case, the scaling mode should be used for operations by properly rounding the output of the final stage (color conversion) by 3 bits.

在使用未缩放模式时(即,NO_SCALED_FLAG==FALSE),尤其是在低QP的情况下,有损编码的率失真性能很好。然而,在使用未缩放模式时,由于以下两个原因,计算复杂度较低:The rate-distortion performance of lossy coding is good when using unscaled mode (ie, NO_SCALED_FLAG==FALSE), especially at low QP. However, when using unscaled mode, the computational complexity is lower for two reasons:

未缩放模式中的较小的动态范围扩张意味着较短的字可以用于变换计算,尤其是在结合“LONG_WORD_FLAG”的情况下。在VLSI实现中,降低的动态范围扩张意味着实现更多有效位的门逻辑可被断电。Smaller dynamic range expansion in unscaled mode means shorter words can be used for transform calculations, especially in combination with "LONG_WORD_FLAG". In a VLSI implementation, the reduced dynamic range expansion means that gate logic implementing more significant bits can be powered down.

缩放模式在解码器侧要求加法运算和右移3位(实现除以8取整)。在编码器侧,其要求左移3位。总体上,这比未缩放模式在计算上要求稍高。Scaling mode requires an addition operation and a right shift of 3 bits on the decoder side (to implement division by 8). On the encoder side, it requires a left shift of 3 bits. Overall, this is slightly more computationally demanding than unscaled mode.

此外,未缩放模式允许比缩放模式压缩更多的有效位。例如,使用32位算术,未缩放模式准许每样本达27个有效位的无损压缩(以及解压)。相反,缩放模式在同样情况下只允许24位压缩。这是因为缩放过程引入了动态范围的三个附加位。Also, unscaled mode allows more significant bits to be compressed than scaled mode. For example, using 32-bit arithmetic, unscaled mode permits lossless compression (and decompression) of up to 27 effective bits per sample. In contrast, scaling mode only allows 24-bit compression under the same circumstances. This is because the scaling process introduces three additional bits of dynamic range.

对这两种精度模式而言,对于8位输入,解码器上的数据值都不超过16个有符号位。(然而,变换阶段内的中间运算可超过这一数字。)For both precision modes, the data value at the decoder does not exceed 16 signed bits for an 8-bit input. (However, intermediate operations within the transform stage can exceed this number.)

注意:如果需要无损编码(QP=1),即使只有图像的子区域需要无损编码,则编码器将NO_SCALED_FLAG设为TRUE。NOTE: The encoder sets NO_SCALED_FLAG to TRUE if lossless encoding is required (QP=1), even if only a subregion of the image requires lossless encoding.

编码器可以使用任一模式来用于有损压缩。建议解码器对其运算使用NO_SCALED_MODE用信号通知的精度模式。然而,缩放量化等级,以便用缩放精度模式创建并使用未缩放的精度模式解码(反之亦然)的流在大多数情况下产生可接受的图像。Encoders can use either mode for lossy compression. It is recommended that decoders use the precision mode signaled by NO_SCALED_MODE for their operations. However, scaling the quantization level so that a stream created with a scaled precision mode and decoded with an unscaled precision mode (and vice versa) produces acceptable images in most cases.

6.用于增加的准确度的缩放算术6. Scaled arithmetic for increased accuracy

在该代表性编码器/解码器的一个实现中,变换(包括色彩转换)是整数变换并通过一系列提升步骤来实现。在这些提升步骤中,截断误差损害变换性能。对于有损压缩的情况,为最小化截断误差的损害并因而最大化变换性能,对于变换的输入数据需要被左移若干位。然而,另一极其需要的特征是如果输入图像是8位,则每一变换的输出应当在16位以内。所以左移位数不能很大。该代表性解码器实现缩放算术来达到这两个目标的技术。缩放算术技术通过最小化截断误差的损害来最大化变换性能,并且在输入图像是8位的情况下仍然将每一变换步骤的输出限制在16位以内。这使简单的16位实现成为可能。In one implementation of this representative encoder/decoder, the transformations (including color transformations) are integer transformations and are implemented through a series of lifting steps. In these lifting steps, truncation errors impair transformation performance. For the case of lossy compression, to minimize the penalty of truncation errors and thus maximize the transform performance, the input data for the transform needs to be left shifted by a number of bits. However, another highly desirable feature is that if the input image is 8 bits, the output of each transform should be within 16 bits. So the number of left shifts cannot be very large. This representative decoder implements techniques for scaling arithmetic to achieve both goals. Scaled arithmetic techniques maximize transform performance by minimizing the penalty of truncation errors, and still limit the output of each transform step to 16 bits when the input image is 8 bits. This enables simple 16-bit implementations.

该代表性编码器/解码器中所使用的变换是整数变换并通过提升步骤来实现。大多数提升步骤涉及右移,这引入截断误差。变换通常涉及多个提升步骤,并且累积截断误差明显损害变换性能。The transform used in this representative encoder/decoder is an integer transform and is implemented by a lifting step. Most boosting steps involve right shifts, which introduce truncation errors. Transformations often involve multiple lifting steps, and accumulating truncation errors significantly impairs transformation performance.

降低截断误差的损害的一种方式是在编码器中进行变换之前左移输入数据,并在解码器处在变换(与量化相组合)之后右移相同位数。如上所述,该代表性编码器/解码器具有两阶段变换结构:可任选第一阶段重叠+第一阶段CT+可任选第二阶段重叠+第二阶段CT。实验显示为最小化截断误差,左移3位是必要的。所以,在有损的情况下,在色彩转换之前,输入数据可以左移3位,即乘或放大因数8(例如,对于上述缩放模式)。One way to reduce the penalty of truncation error is to left shift the input data before the transform at the encoder, and right shift by the same number of bits after the transform (combined with quantization) at the decoder. As mentioned above, the representative encoder/decoder has a two-stage transform structure: optional first-stage overlap + first-stage CT + optional second-stage overlap + second-stage CT. Experiments have shown that a left shift of 3 bits is necessary to minimize truncation error. So, in the lossy case, the input data can be left shifted by 3 bits before color conversion, i.e. multiplied or upscaled by a factor of 8 (e.g. for the above scaling modes).

然而,色彩转换和变换扩大数据。如果输入数据左移3位,则在输入数据是8位的情况下,第二阶段4×4DCT的输出具有17位动态范围(其它变换的输出仍然在16位以内)。这是极不需要的,因为它阻止了16位实现(这是极其需要的特征)。为避开这一点,在第二阶段4×4CT之前,输入数据右移1位,并且故而输出也在16位以内。因为只对数据(第一阶段DCT的DC变换系数)的1/16应用了第二阶段4×4CT,并且第一阶段变换已经将该数据放大,所以截断误差的损害很小。However, color conversions and transformations enlarge the data. If the input data is shifted left by 3 bits, the output of the second stage 4×4DCT has a dynamic range of 17 bits if the input data is 8 bits (the output of other transformations is still within 16 bits). This is highly undesirable as it prevents 16-bit implementation (which is a highly desired feature). To get around this, before the second stage 4x4CT, the input data is shifted right by 1 bit, and thus the output is also within 16 bits. Since the second-stage 4x4CT is only applied to 1/16 of the data (DC transform coefficients of the first-stage DCT), and the first-stage transform already amplifies this data, the damage of truncation errors is small.

所以在8位图像的有损情况下,在编码器侧,在色彩转换之前输入被左移3位,并且在第二阶段4×4CT之前右移1位。在解码器侧,在第一阶段4×4IDCT之前左移1位并在色彩转换之后右移3位。So in the lossy case of an 8-bit image, on the encoder side, the input is left shifted by 3 bits before color conversion, and right shifted by 1 bit before the second stage 4×4CT. On the decoder side, 1 bit is left shifted before the first stage 4x4 IDCT and 3 bits are shifted right after color conversion.

7.计算环境7. Computing environment

上述用于数字媒体编解码器中的计算复杂度和精度信令的处理技术可以在各种数字媒体编码和/或解码系统的任一种上实现,包括计算机(各种形状因数,包括服务器、台式机、膝上型计算机、手持式计算机等);数字媒体记录器和播放器;图像和视频捕捉设备(诸如照相机、扫描仪等);通信设备(诸如电话、移动电话、会议设备等);显示、打印或其它呈现设备;以及其它示例等等。数字媒体编解码器中的计算复杂度和精度信令技术可用硬件电路、控制数字媒体处理硬件的固件、以及在计算机或在诸如图6中所示的其它计算环境中执行的通信软件来实现。The processing techniques described above for computational complexity and precision signaling in digital media codecs can be implemented on any of a variety of digital media encoding and/or decoding systems, including computers (various form factors, including servers, desktop computers, laptop computers, handheld computers, etc.); digital media recorders and players; image and video capture equipment (such as cameras, scanners, etc.); communication equipment (such as telephones, mobile phones, conferencing equipment, etc.); display, printing, or other presentation device; and other examples, etc. Computational complexity and precision signaling techniques in digital media codecs may be implemented with hardware circuitry, firmware controlling digital media processing hardware, and communications software executing on a computer or in other computing environments such as shown in FIG. 6 .

图6示出了其中可实现所描述的实施例的合适计算环境(600)的一个一般示例。计算环境(600)不旨在对本发明的使用范围或功能提出任何限制,因为本发明可以在完全不同的通用或专用计算环境中实现。Figure 6 shows one general example of a suitable computing environment (600) in which described embodiments may be implemented. The computing environment (600) is not intended to suggest any limitation as to the scope of use or functionality of the invention, as the invention can be implemented in entirely different general-purpose or special-purpose computing environments.

参考图6,计算环境(600)包括至少一个处理单元(610)和存储器(620)。在图6中,这一最基本的配置(630)被包括在虚线内。处理单元(610)执行计算机可执行指令,并且可以是真实或虚拟处理器。在多处理系统中,多个处理单元执行计算机可执行指令以提高处理能力。存储器(620)可以是易失性存储器(例如,寄存器、高速缓存、RAM)、非易失性存储器(例如,ROM、EEPROM、闪存等)或两者的某种组合。存储器(602)存储实现所描述的使用计算复杂度和精度信令技术的数字媒体编码/解码的软件(680)。Referring to Figure 6, the computing environment (600) includes at least one processing unit (610) and memory (620). In Figure 6, this most basic configuration (630) is enclosed within the dashed line. The processing unit (610) executes computer-executable instructions and may be a real or virtual processor. In a multi-processing system, multiple processing units execute computer-executable instructions to increase processing power. The memory (620) may be volatile memory (eg, registers, cache, RAM), non-volatile memory (eg, ROM, EEPROM, flash memory, etc.), or some combination of the two. The memory (602) stores software (680) implementing the described digital media encoding/decoding using computational complexity and precision signaling techniques.

计算环境可具有附加特征。例如,计算环境(600)包括存储(640)、一个或多个输入设备(650)、一个或多个输出设备(660)以及一个或多个通信连接(670)。诸如总线、控制器或网络等互连机制(未示出)将计算环境(600)的各组件互连。通常,操作系统软件(未示出)为在计算环境(600)中执行的其它软件提供操作环境,并协调计算环境(600)的各组件的活动。A computing environment can have additional features. For example, computing environment (600) includes storage (640), one or more input devices (650), one or more output devices (660), and one or more communication connections (670). An interconnection mechanism (not shown), such as a bus, controller, or network, interconnects the various components of the computing environment (600). In general, operating system software (not shown) provides an operating environment for other software executing in the computing environment (600) and coordinates the activities of the various components of the computing environment (600).

存储(640)可以是可移动或不可移动的,并包括磁盘、磁带或磁带盒、CD-ROM、CD-RW、DVD或可用于储存信息并可在计算环境(600)内访问的任何其它介质。存储(640)存储用于实现所描述的使用计算复杂度和精度信令技术的数字媒体编码/解码的软件(680)的指令。Storage (640) may be removable or non-removable and includes magnetic disks, tape or cassettes, CD-ROM, CD-RW, DVD, or any other medium that can be used to store information and be accessed within the computing environment (600) . The storage (640) stores instructions for implementing the described software (680) for digital media encoding/decoding using computational complexity and precision signaling techniques.

输入设备(650)可以是诸如键盘、鼠标、笔或跟踪球的触摸输入设备、语音输入设备、扫描设备或向计算环境(600)提供输入的另一设备。对于音频,输入设备(650)可以是声卡或接受来自话筒或话筒阵列的模拟或数字形式的音频输入的类似设备,或向计算环境提供音频样本的CD-ROM读取器。输出设备(660)可以是显示器、打印机、CD刻录机或提供来自计算环境(600)的输出的另一设备。The input device (650) may be a touch input device such as a keyboard, mouse, pen, or trackball, a voice input device, a scanning device, or another device that provides input to the computing environment (600). For audio, the input device (650) may be a sound card or similar device that accepts audio input in analog or digital form from a microphone or microphone array, or a CD-ROM reader that provides audio samples to the computing environment. The output device (660) may be a display, printer, CD recorder, or another device that provides output from the computing environment (600).

通信连接(670)允许在通信介质上与另一计算实体的通信。通信介质在已调制数据信号中传达诸如计算机可执行指令、压缩音频或视频信息、或其它数据等信息。已调制数据信号是其一个或多个特征以在信号中编码信息的方式设置或改变的信号。作为示例而非局限,通信介质包括以电、光、RF、红外、声学或其它载波实现的有线或无线技术。A communication connection (670) allows communication with another computing entity over a communication medium. Communication media convey information such as computer-executable instructions, compressed audio or video information, or other data in a modulated data signal. A modulated data signal is a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. By way of example, and not limitation, communication media includes wired or wireless technologies implemented with electrical, optical, RF, infrared, acoustic or other carrier waves.

此处所描述的使用灵活量化技术的数字媒体编码/解码可在计算机可读介质的一般上下文中描述。计算机可读介质可以是可在计算环境内访问的任何可用介质。作为示例而非局限,对于计算环境(600),计算机可读介质可包括存储器(620)、存储(640)、通信介质和以上任一种的组合。The encoding/decoding of digital media using flexible quantization techniques described herein may be described in the general context of computer-readable media. Computer readable media can be any available media that can be accessed within a computing environment. By way of example, and not limitation, for the computing environment (600), computer-readable media can include memory (620), storage (640), communication media, and combinations of any of the above.

此处描述的使用计算复杂度和精度信令技术的数字媒体编码/解码可在诸如程序模块中所包括的、在目标真实或虚拟处理器上的计算环境中执行的计算机可执行指令的一般上下文中描述。一般而言,程序模块包括执行特定任务或实现特定抽象数据类型的例程、程序、库、对象、类、组件、数据结构等。程序模块的功能可以如各实施例中所需的组合或在程序模块之间分离。用于程序模块的计算机可执行指令可以在本地或分布式计算环境中执行。Digital media encoding/decoding using computational complexity and precision signaling techniques described herein may be in the general context of computer-executable instructions executed in a computing environment on a target real or virtual processor, such as included in a program module described in. Generally, program modules include routines, programs, libraries, objects, classes, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The functions of the program modules may be combined as desired in various embodiments or separated among the program modules. Computer-executable instructions for program modules may be executed in local or distributed computing environments.

出于表示的目的,详细描述使用了如“确定”、“生成”、“调整”和“应用”等术语来描述计算环境中的计算机操作。这些术语是由计算机执行的操作的高级抽象,且不应与人类所执行的动作混淆。对应于这些术语的实际的计算机操作取决于实现而不同。For purposes of presentation, the detailed description uses terms such as "determine," "generate," "modify," and "apply" to describe computer operations in a computing environment. These terms are high-level abstractions for operations performed by computers and should not be confused with actions performed by humans. The actual computer operations that correspond to these terms vary depending on the implementation.

鉴于可应用本发明的原理的许多可能的实施例,要求保护落入所附权利要求书及其等效技术方案的范围和精神之内的所有这样的实施例作为本发明。In view of the many possible embodiments to which the principles of the invention may be applied, the invention is claimed all such embodiments which come within the scope and spirit of the appended claims and their equivalents.

Claims (18)

1.一种数字媒体解码方法,包括:1. A digital media decoding method, comprising: 在数字媒体解码器处接收压缩数字媒体位流;receiving a compressed digital media bitstream at a digital media decoder; 解析来自所述位流的句法元素,所述句法元素用信号通知用于所述数字媒体数据的处理期间的变换计算的算术精度;以及parsing syntax elements from the bitstream that signal arithmetic precision for transform calculations during processing of the digital media data; and 输出重构的图像。Output the reconstructed image. 2.如权利要求1所述的数字媒体解码方法,其特征在于,所述句法元素用信号通知使用高算术精度或低算术精度中的一个。2. The digital media decoding method of claim 1, wherein the syntax element signals to use one of high arithmetic precision or low arithmetic precision. 3.如权利要求2所述的数字媒体解码方法,其特征在于,所述高算术精度是32位数字处理,并且所述低算术精度是16位数字处理。3. The digital media decoding method according to claim 2, wherein the high arithmetic precision is 32-bit digital processing, and the low arithmetic precision is 16-bit digital processing. 4.如权利要求2所述的数字媒体解码方法,其特征在于,还包括:4. The digital media decoding method according to claim 2, further comprising: 解码来自所述压缩数字媒体位流的变换系数块;decoding a block of transform coefficients from said compressed digital media bitstream; 在所述句法元素用信号通知使用所述高算术精度的情况下,使用高算术精度处理来对所述变换系数应用逆变换;以及where the syntax element signals use of the high arithmetic precision, applying an inverse transform to the transform coefficients using high arithmetic precision processing; and 在所述句法元素用信号通知使用所述低算术精度的情况下,使用低算术精度处理来对所述变换系数应用逆变换。Where the syntax element signals use of the low arithmetic precision, an inverse transform is applied to the transform coefficients using low arithmetic precision processing. 5.如权利要求4所述的数字媒体解码方法,其特征在于,所述高算术精度是32位数字处理,并且所述低算术精度是16位数字处理。5. The digital media decoding method according to claim 4, wherein the high arithmetic precision is 32-bit digital processing, and the low arithmetic precision is 16-bit digital processing. 6.如权利要求2所述的数字媒体解码方法,其特征在于,还包括:6. The digital media decoding method as claimed in claim 2, further comprising: 解码来自所述压缩数字媒体位流的变换系数块;decoding a block of transform coefficients from said compressed digital media bitstream; 使用高算术精度处理来对所述变换系数应用逆变换,而不管经由所述句法元素用信号通知的算术精度。The inverse transform is applied to the transform coefficients using high arithmetic precision processing, regardless of the arithmetic precision signaled via the syntax element. 7.一种数字媒体编码方法,包括:7. A digital media encoding method, comprising: 在数字媒体编码器处接收数字媒体数据;receiving digital media data at a digital media encoder; 做出在所述数字媒体数据的处理期间是否将较低精度算术用于变换计算的决定;making a decision whether to use lower precision arithmetic for transform calculations during processing of said digital media data; 用编码位流中的句法元素来表示是否将较低精度算术用于变换计算的所述决定,其中所述句法元素可用于将所述决定传递给数字媒体解码器;以及expressing said decision whether to use lower-precision arithmetic for transform calculations by a syntax element in an encoded bitstream, wherein said syntax element can be used to communicate said decision to a digital media decoder; and 输出所述编码位流。The encoded bitstream is output. 8.如权利要求7所述的数字媒体编码方法,其特征在于,所述做出决定包括:8. The digital media encoding method according to claim 7, wherein said making a decision comprises: 验证用于变换计算的所述较低精度算术是否产生与将较高精度算术用于变换计算相同的解码器输出;以及verifying that said lower precision arithmetic used for transform calculations produces the same decoder output as using higher precision arithmetic for transform calculations; and 基于所述验证,决定是否使用所述较低精度算术。Based on the verification, a decision is made whether to use the lower precision arithmetic. 9.如权利要求7所述的数字媒体编码方法,其特征在于,所述较低精度算术是16位算术精度。9. The digital media encoding method of claim 7, wherein the lower precision arithmetic is 16-bit arithmetic precision. 10.如权利要求7所述的数字媒体编码方法,其特征在于,还包括:10. digital media encoding method as claimed in claim 7, is characterized in that, also comprises: 做出在变换编码之前是否应用所述输入数字媒体数据的缩放的决定;以及making a determination whether to apply scaling of the input digital media data prior to transform encoding; and 用所述编码位流中的句法元素表示是否应用所述缩放的所述决定。The decision whether to apply the scaling is represented by a syntax element in the encoded bitstream. 11.如权利要求10所述的数字媒体编码方法,其特征在于,所述做出是否应用缩放的决定包括,在无损地编码所述数字媒体数据时决定不应用所述输入数字媒体数据的缩放。11. The digital media encoding method according to claim 10, wherein said determining whether to apply scaling comprises deciding not to apply scaling to said input digital media data when encoding said digital media data losslessly . 12.一种数字媒体解码方法,包括:12. A digital media decoding method, comprising: 在数字媒体解码器处接收压缩数字媒体位流;receiving a compressed digital media bitstream at a digital media decoder; 解析来自所述位流的句法元素,所述句法元素用信号通知用于所述数字媒体数据的处理期间的变换计算的精度模式选择;parsing syntax elements from the bitstream that signal a precision mode selection for transform calculations during processing of the digital media data; 在用信号通知了使用缩放的第一精度模式的情况下,缩放所述解码器的输出;scaling the output of the decoder if a scaled first-precision mode is signaled; 在用信号通知了没有缩放的第二精度模式的情况下,省略应用所述输出的缩放;以及where a second precision mode without scaling is signaled, omitting to apply the scaling of the output; and 输出重构的图像。Output the reconstructed image. 13.如权利要求12所述的数字媒体解码方法,其特征在于,所述缩放所述解码器的输出包括以某一数字对所述输出进行取整除法。13. The digital media decoding method according to claim 12, wherein said scaling the output of said decoder comprises rounding and dividing said output by a certain number. 14.如权利要求12所述的数字媒体解码方法,其特征在于,对所述输出的所述取整除法是以数字8进行的取整除法。14. The digital media decoding method according to claim 12, characterized in that, the rounding and division of the output is a rounding and division of the number 8. 15.如权利要求12所述的数字媒体解码方法,其特征在于,还包括:15. The digital media decoding method according to claim 12, further comprising: 解析来自所述位流的第二句法元素,所述第二句法元素用信号通知是否将较低算术精度用于所述数字媒体数据的处理期间的变换计算;parsing a second syntax element from the bitstream, the second syntax element signaling whether to use lower arithmetic precision for transform calculations during processing of the digital media data; 解码来自所述压缩数字媒体位流的变换系数块;以及decoding a block of transform coefficients from said compressed digital media bitstream; and 在所述没有缩放的第二精度模式并且用信号通知了使用较低算术精度的情况下,使用所述较低算术精度来执行所述变换系数的逆变换处理。In the case of the second precision mode without scaling and the use of a lower arithmetic precision is signaled, the inverse transform process of the transform coefficients is performed using the lower arithmetic precision. 16.如权利要求15所述的数字媒体解码方法,其特征在于,所述较低算术精度是16位算术精度。16. The digital media decoding method according to claim 15, wherein the lower arithmetic precision is 16-bit arithmetic precision. 17.如权利要求12所述的数字媒体解码方法,其特征在于,所述数字媒体数据是使用两阶段变换结构来编码的,所述两阶段变换结构具有第一阶段变换,其后是对所述第一阶段变换的DC系数的第二阶段变换,所述数字媒体解码方法还包括:17. The digital media decoding method as claimed in claim 12, wherein said digital media data is encoded using a two-stage transform structure, said two-stage transform structure having a first-stage transform followed by a conversion of said two-stage transform structure. The second stage transformation of the DC coefficient of the first stage transformation, the digital media decoding method also includes: 解码来自所述压缩数字媒体位流的数字媒体数据;decoding digital media data from said compressed digital media bitstream; 对所述数字媒体数据应用逆第二阶段变换;applying an inverse second stage transform to the digital media data; 对所述数字媒体数据应用逆第一阶段变换;applying an inverse first stage transform to the digital media data; 执行所述数字媒体数据的色彩转换;以及performing color conversion of said digital media data; and 其中,在用信号通知了所述使用缩放的第一精度模式的情况下,对所述解码器的输出的所述缩放包括:Wherein, in case the first precision mode using scaling is signaled, said scaling of the output of said decoder comprises: 在输入到所述逆第一阶段变换之前,将所述数字媒体数据左移单个位;shifting the digital media data to the left by a single bit prior to input to the inverse first-stage transform; 在所述色彩转换之后,将所述数字媒体数据右移3位。After the color conversion, the digital media data is shifted right by 3 bits. 18.如权利要求12所述的数字媒体解码方法,其特征在于,所述压缩数字媒体位流是根据定义图像的分开的主图像平面和α图像平面的句法模式来编码的,所述句法元素用信号通知按图像平面以信号通知的精度模式的选择,由此所述主图像平面和所述α图像平面的精度模式是独立地用信号通知的,并且所述解码方法包括执行解析用信号通知对每一图像平面的精度模式的选择的所述句法元素的所述动作,并且在对相应图像平面用信号通知了所述使用缩放的第一精度模式的情况下,对所述相应图像平面缩放所述解码器的输出。18. The digital media decoding method according to claim 12, wherein the compressed digital media bit stream is encoded according to a syntax pattern defining separate main image planes and alpha image planes of an image, the syntax elements signaling a selection of a precision mode signaled per picture plane, whereby the precision modes of said main picture plane and said alpha picture plane are signaled independently, and said decoding method comprises performing a parsing signaled said action of said syntax element for selection of a precision mode for each image plane, and if said first precision mode using scaling is signaled for said corresponding image plane, scaling said corresponding image plane output of the decoder.
CN2008800056300A 2007-02-21 2008-02-20 Computational complexity and precision control in transform-based digital media coder Active CN101617539B (en)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
US89103107P 2007-02-21 2007-02-21
US60/891,031 2007-02-21
US11/772,076 US8942289B2 (en) 2007-02-21 2007-06-29 Computational complexity and precision control in transform-based digital media codec
US11/772,076 2007-06-29
PCT/US2008/054473 WO2008103766A2 (en) 2007-02-21 2008-02-20 Computational complexity and precision control in transform-based digital media codec

Publications (2)

Publication Number Publication Date
CN101617539A true CN101617539A (en) 2009-12-30
CN101617539B CN101617539B (en) 2013-02-13

Family

ID=41556839

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2008800056300A Active CN101617539B (en) 2007-02-21 2008-02-20 Computational complexity and precision control in transform-based digital media coder

Country Status (10)

Country Link
US (1) US8942289B2 (en)
EP (1) EP2123045B1 (en)
JP (2) JP5457199B2 (en)
KR (2) KR101550166B1 (en)
CN (1) CN101617539B (en)
BR (1) BRPI0807465B1 (en)
IL (1) IL199994A (en)
RU (1) RU2518417C2 (en)
TW (1) TWI471013B (en)
WO (1) WO2008103766A2 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104885468A (en) * 2012-12-28 2015-09-02 佳能株式会社 Provision of precision information in an image encoding device, image encoding method and program, image decoding device, and image decoding method and program
CN104995918A (en) * 2013-01-08 2015-10-21 微软公司 Method for converting the format of a frame into a chroma subsampling format
CN106664448A (en) * 2014-07-11 2017-05-10 Lg 电子株式会社 Method and device for transmitting and receiving broadcast signals
US10368144B2 (en) 2014-07-29 2019-07-30 Lg Electronics Inc. Method and device for transmitting and receiving broadcast signal
WO2019148977A1 (en) * 2018-02-01 2019-08-08 Mediatek Inc. Methods and apparatuses of video encoding or decoding with adaptive quantization of video data
US10582269B2 (en) 2014-07-11 2020-03-03 Lg Electronics Inc. Method and device for transmitting and receiving broadcast signal

Families Citing this family (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6882685B2 (en) * 2001-09-18 2005-04-19 Microsoft Corporation Block transform and quantization for image and video coding
US7949054B2 (en) * 2006-06-01 2011-05-24 Microsoft Corporation Flexible data organization for images
US8942289B2 (en) * 2007-02-21 2015-01-27 Microsoft Corporation Computational complexity and precision control in transform-based digital media codec
US8275209B2 (en) * 2008-10-10 2012-09-25 Microsoft Corporation Reduced DC gain mismatch and DC leakage in overlap transform processing
JP5340091B2 (en) 2008-12-19 2013-11-13 キヤノン株式会社 Image coding apparatus and control method thereof
US8676849B2 (en) * 2009-03-12 2014-03-18 Microsoft Corporation Storing lossless transforms of data
KR20110135786A (en) * 2010-06-11 2011-12-19 삼성전자주식회사 3D video encoding / decoding apparatus and method using depth transition data
EP2993905B1 (en) 2011-02-21 2017-05-24 Dolby Laboratories Licensing Corporation Floating point video coding
US8781238B2 (en) 2011-09-08 2014-07-15 Dolby Laboratories Licensing Corporation Efficient decoding and post-processing of high dynamic range images
US11184623B2 (en) * 2011-09-26 2021-11-23 Texas Instruments Incorporated Method and system for lossless coding mode in video coding
KR20130040132A (en) * 2011-10-13 2013-04-23 한국전자통신연구원 A method of transporting media data independent of media codec through heterogeneous ip network
GB2521349A (en) * 2013-12-05 2015-06-24 Sony Corp Data encoding and decoding
JP6220722B2 (en) * 2014-04-17 2017-10-25 アンリツ株式会社 Radio wave half mirror for millimeter wave band and method for flattening its transmission coefficient
KR20190052128A (en) * 2016-10-04 2019-05-15 김기백 Image data encoding / decoding method and apparatus
JP6324590B2 (en) * 2017-05-25 2018-05-16 キヤノン株式会社 Image encoding device, image encoding method and program, image decoding device, image decoding method and program
JP6915483B2 (en) * 2017-09-27 2021-08-04 富士フイルムビジネスイノベーション株式会社 Image processing equipment, image processing systems and programs
EP3471271A1 (en) * 2017-10-16 2019-04-17 Acoustical Beauty Improved convolutions of digital signals using a bit requirement optimization of a target digital signal
JP2018142969A (en) * 2018-04-11 2018-09-13 キヤノン株式会社 Image encoding device, image encoding method and program, image decoding device, image decoding method and program

Family Cites Families (112)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS63219066A (en) 1987-03-06 1988-09-12 Matsushita Electric Ind Co Ltd Orthogonal transform device
US4922537A (en) 1987-06-02 1990-05-01 Frederiksen & Shu Laboratories, Inc. Method and apparatus employing audio frequency offset extraction and floating-point conversion for digitally encoding and decoding high-fidelity audio signals
US5357594A (en) 1989-01-27 1994-10-18 Dolby Laboratories Licensing Corporation Encoding and decoding using specially designed pairs of analysis and synthesis windows
US5379351A (en) 1992-02-19 1995-01-03 Integrated Information Technology, Inc. Video compression/decompression processing and processors
US5319724A (en) 1990-04-19 1994-06-07 Ricoh Corporation Apparatus and method for compressing still images
JP2945487B2 (en) 1990-12-26 1999-09-06 株式会社日立製作所 Matrix multiplier
JPH04282988A (en) 1991-03-12 1992-10-08 Sony Corp Picture data converter
US5168375A (en) 1991-09-18 1992-12-01 Polaroid Corporation Image reconstruction by use of discrete cosine and related transforms
KR0150955B1 (en) 1992-05-27 1998-10-15 강진구 Compressive and extensive method of an image for bit-fixing and the apparatus thereof
US5394349A (en) 1992-07-10 1995-02-28 Xing Technology Corporation Fast inverse discrete transform using subwords for decompression of information
JPH0645948A (en) 1992-07-27 1994-02-18 Victor Co Of Japan Ltd Orthogonal converter and reverse orthogonal converter
JPH0645949A (en) 1992-07-27 1994-02-18 Victor Co Of Japan Ltd Orthogonal converter and reverse orthogonal converter
JPH0654307A (en) 1992-07-29 1994-02-25 Casio Comput Co Ltd Data compressing device
JPH06105296A (en) 1992-09-18 1994-04-15 Sony Corp Variable length coding and decoding method
JP3348310B2 (en) 1992-09-28 2002-11-20 ソニー株式会社 Moving picture coding method and moving picture coding apparatus
JP3069455B2 (en) 1992-12-22 2000-07-24 富士写真フイルム株式会社 Quantization and dequantization circuits in image data compression and decompression equipment
US5995539A (en) 1993-03-17 1999-11-30 Miller; William J. Method and apparatus for signal transmission and reception
JP3697717B2 (en) 1993-09-24 2005-09-21 ソニー株式会社 Two-dimensional discrete cosine transform device and two-dimensional inverse discrete cosine transform device
US5587708A (en) * 1994-01-19 1996-12-24 Industrial Technology Research Institute Division technique unified quantizer-dequantizer
JP3046224B2 (en) 1994-07-26 2000-05-29 三星電子株式会社 Constant bit rate coding method and apparatus and tracking method for fast search using the same
EP0714212A3 (en) 1994-11-21 1999-03-31 SICAN, GESELLSCHAFT FÜR SILIZIUM-ANWENDUNGEN UND CAD/CAT NIEDERSACHSEN mbH Video decoder using concurrent processing and resource sharing
US6002801A (en) 1995-04-18 1999-12-14 Advanced Micro Devices, Inc. Method and apparatus for improved video decompression by selection of IDCT method based on image characteristics
US5864637A (en) 1995-04-18 1999-01-26 Advanced Micro Devices, Inc. Method and apparatus for improved video decompression by selective reduction of spatial resolution
JP2778622B2 (en) 1995-06-06 1998-07-23 日本電気株式会社 Two-dimensional DCT circuit
JP2914226B2 (en) 1995-06-16 1999-06-28 日本電気株式会社 Transformation encoding of digital signal enabling reversible transformation
JP3274593B2 (en) 1995-09-27 2002-04-15 日本電気株式会社 Conversion device and reversible conversion device capable of reversible conversion
US5995670A (en) 1995-10-05 1999-11-30 Microsoft Corporation Simplified chain encoding
EP0878097B1 (en) * 1996-01-08 2003-03-26 International Business Machines Corporation File server for multimedia file distribution
US6957350B1 (en) 1996-01-30 2005-10-18 Dolby Laboratories Licensing Corporation Encrypted and watermarked temporal and resolution layering in advanced television
JP3168922B2 (en) 1996-08-27 2001-05-21 日本ビクター株式会社 Digital image information recording and playback device
JPH1091614A (en) 1996-09-13 1998-04-10 Hitachi Ltd IDCT integer conversion method
JPH10107644A (en) * 1996-09-26 1998-04-24 Sony Corp Quantizing device and method, and coding device and method
SG54383A1 (en) 1996-10-31 1998-11-16 Sgs Thomson Microelectronics A Method and apparatus for decoding multi-channel audio data
US5883823A (en) 1997-01-15 1999-03-16 Sun Microsystems, Inc. System and method of a fast inverse discrete cosine transform and video compression/decompression systems employing the same
US5974184A (en) 1997-03-07 1999-10-26 General Instrument Corporation Intra-macroblock DC and AC coefficient prediction for interlaced digital video
US6351570B1 (en) 1997-04-01 2002-02-26 Matsushita Electric Industrial Co., Ltd. Image coding and decoding apparatus, method of image coding and decoding, and recording medium for recording program for image coding and decoding
US6058215A (en) 1997-04-30 2000-05-02 Ricoh Company, Ltd. Reversible DCT for lossless-lossy compression
US6134270A (en) 1997-06-13 2000-10-17 Sun Microsystems, Inc. Scaled forward and inverse discrete cosine transform and video compression/decompression systems employing the same
US6057855A (en) 1997-07-02 2000-05-02 Hewlett-Packard Company Method and apparatus for providing polygon pixel sub-sample information using incremental means
JPH11122624A (en) 1997-10-16 1999-04-30 Matsushita Electric Ind Co Ltd Method and apparatus for reducing video decoder throughput
US6006179A (en) 1997-10-28 1999-12-21 America Online, Inc. Audio codec using adaptive sparse vector quantization with subband vector classification
US6137916A (en) 1997-11-17 2000-10-24 Sony Electronics, Inc. Method and system for improved digital video data processing using 8-point discrete cosine transforms
WO1999029112A1 (en) 1997-12-01 1999-06-10 Matsushita Electric Industrial Co., Ltd. Image processor, image data processor and variable length encoder/decoder
WO1999033276A1 (en) 1997-12-19 1999-07-01 Infineon Technologies Ag Device for multiplying with constant factors and use of said device for video compression (mpeg)
RU2201654C2 (en) 1997-12-23 2003-03-27 Томсон Лайсенсинг С.А. Low-noise coding and decoding method
JP3953183B2 (en) 1998-03-27 2007-08-08 パナソニック コミュニケーションズ株式会社 Image communication method and image communication apparatus
US6115689A (en) 1998-05-27 2000-09-05 Microsoft Corporation Scalable audio coder and decoder
US6029126A (en) 1998-06-30 2000-02-22 Microsoft Corporation Scalable audio coder and decoder
US6073153A (en) 1998-06-03 2000-06-06 Microsoft Corporation Fast system and method for computing modulated lapped transforms
US6154762A (en) 1998-06-03 2000-11-28 Microsoft Corporation Fast system and method for computing modulated lapped transforms
US6301304B1 (en) 1998-06-17 2001-10-09 Lsi Logic Corporation Architecture and method for inverse quantization of discrete cosine transform coefficients in MPEG decoders
GB9819648D0 (en) 1998-09-10 1998-11-04 Nds Ltd Determining visually noticeable differences between two images
US6353808B1 (en) 1998-10-22 2002-03-05 Sony Corporation Apparatus and method for encoding a signal as well as apparatus and method for decoding a signal
EP1125235B1 (en) 1998-10-26 2003-04-23 STMicroelectronics Asia Pacific Pte Ltd. Multi-precision technique for digital audio encoder
US7194138B1 (en) 1998-11-04 2007-03-20 International Business Machines Corporation Reduced-error processing of transformed digital data
US6421464B1 (en) 1998-12-16 2002-07-16 Fastvdo Llc Fast lapped image transforms using lifting steps
US6363117B1 (en) 1998-12-31 2002-03-26 Sony Corporation Video compression using fast block motion estimation
US6473534B1 (en) 1999-01-06 2002-10-29 Hewlett-Packard Company Multiplier-free implementation of DCT used in image and video processing and compression
US6496795B1 (en) 1999-05-05 2002-12-17 Microsoft Corporation Modulated complex lapped transform for integrated signal enhancement and coding
US6487574B1 (en) 1999-02-26 2002-11-26 Microsoft Corp. System and method for producing modulated complex lapped transforms
US6370502B1 (en) 1999-05-27 2002-04-09 America Online, Inc. Method and system for reduction of quantization-induced block-discontinuities and general purpose audio codec
US6574651B1 (en) 1999-10-01 2003-06-03 Hitachi, Ltd. Method and apparatus for arithmetic operation on vectored data
US6507614B1 (en) 1999-10-19 2003-01-14 Sony Corporation Efficient de-quantization in a digital video decoding process using a dynamic quantization matrix for parallel computations
US7028063B1 (en) 1999-10-26 2006-04-11 Velocity Communication, Inc. Method and apparatus for a DFT/IDFT engine supporting multiple X-DSL protocols
AU2063401A (en) 1999-12-06 2001-06-12 Hrl Laboratories, Llc Variable precision wavelets
EP1172008A1 (en) 2000-01-12 2002-01-16 Koninklijke Philips Electronics N.V. Image data compression
JP3593944B2 (en) 2000-03-08 2004-11-24 日本電気株式会社 Image data processing apparatus and motion compensation processing method used therefor
JP4560694B2 (en) 2000-04-05 2010-10-13 ソニー株式会社 Encoding apparatus and method thereof
US6606725B1 (en) 2000-04-25 2003-08-12 Mitsubishi Electric Research Laboratories, Inc. MAP decoding for turbo codes by parallel matrix processing
SE522261C2 (en) * 2000-05-10 2004-01-27 Global Ip Sound Ab Encoding and decoding of a digital signal
FR2818053B1 (en) * 2000-12-07 2003-01-10 Thomson Multimedia Sa ENCODING METHOD AND DEVICE FOR DISPLAYING A ZOOM OF AN MPEG2 CODED IMAGE
US8374237B2 (en) 2001-03-02 2013-02-12 Dolby Laboratories Licensing Corporation High precision encoding and decoding of video images
JP4063508B2 (en) 2001-07-04 2008-03-19 日本電気株式会社 Bit rate conversion device and bit rate conversion method
US20030112873A1 (en) 2001-07-11 2003-06-19 Demos Gary A. Motion estimation for video compression systems
US7123655B2 (en) 2001-08-09 2006-10-17 Sharp Laboratories Of America, Inc. Method for reduced bit-depth quantization
US7185037B2 (en) 2001-08-23 2007-02-27 Texas Instruments Incorporated Video block transform
KR100433709B1 (en) 2001-08-31 2004-05-31 (주)씨앤에스 테크놀로지 Discrete cosine transform method of distributed arithmetic
US6882685B2 (en) 2001-09-18 2005-04-19 Microsoft Corporation Block transform and quantization for image and video coding
US7295609B2 (en) 2001-11-30 2007-11-13 Sony Corporation Method and apparatus for coding image information, method and apparatus for decoding image information, method and apparatus for coding and decoding image information, and system of coding and transmitting image information
EP2262269B1 (en) * 2001-12-17 2018-01-24 Microsoft Technology Licensing, LLC Skip macroblock coding
EP1469682A4 (en) * 2002-01-24 2010-01-27 Hitachi Ltd CODING AND DECODING OF ANIMATED IMAGE SIGNAL AND APPARATUS THEREFOR
US7379498B2 (en) * 2002-03-11 2008-05-27 Broadcom Corporation Reconstructing a compressed still image by transformation to a compressed moving picture image
CN1225904C (en) 2002-04-12 2005-11-02 精工爱普生株式会社 Method and apparatus for storage of effective compression domain video processing and compensation of fast reverse motion
US7242713B2 (en) 2002-05-02 2007-07-10 Microsoft Corporation 2-D transforms for image and video coding
US6944224B2 (en) 2002-08-14 2005-09-13 Intervideo, Inc. Systems and methods for selecting a macroblock mode in a video encoder
US7197525B2 (en) 2002-11-26 2007-03-27 Analog Devices, Inc. Method and system for fixed point fast fourier transform with improved SNR
US7075530B2 (en) 2003-02-27 2006-07-11 International Business Machines Corporation Fast lighting processors
US7330866B2 (en) 2003-07-01 2008-02-12 Nvidia Corporation System for frequency-domain scaling for discrete cosine transform
JP4617644B2 (en) 2003-07-18 2011-01-26 ソニー株式会社 Encoding apparatus and method
US7502415B2 (en) * 2003-07-18 2009-03-10 Microsoft Corporation Range reduction
US7499495B2 (en) * 2003-07-18 2009-03-03 Microsoft Corporation Extended range motion vectors
US7609763B2 (en) 2003-07-18 2009-10-27 Microsoft Corporation Advanced bi-directional predictive coding of video frames
US8218624B2 (en) * 2003-07-18 2012-07-10 Microsoft Corporation Fractional quantization step sizes for high bit rates
US7688895B2 (en) 2003-07-22 2010-03-30 Lsi Corporation Method and/or circuit for binary arithmetic decoding decisions before termination
US20050036548A1 (en) * 2003-08-12 2005-02-17 Yong He Method and apparatus for selection of bit budget adjustment in dual pass encoding
US8014450B2 (en) 2003-09-07 2011-09-06 Microsoft Corporation Flexible range reduction
KR100965881B1 (en) 2003-10-10 2010-06-24 삼성전자주식회사 Video data encoding system and decoding system
WO2005076614A1 (en) 2004-01-30 2005-08-18 Matsushita Electric Industrial Co., Ltd. Moving picture coding method and moving picture decoding method
US20050213835A1 (en) 2004-03-18 2005-09-29 Huazhong University Of Science & Technology And Samsung Electronics Co., Ltd. Integer transform matrix selection method in video coding and related integer transform method
US20050259729A1 (en) 2004-05-21 2005-11-24 Shijun Sun Video coding with quality scalability
JP4241517B2 (en) 2004-06-15 2009-03-18 キヤノン株式会社 Image encoding apparatus and image decoding apparatus
JP4074868B2 (en) 2004-12-22 2008-04-16 株式会社東芝 Image coding control method and apparatus
EP1869894A1 (en) 2005-04-14 2007-12-26 Thomson Licensing Method and apparatus for slice adaptive motion vector coding for spatial scalable video encoding and decoding
CA2610276C (en) 2005-07-22 2013-01-29 Mitsubishi Electric Corporation Image encoder and image decoder, image encoding method and image decoding method, image encoding program and image decoding program, and computer readable recording medium recorded with image encoding program and computer readable recording medium recorded with image decoding program
CN100539437C (en) 2005-07-29 2009-09-09 上海杰得微电子有限公司 A kind of implementation method of audio codec
US8548265B2 (en) 2006-01-05 2013-10-01 Fastvdo, Llc Fast multiplierless integer invertible transforms
US20070271321A1 (en) 2006-01-11 2007-11-22 Qualcomm, Inc. Transforms with reduce complexity and/or improve precision by means of common factors
US8942289B2 (en) * 2007-02-21 2015-01-27 Microsoft Corporation Computational complexity and precision control in transform-based digital media codec
WO2008118146A1 (en) * 2007-03-23 2008-10-02 Thomson Licensing Modifying a coded bitstream
KR101518999B1 (en) * 2007-06-14 2015-05-12 톰슨 라이센싱 Modifying a coded bitstream
US20120014431A1 (en) * 2010-07-14 2012-01-19 Jie Zhao Methods and Systems for Parallel Video Encoding and Parallel Video Decoding
US9313514B2 (en) * 2010-10-01 2016-04-12 Sharp Kabushiki Kaisha Methods and systems for entropy coder initialization

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104885468A (en) * 2012-12-28 2015-09-02 佳能株式会社 Provision of precision information in an image encoding device, image encoding method and program, image decoding device, and image decoding method and program
CN104885468B (en) * 2012-12-28 2019-11-01 佳能株式会社 Picture coding device and method and picture decoding apparatus and method
US10743017B2 (en) 2012-12-28 2020-08-11 Canon Kabushiki Kaisha Encoding including transform and quantization, or decoding including inverse-transform and inverse-quantization
CN104995918A (en) * 2013-01-08 2015-10-21 微软公司 Method for converting the format of a frame into a chroma subsampling format
CN104995918B (en) * 2013-01-08 2019-05-17 微软技术许可有限责任公司 Method for converting the format of a frame to a chroma subsampling format
CN106664448A (en) * 2014-07-11 2017-05-10 Lg 电子株式会社 Method and device for transmitting and receiving broadcast signals
US10419718B2 (en) 2014-07-11 2019-09-17 Lg Electronics Inc. Method and device for transmitting and receiving broadcast signal
US10582269B2 (en) 2014-07-11 2020-03-03 Lg Electronics Inc. Method and device for transmitting and receiving broadcast signal
US10368144B2 (en) 2014-07-29 2019-07-30 Lg Electronics Inc. Method and device for transmitting and receiving broadcast signal
WO2019148977A1 (en) * 2018-02-01 2019-08-08 Mediatek Inc. Methods and apparatuses of video encoding or decoding with adaptive quantization of video data
US11019338B2 (en) 2018-02-01 2021-05-25 Mediatek Inc. Methods and apparatuses of video encoding or decoding with adaptive quantization of video data

Also Published As

Publication number Publication date
RU2009131599A (en) 2011-02-27
TWI471013B (en) 2015-01-21
BRPI0807465A8 (en) 2017-01-17
IL199994A (en) 2015-11-30
HK1140341A1 (en) 2010-10-08
CN101617539B (en) 2013-02-13
TW200843515A (en) 2008-11-01
KR20090115726A (en) 2009-11-05
KR20150003400A (en) 2015-01-08
BRPI0807465B1 (en) 2020-05-26
RU2518417C2 (en) 2014-06-10
EP2123045A4 (en) 2013-03-13
JP5457199B2 (en) 2014-04-02
US20080198935A1 (en) 2008-08-21
KR101507183B1 (en) 2015-03-30
IL199994A0 (en) 2010-04-15
JP2010519858A (en) 2010-06-03
KR101550166B1 (en) 2015-09-03
BRPI0807465A2 (en) 2014-06-03
WO2008103766A3 (en) 2008-11-27
EP2123045B1 (en) 2018-10-17
JP2014078952A (en) 2014-05-01
WO2008103766A2 (en) 2008-08-28
EP2123045A2 (en) 2009-11-25
US8942289B2 (en) 2015-01-27

Similar Documents

Publication Publication Date Title
US8942289B2 (en) Computational complexity and precision control in transform-based digital media codec
US11843775B2 (en) Flexible quantization
US8515194B2 (en) Signaling and uses of windowing information for images
CN101617540A (en) Signaling and use of chroma sample positioning information
CA2735973C (en) Reduced dc gain mismatch and dc leakage in overlap transform processing
HK1140341B (en) Computational complexity and precision control in transform-based digital media codec
HK1179084B (en) Flexible quantization

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 1140341

Country of ref document: HK

C14 Grant of patent or utility model
GR01 Patent grant
REG Reference to a national code

Ref country code: HK

Ref legal event code: GR

Ref document number: 1140341

Country of ref document: HK

ASS Succession or assignment of patent right

Owner name: MICROSOFT TECHNOLOGY LICENSING LLC

Free format text: FORMER OWNER: MICROSOFT CORP.

Effective date: 20150514

C41 Transfer of patent application or patent right or utility model
TR01 Transfer of patent right

Effective date of registration: 20150514

Address after: Washington State

Patentee after: Micro soft technique license Co., Ltd

Address before: Washington State

Patentee before: Microsoft Corp.