News Column

Patent Application Titled "Moving Image Encoding Method, Moving Image Encoding Apparatus, and Computer-Readable Medium" Published Online

July 3, 2014



By a News Reporter-Staff News Editor at Computer Weekly News -- According to news reporting originating from Washington, D.C., by VerticalNews journalists, a patent application by the inventor Moriyoshi, Tatsuji (Tokyo, JP), filed on May 1, 2012, was made available online on June 19, 2014.

The assignee for this patent application is NEC Corporation.

Reporters obtained the following quote from the background information supplied by the inventors: "In recent years, a moving image encoding technique has been widely used. The moving image encoding technique has been used for a wide range of applications such as digital broadcasting, video content distribution via an optical disk, and video distribution via the Internet and the like. As techniques for generating encoded data by encoding a moving image signal at a low bit rate, a high compression ratio, and a high image quality and for decoding the encoded moving image, H.261 and H.263, which are standardized by the ITU (International Telecommunication Union), MPEG-1, MPEG-2, and MPEG-4, which are ISO (International Organization for Standardization) standards, VC-1, which is a SMPTE (Society of Motion Picture and Television Engineers) standard, and the like are used as the international standards.

"In addition, H.264/MPEG-4 AVC (hereinafter referred to as 'H.264') which has been recently standardized by the ITU and ISO is known (Non Patent Literature 1). It is known that H.264 further improves the compression efficiency and image quality, as compared with moving image encoding techniques of related art.

"To meet the demand for improving the video quality and reducing the transmission rate, the encoding techniques are complicated and the load of encoding processing increases. Accordingly, for example, real-time encoding of a full-high-vision (1920.times.1080 pixels) video in an H.264 system, which is the latest international standard system, cannot be achieved using only typical CPU software, and some accelerator is also used in practice. As a desirable accelerator in a platform of a PC (personal computer), a GPGPU (General Purpose Computing on Graphics Processing Unit) is known. In the GPGPU, a GPU (Graphics Processing Unit), which has been used for three-dimensional graphics processing, is used for other purposes as well. The GPGPU can perform processing matching the characteristics of the GPU, which exhibits an extremely high performance in large-scale vector operation, at a speed several to several tens of times faster than a CPU.

"FIG. 8 shows a typical example of a moving image encoding apparatus of an H.264 system. As shown in the figure, a moving image encoding apparatus 100 includes a motion estimation unit 101, a motion compensation unit 102, an intra prediction mode determination unit 103, an intra prediction unit 104, a selection unit 105, an integer transform unit 106, a quantization unit 107, an inverse quantization unit 108, an inverse discrete integer transform unit 109, a variable-length coding unit 110, a deblocking filter unit 111, a frame buffer 112, a subtraction unit 113, and an addition unit 114. The moving image encoding apparatus 100 sequentially encodes each input image (hereinafter referred to as 'input image') to obtain a bit stream and outputs the obtained bit stream. In the moving image encoding apparatus 100, processing of all functional blocks is executed by a CPU.

"In order to improve the compression efficiency and image quality, the H.264 system also employs the intra prediction (in-screen prediction) technique that performs a prediction using information on neighboring pixels within an image, and the technique of a deblocking filtering for reducing encoding noise caused in an image obtained as a result of encoding. The frame buffer 112 stores image data of previously encoded frames. Encoding processing is performed on the input image in the unit of a block of 16.times.16 pixels. The block is called a macroblock (MB).

"The motion estimation (ME: Motion Estimation) unit 101 detects a change in the position of the corresponding image block between an input image and an encoded image stored in the frame buffer 112, and outputs motion vector information corresponding to the position change. The motion compensation (MC: Motion Compensation) unit 102 performs motion compensation processing using the encoded image stored in the frame buffer 112 and the motion vector information supplied from the motion estimation unit 101, and outputs a motion compensation prediction image.

"The intra prediction mode determination unit 103 selects an appropriate intra prediction mode based on the input image and image information on the encoded macroblock within the input image, and outputs information (intra prediction mode information) indicating the selected mode. The intra prediction (IP: Intra Prediction) unit 104 performs intra prediction processing using the image information on the encoded macroblock within the input image and the intra prediction mode information supplied from the intra prediction mode determination unit 103, and outputs an intra prediction image.

"The selection unit 105 selects an appropriate one of either the motion compensation prediction image, which is supplied from the motion compensation unit 102, or the intra prediction image, which is supplied from the intra prediction unit 104, and outputs the selected image as a predicted image. A mode for selecting the motion compensation prediction image is called an inter mode, and a mode for selecting the intra prediction image is called an intra mode.

"The subtraction unit 113 subtracts the predicted image, which is output from the selection unit 105, from the input image, and outputs a prediction error image. The integer transform (DIT: Discrete Integer Transform) unit 106 performs orthogonal transform processing similar to that performed by DCT (Discrete Cosine Transform) on the prediction error image to obtain an orthogonal transform coefficient sequence, and outputs the obtained orthogonal transform coefficient sequence.

"The quantization (Q: Quantize) unit 107 quantizes the orthogonal transform sequence from the integer transform unit 106, and outputs the quantized orthogonal transform coefficient sequence.

"The variable-length coding (VLC: Variable-Length Coding) unit 110 encodes the quantized orthogonal transform coefficient sequence, which is supplied from the quantization unit 107, according to a predetermined rule, and outputs a bit stream of encoding results. This bit stream is an output bit stream of the encoding apparatus of the H.264 system.

"The orthogonal transform coefficient sequence quantized by the quantization unit 107 is also output to the inverse quantization (IQ: Inverse Quantization) unit 108, and is subjected to inverse quantization processing by the inverse quantization unit 108 and then subjected to inverse discrete integer transform processing by the inverse discrete integer transform (IDIT: Inverse Discrete Integer Transform) unit 109. Then, the orthogonal transform coefficient sequence is added to the predicted image, which is output from the selection unit 105, by the addition unit 114, and is further subjected to deblocking filtering processing by the deblocking filter unit 111. Data obtained by the deblocking filter unit 111 is a local decoded image to be stored in the frame buffer 112 and used for encoding of the subsequent frame.

"The intra prediction mode determination unit 103 and the selection unit 105 employ various selection methods. However, in general, the intra prediction mode determination unit 103 and the selection unit 105 select one having a higher encoding efficiency.

"The contents of the above-mentioned processing of the functional blocks of the moving image encoding apparatus 100 are also disclosed in Non Patent Literature 2, for example, so a detailed description thereof is omitted.

"In each functional block of the moving image encoding apparatus 100 shown in FIG. 8, in general, the throughputs for the motion estimation performed by the motion estimation unit 101 and the intra prediction mode determination performed by the intra prediction mode determination unit 103 are especially high. Accordingly, the processing of the motion estimation and intra prediction mode determination is off-loaded to an accelerator, such as a GPU, thereby achieving speeding-up of the processing. The case where the motion estimation and intra prediction mode determination are off-loaded to the GPU will be described with reference to FIG. 9.

"FIG. 9 shows an example of the configuration of the moving image encoding apparatus in which the respective processings of the motion estimation unit 101, the intra prediction mode determination unit 103, and the motion compensation unit 102, which uses the results of the motion estimation performed by the motion estimation unit 101, as shown in FIG. 8, are off-loaded to the GPU and the subsequent respective processings are executed by the CPU. To facilitate comparison with FIG. 8, functional blocks having the same function are denoted by the same reference numerals in FIGS. 9 and 8.

"As shown in FIG. 9, in a moving image encoding apparatus 200, the GPU executes the respective processings of the motion estimation unit 101, an intra prediction mode determination unit 203, and the motion compensation unit 102. Other respective processings are executed by the CPU.

"In the moving image encoding apparatus 200, the intra prediction mode determination unit 203 that performs an intra prediction mode determination is different from the intra prediction mode determination unit 103 of the moving image encoding apparatus 100. The reason for this will be described below.

"Since it generally takes a long time to perform data communication between a CPU and a GPU, the processing results can be collectively transferred from the CPU to the GPU by a certain amount, for example, by an amount corresponding to one screen. Specifically, the GPU performs motion estimation, motion compensation, and intra prediction mode determination processing for one screen, and collectively transfers the processing results for one screen to the CPU. The CPU performs the subsequent processing for the one screen. In this case, the intra prediction mode determination unit 203 cannot use the image information on the encoded macroblocks within the same image. Accordingly, unlike the intra prediction mode determination unit 103 of the moving image encoding apparatus 100, the intra prediction mode determination unit 203 operates to select an appropriate intra prediction mode by using only the information on the input image.

"In the moving image encoding apparatus 200, the intra prediction mode determination processing is performed by the GPU, but the intra prediction processing using the result is executed by the CPU as in the moving image encoding apparatus 100. This is because the result of DIT-Q-IQ-IDIT processing on an image block adjacent to the image block (having a size of 16.times.16 pixels, 8.times.8 pixels, or 4.times.4 pixels) which is being processed is required for intra prediction.

"In many cases, patterns that are spatially analogous to each other are continuously formed in normal images. For this reason, in the H.264 intra prediction, the image data of the block adjacent to the block to be processed is duplicated to predict the image of the block to be processed, thereby obtaining a high prediction effect. To deal with various types of patterns, prediction modes in nine directions as shown in FIG. 10 are used in the case of 4.times.4 blocks, for example. The intra prediction mode determination is processing for determining a mode indicating an optimum prediction result from among the nine modes. The prediction results of the nine modes are evaluated in each block of an image and the optimum mode is selected, which results in an increase in throughput. In this case, images located outside the screen cannot be used as a prediction source, so the operation of intra prediction is changed at an end of the screen and at a boundary of division when the screen is divided. At an upper end of the screen, for example, the modes 0, 3, 4, 5, 6, and 7, in which the upper-side image is used, cannot be selected. When the mode 2 is selected, a special operation is carried out.

"As disclosed in Patent Literatures 1 and 2, for example, H.264 enables division of a screen into small regions, each of which is called a slice, thereby encoding each slice separately. In this case, images located outside each slice cannot be used as a prediction source, so the operation of intra prediction is also changed at the boundary between slices in the same manner as described above.

"Processing for dividing a screen into slices to be encoded will be described in detail with reference to a moving image encoding apparatus 300 of related art shown in FIG. 11. For ease of understanding, in FIG. 11, the processing configuration is limited to that when the intra mode, i.e., the intra prediction image, is selected. The subtraction unit 113, the integer transform unit 106, the quantization unit 107, the inverse quantization unit 108, the inverse discrete integer transform unit 109, the addition unit 114, and the variable-length coding unit 110, which are provided in the moving image encoding apparatus 100 shown in FIG. 8, are integrated into a block encoding unit 303. A slice division structure control unit 301 is added to explain the operation using a slice.

"As shown in FIG. 11, an intra prediction mode determination unit 310 includes an optimum mode determination unit 320. The optimum mode determination unit 320 determines an appropriate intra prediction mode by using information on an input image and information on the slice division structure of the screen supplied from the slice division structure control unit 301, and outputs intra prediction mode information.

"An intra prediction unit 302 performs intra prediction processing using the intra prediction mode information supplied from the intra prediction mode determination unit 310, information on the slice division structure of the screen supplied from the slice division structure control unit 301, and information on an encoded image supplied from an encoded image storage unit 304, and outputs an intra prediction image.

"The block encoding unit 303 performs a series of encoding processing, such as DIT-Q-IQ-IDIT, by using the input image and the intra prediction image supplied from the intra prediction unit 302, and outputs a bit stream and an encoded image. The encoded image storage unit 304 stores the encoded image supplied from the block encoding unit 303.

"The slice division structure control unit 301 determines a slice division position by using the bit stream output from the block encoding unit 303, and outputs information on the slice division structure of the screen.

"The encoding by slice division is effective for reducing the effect of transmission line errors when video communication is performed using a transmission line in which an error occurs. A variable-length code is used as a bit stream in H.264 and the like. Accordingly, if a bit error occurs due to a transmission line error or the like, bit streams following the position where the bit error occurs cannot be normally decoded, so that the effect of the bit error propagates through the subsequent region of the screen. FIG. 12 is a diagram for explaining the range affected by the error.

"The left side of FIG. 12 shows the range affected by the error when slice division is not performed. As shown in the figure, in this case, the effect of the error covers the whole region below the position where the error occurs. On the other hand, as shown on the right side of FIG. 12, when slice division is performed, the effect of the error is limited to the inside of the slice in which the error occurs.

"In the case of using an IP (Internet Protocol) network as a transmission line, slice division is generally performed so that the data size of one slice falls within Path MTU (which is a maximum data size that can be transmitted by one packet). This is because if the slice is present across a plurality of packets, the error rate due to a packet loss increases. Therefore, in the case of encoding a moving image, dynamic slice division processing is performed in which the data size of a bit stream obtained as a result of encoding is monitored and when the size of data included in one slice exceeds a predetermined value, the slice is further divided. The slice division structure control unit 301 of the moving image encoding apparatus 300 performs this dynamic slice division processing."

In addition to obtaining background information on this patent application, VerticalNews editors also obtained the inventor's summary information for this patent application: "Technical Problem

"Now, the case where processing with a large load is off-loaded to the GPU as in the moving image encoding apparatus 200 shown in FIG. 9 and slice division is performed as in the moving image encoding apparatus 300 shown in FIG. 11 will be considered. In this case, in the moving image encoding apparatus 300 shown in FIG. 11, the processing of the intra prediction mode determination unit 310 is performed by the GPU and the processing of other functional blocks including the slice division structure control unit 301 is performed by the CPU as shown in FIG. 13.

"As described above, the GPU processes one screen and collectively transmits the processing results for the one screen to the CPU. In this case, even when the CPU controls the slice division, the processing for the one screen has already been completed in the intra prediction mode determination performed by the GPU. This poses a problem that the results of the slice division cannot be fed back for the intra prediction mode determination.

"Assuming a certain slice division structure, for example, the GPU collectively determines the optimum intra prediction mode in the structure for the one screen and sends the results to the CPU. The CPU performs intra prediction based on the intra prediction mode information from the GPU and performs block encoding processing. Further, the CPU performs variable-length coding on the processing result and outputs a bit stream. For the dynamic slice division, the CPU monitors the data size of the output bit stream, and performs slice division when the data size exceeds the predetermined value.

"As described above, the operation of intra prediction is changed at the boundary between slices. Accordingly, when the slice division structure is changed by the dynamic slice division, it is originally necessary to perform the intra prediction mode determination processing again based on a new slice division structure. However, since the processing for the one screen has already been completed, it is difficult to perform the intra prediction mode determination again. In theory, it is possible to perform the intra prediction mode determination again, but there is a problem that the efficiency deteriorates due to an increase in cost of the communication between the CPU and the GPU and an increase in the number of operations as a result of performing the processing again, for example. On the other hand, when the results of the intra prediction mode determination performed based on a slice division structure different from that determined by the CPU are used as they are, there is a problem that data (bit stream) which is non-compliant with the standards can be generated, or the image quality significantly deteriorates, for example.

"The present invention has been made in view of the above-mentioned circumstances, and provides a moving image encoding technique that can avoid deterioration in image quality and efficiency even when a slice division structure is dynamically changed.

"Solution to Problem

"An exemplary aspect of the present invention is a moving image encoding apparatus that encodes each picture by dividing the picture into a plurality of slices. The moving image encoding apparatus includes a slice division structure control unit, a pre-stage processing unit, and a post-stage processing unit.

"The slice division structure control unit determines a slice division structure indicating a division position for dividing a picture into a plurality of slices.

"The pre-stage processing unit performs pre-stage processing which is processing for a pre-stage portion for encoding the picture. The pre-stage processing can yield different processing results for different slice division structures.

"The post-stage processing unit performs post-stage processing, which is processing for a post-stage portion for encoding, based on the processing result of the pre-stage processing unit, and obtains an encoding result.

"The pre-stage processing unit performs the pre-stage processing for each of a plurality of slice division structures that can be determined by the slice division structure control unit, and obtains processing results for each of the plurality of slice division structures.

"The post-stage processing unit selects, from among the processing results for each of the plurality of slice division structures obtained by the pre-stage processing unit, a processing result matching a slice division structure determined by the slice division structure control unit, and performs the post-stage processing based on the selected processing result.

"Note that implementations of the apparatus according to the above-mentioned aspect in the form of a system and a method, and a program for causing a computer to execute processing as the apparatus, and the like may also be effective as aspects of the present invention.

"Advantageous Effects of Invention

"The technique according to the present invention can avoid deterioration in image quality and efficiency even when a slice division structure is dynamically changed during moving image encoding.

BRIEF DESCRIPTION OF DRAWINGS

"FIG. 1 is a diagram for explaining the technical principle according to the present invention;

"FIG. 2 is a diagram for explaining a case where pre-stage processing in a moving image encoding apparatus shown in FIG. 1 is off-loaded to a GPU;

"FIG. 3 is a diagram showing a moving image encoding apparatus according to a first exemplary embodiment of the present invention;

"FIG. 4 is a diagram showing an example of the relationship between a slice structure and the number of selectable intra prediction modes;

"FIG. 5 is a diagram showing an example of processing results that are output to a post-stage processing unit by an intra prediction mode determination unit in the moving image encoding apparatus shown in FIG. 3;

"FIG. 6 is a diagram showing a moving image encoding apparatus according to a second exemplary embodiment of the present invention;

"FIG. 7 is a timing diagram showing an example of the timing relationship between intra prediction mode determination processing and block encoding processing in the moving image encoding apparatus shown in FIG. 6;

"FIG. 8 is a diagram showing an example of a typical moving image encoding apparatus of an H.264 system;

"FIG. 9 is a diagram for explaining a case where a part of the processing of the moving image encoding apparatus shown in FIG. 8 is off-loaded to a CPU;

"FIG. 10 is a diagram showing an example of intra prediction modes;

"FIG. 11 is a schematic diagram showing an example of a moving image encoding apparatus that performs slice division;

"FIG. 12 is a diagram for explaining a difference in error influence range between a case where slice division is performed and a case where slice division is not performed; and

"FIG. 13 is a diagram for explaining a problem of a related art technique."

For more information, see this patent application: Moriyoshi, Tatsuji. Moving Image Encoding Method, Moving Image Encoding Apparatus, and Computer-Readable Medium. Filed May 1, 2012 and posted June 19, 2014. Patent URL: http://appft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&u=%2Fnetahtml%2FPTO%2Fsearch-adv.html&r=4069&p=82&f=G&l=50&d=PG01&S1=20140612.PD.&OS=PD/20140612&RS=PD/20140612

Keywords for this news article include: NEC Corporation.

Our reports deliver fact-based news of research and discoveries from around the world. Copyright 2014, NewsRx LLC


For more stories covering the world of technology, please see HispanicBusiness' Tech Channel



Source: Computer Weekly News


Story Tools






HispanicBusiness.com Facebook Linkedin Twitter RSS Feed Email Alerts & Newsletters