# H.264/AVC Video Codec Technology for High-Image-Quality and Low-Power Applications

• Kiyonori Morioka

To make use of the high coding performance of the international standard H.264/AVC to achieve high image quality, a massive amount of computation and accompanying reference to many pieces of image data are required. These processes involve a trade-off with reducing power consumption. Fujitsu has developed H.264/AVC codec technology capable of achieving industry-leading levels of high-image-quality and low-power-consumption performance based on its proprietary image analysis technology, and applied it to many LSIs for imaging devices. This paper outlines the technology to reduce power consumption with the focus on data transfer with external memory, which occurs when there is internal computation processing. First, it describes a lossless image compression technique optimized for video codec processing. Then, it gives an explanation about technology called "lossless compression prefetch memory," which is applied to read ahead and retain the data likely to be used repeatedly, thereby significantly reducing the amount of data transfer. The technology presented in this paper, and the H.264/AVC codec technology, which achieves both high image quality and lower power consumption, can be applied to devices for which there is a high demand for power consumption reduction.

#### 1. Introduction

Recently, small mobile devices such as digital cameras and smartphones have come to be widely equipped with functions for handling high-resolution moving images of high-definition television (HDTV) quality. In addition, with the higher-speed communications infrastructure and larger-capacity recording media, video data can now be exchanged between a variety of devices and services. This has made HDTV-quality videos familiar to general consumers.

While video data are generally compressed, their information volumes are overwhelmingly larger than those of other types of digital information such as still images, sound and text. To make efficient use of the recording media capacity and communication bandwidths, performance that allows coding with a high compression rate while minimizing the degradation of image quality due to compression is important. With mobile devices, power-saving performance that lowers battery consumption and LSI heat generation is desired.

Fujitsu developed a video codec (compression/

decompression) LSI compliant with H.264/AVC High profile, Level  $4.0^{11}$  in 2006.

Subsequently in 2008, we developed a highperformance video codec LSI (MB86H56) with lower power consumption and improved compression performance that supports coding of 60 frames/second (progressive), which is twice as much processing as the conventional ones, at 1920 × 1080 pixels.

In 2009, we developed a transcoder LSI (MB86H57) for TVs with a recording feature and hard disk recorders that recompresses MPEG-2 compressed videos to the H.264/AVC standard to realize long-time recording while maintaining high image quality.

In 2011, we developed Milbeaut (MB91696AM), a high-performance imaging processor for digital cameras, and a transcoder LSI (MB86M01) that realizes the transrating function, which recompresses H.264/AVC compressed videos to the H.264/AVC standard, and is provided with an additional function of optimizing videos for mobile devices such as smartphones.

The video codec technologies integrated in these LSIs have been repeatedly improved with the consistent

focus on high-image-quality performance and power-saving performance.

Of the technologies that support high image quality and low power consumption, this paper describes those for effectively reducing the amount of data transfer with external memory (hereafter "amount of external memory transfer").

# 2. Approaches to higher image quality and lower power consumption

To bring out the high compression performance of H.264/AVC, the optimum combination must be identified out of an enormous number of combinations of coding tools. To build such compression technology into an LSI, effectively finding the optimum combination while suppressing the amounts of computation and external memory transfer poses a challenge.

Fujitsu has worked on prefetch memory technology to reduce external memory transfer and technology to optimize signal variations at the circuit element level. Its R&D has been centered on proprietary highimage-quality algorithms to analyze the features of images and realize high-efficiency compression with a small amount of computation. In that way, Fujitsu has constantly realized industry-leading levels of highimage-quality and power-saving performance.<sup>2), 3)</sup>

Along with the dissemination of the H.264/AVC technology, integration of H.264/AVC-compliant video codec technology for various applications has been called for. Systems integrating this large number of functions need to share external memory with various types of signal processing functions other than video codec. In addition, the increase of functions causes an increased amount of external memory transfer in the entire system and issues such as a reduced battery life and greater amount of heat generated. These have further increased the demand for lower power consumption.

The following sections present Fujitsu's H.264/ AVC-compliant video codec. First, its architecture is outlined. Then, a description is given of the video coding technology and amount of external memory transfer, which is followed by an explanation about image frame lossless compression technology and lossless compression prefetch memory technology that can help achieve low-power devices by significantly reducing the amount of data transferred.

## 3. Overview of video codec

**Figure 1** shows a block diagram of a video codec core compliant with H.264/AVC. A video codec core is composed of a preprocessing block, basic codec block, entropy block, data transfer block and CPU. The preprocessing, basic codec and entropy blocks may have a different clock frequency or a multicore configuration depending on the performance requirements of the system desired.

For example, a manufacturing process of LSI characterized by low speed and low power consumption can be used to provide a two-core configuration with a suppressed clock frequency when low power consumption is required and, when the LSI chip size is given a higher priority, a one-core configuration with a high clock frequency can be adopted. In this way, a flexible choice is available according to the system requirements.

The following gives an overview of the functions.

1) Preprocessing block

Analyzes the features of an image and performs coarse motion estimation processing, spatial prediction processing and image feature analysis processing.

2) Basic codec block

Performs fine motion estimation, spatial prediction, frequency domain transformation and quantization.

3) Entropy block

Performs variable-length coding/decoding processing such as context-based adaptive binary arithmetic



#### Figure 1 Core block diagram of H.264/AVC video codec.

coding (CABAC).

4) Data transfer block

This block controls data transfer between various computation blocks and external memory. It can integrate compression and decompression blocks that make use of the lossless compression technology and prefetch memory, which will be explained in the following sections, according to the system requirements.

5) CPU

The CPU is responsible for syntax analysis, image quality control and prefetch memory control. Image quality control uses the temporal and spatial feature information analyzed by the preprocessing block as the basis of control for efficient coding including change of computation processes and prioritized allocation of an amount of code to regions that allow easy detection of subjective image quality degradation.

## 4. Reduction of amount of external memory transfer

Regarding image frames constituting a video, a focus on image frames that are temporally close to each other reveals that they include many similar portions. In an MPEG system video coding processing such as H.264/AVC, making use of such temporal correlation of images to compress the amount of information is a basic principle. To achieve high image quality, this processing of searching for similar portions, which is called "motion estimation," with high precision is important.

In the motion estimation processing, searching as large a region as possible of a temporally adjacent image frame called a "reference image" increases the possibility of identifying similar regions accordingly. However, a larger region to search means an increase in the amount of read of the reference image stored in the external memory, which strains the external memory bandwidth.

In addition to this, video codec-related processing involves a large amount of image frame transfer due to storing of reference images and video input/output, which accounts for a dominant portion of the amount of external memory transfer of the entire system.

For such image transfer with the external memory, we have developed a technology to reduce the amount of transfer by losslessly compressing image data and lossless compression prefetch memory technology, which makes use of the lossless compression technology, to further reduce the amount of reference image transfer. The following sections describe these technologies developed.

# 5. Lossless compression technology

In video codec processing, especially for reference images relating to motion estimation that involves a large amount of data transfer, random access performance is required for efficient handling of image data for a rectangular region of any image position and size. In addition, there are many processes with strong causality in which, for example, the result of the immediately preceding computation determines the position and size of the image data to be obtained from the external memory and high-speed response to data transfer requests is also required. Furthermore, video coding standards such as H.264/AVC strictly specify the precision of computation and data degradation caused by compression and decompression is not acceptable.

Accordingly, we have developed lossless compression technology optimized for video codec processing that satisfies these conditions.

To develop this lossless compression technology, we conducted a simulation using various evaluation videos to determine the compression block size to serve as the unit of random access. We adopted a system in which a mode that provides the maximum compression is selected out of five compression algorithms including non-compression for each compression block.

**Figure 2** shows the results of simulation evaluation on compression performance of lossless compression algorithms. As shown by the results, we have successfully reduced the amount of data transfer to around 50% or less on average of the amount before compression while satisfying the requirements including the random access that is necessary for codec processing.

# 6. Lossless compression prefetch memory

As shown in **Figure 3**, prefetch memory holds data in a strip-shaped region as wide as the frame memory and vertically short and the strip-shaped region moves to cover the motion estimation reference range according to the progress of macroblock processing.

The data length of the compressed data stored in the prefetch memory changes according to the



Figure 2 Evaluation of lossless compression algorithm.



Image prefetch range and memory management.

compression rate. For this reason, a configuration is adopted in which the correspondence between the coordinate space and data storage address is separately managed.

In response to a reference image acquisition request in motion estimation processing, the prefetch memory control block checks to see if the data exists in the prefetch memory to determine whether a hit or miss has occurred. For a hit, it reads the data in the prefetch memory and decompresses it to transfer to the motion estimation processing block. For a miss, compressed data is read from the external memory, which is decompressed and transferred to the motion estimation processing block. At the same time, misses in the upper and lower directions are counted to control the prefetch memory read ahead amount and timing for offsetting in the upper or lower direction. This control process allows the prefetch range to follow adaptively even if a global motion such as panning in the vertical direction occurred and a high hit rate can be maintained.

The results of simulation for the miss rate with lossless compression prefetch memory applied are shown in **Figure 4**. For the prefetch memory, a size has been adopted that provides sufficient effect when two reference images compressed at an average rate are referenced at the same time. In addition, the simulation has been configured to prefetch two reference images for P-pictures, which are predictive coded pictures, and four reference images for B-pictures, which are bidirectionally predictive coded pictures.

As shown in Figure 4, the miss rate was up to approximately 2% on the average among B-pictures. With P-pictures, the results for which are not shown



Figure 4 Evaluation of lossless compression prefetch memory.

in the figure, a larger range than with B-pictures can be prefetched and the miss rate is less than 0.1% on average. In this way, the amount of additional external memory transfer caused by misses has been significantly reduced.

It has been found out that, by applying lossless compression technology to prefetch memory, memory transfer for data filling into the prefetch memory can be reduced according to the compression rate and a large prefetch range can be covered with a small amount of memory, which leads to significant reduction of the amount of external memory transfer.



**Kiyonori Morioka** Fujitsu Ltd.

Mr. Morioka is currently engaged in research on image processing.

### 7. Conclusion

This paper has outlined, of the technologies applied to video codec LSIs compliant with H.264/AVC, the technology that achieves lower power consumption by reducing the amount of external memory transfer while pursuing high image quality.

Regarding imaging technologies, there is a trend toward higher resolution such as 4K2K (4096 × 2160 or 3840 × 2160 pixels) and 8K4K (7680 × 4320 pixels) to give users a more realistic viewing experience and the standardization of next-generation video compression technologies called High Efficiency Video Coding (HEVC) is in process, in pursuit of twice as high efficiency as H.264/AVC.

In the future, we intend to work energetically on these new technologies as we seek an imaging technology that provides larger-than-life pictures and move ahead with technological development that does not make it something special but allows it to be used widely by anybody.

#### References

- ISO/IEC 14496-10: Information technology–Coding of audio-visual objects–Part 10: Advanced Video Coding. 2009.
- A. Nakagawa: Fujitsu's Approach to H.264/AVC and Application Trends. *FUJITSU Sci. Tech. J.*, Vol. 44, No. 3, pp. 343–350 (2008).
- H. Nakayama et al.: H.264/AVC HDTV Video Codec LSI. FUJITSU Sci. Tech. J., Vol. 44, No. 3, pp. 351–358 (2008).