Kawasaki, Japan, March 05, 2020
Fujitsu Laboratories Ltd. has developed a technology for compressing ultra-high-definition, high-volume video data to the minimum size needed for AI video recognition applications. This technology can compress video data to just one tenth the size of data prepared using conventional compression technology intended for visual confirmation by humans.
In recent years, there has been a sharp increase in demand for AI analysis of video data in various business areas. The spread of 5th-generation mobile communications system (1), in particular, is expected to contribute to an explosive increase in the number of ultra-high-definition video images captured by cameras, as well as many images captured on the street and on production lines.
In developing this new compression technology, Fujitsu focused on an important divergence in the way in which AI and humans recognize images. Namely, AI and humans tend to differ in the areas of the image that are emphasized as important for judgment when recognizing people, animals, or objects in video data. Fujitsu has developed a technology to automatically analyze the areas that AI values and to compress data to the minimum size that AI can recognize. This makes it possible to analyze a large amount of video data without compromising recognition accuracy, and at the same time significantly reduce operating and data transmission costs. It is also anticipated that the technology will allow users to analyze more advanced video data by combining multiple video data stored in the cloud, sensor data, and performance data such as sales data.
Background and Challenges
In recent years, technology for analyzing images using AI has been developing rapidly and is expected to be one of the driving forces for digital transformation in many companies in a variety of industries. With the advent of sophisticated 5G mobile services in 2020, demand for AI analysis is expected to increase even further, accompanied by the increasing use of ultra-high-definition 4K and 8K cameras and large amounts of video data for applications including behavioral analysis in the manufacturing and retail industries.
Despite this, the processing demands for deep learning techniques used for image analysis present considerable challenges. One effective technique for securing computing power to deal with these tasks is to process in conjunction with the cloud, but since video data is often very resource-intensive, there is a need for high-compression technology that can transmit all video data to the cloud without compromising quality so that network bandwidth does not become overburdened.
About the Newly Developed Technology
Compressing video reduces image quality depending on the compression rate, and if the area that AI is focused on is compressed excessively, the recognition accuracy decreases. Fujitsu has developed a video compression technology that automatically analyzes the area of an object recognized by AI as judgment material in an image of 1 frame of video data, compressing (2) the image with the minimum image quality required for recognition for each area (Figure 1). By applying this technology, the size of video data can be significantly reduced compared with conventional compression technologies while maintaining recognition accuracy.
Technology to automatically estimate the compression ratio without affecting AI recognition accuracy
The effect of image quality degradation specific to compression on recognition accuracy is analyzed for each area. The compression ratio that does not affect recognition accuracy is automatically estimated based on the AI recognition results (Figure 2).
The degree of importance of features in the process of recognition by AI is determined for all areas by aggregating the effects on the recognition results when the compression ratio of the entire image is changed and the image quality is changed. The compression rate immediately before the recognition accuracy rapidly deteriorates in each area is estimated as a compression rate that does not affect the recognition accuracy.
It also feeds back the AI results of successive images to increase the compression to the maximum AI can recognize. In so doing, the technology achieves high image compression while maintaining AI recognition accuracy.
The newly developed technology was applied to video footage taken by a 4K camera of multiple workers packing in a factory. It was confirmed that the data size could be reduced to 1/10 the data size of conventional compression technology without a deterioration in recognition accuracy. This technology is expected to be used for applications that do not require strict real-time performance, as well as for the analysis of advanced video data that combines multiple video data stored in the cloud, sensor data, and performance data such as sales data.
Fujitsu Laboratories is evaluating this technology in a variety of cases, and is carrying out additional research and development to further refine compression performance. Fujitsu expects to commercialize this technology by the end of fiscal 2020, and introduce it into a variety of applications for different industries, including its Fujitsu Manufacturing Industry Solution COLMINA service platform .