March 05, 2018
Fujitsu Research & Development Center Co., Ltd. (Note 1) has developed an AI-based automatic configuration technology for large-scale traffic video analysis solution. By automatically segmenting scene image as meaningful regions like main-lane, road-median or sky and calculating relevant parameters, proposed technology effectively reduces the labor cost to deploy a large-scale video analysis system, which needs initializing multiple video analysis functions for all cameras. Besides, PTZ (Pan-Tilt-Zoom) cameras could be supported by a video analysis system employing proposed technology, while conventional video analysis system suffers from authentic operations such as turning and zooming the camera lens.
【 Background 】
As the number of private vehicles increases rapidly in these years, traffic problems such as congestions, accidents and air pollution becomes heavier. To monitor and mitigate such traffic problems, large scale video surveillance cameras are deployed, intelligent video analysis systems are also employed to automatically detect and report traffic events at real-time on tremendous video sources instead of human. Such intelligent analysis system usually supports detection of multiple traffic events like illegal parking, vehicle flow count and traffic congestion, which greatly facilitate the work of human regulator.
Generally, mentioned system will require users to configure each function before start analyzing, including drawing detection regions and setting relevant parameters. For example, to specify road region and legal driving direction to perform traffic statistic function and illegal driving event detection function, otherwise the analysis functions may not work accurately.
【 Topics 】
The unavoidable manual configuration operation (or called “system initialization”) mentioned above has following annoying problems:
- It becomes a heavy task for user or vendor to complete configuration for an intelligent video analysis system that processes large scale surveillance cameras. For example, a system that involves 1000 unit cameras and supports 10+ analysis functions will cost total 160+ hours for configuration. (Suppose one person spends 10 minutes to configure one camera).
- Once user changes position of a camera or modifies its view angle slightly, he has to re-initialize necessary detection regions and parameters for that camera;
- Problems stated above make it difficult to support PTZ camera because such camera may change view angle or take zoom operation at any time, thus previous configuration becomes invalid.
Therefore, it would be significantly useful that an automatic configuration technology could take such initialization operation instead of manual configuration and re-initialize once camera changes status. Figure 1 illustrates the problem of conventional video analysis system when utilized for PTZ camera, as well as how proposed automatic configuration technology could solve this problem.
Figure 1 Comparison between conventional system and system with proposed technology when applying for PTZ camera
【 Technology 】
Figure 2
illustrates the framework of an innovative video analysis system with automatic configuration technology. Compared with conventional video analysis system, main difference is the introduction of automatic configuration module, which defines specific detection regions and necessary computation parameters instead of human users.
Figure 2 Framework of video analysis system with automatic configuration
Key features of the technology are as follows:
1.Deep-learning based adaptive scene parsing technology
This technology will apply different inference model to parse scene image based on recognition of basic scene information, such as normal sunny case or rainy/dark case. This helps to segment areas more precisely under different light conditions. Besides, this technology is designed to have different precision priorities over parsed regions. For example, it focus on accurate segmentation of lane boundaries between road surface and medians, while parsed result of distant regions might not be cared a lot. Purpose of this strategy is to generate accurate and valid detection regions for traffic event analyzing. (Figure 3)
Figure 3 Example of adaptive scene parsing technology
2. A partial-processing and result-fusion technique to balance precision and speed
General procedure of scene parsing usually resize the input image to a smaller resolution to reduce both training and inferencing time. The drawback of image down-sampling is to lose image details hence to damage parsing precision. A partial-processing and result-fusion technique is therefore employed, which processes on part of original scene image successively and fuses parsing result of each part as final one (Figure 4). With this technique, it’s possible to fulfil requirement to finish automatic configuration for PTZ camera within 10 second or less in practice.
Figure 4 Example of partial-processing and result-fusion technology
Except for the key features described above, there are some other creative techniques engaged to support automatic configuration application, including: high-efficient camera status detection technique to control the activation of automatic configuration, key parameters generation from parsed result and environmental information.
【Result】
Testing with images from traffic surveillance camera, proposed technology achieves nearly 90% parsing accuracy for general traffic scenes, where lane area occupies the major portion of surveillance image. For one camera that lunches 10+ analysis functions, currently it takes only 30 seconds to generate all necessary configuration data with a GTX-1080Ti GPU and no manual operation.
【Future Plans】
FRDC is working to increase both precision and performance of this technology, as well as taking field trial in Chinese cities. Next, by combining this technology with Fujitsu Traffic Image Analysis service, Fujitsu Limited’s image analysis technology for intelligent traffic, Fujitsu is able to significantly improve scalability of intelligent traffic image analysis service by reducing labor cost of large-scale video surveillance project and initiating technical support for PTZ cameras. Traffic Image Analysis service that integrates proposed technology is expected to be released in the first half of next fiscal year.
【Note 1】