Solving the Initial Challenges of AI Video Analysis
As one of the key machine learning technologies, deep learning has been credited with unlocking many important new developments, paving the way for the third AI boom. Coupled with dramatic advances in computing performance, it has accelerated a host of remarkable new developments, including the advent of image recognition technology. With myriad implementations in various fields, the range of applications is expanding all the time as it becomes possible to widen the scope of feature extraction and accurate identification from images. But there are still significant challenges involved in applying image recognition technology to real-world business applications.
The Fujitsu Laboratories team behind “Actlyzer” started out with the aim of solving some of these technical issues. AI recognition accuracy relies on collecting a large amount of data and correctly labeling these data in advance. This in turn involves a major overhead in terms of time and effort, and can be cost prohibitive for many applications – particularly in the case of video data, where the initial collection of the large amount of correctly labeled data required can be difficult.
Sho Iwasaki of the Digital Innovation Core Unit explains Fujitsu Laboratories’ initial R&D approach to creating an AI model that would accurately recognize video-based human behavior:
"Collecting the volume of correctly labeled data required and evaluating the trained model for this type of application accounts for some 95% of the total project time and cost, which in itself presented some major early difficulties. We also found early on that the resultant models were very limited in terms of versatility and difficult to scale horizontally. In order to apply them to different environments and applications, we had to tailor the learning data specifically to each one in order to train a new model, increasing the time overhead and associated costs considerably.”
These were the driving forces behind Fujitsu Laboratories’ initiative to develop a new technology that could enable the recognition and analysis of human behavior without the need for large volumes of learning data. The research and development that led to Actlyzer commenced in 2018, and was completed just a year later.
Accurate behavior analysis from basic action combinations
How does Actlyzer accomplish accurate behavioral recognition?
Collecting images as learning data for the creation of AI recognition models is not in itself very difficult, particularly for simple operations such as "running" "walking" and "being stationery". As these actions are contained within various video data, information can be added when a target action for learning is displayed. However, it is considerably more difficult when it comes to complex behavior recognition, where several actions are combined such as “walking backwards and forwards while looking around”.
Actlyzer solves this problem with a novel approach. The first step involved creating an AI model capable of accurately recognizing each of the around 100 basic human actions that constitute complex behaviors. We then built on this by combining these basic actions with other information such as the order, the location and the target of each action. Overall, the development of Actylzer spanned a wide cross-section of researcher skillsets from multiple disciplines, reflecting the complexities involved in creating this important new technology.
One of the team members was Sho Iwasaki, who had been with Fujitsu Laboratories for three years. He focused on researching the elements of basic human actions, such as posture as well as different movement types. As a student, Sho focused on Virtual Reality (VR) and Augmented Reality (AR), gaining an in-depth knowledge of motion capture and related technologies involving human motion digitization. His expertise in this domain made him the ideal candidate for the Actylzer development project.
Sho explains: "Working with our colleagues from the Fujitsu Research and Development Center (FRDC) in China, we collected up to a million video and still images that depicted various angles and the movement patterns of different human body parts. We also used motion-capture data to augment the AI model learning process. This is what helped us to create Actlyzer's basic operational performance.”
Another team member was Yuka Sugimura, a researcher with experience in time-series data analysis and knowledge processing. She was responsible for developing the rules behind recognizing complex actions, using combinations and sequences of basic actions as the basis.
Yuka explains: "As one example, we developed a technology that can recognize suspicious behavior. It combines basic human actions such as 'being in front of the door', 'sitting', 'looking at the keyhole', and 'putting one's hand in the keyhole' into a time-series pattern and creates a specific rule. This allows you to adjust the recognition accuracy simply by changing selected parameters, such as adding another active condition or specifying a duration for a specific operation. As a result, it is now much easier to apply Actylzer to many different business applications, irrespective of the industry sector.”
Actylzer in Action: effective operation across multiple applications
Before Actlyzer’s official launch in November 2019, it was put to the test in a wide cross-section of different field trails, with our experienced solution development researchers teaming up with the Fujitsu business divisions.
Chisato Ishikawa focused on developing solutions specifically designed for manufacturing sites, explaining: "We wanted to demonstrate how Actlyzer could be applied to improve work quality and efficiency at production sites in manufacturing. This involved showing how real-time images from cameras installed across a site can be used to measure working hours, check work procedures, and analyze workers' behavior.”
Another important benefit is accident prevention, as a result of Actlyzer detecting dangerous postures and behaviors, such as lifting heavy objects without bending one’s knees. Chisato elaborates: “By customizing Actlyzer's basic behavior rules, you avoid the problems associated with 'Blackbox AI' and promote transparent, explainable AI performance.”
Team member Megumi Chikano’s specialization lies in crime prevention. Despite the numbers of surveillance cameras installed in cities, they are not always used effectively for crime prevention. In order to detect suspicious behavior in real time and prevent crime, you need an AI model that can effectively recognize suspicious behavior. The problem, however, was the virtual impossibility of collecting the large amount of suspicious behavior-related video data needed to create the AI model.
Megumi explains: "Based on typical patterns of suspicious individuals, such as wandering or looking around, we were able to combine basic actions to develop a rule for suspicious behavior, enabling its accurate recognition even without learning data. Importantly, Actlyzer can go one step further to prevent crime. By combining object recognition with human behavior, it is possible to recognize more complex behaviors, such as operating a mobile phone or making a call in front of a bank ATM.”
Another future application for Actlyzer involves analyzing customers’ purchasing behavior at retail stores and checking staff responses.
Actlyzer – Advancing AI Performance
We have only just started to explore Actlyzer’s potential, and are working on a range of new applications. These include studying the practical application of "Fujitsu Human Centric AI Zinrai", which systemizes Zinrai's AI technology, as a support service for the utilization of Fujitsu's AI. We are also working on a joint collaboration to enhance the AI image analysis solution called "Fujitsu Technical Computing Solution GREENAGES Citywide Surveillance ".
Megumi explains:By applying AI models that have already been learned, Actlyzer can reduce both cost and lead times significantly for AI deployment. Another major development is that in order to maximize surveillance camera images, we have succeeded in improving efficiency significantly, across multiple image processing functions. The net result is that it is much easier to implement Actlyzer and develop new business applications. We are also looking at how to overcome camera limitations that impact on image quality at some sites, using peripheral technologies such as image sharpening to improve recognition accuracy.”
Yuka expands: "It’s important to improve the user interface and usability, enabling rule making and changing parameters to be made really intuitively. We are also working on a variety of other issues, including the expansion of industry-specific rules for diverse business areas and how best to determine multiple-person human behavior."
AI is increasingly touching all of our lives, and at Fujitsu Laboratories, our goal is to create AI solutions that can be easily be used by anyone. Actlyzer will play an important role as we investigate new applications in the future.
(All titles, numerical values, names and trademarks etc described in this article are accurate at the time of publishing.)