Addressing the issue of stream data processing in the conventional systems
Amid the rapid progress of IoT technologies, a lot of companies in various industries have been showing their increasing demands for analyzing and processing a large amount of data sent from each sensor device in real time to quickly utilize it for various services. Especially in automobile industry, there has been an accelerated shift toward the development of connected cars that enable real-time utilization of a huge amount of data sent from vehicles, while the wave of “CASE (Connected, Autonomous, Shared&Service, Electric) is surging into the industry. For example, automobile makers are trying to design and develop platforms for analyzing the data of vehicle speeds and locations in real time to visualize the status of roads. In case roads are congested, or there happens some accident, drivers moving toward the roads in question will receive a warning message to change their routes. Like this, a lot of efforts have been made for designing and developing the systems to assist and enhance the safe driving.
However, there is a problem in conventional data processing methods. Senior Manager Miwa Ueki, who is leading R&D of cyber-physical systems (CPS) including IoT systems in the Cyber Physical System Project of Super Middleware Unit at Fujitsu Laboratories, says as follows:
“In case of connected cars, in order to detect the status of each vehicle and road at high speed based on a large amount of data sent from moving vehicles on a second-by-second basis, we need a parallel distributed system for stream data processing. However, when we use the conventional system, we have to stop the process of a target service every time we add or change only the small part of it. Therefore, in order to provide the information to self-driving cars of which operation cannot be suspended even a moment, we needed novel technological innovation for realizing nonstop operation.”
Implementing a highly flexible data processing platform by combining various insights
For solving the said issue, members of Cyber Physical System Project started the development of a stream data processing architecture that enables to add or change service content without stopping the ongoing data processing. Ms. Ueki adds that, at the beginning, this theme was approached by a group who were engaged in IoT system development at Fujitsu Laboratories by creating its concept. However, they got stuck in difficulty in finding the optimal solution and could not go on as they intended.
At that time, we were introduced a technology that helped us make a breakthrough in this problem, by a group working on software development also at Fujitsu Laboratories. Mr. Kota Itakura of Cyber Physical System Project explains as follows:
“We were told about the ‘Apache Flink,’ which is an open source distributed stream data processing engine. We set up a cross-sectoral team of researchers with different specialties and started development of this unique technology using this engine. Bringing together the in-house diverse insights, we developed ‘Dracena (Dynamically-Reconfigurable Asynchronous Consistent EveNt-processing Architecture)’.”
Dracena is a data processing platform that enables to add or change data processing programs dynamically as a plug-in while continuing a large amount of real-time data processing and the target service. We do not have to stop the target system for adding a new function, or perform batch processing for data analysis, as before. It becomes possible to perform stream data processing with high extensibility and flexibility in development such as for adding new services. Ms. Ueki adds, “Especially, when we try to realize a service using sensor data, we have to take an agile development method to repeat trial and error by changing processes or parameters and confirming the results. Dracena is a very effective platform for utilizing real-time data as it can support both agile development and nonstop operation.”
Moreover, a large variety of IoT data in the real world tend to be collected separately by its use or system so far. Therefore, it was difficult to connect these kinds of data across different services at high speed. On the other hand, Dracena made it possible to handle the data on an object basis such as a “person, thing and event” in the stream data processing and to share all processing results, by taking special measures to keep the status of data and its processing, and improved its usability in various services.
Fujitsu Laboratories executed the simulation of a system to which a huge amount of data such as car speeds and locations is sent from 1 million vehicles (objects) per second by using Dracena, and checked the system performance by supposing the case where a new service or function (such as to detect a vehicle which stopped suddenly) is added. As a result, we could confirm that even if a new data processing program is added during the ongoing data process, the delay of service operation can be kept within 5 ms on average.
Continuing the R&D considering “speed” and “user-friendliness”
Development of Dracena is being promoted by engineers with high expertise. Mr. Itakura is leading the development and implementation of its core system.
He explains their current initiatives and future plans by saying, “As Dracena is basically a platform for real-time stream data processing, in order to utilize the data processing results, application programs which run on this platform are also very important. From this viewpoint, we are also putting emphasis on the development of a framework that makes application development easy, as well as API development for connecting services, with an eye to making Dracena user-friendly for both application developers and end users.”
Mr. J. Michaelis is improving the system by taking a different approach. He is an engineer who was engaged in database and network development before.
“I am fixing and modifying the source code written by Itakura-san for improving the system speed and performance. As a stream data processing platform, Dracena is never allowed to stop its operation. Thus, I am executing tests from various aspects to find bugs and fixing them.”
Mr. Takafumi Onishi, who is also the member of Cyber Physical System Project, is also executing tests repeatedly with J. Michaelis for system enhancement. He talks about his role by saying, “I check the performance of data processing systems connected to Dracena and make tuning by changing the setting parameters for improving the performance to process a much larger amount of data at high speed.”
As mentioned before, many experts with various backgrounds across multiple fields such as software, IoT collaborate for development of Dracena, which is a characteristics of this project. Onishi says, “Although it is just my second year at Fujitsu Labs, I am enjoying my work as a member of this cutting-edge technology project with many researchers who have expertize.”
Starting with a mobility field, aiming to expand its application areas
As it was difficult for conventional stream data processing systems to perform parallel processing, to connect the data of different systems, or to add or change services, they were categorized as vertically integrated silo type systems which can utilize analysis result data for only some specific service. On the contrary, Dracena enables us to take an agile and flexible development approach to firstly implement a simple system for data analysis, and to add new services one after another.
Development of Dracena was started by firstly targeting a mobility field where real-time processing of a large amount of data, nonstop system operation and frequent addition/change of services are needed. Now, it is incorporated into a Fujitsu product called “Future Mobility Accelerator” as its key technology. We have already started a joint project with Autonomic in the U.S., which is a mobility service company and a subsidiary of Ford Motor Company. We are also advancing PoC (Proof of Concept) with charge, with automobile makers in and outside Japan.
Connected cars have diverse potentialities to create new services such as to detect the symptom of drunk driving from steering data, to predict the side wind strength at the exit of a tunnel in combination with map data, or to find illegally parked vehicles in the image data.
Although the current target of Dracena is a mobility field, it can be applied to other fields as well. Going forward, Use of IoT will expand rapidly into various fields. In order to process and utilize a huge amount of collected data, we will have to struggle with more challenges than now. Therefore, so as to avoid the situation like this, we will need a system to collect high-quality data appropriately and it will be a turning point for new data utilization.
Ms. Ueki concluded by saying, “In the future, we are planning to use Dracena for all kinds of real-time services to address problems in the real world such as support for economical use of home appliances, watching the elderly who needs care, and route guidance at some event or disaster.”