Skip to main content

Fujitsu

Global

Features

Hadoop

FUJITSU Software Interstage Big Data Parallel Processing Server integrates Open Source Software "Apache Hadoop" with Fujitsu proprietary technologies. Apache Hadoop enables fast batch processing over Big Data by distributing data among servers for parallel processing; it can be a few servers up to thousands of servers.

Fujitsu has conducted verification of Hadoop with various related products to confirm secure and reliable Big Data processing.

High Performance

Conventional Apache Hadoop requires application data to be transferred to its dedicated file system HDFS (Hadoop Distributed File System) before processing (MapReduce) operations.

Similarly, the result data needs to be retrieved from HDFS so that applications can use the data processed by Hadoop. This bidirectional data transfer can be a huge overhead in total data processing time.

Fujitsu proprietary Distributed File System, which is a part of Interstage Big Data Parallel Processing Server, enables direct access to storage system data during data processing by Hadoop.

The built-in memory cache feature allows memory cache to be effectively allocated to slave servers, contributing to faster data processing.

This allows Apache Hadoop and applications to share data, eliminating the overhead of data transfer to/from HDFS. High Availability Interstage Big Data Parallel Processing Server incorporates Hadoop master-server clustering with Fujitsu technologies that are based on a strong track record in mission-critical systems, such as PRIMECLUSTER. This eliminates conventional Apache Hadoop's single point of failure issue, where master-server failure could lead to data loss disabling data processing.

High Operability

Smart Setup feature of Interstage Big Data Parallel Processing Server allows easy installation.

By preparing system parameters like IP address, most installation and setup operations are performed automatically with minimum manual intervention. Smart Setup also supports easy scale-out of Hadoop slave server(s) by deploying the master image to servers instead of installing and setting up each slave server manually from scratch.

Fujitsu Distributed File System allows dynamic addition of disks to a running system. This supports scale-up of the storage system according to increases in data to be processed. Fujitsu Distributed File Systems allows applications to access data through Linux standard file access interface (POSIX compliant) with no need for applications to be tailored to work with HDFS. This results in simplified operations for Big Data processing.

Main features of Interstage Big Data Parallel Processing Server V1.0
Feature
Hadoop Hadoop (common/MapReduce/HDFS)
Hive
Pig
Hbase
High Availability Master server clustering
External SAN storage system with Fujitsu proprietary Distributed File System
High Performance Sharing data with existing system
Fast File System
Fast file access with memory cache
High performance Java VM
High Operability Smart Setup (Installation, scale-out)
Utilize existing data tools (Backup, etc.)
Management tool (CLI)
Dynamic disk addition

| Overview | Features | Benefits | Resources |