# Single-Chip Baseband Signal Processor for Software-Defined Radio

 Seiichi Nishijima
Miyoshi Saito
Iwao Sugiyama (Manuscript received September 30, 2005)

Reconfigurable processor technology offers a way to couple significant hardware performance improvements with realtime software signal processing. This technology enables software-defined radio (SDR) to be realized on a system-on-a-chip (SoC) platform. In this paper, we describe an SDR SoC LSI that is suitable for use in programmable wireless communication systems. The LSI has two advanced features. First, the hybrid architecture consists of reconfigurable signal processors and accelerator circuits. These accelerators are parametric circuits essential for baseband processing. Second, the reconfigurable processing elements have a cluster structure that improves the mapping efficiency and minimizes the processing time. We also describe a prototype SDR system that uses this LSI to perform software-defined IEEE802.11a and 11b communications.

# 1. Introduction

For mobile communication systems beyond the third generation, it will be necessary to realize a smart wireless terminal that can operate under several communication systems. Softwaredefined radio (SDR) is a progressive technology for realizing this goal. An SDR system features multiple standards, multiple bands, seamless mode/band transitions, and software programmability.<sup>1)</sup> A variety of SDR prototypes have been demonstrated that use FPGAs, DSPs, and generalpurpose reconfigurable processors.<sup>2)-4)</sup> We have developed a single-chip solution for the baseband signal processing of an SDR system. This SDR SoC LSI (SDR LSI, hereafter) integrates several accelerator modules that can be applied for different wireless standards and highly effective and powerful reconfigurable processing cores.<sup>5)</sup> This single-chip solution will make it possible to produce small size radio terminals that have better flexibility and programmability than hardware solutions. In this paper, we present the architecture and features of this LSI and a prototype system for future smart SDR systems.

# 2. Features of SDR LSI

# 2.1 Hybrid architecture

Although the flexibility of reconfigurable processors inevitably produces redundancy, we reduced the area and power consumption of the LSI by using the accelerator modules.

The LSI has a hybrid architecture containing reconfigurable processors<sup>5)</sup> and accelerator modules. Common functions of the major wireless standards, for example, fast-Fouriertransformation, forward-error-correction, and finite impulse response (FIR) filtering, are integrated into the accelerators using a parametric structure.

**Figure 1** shows the block diagram of the LSI. The LSI consists of three reconfigurable signal processors (RSPs), three accelerators, and five additional hardware circuits. These blocks are connected using crossbar type data networks and



Block diagram of SDR LSI.

controlled by a central processing unit (CPU).

By changing the network configuration, programs of the reconfigurable processors, and accelerator parameters, the LSI can be reconfigured to operate with many wireless communication systems.

# 2.2 RSPs

The RSPs provide the core features of powerful processing and high flexibility. In this work, we used a coarse-grained RSP designed by Fujitsu. This RSP has the advantages of a short latency and an area-effective mapping architecture.<sup>5)</sup>

**Figure 2** shows the structure of the RSP. To reduce the operation latency, the RSP is designed to minimize the physical data transfer delay between each processor element (PE) by dividing a large PE array into small reconfigurable logic cores called clusters. The RSP also includes macro elements that contain a divider, square-root

calculator, and arctangent tables. These macro elements are more effective for reducing the latency.

The RSP cluster has a sequencer for configuring the PEs and the networks between PEs. This sequencer reduces the number of PEs that are needed for signal processing. Figures 3 (a), **3 (b)**, and **3 (c)** show an example of cyclic reconfiguration using this sequencer. The data processing flow in Figure 3 (a) shows the processing flow for four inputs and outputs and eight processing steps using 30 PEs. Figures 3 (b) and 3(c) show the data processing flow for cyclic reconfiguration. The sequencer alternates operation between the odd and even sequences. As shown in these figures, cyclic reconfiguration enables signal processing to be performed using half of the PE resources required for ordinary data processing. Therefore, this architecture makes the mapping more area-effective.

#### S. Nishijima et al.: Single-Chip Baseband Signal Processor for Software-Defined Radio



PE: Processing element

### Figure 2 Structure of reconfigurable signal processor (RSP).



#### Figure 3

Example of cyclic reconfiguration using sequencer.

## 2.3 Programmable state machine

The major wireless communication systems require a state transition unit to control their communications state transitions. To realize a software-defined state flow controller, we implemented a programmable, scalable state machine in the SDR LSI.

**Figure 4** shows the structure of the programmable state machine (PSM). The PSM consists of 16 state memories, 27 input events, and 27 output events. The conditions and flow of state transitions are defined and programmed in the state memories for each wireless communication system. **Figure 5** shows an example mapping of the state flow of the IEEE802.11a and IEEE802.11b standards. With this mapping, a radio system is realized using 9 states and 15 events.

The PSM has an extensible structure so the LSI can control the state transitions of multi-chip systems. This feature is described in Section 2.5.

# 2.4 Accelerators

The common processing functions are performed using four types of accelerators: a fast Fourier transform (FFT), FIR, Viterbi decoder, and 32-stage programmable flip-flop-array module. These accelerator modules can perform the

Condition and flow of state transition 27 State memories (16 states) S00 S01 • • S15 Output events 27 Controller 27

Figure 4 Programmable state machine (PSM).

FUJITSU Sci. Tech. J., 42,2,(April 2006)

parametric processing needed to cover the modern mobile-wireless communication systems.

**Table 1** summarizes the functions of theaccelerators.

The FFT module executes  $2^n$ -point FFT and inverse FFT, where *n* is 6 to 13. This covers the 64 points of IEEE802.11a and the 2048 and 8192 points of IEEE802.16x (WiMAX) and future digital broadcasting standards.

The Viterbi module decodes signals that are encoded with any set of the three generator-polynomial types (G0, G1, G2) shown in Table 1. The constraint lengths can be set to 7 and 9, and the coding rates are 1/2 and 1/3. These parameter ranges are suitable for the IEEE802.11a, 11b, W-CDMA, and WiMAX standards.

The programmable flip-flop-array module operates as a scrambler/descrambler, CRC circuit, or convolution encoder, depending on the array combination setting. The FIR module covers up to 32 filter taps.

# 2.5 Multi-chip expansion and optimized crossbar networks

Although the LSI can be used as a singlechip solution for the baseband processing of SDR systems, we added a feature for expanding its I/O terminals from the crossbar data networks. This



Figure 5 Example mapping of state flow of IEEE802.11a and IEEE802.11b standards.

| Function                        |                           | Parameters                                                                                                                                           |  |
|---------------------------------|---------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------|--|
| FFT/IFFT                        |                           | 2 <sup>n</sup> points. <i>n</i> is 6 to 13.                                                                                                          |  |
| Viterbi decoder                 |                           | Polynomial generators: G0=1 to 777 (octal)<br>G1=1 to 777 (octal)<br>G2=1 to 777 (octal)<br>Constraint lengths: 7 and 9<br>Coding rates: 1/2 and 1/3 |  |
| Programmable<br>flip-flop-array | Scrambler/<br>descrambler | Polynomial generators: X <sup>15</sup> +X <sup>14</sup> +1,<br>X <sup>7</sup> +X <sup>6</sup> +1,<br>X <sup>7</sup> +X <sup>4</sup> +1               |  |
|                                 | CRC                       | 8 to 32-bit CRC                                                                                                                                      |  |
|                                 | Convolution<br>encoder    | Polynomial generators: G0=1 to 777 (octal)<br>G1=1 to 777 (octal)<br>G2=1 to 777 (octal)<br>Constraint lengths: 7 and 9<br>Coding rates: 1/2 and 1/3 |  |
| FIR filter                      |                           | Number of filter taps: Up to 32                                                                                                                      |  |

Table 1 Function of accelerators

feature enables multi-chip processing so the system can be adapted to future wireless communication methods that require more processing power. The I/Os can be expanded to two pairs of three 16-bit data channels with a maximum transfer rate of 4.8 Gb/s. This high transfer rate will be sufficient for most wireless communication systems.

The signal networks between each module in the LSI must have a wide bandwidth. The data network in the LSI consists of four 16-bit crossbar channels with a maximum transfer rate of 6.4 Gb/s. Because the full-channel crossbar occupies a large area of the LSI, we divided it into three blocks based on an optimization calculation (**Figure 6**).

# 3. Specifications and evaluation board

The SDR LSI integrates 774 PEs, which operate at a maximum clock speed of 160 MHz and a peak performance of 103 GOPS. The control CPU operates at 66 MHz, while the accelerators and other signal processing units operate at a maximum of 100 MHz. The PEs occupy 75% of



Figure 6 Calculated crossbar area versus number of crossbar blocks.

the core area, while the other processing circuits, including the SRAM, occupy the remaining area. A photograph of the chip is shown in **Figure 7**. The chip is mounted on a 1156-pin flip chip ball grid array (FCBGA) package. Other specifications of the LSI are summarized in **Table 2**.

We constructed an evaluation board for this LSI (**Figures 8** and **9**). The board contains two SDR LSIs and three FPGAs. One of the FPGAs is used to interconnect the two SDR LSIs, and the other two perform media access control (MAC) and



RSP: Reconfigurable Signal Processor

Figure 7 SDR LSI chip.



Analog interface Tx: Transmitter, Rx: Receiver

External interface

Figure 8 Evaluation board.

| SDR chip specifications.            |                                                                                                       |  |  |  |
|-------------------------------------|-------------------------------------------------------------------------------------------------------|--|--|--|
| CPU                                 | ARM946                                                                                                |  |  |  |
| Internal memory                     | 370KB                                                                                                 |  |  |  |
| External memory                     | Flash (16MB), SDRAM (256MB)                                                                           |  |  |  |
| I/O for control                     | GPIO, UART, IRQ interface, control bus                                                                |  |  |  |
| Power supply                        | 1.2 V (I/O: 2.5 V)                                                                                    |  |  |  |
| Clock speed                         | ARM: 66 MHz<br>Accelerator circuits: Up to 100 MHz<br>Reconfigurable signal processors: Up to 160 MHz |  |  |  |
| Bit width of crossbar data networks | 16-bit × 4 channels                                                                                   |  |  |  |
| Bit width of expansion I/O          | 16-bit × 3 channels                                                                                   |  |  |  |
| Package                             | 1156-pin FCBGA                                                                                        |  |  |  |
| Performance of RSPs                 | Up to 103 GOPS                                                                                        |  |  |  |
| Number of processing elements       | 774                                                                                                   |  |  |  |
| Accelerators                        | FFT/IFFT<br>Viterbi decoder<br>Scrambler/descrambler<br>CRC<br>Convolution encoder<br>FIR filter      |  |  |  |

#### Table 2 SDR chip specifications.





| Table 3           |                   |
|-------------------|-------------------|
| Specifications of | evaluation board. |

| -                                       |                             |
|-----------------------------------------|-----------------------------|
| Number of SDR LSIs                      | 2                           |
| Peak performance                        | 103 GOPS × 2                |
| Number of FPGAs                         | 3                           |
| External interfaces                     | 16-bit parallel I/O, RS232C |
| Power supply                            | 24 V                        |
| Board size                              | Width: 35 cm, Length: 25 cm |
| Download time per wireless system       | 20 ms                       |
| Number of downloadable wireless systems | Up to 7                     |
| Reconfiguration time                    | 5 ms                        |
|                                         |                             |

board control.

Up to seven configuration programs of the SDR can be stored in the flash memory on the board or downloaded from the external controller. The download time for configuration is 20 ms for each wireless system. The other specifications are summarized in **Table 3**.

We have developed example configurations for the IEEE 802.11a and IEEE 802.11b standards to demonstrate SDR operation on the board. The usage of resources was evaluated to one SDR LSI in these cases. Using these example configurations on the board, we confirmed that these systems worked well with maximum throughputs of 43.0 Mb/s. The time needed for reconfiguration between these two standards was about 5 ms.

# 4. Conclusion

We have developed a baseband signal processing LSI for SDR systems. The LSI consists of reconfigurable signal processors, parametric accelerators, a programmable state machine, and supplemental hardware circuits. Because the LSI is highly flexible and can perform intense processing with small latency, it can be used as a single-chip baseband solution for SDR. An additional expansion feature enables the LSI to control multiple chips for future wireless communication methods that require more processing power.

We used this LSI on an evaluation board to demonstrate an SDR system. We also developed configurations for two wireless communication standards for this evaluation board.

## Acknowledgements

We thank all the experts of the SDR LSI development team for their professional support and discussions. This work was supported by the National Institute of Information and Communications Technology of Japan.

### References

- 1) J. Mitola: The Software Radio Architecture. *IEEE Communications Magazine*, **33**, 5, p.26-38 (1995).
- A. Blaickner, S. Albl, and W. Scherr: Configurable Computing Architectures for Wireless and Software Defined Radio — A FPGA Prototyping Experience using High Level Design-Tool-Chains —. Proc. of IEEE Int. Symp. on System-on-Chip 2004, 2004, p.111-116.
- T. Shono, H. Shiba, Y. Shirato, K. Uehara, K. Araki, and M. Umehira: Performance of IEEE 802.11 Wireless LAN Implemented on Software Defined Radio with Hybrid Programmable Architecture. Proc. of IEEE Int. Conf. on Communications 2003, vol.3, 2003, p.2035-2040.
- Y. Sakai, N. Ujiie, N. Odate, S. Nishijima, K. Yoda, and M. Saito: An Evaluation Board for Software Defined Radio. Proc. of the 2005 Int. Tech. Conf. on Circuits/System, Computers and Communication (ITC-CSCC), vol.2, 2005, p.805-806.
- 5) M. Saito, H. Fujisawa, N. Ujiie, and H. Yoshizawa: Cluster Architecture for Reconfigurable Signal Processing Engine for Wireless Communication. Proc. of FPL2005, 2005, p.353-359.



(IEICE) of Japan.

#### Seiichi Nishijima, Fujitsu Ltd.

Mr. Nishijima received the B.S. and M.S. degrees in Electronics Engineering from Hiroshima University, Higashi-Hiroshima, Japan in 1999 and 2001, respectively. He joined Fujitsu Ltd., Kawasaki, Japan in 2001, where he has been researching and developing LSI for software defined radio (SDR). He is a member of the Institute of Electronics, Information and Communication Engineers



Iwao Sugiyama, Fujitsu Ltd.

Mr. Sugiyama received the B.S. degree in Physics from Tokyo University of Science, Tokyo, Japan in 1984. He joined Fujitsu Ltd., Kawasaki, Japan in 1984. Since February 2004, he has been researching and developing a next-generation platform for mobile wireless devices and systems.



Miyoshi Saito, Fujitsu Ltd.

Mr. Saito received the B.S. and M.S. degrees in Physics from Tokyo Institute of Technology, Tokyo, Japan in 1987 and 1989, respectively. He joined Fujitsu Ltd., Kawasaki, Japan in 1989, where he has been researching and developing quantum electron devices, high-speed DRAMs, embedded processors, wireless baseband LSIs, and reconfigurable devices. He is a mem-

ber of the Institute of Electrical and Electronics Engineers (IEEE) and the Association for Computing Machinery (ACM).