Processing Challenges in Shrinking HPEC Systems into Small Platforms

Stephen Pearce & Richard Jaenicke
Mercury Computer Systems, Inc.

High Performance Embedded Computing (HPEC) Conference
September 28, 2004

The Ultimate Performance Machine
Target Applications

- COMINT/ESM
- Software Radio
- Radar
- ELINT/ESM/RWR
- EO/IR Imagery

... and other HPEC challenges, such as ATR, to reduce sensor communication bandwidth/latency needs
Target Platform Types

- UAVs
- Helicopters
- Man-pack/Briefcase
  - e.g., Humvee
- Small Vehicle
  - e.g., ARC-210 radio
- Manned aircraft
  - e.g., ARC-210 radio
- Airborne Pods

Predator
SH60
Prophet
JSF
F-16
F-18 (POD)
RAPTOR
Gripen
Litening Pod
<table>
<thead>
<tr>
<th>UAV</th>
<th>Global Hawk</th>
<th>Predator B</th>
<th>Heron A</th>
<th>Hunter</th>
<th>Eagle Eye</th>
<th>Fire-Scout</th>
<th>Sentry</th>
<th>Dragon Warrior</th>
<th>Dragon Eye</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>Picture</strong></td>
<td><img src="image" alt="Picture" /></td>
<td><img src="image" alt="Picture" /></td>
<td><img src="image" alt="Picture" /></td>
<td><img src="image" alt="Picture" /></td>
<td><img src="image" alt="Picture" /></td>
<td><img src="image" alt="Picture" /></td>
<td><img src="image" alt="Picture" /></td>
<td><img src="image" alt="Picture" /></td>
<td><img src="image" alt="Picture" /></td>
</tr>
<tr>
<td><strong>Length (ft)</strong></td>
<td>44.4</td>
<td>36</td>
<td>26</td>
<td>22</td>
<td>17</td>
<td>23</td>
<td>8.4</td>
<td>10</td>
<td>3</td>
</tr>
<tr>
<td><strong>Wingspan (ft)</strong></td>
<td>116</td>
<td>66</td>
<td>54</td>
<td>29</td>
<td>17</td>
<td>20</td>
<td>12.8</td>
<td>9</td>
<td>3.8</td>
</tr>
<tr>
<td><strong>Height (ft)</strong></td>
<td>14</td>
<td>9.5</td>
<td>5.9</td>
<td>5.6</td>
<td>5.5</td>
<td>9.5</td>
<td>4</td>
<td>5</td>
<td>1</td>
</tr>
<tr>
<td><strong>Payload Weight (lbs)</strong></td>
<td>1000</td>
<td>800</td>
<td>550</td>
<td>250</td>
<td>200</td>
<td>200</td>
<td>75</td>
<td>35</td>
<td>5</td>
</tr>
<tr>
<td><strong>Max Altitude (ft)</strong></td>
<td>65k</td>
<td>50k</td>
<td>25k</td>
<td>15k</td>
<td>20k</td>
<td>20k</td>
<td>15k</td>
<td>4k</td>
<td>1.2k</td>
</tr>
<tr>
<td><strong>Sensors</strong></td>
<td>EO/IR SAR ISAR SIGINT MTS</td>
<td>EO/IR SAR ISAR SIGINT MTS</td>
<td>EO/IR SAR ISAR SIGINT MTS</td>
<td>EO/IR SAR ISAR SIGINT MTS</td>
<td>EO/IR SAR ISAR SIGINT MTS</td>
<td>EO/IR SAR ISAR SIGINT MTS</td>
<td>EO/IR</td>
<td>EO/IR</td>
<td>EO/IR</td>
</tr>
<tr>
<td><strong>Endurance (hrs)</strong></td>
<td>36</td>
<td>36</td>
<td>36</td>
<td>10</td>
<td>5</td>
<td>4</td>
<td>3</td>
<td>3</td>
<td>1</td>
</tr>
<tr>
<td><strong>Max Airspeed (kts)</strong></td>
<td>320</td>
<td>220</td>
<td>120</td>
<td>100</td>
<td>220</td>
<td>120</td>
<td>100</td>
<td>70</td>
<td>35</td>
</tr>
</tbody>
</table>

- UAVs height is very small; tends to lead to smaller system designs than 6U arrayed on base of fuselage/wings
- Payload weight is small, thus weight constrained solutions are demanded
- UAVs tend to fly fairly high. A consequence is that without life support environments (no man) at this altitude, conduction cooled becomes mandatory.
- All traditional HPEC applications are represented on all the platforms.
Historically, have relied on Moore’s Law. Could wait and technology improvements would enable significant miniaturization. However, we observed increases in absolute performance are accompanied by increases in power, and by consequence weight and volume.

Number of transistors available is increasing, but power consumption is increasing at almost same rate. Increased infrastructure to handle power distribution and heat extraction incurs a penalty in size and weight. Alternative approaches are needed.

One approach: leverage field-programmable gate arrays (FPGAs) as programmable processors.

For some signal/image processing functions, FPGAs shown to provide a 10-20 fold performance boost over a PowerPC G4 processor. However, some tasks, e.g. filter weight computation, back-end processing, still perform better on a PowerPC. In trying to maximize processing power in smallest space, trick is not only trying to find optimum balance between FPGAs and PowerPCs, but also exactly which model of each chip to choose.
• The popular comparison....

• These are the resources most often receiving attention when people look at Xilinx parts
FPGA Selection

- But what really matters

For embedded signal/image processing applications, more critical elements tend to be number of multiplier blocks and block RAM size

- Leads to different component selection favoring Pro range
Scaling the Processing

Current PPC-only Solutions (e.g. 6U VME chassis)

- 500 MHz class PPC x 4
  - 6 slot = 96 GFLOPS
  - 12 slot = 192 GFLOPS
  - 20 slot = 320 GFLOPS

Future PPC-only Solutions

- 4x 1 GHz class PPC per board or 2 FPGA per board
  - 2 slot = 96-216 GFLOPS
  - 4 slot = 112-616 GFLOPS
  - 8 slot = 224-1232 GFLOPS

Future Heterogeneous Solutions

- 4x 1.5 GHz class PPC = 48 GFLOPS per slot
  - 6 slot = 288 GFLOPS
  - 12 slot = 576 GFLOPS
  - 20 slot = 960 GFLOPS

Assumptions

- FPGA = Equivalent 40-100 GFLOPS
- 500 MHz PPC = 4 GFLOPS

Assumptions:

- 2x 1GHz class PPC per board or 2 FPGA per board
- Future FPGA + PPC exploitation on 3U better than existing 6U
- Future FPGA + PPC exploitation on VME

Assumptions:

- 500 MHz class PPC x 4 = 16 GFLOPS per slot
- 6 slot = 96 GFLOPS
- 12 slot = 192 GFLOPS
- 20 slot = 320 GFLOPS
- 4x 1.5 GHz class PPC = 48 GFLOPS per slot
- 6 slot = 288 GFLOPS
- 12 slot = 576 GFLOPS
- 20 slot = 960 GFLOPS
Slot limitations on space-constrained systems also lend to integration of the analog-to-digital conversion and general I/O with the processing. This is especially important for multi-channel systems.

Sensor I/O can be part of baseboard design, e.g. tuner/ADC or be a mezzanine card attached to processors.
Example ARC-210 Form

- Fitting 6 x 3U cPCI slots leaves total remaining space of
  - Width 1” (20%)
  - Height 1.7” (>30%)
  - Length 6.3” (>35%)

MCP3 FCN + DRTi Analogue

- dimensions to scale

RF

- 1 channel at 70 MSPS 14 bit input from 3GHz operating band
- 1 channel at 70 MSPS 14 bit output to 3 GHz operating band +20dBm

Digital = ~80-240 GFLOPs

- 4 x 1 GHz PPCs
- =~ 40 GFLOPs
- 4 x Virtex II P40 FPGAS
- =~ 40-200 GFLOP equivalent
Small SAR

Image Formation

Memory PMC

ADC PMC

Digital Receiver

Quadrature Exciter

Radar Control

Radar Control

Guidance & Control

RF Up/Down-converter

User Display

Power Supply

RF Control/Status/Power

Weight < 10lb

Cost < $60k

Power Consumption < 150W
**Beamformer/DF**

- COMINT
- ESM
- ELINT
  
  - If down-conversion added

**User Display**

**Ethernet /VGA**

**REF GEN**

**System Host**

**Digital Tuner**
3U Design for Signal Processing

- PowerPC 7447, 1 GHz
- 250 MB/s off-board via cPCI
- MCOE 6.2.x support
- WindRiver VxWorks + Tools

- FPGA Virtex II Pro
- 4x Direct high speed ‘digital IF’ interfaces
- PMC site for digital receiver or modem etc.
- FDK 2.0.x support
MCP3 FCN: Flexible 3U Signal Processing

- Combined PowerPC & FPGA
  - Flexibility of RISC processing code
  - Density and bandwidth handling strengths of FPGAs
- Deployable
  - Ruggedized & conduction-cooled
- Multiple I/Os direct to FPGA
  - 4x high-speed bus via J2
  - Dual-channel analogue input digital receiver PMC option
Analog I/O receiver
- 2x 80 MSPS 14 bit ADC
- Factory configurable
  - IF up to 100 MHz

PMC general features
- Direct interface to FPGA
- Stepped attenuators
- RF screening
- Clocks (int./ext.)
- Power managed