Qualcomm’s new Hexagon 680 DSP: Fast, efficient, shipping with Snapdragon 820

The annual engineering and technical conference known as Hot Chips kicked off yesterday, and Qualcomm was out in front to detail its new DSP, the Hexagon 680. Digital Signal Processors (DSPs) aren’t something we’ve discussed much at ExtremeTech, and Qualcomm is putting a major marketing push behind their DSP technology for the first time. How does the chip work, what makes it an integral part of Snapdragon 820, and how does it advance heterogeneous computing?

DSPs are specialized processors dedicated to digital signal processing. Like GPUs, DSPs are designed to exploit parallelism. Like CPUs, they often make use of SIMD (single instruction, multiple data) and VLIW processing to boost throughput and total performance per watt. Also like GPUs, DSPs are designed to perform a very specific subset of tasks. CPUs can handle these tasks (and sometimes do), but DSPs offer better performance than general processors, and more flexibility than a traditional ASIC. This relationship is captured in the slide below:

Qualcomm’s Hexagon 680 DSP

Qualcomm’s Hexagon 680 is designed to accelerate certain workloads at performance efficiencies well above anything a modern CPU can offer. The Hexagon 680 is a VLIW (Very Long Instruction Word) processor, meaning it’s designed to extract maximum parallelism per clock cycle and to spread workloads across a wide set of execution units.

ThreadingModel

The 680 DSP offers four parallel scalar threads, each with 4-way VLIW support and a shared L1/L2. Each of these scalar groups is clocked at 500MHz for a maximum throughput of 2GHz-equivalent worth of processing. On the vector side of the equation, the 680 has 32 1024-bit vector registers. Each instruction can address up to four of these per cycle, for a maximum output of 4096 bits per cycle per instruction. It also includes support for Qualcomm’s new Hexagon Vector Instructions, or HVX. The HVX registers can be controlled by any two of the scalar registers.

Here’s what this means in aggregate: The Hexagon 680 is designed to allow for extensive threading and to share data across the L1 and L2 caches. There’s no penalty to using the HVX units and the scalar units simultaneously, provided that the workload is designed for it. The vector processors don’t have access to L1, but treat L2 as their first level of memory. L1 and L2 are kept coherent and data can be streamed into L2 from DDR memory at up to 1.2Gpixels/s. This supports some of the advanced capabilities of the Hexagon 680 (we’ll talk about these below).

According to Qualcomm, the performance advantages of these new features is enormous. While this data is provided by the company and should be taken with a grain of salt, there’s nothing outlandish here. These kinds of accelerations are typical when moving to a high-end dedicated chip as opposed to executing code on a general-purpose CPU.

DSP benchmarks

Qualcomm believes that the programming model for the Hexagon 680 is similar enough to CPU models to allow programmers to use the hardware effectively, but with significant overall improvements.

DSP-vs-CPU

Power consumption should also be much reduced, thanks to the simpler nature of the VLIW model and use of L2 for vector processing rather than both the L1 and L2. The company also notes that by adopting its DSP for low frequencies, it can cut leakage current and reduce overall power consumption.

Applications and heterogeneous computing

The best application processor on Earth isn’t worth much without applications to run on it, but the Hexagon 680 DSP delivers on this front as well. Qualcomm claims that the new chip is fully heterogeneous, meaning it can share data between CPU, GPU, and the DSP. Qualcomm is also a founding member of AMD’s HSA consortium, and while it isn’t calling its heterogeneous compute model by that name, we expect the two to be similar on a conceptual level. The DSP inside the Snapdragon 820 can be used to render AR or VR, tapped for better video playback and encoding, or used by the camera for extensive improvements in low-light photography. Alternately, HVX can be used to enhance detail in standard photos, as shown below.

Enhance. Enhance. Enhance.

Qualcomm has stated that the Hexagon 680 can perform low light enhancement 3x faster than a Krait SoC, while using 1/10 as much power. Programmers will be able to use the DSP and write applications to run on it, which could give the Snapdragon 820 platform a substantial leg up over the competition. DSPs have shipped on SoCs for a long time, but few companies spend as much time talking up their solutions as part of a heterogeneous compute platform as Qualcomm has.

In the past, a component like the DSP would be invisible, buried under interest in the CPU and GPU. Qualcomm’s decision to talk about the chip is a sign of the times. As visual processing, augmented reality, and virtual reality take the stage, more and more consumers expect advanced capabilities from their smartphones. For lower-tech users, that means high quality photos and video, while gamers and enthusiasts want cutting-edge performance and better battery life. The Hexagon 680 DSP is meant to speak to all these needs, with power efficiency that will beat even the upcoming Kryo CPU, flexibility and heterogeneous compute capability to whet the appetites of programmers and application developers, and performance that appeals to enthusiasts, gamers, and the general public.

After these disclosures, the Kryo is the last piece of the puzzle still to drop into place. Hopefully we’ll have details on the CPU core sooner rather than later.

Advertisements
Tagged , , , , , ,

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: