FPGAs, Deep Learning, Software Defined Networks and the Cloud: A Love Story Part 1

Digging into FPGAs and how they are being utilized in the cloud.

FPGAs have been around since the 1980s but are having a resurgence as of recent. In the past year, most of the major cloud providers like Amazon, Microsoft, Aliyun, Baidu and Huawei have announced FGPA offerings. Why all of the sudden interest in FPGAs? Let’s dig into some of the key benefits driving adoption.

Have you just gained a decent understanding on how GPUs and CPUs differ on efficiency when running machine learning workloads? Or have read the IHS Markit survey where 100% of carriers said they intend to invest in and deploy SDNs and are trying to wrap your head around how that investment simplifies network automation and service provisioning? Well in typical tech industry fashion I’m going to add new technologies and concepts to your plate before you’ve been able to properly digest the old ones. In this two part article you’ll get a quick introduction to FPGAs, how cloud providers are utilizing them with SDNs to increase network efficiency and how FPGAs are increasingly being used to accelerate machine learning workloads in the cloud. Let’s first start with defining FPGAs and contrasting them against their primary silicon alternatives.

Figure 1: Flexibility ←→ Efficiency Scale for CPUs, GPUs, FPGAs and ASICs

The simplest way to understand FPGAs is by comparing and contrast them with other computing alternatives on a scale of flexibility and efficiency. When it comes to flexibility, Central Processing Units (CPUs) are extremely flexible and designed to handle a wide variety of computing scenarios so you’ll find them in a wide variety of devices like desktops, laptop, tablet computers, smartphones, TVs and drones. Graphical Processing Units (GPUs), as the name implies, are optimal for rendering images, animations and video for a display. With GPUs you loose some of the flexibility of CPUs but increase efficiency and performance when processing graphics intense workloads. Field Programmable Gate Arrays (FPGA) have a matrix of configurable logic blocks connected via programmable interconnects which allow you to program FPGAs for specific applications or functionality. Last, but not least, are ASICs. An Application-Specific Integrated Circuit (ASIC) is a circuit customized for a particular use, rather than intended for general-purpose use. For example, a chip that is designed for specific use on high-frequency trading in financial markets is an ASIC. It will work better than general purpose chips for high-frequency trading but not do well at anything else, hence the indication that the circuit is application specific. The ability to reprogram the logic in an FPGAs is a primary differentiator when comparing them to ASICs, CPUs and GPUs that keep the same logic from when they were manufactured.

Before we move on, I should make a few points:

  1. Relative Bests. The best chip here relative to the applications or workloads you are running and your technical ability. For example, if you are using the deep learning library TensorFlow to run machine learning workloads on FPGAs, you’ll have the ability to optimize to your heart’s content. If you don’t have in-house knowledge on programming FPGAs then using an ASIC, like Google Cloud’s Tensor Processing Unit specifically built for machine learning and tailored to TensorFlow can make more sense.
  2. Overlap. There can also be overlap with CPUs, GPUs, FPGAs and ASICs when it comes to processing. As discussed earlier, ASICs and FPGAs might represent similar processing ability and only differ in ones ability to reprogram the logic. Another example is CPUs that are capable of handling light graphics processing normally handled by a GPU.
  3. Tag-Team Computing. As each type of chip has it’s own value proposition, there are also efforts to combine chips for a more seamless integration. Intel’s upcoming Programmable Acceleration Cards that include an FPGA connected with a CPU through a UltraPath Interconnect is a good example. Baidu is also taking this approach with the announcement of their 256-core cloud/AI “XPU” which combines elements of FPGA, GPU and CPU architectures. Even in Amazon Web Service’s F1 instance type you’ll find an FPGA paired with CPUs with a bulk of the application running on the CPU and the FPGA being used to accelerate specific computations. Surprisingly, even this GPU-CPU combo from competitors AMD and Intel exists, which makes me look a bit closer at pigs to see if they are growing wings and start flying soon.

Now that we have a decent understanding of FPGAs, let’s move onto why there has been a recent increase in adoption by cloud providers and what some of their customers are doing with them.

As stated earlier, FPGAs have been around for a while so why the sudden interest? The answers are found in the advantages of adoption. Cloud service providers are finding when adopting FPGAs they realize these three key benefits:

  1. Speed. This comes from a simple truth that purpose built hardware is almost always better than hardware built for general purpose. As I’ll discuss later, you will see some tremendous gains in speed across machine learning, networking and other areas allowing for better performance with fewer resources which help to justify the FPGA adoption in the data center.
  2. Efficiency & Scale. Both are equally relevant here as the efficiency of purpose built hardware is directly tied to the ability to scale out the cloud. As an analogy, picture a train that seats 500 people and makes a round trip from Los Angeles to San Francisco in 12 hours. That train can only make two round trips a day, servicing 2,000 people (500 going and then 500 coming back per trip, two round trips per 24 hour day). If you could increase efficiency of the train allowing it to make the round trip in 4 hours then the train can now take 6 trips a day servicing 6,000 people without having to scale out train cars or tracks. With the exact same transit system you can now scale out to service 3x as many customers. Cloud service providers are using the same efficiency plus scale formula to their benefit. If their customers complete the same workloads in less time then that frees up the hardware allowing cloud providers to have more available hardware to service customer’s needs without having to increase their data center footprint.
  3. Cost. I know this is a listed benefit that mysteriously shows up on almost every list of benefits regardless of the technology, but in the case of FPGAs it is actually true. By improving speed, efficiency and scale, cloud providers decrease their cost per user transaction and pass those savings on to their customers. As cloud providers continue to slash prices in a race to $0, they increasingly have thin profit margins and rely on scale to hit revenue targets. The improved efficiency allows them to take on more customers without scaling servers which increases their annual revenue. Cloud providers also save money on energy costs as the cost of processing per watt is typically lower on an FPGA than with CPUs or GPUs. A recent demo of image processing on AlexNet showed FPGAs processing 32 images per watt of power with GPUs only processing 14.2 images per watt with even less for CPUs.

And you don’t have to just take my word on this. AWS General Manager Deepak Singh also echoes these points in his post on Amazon’s FPGA Rational and Expected Market Trajectory post. As do other cloud providers who have similarly adopted FPGAs. Let’s now focus on two specific use cases of FPGAs to illustrate how everyone from cloud service providers to their customers can benefit from FPGA adoption.

An increasingly popular use case for FPGAs are SDNs. For those not familiar, Software-Defined Networking (SDN) is an umbrella term encompassing several kinds of networking technologies which use virtualization of a network to allow network admins to handle administrative tasks programmaticly from a centralized control console without having to touch physical networking devices like switches and routers. SDNs are common in the cloud and allow customers to create virtual private networks specifying their own address ranges, load balance traffic and control access to networks through firewalls and other networking related tasks.

So where do FPGAs come into play? They allow cloud providers to further expand on the capabilities of their SDNs. For example, during their 2017 Build conference, Mark Russinovich CTO of Azure, gave a bandwidth test on virtual machines (VM) with and without FPGAs. While using standard hardware and testing communications between VMs over a two second interval, the VMs hit 4 Gbps on average on a 40 Gbps physical network adapter. When using accelerated networking backed by FPGAs he was able to achieve 25 Gbps, a 6x increase. Another test later in the talk showed latency being reduced from 145 microseconds to 30 microseconds, roughly a 5x decrease in the amount of time it takes networking packets to reach their destination. For cloud providers, gaining this level of networking efficiency in communications between VMs is paramount when you consider 80% of all networking traffic in data centers is horizontal and never leaves the data center.

Figure 2: Microsoft’s SmartNIC with FPGA

To reduce the latency and increase bandwidth, Microsoft is using an FPGA in what they call a SmartNIC, shown in Figure 2. Their SmartNIC sends the rules down to the server and compiles them in a way the FPGA understands which allows the FPGA to process the networking faster than a general purpose CPU not optimized for data flows can. If that didn’t make sense to you, take a look at Figure 3.

Figure 3: Typical Virtual Network and Switch (LEFT). Virtual network and Switch with FPGA (RIGHT)

You can see a typical cloud architecture on the left where a virtual switch is sitting inside the server doing all the processing. The new FPGA architecture on the right shows where the SmartNIC communicates directly with the VMs and physical switch with 0% CPU overhead for processing network traffic, reducing latency of the virtual switch as the SmartNIC now only consists of the logic inside the FPGA. Microsoft’s use of FPGAs and benchmarks showing improved networking efficiency matches the growing consensus that

FPGA based network function virtualization can provide the flexibility of general purpose processors, like a CPU, while at the same time providing the necessary throughput that CPUs cannot sustain.

The wrap up, wait…what? But you’ve promised me Artificial Intelligence and that is what I came for? I only got SDNs and no true love story is complete without the other half. Yes, you are 100% correct. But as it is 2017 and not 1907 and this is a thing, I’ve broken the article into two parts. The first half has introduced FPGAs, benefits that are attracting cloud service providers and how FPGAs are improving efficiency of software defined networks where the second part will cover artificial intelligence explaining the roles an FPGA can play to increase efficiency on deep learning workloads. For part 2 follow the link below.

If you enjoyed this article, please tap the claps 👏 button.

Interested in learning more about Jamal Robinson or want to work together? Reach out to him on Twitter or through LinkedIn.

Enterprise technologist with experience across cloud, artificial intelligence, machine learning, big-data and other cool technologies.