We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

Cloud Acceleration for RTL, C/C++ and OpenCL

The SDAccel™ development environment enables up to 25X better performance per Watt for data center application acceleration with FPGAs.

SDAccel, a member of the SDx™ family, offers a compiler, a debugger and a profiler.  It supports standard OpenCL APIs to abstract the hardware platform and optimizes code to hardware as kernels running onto the FPGA acceleration board.


SDAccel™ is a complete development environment for OpenCL™ applications targeting Xilinx® FPGA-based accelerator boards. It enables concurrent programming of the in-system processor and the FPGA device without the need for hardware design experience as the whole application can be coded in a C based language.

The application is captured as a host program written in OpenCL and a set of computation kernels expressed in OpenCL, C, C++.  The kernels can still be written in RTL (VHDL or Verilog).

Xilinx has partnered with leading Cloud service providers who specialize in heterogeneous accelerator clouds for big data and machine learning to create the next generation applications leveraging the computational density of FPGAs from C/ C++ / OpenCL.


The offering from Nimbix will dramatically lower the barrier to leveraging the high performance, energy efficient power of FPGAs to accelerate high end computational workflows across all industries.  Developers can now run these tools in the cloud and then test and deploy on the latest Xilinx-accelerated hardware with no upfront investment or equipment purchases.

To get started with application acceleration on the cloud, visit http://www.nimbix.net/xilinx

Xilinx Application Acceleration on the Nimbix Cloud 

Category Examples Features/Description Performance Benefits
Getting Started Hello The hello world example is a simple design which tests the correct installation of the FPGA acceleration boards. The example uses the printf function call inside of the kernel code to report on the values provided from the host to the kernel.  
Host_global_bandwidth Host to global memory bandwidth test  
Kernel_global_bandwidth Bandwidth test of global to local memory  
Sum_scan Example of parallel prefix sum  
Vadd Simple example of vector addition.  
Vdotprod Simple example of vector dot-product.  
Vmul_vadd This example shows how data stored in global memory can be shared between kernels in different binary containers.  
Acceleration bfgminer Bitcoin Mining Application implemented on SDAccel platforms

80 Megahashes / second

nearest_neighbor_linear_search This is an optimized implementation of a nearest neighbor linear search algorithm

256 Measurements/ Cycle  

37.5 Gigameasurements/sec

smithwaterman This is an optimized implementation of the smithwaterman algorithm. The main algorithm characteristics of this application are 1. Compute MaxScore 2. Systolic array implementation  
Security aes_decrypt Implementation of an AES-128 ECB Encrypt in software, followed by decryption written in OpenCL and targeting execution on an SDAccel supported FPGA acceleration card.  
rsa This is an implementation of a RSA Decryption algorithm

1,024 bits Cipher Text Length

272,340 bytes/sec

sha1 This is an optimized implementation of SHA1 secure hash algorithm targeting execution on an SDAccel  
tiny_encryption Implementation example of Tiny Encryption Algorithm (TEA), which is a block cipher.  
Vision Affine Affine transformation is a linear mapping method that preserves points, straight lines, and planes.

21.5 fps

Convolve The convolve example is a performant design which showcases convolutional image filtering. The example processes the image 8 pixels at a time.

1,000 fps

Edge_detection Implementation of a Sobel Filter for edge detection.  
Histogram_codec This is an optimized implementation of a 12-bit histogram equalizer targeting execution on an SDAccel supported FPGA acceleration card.

333 fps

Huffman_codec This is an implementation of a huffman encoding/decoding algorithm targeting execution on an SDAccel supported FPGA acceleration card.  
Median_filer This is an optimized implementation of a median filter being used to remove noise in images.

22,222 fps

Watermarking This is an optimized implementation of a watermarking application to add watermarking to images.

6,134 fps

Contributed Examples ArrayFire – Fast Corner Demo of FAST feature detection developed by ArrayFire  
Polito – K-Nearest Neighbor K-Nearest Neighbor Algorithm derived from the Rodinia Benchmark suite. This project is aimed at using SDAccel to implement the k-Nearest Neighbor algorithm onto a Xilinx FPGA.

Realtime throughput at 1.23ms

Polito – Black Sholes Monte Carlo This project implements a Monte Carlo simulation of the Black-Scholes financial model, using both the European and the Asian options. It contains an OpenCL C++ kernel, to be mapped to FPGA via SDAccel. It provides much better energy-per-operation than a GPU implementation, at a comparable performance level.

.315 ns

7.69 sims/joule

Recommended SDAccel-Enabled On-Premises Platforms

Board Name & Description Devices Supported Software Development Tools and Runtime Vendor
VCU1525 Acceleration Development Kit 
Ideally suited for data center application developers wanting to leverage the advanced capabilities of Virtex® UltraScale™ FPGAs. The kit enables easy application programming with OpenCL™, C, C++ and RTL through the Xilinx SDAccel™ Development Environment complete with frameworks, libraries, drivers and development tools.
Virtex UltraScale+ SDAccel and DSA 5.1 Xilinx
KCU1500 Acceleration Development Kit 
Excellent starting point for hyperscale application developers. This kit is SDAccel ready and supports easy application programming with OpenCL, C, C++ and RTL through SDAccel.
Kintex UltraScale SDAceel and DSA 5.0 Xilinx

Key Documents

Default Default Title Document Type Date

SDAccel QuickTake Video Tutorials

Play Video Fundamental Concepts of Application Host
The OpenCL standard for heterogeneous computing defines a programming model for transferring data between host processors and acceleration devices. This video provides an introduction to the minimum set of OpenCL APIs required for data transfer and control of accelerators on a device such as the FPGA.
Play Video N-Dimensional Kernel Range
One of the key concepts in OpenCL is the division of the application problem into a multi-dimensional problem space. Each block of the problem space referred to as the N-Dimensional Kernel Range executes the same computation in parallel across all accelerators available in a device. This video introduces the N-Dimensional kernel range concept and its application to solving computation problems on parallel computing systems.
Play Video OpenCL Application Structure
The OpenCL standard for heterogenous computing defines a basic programming model for all compute devices implementing the OpenCL standard. This video introduces the host code and kernel elements of an OpenCL application. The mapping of these elements to systems containing FPGA accelerator co-processing cards is explained.
Play Video OpenCL Memory Architecture
OpenCL defines a memory architecture and abstraction model that is common to all computing devices implementing the standard. This means that a programmer only has to learn about 1 memory model, which simplifies application coding. This video provides an overview of the OpenCL memory model and how it is implemented in an FPGA acceleration device.

Design Services

Design Services Alliance Members Markets
Cluster Technology Limited
Cluster Tech specializes in the provision of advanced computing technology solutions and utilizes High Performance Computing, Cloud, Business Intelligence and Financial Engineering to improve operational efficiency.
High Performance Computing, Cloud, Business Intelligence and Financial Engineering
Irish Centre for High-End Computing (ICHEC)
ICHEC offers services to help clients enable, optimize and deploy OpenCL-based software solutions on high performance low energy Xilinx FPGAs. With a dynamic team of engineers with domain, systems and software expertise, ICHEC offers design services in finance, energy, life sciences, and analytics.
Finance, Energy, Life Sciences, Analytics
Instigate Design
Instigate Design specializes in system level design of electronic systems, EDA specific software design and parallel programming. Design services range from software design and quality assurance to comprehensive application engineering with an emphasis on audio/video coding and communication.
High Performance Computing
Array Fire
ArrayFire is an industry leader in high performance computing software development and coding services.
Defense/Aerospace, Consumer, Industrial Scientific Medical
Page Bookmarked