We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

Page Bookmarked

SDAccel Development Environment

Delivering GPU and CPU-like Programming
Experience for Data Center Workload Acceleration

The SDAccel™ development environment for OpenCL™, C, and C++, enables up to 25X better performance/watt for data center application acceleration leveraging FPGAs. SDAccel, member of the SDx™ family, combines the industry's first architecturally optimizing compiler supporting any combination of OpenCL, C, and C++ kernels, along with libraries, development boards and the first complete CPU/GPU like development and run-time experience for FPGAs.

First Architecturally Optimizing Compiler for OpenCL, C, and C++

  • Architecturally optimizing compiler delivers up to 25X better performance/watt compared to CPU/GPU
  • Delivers 3X the performance and resource efficiency of other FPGA solutions
  • Enables new or existing OpenCL, C and C++ code for creating high performance accelerators

First Complete CPU/GPU-Like Development Experience on FPGAs

  • First complete software development environment targeting FPGAs
  • Optimize applications on FPGA platforms with little to no FPGA experience
  • Easily migrate applications to FPGAs while maintaining and reusing OpenCL, C and C++ code

First complete CPU/GPU-Like Run-time Experience on FPGAs

  • Supports large applications with multiple programs and CPU/GPU-like on-demand loadable compute units
  • Maintains system functionality during program transitions and keeps critical system interfaces and functions live during application execution
  • Allows FPGA accelerators to be shared across multiple applications using on-the-fly compute unit reconfiguration

SDAccel™ is a development environment for OpenCL™ applications targeting Xilinx® FPGA-based accelerator cards. This environment enables concurrent programming of the in-system processor and the FPGA fabric without the need for RTL design experience. The application is captured as a host program written in C/C++ and a set of computation kernels expressed in C, C++, or the OpenCL C language.

Xilinx has partnered with Nimbix Inc., a leading provider of heterogeneous accelerator clouds for big data and machine learning to create the next generation of applications leveraging the computational density of an FPGA from C/ C++ and OpenCL.

The offering from Nimbix will dramatically lower the barrier to leveraging the high performance, energy efficient power of FPGAs to accelerate high end computational workflows across all industries.  Developers can now run these tools in the cloud and then test and deploy on the latest Xilinx-accelerated hardware with no upfront investment or equipment purchases.

To get started with application acceleration on the cloud, visit http://www.nimbix.net/xilinx

Xilinx Application Acceleration on the Nimbix Cloud 

Category Examples Features/Description Performance Benefits
Getting Started Hello The hello world example is a simple design which tests the correct installation of the FPGA acceleration boards. The example uses the printf function call inside of the kernel code to report on the values provided from the host to the kernel.  
Host_global_bandwidth Host to global memory bandwidth test  
Kernel_global_bandwidth Bandwidth test of global to local memory  
Sum_scan Example of parallel prefix sum  
Vadd Simple example of vector addition.  
Vdotprod Simple example of vector dot-product.  
Vmul_vadd This example shows how data stored in global memory can be shared between kernels in different binary containers.  
Acceleration bfgminer Bitcoin Mining Application implemented on SDAccel platforms

80 Megahashes / second

nearest_neighbor_linear_search This is an optimized implementation of a nearest neighbor linear search algorithm

256 Measurements/ Cycle  

37.5 Gigameasurements/sec

smithwaterman This is an optimized implementation of the smithwaterman algorithm. The main algorithm characteristics of this application are 1. Compute MaxScore 2. Systolic array implementation  
Security aes_decrypt Implementation of an AES-128 ECB Encrypt in software, followed by decryption written in OpenCL and targeting execution on an SDAccel supported FPGA acceleration card.  
rsa This is an implementation of a RSA Decryption algorithm

1,024 bits Cipher Text Length

272,340 bytes/sec

sha1 This is an optimized implementation of SHA1 secure hash algorithm targeting execution on an SDAccel  
tiny_encryption Implementation example of Tiny Encryption Algorithm (TEA), which is a block cipher.  
Vision Affine Affine transformation is a linear mapping method that preserves points, straight lines, and planes.

21.5 fps

Convolve The convolve example is a performant design which showcases convolutional image filtering. The example processes the image 8 pixels at a time.

1,000 fps

Edge_detection Implementation of a Sobel Filter for edge detection.  
Histogram_codec This is an optimized implementation of a 12-bit histogram equalizer targeting execution on an SDAccel supported FPGA acceleration card.

333 fps

Huffman_codec This is an implementation of a huffman encoding/decoding algorithm targeting execution on an SDAccel supported FPGA acceleration card.  
Median_filer This is an optimized implementation of a median filter being used to remove noise in images.

22,222 fps

Watermarking This is an optimized implementation of a watermarking application to add watermarking to images.

6,134 fps

Contributed Examples ArrayFire – Fast Corner Demo of FAST feature detection developed by ArrayFire  
Polito – K-Nearest Neighbor K-Nearest Neighbor Algorithm derived from the Rodinia Benchmark suite. This project is aimed at using SDAccel to implement the k-Nearest Neighbor algorithm onto a Xilinx FPGA.

Realtime throughput at 1.23ms

Polito – Black Sholes Monte Carlo This project implements a Monte Carlo simulation of the Black-Scholes financial model, using both the European and the Asian options. It contains an OpenCL C++ kernel, to be mapped to FPGA via SDAccel. It provides much better energy-per-operation than a GPU implementation, at a comparable performance level.

.315 ns

7.69 sims/joule

Built-in Platforms

Board Name & Description Devices Supported Vendor
Xilinx® Kintex® UltraScale™ FPGA Acceleration Development Kit
The Kintex® UltraScale™ FPGA Acceleration Development Kit is an excellent starting point for hyperscale application developers.
Kintex UltraScale Xilinx
The ADM-PCIE-KU3 is a high performance reconfigurable half-length, low profile x16 PCIe form factor board based on the Xilinx Kintex UltraScale range of platform FPGAs.
Kintex UltraScale Alpha Data
The ADM-PCIE-7V3 is a high performance reconfigurable half-length low profile x8 PCIe form factor board based on the Xilinx Virtex-7 range of Platform FPGAs.
Virtex-7 Alpha Data

Platforms (Externally Provided)

Board Name & Description Devices Supported Vendor
The SB-850 is a full height, GPU-length, PCI Express board featuring up to eight HMC devices and a single high-performance Xilinx UltraScale FPGA.
Kintex UltraScale Micron Pico Computing
The business card-sized M-505-K325T is a powerful computing element composed of FPGA logic (with loading system), a local memory sub-system, and a fully-switched PCIe x8 communication structure.
Kintex-7 Micron Pico Computing
The PEA-C8K0-060 is high performance reconfigurable Half-Length, Low profile Single x8 PCI Express(PCIe) 3.0 form factor board on the Xilinx Kintex Ultrascale FPGAs. Ideal for demanding applications including high Performance computing, data processing, data center and system modeling.
Kintex COTS
The PEA-C8K0-040 is high performance reconfigurable Half-Length, Low profile Single x8 PCI Express(PCIe) 3.0 form factor board on the Xilinx Kintex Ultrascale FPGAs. Ideal for demanding applications including high Performance computing, data processing, data center and system modeling.
Kintex COTS
Semptian NSA-120 Accelerator Card
Semptian NSA-120 provides a new Xilinx FPGA based heterogeneous computing platform for big data analysis, cloud computing and network application acceleration. It can be used in big data analysis, image recognition/processing, video encoding/decoding, data compression/decompression, data encryption/decryption, voice recognition, neural network, machine learning, network security, etc.
Kintex Semptian

Key Documents

SDAccel QuickTake Video Tutorials

Play Video Fundamental Concepts of Application Host
The OpenCL standard for heterogeneous computing defines a programming model for transferring data between host processors and acceleration devices. This video provides an introduction to the minimum set of OpenCL APIs required for data transfer and control of accelerators on a device such as the FPGA.
Play Video N-Dimensional Kernel Range
One of the key concepts in OpenCL is the division of the application problem into a multi-dimensional problem space. Each block of the problem space referred to as the N-Dimensional Kernel Range executes the same computation in parallel across all accelerators available in a device. This video introduces the N-Dimensional kernel range concept and its application to solving computation problems on parallel computing systems.
Play Video OpenCL Application Structure
The OpenCL standard for heterogenous computing defines a basic programming model for all compute devices implementing the OpenCL standard. This video introduces the host code and kernel elements of an OpenCL application. The mapping of these elements to systems containing FPGA accelerator co-processing cards is explained.
Play Video OpenCL Memory Architecture
OpenCL defines a memory architecture and abstraction model that is common to all computing devices implementing the standard. This means that a programmer only has to learn about 1 memory model, which simplifies application coding. This video provides an overview of the OpenCL memory model and how it is implemented in an FPGA acceleration device.

Design Services

Design Services Alliance Members Markets
Cluster Technology Limited
Cluster Tech specializes in the provision of advanced computing technology solutions and utilizes High Performance Computing, Cloud, Business Intelligence and Financial Engineering to improve operational efficiency.
High Performance Computing, Cloud, Business Intelligence and Financial Engineering
Impulse Accelerated Technologies
Impulse Accelerated and offers design services wherein engineers work with design teams to optimize designs for FPGAs. Impulse excels at timely completion of complex designs in a way that operates efficiently in real world environments, working with target FPGAs and boards to ensure well tested, fully integrated, and fully documented solutions.
Audio, Video & Broadcast, Automotive & Transport, Computing & Data Processing, Consumer, Defense/Aerospace, Industrial Scientific Medical
Irish Centre for High-End Computing (ICHEC)
ICHEC offers services to help clients enable, optimize and deploy OpenCL-based software solutions on high performance low energy Xilinx FPGAs. With a dynamic team of engineers with domain, systems and software expertise, ICHEC offers design services in finance, energy, life sciences, and analytics.
Finance, Energy, Life Sciences, Analytics
Instigate Design
Instigate Design specializes in system level design of electronic systems, EDA specific software design and parallel programming. Design services range from software design and quality assurance to comprehensive application engineering with an emphasis on audio/video coding and communication.
High Performance Computing
MulticoreWare develops and licenses a wide range of computer vision and video processing libraries, while also providing design services to Xilinx customers.
Audio, Video & Broadcast, Automotive & Transport
Array Fire
ArrayFire is an industry leader in high performance computing software development and coding services.
Defense/Aerospace, Consumer, Industrial Scientific Medical