Build on Trainium

A $110m investment program to accelerate AI research and education with AWS Trainium

What is Build on Trainium?

Build on Trainium, is a $110M investment program focused on AI research and university education to support the next generation of innovation and development on AWS Trainium. AWS Trainium is an AI systolic array chip uniquely designed for advancing state-of-the-art AI ideas and applications. Build on Trainium funds novel AI research on Trainium, investing in leading academic teams to build innovations in critical areas including new model architectures, ML libraries, optimizations, large-scale distributed systems, and more. This multi-year initiative lays the foundation for the future of AI by inspiring the academic community to leverage, invest in, and contribute to the open-source community around Trainium. Combining these benefits with Neuron software development kit (SDK) and recent launch of the Neuron Kernel Interface (NKI), Trainium customers can now innovate at scale in the cloud.

data illustration

AWS Trainium research cluster

We have created a dedicated Trainium research cluster with up to 40,000 Trainium chips that will be available through Amazon EC2 Trn1 instances connected on a single non-blocking peta-bit scale network using Amazon EC2 UltraClusters. Research teams and students can access these chips through self-managed capacity block reservations using Amazon EC2 Capacity Blocks for ML.

abstract texture reds and violets

Amazon Research Awards

We are conducting multiple rounds of Amazon Research Awards (ARA) call for proposals (CFP) to the broad research community, with selected proposals receiving AWS Trainium credits and access to the Trainium research cluster. Build on Trainium welcomes research proposals that will leverage popular open-source ML libraries and Frameworks, and contribute back to open-source to enhance resources for the ML developer community.

illustration of two people drawing on board

Neuron Kernel Interface

Neuron Kernel Interface (NKI) is a new programming interface for AWS AI chips, Trainium and Inferentia. NKI provides direct access to hardware primitives and instructions available on AWS Trainium and Inferentia, enabling researchers to build and tune compute kernels for optimal performance. It is a Python-based programming environment which adopts commonly used Triton-like syntax and tile-level semantics. Researchers can use NKI to enhance deep learning models with new functionalities, optimizations, and science innovations. Please visit the NKI documentation page to learn more.

illustration of geometric shapes circles triangles squares and lines in yellow and purple hues

Benefits

Get access to dedicated AWS Trainium research clusters and use world-class AI hardware and scalable cloud infrastructure to power your most ambitious research projects.
Build innovative and optimized compute kernels that out-perform existing architectures and techniques to push the boundaries of generative AI research and open-source innovation. Build highly optimized kernels to optimize the most critical or differentiated parts of your models.
Get started easily with Neuron SDK which integrates seamlessly with PyTorch and JAX. Neuron Kernel Interface’s Python-based programming environment adopts commonly used Triton-like syntax to help you ramp up quickly.
Collaborate with AWS experts the broader research community to amplify the real-world impact of your work.

Participating Universities

Here is how leading universities are benefiting from the Build on Trainium Program.

  • Berkeley University of California

    Trainium is beyond programmable—not only can you run a program, you get low-level access to tune features of the hardware itself. The knobs of flexibility built into the architecture at every step make it a dream platform from a research perspective. AWS is really enabling unexpected innovation. I walk across the lab and every project needs compute cluster resources for something different. The Build on Trainium resources will be immensely useful—from day-to-day work, to the deep research we do in the lab.

    Christopher Fletcher, Associate Professor of Computer Science, University of California, Berkeley
  • Carnegie Mellon University

    AWS’s new Build on Trainium initiative enables our faculty and students large scale access to modern accelerators, like AWS Trainium, with an open programming model, and allow us to greatly expand our research on tensor program compilation, ML parallelization, and language model serving and tuning.

    Todd C. Mowry, Professor of Computer Science, Carnegie Mellon University