Video — Transformer training shootout: AWS Trainium vs. NVIDIA A10G

In this video, I compare the cost-performance of AWS Trainium, a new custom chip designed by AWS, with NVIDIA A10G GPUs.

I first launch a trn1.32xlarge instance (16 Trainium chips) and a g5.48xlarge (8 A10Gs). Then, I run a natural language processing job, fine-tuning the BERT Large model on the full Yelp review datatset. I use the BF16 data format with the maximum sequence length supported by the model (512). The results? The Trainium job is 5x faster. As the trn1 instance is only 30% more expensive, this is a huge improvement in cost-performance!



Chief Evangelist, Hugging Face (

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store