As a team which builds data-powered software for ecommerce-focussed companies, we are always on the lookout for incorporating new development practices into our workflow.

One such advancement over the years has been the widespread adoption of GPUs for efficient deep learning.

Given the widespread use of Amazon Web Services (AWS) for our infrastructure needs, it was natural that we start off with the GPU EC2 instances that AWS has been pushing in the recent years.

Despite there being many posts on how to setup a GPU instance for deep learning, there was just too much diverse information across the posts that it became a pain to read through the details every time. So with that said, here is my yet another post on how to setup an EC2 instance with GPU support for deep learning.

Instance specifics

This is a very opinionated setup guide, specific to AWS g2 instances (with NVIDIA cards), running Ubuntu 14.04 LTS.

The script installs Cuda Toolkit 7.5 and cuDNN 5.1 — the latest stable releases at the time of writing this post.

TL;DR — run this

#!/bin/bash

# working directory
mkdir -p ~/packages && cd ~/packages

# System Requirements
sudo apt-get update
sudo apt-get upgrade
sudo apt-get install -y build-essential git unzip \
    pkg-config cmake libopenblas-dev \
    linux-headers-generic linux-image-extra-virtual

# Cuda
wget http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1404/x86_64/cuda-repo-ubuntu1404_7.5-18_amd64.deb
sudo dpkg -i cuda-repo-ubuntu1404_7.5-18_amd64.deb
sudo apt-get update
sudo apt-get install -y cuda
echo 'export CUDA_HOME=/usr/local/cuda-7.5' >> ~/.bashrc
echo 'export CUDA_ROOT=/usr/local/cuda-7.5' >> ~/.bashrc
echo 'export PATH=$CUDA_ROOT/bin:$PATH' >> ~/.bashrc
echo 'export LD_LIBRARY_PATH=$CUDA_ROOT/lib64:$CUDA_ROOT/extras/CUPTI/lib64:$LD_LIBRARY_PATH' >> ~/.bashrc
source ~/.bashrc

# cuDNN
wget https://developer.nvidia.com/compute/machine-learning/cudnn/secure/v5.1/prod/7.5/cudnn-7.5-linux-x64-v5.1-tgz
tar -zxf cudnn-7.5-linux-x64-v5.1.tgz
sudo cp -R cuda/lib64/* $CUDA_ROOT/lib64/
sudo cp cuda/include/cudnn.h $CUDA_ROOT/include/

# reboot required!
sudo reboot

## that's it!

note: the cuDNN download requires a NVIDIA developer account. Sign-up for a free account and download the cuDNN v5.1 Library for Linux file here.

Verify with TensorFlow

If you want to verify that your setup is working correctly, here is another short (also opinionated) script to test TensorFlow installed via Miniconda.

# miniconda setup
$ wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
$ bash Miniconda3-latest-Linux-x86_64.sh
$ source ~/.bashrc

# create environment with tensorflow
$ conda create --name tensorflow-test python=3
$ source activate tensorflow-test
$ pip install https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.10.0-cp35-cp35m-linux_x86_64.whl

# test with python
$ python
>>> import tensorflow as tf
>>> hello = tf.constant('Hello, TensorFlow!')
>>> sess = tf.Session()
>>> print(sess.run(hello))
# this should print a few messages like
# "opening CUDA libraries" and
# "creating TensorFlow device (/gpu0)"

Update: To use your newly configured GPU instance on EC2, refer to our post about maintaining persistence across spot instances to optimize for both cost and continuity.