As a team which builds data-powered software for ecommerce-focussed companies, we are always on the lookout for incorporating new development practices into our workflow.
One such advancement over the years has been the widespread adoption of GPUs for efficient deep learning.
Given the widespread use of Amazon Web Services (AWS) for our infrastructure needs, it was natural that we start off with the GPU EC2 instances that AWS has been pushing in the recent years.
Despite there being many posts on how to setup a GPU instance for deep learning, there was just too much diverse information across the posts that it became a pain to read through the details every time. So with that said, here is my yet another post on how to setup an EC2 instance with GPU support for deep learning.
This is a very opinionated setup guide, specific to AWS g2 instances (with NVIDIA cards), running Ubuntu 14.04 LTS.
The script installs Cuda Toolkit 7.5 and cuDNN 5.1 — the latest stable releases at the time of writing this post.
TL;DR — run this
#!/bin/bash # working directory mkdir -p ~/packages && cd ~/packages # System Requirements sudo apt-get update sudo apt-get upgrade sudo apt-get install -y build-essential git unzip \ pkg-config cmake libopenblas-dev \ linux-headers-generic linux-image-extra-virtual # Cuda wget http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1404/x86_64/cuda-repo-ubuntu1404_7.5-18_amd64.deb sudo dpkg -i cuda-repo-ubuntu1404_7.5-18_amd64.deb sudo apt-get update sudo apt-get install -y cuda echo 'export CUDA_HOME=/usr/local/cuda-7.5' >> ~/.bashrc echo 'export CUDA_ROOT=/usr/local/cuda-7.5' >> ~/.bashrc echo 'export PATH=$CUDA_ROOT/bin:$PATH' >> ~/.bashrc echo 'export LD_LIBRARY_PATH=$CUDA_ROOT/lib64:$CUDA_ROOT/extras/CUPTI/lib64:$LD_LIBRARY_PATH' >> ~/.bashrc source ~/.bashrc # cuDNN wget https://developer.nvidia.com/compute/machine-learning/cudnn/secure/v5.1/prod/7.5/cudnn-7.5-linux-x64-v5.1-tgz tar -zxf cudnn-7.5-linux-x64-v5.1.tgz sudo cp -R cuda/lib64/* $CUDA_ROOT/lib64/ sudo cp cuda/include/cudnn.h $CUDA_ROOT/include/ # reboot required! sudo reboot ## that's it!
note: the cuDNN download requires a NVIDIA developer account. Sign-up for a free account and download the cuDNN v5.1 Library for Linux file here.
Verify with TensorFlow
# miniconda setup $ wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh $ bash Miniconda3-latest-Linux-x86_64.sh $ source ~/.bashrc # create environment with tensorflow $ conda create --name tensorflow-test python=3 $ source activate tensorflow-test $ pip install https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.10.0-cp35-cp35m-linux_x86_64.whl # test with python $ python >>> import tensorflow as tf >>> hello = tf.constant('Hello, TensorFlow!') >>> sess = tf.Session() >>> print(sess.run(hello)) # this should print a few messages like # "opening CUDA libraries" and # "creating TensorFlow device (/gpu0)"
Update: To use your newly configured GPU instance on EC2, refer to our post about maintaining persistence across spot instances to optimize for both cost and continuity.