Tensorflow with GPU

Python

Virtual Environment

To use Tensorflow with GPU support, you must first create a virtual environment. You can reuse virtual environments, so you will typically only have to do this once.

# start an interactive session
srun --pty bash -l

# load the python module
module load python/booth/3.6/3.6.3

# create a new virtual environment
virtualenv --system-site-packages -p python3 ~/venv/tensorflow-gpu

# activate the virtual environment
source ~/venv/tensorflow-gpu/bin/activate

# upgrade pip (inside the venv)
pip install --upgrade pip

# ensure tensorflow is up to date (inside the venv)
pip install --upgrade tensorflow

# install tensorflow-gpu (inside the venv)
pip install tensorflow-gpu

# when finished, leave the venv
deactivate

# log out of the compute node
exit

To test that Tensorflow works and is utilizing the GPU, test the installation inside a GPU node. For this purpose, create a simple example file and execute it inside a GPU node.

example.py
import tensorflow as tf

with tf.device('/gpu:0'):
    a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
    b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
    c = tf.matmul(a, b)

with tf.Session(config=tf.ConfigProto(log_device_placement=True)) as sess:
    print (sess.run(c))
# request a node with one GPU in interactive mode
srun --partition=gpu --gres=gpu:1 --pty bash -l

# load the python module
module load python/booth/3.6/3.6.3

# activate the virtual environment created in the previous step
source ./venv/tensorflow-gpu/bin/activate

# run the example script
python example.py

# leave the venv
deactivate

# log out of the node
exit

Finally, to submit the example file as a batch job, create a submit script and submit it using the sbatch command.

submith.sh
#!/bin/bash

#---------------------------------------------------------------------------------
# Accounting information

#SBATCH --account=basic

#---------------------------------------------------------------------------------
# Resources requested

#SBATCH --partition=gpu
#SBATCH --gres=gpu:1

#---------------------------------------------------------------------------------
# Job specific name (helps organize and track progress of jobs)

#SBATCH --job-name="tf-gpu_example"

#---------------------------------------------------------------------------------
# Commands to execute

# load the python module
module load python/booth/3.6/3.6.3

# activate the virtual environment
source ~/venv/tensorflow-gpu/bin/activate

# run the example script
srun python3 example.py

Submit the job by typing: sbatch submit.sh