.. include:: Additional Resources -------------------- GPU Jobs ######## Mercury currently has, as of November 2016, 8 logical Nvidia K80 GPUs in 2 servers. Each logical GPU has 2496 CUDA cores and 12GB dedicated high-speed memory, which can dramatically accelerate massively parallel code blocks. These are available via the "gpu" partition. This gpu partition has a 2 day wallclock limit. GPUs are currently accessible to any application, however only Matlab and R are documented here. Future application documentation will be added as time allows. If you have something you need to use with a GPU and it's not documented here, please contact Research.Support@chicagobooth.edu. Jupyter ####### Jupyter is an open-source web-based user interface for Python. For a general overview including installation instructions, see the `online documentation`_. .. _online documentation: https://jupyterlab.readthedocs.io/en/stable/index.html .. figure:: _static/jupyterlab.png :scale: 55% Jupyter provides a user-friendly interfaces for Python. Interactive Mode $$$$$$$$$$$$$$$$ 1. Connect to the Booth VPN 2. Log in to a compute node in interactive mode :code:`srun --account= --pty bash --login` 3. Load Python if necessary :code:`module load python/booth/3.8` 4. Prepare environment :code:`unset XDG_RUNTIME_DIR` 5. Start the desired Jupyter backend :code:`jupyter notebook --no-browser --ip=$(hostname -i)` or :code:`jupyter lab --no-browser --ip=$(hostname -i)` .. note:: It is important to start Jupyter using ``--no-browser`` to ensure that the graphical interface is not launched on the network. Doing so results in slow, laggy performance. 6. Copy the URL shown by the terminal and paste into a local web browser (e.g. Firefox, Chrome). The URL should look something like this: .. code-block:: console http://10.xx.xxx.xxx:8888/?token=9c9b7fb3885a5b6896c959c8a945773b8860c6e2e0bad629 When finished, shutdown Jupyter: 7. On the web interface, navigate to File |rarr| Save Notebook 8. On the web interface, navigate to File |rarr| Close and Shutdown 9. In the Terminal, shut down the notebook server by pressing :code:`Ctrl-C` 10. Exit your interactive session by typing :code:`exit` Batch Mode $$$$$$$$$$ Jupyter notebooks can also be run on Mercury in batch mode without a graphical interface. For example, for notebook `test_run.ipynb`. .. code-block:: console $ jupyter nbconvert --to notebook --execute test_run.ipynb Once the batch job has completed, view the results by logging into https://jupyter.chicagobooth.edu with your Booth ID. The completed notebook with output cells will be named `test_run.nbconvert.ipynb`. .. Working With Large Datasets .. ########################### RStudio Servers ############### RStudio servers for available for light computational prototyping. These servers are physically separate from Mercury, but do provide access to the files in your home directory. The servers are available at :code:`rstudio-research.chicagobooth.edu`. .. note:: Do not fill up your home directory to its maximum capacity, otherwise you will be locked out of rstudio. You can check its available space in the R console with the command ``system('df -h ~')`` Because RStudio is hosted on lightweight virtual machines, they should not be used for heavy computations. However, it is possible to offload heavier computations from :code:`rstudio-research.chicagobooth.edu` to Mercury compute nodes via `batchtools`_. The code below shows an example of submitting jobs to Mercury from the RStudio server. .. _`batchtools`: https://cran.r-project.org/web/packages/batchtools/index.html .. code-block:: R :linenos: library(batchtools) myFct <- function(x) { print (paste0("x: ", x)) print (paste0("SLURM_JOB_ID: ", Sys.getenv("SLURM_JOB_ID"))) print (paste0("SLURM_JOB_USER: ", Sys.getenv("SLURM_JOB_USER"))) print (paste0("PID: ", Sys.getpid())) } # create a registry (folder) at the location given by file.dir reg <- makeRegistry(file.dir="myregdir") # map my function over arguments Njobs <- 1:4 Njobs <- list('a', 'b', 'c', 'd') ids <- batchMap(fun=myFct, x=Njobs, reg=reg) # define sbatch submit preferences for Mercury and submit jobs submit_prefs = list(ncpus=1, time="0-00:15:00", mem="4G", account='basic', partition='standard') done <- submitJobs(ids, reg=reg, resources=submit_prefs) # Wait until jobs are completed waitForJobs() .. note:: The resources argument to submitJobs has a different naming convention from the typical slurm parameters due to R naming conventions. Note that `ncpus` corresponds to the slurm parameter `cpus-per-task`. Remote Mathematica Kernel on Mercury #################################### The university has a Mathematica `site-license`_ that is available for all faculty, students, and staff. It is possible to run an interactive Mathematica notebook on a local machine while offloading the heavy computations to Mercury. Doing so requires being on wired ethernet at Harper or having a public-facing, routable IP address. .. _`site-license`: https://uchicago.service-now.com/it?id=kb_article&sys_id=ec0c24c7db5117c842a35bc0cf96191d .. note:: This method requires being on a wired ethernet connection at Harper Center 1. Open a new or existing Mathematica notebook on your local computer (e.g. laptop or desktop) 2. Navigate to Evaluation |rarr| Kernel Configuration Options |rarr| Add 3. Under Advanced Options – Arguments to MLOpen: .. code-block:: console -LinkMode Listen -LinkProtocol TCPIP -LinkOptions MLDontInteract 4. Under Advanced Options – Launch command: .. code-block:: console `java` -jar "`wolframssh`" @mercury.chicagobooth.edu srun --mem=16G /apps/mathematica11/Executables/wolfram -wstp -LinkMode Connect -LinkProtocol TCPIP -LinkName "`linkname`" -LinkHost `ipaddress` 5. Click OK 6. Start the remote kernel by navigating to Evaluation |rarr| Start Kernel |rarr| mercury (or whatever you named it) 7. When you are done, please Quit the kernel so that the compute node is cleaned up by navigating to Evaluation |rarr| Quit Kernel |rarr| mercury .. MPI .. ### Cron-like jobs ############## Cron jobs persist until they are canceled or encounter an error. Mercury has a dedicated partition, cron, for running Cron-like jobs. Please email Research.Support@chicagobooth.edu to request submitting Cron-like jobs. These jobs are subject to scheduling limits and will be monitored. Here is an example of an sbatch script that runs a Cron job: .. code-block:: bash :caption: cron.sbatch :linenos: #!/bin/bash #SBATCH --time=00:05:00 #SBATCH --output=cron.log #SBATCH --open-mode=append #SBATCH --account=systems #SBATCH --partition=cron # Specify a valid Cron string for the schedule. This specifies that # the Cron job run once per day at 5:15a. SCHEDULE='15 5 * * *' # Here is an example of a simple command that prints the host name and # the date and time. echo "Hello on $(hostname) at $(date)." # This schedules the next run. sbatch --quiet --begin=$(next-cron-time "$SCHEDULE") cron.sbatch After executing a simple command (print the host name, date and time), the script schedules the next run with another call to sbatch with the --begin option. Booth AWS Environment ##################### Users wanting to take advantage of AWS command line tools will need appropriate security credentials. Booth's AWS environment allows users to obtain temporary security credentials that are valid for one hour. To facilitate generating these credentials, we recommend using the `aws-adfs` open source tool. `aws-adfs` is available on our Mercury cluster and is also installable via pip if needed. The basic syntax is below: .. code-block:: bash # Log into the ADFS host # username: gsb.uchicago.edu\ # password: aws-adfs login --adfs-host=bushadfs01.chicagobooth.edu --role-arn arn:aws:iam:::role/ A full session on Mercury would look like this: .. code-block:: bash # connect to Mercury computing cluster [localhost] $ ssh mercury.chicagobooth.edu # request an interactive session on a compute node [mfe01] $ srun --account= --pty bash --login # load the python and awscli modules [mcn01] $ module load python/booth/3.8 [mcn01] $ module load awscli/2.10/2.10.3 # Log into the ADFS host # username: gsb.uchicago.edu\ # password: [mcn01] $ aws-adfs login --adfs-host=bushadfs01.chicagobooth.edu --role-arn arn:aws:iam:::role/ # Temporary security credentials should now allow you access to AWS resources [mcn01] $ aws s3 ls s3:// After obtaining the temporary security credentials, it is possible to access files from AWS S3 using Python's boto3 library. The following code demonstrates a few useful commands. .. code-block:: python import boto3 s3client = boto3.client('s3') s3resource = boto3.resource('s3') # Retrieve the list of existing buckets in aws account response = s3client.list_buckets() # Display the bucket names print('Existing buckets:') for bucket in response['Buckets']: print(f' {bucket["Name"]}') # Specify bucket bucketname = "test-protected-raw-data" # Display bucket contents for bucket_object in s3resource.Bucket(bucketname).objects.all(): print(bucket_object) # Specify file file_to_read = "elev/junk22.csv" #Create a file object using the bucket and object key. fileobj = s3client.get_object( Bucket=bucketname, Key=file_to_read ) # Print file line by line for i in fileobj['Body'].iter_lines(): print(i.decode('utf-8')) # Alternatively, use pandas for tabular data import pandas as pd import io fileobj = s3client.get_object( Bucket=bucketname, Key=file_to_read ) df = pd.read_csv(io.BytesIO(fileobj['Body'].read()))