Introduction to Boston University’s Shared Computing Cluster (SCC) Aaron D. Fuegi [email protected] Research Computing Services Information Services & Technology Boston University.
Download ReportTranscript Introduction to Boston University’s Shared Computing Cluster (SCC) Aaron D. Fuegi [email protected] Research Computing Services Information Services & Technology Boston University.
Introduction to Boston University’s Shared Computing Cluster (SCC) Aaron D. Fuegi [email protected] Research Computing Services Information Services & Technology Boston University Information Services & Technology Outline What is the Shared Computing Cluster (SCC)? Getting an Account on the SCC Connecting to the SCC Using the SCC (Hands-On) Questions? 11/7/2015 Information Services & Technology 11/7/2015 What Is The SCC? A Linux cluster with over 6200 processors and 236 GPUs. Currently over 2 Petabytes of disk. Located in Holyoke, MA at the Massachusetts Green High Performance Computing Center (MGHPCC), a collaboration between 5 major universities and the Commonwealth of Massachusetts. Went into production in June, 2013 for Research Computing. Majorly expanded in December, 2014. http://www.bu.edu/tech/support/research/computing-resources/scc/ 3 Information Services & Technology 11/7/2015 Why Holyoke? – MGHPCC Benefits Green, environmentally friendly design. Low cost, clean and renewable energy source. Space on-site for building expansion (years 10-20). Opportunities for shared facilities and services. Opportunities for collaboration with other institutions. BU “Far West” – Two 10Gigabit/second Ethernet connections from BU to the MGHPCC. http://www.bu.edu/tech/support/research/rcs/mghpcc/ 4 Information Services & Technology 11/7/2015 MGHPCC - Photo 5 Information Services & Technology 11/7/2015 Service Models – Shared and Buy-In Many of the elements of the SCC are paid for by BU and university-wide grants and are free to the entire BU Research Computing community. Other elements (about 60% of the processors currently) are purchased by individual faculty or research groups through the Buy-In program with priority access for the purchaser. http://www.bu.edu/tech/support/research/computingresources/service-models/ 6 Information Services & Technology 11/7/2015 SCC Architecture File Storage Public Network SCC1 SCC2 GEO SCC4 VPN Only Login Nodes Private Network Compute Nodes 7 Information Services & Technology 11/7/2015 Storage Research projects are automatically granted 50GB of backed-up spaced (/project/PROJNAME) and 50GB of not-backed-up space (/projectnb/PROJNAME). These numbers can be increased for free to 200GB/800GB. Project groups can either purchase or “rent” additional storage. All users have a Home Directory with a 10GB quota. http://www.bu.edu/tech/support/research/computingresources/file-storage/ 8 Information Services & Technology 11/7/2015 Storage Space (in GBs) Home Directory Projectnb Project Stash 0 100 200 Default Size 300 400 500 Maximum (free) Size 600 700 800 900 1000 Expansion ($$) 9 Information Services & Technology 11/7/2015 Storage – What files should go where? Home Directory – Personal files, custom scripts. /project – Source code, files you can’t replace. /projectnb – Output files, downloaded data sets. Large quantities of data that you could recreate in the incredibly unlikely event of a disastrous data loss. /stash – Manual backup of vital /projectnb data. 10 Information Services & Technology 11/7/2015 Storage - Restricted (dbGaP) Data Some projects, mostly those on the BU Medical Campus, require dbGaP security measures: /restricted/project/PROJNAME – backed up space for dbGaP data /restricted/projectnb/PROJNAME – not backed up space for dbGaP data Only accessible through scc4.bu.edu and compute nodes 11 Information Services & Technology 11/7/2015 Storage – Scratch Space Each node (login or compute) has a directory called /scratch stored on a local hard drive. This can be used by batch jobs to quickly write temporary files. If you wish to keep these files, you should copy them to your own space when the job completes. Scratch files are kept for 30 days, with no guarantees. http://www.bu.edu/tech/support/research/systemusage/running-jobs/resources-jobs/local_scratch/ 12 Information Services & Technology 11/7/2015 Snapshots – Recovering lost files Available for Home Directories, all Project Disk Space, and STASH. Backups made daily at Midnight. [adftest2@scc1 ~]$ cd .snapshots [adftest2@scc1 ~]$ ls 140613/ 140624/ [adftest2@scc1 ~]$ cd 140613 [adftest2@scc1 ~]$ ls –l -rw-r--r-- 1 adftest2 scv 71 May 29 19:41 myfile [adftest2@scc1 ~]$ cp myfile ../../ http://www.bu.edu/tech/support/research/computing-resources/filestorage/#Snapshots 13 Information Services & Technology 11/7/2015 Software (Tutorial this semester) Programming Languages: C, FORTRAN, Python, CUDA, Perl Math, Data Analysis, and Plotting: MATLAB, Mathematica, IDL, MAPLE Statistics: R, Rstudio, SAS, Stata Visualization: VTK, ParaView, VMD, Maya Domain Specific Packages: Bioinformatics, Engineering, Geographic Information Systems (GIS) Parallel: MPI, MATLAB PCT, OpenMP, OpenAcc http://rcs.bu.edu/software/ 14 Information Services & Technology 11/7/2015 GPU Computing Fast computation using GPUs (graphics processing units). 100x speedups possible for some codes. 236 GPUs available. Programming: C++ and FORTRAN - CUDA, OpenACC Software Packages: MATLAB PCT, R If interested, take one or more of our GPU tutorials. http://www.bu.edu/tech/support/research/software-andprogramming/programming/multiprocessor/gpu-computing/ 15 Information Services & Technology 11/7/2015 Getting an Account on the SCC Using tutorial accounts today. These should not be used after today. All users of the SCC must be on a Research Project headed up by a full-time BU Faculty member. Exception: 3 month trial accounts for students/tutorial attendees. Email [email protected] if interested. http://www.bu.edu/tech/support/research/accountmanagement/ 16 Information Services & Technology 11/7/2015 Alternative: Linux Virtual Lab Available to any BU community member that needs access to a Linux system. Send email to [email protected] to get access. Advantages: Permanent account Full access to SCC software via scc-lite.bu.edu Disadvantages: No batch system access Limited disk space http://www.bu.edu/tech/services/support/desktop/computerlabs/unix/ 17 Information Services & Technology 11/7/2015 Connecting to the SCC Windows - MobaXterm http://www.bu.edu/tech/support/research/systemusage/getting-started/connect-ssh/#windows Macintosh – Built-in Terminal application http://www.bu.edu/tech/support/research/systemusage/getting-started/connect-ssh/#apple Linux – Terminal application http://www.bu.edu/tech/support/research/systemusage/getting-started/connect-ssh/#linux 18 Information Services & Technology 11/7/2015 Connecting - Details Software you need: SSH Client – To log in to the SCC machines, such as scc1.bu.edu X Forwarding – Display graphics for those programs with a GUI interface (such as MATLAB) or that otherwise display images. File Transfer – Transferring files between the SCC and your local machine. VNC – Advanced users only. Faster graphics: http://www.bu.edu/tech/support/research/systemusage/getting-started/remote-desktop-vnc/ 19 Information Services & Technology 11/7/2015 Questions so Far Questions on the Shared Computing Cluster so far? Remainder of the tutorial will be hands-on getting a feel for using Linux and the SCC. If you are already familiar with Linux, this section may be slow for you. 20 Information Services & Technology 11/7/2015 Using the SCC (Hands-On) Linux “Command Line” Environment – No menus or graphics unless in specific software packages. Login Nodes - Interactive use, code development. General: scc1.bu.edu, scc2.bu.edu Earth & Environment Dept. Users: geo.bu.edu BUMC and Restricted Data Users: scc4.bu.edu Compute Nodes – Run “Batch Jobs” on, both single and multi-processor. Names like scc-bc5.bu.edu 21 Information Services & Technology 11/7/2015 Using the SCC - Basics This tutorial is going to cover the very basics of Linux on the SCC. Please consider taking a fuller Linux tutorial from us or online if you end up using the SCC significantly. We have on our web site some material for new users of Linux and the SCC at: http://www.bu.edu/tech/support/research/systemusage/getting-started/commands/ 22 Information Services & Technology 11/7/2015 Using the SCC – ssh From your ssh/terminal application on your tutorial workstation or your laptop or on a machine at home: ssh -l adftest2 scc1.bu.edu “ssh” is the command you are issuing “-l adftest2” is a “command line option” to specify your login name on the SCC “scc1.bu.edu” is a “parameter” of the command Make sure to hit the “Enter” key after every command 23 Information Services & Technology 11/7/2015 Using the SCC - Logging In Windows/MobaXterm local_prompt% ssh [email protected] Mac local_prompt% ssh –Y [email protected] Linux local_prompt% ssh –X [email protected] 24 Information Services & Technology 11/7/2015 SFTP File Transfer to/from the SCC Graphical Applications Windows – MobaXterm (Free), WinSCP (Free) Mac – FileZilla (Free), Fetch (BU site license) Command Line Applications rsync scp http://www.bu.edu/tech/support/research/systemusage/getting-started/get-started-file-transfer/ 25 Information Services & Technology 11/7/2015 File Transfer Issues – dos2unix Windows, Macs, and Linux in text files define “end of line” differently. To solve this issue, there is a utility called dos2unix. This is not an issue with binary files. Transfer text file “example.txt” from Windows to Linux. Rewrite “example.txt” as a Linux style file. [adftest2@scc1 ~]$ dos2unix example.txt http://linuxcommand.org/man_pages/dos2unix1.html 26 Information Services & Technology 11/7/2015 Using the SCC – the “prompt” You should now see something like: [adftest2@scc1 ~]$ This is what is called the “prompt” and indicates the system (the bash “shell” in particular) is ready to accept commands from you. “adftest2” is your login name. “scc1” is the machine you are on. “~” is the directory you are in – in Linux “~” is a shorthand for a person’s home directory. 27 Information Services & Technology 11/7/2015 Using the SCC - X-Forwarding (Graphics) Run the command xclock to see if graphics are working for you. [adftest2@scc1 ~]$ xclock A window similar to the image on the right should come up. Click the X in the upper right to close this window. http://www.bu.edu/tech/support/research/system-usage/gettingstarted/x-forwarding/ 28 Information Services & Technology 11/7/2015 Using the SCC – pwd Show the current “full path”, the directory you are in with its parent and all levels of grandparents up to the root directory (/). Items you type will be shown in bold: [adftest2@scc1 ~]$ pwd /usr2/collab/adftest2 Here the command pwd “returns” (prints to your screen) the result “/usr2/collab/adftest2” 29 Information Services & Technology 11/7/2015 Using the SCC – man The man (short for “manual”) command is used to look up information about a Linux command. [adftest2@scc1 ~]$ man pwd PWD(1) User Commands PWD(1) NAME pwd - print name of current/working… SYNOPSIS pwd [OPTION]... … 30 Information Services & Technology 11/7/2015 Using the SCC – man cont. For some commands, such as if you run man cd, you will get a general manual page for the bash shell and not such a particular page as for pwd. You can page through the manual page for a command a screenful at a time using the “spacebar”, a line at a time using the “Enter” key, and quit out of the page by typing q. 31 Information Services & Technology 11/7/2015 Using the SCC – mkdir Create a new directory: [adftest2@scc1 ~]$ mkdir newdir Creates a new directory (folder) to store files in within your home directory. 32 Information Services & Technology 11/7/2015 Using the SCC – ls List the contents of a directory: [adftest2@scc1 ~]$ ls newdir Or with a command line option, asking for more details: [adftest2@scc1 ~]$ ls -l total 0 drwxr-xr-x 3 adftest2 adftest 512 Oct 28 16:03 newdir 33 Information Services & Technology 11/7/2015 Using the SCC – File Permissions From the previous slide: drwxr-xr-x 3 adftest2 adftest 512 Oct 28 16:03 newdir “drwxr-xr-x” gives the “permissions” for this directory (or file). The “d” indicates this is a directory. There are then three sets of three characters for “user” (u), “group” (g), and “other” (o) access levels. “r” indicates a file/directory is readable, “w” writable, and “x” executable. A “-” indicates no such permission. 34 Information Services & Technology 11/7/2015 Using the SCC - chmod Change the permissions on the directory “newdir” so that members of your group can write to it: [adftest2@scc1 ~]$ chmod g+w newdir and note the difference: [adftest2@scc1 ~]$ ls -l total 0 drwxrwxr-x 3 adftest2 adftest 512 Oct 28 16:03 newdir 35 Information Services & Technology 11/7/2015 Using the SCC - cd Change directory to “newdir”: [adftest2@scc1 ~]$ cd newdir You can also move to other directories by giving a “full path” (a path starting with the / character) such as: [adftest2@scc1 newdir]$ cd /usr/local/bin/ Type just cd anytime to go back to your home directory. 36 Information Services & Technology 11/7/2015 Using the SCC – cp (Start C Example) We will now begin a sequence of commands to compile and run a very simple C code. We start by copying the code from our “examples” directory into the current directory, which can be abbreviated by the . (period) character: [adftest2@scc1 newdir]$ cp /project/scv/examples /c/examples/ex01-helloworld/helloWorld.c . 37 Information Services & Technology 11/7/2015 Using the SCC - more Look at the contents of the C source code file we just copied using the more command: [adftest2@scc1 newdir]$ more helloWorld.c #include <stdio.h> int main(int argc, char *argv[]) { /* print message */ printf("Hello, World!\n"); return (0); } 38 Information Services & Technology 11/7/2015 Using the SCC - gcc Compile the source code file we just copied into the binary file hello using the Gnu C compiler gcc: [adftest2@scc1 newdir]$ gcc -o hello helloWorld.c The “-o hello” option causes the output file to be named “hello”. Without this, it would be named “a.out” regardless of the name of your source code file. 39 Information Services & Technology 11/7/2015 Using the SCC – File Execution Note that the compiled file is automatically made “executable”: [adftest2@scc1 newdir]$ ls -l hello -rwxr-xr-x 1 adftest2 adftest 6430 Oct 28 15:49 hello Now we run the command from the current directory: [adftest2@scc1 newdir]$ hello Hello, World! 40 Information Services & Technology 11/7/2015 Using the SCC – qsub and qstat Use the Open Grid Scheduler (OGS) command qsub to submit our compiled program to the batch system: [adftest2@scc1 newdir]$ qsub -b y hello Your job 1041461 ("hello") has been submitted If you are quick, you can monitor this job using qstat: [adftest2@scc1 newdir]$ qstat –u adftest2 job-ID prior name user state submit/start at queue … ------------------------------------------------------------------------ … 1041461 0.00000 hello adftest2 qw 09/02/2014 11:44:28 … 41 Information Services & Technology 11/7/2015 Using the SCC – qsub output The job should run soon and produce an output file: [adftest2@scc1 newdir]$ cat hello.o1041461 hello, world There will also be an error file which should be empty: [adftest2@scc1 newdir]$ cat hello.e1041461 42 Information Services & Technology 11/7/2015 Using the SCC – qsub Details Submit non-interactive batch jobs using qsub qsub [options] command [arguments] Setting default qsub options using a .sge_request file: http://www.bu.edu/tech/support/research/system-usage/runningjobs/advanced-batch/#sge_request http://www.bu.edu/tech/support/research/system-usage/runningjobs/submitting-jobs/ 43 Information Services & Technology 11/7/2015 Using the SCC – qsub options 44 Information Services & Technology 11/7/2015 Using the SCC – qsub options cont. 45 Information Services & Technology 11/7/2015 Interactive Batch Jobs Used for doing interactive work, such as in MATLAB, that takes more than 15 minutes of CPU time. [adftest2@scc1 newdir]$ qsh Your job 5274760 ("INTERACTIVE") has been submitted waiting for interactive job to be scheduled ..... Your interactive job 5274760 has been successfully scheduled. New window comes up after a little while: [adftest2@scc-pi4 newdir]$ matlab http://www.bu.edu/tech/support/research/system-usage/runningjobs/interactive-jobs/ 46 Information Services & Technology 11/7/2015 Using the SCC - .bashrc file You have a .bashrc (.cshrc for tcsh users) in your home directory. Commands in this file are automatically executed every time you log in. Do not put commands like “echo” in this file. Modify this file to change your default system behaviors or automatically run certain commands when you log in. Be careful modifying this file or you could make it impossible for yourself to log in to the system; contact us if that happens. http://www.bu.edu/tech/support/research/system-usage/usingscc/environment/ 47 Information Services & Technology 11/7/2015 Using the SCC – gedit GUI Editor Try launching a graphical application, such as gedit: [adftest2@scc1 newdir]$ gedit ~/.bashrc & Assuming you have X Forwarding set up, this should bring up in a separate window the simple editor gedit to enable you to edit your source code file. Other editors such as emacs and vi are also available. 48 Information Services & Technology 11/7/2015 Using the SCC – Editing your .bashrc Add the following line in your .bashrc file. alias dir=‘ls-al’ Save the new .bashrc file and run in your shell window: [adftest2@scc1 newdir]$ source ~/.bashrc Test out the new command: [adftest2@scc1 newdir]$ dir hello -rwxr-xr-x 1 adftest2 adftest 6494 Jan 22 11:22 hello 49 Information Services & Technology 11/7/2015 Using the SCC – Modules (1) Modules – Used to load applications not automatically loaded by the system, including alternative versions of applications. See if I have access to the envi program: [adftest2@scc1 newdir]$ which envi /usr/bin/which: no envi in (/usr/local/apps/pgi13.5/bin:/usr/java/default/jre/bin:/usr/java/default/bin:/usr/lib64/qt3. 3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/usr2/ collab/adftest2/bin) http://www.bu.edu/tech/support/research/software-andprogramming/software-and-applications/modules/ 50 Information Services & Technology 11/7/2015 Using the SCC – Modules cont. (2) See what other modules are available to me: [adftest2@scc1 newdir]$ module avail ------------------------- /usr/local/Modules/versions ------------------------3.2.10 … envi/4.8 envi/5.0 envi/5.0_sp3 … 51 Information Services & Technology 11/7/2015 Using the SCC – Modules cont. (3) Load the module I need from the earlier long list: [adftest2@scc1 newdir]$ module load envi/5.0 See if I now have access to the envi program: [adftest2@scc1 newdir]$ which envi alias envi='/project/earth/packages/exelis5.0/envi50/bin/envi‘ /project/earth/packages/exelis5.0/envi50/bin/envi 52 Information Services & Technology 11/7/2015 Using the SCC - grep grep is a useful command for searching for a text string in a file, such as: [adftest2@scc1 newdir]$ grep –i hello * Binary file hello matches hello.o1041461:hello, world helloWorld.c: printf("hello, world\n"); * is a special “wildcard” character that matches all filenames. Can also limit it by doing, for example, *.c 53 Information Services & Technology 11/7/2015 Using the SCC - Pipes You can also create a series of commands with the output of one being the input of the next through a series of “pipes” such as: [adftest2@scc1 newdir]$ cat helloWorld.c | grep print /* print message */ printf( "Hello, World!\n" ); You can also redirect the output of a command to a file using > myfilename 54 Information Services & Technology 11/7/2015 Using the SCC – R Example (1) Similar example to the C example earlier but using R. Copy the needed files from the examples directory: [adftest2@scc1 newdir]$ cp /project/scv/examples/r/examples/ex01helloworld/* . Run this code in a variety of ways. 55 Information Services & Technology 11/7/2015 Using the SCC – R Example cont. (2) Run R: [adftest2@scc1 newdir]$ R Within R, run the sourcecode file we just copied: > source("helloWorld.R") [1] "Hello, World!“ [1] "R version 2.15.3 (2013-03-01)“ 56 Information Services & Technology 11/7/2015 Using the SCC – R Example cont. (3) Quit R: > q() Save workspace image? [y/n/c]: n Run the source code directly from the shell: [adftest2@scc1 newdir]$ R CMD BATCH helloWorld.R 57 Information Services & Technology 11/7/2015 Using the SCC – R Example cont. (4) Look at the generated output file: [adftest2@scc1 newdir]$ more helloWorld.Rout … > # This is a simple R code > > print("Hello, World!") [1] "Hello, World!“ > > # Print current R version > print(version$version.string) [1] "R version 2.15.3 (2013-03-01)“ … 58 Information Services & Technology 11/7/2015 Using the SCC – R Example cont. (5) Submit the prewritten script file which calls this R code to the batch system using qsub: [adftest2@scc1 newdir]$ qsub –P tutorial Rjob Your job 9235558 ("helloWorld") has been submitted Check on the status of this job: [adftest2@scc1 newdir]$ qstat –u adftest2 job-ID prior name user state submit/start at queue slots ja-task-ID ----------------------------------------------------------------------------------------------------------------9235558 0.00000 helloWorld adftest2 qw 08/28/2014 16:43:18 1 59 Information Services & Technology 11/7/2015 Using the SCC – R Example cont. (6) View the output file when the job is complete: [adftest2@scc1 newdir]$ cat helloWorld.o9235558 > # This is a simple R code > > print("Hello, World!") [1] "Hello, World!“ > > > # Print current R version > print(version$version.string) [1] "R version 3.1.1 (2014-07-10)“ > > A <- rnorm(100,0,1) > > pdf(file="mypdf.pdf") > hist(A) > dev.off() null device … 60 Information Services & Technology 11/7/2015 Additional Web Resources Research Computing Support Pages http://www.bu.edu/tech/support/research/ Technical Summary of SCC Resources http://www.bu.edu/tech/support/research/computingresources/tech-summary/ SCC Updates – Latest SCC News http://www.bu.edu/tech/support/research/whatshappening/updates/ Code Examples for Popular Software Packages http://scv.bu.edu/examples/ 61 Information Services & Technology 11/7/2015 Questions All SCC questions welcome. Those I can’t answer I will make every effort to get you an answer for later. Email Addresses: Aaron Fuegi – [email protected] General help using the SCC – [email protected] 62 Information Services & Technology 11/7/2015 Tutorial Survey Please open a web browser and go to: http://scv.bu.edu/survey/tutorial_evaluation.html to fill out the tutorial survey. Thanks for coming. Tutorials slides available on the web from: http://www.bu.edu/tech/support/research/trainingconsulting/live-tutorials/ My Contact Information: [email protected] or (617) 353-8255 63