We're actually in the process of changing our documentation since the recent system upgrades to both Gordon and Trestles has changed the way hybrid jobs need to be submitted. In addition, Lester's question touches on another type of job which involves bundling a number of non-MPI jobs into a single job submission which our documentation does not cover. For the sake of posterity, I'll just clarify both aspects: running hybrid MPI/OpenMP jobs at SDSC, and bundling non-MPI jobs together.
Hybrid JobsWe used to advise users to simply adjust their ppn request to reduce the number of MPI tasks run on each node, but that no longer works very well due to some changes we made when Gordon and Trestles were upgraded. To simplify the task of running these MPI+OpenMP jobs, we've implemented the ibrun command which is similar in spirit to TACC's command of the same name.
The point of ibrun is to provide a simple, consistent way for users to launch their MPI jobs without having to worry about all of the different implementation-specific tweaks that may be necessary to get the best performance. To run a job that uses both MPI and OpenMP, you would run your application (called ./my_hybrid_app) by issuing ibrun like this:
1#PBS -l nodes=4:ppn=16:native
2export OMP_NUM_THREADS=16
3ibrun -npernode 1 ./my_hybrid_app
This would launch one MPI process per node (-npernode 1), each with 16 OpenMP threads (OMP_NUM_THREADS=16). Since your job script requested 4 nodes, you would run a total of 4 MPI processes, each with 16 threads, and use a total of 64 cores.
Trestles has four 8-core processors per node, so you may want to run an MPI process on each processor, each with 8 OpenMP threads. The relevant parts of your submit script would then be
1#PBS -l nodes=2:ppn=32
2export OMP_NUM_THREADS=8
3ibrun -npernode 4 ./my_hybrid_app
ibrun is smart enough to know how many nodes you want to use from your #PBS -l nodes=X:ppn=... request. However
you should always request the maximum number of ppn (16 on Gordon, 32 on Trestles).
If your hybrid application uses pthreads instead of OpenMP, you can tell ibrun how many threads per process (tpp) you want to use:
1#PBS -l nodes=2:ppn=32
2ibrun -npernode 4 -tpp 8 ./my_hybrid_app
For hybrid jobs, you should always specify either OMP_NUM_THREADS or -tpp. If you specify neither, ibrun will assume you are using one thread per process. The resulting performance tweaks it applies may not be optimal.
Job BundlingSince Gordon does not allow users to share compute nodes, people who have serial applications that only use one core are at a disadvantage since they get charged for all 16 cores per node regardless of if their job will only use one of them.
As a workaround, we provide a "job bundling" script that allows you to specify multiple "tasks" in a tasks file, submit a single job as an MPI job, and have those tasks automatically distributed across all of the CPU cores on all of your job's nodes and executed concurrently.
This job bundler can be found in our
GitHub user scripts repository. In brief, you create a "tasks" file (literally called 'tasks' if you'd like) and enter a single invocation of your serial application (e.g., "./my_application < my_input1 > my_output1") per line. Then submit the associated submit script which launches the bundler.py program, and the bundler.py program interfaces with the MPI environment and distributes all of the commands in your tasks files across all nodes it receives.
Job Bundling with Multithreaded ApplicationsLester's question combines the above two ways of running jobs. His individual tasks are multithreaded, but bundler.py uses MPI. Thus, running bundled, multithreaded applications requires the use of ibrun and bundler.py because the net effect works like a hybrid MPI+OpenMP job.
The .qsub files we initially provided for bundler.py used mpirun_rsh and not ibrun, which meant these .qsub files wouldn't work for launching bundled multithreaded tasks. We've just added some more examples (in the bundler/OpenMP directory) that have the correct ibrun usage and illustrate how to bundle multithreaded tasks. Sorry for not providing these scripts earlier!
ibrun can be used for the serial (non-multithreaded) task bundling as well, but we kept mpirun_rsh in the original example scripts to be consistent with our User Guides. As I alluded above, we have a new version of the Gordon and Trestles User Guides in the works that discuss ibrun as I have done here, and it will be consistent with the sample scripts I put in the bundler/OpenMP directory.