In coordination with the ACCESS team, XSEDE has paused the processing of Startup, Education, and other allocation requests from August 17-31. This pause will ensure that no requests are lost while we are making infrastructure updates behind the scenes and handing things over to ACCESS. If you have questions, contact help@xsede.org. The ACCESS team will resume taking allocation requests on September 1 via https://access-ci.org/.

The discussion forums in the XSEDE User Portal are for users to share experiences, questions, and comments with other users and XSEDE staff. Visitors are welcome to browse and search, but you must login to contribute to the forums. While XSEDE staff monitor the lists, XSEDE does not guarantee that questions will be answered. Please note that the forums are not a replacement for formal support or bug reporting procedures through the XSEDE Help Desk. You must be logged in to post to the user forums.

« Back

Does Gromacs use the Phi coprocessor on stampede?

Combination View Flat View Tree View
Threads [ Previous | Next ]
toggle
Hi,

I just tried using gromacs 4.5.5 on stampede for the first time (using the supplied module). I ran like this:

ibrun mdrun_mpi -notunepme -deffnm md3 -nosum -dlb yes -npme -1 -cpt 60 -maxh 0.1 -cpi md3.cpt -nsteps 5000000000

From looking at the speed (ns/day) I get approx. what I would expect from 16 cores alone, so I suspect that the MICs are not being used. I also looked at the .log information from gromacs (see below), which also doesn't list the MICs. Does anybody know if they are being used, or how I can check that?

Thank you,
Chris.

R E A L C Y C L E A N D T I M E A C C O U N T I N G

Computing: Nodes Number G-Cycles Seconds %
-----------------------------------------------------------------------
Domain decomp. 12 2281 304.668 112.8 2.0
DD comm. load 12 2280 1.257 0.5 0.0
DD comm. bounds 12 2282 2.724 1.0 0.0
Send X to PME 12 22801 31.611 11.7 0.2
Comm. coord. 12 22801 75.950 28.1 0.5
Neighbor search 12 2281 1199.547 444.3 7.8
Force 12 22801 6957.146 2576.8 45.1
Wait + Comm. F 12 22801 807.014 298.9 5.2
PME mesh 4 22801 1159.146 429.3 7.5
Wait + Comm. X/F 4 2693.618 997.7 17.5
Wait + Recv. PME F 12 22801 18.672 6.9 0.1
Write traj. 12 2 0.321 0.1 0.0
Update 12 22801 320.949 118.9 2.1
Constraints 12 45602 1532.245 567.5 9.9
Comm. energies 12 2282 4.756 1.8 0.0
Rest 12 301.601 111.7 2.0
-----------------------------------------------------------------------
Total 16 15411.226 5708.0 100.0
-----------------------------------------------------------------------
-----------------------------------------------------------------------
PME redist. X/F 4 45602 113.086 41.9 0.7
PME spread/gather 4 45602 640.632 237.3 4.2
PME 3D-FFT 4 45602 306.614 113.6 2.0
PME solve 4 22801 98.323 36.4 0.6
-----------------------------------------------------------------------

Parallel run - timing based on wallclock.

NODE (s) Real (s) (%)
Time: 356.752 356.752 100.0
5:56
(Mnbf/s) (GFlops) (ns/day) (hour/ns)
Performance: 714.533 44.446 11.044 2.173