Hi, it seems to be a bug for mvapich2/1.9a2, when calling BLACS functions 'dgebs2d' and 'dgebr2d' with TOP=' '. All processes are blocked or dead-lock (not sure), will never return from 'dgebs2d' or 'dgebr2d'. The sample code is as follow,
1c bcast.f
2
3 program test
4c------------------------------------------------------------
5 include 'mpif.h'
6 integer nprocs,me
7 integer nprow,npcol,ictxt,myrow,mycol
8 integer n
9 parameter(n=1000)
10 double precision A(n,n)
11
12 call blacs_pinfo(me,nprocs)
13 nprow=int(sqrt(float(nprocs)))
14 npcol=int(nprocs/nprow)
15 if(nprow*npcol.gt.nprocs)npcol=npcol-1
16 print *,'nprow=',nprow,' npcol=',npcol
17
18 call blacs_get(-1, 0, ictxt)
19 call blacs_gridinit(ictxt,'R',nprow,npcol)
20 call blacs_gridinfo(ictxt,nprow,npcol,myrow,mycol)
21 print *,'myrow=',myrow,' mycol=',mycol
22
23 if(myrow.eq.0.and.mycol.eq.0)then
24 call dgebs2d(ictxt,'All',' ',n,n,A,n)
25 else
26 call dgebr2d(ictxt,'All',' ',n,n,A,n,0,0)
27 endif
28 print *,'myrow=',myrow,' mycol=',mycol, 'pass'
29
30 if(myrow.ge.0)call blacs_gridexit(ictxt)
31 call blacs_exit(0)
32 end
The code is compiled and executed like this,
1/opt/apps/intel13/mvapich2/1.9/bin/mpif77 -O2 -traceback -o bcast.x bcast.f -L/opt/apps/intel/13/composer_xe_2013.2.146/mkl/lib/intel64 -lmkl_scalapack_lp64 -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -lmkl_blacs_intelmpi_lp64 -liomp5 -lpthread -lm
2ibrun -n 4 -o 0 ./bcast.x
If the TOP is set to like 'i-ring' or '1-tree', the code works well. Or if intel MPI is used instead of mvapich2, everything is fine.
Is there some magic env or setting could fix this issue? Or is this a bug need to be fixed?
Thanks,
Jing