00_imb to compile Intel MPI benchmark.task.sh in cs32cl256.bw_node/. It contains three kinds of measurement settings on Pingpong benchmark.
msg.txt).OMPI_MCA_btl_tofu_eager_limit as the minimum value.task.sh (modify --gname option).BINDIR variable in task.sh before the execution in cs32cl256.bw_node/. You need to write your installed location of IMB binary (e.g., IMB-MPI1) there.## To run as a batch job
$ cd cs32cl256.bw_node
$ pjsub task.sh
## Or, to run in an interactive job
$ cd cs32cl256.bw_node
$ bash task.sh
task.sh are node=2, by-node, and PPN(Process-per-node)=1 (i.e., #PJM -L "node=2", #PJM --mpi "rank-map-bynode", and #PJM --mpi "max-proc-per-node=1"). In this case, how are the two processes distributed over nodes?out-***).err-***).*.stat). How are the node and ranks allocated?cs32cl256.bw_cmg. In turn, one will measure the bandwidth between CMGs (i.e., intra-node communication performance).simple-p2p. This directory contains two kinds of simple implementation of Pingpong, in std/ and sync/. The former is equivalent to PingPong in IMB, composed of MPI_Send (i.e., standard communication mode) and MPI_Recv. The latter is different, composed of MPI_Ssend (i.e., synchronized communication mode) and MPI_Recv. Thus, the latter may be expected to always use the Rendezvous protocol, independent of the message length. Compare the results between std/ and sync/. Also, confirm that the results in std/ is similar to the measurement results with IMB's Pingpong.