KMR
Classes | Macros | Functions | Variables
kmrdp.cpp File Reference

MPI-DP Implementation with KMR. More...

#include <mpi.h>
#include <stdlib.h>
#include <string>
#include <vector>
#include <iostream>
#include <fstream>
#include <algorithm>
#include <time.h>
#include <fcntl.h>
#include <sys/time.h>
#include <sys/stat.h>
#include <assert.h>
#include "kmr.h"

Go to the source code of this file.

Classes

class  MPIDP
 A Tool to Run Tasks under MPI. More...
 
struct  RankLog
 Per-Rank Worker Log (for MPIDP). More...
 
struct  TaskLog
 Per-Task Log (for MPIDP). More...
 
struct  TaskRecord
 Task Log (for MPIDP). More...
 

Macros

#define KMRDP   1
 
#define MPIDP_LASTUPDATED   "2011/12/12"
 
#define MPIDP_VERSION   "1.0.3"
 
#define xassert(X)
 Asserts and aborts, but it cannot be disabled. More...
 

Functions

int application (int argc, char *argv[])
 Application Code Entry. More...
 
int collect_results (const struct kmr_kv_box kv[], const long n, const KMR_KVS *kvs, KMR_KVS *kvo, void *dp_)
 Collects result status in ir[] on rank#0. More...
 
static int count_args (char *args, size_t len)
 
string erase_spaces (const string &s0, const size_t len)
 Deletes white-spaces in the string appearing within LEN characters from the start (counting after removing spaces). More...
 
int gather_names (const struct kmr_kv_box kv[], const long n, const KMR_KVS *kvs, KMR_KVS *kvo, void *dp_)
 Collects host names on all nodes. More...
 
int list_tasks_rank0 (const struct kmr_kv_box kv, const KMR_KVS *kvs, KMR_KVS *kvo, void *dp_, const long i)
 Puts tasks in the KVS. More...
 
int main (int argc, char *argv[])
 Initializes MPI and then starts the application. More...
 
string replace_pattern (const string &s, const string &key, const string &value)
 Replaces the KEY by the VALUE in the source string S. More...
 
static int safe_atoi (const char *s)
 Parses an integer string. More...
 
static void scan_args (char **wargv, int wargc, char *args, size_t len)
 
int setup_rank0 (const struct kmr_kv_box kv, const KMR_KVS *kvs, KMR_KVS *kvo, void *dp_, const long i)
 Reads the command-line options and the jobs-list table, and then puts the parameters into the KVS. More...
 
int start_task (struct kmr_kv_box kv, const KMR_KVS *kvs, KMR_KVS *kvo, void *dp_, const long i)
 Starts an application using argv passed in value data. More...
 
int write_report (const struct kmr_kv_box kv, const KMR_KVS *kvs, KMR_KVS *kvo, void *dp_, const long i)
 Writes jobs and workers report in log. More...
 

Variables

int(* application_fin )(int argc, char *argv[])
 
int(* application_init )(int argc, char *argv[])
 
static const char * dummy_string = "MPIDP"
 
MPIDPkmr_dp = 0
 A pointer to an MPIDP object for debugging. More...
 
KMRkmr_dp_mr = 0
 A pointer to an embedded KMR context for debugging. More...
 

Detailed Description

MPI-DP Implementation with KMR.

MPI-DP is a tool developed by Akiyama Lab., Tokyo Institute of Technology (titec), which provides an environment for running almost independent (data-intensive, genome search) tasks using MPI, with master-worker job scheduling. This is a rewrite of MPI-DP version 1.0.3. MEMO: RETRY is ignored.

Overview of KMR-DP (MPI-DP Compatibility)

Copyright (C) 2012-2018 RIKEN R-CCS

(KMR Version: v1.10 (20201116))

KMR-DP is a (hopefully better) replacement to a tool MPI-DP for Ghost/MP, developed by AKIYAMA Lab., Tokyo Institute of Technology Institute (titec).

KMR-DP is intended to run master-worker tasks with minimal modification to the application. KMR-DP reads tasks in the job-list table and distributes them to the ranks. Arguments to the tasks are prepared by substituting the variables in the template with the entries in the job-list table.

Application Changes

An application needs to be changed to make the entry point name as "application". The "main" is provided by MPI-DP, which sets up the MPI environment and then calls "application". The "application" has a C-linkage.

int main(int argc, char *argv[]) {......}
==>
int application(int argc, char *argv[]) {......}

Runtime Options

Options:

-tb name : (REQUIRED) File name of jobs-list
-ot n : Position of output file name in the job-data list
-pg name : Application name
-rt _ : (not used; it is for fault tolerance)
-wl _ : (not used; it is for fault tolerance)
-lg name : Log-file name (default as "mpidp.log")

Jobs-List Table

Jobs-list table consists of a title (TITLE), an argument list (PARAM), and tab-separated lines of job-data lists.

TITLE=XXXXXXXXXXXXXXXXX : (OPTIONAL) Title Line (any string)
PARAM=arg1 arg2 arg3 ... : (REQUIRED) Argument List to Main Procedure (argc/argv)
job1-data1 \t job1-data2 \t job1-data3 ... : (REQUIRED) Job-data List (one line corresponds to one job)
job2-data1 \t job2-data2 \t job2-data3 ...
job3-data1 \t job3-data2 \t job3-data3 ...
...

The title line (TITLE) indicates a title, and can be any (short) string.

The argument list (PARAM) forms a skeleton of arguments to the main procedure, where it is appended after the command line arguments. It may include variable references like "$i", where they are substituted by the corresponding ones in the job-data list. Job-data list is a tab-separated line of words, and the first word replaces "$1", and so on.

Each line of job-data lists makes up a task. Fields in a job-data list will replace the variables in the argument list, where the fields are separated by tabs.

Jobs-List Example

An example jobs-list file "testdp.table" be look like the following:

TITLE=Example Run
PARAM=aln -i $1 -d $2 -o $3 -a 2 -w 0 -q d -t p -L 2097152 -S 8 -v 1 -b 1 -s 1 -T 30
xaa	genes.db	out.0
xab     genes.db	out.1
xac     genes.db	out.2
xad     genes.db	out.3

"$1" is substituted by "xaa", "$2" by "genes.db", and "$3" by "out.0", and so on, similarly for each following lines.

Example Run

A run can be started with:

mpirun -np n a.out -tb testdp.table

Installation

KMR-DP is included in KMR.

Simplest Usage

sed -e 's/main/application/' < app.c > appnew.c
mpic++ appnew.c libkmr.a
mpiexec a.out -tb testdp.table

Logging Output

A logging file ("mpidp.log" by default) dumps run information. The fields of "JOB table" and "Worker table" indicate the result.

The "JOB table" lists results one line for each task. "WID" is identical to the task number (in KMR-DP case, with no retry).

The "Worker table" lists results one line for each rank. Fields of each line indicate: 1st: rank, 2nd: number of tasks, and the following consists two columns for each task result. Each task result indicates: 1st: a task number, and 2nd: a return value.

Definition in file kmrdp.cpp.

Macro Definition Documentation

◆ xassert

#define xassert (   X)
Value:
((X) ? (void)(0) \
: (fprintf(stderr, \
"%s:%d: Assertion '%s' failed.\n", \
__FILE__, __LINE__, #X), \
(void)MPI_Abort(MPI_COMM_WORLD, 1)))

Asserts and aborts, but it cannot be disabled.

The two message styles are for Linux and Solaris. (The C++ standard does not define the macro "__func__", and avoid it, although most compiler does extend it).

Definition at line 51 of file kmrdp.cpp.

Function Documentation

◆ application()

int application ( int  argc,
char *  argv[] 
)

Application Code Entry.

It is necessary to rename "main" to "application" in the application. NOTE IT IS OF C-LINKAGE (that is, extern "C"). The main exists in MPI-DP, which sets up MPI, reads the configuration table, and then calls the entry of the application.

Definition at line 38 of file testdp.c.

◆ erase_spaces()

string erase_spaces ( const string &  s0,
const size_t  len 
)

Deletes white-spaces in the string appearing within LEN characters from the start (counting after removing spaces).

Definition at line 400 of file kmrdp.cpp.

◆ replace_pattern()

string replace_pattern ( const string &  s,
const string &  key,
const string &  value 
)

Replaces the KEY by the VALUE in the source string S.

Definition at line 381 of file kmrdp.cpp.

◆ gather_names()

int gather_names ( const struct kmr_kv_box  kv[],
const long  n,
const KMR_KVS kvs,
KMR_KVS kvo,
void *  dp_ 
)

Collects host names on all nodes.

It is a reduce-function. It in effect runs only on rank#0.

Definition at line 418 of file kmrdp.cpp.

◆ setup_rank0()

int setup_rank0 ( const struct kmr_kv_box  kv,
const KMR_KVS kvs,
KMR_KVS kvo,
void *  dp_,
const long  i 
)

Reads the command-line options and the jobs-list table, and then puts the parameters into the KVS.

It is a map-function, and runs only on rank#0.

Definition at line 439 of file kmrdp.cpp.

◆ list_tasks_rank0()

int list_tasks_rank0 ( const struct kmr_kv_box  kv,
const KMR_KVS kvs,
KMR_KVS kvo,
void *  dp_,
const long  i 
)

Puts tasks in the KVS.

It is a map-function, and runs only on rank#0.

Definition at line 651 of file kmrdp.cpp.

◆ start_task()

int start_task ( struct kmr_kv_box  kv,
const KMR_KVS kvs,
KMR_KVS kvo,
void *  dp_,
const long  i 
)

Starts an application using argv passed in value data.

It is a map-function.

Definition at line 752 of file kmrdp.cpp.

◆ collect_results()

int collect_results ( const struct kmr_kv_box  kv[],
const long  n,
const KMR_KVS kvs,
KMR_KVS kvo,
void *  dp_ 
)

Collects result status in ir[] on rank#0.

It is a reduce-function. It in effect runs only on rank#0.

Definition at line 864 of file kmrdp.cpp.

◆ write_report()

int write_report ( const struct kmr_kv_box  kv,
const KMR_KVS kvs,
KMR_KVS kvo,
void *  dp_,
const long  i 
)

Writes jobs and workers report in log.

It is a map-function, and runs only on rank#0.

Definition at line 905 of file kmrdp.cpp.

◆ main()

int main ( int  argc,
char *  argv[] 
)

Initializes MPI and then starts the application.

Definition at line 214 of file kmrdp.cpp.

◆ safe_atoi()

static int safe_atoi ( const char *  s)
static

Parses an integer string.

It is a safe "atoi".

Definition at line 370 of file kmrdp.cpp.

Variable Documentation

◆ kmr_dp

MPIDP* kmr_dp = 0

A pointer to an MPIDP object for debugging.

Definition at line 205 of file kmrdp.cpp.

◆ kmr_dp_mr

KMR * kmr_dp_mr = 0

A pointer to an embedded KMR context for debugging.

Definition at line 209 of file kmrdp.cpp.