Basic Usage
Some high-level tools are provided for basic usage of DynRM. For more customization refer to the ‘advanced usage’ chapter.
Basic DynRM runs
DynRM offers a basic command to run a dedicated DynRM instance for executing a single job or job mix:
dynrm_run
--topology Path to the topology used in this run
--submission Path to submission file (job batch file or job mix file) executed by this run
--system System class name to be used in this run
--policy Scheduling policy to be used in this run
--policy_params Paramters for scheduling policy
--output Path to directory where output files should be written to
--verbosity Verbosity level to be used in this run (0 - 11)
Topologies
DynRM requires information about the topology to be managed. The default topology creation module supports yaml files with the following syntax:
topology:
nodes:
[NODE_NAME]
num_cores: [NUM_CORES]
...
Example of a topology consiting of two node each with 8 cores:
topology:
nodes:
n1:
num_cores: 8
n2:
num_cores: 8
Job Submissions
DynRM requires the submission of a “Task Graph”, consting of jobs and job dependencies.
To specify resource (re-)assignments, Output Space Generators are used.
The yaml submisson module expects a python script with the .batch file extension.
tasks:
- name : [Task Name] # e.g. "task1"
executable : [Name of executable] # absolute or relative path or binary name"
arguments : # Arguments to be passed to executable
- arg1 # First <String> argument
- arg2 # Second <String> argument
- ...
launch_generator: # Information about job launch (see Output Space Generators)
model: [PSetModel class] # [OPTIONAL] PsetModel class to use for output PSet. E.g. "AmdahlPsetModel"
model_params: # [OPTIONAL] Parameters of the PsetModel
key1 : value1 # [OPTIONAL] default = t_s : 1
key2 : value2 # [OPTIONAL] default = t_p : 200
mapping : "[NUM_PROCS]:node" # [OPTIONAL] Mapping of output PSets (Number of processes per node)
num_procs : [NUM_PROCS] # [OPTIONAL] Fixed Number of processes tp start
generators: # [OPTIONAL] Define Output Space Generators to be referenced by the applications
- key : [generator_key] # Key to be used to reference this Output Space Generator
function : [generator_function] # See Output Space Generators "output_space_generator_replace"
model: [PSetModel class] # [OPTIONAL] PsetModel class to use for output PSet. E.g. "AmdahlPsetModel"
model_params: # [OPTIONAL] Parameters of the PsetModel
key1 : [value1] # [OPTIONAL] default = t_s : 1
key2 : [value2] # [OPTIONAL] default = t_p : 200
mapping : "[NUM_PROCS]:node" # [OPTIONAL] Mapping of output PSets (Number of processes per node)
num_procs_add : [NUM_PROCS] # [OPTIONAL] Fixed number of processes to add
num_procs_sub : [NUM_PROCS] # [OPTIONAL] Fixed number of processes to remove
max_procs : [NUM_PROCS] # [OPTIONAL] Maximum number of processes after reconfiguration
min_procs : [NUM_PROCS] # [OPTIONAL] Minimum number of processes after reconfiguration
power2 : [true/false] # [OPTIONAL] Allow only power of 2 numbers of proesses
factor : [FACTOR] # [OPTIONAL] Fixed factor between number of processes of input and output
runtime : [SECONDS] # Estimated runtime of task graph (not considered by all policies)
num_nodes : [NUM_NODES] # Fixed number of nodes to allocate (Only for static scheduling)
Example of a job starting with 8 processes on 1 node. During runtime, processes can be added/removed, with a miximum of 64 processes, allowing only powers of 2 numbers of processes. I.e. valid configurations are 8, 16, 32, 64 processes runnning on 1, 2, 4, 8 nodes:
tasks:
- name : "Example Task"
executable : /path/to/my_executable
arguments :
"a_positional_arg"
"--key"
"value"
launch_generator:
model: "AmdahlPsetModel"
model_params:
key1 : t_s : 1
key2 : t_p : 200
mapping : "8:node"
num_procs : 8
generators:
- key : "power2_reconf"
function : "output_space_generator_replace"
model: "AmdahlPsetModel"
model_params:
t_s : 1
t_p : 200
mapping : "8:node"
max_procs : 64
power2 : true
runtime : 3600
num_nodes : 1
Job Reconfiguration
Following the DPP design principles, job reconfiguartions are expressed as Process Set Operarations. The Dynamic Open-MPI and Dynamic OpenPMIx libraries provide a according functions to specify PSet Operations.
These interfaces support the following arguments:
input_psets: The list of input PSets of the operationpsetop_type: The type of the PSet operation, e.g. ADD, SUB, GROW, SHRINK, REPLACE, SPLIT, UNION, DIFFERENCE, …
In addition to this, addaitional information for DynRM can be provided, e.g. via MPI_Info object, or :
model: The PSetOp model. The Default Models are of the form:Default[Operaration]Model, e.g.Default[Replace]Modelgenerator_key: <String> Key to reference an output space generator specified in job submission script, e.g. “power2_reconf”output_space_generator: <String> specifying paramters binding for Output Space Generator Function (using partial from functools module). See Advanced Usage.output_space_generator_json: A <json string> containing paramaters for Output Space Generators
{
"function": "[generator_function]", // Output Space Generator function (e.g., "output_space_generator_replace")
"model": "[PSetModel class]", // OPTIONAL: PSetModel class used for output (e.g., "AmdahlPsetModel")
"model_params": { // OPTIONAL: Parameters for the PSetModel
"key1": "[value1]", // OPTIONAL: default = t_s : 1
"key2": "[value2]" // OPTIONAL: default = t_p : 200
},
"mapping": "[NUM_PROCS]:node", // OPTIONAL: Mapping of output PSets (processes per node)
"num_procs_add": "[NUM_PROCS]", // OPTIONAL: Fixed number of processes to add
"num_procs_sub": "[NUM_PROCS]", // OPTIONAL: Fixed number of processes to remove
"max_procs": "[NUM_PROCS]", // OPTIONAL: Maximum number of processes after reconfiguration
"min_procs": "[NUM_PROCS]", // OPTIONAL: Minimum number of processes after reconfiguration
"power2": "[true/false]", // OPTIONAL: Restrict to powers of 2
"factor": "[FACTOR]" // OPTIONAL: Fixed factor between input and output process counts
}
Example using MPI4py:
input_psets = ['mpi://WORLD']
op = MPI.PSETOP_REPLACE
generator_json = json.dumps(
{
"function": "ouputspace_generator_replace",
"model": "AmdahlPsetModel",
"model_params": {
"t_s": 1,
"t_p": 2
},
"mapping": "8:node",
"max_procs": 64,
"power2": True",
}
)
info = MPI.Info.Create()
info.Set('model', 'DefaultReplaceModel')
info.Set('ouput_space_generator_json', generator_json)
req = session.Dyn_v2a_psetop_nb(op, input_psets, info)
# PSetOp is forwarded to DynRM. Use req.Test() to check for reconfiguration
Job Mix Submissions
Job mix files have a “.mix” file extension and are structured as csv files.
CSV structure
arrival_time_s |
submission_path |
parameters |
|---|---|---|
arrival time (in seconds after job mix submission) |
path to submission file (.batch file) |
Optional dict of parameters |
Supported parameters
terminate_soon: True/False - When True, after submission of this job the resource manager will terminate when all jobs have finished.
Example:
1,/path/to/executable1,
10,/path/to/executable10,
11,/path/to/executable11,
12,/path/to/executable12,
13,/path/to/executable13, {"terminate_soon": True}