see https://github.com/Concerto-D/concerto-decentralized.git for details
Note : Most of the documentation and other details come from https://github.com/Concerto-D/concerto-decentralized.git
and directly for the example experiment file.
# How to start the experiment
**Note:** To run the experiment locally or remotely you need a **g5k account**.
### Setup g5k credentials and access
*Set g5k credentials*
- If the execution is local: create the file ```~/.python-grid5000.yaml``` with the following content:
```
username: <g5k_username>
password: "<g5k_password>"
```
- If on g5k: the authentication should works out of the box, so the content should just be:
```
verify_ssl: /etc/ssl/certs/ca-certificates.crt
```
More informations on python-grid5000 here: https://msimonin.gitlabpages.inria.fr/python-grid5000/#installation-and-examples
**For local execution:**
*Set g5k access*
- The **grid5000 ssh private key** is needed to access g5k. Then, it is required to add or modify some rules in the **ssh config
file** as the evaluation code uses the ssh config file to configure its access to g5k:
- Create or modify the file ```~/.ssh/config``` and add the following rules:
```
Host g5k
HostName access.grid5000.fr
User <g5k_username>
IdentityFile "<g5k_private_key>"
ForwardAgent no
# The following entries are added for local execution to use only one ssh connection to g5k. Enoslib by default
# create as many ssh connection as the number of node it reserves which makes g5k to refuse some of the connections
ControlMaster auto
ControlPath /dev/shm/ssh-g5k-master
Host *.grid5000.fr
User <g5k_username>
ProxyCommand ssh g5k -W "$(basename %h .g5k):%p"
IdentityFile "<g5k_private_key>"
ForwardAgent no
```
*Configure the experiment parameters*
The file ```expe_parameters_example.yaml``` contains the parameters of the experiments. This will be fed to the
python script that starts the experiment. This file contains an example of a configuration. **For each experiments** to run,
this has to be **adapted** before being passed as a parameter to the script.
Here is the file with the explaination of the each parameters :
```yaml
global_parameters:
expe_name:"openstack-deployment"# The experiment name, uniquely identify the experiment
environment:"remote"# Where the reconfiguration program are launched:
# local: on the host machine (each program is one process)
# remote: each program is assigned to a g5k node
version_concerto_d:"asynchronous"# Which version to launch
# synchronous: without router
# asynchronous: with router
use_case_name:"parallel_deps"# parallel_deps is the only available use_case to launch
all_expes_dir:"/home/yperiquoi/concerto-d-projects"# Controller directory (on machine that runs the controller)
all_executions_dir:"/root/concerto-d-projects"# Nodes directory (on machine that simulate node)
fetch_experiment_results:"True"# Should the experiment results should be fetched or not
local_expe_res_dir:"concerto-d/prod/raspberry-5_deps-50-duration"# Where to find the local directory for experiment scenario
send_mail_after_all_expes:"False"# Should user be notified or not
###
# Mail notification parameters
email_parameters:
smtp_server:""
smtp_port:587
username:""
password:""
###
# Infrastructure reservation on g5k. Used only if environment == remote, else ignored
reservation_parameters:
job_name_concerto:"concerto-d"# Name of the g5k reservation
walltime:"01:00:00"# Duration of the reservation format HH:MM:SS
reservation:""# e.g.: "2022-09-08 19:00:00", schedule the reservation, leave blank ("") for immediate runs
nb_server_clients:0# Not used
nb_servers:1# Nb of servers reconfs programs, only exactly 1 is available
nb_dependencies:1# Nb of dependencies reconfs programs
nb_zenoh_routers:1# Nb of routers running pub/sub service (Zenoh). 0 if synchronous, 1 or more if asynchronous (has been tested only with 0 and 1)
cluster:"grisou"# Which g5k cluster to reserve nodes
destroy_reservation:"False"# Destroy reservation immediately after all experiments are done or failed
# Synthetic use case parameters. Will be "swept" by ParamSweeper (https://mimbert.gitlabpages.inria.fr/execo/execo_engine.html?highlight=paramsweeper#execo_engine.sweep.sweep):
# - <uptimes>: schedules representing the uptimes and sleeping periods of the server and its dependencies
# - <transitions_times>: transitions time of the reconfiguration actions for each components
# - <waiting_rate>: for each reconf program, when reconf is blocked, percentage of mandatory time
# to be up before going back to sleep. (leave it at 1 for MASCOTS)
# - <id>: uniquely identify a combination of the previous parameters (used to repeat the same experiment multiple times)
There are two dirs created for the execution: **local dir** and **remote dir**.
The **remote dir** is ```<all_executions_dir>/execution-<expe_name>-<datetime_expe_execution>/``` and is always on g5k.
It contains mainly the log files of the assemblies for **debugging purposes**.
The **local dir** is under the folder ```<all_expes_dir>/global-<expe_name>-dir/``` can be either on g5k or in your computer,
depending if you executed the script on g5k or locally. It contains:
- The execution dirs for each experiment: ```execution-<expe_name>-<datetime_expe_execution>``` which in turn contains:
- The timestamp of each step of the reconfiguration in ```log_files_assemblies/```. These
files serve to compute the global result at the end.
- The global result of the experiment, computed in the file: ```results_<concerto_d_version>_T<transition_time_id>_perc-<min_overlap>-<max_overlap>_expe_<waiting_rate>.json```
- The log of the execution of the controller of all the experiment is in ```experiment_logs/experiment_logs_<datetime_controller_execution>.txt```
- The state of the ParamSweeper in ````sweeps/````. The sweeper is part of the
```execo``` python library and keeps track of the current state of the execution of experiments. In our case, it marks experiments
as either *todo* if it has to be done *done* if finished correctly, *in_progress* if in progress (or if the whole process crash) and
*skipped* if an exception occurs during the execution.\