Note : Most of the documentation and other details come from https://github.com/Concerto-D/concerto-decentralized.git and directly for the example experiment file.
How to start the experiment
Note: To run the experiment locally or remotely you need a g5k account.
Setup g5k credentials and access
Set g5k credentials
- If the execution is local: create the file
~/.python-grid5000.yaml
with the following content:
username: <g5k_username>
password: "<g5k_password>"
- If on g5k: the authentication should works out of the box, so the content should just be:
verify_ssl: /etc/ssl/certs/ca-certificates.crt
More informations on python-grid5000 here: https://msimonin.gitlabpages.inria.fr/python-grid5000/#installation-and-examples
For local execution: Set g5k access
- The grid5000 ssh private key is needed to access g5k. Then, it is required to add or modify some rules in the ssh config file as the evaluation code uses the ssh config file to configure its access to g5k:
- Create or modify the file
~/.ssh/config
and add the following rules:
Host g5k
HostName access.grid5000.fr
User <g5k_username>
IdentityFile "<g5k_private_key>"
ForwardAgent no
# The following entries are added for local execution to use only one ssh connection to g5k. Enoslib by default
# create as many ssh connection as the number of node it reserves which makes g5k to refuse some of the connections
ControlMaster auto
ControlPath /dev/shm/ssh-g5k-master
Host *.grid5000.fr
User <g5k_username>
ProxyCommand ssh g5k -W "$(basename %h .g5k):%p"
IdentityFile "<g5k_private_key>"
ForwardAgent no
Configure the experiment parameters
The file expe_parameters_example.yaml
contains the parameters of the experiments. This will be fed to the
python script that starts the experiment. This file contains an example of a configuration. For each experiments to run,
this has to be adapted before being passed as a parameter to the script.
Here is the file with the explaination of the each parameters :
global_parameters:
expe_name: "openstack-deployment" # The experiment name, uniquely identify the experiment
environment: "remote" # Where the reconfiguration program are launched:
# local: on the host machine (each program is one process)
# remote: each program is assigned to a g5k node
version_concerto_d: "asynchronous" # Which version to launch
# synchronous: without router
# asynchronous: with router
use_case_name: "parallel_deps" # parallel_deps is the only available use_case to launch
all_expes_dir: "/home/yperiquoi/concerto-d-projects" # Controller directory (on machine that runs the controller)
all_executions_dir: "/root/concerto-d-projects" # Nodes directory (on machine that simulate node)
fetch_experiment_results: "True" # Should the experiment results should be fetched or not
local_expe_res_dir: "concerto-d/prod/raspberry-5_deps-50-duration" # Where to find the local directory for experiment scenario
send_mail_after_all_expes: "False" # Should user be notified or not
###
# Mail notification parameters
email_parameters:
smtp_server: ""
smtp_port: 587
username: ""
password: ""
###
# Infrastructure reservation on g5k. Used only if environment == remote, else ignored
reservation_parameters:
job_name_concerto: "concerto-d" # Name of the g5k reservation
walltime: "01:00:00" # Duration of the reservation format HH:MM:SS
reservation: "" # e.g.: "2022-09-08 19:00:00", schedule the reservation, leave blank ("") for immediate runs
nb_server_clients: 0 # Not used
nb_servers: 1 # Nb of servers reconfs programs, only exactly 1 is available
nb_dependencies: 1 # Nb of dependencies reconfs programs
nb_zenoh_routers: 1 # Nb of routers running pub/sub service (Zenoh). 0 if synchronous, 1 or more if asynchronous (has been tested only with 0 and 1)
cluster: "grisou" # Which g5k cluster to reserve nodes
destroy_reservation: "False" # Destroy reservation immediately after all experiments are done or failed
# Synthetic use case parameters. Will be "swept" by ParamSweeper (https://mimbert.gitlabpages.inria.fr/execo/execo_engine.html?highlight=paramsweeper#execo_engine.sweep.sweep):
# - <uptimes>: schedules representing the uptimes and sleeping periods of the server and its dependencies
# - <transitions_times>: transitions time of the reconfiguration actions for each components
# - <waiting_rate>: for each reconf program, when reconf is blocked, percentage of mandatory time
# to be up before going back to sleep. (leave it at 1 for MASCOTS)
# - <id>: uniquely identify a combination of the previous parameters (used to repeat the same experiment multiple times)
sweeper_parameters:
uptimes: ["mascots_uptimes-60-50-5-ud0_od0_15_25_perc.json", "mascots_uptimes-60-50-5-ud1_od0_15_25_perc.json", "mascots_uptimes-60-50-5-ud2_od0_15_25_perc.json"]
transitions_times: ["transitions_times-1-30-deps12-0.json", "transitions_times-1-30-deps12-1.json"]
waiting_rate: [1]
id: [1]
Installation of the project
Install apt deps
sudo apt update
sudo apt install python3-pip virtualenv
Set up Python project:
virtualenv venv
source venv/bin/activate
pip install -r requirements.txt
Start an experiment
To start the experiment you simply have to run the command with the previously tuned configuration file:
python3 experiment/execution_experiment.py expe_parameters_example.yaml
Gathering results
There are two dirs created for the execution: local dir and remote dir.
The remote dir is <all_executions_dir>/execution-<expe_name>-<datetime_expe_execution>/
and is always on g5k.
It contains mainly the log files of the assemblies for debugging purposes.
The local dir is under the folder <all_expes_dir>/global-<expe_name>-dir/
can be either on g5k or in your computer,
depending if you executed the script on g5k or locally. It contains:
- The execution dirs for each experiment:
execution-<expe_name>-<datetime_expe_execution>
which in turn contains:- The timestamp of each step of the reconfiguration in
log_files_assemblies/
. These files serve to compute the global result at the end. - The global result of the experiment, computed in the file:
results_<concerto_d_version>_T<transition_time_id>_perc-<min_overlap>-<max_overlap>_expe_<waiting_rate>.json
- The timestamp of each step of the reconfiguration in
- The log of the execution of the controller of all the experiment is in
experiment_logs/experiment_logs_<datetime_controller_execution>.txt
- The state of the ParamSweeper in
sweeps/
. The sweeper is part of theexeco
python library and keeps track of the current state of the execution of experiments. In our case, it marks experiments as either todo if it has to be done done if finished correctly, in_progress if in progress (or if the whole process crash) and skipped if an exception occurs during the execution.\