Skip to content
Snippets Groups Projects
Commit fad28d1b authored by yoperiquoi's avatar yoperiquoi
Browse files

Add readme

parent 8af97116
No related branches found
No related tags found
No related merge requests found
see https://github.com/Concerto-D/concerto-decentralized.git for details Note : Most of the documentation and other details come from https://github.com/Concerto-D/concerto-decentralized.git
and directly for the example experiment file.
# How to start the experiment
**Note:** To run the experiment locally or remotely you need a **g5k account**.
### Setup g5k credentials and access
*Set g5k credentials*
- If the execution is local: create the file ```~/.python-grid5000.yaml``` with the following content:
```
username: <g5k_username>
password: "<g5k_password>"
```
- If on g5k: the authentication should works out of the box, so the content should just be:
```
verify_ssl: /etc/ssl/certs/ca-certificates.crt
```
More informations on python-grid5000 here: https://msimonin.gitlabpages.inria.fr/python-grid5000/#installation-and-examples
**For local execution:**
*Set g5k access*
- The **grid5000 ssh private key** is needed to access g5k. Then, it is required to add or modify some rules in the **ssh config
file** as the evaluation code uses the ssh config file to configure its access to g5k:
- Create or modify the file ```~/.ssh/config``` and add the following rules:
```
Host g5k
HostName access.grid5000.fr
User <g5k_username>
IdentityFile "<g5k_private_key>"
ForwardAgent no
# The following entries are added for local execution to use only one ssh connection to g5k. Enoslib by default
# create as many ssh connection as the number of node it reserves which makes g5k to refuse some of the connections
ControlMaster auto
ControlPath /dev/shm/ssh-g5k-master
Host *.grid5000.fr
User <g5k_username>
ProxyCommand ssh g5k -W "$(basename %h .g5k):%p"
IdentityFile "<g5k_private_key>"
ForwardAgent no
```
*Configure the experiment parameters*
The file ```expe_parameters_example.yaml``` contains the parameters of the experiments. This will be fed to the
python script that starts the experiment. This file contains an example of a configuration. **For each experiments** to run,
this has to be **adapted** before being passed as a parameter to the script.
Here is the file with the explaination of the each parameters :
```yaml
global_parameters:
expe_name: "openstack-deployment" # The experiment name, uniquely identify the experiment
environment: "remote" # Where the reconfiguration program are launched:
# local: on the host machine (each program is one process)
# remote: each program is assigned to a g5k node
version_concerto_d: "asynchronous" # Which version to launch
# synchronous: without router
# asynchronous: with router
use_case_name: "parallel_deps" # parallel_deps is the only available use_case to launch
all_expes_dir: "/home/yperiquoi/concerto-d-projects" # Controller directory (on machine that runs the controller)
all_executions_dir: "/root/concerto-d-projects" # Nodes directory (on machine that simulate node)
fetch_experiment_results: "True" # Should the experiment results should be fetched or not
local_expe_res_dir: "concerto-d/prod/raspberry-5_deps-50-duration" # Where to find the local directory for experiment scenario
send_mail_after_all_expes: "False" # Should user be notified or not
###
# Mail notification parameters
email_parameters:
smtp_server: ""
smtp_port: 587
username: ""
password: ""
###
# Infrastructure reservation on g5k. Used only if environment == remote, else ignored
reservation_parameters:
job_name_concerto: "concerto-d" # Name of the g5k reservation
walltime: "01:00:00" # Duration of the reservation format HH:MM:SS
reservation: "" # e.g.: "2022-09-08 19:00:00", schedule the reservation, leave blank ("") for immediate runs
nb_server_clients: 0 # Not used
nb_servers: 1 # Nb of servers reconfs programs, only exactly 1 is available
nb_dependencies: 1 # Nb of dependencies reconfs programs
nb_zenoh_routers: 1 # Nb of routers running pub/sub service (Zenoh). 0 if synchronous, 1 or more if asynchronous (has been tested only with 0 and 1)
cluster: "grisou" # Which g5k cluster to reserve nodes
destroy_reservation: "False" # Destroy reservation immediately after all experiments are done or failed
# Synthetic use case parameters. Will be "swept" by ParamSweeper (https://mimbert.gitlabpages.inria.fr/execo/execo_engine.html?highlight=paramsweeper#execo_engine.sweep.sweep):
# - <uptimes>: schedules representing the uptimes and sleeping periods of the server and its dependencies
# - <transitions_times>: transitions time of the reconfiguration actions for each components
# - <waiting_rate>: for each reconf program, when reconf is blocked, percentage of mandatory time
# to be up before going back to sleep. (leave it at 1 for MASCOTS)
# - <id>: uniquely identify a combination of the previous parameters (used to repeat the same experiment multiple times)
sweeper_parameters:
uptimes: ["mascots_uptimes-60-50-5-ud0_od0_15_25_perc.json", "mascots_uptimes-60-50-5-ud1_od0_15_25_perc.json", "mascots_uptimes-60-50-5-ud2_od0_15_25_perc.json"]
transitions_times: ["transitions_times-1-30-deps12-0.json", "transitions_times-1-30-deps12-1.json"]
waiting_rate: [1]
id: [1]
```
## Installation of the project
*Install apt deps*
- ```sudo apt update```
- ```sudo apt install python3-pip virtualenv```
*Set up Python project:*
- ```virtualenv venv```
- ```source venv/bin/activate```
- ```pip install -r requirements.txt```
## Start an experiment
```shell
python3 experiment/execution_experiment.py expe_parameters_example.yaml
```
### Gathering results
There are two dirs created for the execution: **local dir** and **remote dir**.
The **remote dir** is ```<all_executions_dir>/execution-<expe_name>-<datetime_expe_execution>/``` and is always on g5k.
It contains mainly the log files of the assemblies for **debugging purposes**.
The **local dir** is under the folder ```<all_expes_dir>/global-<expe_name>-dir/``` can be either on g5k or in your computer,
depending if you executed the script on g5k or locally. It contains:
- The execution dirs for each experiment: ```execution-<expe_name>-<datetime_expe_execution>``` which in turn contains:
- The timestamp of each step of the reconfiguration in ```log_files_assemblies/```. These
files serve to compute the global result at the end.
- The global result of the experiment, computed in the file: ```results_<concerto_d_version>_T<transition_time_id>_perc-<min_overlap>-<max_overlap>_expe_<waiting_rate>.json```
- The log of the execution of the controller of all the experiment is in ```experiment_logs/experiment_logs_<datetime_controller_execution>.txt```
- The state of the ParamSweeper in ````sweeps/````. The sweeper is part of the
```execo``` python library and keeps track of the current state of the execution of experiments. In our case, it marks experiments
as either *todo* if it has to be done *done* if finished correctly, *in_progress* if in progress (or if the whole process crash) and
*skipped* if an exception occurs during the execution.\
\ No newline at end of file
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment