Skip to content
Snippets Groups Projects
user avatar
yoperiquoi authored
751dd84e
History

Note : Most of the documentation and other details come from https://github.com/Concerto-D/concerto-decentralized.git and directly for the example experiment file.

How to start the experiment

Note: To run the experiment locally or remotely you need a g5k account.

Setup g5k credentials and access

Set g5k credentials

  • If the execution is local: create the file ~/.python-grid5000.yaml with the following content:
username: <g5k_username>
password: "<g5k_password>"
  • If on g5k: the authentication should works out of the box, so the content should just be:
verify_ssl: /etc/ssl/certs/ca-certificates.crt

More informations on python-grid5000 here: https://msimonin.gitlabpages.inria.fr/python-grid5000/#installation-and-examples

For local execution: Set g5k access

  • The grid5000 ssh private key is needed to access g5k. Then, it is required to add or modify some rules in the ssh config file as the evaluation code uses the ssh config file to configure its access to g5k:
  • Create or modify the file ~/.ssh/config and add the following rules:
Host g5k
  HostName access.grid5000.fr
  User <g5k_username>
  IdentityFile "<g5k_private_key>"
  ForwardAgent no
  
  # The following entries are added for local execution to use only one ssh connection to g5k. Enoslib by default
  # create as many ssh connection as the number of node it reserves which makes g5k to refuse some of the connections
  ControlMaster auto
  ControlPath /dev/shm/ssh-g5k-master

 Host *.grid5000.fr
  User <g5k_username>
  ProxyCommand ssh g5k -W "$(basename %h .g5k):%p"
  IdentityFile "<g5k_private_key>"
  ForwardAgent no

Configure the experiment parameters

The file expe_parameters_example.yaml contains the parameters of the experiments. This will be fed to the python script that starts the experiment. This file contains an example of a configuration. For each experiments to run, this has to be adapted before being passed as a parameter to the script. Here is the file with the explaination of the each parameters :

global_parameters:
  expe_name: "openstack-deployment"      # The experiment name, uniquely identify the experiment
  environment: "remote"                  # Where the reconfiguration program are launched:
                                            # local: on the host machine (each program is one process)
                                            # remote: each program is assigned to a g5k node
  version_concerto_d: "asynchronous"     # Which version to launch
                                            # synchronous: without router
                                            # asynchronous: with router
  use_case_name: "parallel_deps"         # parallel_deps is the only available use_case to launch
  all_expes_dir: "/home/yperiquoi/concerto-d-projects"     # Controller directory (on machine that runs the controller)
  all_executions_dir: "/root/concerto-d-projects"    # Nodes directory (on machine that simulate node)

  fetch_experiment_results: "True" # Should the experiment results should be fetched or not
  local_expe_res_dir: "concerto-d/prod/raspberry-5_deps-50-duration" # Where to find the local directory for experiment scenario
  send_mail_after_all_expes: "False" # Should user be notified or not
  ###

# Mail notification parameters
email_parameters:
  smtp_server: ""
  smtp_port: 587
  username: ""
  password: ""
###

# Infrastructure reservation on g5k. Used only if environment == remote, else ignored
reservation_parameters:
  job_name_concerto: "concerto-d"    # Name of the g5k reservation
  walltime: "01:00:00"               # Duration of the reservation format HH:MM:SS
  reservation: ""                    # e.g.: "2022-09-08 19:00:00", schedule the reservation, leave blank ("") for immediate runs
  nb_server_clients: 0               # Not used
  nb_servers: 1                      # Nb of servers reconfs programs, only exactly 1 is available
  nb_dependencies: 1                 # Nb of dependencies reconfs programs
  nb_zenoh_routers: 1                # Nb of routers running pub/sub service (Zenoh). 0 if synchronous, 1 or more if asynchronous (has been tested only with 0 and 1)
  cluster: "grisou"                  # Which g5k cluster to reserve nodes
  destroy_reservation: "False"       # Destroy reservation immediately after all experiments are done or failed


# Synthetic use case parameters. Will be "swept" by ParamSweeper (https://mimbert.gitlabpages.inria.fr/execo/execo_engine.html?highlight=paramsweeper#execo_engine.sweep.sweep):
#   - <uptimes>: schedules representing the uptimes and sleeping periods of the server and its dependencies
#   - <transitions_times>: transitions time of the reconfiguration actions for each components
#   - <waiting_rate>: for each reconf program, when reconf is blocked, percentage of mandatory time
#                     to be up before going back to sleep.  (leave it at 1 for MASCOTS)
#   - <id>: uniquely identify a combination of the previous parameters (used to repeat the same experiment multiple times)
sweeper_parameters:
  uptimes: ["mascots_uptimes-60-50-5-ud0_od0_15_25_perc.json", "mascots_uptimes-60-50-5-ud1_od0_15_25_perc.json", "mascots_uptimes-60-50-5-ud2_od0_15_25_perc.json"]
  transitions_times: ["transitions_times-1-30-deps12-0.json", "transitions_times-1-30-deps12-1.json"]
  waiting_rate: [1]
  id: [1]

Installation of the project

Install apt deps

  • sudo apt update
  • sudo apt install python3-pip virtualenv

Set up Python project:

  • virtualenv venv
  • source venv/bin/activate
  • pip install -r requirements.txt

Start an experiment

To start the experiment you simply have to run the command with the previously tuned configuration file:

python3 experiment/execution_experiment.py expe_parameters_example.yaml

Gathering results

There are two dirs created for the execution: local dir and remote dir.

The remote dir is <all_executions_dir>/execution-<expe_name>-<datetime_expe_execution>/ and is always on g5k. It contains mainly the log files of the assemblies for debugging purposes.

The local dir is under the folder <all_expes_dir>/global-<expe_name>-dir/ can be either on g5k or in your computer, depending if you executed the script on g5k or locally. It contains:

  • The execution dirs for each experiment: execution-<expe_name>-<datetime_expe_execution> which in turn contains:
    • The timestamp of each step of the reconfiguration in log_files_assemblies/. These files serve to compute the global result at the end.
    • The global result of the experiment, computed in the file: results_<concerto_d_version>_T<transition_time_id>_perc-<min_overlap>-<max_overlap>_expe_<waiting_rate>.json
  • The log of the execution of the controller of all the experiment is in experiment_logs/experiment_logs_<datetime_controller_execution>.txt
  • The state of the ParamSweeper in sweeps/. The sweeper is part of the execo python library and keeps track of the current state of the execution of experiments. In our case, it marks experiments as either todo if it has to be done done if finished correctly, in_progress if in progress (or if the whole process crash) and skipped if an exception occurs during the execution.\