Step 1 : Create a new Trial

Download cogment-cli

Download the latest version of cogment-cli

Bootstrap the project

Run the following to create a bootstrap project:

$ cogment init rps

   Number of actor types: 1
   Actor 1 name: player
   Number of AI "player": 1
   Number of Human "player": 1
   generating python settings: rps/cog_settings.py
   Your new project is now ready
   Don't forget to run "cogment-cli generate" whenever you change the cogment.yaml or a .proto file

$ cd rps

For Rock-Paper-Scissors, there’s only 1 type of Actor, that we named “player”. There will be only 2 instances of the “player” actor; one played by the AI, one by a human user.

You now have all the needed elements of a (blank) cogment project. Now, let's implement the logic.

Define our data structures

Cogment uses Protocol Buffers to define and serialize messages, and GRPC for communication across services.

data.proto

syntax = "proto3";

package bootstrap;

enum Decision {
    NONE = 0;
    ROCK = 1;
    PAPER = 2;
    SCISSOR = 3;
}

message Observation {
    int32 p1_score = 1;
    int32 p2_score = 2;
}

message PlayerAction {
    Decision decision = 1;
}

In RPS, the Action Space of our Actor Class is a discrete choice between three alternatives (“Rock”, “Paper”, “Scissors” - with the addition of a “None” decision which will be used when we initialize the game). We will use an enum, called Decision to represent those alternatives and have a field of that enum type within the action space.

The Observation is a point-in-time snapshot of the state of the environment. In the present RPS case, it’s the score of each player.

The PlayerAction is the action taken by any Actor of the player class (Agent or User). If the types of possible actions were different, a separate AgentActor and UserActor would have been created, each with its own actor class.

Compile the .proto file into python using the following command

cogment generate --python_dir=.

The Environment

A default environment is provided by the bootstrap project code. It inherits from the cogment::Environment class and defines 3 methods: start, update, end)

Start the Env service

$ docker-compose up env
Starting rps_env_1 ... done
Attaching to rps_env_1
env_1           | Versions:
env_1           |   aok_sdk: 0.1.4
env_1           |   grpc: 1.23.0
env_1           |   env: 1.0.0
env_1           | Environment service started
env_1           | cogment.Environment service listening on port 9000

and test it in another terminal

 $ docker-compose run grpc-cli call env:9000 Version ""
connecting to env:9000
versions {
  name: "aok_sdk"
  version: "0.1.4"
}
versions {
  name: "grpc"
  version: "1.23.0"
}
versions {
  name: "env"
  version: "1.0.0"
}

Rpc succeeded with OK status

The Environment is up and running.

The Agent

The bootstrap project provides as many agent files as were defined in the init process we did in the Bootstrap the project section above: only Actor 1 name: player was defined, therefore, only a player.py file was generated.

An agent must implement 3 methods: decide, reward, end

VERSIONS is a special variable used to display different versions by calling the Version procedure. The framework takes care of adding versions of the sdk and grpc.

Start the agent service

$ docker-compose up player
Attaching to rps_player_1
player_1        | Versions:
player_1        |   aok_sdk: 0.1.4
player_1        |   grpc: 1.23.0
player_1        |   player: 1.0.0
player_1        | Agent Service started
player_1        | cogment.Agent service listening on port 9000


    $ docker-compose up player
    Attaching to rps_player_1
    player_1        | Versions:
    player_1        |   aok_sdk: 0.1.4
    player_1        |   grpc: 1.23.0
    player_1        |   player: 1.0.0
    player_1        | Agent Service started
    player_1        | cogment.Agent service listening on port 9000



Test it in another terminal

```text
$ docker-compose run grpc-cli call player:9000 Version ""
connecting to agent:9000
versions {
  name: "aok_sdk"
  version: "0.1.6"
}
versions {
  name: "grpc"
  version: "1.23.0"
}
versions {
  name: "player"
  version: "1.0.0"
}
Rpc succeeded with OK status

As you've probably noticed, both the environment and agent are running on port 9000. There is no conflict thanks to the under-the-hood use of docker and docker-compose.

Start the trial service

This service is implemented by running a component called the orchestrator, which is the entry point for the system and the interface between the client and the backend. This component is provided as a docker image and is automatically added by the bootstrap process.

The orchestrator needs to know where the Env and the Agent are running. Cogment uses a distributed infrastructure where components can live on different servers. This is an important concern for Human / AI Interaction training, since one may have a human user base spread out in many different locations.

$ docker-compose up orchestrator
rps_player_1 is up-to-date
Creating rps_env_1 ... done
Creating rps_orchestrator_1 ... done
Attaching to rps_orchestrator_1
orchestrator_1  | [2019-08-26 15:01:51.990] [info] AoM Orchestrator v. 0.1.1
orchestrator_1  | [2019-08-26 15:01:51.990] [trace] creating stubs
orchestrator_1  | [2019-08-26 15:01:51.990] [info] Connecting to env service at env:9000
orchestrator_1  | [2019-08-26 15:01:51.990] [trace] starting prometheus
orchestrator_1  | [2019-08-26 15:01:51.991] [trace] building server
orchestrator_1  | [2019-08-26 15:01:51.992] [info] Server listening for trials on 0.0.0.0:9000

In a new terminal, call the orchestrator in order to start a new Trial

$ docker-compose run grpc-cli call orchestrator:9000 Start ""
D0719 13:50:27.220812582       9 env_linux.cc:71]            Warning: insecure environment read function 'getenv' used
connecting to orchestrator:9000
player_id: 1
trial_id: "da66514c-fa04-46ba-8627-b18f1a71afb7"
env_state {
  time_stamp {
    seconds: 1563544227
    nanos: 244466000
  }
}

This concludes Step 1 of the Tutorial: you have bootstrapped a Cogment project, defined your protobufs, started the environment and agent services, and launched a trial through a debug command.

The above sends a start trial command and receives a succeeded response with a trial\_id, which you will see on the terminal window listening for trials:

```text
orchestrator_1  | [2019-09-09 17:30:48.628] [info] creating trial: da47d64e-1a62-49ca-aebd-5ecff9e64a5e
orchestrator_1  | [2019-09-09 17:30:48.632] [info] populating trial...

This concludes Step 1 of the Tutorial: you have bootstrapped a Cogment project, defined your protobufs, started the environment and agent services, and launched a trial through a debug command.

Let’s move on to actually implementing our components.