RLPark - Code Snippet Using Scheduling for Running Jobs in Parallel

Code Snippet: Using Scheduling for Running Jobs in Parallel

RLPark includes a few classes to be able to run experiments on clusters or computers with multiple cores. The main idea is to start a server that contains a list of jobs to run. Then, the generic java client rltoys-client.jar is run on nodes on the cluster. The client connects to the server, requests a list of jobs and if necessary, downloads the code required to run the job. As soon as the client is done with a job, it sends the result to the server. When the client has its list of jobs empty, it requests a new list of jobs to the server.

RLPark also has a framework for evaluating learning control algorithm for both on-policy and off-policy.

The following code snippets provide examples to run the client/server in different configuration:

Job.java is an example of runnable job. A job is defined in Java by implementing the Runnable interface. For jobs to be send over the network, the job need to implement the Serializable interface. As shown in this example, a job can store the results after running (e.g. a measure of performance), such that, when the client sends the job done back the server, the server can store the result in a file.
LocalScheduling.java creates a local scheduler to run a list of jobs. The method createJobList() creates the list of all the jobs to run. The method createJobDoneListener() creates the listener to be called every time a job is done. This listener would be responsible to store the result of the job into a file. The method main() creates a local scheduler, creates the list of job, add them to the scheduler, and then ask the scheduler to run everything. Jobs will be dispatched on all the cores available on the machine.
ServeurWithLocalScheduling.java creates a server with an embedded local scheduler. Using the methods from the class above, it creates a list of jobs and a listener and submit them to the scheduler. The server also open a socket to which clients can connect to request jobs to run.
Let's say the server is compiled in server.jar, then the server can be started by the command:
java -jar server.jar
The client can be started by running:
java -jar rltoys-client.jar -t<max computation time in minutes> -c<number of core to use> <hostname:port>
For instance:
java -jar rltoys-client.jar -t420 -c4 localhost:4000
for 420 minutes of computation, using 4 cores and connecting to localhost on port 4000. The options -t and -c are optional but useful on clusters. In this example, jobs will be dispatched on 2 cores on the machine running the server and on clients connecting to the server.
ServeurWaitingClients.java is the same as above but it does not run the cores on the machine running the server to dispatch job. Thus, jobs will run only on clients. This is useful when many clients connect to the server.

A jar of the client is available within RLPark: rltoys-client.jar

Parameter sweeps for reinforcement learning

See rlpark.plugin.rltoys.junit.experiments.reinforcementlearning.OnPolicySweepTest for on-policy evaluations.
See rlpark.plugin.rltoys.junit.experiments.reinforcementlearning.OffPolicyContinuousEvaluationSweepTest for off-policy evaluations.

Dependencies

zephyr.plugin.core.api, rlpark.plugin.rltoys

Documentation