Code Snippet: Off-Policy Actor-critic (Off-PAC) in a 2D puddle world


Description: Off-policy Actor-Critic (Off-PAC) learning off-policy from a behavior policy using an uniform policy distribution, the representation is using tile-coding with a murmur2 hashing function, the target policy is a Gibbs distribution for discrete action.


Source code:

Reference:

Off-policy Actor-Critic. T. Degris, M. White, R. S. Sutton (2012). In Proceedings of the 29th International Conference on Machine Learning.

Running this demo:

  • From the command line:
    1. Download rlpark.jar
    2. Run the following command line:
      java -cp rlpark.jar rlpark.example.demos.learning.OffPACPuddleWorld
  • In Zephyr standalone application:
    1. Download Zephyr standalone application
    2. Install RLPark plug-ins in Zephyr
    3. Go to: Demos->Off-PAC in a Puddle World
  • In Eclipse, as a Java Application:
    1. Create a new Java Project or use an existing project
    2. Include rlpark.jar in the project classpath
    3. Run a Java Application target using rlpark.example.demos.learning.OffPACPuddleWorld as a main class
  • In Eclipse, as an Eclipse Application:
    1. Install Zephyr plug-ins and RLPark plug-ins in Eclipse
      or
      download RLPark source code and import RLPark projects (including the demo project rlpark.example.demos) into the workspace
    2. Set up an Eclipse Application target following the tutorial Using Zephyr plug-ins
    3. In the Eclipse Application target configuration:
      1. In the menu, go to: Run->Run Configurations...
      2. Select the Eclipse Application target
      3. In the Plug-ins tab, select the plug-in rlpark.example.demos and rlpark.plugin.rltoysview to enable RLPark views
    4. Start Zephyr by running the Eclipse Application target
    5. In the Zephyr menu, go to: Demos->Actor-Critic on Pendulum
      or in the Arguments tab, add rlpark.example.demos.learning.OffPACPuddleWorld to the Program Arguments text field

Dependencies

zephyr.plugin.core.api, rlpark.plugin.rltoys

Documentation