Code Snippet: Off-Policy Actor-critic (Off-PAC) in a 2D puddle world
Description: Off-policy Actor-Critic (Off-PAC) learning off-policy from a behavior policy using an uniform policy distribution, the representation is using tile-coding with a murmur2 hashing function, the target policy is a Gibbs distribution for discrete action.
Source code:
- Doxygen: OffPACPuddleworld.java
- Github: OffPACPuddleWorld.java
Reference:
Off-policy Actor-Critic. T. Degris, M. White, R. S. Sutton (2012). In Proceedings of the 29th International Conference on Machine Learning.Running this demo:
- From the command line:
- Download rlpark.jar
- Run the following command line:
java -cp rlpark.jar rlpark.example.demos.learning.OffPACPuddleWorld
- In Zephyr standalone application:
- Download Zephyr standalone application
- Install RLPark plug-ins in Zephyr
- Go to:
- In Eclipse, as a Java Application:
- Create a new or use an existing project
- Include rlpark.jar in the project classpath
- Run a
Java Application
target usingrlpark.example.demos.learning.OffPACPuddleWorld
as a main class
- In Eclipse, as an Eclipse Application:
- Install Zephyr plug-ins and
RLPark plug-ins in Eclipse
or
download RLPark source code and import RLPark projects (including the demo project rlpark.example.demos) into the workspace - Set up an
Eclipse Application
target following the tutorial Using Zephyr plug-ins - In the
Eclipse Application
target configuration:- In the menu, go to:
- Select the
Eclipse Application
target - In the rlpark.example.demos and rlpark.plugin.rltoysview to enable RLPark views tab, select the plug-in
- Start Zephyr by running the
Eclipse Application
target - In the Zephyr menu, go to:
or in the tab, addrlpark.example.demos.learning.OffPACPuddleWorld
to the text field
- Install Zephyr plug-ins and
RLPark plug-ins in Eclipse
Dependencies
zephyr.plugin.core.api, rlpark.plugin.rltoysDocumentation