16-311 HW4 2019

Description

In this homework, you are tasked with designing a neural network to solve the self-balancing robot (SBR) task. You will first train your network to mimic the behavior of a PID contoller using imitation learning. After you get decent performance using imitation learning, you will then use policy improvement (REINFORCE, [Reinforcement Learning]), to further train your network. This homework is intended to provide an opportunity to apply the ML/RL algorithms discussed in lecture, as well as gaining experience with neural network engineering and programming in python.

Any questions regarding the homework should be directed to the homework TA or posted to piazza. As questions arise, clarifications will be added to this handout and the Lab 2 Clarifications Piazza post. 

Note: While there are no restrictions on what functions you use in python/pytorch, you are not required to consult anything outside this document to complete the assignment. 

You can view the table of contents by mousing over the lines to the left of this document.

Setup

You should install the below software onto your personal computer. Using andrew linux is not recommended, since there are graphical components which may cause issues over ssh. Linux is recommended (as the homework was created in Linux), but Windows has also been tested and mac should work similarly to Linux. Please post to Piazza, especially if you are a windows or mac user, and are having issues.

Python Installation

For this homework, you will need to have python installed, as well as two packages. 
You can download python at the link below: MAKE SURE TO INSTALL PYTHON VERSION 3.5+ (3.7 recommended)
Python2 will not necessarily work, or could give cryptic bugs/errors, so please use a version of python > 3.5 (If you are on a mac, there may be a version of python installed, but it is probably an older version and is therefore not usable).

When installing, make sure you add python to the PATH if prompted.

Package installation

We recommend that you use pip to install the necessary packages. More info can be found here: https://packaging.python.org/tutorials/installing-packages/. If you installed Python from the link above, you likely already have pip installed.

If you do not have pip installed, this site provides instructions for Linux users: https://packaging.python.org/guides/installing-using-linux-tools/ and this article provides instructions for Windows users (and also to add Python to your path): https://github.com/BurntSushi/nfldb/wiki/Python-&-pip-Windows-installation.  

To install the packages, open a terminal window. 

If you are using Mac or Linux, type the following command (replace 3.7  in python3.7 with your corresponding python version):
python3.7 -m pip install gym torch torchvision

If you are on Windows, run the following commands (for python3.7):
pip3 install https://download.pytorch.org/whl/cu90/torch-1.0.1-cp37-cp37m-win_amd64.whl
pip3 install gym torchvision

For additional info, go to this website: https://pytorch.org/

These commands install the two main packages, pytorch and gym. Pytorch is a machine learning library, which is used to create neural networks. Gym is a library which provides environments for creating and testing reinforcement learning algorithms.

Download Homework/Starter Code

You can download the starter code from the link below:
The starter code contains 5 files. You will edit 3 of the files, but will not edit pole.py or test_install.py.  You can run python test_install.py to ensure you installed your packages correctly. You can make sure the self-balancing robot (SBR) environment works by running pole.py in the same manner.

At this point, everything should be set up. Note that you will not yet be able to run any of the starter codes yet.

File Documentation

pole.py

This file contains the code which runs the SBR simulator. The simulator is accessed via the environment. In this file, there is one class, called CartPole. To use this simulator in your code, you can load it into another file with the line from pole import CartPole. To see a short example on how to use the simulator, scroll down to the bottom of pole.py.

pole.py has 2 methods which you will care about/use.