← IndexGF-01 · SCHOOL2025

Q-Learner

REINFORCEMENT LEARNING · PYTHON
01
FIG. 01REAL SCREENSHOT PENDING. THE PATTERN STANDS IN.

A tabular Q-learning agent dropped into a 2D gridworld with nothing but a reward signal and an ε-greedy disposition. It wanders, it bumps into walls, and — after enough episodes — it stops embarrassing itself.

The interesting part was reward shaping: small tweaks to the signal changed the personality of the policy more than any hyperparameter. Convergence plots and a value heatmap document the journey from random walk to competence.

INCLUDING
  • PYTHON + NUMPY
  • ε-GREEDY POLICY
  • REWARD SHAPING
  • VALUE HEATMAP
SOURCE — AVAILABLE UPON REQUEST.
COURSEWORK FALLS UNDER THE HONOR CODE.
NEXTGF-02
02
PID ControlSIMULATED ROBOTICS · CONTROL LOOPS2025
© 2026 GRAY FORRESTERDROP ME A LINE