I have now finished implementing two agent architectures (and one which combines the two) using the Unity editor and the C# language.
- Reward Machine Agent for Patrolling (https://github.com/GavinRens/Reward-Machine-Agent—Patrolling) and Reward Machine Agent for Treasure Hunting (https://github.com/GavinRens/Reward-Machine-Agent—Treasure-Hunting). The algorithm in the frameworks is based on my work with Reward Machines: Instead of rewarding an agent for a given action in a given state, a reward machine allows one to specify rewards for sequences of observations. Every observation is mapped from an action-state pair. For instance, if you want to make your agent kick the ball twice in a row, then give it a reward only after seeing that it has kicked the ball twice in a row. A regular reward function would have to give the same reward for the first and second kick. I implemented a Monte Carlo Tree Search (MCTS) planner, which plans over the given reward machine. In the patrolling environment, the observation mapping function is not completely deterministic, whereas the treasure-hunting environment is fully observable).
- EatPrayDanceSleep_HMB_Agent (https://github.com/GavinRens/BDI-Agent—EatPrayDanceSleep). The algorithm in the framework is based on my work with the Belief-Desire-Intention (BDI) architecture: The agent has a set of goals. The agent periodically selects a subset of these goals to pursue for a while. The currently selected goals are called intentions. In the framework in this project, an agent can pursue all or some intentions simultaneously. The agent designer can specify which goals can/cannot be pursued simultaneously, and the ‘importance’ of every goal can be set. The agent designer can also define what rewards the agent will get in general (besides for goals) and define the cost of each action. Taken together, these specifications and definitions produce emergent behavior, where an agent will keep selecting different intentions to pursue. I implemented a Monte Carlo Tree Search (MCTS) planner, which plans over the current set of intentions, weighted by their importance.
- Hybrid-Agent—Work-n-Home (https://github.com/GavinRens/Hybrid-Agent—Work-n-Home) is an agent architecture combining the BDI and Reward-Machine architectures – controlled by two MCTS planners, adapted for each architecture.
Details of these architectures can be found in the README files of the respective repositories on my GitHub site: https://github.com/GavinRens.