A system with 2 agent with centralize control and will move together with the player to corner evadors.

The planning model combines path-finding and graph plan when get close to target. A learning model is used to learn the best order of finishing sub-task. Also, it takes data from real player from which I am try to extract the transition model for robots to whether “Cooperate” or “defect” with player.