0% found this document useful (0 votes)
73 views

Andrei Barbu, Siddharth Narayanaswamy, and Jeffrey Mark Siskind School of Electrical and Computer Engineering

The document describes a system that enables robots to learn how to play board games through visual observation of other robots playing. Three robots were used - two that played against each other while the third watched and later took over playing. The watching robot was able to learn the initial board setup, legal moves, and winning conditions for Hexapawn and other games by observing multiple games being played. It used computer vision to reconstruct the game state and an inductive logic programming system to learn and represent the game rules as logical formulas that could then be used to play the games itself. Future work proposed expanding the approach to more complex games and other tasks involving object assembly and manipulation.

Uploaded by

manishpali
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
73 views

Andrei Barbu, Siddharth Narayanaswamy, and Jeffrey Mark Siskind School of Electrical and Computer Engineering

The document describes a system that enables robots to learn how to play board games through visual observation of other robots playing. Three robots were used - two that played against each other while the third watched and later took over playing. The watching robot was able to learn the initial board setup, legal moves, and winning conditions for Hexapawn and other games by observing multiple games being played. It used computer vision to reconstruct the game state and an inductive logic programming system to learn and represent the game rules as logical formulas that could then be used to play the games itself. Future work proposed expanding the approach to more complex games and other tasks involving object assembly and manipulation.

Uploaded by

manishpali
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

Learning Physically-Instantiated Game Play Through Visual Observation

Andrei Barbu, Siddharth Narayanaswamy, and Jeffrey Mark Siskind


School of Electrical and Computer Engineering
Introduction

Task

Results

goal is to emulate a 2-year-old child


I an integrated robotics system for learning to play board games
I learn a full set of rules; learn to play, not to play well
I learn the initial board, legal-move generator, and outcome predicate
I two robots play a board game, while a third watches and takes over
I fully automatic with no human intervention
I no communication between the robots

reliable robust operation was achieved for 62 games, approximately


2000 pick-up and put-down operations with fewer than 20
interventions
I robotic manipulation based on dead-reckoning due to non-linear
relationship between servo control and angular position
I error detection by interleaving manipulation and visual
reconstruction of board states
I learned Hexapawn rule set:

initial_board([[x,x,x],[none,none,none],[o,o,o]],player_x).
legal_move(A,B,C) :- row(D), col(E), owns(A,F), empty(G),
forward(A,H,D), at(H,E,B,F,I), at(H,E,C,G,J),
at(D,E,B,G,K), at(D,E,C,F,L), frame_obj(I,K,J,L,B,C).
legal_move(A,B,C) :- row(D), col(E), opponent(A,F),
owns(A,G), empty(H), forward(A,I,D), owns(F,J),
sideways(E,K), at(D,K,C,G,L), at(I,E,B,G,M),
at(I,E,C,H,N), at(D,K,B,J,O), frame_obj(L,N,O,M,C,B).
outcome(A,B,C) :- row(D), opponent(A,E), forward(E,D,F),
forward(E,F,G), owns_outcome(E,C), owns_piece(C,H),
at(G,I,B,H,J).
outcome(A,B,C) :- opponent(A,D), has_no_move(A,B),
owns_outcome(D,C).

Why games, and why learning?


games are an idealized version of the real world
I most AI cannot deal with real-world complexity
I children learn from observation
I not only when rules are unavailable, c.f. Social Learning Theory
I have you read the rules for board games youve played?
I

plays

Experimental setup
protagonist

Games
I off-the-shelf game hardware, but judiciously chosen to simplify
robotic manipulation
I depressions in the board provide for easy piece placement
I large, easy-to-grab pieces
I Tic-Tac-Toe with standard rules learned
I Hexapawn; three pawns on opposing sides; win by queening,
capture, or force an inability to move
I learned 5 variants of Hexapawn: regular, forward diagonal moves,
forward and backward diagonal moves, vertical backward moves,
vertical backward and sideways moves

Fill in attribution

Robots
I custom robots with a 4-DOF arm, two fingers, and two eyes
I eyes on a 1-DOF pendulum arm that rotates around the game
I each eye can pan/tilt independently
I mounted in a custom housing
I parts primarily from Lynxmotion, enhanced with custom parts to
provide greater support for the arm and eyes and increased efficacy
of operation
Computer Vision
I reconstruct the game state from visual information
I must detect the board itself; this is a calibration step done once on
startup where 9 ellipses arranged in a grid are found
I O PEN CV ellipse finder is used with multiple thresholds and voting in
order to detect Xs, Os, and empty board positions

plays

plays

G AME
antagonist

protagonist

watches
=

wannabe

plays

I
G AME

learned similar rule sets for all 6 games

wannabe

Lincoln Logs & language


antagonist

assembly task using assembly toys, e.g. Lincoln Logs


I novel computer-vision system to recognize block assemblies from
grammars by extracting features, fitting them to known grammars,
and searching for implied necessary or possible blocks
I novel language component to describe assemblies in terms of walls
and windows, and reconstruct the same structure out of different
assembly toys
I more advanced robotic system, with custom grippers, farther reach,
tactile sensors, and a palm-mounted camera
I more robust robotic manipulation using visual servoing
I

Rules
P ROGOL, an inductive-logic-programming (ILP) system, is used to
learn the initial board, legal-move generator, and outcome predicate
I rules represented as logical formulae, i.e. Horn clauses
I learned rules are then directly executed with P ROLOG
I system is given background knowledge about the world, such as: a
board exists, pieces can be on the board, players own pieces, the
concepts of linearity, forwards, sideways, and the frame axiom
I background knowledge is of the type any child would have
I search the space of possible initial board, legal-move generators,
and outcome predicates for 3 3 games, given the evidence of n
games, and find the most compressed rule set which best explains
the observed games
I learn the initial board first, then the legal-move generator, and
finally the outcome predicate
I use the previously acquired knowledge to learn the next item
I can learn a full game description in a modest number of games,
typically 36
I

Yoururl

Future work
complete the Lincoln Log task and move on to other assembly toys
I expand upon the current game learning and scale up to games of
higher complexity, e.g. checkers
I learn the mapping from world state to game state
I learn Lincoln Log and other assembly-toy grammars
I integrate more sensors, e.g. a laser pointer and ultrasonic range
finder
I stochastic ILP for fault tolerance
I a custom ILP system with better heuristics for learning games
I

{your,names}@here

You might also like