The next time you catch your robot watching sitcoms, don’t assume it’s slacking off. It may be hard at work. TV shows and video clips can help Artificially Intelligent systems learn about and anticipate human interactions, according to MIT’s Computer Science and Artificial Intelligence Laboratory. Researchers created an algorithm that analyzes video, then uses what it learns to predict how humans will behave.
Six hundred hours of clips from shows like The Office and Big Bang Theory let the AI learn to identify high-fives, handshakes, hugs, and kisses. Then it learned what the moments leading to those interactions looked like.
After the Artificial Intelligence devoured the videos to train itself, the researchers fed it a single frame from a video it had not seen and tasked it with predicting what would happen next. It was right about 43 per cent of the time.
Humans nail the answer 71 per cent of the time, but the researchers still think the AI did a great job, given its rudimentary education. “Even a toddler has much more life experience than this,” says Carl Vondrick, the project’s lead author. “I’m interested to see how much the algorithms improve if we train it on years of videos.”
The AI doesn’t understand what’s happening in the scene in the same way a human does. It analyzes the composition and movement of pixels to identify patterns. “It drew its own conclusions in terms of correlations between the visuals and the eventual action,” says Vondrick.
Vondrick spent two years on the project. He says the efficient, self-reliant training could come in handy for more important things.
For example, an improved version of the system could have a future in hospitals and in places where it could prevent injuries. He says, smart cameras could analyze video feeds and alert emergency responders if something catastrophic is about to happen. Embed these systems in robots, and they could even intervene in these situations themselves.