Monday, October 9, 2017

#CornerCameras: An AI for your blind-spot

Compatible with smartphone cameras, MIT CSAIL system for seeing around corners
could help with self-driving cars and search-and-rescue

Earlier this year researchers at Heriot-Watt University and the University of Edinburgh recognized, there is a way to tease out information on the object even from apparently random scattered light. Their method, published in Nature Photonics, relies on laser range-finding technology, which measures the distance to an object based on the time it takes a pulse of light to travel to the object, scatter, and travel back to a detector.

And now further research has shown significant forward progress. Light lets us see the things that surround us, but what if we really could also use it to see things hidden around corners?

This may sound like science fiction, but that’s the idea behind a new algorithm out of MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) - and its discovery has implications for everything from emergency response to self-driving cars.

The CSAIL team’s imaging system, which can work with video from smartphone cameras, uses information about light reflections to detect objects or people in a hidden scene and measure their speed and trajectory - all in real-time. (It doesn't see any identifying details about individuals - just the fact that they are moving objects.)

Researchers say that the ability to see around obstructions would be useful for many tasks, from firefighters finding people in burning buildings to drivers detecting pedestrians in their blind spots.

To explain how it works, imagine that you’re walking down an L-shaped hallway and have a wall between you and some objects around the corner. Those objects reflect a small amount of light on the ground in your line of sight, creating a fuzzy shadow that is referred to as the “penumbra.”

Using video of the penumbra, the system - which the team dubbed “CornerCameras” - can stitch together a series of one-dimensional images that reveal information about the objects around the corner.

“Even though those objects aren’t actually visible to the camera, we can look at how their movements affect the penumbra to determine where they are and where they’re going,” says PhD graduate Katherine Bouman, who was lead author on a new paper about the system. “In this way, we show that walls and other obstructions with edges can be exploited as naturally-occurring ‘cameras’ that reveal the hidden scenes beyond them.”

Bouman co-wrote the paper with MIT professors Bill Freeman, Antonio Torralba, Greg Wornell and Fredo Durand, master’s student Vickie Ye and PhD student Adam Yedidia. She will present the work later this month at the International Conference on Computer Vision (ICCV) in Venice.

How it works

Most approaches for seeing around obstacles involve special lasers. Specifically, researchers shine cameras on specific points that are visible to both the observable and hidden scene, and then measure how long it takes for the light to return.

However, these so-called “time-of-flight cameras” are expensive and can easily get thrown off by ambient light, especially outdoors.

In contrast, the CSAIL team’s technique doesn’t require actively projecting light into the space, and works in a wider range of indoor and outdoor environments and with off-the-shelf consumer cameras.

From viewing video of the penumbra, CornerCameras generates one-dimensional images of the hidden scene. A single image isn’t particularly useful, since it contains a fair amount of “noisy” data. But by observing the scene over several seconds and stitching together dozens of distinct images, the system can distinguish distinct objects in motion and determine their speed and trajectory.

“The notion to even try to achieve this is innovative in and of itself, but getting it to work in practice shows both creativity and adeptness,” says professor Marc Christensen, who serves as Dean of the Lyle School of Engineering at Southern Methodist University and was not involved in the research. “This work is a significant step in the broader attempt to develop revolutionary imaging capabilities that are not limited to line-of-sight observation.”

The team was surprised to find that CornerCameras worked in a range of challenging situations, including weather conditions like rain.

“Given that the rain was literally changing the color of the ground, I figured that there was no way we’d be able to see subtle differences in light on the order of a tenth of a percent,” says Bouman. “But because the system integrates so much information across dozens of images, the effect of the raindrops averages out, and so you can see the movement of the objects even in the middle of all that activity.”

The system still has some limitations. For obvious reasons, it doesn’t work if there’s no light in the scene, and can have issues if there’s low light in the hidden scene itself. It also can get tripped up if light conditions change, like if the scene is outdoors and clouds are constantly moving across the sun. With smartphone-quality cameras the signal also gets weaker as you get farther away from the corner.

The researchers plan to address some of these challenges in future papers, and will also try to get it to work while in motion. The team will soon be testing it on a wheelchair, with the goal of eventually adapting it for cars and other vehicles.

“If a little kid darts into the street, a driver might not be able to react in time,” says Bouman. “While we’re not there yet, a technology like this could one day be used to give drivers a few seconds of warning time and help in a lot of life-or-death situations."

The conclusion of the paper states:
We show how to turn corners into cameras, exploiting a common, but overlooked, visual signal. The vertical edge of a corner’s wall selectively blocks light to let the ground nearby display an angular integral of light from around the corner. The resulting penumbras from people and objects are invisible to the eye – typical contrasts are 0.1% above background – but are easy to measure using consumer-grade cameras. We produce 1-D videos of activity around the corner, measured indoors, outdoors, in both sunlight and shade, from brick, tile, wood, and asphalt floors. The resulting
1-D videos reveal the number of people moving around the corner, their angular sizes and speeds, and a temporal summary of activity. Open doorways, with two vertical edges, offer stereo views inside a room, viewable even away from the doorway. Since nearly every corner now offers a 1-D view around the corner, this opens potential applications for automotive pedestrian safety, search and rescue, and public safety. This ever-present, but previously unnoticed, 0.1% signal may invite other novel camera measurement methods.

This work was supported in part by the DARPA REVEAL Program, the National Science Foundation, Shell Research and a National Defense Science & Engineering Graduate (NDSEG) fellowship.

Materials provided by MIT CSAIL

Monday, October 2, 2017

Teleoperating robots with virtual reality

MIT CSAIL's VR system could make it easier for factory workers to telecommute

Certain industries have not traditionally had the luxury of telecommuting. For example, many manufacturing jobs require a physical presence to operate machinery.

But what if such jobs could be done remotely? This week researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) presented a virtual-reality (VR) system that lets you teleoperate a robot using an Oculus Rift headset.

Credit "Jason Dorfman, MIT CSAIL
The system embeds the user in a VR control room with multiple sensor displays, making it feel like they’re inside the robot’s head. By using hand controllers, users can match their movements to the robot’s to complete various tasks.

“A system like this could eventually help humans supervise robots from a distance,” says CSAIL postdoctoral associate Jeffrey Lipton, who was lead author on a related paper about the system. “By teleoperating robots from home, blue-collar workers would be able to tele-commute and benefit from the IT revolution just as white-collars workers do now."

The researchers even imagine that such a system could help employ increasing numbers of jobless video-gamers by “game-ifying” manufacturing positions.

The team used the Baxter humanoid robot from Rethink Robotics, but said that it can work on other robot platforms and is also compatible with the HTC Vive headset.

Lipton co-wrote the paper with CSAIL director Daniela Rus and researcher Aidan Fay. They presented the paper this week at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) in Vancouver.

How it works

There have traditionally been two main approaches to using VR for teleoperation.

In a “direct” model, the user's vision is directly coupled to the robot's state. With these systems, a delayed signal could lead to nausea and headaches, and the user’s viewpoint is limited to one perspective.

In a “cyber-physical” model, the user is separate from the robot. The user interacts with a virtual copy of the robot and the environment. This requires much more data, and specialized spaces.

The CSAIL team’s system is halfway between these two methods. It solves the delay problem, since the user is constantly receiving visual feedback from the virtual world. It also solves the the cyber-physical issue of being distinct from the robot: once a user puts on the headset and logs into the system, they’ll feel as if they’re inside Baxter’s head.

The system mimics the “homunculus model of mind” - the idea that there’s a small human inside our brains controlling our actions, viewing the images we see and understanding them for us. While it’s a peculiar idea for humans, for robots it fits: “inside” the robot is a human in a control room, seeing through its eyes and controlling its actions.

Using Oculus’ controllers, users can interact with controls that appear in the virtual space to open and close the hand grippers to pick up, move, and retrieve items. A user can plan movements based on the distance between the arm’s location marker and their hand while looking at the live display of the arm.

To make these movements possible, the human’s space is mapped into the virtual space, and the virtual space is then mapped into the robot space to provide a sense of co-location.

The system is also more flexible compared to previous systems that require many resources. Other systems might extract 2-D information from each camera, build out a full 3-D model of the environment, and then process and redisplay the data.

In contrast, the CSAIL team’s approach bypasses all of that by simply taking the 2-D images that are displayed to each eye. (The human brain does the rest by automatically inferring the 3-D information.)

To test the system, the team first teleoperated Baxter to do simple tasks like picking up screws or stapling wires. They then had the test users teleoperate the robot to pick up and stack blocks.

Users successfully completed the tasks at a much higher rate compared to the “direct” model. Unsurprisingly, users with gaming experience had much more ease with the system.

Tested against state-of-the-art systems, CSAIL’s system was better at grasping objects 95 percent of the time and 57 percent faster at doing tasks. The team also showed that the system could pilot the robot from hundreds of miles away, testing it on a hotel’s wireless network in Washington, DC to control Baxter at MIT.

"This contribution represents a major milestone in the effort to connect the user with the robot's space in an intuitive, natural, and effective manner." says Oussama Khatib, a computer science professor at Stanford University who was not involved in the paper.

The team eventually wants to focus on making the system more scalable, with many users and different types of robots that can be compatible with current automation technologies.

The project was funded in part by the Boeing Company and the National Science Foundation.
Materials provided by MIT-CSAIL