Thursday, June 21, 2018

Controlling robots with brainwaves and hand gestures


System enables people to correct robot mistakes on multi-choice problems


Getting robots to do things isn’t easy: usually scientists have to either explicitly program them or get them to understand how humans communicate via language.

But what if we could control robots more intuitively, using just hand gestures and brainwaves?

A new system spearheaded by researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) aims to do exactly that, allowing users to instantly correct robot mistakes with nothing more than brain signals and the flick of a finger.

Building off the team’s past work focused on simple binary-choice activities, the new work expands the scope to multiple-choice tasks, opening up new possibilities for how human workers could manage teams of robots.

By monitoring brain activity, the system can detect in real time if a person notices an error as a robot does a task. Using an interface that measures muscle activity, the person can then make hand gestures to scroll through and select the correct option for the robot to execute.
The system allows a human supervisor to correct mistakes using gestures and brainwaves -
credit Joseph DelPreto, MIT CSAIL
The team demonstrated the system on a task in which a robot moves a power drill to one of three possible targets on the body of a mock plane. Importantly, they showed that the system works on people it’s never seen before, meaning that organizations could deploy it in real-world settings without needing to train it on users.

“This work combining EEG and EMG feedback enables natural human-robot interactions for a broader set of applications than we've been able to do before using only EEG feedback,” says CSAIL director Daniela Rus, who supervised the work. “By including muscle feedback, we can use gestures to command the robot spatially, with much more nuance and specificity.”

PhD candidate Joseph DelPreto was lead author on a paper about the project alongside Rus, former CSAIL postdoctoral associate Andres F. Salazar-Gomez, former CSAIL research scientist Stephanie Gil, research scholar Ramin M. Hasani, and Boston University professor Frank H. Guenther. The paper will be presented at the Robotics: Science and Systems (RSS) conference taking place in Pittsburgh next week.

Intuitive human-robot interaction

In most previous work, systems could generally only recognize brain signals when people trained themselves to “think” in very specific but arbitrary ways and when the system was trained on such signals. For instance, a human operator might have to look at different light displays that correspond to different robot tasks during a training session.

Not surprisingly, such approaches are difficult for people to handle reliably, especially if they work in fields like construction or navigation that already require intense concentration.Meanwhile, Rus’ team harnessed the power of brain signals called “error-related potentials” (ErrPs), which researchers have found to naturally occur when people notice mistakes. If there’s an ErrP, the system stops so the user can correct it; if not, it carries on.
“What’s great about this approach is that there’s no need to train users to think in a prescribed way,” says DelPreto. “The machine adapts to you, and not the other way around.”

For the project the team used “Baxter”, a humanoid robot from Rethink Robotics. With human supervision, the robot went from choosing the correct target 70 percent of the time to more than 97 percent of the time.

To create the system the team harnessed the power of electroencephalography (EEG) for brain activity and electromyography (EMG) for muscle activity, putting a series of electrodes on the users’ scalp and forearm.

Both metrics have some individual shortcomings: EEG signals are not always reliably detectable, while EMG signals can sometimes be difficult to map to motions that are any more specific than “move left or right.” Merging the two, however, allows for more robust bio-sensing and makes it possible for the system to work on new users without training.
“By looking at both muscle and brain signals, we can start to pick up on a person's natural gestures along with their snap decisions about whether something is going wrong,” says DelPreto. “This helps make communicating with a robot more like communicating with another person.”

The team says that they could imagine the system one day being useful for the elderly, or workers with language disorders or limited mobility.
“We’d like to move away from a world where people have to adapt to the constraints of machines,” says Rus. “Approaches like this show that it’s very much possible to develop robotic systems that are a more natural and intuitive extension of us.”

Materials provided by MIT CSAIL, 32 Vassar Street, Cambridge, MA 02139, USA 

Sunday, October 29, 2017

Artificial Intelligence and the Future of Industry

AI and Industry 4.0

The entire world is feeling the impact of established and emerging artificial intelligence techniques tools. This is transforming society and all areas of business including healthcare and biomedicine, retail and finance, transportation and auto, and all verticals. Marketing and communications is certainly being transformed through artificial intelligence techniques. Manufacturing is at the beginning of a major upheaval as automation and machine learning rewrite the rules of work. We are seeing applications in construction and additive manufacturing, as well as self-driving vehicles and industrial robotics. Robotic systems, the Internet of things, cloud computing and cognitive computing collectively make up what is termed "Industry 4.0." The first three stage were mechanization, mass production and basic automation.

The digital transformation of manufacturing and the supply chain means that data from factories is directly analyzed using AI technologies. EmTech Digital (short for emerging technology) produced by the Massachusetts Institute of Technology's Technology Review magazine, is an annual conference that examines the latest research on artificial-intelligence techniques, including deep learning, predictive modeling, reasoning and planning, and speech and pattern recognition. The 2016 event was especially interesting. To learn more about future events, go to:
https://events.technologyreview.com/

EmTech Digital 2016 in San Francisco 

Artificial intelligence is already impacting every industry, powering search, social media, and smartphones and tracking personal health and finances. What’s ahead promises to be the greatest computing breakthrough of all time, yet it’s difficult to discern facts from hype. That is exactly what EmTech Digital tries to accomplish.

At the 2016 event a roundtable discussion on the State of AI was held with a panel of experts including:
  • Peter Norvig of Google
  • Andrew Ng (formerly) of Baidu
  • Oren Etzioni of the Allen Institute
The panel was moderated by MIT Technology Review editor in chief Jason Pontin.


State-of-the-Art AI: Building Tomorrow’s Intelligent Systems

Peter Norvig, Director of Research for Google, talks about developing state-of-the-art AI solutions for building tomorrow's intelligent systems.


Deep Learning in Practice: Speech Recognition and Beyond

Andrew Ng, formerly Chief Scientist with Baidu who in 2011 founded and led the Google Brain project, which built the largest deep-learning neural network systems at the time, discusses deploying deep learning solutions in practice with conversational AI and beyond.


AI for the Common Good

Oren Etzioni, CEO of the Allen Institute for AI, shares his vision for deploying AI technologies for the common good.




Videos courtesy of MIT Technology Review

Monday, October 9, 2017

#CornerCameras: An AI for your blind-spot

Compatible with smartphone cameras, MIT CSAIL system for seeing around corners
could help with self-driving cars and search-and-rescue


Earlier this year researchers at Heriot-Watt University and the University of Edinburgh recognized, there is a way to tease out information on the object even from apparently random scattered light. Their method, published in Nature Photonics, relies on laser range-finding technology, which measures the distance to an object based on the time it takes a pulse of light to travel to the object, scatter, and travel back to a detector.

And now further research has shown significant forward progress. Light lets us see the things that surround us, but what if we really could also use it to see things hidden around corners?

This may sound like science fiction, but that’s the idea behind a new algorithm out of MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) - and its discovery has implications for everything from emergency response to self-driving cars.



The CSAIL team’s imaging system, which can work with video from smartphone cameras, uses information about light reflections to detect objects or people in a hidden scene and measure their speed and trajectory - all in real-time. (It doesn't see any identifying details about individuals - just the fact that they are moving objects.)

Researchers say that the ability to see around obstructions would be useful for many tasks, from firefighters finding people in burning buildings to drivers detecting pedestrians in their blind spots.

To explain how it works, imagine that you’re walking down an L-shaped hallway and have a wall between you and some objects around the corner. Those objects reflect a small amount of light on the ground in your line of sight, creating a fuzzy shadow that is referred to as the “penumbra.”

Using video of the penumbra, the system - which the team dubbed “CornerCameras” - can stitch together a series of one-dimensional images that reveal information about the objects around the corner.

“Even though those objects aren’t actually visible to the camera, we can look at how their movements affect the penumbra to determine where they are and where they’re going,” says PhD graduate Katherine Bouman, who was lead author on a new paper about the system. “In this way, we show that walls and other obstructions with edges can be exploited as naturally-occurring ‘cameras’ that reveal the hidden scenes beyond them.”

Bouman co-wrote the paper with MIT professors Bill Freeman, Antonio Torralba, Greg Wornell and Fredo Durand, master’s student Vickie Ye and PhD student Adam Yedidia. She will present the work later this month at the International Conference on Computer Vision (ICCV) in Venice.

How it works

Most approaches for seeing around obstacles involve special lasers. Specifically, researchers shine cameras on specific points that are visible to both the observable and hidden scene, and then measure how long it takes for the light to return.

However, these so-called “time-of-flight cameras” are expensive and can easily get thrown off by ambient light, especially outdoors.

In contrast, the CSAIL team’s technique doesn’t require actively projecting light into the space, and works in a wider range of indoor and outdoor environments and with off-the-shelf consumer cameras.


From viewing video of the penumbra, CornerCameras generates one-dimensional images of the hidden scene. A single image isn’t particularly useful, since it contains a fair amount of “noisy” data. But by observing the scene over several seconds and stitching together dozens of distinct images, the system can distinguish distinct objects in motion and determine their speed and trajectory.

“The notion to even try to achieve this is innovative in and of itself, but getting it to work in practice shows both creativity and adeptness,” says professor Marc Christensen, who serves as Dean of the Lyle School of Engineering at Southern Methodist University and was not involved in the research. “This work is a significant step in the broader attempt to develop revolutionary imaging capabilities that are not limited to line-of-sight observation.”

The team was surprised to find that CornerCameras worked in a range of challenging situations, including weather conditions like rain.

“Given that the rain was literally changing the color of the ground, I figured that there was no way we’d be able to see subtle differences in light on the order of a tenth of a percent,” says Bouman. “But because the system integrates so much information across dozens of images, the effect of the raindrops averages out, and so you can see the movement of the objects even in the middle of all that activity.”

The system still has some limitations. For obvious reasons, it doesn’t work if there’s no light in the scene, and can have issues if there’s low light in the hidden scene itself. It also can get tripped up if light conditions change, like if the scene is outdoors and clouds are constantly moving across the sun. With smartphone-quality cameras the signal also gets weaker as you get farther away from the corner.

The researchers plan to address some of these challenges in future papers, and will also try to get it to work while in motion. The team will soon be testing it on a wheelchair, with the goal of eventually adapting it for cars and other vehicles.

“If a little kid darts into the street, a driver might not be able to react in time,” says Bouman. “While we’re not there yet, a technology like this could one day be used to give drivers a few seconds of warning time and help in a lot of life-or-death situations."

The conclusion of the paper states:
We show how to turn corners into cameras, exploiting a common, but overlooked, visual signal. The vertical edge of a corner’s wall selectively blocks light to let the ground nearby display an angular integral of light from around the corner. The resulting penumbras from people and objects are invisible to the eye – typical contrasts are 0.1% above background – but are easy to measure using consumer-grade cameras. We produce 1-D videos of activity around the corner, measured indoors, outdoors, in both sunlight and shade, from brick, tile, wood, and asphalt floors. The resulting
1-D videos reveal the number of people moving around the corner, their angular sizes and speeds, and a temporal summary of activity. Open doorways, with two vertical edges, offer stereo views inside a room, viewable even away from the doorway. Since nearly every corner now offers a 1-D view around the corner, this opens potential applications for automotive pedestrian safety, search and rescue, and public safety. This ever-present, but previously unnoticed, 0.1% signal may invite other novel camera measurement methods.

This work was supported in part by the DARPA REVEAL Program, the National Science Foundation, Shell Research and a National Defense Science & Engineering Graduate (NDSEG) fellowship.

Materials provided by MIT CSAIL

Monday, October 2, 2017

Teleoperating robots with virtual reality

MIT CSAIL's VR system could make it easier for factory workers to telecommute

Certain industries have not traditionally had the luxury of telecommuting. For example, many manufacturing jobs require a physical presence to operate machinery.

But what if such jobs could be done remotely? This week researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) presented a virtual-reality (VR) system that lets you teleoperate a robot using an Oculus Rift headset.

Credit "Jason Dorfman, MIT CSAIL
The system embeds the user in a VR control room with multiple sensor displays, making it feel like they’re inside the robot’s head. By using hand controllers, users can match their movements to the robot’s to complete various tasks.

“A system like this could eventually help humans supervise robots from a distance,” says CSAIL postdoctoral associate Jeffrey Lipton, who was lead author on a related paper about the system. “By teleoperating robots from home, blue-collar workers would be able to tele-commute and benefit from the IT revolution just as white-collars workers do now."

The researchers even imagine that such a system could help employ increasing numbers of jobless video-gamers by “game-ifying” manufacturing positions.

The team used the Baxter humanoid robot from Rethink Robotics, but said that it can work on other robot platforms and is also compatible with the HTC Vive headset.

Lipton co-wrote the paper with CSAIL director Daniela Rus and researcher Aidan Fay. They presented the paper this week at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) in Vancouver.



How it works

There have traditionally been two main approaches to using VR for teleoperation.

In a “direct” model, the user's vision is directly coupled to the robot's state. With these systems, a delayed signal could lead to nausea and headaches, and the user’s viewpoint is limited to one perspective.

In a “cyber-physical” model, the user is separate from the robot. The user interacts with a virtual copy of the robot and the environment. This requires much more data, and specialized spaces.

The CSAIL team’s system is halfway between these two methods. It solves the delay problem, since the user is constantly receiving visual feedback from the virtual world. It also solves the the cyber-physical issue of being distinct from the robot: once a user puts on the headset and logs into the system, they’ll feel as if they’re inside Baxter’s head.

The system mimics the “homunculus model of mind” - the idea that there’s a small human inside our brains controlling our actions, viewing the images we see and understanding them for us. While it’s a peculiar idea for humans, for robots it fits: “inside” the robot is a human in a control room, seeing through its eyes and controlling its actions.

Using Oculus’ controllers, users can interact with controls that appear in the virtual space to open and close the hand grippers to pick up, move, and retrieve items. A user can plan movements based on the distance between the arm’s location marker and their hand while looking at the live display of the arm.

To make these movements possible, the human’s space is mapped into the virtual space, and the virtual space is then mapped into the robot space to provide a sense of co-location.

The system is also more flexible compared to previous systems that require many resources. Other systems might extract 2-D information from each camera, build out a full 3-D model of the environment, and then process and redisplay the data.

In contrast, the CSAIL team’s approach bypasses all of that by simply taking the 2-D images that are displayed to each eye. (The human brain does the rest by automatically inferring the 3-D information.)

To test the system, the team first teleoperated Baxter to do simple tasks like picking up screws or stapling wires. They then had the test users teleoperate the robot to pick up and stack blocks.

Users successfully completed the tasks at a much higher rate compared to the “direct” model. Unsurprisingly, users with gaming experience had much more ease with the system.

Tested against state-of-the-art systems, CSAIL’s system was better at grasping objects 95 percent of the time and 57 percent faster at doing tasks. The team also showed that the system could pilot the robot from hundreds of miles away, testing it on a hotel’s wireless network in Washington, DC to control Baxter at MIT.

"This contribution represents a major milestone in the effort to connect the user with the robot's space in an intuitive, natural, and effective manner." says Oussama Khatib, a computer science professor at Stanford University who was not involved in the paper.

The team eventually wants to focus on making the system more scalable, with many users and different types of robots that can be compatible with current automation technologies.

The project was funded in part by the Boeing Company and the National Science Foundation.
Materials provided by MIT-CSAIL