Did MIT researchers just invent the “Shazam for food” that failed on Silicon Valley?
In a recent episode, the fictitious tech bros of HBO’s Silicon Valley were all abuzz about a “Shazam for food”—an app that could readily identify any meal in the world, just based on a photograph. (Fans of the show know that the product, SeeFood, didn’t exactly work as advertised.) Now, a real-life research team at MIT has done Erlich Bachman and company one better: Feed their AI some #foodporn, and it will spit back a complete recipe.
In a paper presented this month at the Computer Vision and Pattern Recognition conference in Honolulu, Hawaii, the team—which also included scientists from the Polytechnic University of Catalonia and the Qatar Computing Research Institute—described scraping over 1 million recipes and 800,000 images from websites like All Recipes and Food.com, and creating a vast dataset on which to train its artificial intelligence system. Over time, the AI—dubbed “Pic2Recipe”—learned to identify meals visually and suggest recipes based on 16,000 component ingredients, from “avocado” and “ground beef” to “olive oil” and “taco seasoning.”
How well does it work? The researchers tried to find out by pitting their platform against real human beings, hired through Amazon’s side gig marketplace, Mechanical Turk. Participants had to have a 97 percent approval rating, and have completed more than 500 previous jobs; they were asked to link photos of food to lists of existing recipes, a task that becomes harder as ingredients get more specific. Ultimately, according to the study, Pic2Recipe performed about as well as human beings, though it struggled with items that featured homogenized or “fine-grained” features, like smoothies and sushi.
Here’s a question: Why would anyone want to do such a thing?
“We hope that our contributions will support the creation of automated tools for food and recipe understanding,” the authors write, “and open doors for many less explored aspects of learning such as compositional creativity and predicting visual outcomes of action sequences.”
Hmmm. In a writeup for MIT News, co-author Nicholas Hynes explained further:
“This could potentially help people figure out what’s in their food when they don’t have explicit nutritional information,” he says. “For example, if you know what ingredients went into a dish but not the amount, you can take a photo, enter the ingredients, and run the model to find a similar recipe with known quantities, and then use that information to approximate your own meal.”
Since the researchers have made a demo version of Pic2Recipe available online, I thought I’d take a look myself. First, I wanted to see if it could help me recreate these deviled eggs lovingly prepared for me by my mother.
“No matches,” it said. It also didn’t recogize a pretty unambiguous picture of some basil.
I fared a little better when I tried this stock photo of a hamburger:That rung up a list of five recipes, none of which quite nailed it. The closest was the “Inside-Out Cheeseburger with Bacon.” Though the picture I submitted doesn’t appear to have bacon, the recipe would probably do the trick. Other options—including “Steak Sandwich,” and “VELVEETA Spicy Bacon and Spinach Sliders”—weren’t as close (Though: Velveeta Sliders…).
As of now, it doesn’t seem like Pic2Recipe is going to disrupt Instagram foodie culture anytime soon. But if they’re looking for a visionary CEO to bring their prototype to the masses, I hear Erlich Bachman’s looking for work.