How the UBC creators of AutoStitch revolutionized computer vision and sparked a spinoff tsunami

AutoStitch can create panorama images such as this one.

When David Lowe first wrote a computer algorithm to identify objects in images, he never envisioned it popping up on nearly half a million iPhones and in supermarket checkout lines.

But since its research publication in 1999, the Scale-Invariant Feature Transform (SIFT) algorithm, developed by Lowe, has been licensed by more than 20 companies, making it the most widely adopted invention in University of British Columbia history.

“My interest at the time was to solve a long-standing research problem in the field of computer vision,” recalls Lowe, a professor in UBC’s Department of Computer Science. “If we take a picture of an object, how does the computer recognize the same object in a different image, where the size, orientation or brightness may have changed?”

SIFT quickly caught the attention of Sony, which was making its foray into high-tech toys with AIBO the robot dog. SIFT gave AIBO sight, enabling it to recognize shapes on cue cards and perform corresponding tasks.

Then Lowe and his then graduate student Matthew Brown created AutoStitch, the world’s first fully automatic image-stitcher, which creates panoramic images from multiple shots. Cloudburst Research was formed to build an iPhone and iPad app based on AutoStitch, and has recently released an Android version.

Since then, the algorithm has found its way into supermarket anti-theft systems – by matching images of products in shopping carts to those on store shelves – and most recently in tools that assist visually impaired users. One licensee integrated it into an electronic magnifier to bring images from lecture theatre screens onto laptops; another company uses SIFT to identify paper currency and confirm its denomination, and is now developing a tool that scans a kitchen to create an inventory of its contents while helping users distinguish between similarly shaped items such as cans of soup.

“It’s very gratifying to see my work out there improving people’s daily lives in ways I had not anticipated,” says Lowe.

With computing power growing at an explosive pace, Lowe sees great potential for computer vision technology to make an even bigger impact in the foreseeable future.

“Google says that there’ll be self-driving cars on the market within five years,” says Lowe. “And even cars that are not self-driving will have cameras that monitor the surroundings and warn the driver – or put on the brakes – if it’s about to hit something.

“In these potentially life-and-death situations, how do we ensure technology is as helpful and reliable as it can be?”

One thing that computers do better than humans, says Lowe, is that they are never distracted, and cameras are better at recognizing road signs. “But there are still a lot of things they can’t do 100 per cent accurately, and it involves both artificial intelligence and data collection.”

For example, pedestrians come in all shapes and sizes and can be obstructed in countless ways – behind a tree, holding large shopping bags, pushing a baby stroller. They may also be in motion when nearing a vehicle – chasing after a ball, riding a skateboard, or peddling a bicycle. All of these possibilities must be part of the matching and computing process before a camera-assisted safety system can become fully reliable.

Lowe is up for the challenge. He’s been working to scale up recognition to handle a large number of images that may have wide ranging applications. “Imagine yourself in the middle of a city and with one photo, your phone can tell you where you are and superimpose precisely aligned information about that location – essentially a computer vision approach for more accurate GPS – and I think it’s within reach.”

“If we take a picture of an object, how does the computer recognize the same object in a different image, where the size, orientation or brightness may have changed?”

Chris Balma
c 604-202-5047