Have you ever wondered what facilitates a barcode scanner to ‘see’ a few stripes on UPC, or how Apple’s Face ID can detect whether it’s you or someone else? Well, when Alan Turing contemplated on a poignant question ‘Can machines think?’ in his whitepaper “Computing Machinery and Intelligence”, the world gasped in disbelief.
Today, machines are changing the very fabric of what we do, see, or create.
Welcome to the world of computer vision – a process that lets machines such as phones or computers see their surroundings. When a machine processes visual raw input, it leverages computer vision to understand what it’s looking at.
Think about it – you look at a flower for the first time. To know its name, you click an image, and post it on social media. Wouldn’t life be simpler if machines can just look at the flower and identify the correct species and its name? Well, this concept was introduced for the first time in 1970s, but the technology to implement it wasn’t there. It’s only in the recent years that computer vision has grown leaps and bounds, and the entire world is a testimony to it.
A Quick Look at How Machines See -- A. Numbers are represented through colors by a value called HEX. Machines are programmed to look at the image pixels and their respective colors.
- B. Images are segmented in various groups like background color, foreground color, so on and so forth.
- C. After segmentation, machines find edges in the image (called corners) to identify detailed information.
- D. Texture is one of the primary factors to categorize an object and name it precisely.
- E. Guess! Yes, machines, like us, then make a guess using their database.
- F. And finally, machines check whether they were right or not. If yes, Bingo! If not, they add a new name to their database given by algorithmic instructions (or users).
The first significant breakthrough in computer vision was 2012 at the University of Toronto. It was barely mentioned in the news before 2017 when the popularity of this tech grew by 500%. AngelList, a US-based firm that bridges the gap between investors and startups has listed 529 organizations under the ‘Computer Vision’ label. It is unarguably a booming industry with many sectors already embracing it with open arms.
Google Maps use image data to identify street names, roads, office buildings, and restaurants. Facebook makes the most of computer vision by identifying people in photos. Lens, an app introduced by Snapchat detects objects in an image and recommends where they were bought from. The eBay app lets users search for items using camera.
Back in 2014, Tesla launched a driver-assistance system, Autopilot, with features like self-parking, and lane centering. Computer vision entered retail when Amazon Go opened its stores. No cashiers, checkout stations, and automation, these stores are driven by deep learning, computer vision, and sensor fusions.
AI, as of now, behaves like a small child, with computer vision as a sense of sight – and it doesn’t offer an inherent understanding of the real world. AI needs training like kids do. It’ll learn to recognize only if you show the same thing multiple times. For example, the alarm clock that you saw in the morning might be partially blocked from your memory, yet, you know it’s an alarm clock.
Going by Zuckerberg’s words, for machines to take on, it’ll take some time.
"If we could build computers that could understand what's in an image and could tell a blind person who otherwise couldn't see that image, that would be pretty amazing as well. This is all within our reach and I hope we can deliver it in the next 10 years.”
Machines are on the way to revolutionize almost everything that human beings are surrounded with. Are we ready yet?