Advancing Computer Vision: Ryerson Researchers Go Deeper with “Deep Learning”
Driverless cars. Smart medical imaging. Camera-equipped industrial robots. Facial recognition software. Computer vision, as the name implies, involves having computers “see”. More specifically, this involves the development of computer systems that can gain high-level understanding from digital images or videos. In other words, researchers want to approximate and automate the vast and complex capabilities of human eyes.
Adam Harley, who recently completed his master’s degree in Computer Science at Ryerson, says state-of-the-art approaches to computer vision are focusing on “deep learning”, where researchers investigate a slew of high-level tasks, including object recognition, image annotation, and scene understanding.
The potential applications for computer vision are truly mindboggling. Yet with computer vision, there’s also been a continuing challenge around object recognition. As Harley explains, “’Deep learning’ vision machines sometimes struggle in dealing with the context surrounding everyday objects. For example, if the machine is looking directly at a cat, it will likely understand the object is a cat, but if it shifts its gaze toward the tail of the cat, it may lose confidence in ‘cat’ and begin describing ‘floor’ or whatever is behind the cat.”
Supervised by Kosta Derpanis, Harley’s master’s thesis investigations sought to advance the vision capabilities of deep learning machines. Harley says: “My thesis gives these machines an internal attention mechanism, improving their ability to deal with distracting or irrelevant information.”
He then further explains his research: “There are two main parts to my work. The first idea is to teach the machine to relate pixels to one another, so that, for example, it knows that the pixels on the cat's tail are related to the pixels on the cat's body. The second idea is to use those pixel relationships as the basis for an attention mechanism. This way, wherever the machine looks, it has a rough idea of which pixels belong together, and it can use that information to quickly make foreground-background decisions. This helps the machine understand its visual field with better spatial accuracy.”
Thanks to a Mitacs-Globalink research award, Harley was able to conduct a large part of his research at a computer vision lab at Inria, a French national research institute. “I was supervised in France by Professor Iasonas Kokkinos.” Finally, last May, Harley travelled to the International Conference on Learning Representations in San Juan, Puerto Rico and presented his findings in a paper called, “Learning Dense Convolutional Embeddings for Semantic Segmentation.”
Now resettled in Pittsburgh, PA, Harley is completing his PhD at Carnegie Mellon University's (CMU) Robotics Institute on a full scholarship. Of his time at Ryerson, he has much enthusiasm. “My supervisor, Professor Kosta Derpanis, devoted an incredible amount of energy to my success. Even while I was an undergraduate, Kosta encouraged me to take on courses and projects that would build up foundational knowledge in computer vision and machine learning. His continued trust and support helped me navigate through many difficult aspects of the thesis.”
In addition to the Mitacs award, Harley’s research was funded by a Queen Elizabeth II Graduate Scholarship in Science and Technology.