Artificial Intelligence: Computer Vision

A.      Nature of Computer Vision

A. 1.  What is computer vision?

Computer Vision (CV) is the science of teaching a computer how to identify a physical object in its surroundings (Fig. 1). Its task is to capture an image, understand it, reconstruct it internally and create a meaningful and concise description. As a scientific and engineering field, Computer Vision [1] [2] strives to apply its theories, models and techniques to the construction of practical systems. Its ultimate aim is to imitate and improve on human visual perception. To this end, it draws from several fields like image processing (imaging), AI (pattern recognition), math. (statistics, optimization, geometry), solid state physics (image sensors, optics), neurobiology (biological vision) and signal processing. This article is about Computer Vision [3] and not Machine Vision (MV) which has a significantly different goal.


Fig. 1: MERTZ, an active vision head of humanoid robot for
learning in a social context at MIT.

Although MV and CV share some vocabulary, concepts and techniques, they have fundamentally different approaches and priorities (see table 1). On the one hand, Computer Vision needs to capture 2D images of objects in a scene and apply elaborate algorithms to recreate an approximate 3D image of that scene. On the other hand, Machine Vision is interested only in 2D images of objects whose salient features are extracted for discrimination purposes (identifying, recognizing, grading, sorting, counting). MV methods use hard-coded (embedded) software containing information about the scene.


Computer Vision

Machine Vision

Hardware/softwareComputers/SoftwareDedicated industrial hardware
Problem solving methodsAlgorithmsin situ programming
Input datafilemechanical part
output datasignal for human beingsignal to control equipment
User interfacesimple graphical interfaceelaborate interface  is critical
Knowledge of human visionstrong influencefair influence
Quality criteriacomputational performanceeasy, cost effective, reliable
Financial supportSecondaryCritical

Table 1: Comparing Computer Vision and Machine Vision.

2.  A short history of computer vision

Larry Roberts’s Ph.D. dissertation in 1963 at the MIT, “Machine Perception of Three-Dimensional Solids”, was a landmark contribution in that it laid out the foundations of  the field of computer vision. In his thesis, Larry Robert proposed the idea of extracting 3D geometrical information from related 2D views of blocks (polyhedra).

In its evolution [6], research in CV needed to tackle real-world problems where edge detection and segmentation are focal points. David Marr proposed his bottom-up approach to scene understanding at the MIT in 1972. It was a major milestone, the most influential contribution in CV.

Here is a synopsis of fifty years of computer vision 1963-2013:

  • • 1960s: Image processing and pattern recognition appear in AI.
  • • 1970s: Horn, Koenderink, Longuet-Higgins milestone contribution to image processing.
  • • 1980s: Math., probability and control theory are applied to Vision.
  • • 1990s: Vision is augmented by computer graphics and statistical learning.
  • • 2000s: Advances in visual recognition and major practical applications have significant impact on Vision.

B.      R&D challenges in Computer Vision

Today there are a number of factors which prevent CV [7] from reaching its full potential. Its interdisciplinary nature (AI, computer science, math., physics, biology) and unexpected growth have made it subject to dispersion and instability. Then, CV – often confused with machine vision – lacks the name recognition and image as a field in its own right. As a result, many research initiatives have the feeling of being underestimated. Last, there seem to be a disconnect between academic research and industry development. With so little ground for cooperation, their needs, achievements and perspectives are not mutually understood.

In spite of all the steady advances in Computer Vision, R&D results have yet to match the visual capabilities of a young child. As significant progress is under way in Europe, Asia and America [9], there is still a lot of hope and ground for optimism about the future. The 2013 Robotics Roadmap report [10] to the American congress has eloquently identified Robotics and Computer Vision as important future drivers of industry.

C.      Applications of Computer Vision

There are several applications of Computer Vision that we enjoy in our everyday lives. Some of these are movies, surveillance, face recognition & biometrics, road monitoring, autonomous driving, space exploration, remote sensing, agriculture and transportation. The following areas [5] are also well known active applications of CV.

1. Medical computer vision or medical image processing

Vision is mainly used in medicine to help with pathology, surgery and diagnosis.

2. Research:

There are several active CV research initiatives in the context of Robotics, Unmanned Aerial Vehicle (UAV), autonomous vehicles (Mars exploration), drones, submersibles. One of the main applications of vision in mobile robotics is the challenging task of vehicle localization.

Computer Vision can help robots in

  • Localization
  • Obstacle avoidance
  • Mapping (determining navigable terrain)
  • Object recognition  (people and objects)
  • Learning interaction with object

3.       Machine Vision:

Although Machine Vision (MV), an engineering discipline, and Computer Vision (CV), a scientific discipline, sometimes overlap, they have different methods and goals. Machine Vision is concerned with using automated image analysis to insure inspection and robot guidance in industry. CV techniques are borrowed and implemented in MV.

4.       Process control  (ex. industrial robot activity):

In manufacturing, it is common to see robots enabled with vision systems when controlling the quality of manufactured goods. For example, in agriculture, CV techniques are being used to classify rice according to grain size during production.

5. Events detection [8]  (ex. human/crowd surveillance or wildfire detection):

Crowd behavior is known to be complex, abstract and sometimes conceal occlusions, changes in illumination and abnormal patterns. To help analyze these anomalies, many researchers use computer vision techniques in video surveillance. it is also used in forest fire detection to improve the human-controlled fire detection rate. It helps render a 3D model from flat images of the fire captured as it evolves in time.

6.       Topographical modeling (ex. landscape image reconstruction):

The recognition, classification and analysis of landscape elements such as buildings, roads, rivers, plantations, and railways require special skills that Computer Vision can provide. It uses shape recognition techniques to classify features and build reliable topographic data.

7. Automatic inspection in manufacturing applications:

Computer vision techniques (procedures and algorithms) have been implemented in a manufacturing environment for heavy automatic optical inspection of complex thin film metal patterns. These techniques can for example detect critical electrical defects.

8. Military applications: 
CV is heavily used in combat zones for monitoring and identifying enemy activities (see Fig. 2).

SRI’s TerraSight video processing and exploitation suite

Fig. 2: SRI’s TerraSight® video processing and exploitation suite.

D.      References

[1] Linda Shapiro and G. Stockman (2001). Computer Vision. Prentice Hall. ISBN 0-13-030796-3..

[2] Tim Morris (2004). Computer Vision and Image Processing. Palgrave Macmillan. ISBN 0-333-99451-5.

[3] David A. Forsyth and Jean Ponce (2003). Computer Vision, A Modern Approach. Prentice Hall. ISBN 0-13-085198-1..

[4] Turek, Fred (June 2011). Machine Vision Fundamentals, How to Make Robots See. NASA Tech Briefs magazine 35 (6),  p. 60–62.

[5] Gérard Medioni and Sing Bing Kang (2004). Emerging Topics in Computer Vision. Prentice Hall. ISBN 0-13-101366-1.

[6] R. Fisher et al. (2005). Dictionary of Computer Vision and Image Processing. John Wiley. ISBN 0-470-01526-8.

[7] Azad, Pedram, T. Gockel, R. Dillmann (2008). Computer Vision – Principles and Practice. Elektor International Media BV. ISBN 0-905705-71-8.

[8] Çelik, Turgay  ( June 2008). Computer vision based fire detection in color images.

Proceedings of the Soft Computing in Industrial Applications, 2008. SMCia ’08. IEEE Conference on, p. 258 – 263.

[9] David Lowe (current) The computer vision industry, URL =

[10] GaTech, CMU, Robotics Tech. Consortium (March 2013). A Roadmap for U.S. Robotics, From Internet to Robotics.