Review of Kurzweil’s “How to Create a Mind”

Kurzweil has a solid reputation as an inventor of technically-advanced products that have very practical use. He is also a famed a futurists, and a shrewd businessman who has without a doubt learned how to capitalize, popularize, and monetize his own and other’s ideas and visions: some brilliant, some not so much according to skeptics.

As the New Yorker recognized, Kurzweil’s critics have not always been kind; PZ Myers, a renowned biologist once indicated that he is a genius… and one of the greatest hucksters of our time. The author of “Gödel, Escher, Bach,” Pulitzer Prize winner Doug Hofstadter said reading one of Kurzweil’s books was like mixing together good food with dog excrement: ultimately you can’t tell the good from the bad.

The astute reader will be aware of commercialization and hyperbole but not be dissuaded by it. Rather, I suggest you read to enjoy the broad strokes and general principles behind the ideas presented and use them as a catalyst to explore the various aspects he put together in an attempt to explain one of many possible approaches to achieving human-like artificial intelligence- that particular goal only one of several possible paths to self-directed thinking, perhaps consciousness, and sentience in a machine. See our Introduction to Artificial Intelligence for a brief overview of the various AI perspectives.

May Kurzweil’s collection of ideas inspire your imagination.


Kurzweil subscribes to the theory that Artificial Intelligence machines will soon be equaling the power of human thought-with all of its complexities and richness- and perhaps even outstripping it.

The rather broadly held theory is lent credence by some two major turning points;- In 1997, Gary Kasparov was beaten at Chess by Deep Blue of IBM, and in 2011, Watson an Artificial Intelligence machine also of IBM beat Brad Rutter and Ken Jennings in the Jeopardy Chess matches.  He uses these two events to support the argument that the neuro-networks responsible for higher level/ hierarchical thinking (known as the Neocortex) actually have simple principles that can be well replicated, and that some of the more advanced AI machines such Siri- iPhone’s voice recognition software- and the aforementioned Watson already have the pattern recognition scheme used in their installed “brain”.

Kurzweil explains that this pattern recognition scheme is naturally hierarchical, meaning that lower-level patterns that pick minute inputs from the surroundings combine, triggering higher-level patterns picking more abstract categories that must be taught. Also, information moves upwards and downwards, causing feedback between higher and lower order patterns in a theory called the Pattern Recognition Theory of the Mind (PRTM), similar to the design of our best AI machines, and with a little tweaking- Kurzweil continues- will make it possible to design computers that match human thought, with such features as Identity, consciousness, and free will by 2029, eventually outstripping even human capabilities since they don’t have such biological incapacities as will be explained later. This advance, though, will allow us to use technology to update our neurochemistry in a merger Kurzweil calls the “singularity”.

It should be pointed out to the reader of this review, that the Singularity has morphed into several definitions. Originally conceived it simply meant the point at which machine intelligence surpasses human intelligence. Machines have concepts and thought beyond our comprehension, developing even faster and smarter machines further separating us from the new masterminds of the universe. See more on our treatment of that in the Human Extinction: Risks to Humanity section.


The ability to reason, analyze and prioritize enables mammals to think abstractly, as well as be predictive so we can processes, manipulate and store information from which we can adapt to or change a surrounding based on what we have learned about it. This intelligence comes from the Neocortex, which was added to previously existing sections of the brain by evolution.


The Neocortex gives mammals like humans the ability to think hierarchically and to understand singular parts of larger groups, groups that also belong to much bigger groups, and so on, helping us survive and thrive in two ways; It gives us a detailed and precise likeness of our surroundings and allows us to understand and adjust to the surroundings as our thoughts climb the levels of hierarchies, becoming more abstract and complex. The lack of the Neocortex- some scientists believe- contributed to the extinction of dinosaurs. Mammal Neocortex differ in size and development and account for 80% of the weight of human brain.

Neuroscientist Henry Markram of Switzerland deduced that the Neocortex can be reduced to a single thought process- hierarchical thinking- because of its uniform structure, as found out in a study where he scanned mammalian Neocortexes in search of neural assemblies. He indicated that the Neocortex appeared to be constructed of Lego-like collections of several dozen neurons in layers, connected to similarly structured super-assemblies connected to yet a higher layer of neuronal collections, and so on until the highest level represented the entire brain.  He is now a Director at the Blue Brain project, intent on recreating the complexities of the human brain, beginning with a trial on rats.


 The Pattern Recognition Theory of Mind (PRTM)

The author, borrowing from others before him, says that each layer of neural assemblies stands for a pattern recognizer that finds hierarchically organized information in the surroundings whether auditory, linguistic or any other information. Neural assemblies are pre-organized and innate, but are taught at each level of the neural assembly, incorporated with exact information. Human higher level thinking uses some 30 million recognizers and writes all information into different levels of neural assemblies in our brains.  For example, on a human face the mouth and nose are recorded at a different neural assembly from the entire face such that even if some facial parts are absent, a face can still be recognized especially if enough parts of it are available to trigger a recognizer and send the information to the next upward level.


Before a pattern recognizer at one hierarchical level triggers another one higher, they prime it before sending signals back to recognizers at the next-lowest level, to prime and prepare their senses for firing. In this instance, if a person’s eye is detected, the recognizers for the face will be primed before signaling to those representing other parts of the face to detect given features. The author considers this predictive.

Pattern recognizers communicate with positive or negative signals to encourage or hinder firing depending on the possibility of a given pattern to exist and whether they come from lower or higher conceptual levels.

Every new or change in a sensory scenario is detected by the brain and is saved given a new pattern recognizer. Some, like different expressions of a relative are saved multiple times while redundant ones, like a face not seen for ages are eventually replaced to save storage space. This replacement causes memory to fade away slowly to the extent that a face seen before is no longer remembered. Pattern recognizers have a redundancy factor of about 100 to 1 depending on importance (like between relatives and first sighting).

This example is exclusive of the great abstraction levels that we reach with alarming regularity and means. According to the author we might not, for example, remember a reason for laughing yet remember that we did laugh. We must also note that these signals are sent at very high speeds and pattern recognizers fire across many given faculties at any given time.

The reach and presence of the Pattern Recognition Scheme

As can be seen below, different mental capabilities from the Neocortex are found in multiple brain parts, and other parts of the Neocortex are available to perform tasks that are assigned to any other parts should it be found that the said parts are damaged or missing from birth (brain cells in various locations can be “taught”, or rather learn to be multifunctional if necessary for survival. This is known as neural plasticity and has even been found in people having congenital defects.


Introducing Speech Recognition to Artificial Intelligence

As Kurzweil shows, advanced artificial intelligence machines and software programs already use the processes described of the Neocortex above.

When the author and other computer scientists first moved into the uncharted territory of artificial intelligence, they sought to solve problems using predefined intelligent solutions and  programmed these problem types and solutions into a computer to be applied to arising problems as they came. Speech to text conversion (1980’s) was first tackled in this way- recording digital patterns which the program would try to match against human voice inputs. But since enunciation and pronunciation differ between people of different nationalities or races, or even with one person as they age, this method quickly became impracticable- too many variations would be needed in the “answer” databank.  Kurzweil then tried another technique known as vector quantization: to summarize or reduce human speech into 1,024 points/ iterations.

He then recreated what goes on in a person’s brain while they spoke and simulated this so that the computer could identify new units of speech, as well as variations in enunciation and pronunciation using a technique very mathematical in nature known as the Hidden Markov Model which could “infer a hierarchy of states with connections and probabilities.”

With this done, he sought to set parameters of unknown data points and their organizational hierarchies, using the biological evolution and cross-bred multiple ‘solution organisms’ (genetic codes of multiple parameters) which even had mutations that were not definite, or properly defined in their parameter values. Multiple cross-breeding tests were conducted, where in the best resultant designs were set aside and used for setting parameters for the Hierarchical Hidden Markov Model (HHMM). This HHMM was trained with speech samples from people of different nationalities and races, and who had unique accents to learn “the likelihood that specific patterns of sound are found in each phoneme, how the phonemes influence one another, and the likely orders of phonemes.” At the end of the day, the HHMM discovered/ learned that there were different rules, which were very different yet delicate, but more importantly were much more useful than the previous hand-coded rules used. In short, as Kurzweil and team combined HHMMs to simulate the cortical organization that accompanies human learning and a genetic algorithm to simulate the biological evolution that gave rise to a particular cortical design. Both of these are self-organizing procedures. This became the cornerstone of subsequent speech recognition works and research, and is being used in other areas of AI like speech simulation and knowledge of natural languages.

The need for both self-organizing and pre-programmed systems

While self-organizing systems are generally more advanced than pre-programmed ones, Kurzweil says artificial intelligence machines are incorporated with both, especially because the pre-programmed systems are much faster when handling familiar information and present a good basis for lower conceptual levels of hierarchy. These two advantages over the otherwise more advanced self-organizing systems enable the self-organizing system to learn much quicker than it would do on its own, and be ready for practical use much faster.  Combining both optimizes an effective AI machine. After the self-organizing system has fully learned, it’s expected that the pre-programmed system will be discontinued.

Watson; The Most Advanced Machine in AI

According to Kurzweil, Watson is an AI machine which uses an ‘expert manager’ called UIMA (Unstructured Information Management Architecture) to choose the correct sub systems for use in different situations and then with “intelligence” combines the outcomes (answers) of these systems. This method allows Watson to contribute to a resolution even though it may not deliver an actual answer to a given problem.  This multi-processing also helps to gauge and build Watson’s confidence in its answers by use of a probability percentage. This example of probability percentages was witnessed at the Jeopardy matches. Kurzweil says the human brain also uses this method when statistical inference is used to resolve multiple hypotheses.

According to the author, Watson was designed around the complexities and richness of the Neocortex, although admittedly it’s still some way from posing as an actual human. For example, it could not ace the famed Turing test because it was never designed to pass it nor engage in intelligent conversation, rather it was designed to compete at Jeopardy and answer brief and not so complex questions. Kurzweil, though, believes with a little tweaking, Watson will perform those tasks considering that many AI advances occurred before the complexities of the Neocortex were well researched.

Simulating the Human Brain

Multiple attempts with varying degrees of success have been made to accurately simulate the human brain, ably assisted by technologies including the scanning technology used to uncover the grid-like patterns of the Neocortex’s connections.  There a number of such technologies including the latest MRI techniques which are noninvasive scanning technologies.

Human Connectome

The National Institutes of Health, through their Human Connectome project have chosen to use this technology and expect to build a complete 3-D map of the human brain complete with all its connections by 2014.

The Blue Brain Project

The Blue Brain Project, on the other hand aims to model and “simulate the human brain, including the entire Neocortex as well as the old-brain regions such as the cerebellum, amygdala, and hippocampus, and by recording the measurements of ion channels, neurotransmitters, and enzymes that generate and regulate every neuron’s electrochemical activity. They will be using a patch-clamp robot, another scanning technology, in a system that is automatic and able to scan neural tissue at one micro-meter of accuracy, avoiding the destruction of delicate membranes. In 2005, participants simulated one neuron, and in 2011 did a neural mesocircuit of 100 neocortical columns.  They target 10,000 neurons and a rat brain by 2014. Their current goal is 2023 for fully-simulated Human Brain.

Educating the simulated brain

According to Kurzweil, the simulated brain cannot achieve human-level thinking unless it has the necessary content and he describes multiple potential methods to fulfill this requirement. The most likely, he surmises, is one that can simplify molecular models by creating functional equivalents at different levels of detail, starting with his personal functional algorithmic method to simulations that are closer to full molecular simulations. His book goes into greater detail, but he guestimates that it could speed the learning process 1000 fold or more.

Technological acceleration

Kurzweil explains that future-human-evolution-and-exponential-technology-growthhis Law of Accelerating returns (LOAR) is doubted by many because they don’t understand the concept of linear vs. exponential progressions where if forty linear steps is equal to 40 years, the same 40 steps on an exponential scale would equal a whopping trillion years. Based on the historical evidence of exponential advancement, he predicts more complex advances are coming, merging biological and technical evolution techniques. He confidently speculates on the possibility of a machine having human consciousness, identity and free will, purporting that any complex physical system will inevitably develop it.  He cites man’s best friend, the canine, as an example of a non-human consciousness.

Consciousness, Free Will and Identity?

He also argues- concerning free will- that there’s a likelihood that we humans actually don’t have it, but just feel that we do, or alternately, like consciousness, perhaps it’s also an emergent property that evolves at high, complex levels. If these are true then it’s likely possible that a machine of human-level thinking would also have the same, or feel (have the perception) that it does. Kurzweil holds that identity is borne of our sense of free-will and experience. He extrapolates that a self-aware machine would naturally possess the same belief.

Beyond Human Intelligence

Kurzweil is also a proponent of the more advanced applications of AI. Synthetically producing a Neocortex and replacing our own biological one would enable the functioning of more than 300 million processors- or more. A billion?  He considers the fact that digital neurons can be made to link up wirelessly- a big advantage over human ones which are linked physically.

He also considered the possibility of adding bug cleaning features to our brains, to remove/ reduce instances such as multiple thinking and inconsistent but colliding ideas in our brains. A module for detailed thinking could be designed to continually do background scans for inconsistencies in all existing ideas or patterns and update their compatibilities with each other. Inconsistent ideas would then be reviewed or eliminated. With this and other such implants, we would alleviate the risk of AI machines ever outstripping us in intelligence.  We could then take advantage of the singularity by incorporating the exponential advances into our own biology. By doing so we could dispel some fears of losing our identity or changing the continuity of our body cells any more than nature replenishes them for us currently.


It’s only fair to say we are in a race with technology which is ever advancing.  His far future vision is the spread of our non-biological intelligence to the four corners of the universe, infusing our deliberate will directly upon its fate.  If we are able to break the speed of light barrier we could have a universal omnipresence within a few centuries. It is our destiny.

Certainly on that last conclusion this reviewer and this site agree.  Science fiction writers and far futurists have been coming to that conclusion for years as well. See our own 2003 essay on the distant future. It is in fact the only logical conclusion to an assumed eternal existence in the known universe (although we disagree with the assumed ubiquitous non-biological entity).

In any case, let us all hope the boundaries of reality continue to expand the unknown at least as fast as our ability to consume and understand it, lest we be caught in the forever loop of The End is Just the Beginning.

Further Reading

Future Human Extinction Risks

We debated whether or not to include a “Doomsday” section on the site. We are not alarmist. In the end however, we recognize that to ignore the possibility of humanity’s demise is irresponsible for a website dedicate to its future.

Education and awareness can be a powerful force when used by citizens to effect change and action in individuals, businesses, and governments. Toward this goal we have established some basic content outlining and assessing some popularized potential threats to our continued well being as a species, with links so that you can research and learn about the topic more thoroughly.

Risks to Humanity: What is a Doomsday Scenario?

Extinction: The end of Future Human EvolutionIn the context of human evolution, doomsday is any event, condition, or process that results in civilization being unable to progress. In its most extreme form it literally means the extinction of the human race (or derivatives therefrom). There are several schools of thought on the nature of doomsday. In some of these, the phenomenon is projected to be a natural catastrophic event such as a meteor, super volcano, etc.or a man-made catastrophe like nuclear holocaust, runaway nanotechnology or a wayward AI.   In others the ‘end’ is perceived to be a bit more gradual such as the destruction of our terrestrial ecology. Some theorize that the end to progressive evolution will occur even more slowly in the form of dysgenics, a reversal of the evolutionary process. Finally, there are individuals and organizations who perceive the realization of post-humanity to be the greatest threat of all. Below is a very brief introduction to these ideas with links to facilitate your further investigation.

Natural Disaster


“Natural Disasters” cover a lot of territory. Earthquakes, volcanic activity, hurricanes. These are not species threatening though harmful to a subset of humans. Catastrophic for all would be a major meteor or comet impact. Mass extinctions, runaway infernos, erratic climate fluctuations, and devastating effect on human civilization. We place the risk as low because the likelihood of an impact prior to our adequate preparation is low. The privatization of space and the decreasing cost of launch will soon combine to make positive action to such a threat possible.

Ecological Destruction


We categorize the health of our planet at high risk for being able to sustain the current levels of population growth and pollution. While there are a few environmentalists who propose we are in fine shape, the risk that we are not is too great to ignore the issue. We view overpopulation, dirty energy, and the “disposable”  culture as major contributors to poor ecological health. We are not tree-huggers, but advocate learning to live in balance with our environment before we spread “locust” behavior to other worlds.



Was penicillin one of mankind’s greatest discoveries, or will it ultimately destroy civilization? In the past 20 years, common bacteria including Staphylococcus aureus, Serratia marcescens and Enterococcus, have developed resistance to various antibiotics such as vancomycin, as well as whole classes of antibiotics, such as the aminoglycosides and cephalosporins. Antibiotic-resistant organisms have become an important cause of healthcare-associated (nosocomial) infections (HAI). Overuse of antibiotics and their extensive use in agriculture and farming that get absorbed by consumers in doses that make bacteria stronger rather than kill it make this item a very important long-term risk to watch.



Developing and deploying technology comes with inherent risk as it is about pushing the boundaries of the known.  Within the scope of the technologies we watch on this site, Gray Goo (runaway, swarming nanobots) are a far future worry, whereas Artificial Intelligence may pose a threat in the next 10 years. The unpredictable behavior of an intelligence greater than our own is exactly that- unpredictable. I laughed uncomfortably when I first saw the site AI Will Kill our Grandchildren, years ago. Should we assume a benign and benevolent creation, or a cunning, baffling, and powerful intelligence that may, despite directives, out-think its makers in knowing what’s best across a reality that we can’t even fathom?

“Terrorist” Threats


Terrorists cause and intend to cause death, social disruption, and emotional pain to humanity. Whether it be rogue nations or a band of outlaws, their destructive powers are only limited by the weapons at their disposal. As technology makes greater destructive power easier to acquire, it may inevitably fall into hands bent on the destruction of civilization. Conventional weapons (including aircraft) and chemical compounds only cause local damage. Likely methods of mass destruction on a societal scale are biological agents and nuclear bombs. Nuclear detonations would only cause the destruction of civilization if accomplished in sufficient number. A well engineered and strategically released virus could spread like wildfire throughout the world. We rate this risk as medium because the technology exists or soon will, and there are people who want to use it.

•  Assessing the Threat of Mass-Casualty Bioterrorism
•  Biologic Terrorism Responding to the Threat



Dysgenics refers to the process of individuals with ‘poor’ intelligence or characteristics producing disproportionately more progeny due to societal nullification of the natural selection process. The theoretical result is a steady genetic deterioration in human populations. One theory states that we will de-evolve genetically (loss of intelligence and the ability to utilize technology) to maintain equilibrium with the environment. This, of course, is very unlikely if for no other reason because of the time scale required. Before this ever becomes a species threatening issue we will either have allowed the destruction of civilization via some other method or have long since transcended the genetic bonds of biological recombination.

•  Dysgenics: Genetic Deterioration in Modern Populations
•  Dysgenics or Eugenics


Anti-Technology Political Activism


Where we do see risk is in the potential loss of political will to pursue human advancement technologies created by:

1. an increasing population more interested in satisfying current needs at the expense of our future.
2. increased anti-technology political activism based on religious, moral, and ethical grounds.

These could serve as impediments to technological progress long enough for one of the other risk factors to have a severe detrimental impact to the advancement of our civilization.

Anti-technology Groups:

Post Humanity as the Ultimate Threat


What is this risk? This risk is that through changes in our physical being we change our spiritual nature, or our very “human-ness”. There are those that claim altering anything about ourselves places us in jeopardy of becoming posthuman, or worded more strongly, non-human. We are in the process of identifying Internet and other resources who present the case clearly.

As proponents of human advancement through technology, we rate this risk as low. And we do so for a number of reasons, some of which include:

  1. Given our vision of diversity for future human evolution, we believe that regardless of the many ways that technology will be applied to alter members and segments of our species, there will be a significant contingent who remain “natural”.
  2. Our ability to manipulate the genome is a natural act. It is at the core of what make us human – the ability to develop and utilize technology to understand and control nature. Different segments of the population, special interest groups, and individuals will make unique contributions to the soon-to-be fluid constitution of our DNA. “Natural” selection, similar to today’s world, will be determined by social values- “good” modifications will be incorporated into the larger gene pool through both natural reproductive means and deliberate replication, modifications perceived as “bad” will not.

Naturally, equal protection under the law will play an integral part in allowing a genetically diversified culture to function effectively, as will equal access to the technology.

Disruptive Technologies: Driverless Cars

Most of us are familiar with the concept of autopilot, the self-maneuvering technology that provides relief to pilots by maintaining an aircraft on a preset course. A similar system in road vehicles enables cruise-control. Most modern commercial merchant vessels and tankers are also heavily automated and can transport goods around the world with the help of smaller, less specialized crew members. These are all examples of ways in which transport has already benefitted from some degree of automation, and it is only inevitable that we find ourselves entering an era of autonomous road vehicles.

Autonomous vehicles include self-driving cars and trucks that operate with little or no intervention from humans. While fully autonomous vehicles are still in the experimental stage, partly autonomous driving features are already being introduced in production vehicles. These features include automated braking and self-parking systems.

Driverless Car

Google-Lexus Driverless Car

Numerous advances in technology have made autonomous vehicles a reality, and the development of machine vision has been particularly crucial. It involves the use of multiple sensors to capture images of the environment, which are then processed to extract relevant details, such as road signs and obstacles. 3D cameras enable accurate spatial measurements, while pattern recognition software allows identification of characters like numbers and symbols. Additionally, laser-imaging detection and ranging, or LIDAR, and advanced GPS technologies are being used to allow a vehicle to identify its location, and navigate smoothly along road networks.

A self-driving vehicle is also equipped with artificial intelligence software that integrates data from sensors and machine vision to analyze the next best move. The decisions of the software are further modified based on traffic rules, such as red lights or speed limits. Actuators then receive driving instructions from the control engineering software, and the vehicle accelerates, brakes, or changes direction as needed.

The introduction of automobiles has revolutionized the world in the last century, and autonomous vehicles are expected to yield just as tremendous an impact. One of the major advantages of autonomous vehicles will be in their ability to prevent collisions. Elimination of drivers is expected to greatly reduce the injuries, deaths, and damages that result from driving accidents caused by human error. According to one estimate, self-driving technology can result in a 20 percent overall reduction in road accidents. By 2025, this could potentially save 140,000 lives annually.1

Vehicular automation also has promising implications for fuel economy. The technology behind autonomous vehicles ensures supremely precise maneuvering that allows cars in a lane to safely drive within a narrow distance of each other. Such streamlined vehicles experience lower aerodynamic drag, with a consequent reduction in fuel consumption.2 Moreover, autonomous vehicles have acceleration and braking systems that are efficiently operated by the computer, further reducing fuel consumption. Automobiles are a notorious source of pollution; with the improved fuel economy of self-driven cars, it is estimated that C02 emission could be reduced by up to 100 million tons annually.1

The trucking industry could also benefit from automated vehicles. Self-driving truck convoys would theoretically be able to make long haul trips without having to stop for the needs of a human driver. An autonomous trucking system has already been successfully tested by a Japanese research organization. The system consists of several radar-technology equipped trucks that are led by a single driver from the front. Similarly, the mining giant Rio Tinto has used partially autonomous trucks that stay on a predefined route and are able to unload cargo without personnel.

Fully autonomous cars are still in the testing stage and have been developed by several major automakers. Audi has produced a self-parking car that can also start and stop itself in heavy traffic.3 Cadillac has built cars with advanced cruise-control systems that provide steering assistance. Mercedes-Benz is introducing the 2014 S-class that comes with multiple advanced autonomous driving features. The car can maintain speed and distance from other vehicles, and also has a lane-keeping system.4

If the full potential of autonomic vehicles is to be realized in the future, governments will have to be decisive about supporting the technology. Automakers are already testing fully autonomous cars that are likely to become commercially available within a few years. Their ultimate appearance on the roads, however, will be dependent upon government regulations. New laws will have to be established to determine legal responsibilities, and roads may require investments to optimize them for self-driving vehicles.

Finally, as with any computerized machine, autonomous vehicles would be potential targets for hackers. Criminals gaining access to the automated navigation systems could inflict devastating harm, making it crucial for strong cyber security systems to be established before allowing self-driven transportation on the road.


McKinsey Global Institute, Disruptive technologies: Advances that will transform life, business, and the global economy, May 2013

Kevin Bullis, How vehicle automation will cut fuel consumption, MIT Technology Review, October 24, 2011.

Angus MacKenzie, The future is here: Piloted driving in Audi’s autonomous A6, MotorTrend, January 2013.

Andrew English, New car tech: 2014 Mercedes-Benz S-class, Road & Track, November 2012.

Leaders in Artificial Intelligence – Google

A. Presentation of Google

Google Inc. is an American corporation [1] founded in 1998 by Larry Page and Sergey Brin. It is headquartered in Mountain View, CA with more than 70 offices in the USA and 40 other countries around the world (ex. Australia, Brazil, Canada, China, France, Germany, India, Ireland, Israel, Japan, Kenya, Russia and the United Kingdom).

Google- Leader in the field of Artificial Intelligence

Fig. 1: Google headquarters in Mountain View, CA.
Fig. 2: Research at Google (Video)

Google’s main mission is to collect data from companies and private computer servers, organize it and make it accessible to everyone through their recognized world largest search engine. This mission requires large amount of resources, sustained research (Fig. 2) and development and innovation in computer science, artificial intelligence and other scientific fields. In its approach to R&D known as “Hybrid Research Model”, the company blurs the line between research and development activities and maintains the right balance at all levels. That is to say, research teams stay involved in engineering activities as much as their engineering counterparts bring a research dimension to their activities. Google has a strong commitment to and supports academic research through grants, scholarships, Faculty research awards, Faculty training, curriculum development and outreach programs.

B. Research and Development at Google

R&D and innovation at Google span over several areas in computer science and is driven by real-world data and experience. Its goal is to create practical applications and bring a significant improvement in quality of service to its millions of customers. In particular, Google’s contributions to the advancement of Artificial Intelligence is best known through advances in speech recognition, language translation, machine learning, market algorithms and computer vision. Of the more than $3 billions of investment in R&D, a large size is allocated to AI. The best way to describe ongoing research at Google is through its most popular publications, applications and innovation and the people who are leading it. The following table gives a simple synopsis of research at Google in Artificial Intelligence theory and applications.

AI Field


Machine LearningMachine Perception
Machine LearningMachine Translation
Data MiningMultimedia data processing
Data MiningAI-enabled Visual search
Natural Language UnderstandingSentence parts prediction
Natural Language ProcessingSpeech Recognition and Processing
Natural Language ProcessingGoogle Now  voice recognition on Android  [6]
Computer Visionmedia annotation

Table 1: Research activities in AI at Google

B-1.  AI applications and innovations at Google

By applying Machine Learning techniques to speech understanding, machine translation, and visual processing, Google researchers gather large volumes of evidence of unstable relationships within evolving interests. Then they apply multiple learning algorithms to generalize from that evidence, new interests.

As its mission states, Google’s intention is to organize all types of media (image, video, sound) and make it accessible to everyone. To this end, it exposes computers to different kinds of media and makes them perceive and build explanations from these perceptions. This process is called Machine Perception and is at the core of Google’s data-driven solutions to problem solving.

Using computer vision technology, Google is very active in annotating media, measuring semantic similarity, synthesizing complex objects and browsing large collections of multimedia objects. Besides, Google is also using meaningful data mining techniques to process multimedia  contained in YouTube video, Android, Google image search, StreetView, Google Earth.  It succeeds the translation of raw text and audio within e-mail messages, books and Android through selected statistical translation techniques that improve over time and is independent of the natural language of the content.

Research in Natural Language Processing (NLP) at Google goes beyond the traditional boundaries of language-dependent, limited domain, syntactic/semantic analysis to reach out to the vast amount of data on the Web in multiple human languages. On the syntactic as well as the semantic levels, researchers at Google develop algorithms to predict the position; words should be assigned to in a sentence and the relationships that bind them. In addition, NLP research is oriented towards multilingual linear time parsing algorithms that are able to handle large shifts in vocabulary.

Google Instant - Leader in the Field of Artificial Intelligence

Fig. 3 1: Google Instant – Predicting part-of-speech tags with NLP techniques in Google search

In speech technology, Google is involved on two fronts: 1) Making natural language a normal communication medium between man and machine (computers, phones); 2) Turning any multimedia object (text, video, sound) searchable and accessible on the Web.

B-3.  Leading figures in AI research at Google

Ray Kurzweil - Director of Engineering at GoogleRay Kurzweil [3] is an author, a famous inventor and a futurist who joined Google in December 2012 as a Director of Engineering. He published several books on health, AI, transhumanism, technological singularity, and futurism (ex. The Age of Spiritual Machines, The Singularity Is Near). His work at Google focuses on “new technology development” as well as machine learning and language processing. Kurzweil’s ambition is to analyze the enormous amount of information collected on Google tools and provide it as an intelligent private assistant. He predicts that this assistant would listen to your phone conversation and read your e-mail in the background and later anticipate on your needs, serve them to you before you even ask.  Another Kurzweil’s goal is to design at Google the technology that really understands the meaning of any human language.

Peter Norvig - Former Director of Google Search QualityPeter Norvig [4] started at Google in 2001 as a Director of Search Quality, responsible for the core web search algorithms until 2005. Then, as a Director of Research he oversaw the machine translation team and organized efforts in speech understanding groups. In particular, one of his interests is a system that can help humans find answers to questions that aren’t clearly defined.  He is a Fellow of the American Association for Artificial Intelligence and the Association for Computing Machinery. Previously, he was the head of the Computational Sciences Division at NASA Ames Research Center where he received the NASA Exceptional Achievement Award in 2001. He published over fifty publications mainly in Artificial Intelligence.


Sebastian Thrun - Research Fellow at GoogleSebastian Thrun [5] is a research Professor at Stanford University, co-founder of Udacity and a fellow at Google.  He initiated the secretive Google X lab [7] which harbours dozen of projects like the self-driving car [9] [10], speech recognition and object extraction from video and Google Glass[8], an augmented reality head-mounted device.




Fernando Pereira - Research Director at GoogleFernando Pereira [11] is Researcher Director at Google. His main research interests are in machine-learnable models of language and biological sequences. He is a Fellow of the American Association for Artificial Intelligence, holds several patents in AI and has numerous contributions in computational linguistics and logic programming.  Pereira has over 100 research publications on computational linguistics, machine learning, bioinformatics, speech recognition, and logic programming.

C. Selected Google contacts

Google Inc.  (headquarters)
1600 Amphitheatre Parkway
Mountain View, CA 94043
Phone: +1 650-253-0000
Google Shanghai
60F, Shanghai World F. C.
100 Century Avenue,
Shanghai 200120, China
Phone: +86-21-6133-7666
Google Paris
38 avenue de l’Opéra
75002 Paris
France  +33
Phone:  (0)1 42 68 53 00
Google Moscow
7 Balchug st.
Moscow 115035  / +7
Russian Federation
Phone: 495-644-1400

Table 2: Google research – Some contacts around the world

D. Further readings

[1] Google. About Google Inc.  URL = . Retrieved April 16, 2013.

[2] Google. Research at Google.  URL = http . Retrieved April 16, 2013.

[3] Inventor Profile Ray Kurzweil  Invent Now, Inc. URL = . Retrieved April 16, 2013. Retrieved April 16, 2013.

[4]   URL = . Retrieved April 16, 2013.

[5] Sebastian Thrun.  Home page at Stanford. URL = . Retrieved April 16, 2013.

[6]  Google. Google Now presentation  URL =

[7]  Gaudin, Sharon (2011). Top-secret Google X lab rethinks the future, Computerworld. Retrieved April 16, 2013.

[8] Albanesius, Chloe (4, 2012). Google ‘Project Glass’ Replaces the Smartphone With GlassesPC Magazine.

[9]  John Markoff (2010). “Google Cars Drive Themselves, in Traffic”The New York Times. Retrieved April 16, 2013.

[10]  Mary Slosson (2012). Google gets first self-driven car license in Nevada”Reuters. Retrieved April 16, 2013.

[11] Fernando Pereira.   URL = . Retrieved April 16, 2013.

Artificial Intelligence: Computer Vision

A.      Nature of Computer Vision

A. 1.  What is computer vision?

Computer Vision (CV) is the science of teaching a computer how to identify a physical object in its surroundings (Fig. 1). Its task is to capture an image, understand it, reconstruct it internally and create a meaningful and concise description. As a scientific and engineering field, Computer Vision [1] [2] strives to apply its theories, models and techniques to the construction of practical systems. Its ultimate aim is to imitate and improve on human visual perception. To this end, it draws from several fields like image processing (imaging), AI (pattern recognition), math. (statistics, optimization, geometry), solid state physics (image sensors, optics), neurobiology (biological vision) and signal processing. This article is about Computer Vision [3] and not Machine Vision (MV) which has a significantly different goal.


Fig. 1: MERTZ, an active vision head of humanoid robot for
learning in a social context at MIT.

Although MV and CV share some vocabulary, concepts and techniques, they have fundamentally different approaches and priorities (see table 1). On the one hand, Computer Vision needs to capture 2D images of objects in a scene and apply elaborate algorithms to recreate an approximate 3D image of that scene. On the other hand, Machine Vision is interested only in 2D images of objects whose salient features are extracted for discrimination purposes (identifying, recognizing, grading, sorting, counting). MV methods use hard-coded (embedded) software containing information about the scene.


Computer Vision

Machine Vision

Hardware/softwareComputers/SoftwareDedicated industrial hardware
Problem solving methodsAlgorithmsin situ programming
Input datafilemechanical part
output datasignal for human beingsignal to control equipment
User interfacesimple graphical interfaceelaborate interface  is critical
Knowledge of human visionstrong influencefair influence
Quality criteriacomputational performanceeasy, cost effective, reliable
Financial supportSecondaryCritical

Table 1: Comparing Computer Vision and Machine Vision.

2.  A short history of computer vision

Larry Roberts’s Ph.D. dissertation in 1963 at the MIT, “Machine Perception of Three-Dimensional Solids”, was a landmark contribution in that it laid out the foundations of  the field of computer vision. In his thesis, Larry Robert proposed the idea of extracting 3D geometrical information from related 2D views of blocks (polyhedra).

In its evolution [6], research in CV needed to tackle real-world problems where edge detection and segmentation are focal points. David Marr proposed his bottom-up approach to scene understanding at the MIT in 1972. It was a major milestone, the most influential contribution in CV.

Here is a synopsis of fifty years of computer vision 1963-2013:

  • • 1960s: Image processing and pattern recognition appear in AI.
  • • 1970s: Horn, Koenderink, Longuet-Higgins milestone contribution to image processing.
  • • 1980s: Math., probability and control theory are applied to Vision.
  • • 1990s: Vision is augmented by computer graphics and statistical learning.
  • • 2000s: Advances in visual recognition and major practical applications have significant impact on Vision.

B.      R&D challenges in Computer Vision

Today there are a number of factors which prevent CV [7] from reaching its full potential. Its interdisciplinary nature (AI, computer science, math., physics, biology) and unexpected growth have made it subject to dispersion and instability. Then, CV – often confused with machine vision – lacks the name recognition and image as a field in its own right. As a result, many research initiatives have the feeling of being underestimated. Last, there seem to be a disconnect between academic research and industry development. With so little ground for cooperation, their needs, achievements and perspectives are not mutually understood.

In spite of all the steady advances in Computer Vision, R&D results have yet to match the visual capabilities of a young child. As significant progress is under way in Europe, Asia and America [9], there is still a lot of hope and ground for optimism about the future. The 2013 Robotics Roadmap report [10] to the American congress has eloquently identified Robotics and Computer Vision as important future drivers of industry.

C.      Applications of Computer Vision

There are several applications of Computer Vision that we enjoy in our everyday lives. Some of these are movies, surveillance, face recognition & biometrics, road monitoring, autonomous driving, space exploration, remote sensing, agriculture and transportation. The following areas [5] are also well known active applications of CV.

1. Medical computer vision or medical image processing

Vision is mainly used in medicine to help with pathology, surgery and diagnosis.

2. Research:

There are several active CV research initiatives in the context of Robotics, Unmanned Aerial Vehicle (UAV), autonomous vehicles (Mars exploration), drones, submersibles. One of the main applications of vision in mobile robotics is the challenging task of vehicle localization.

Computer Vision can help robots in

  • Localization
  • Obstacle avoidance
  • Mapping (determining navigable terrain)
  • Object recognition  (people and objects)
  • Learning interaction with object

3.       Machine Vision:

Although Machine Vision (MV), an engineering discipline, and Computer Vision (CV), a scientific discipline, sometimes overlap, they have different methods and goals. Machine Vision is concerned with using automated image analysis to insure inspection and robot guidance in industry. CV techniques are borrowed and implemented in MV.

4.       Process control  (ex. industrial robot activity):

In manufacturing, it is common to see robots enabled with vision systems when controlling the quality of manufactured goods. For example, in agriculture, CV techniques are being used to classify rice according to grain size during production.

5. Events detection [8]  (ex. human/crowd surveillance or wildfire detection):

Crowd behavior is known to be complex, abstract and sometimes conceal occlusions, changes in illumination and abnormal patterns. To help analyze these anomalies, many researchers use computer vision techniques in video surveillance. it is also used in forest fire detection to improve the human-controlled fire detection rate. It helps render a 3D model from flat images of the fire captured as it evolves in time.

6.       Topographical modeling (ex. landscape image reconstruction):

The recognition, classification and analysis of landscape elements such as buildings, roads, rivers, plantations, and railways require special skills that Computer Vision can provide. It uses shape recognition techniques to classify features and build reliable topographic data.

7. Automatic inspection in manufacturing applications:

Computer vision techniques (procedures and algorithms) have been implemented in a manufacturing environment for heavy automatic optical inspection of complex thin film metal patterns. These techniques can for example detect critical electrical defects.

8. Military applications: 
CV is heavily used in combat zones for monitoring and identifying enemy activities (see Fig. 2).

SRI’s TerraSight video processing and exploitation suite

Fig. 2: SRI’s TerraSight® video processing and exploitation suite.

D.      References

[1] Linda Shapiro and G. Stockman (2001). Computer Vision. Prentice Hall. ISBN 0-13-030796-3..

[2] Tim Morris (2004). Computer Vision and Image Processing. Palgrave Macmillan. ISBN 0-333-99451-5.

[3] David A. Forsyth and Jean Ponce (2003). Computer Vision, A Modern Approach. Prentice Hall. ISBN 0-13-085198-1..

[4] Turek, Fred (June 2011). Machine Vision Fundamentals, How to Make Robots See. NASA Tech Briefs magazine 35 (6),  p. 60–62.

[5] Gérard Medioni and Sing Bing Kang (2004). Emerging Topics in Computer Vision. Prentice Hall. ISBN 0-13-101366-1.

[6] R. Fisher et al. (2005). Dictionary of Computer Vision and Image Processing. John Wiley. ISBN 0-470-01526-8.

[7] Azad, Pedram, T. Gockel, R. Dillmann (2008). Computer Vision – Principles and Practice. Elektor International Media BV. ISBN 0-905705-71-8.

[8] Çelik, Turgay  ( June 2008). Computer vision based fire detection in color images.

Proceedings of the Soft Computing in Industrial Applications, 2008. SMCia ’08. IEEE Conference on, p. 258 – 263.

[9] David Lowe (current) The computer vision industry, URL =

[10] GaTech, CMU, Robotics Tech. Consortium (March 2013). A Roadmap for U.S. Robotics, From Internet to Robotics.