Core Concept Neuroscience and Psychology Published: June 29, 2023

How Do We Recognize the Same Object From Different Angles?

Abstract

Visual constancy is the crucial mental ability by which we recognize an object as the same even when it may look different. If we did not have visual constancy, we would not recognize the objects around us if we saw them from a different angle, in different lighting, or from a distance. Because of visual constancy, we know an animal is an elephant whether we see it from the front or from the side. Do you think a computer could do the same? Many types of smart computers, such as smartphones and self-driving cars, also use a version of visual constancy to function in the real world. Many cybersecurity tools, like CAPTCHA, take advantage of the fact that humans are much better than robots are at tasks requiring visual constancy. Visual constancy is a learned skill, and understanding it better helps us create smarter machines, including cell phones.

What Is Visual Constancy?

Look at Figure 1A. Which two images are of the same apple, viewed from two different angles, and which one is a different apple? Your brain can identify which is which even though the shape, lighting, and position of the apple are slightly different from one image to the next [1]. This is due to a brain ability called visual constancy. We use visual constancy to recognize objects from various viewpoints and angles (Figures 1B, C). Visual constancy also applies to changes in the size, lighting, and distance of an object.

Figure 1 - Examples of visual constancy.
  • Figure 1 - Examples of visual constancy.
  • (A) Which apple is different? If you guessed the middle apple, you are correct! (B) People can recognize that the same person is shown in these three images, even though the images are from different points of view. (C) When an object is far from the eye (top), it makes a tiny image in the eye. When the same object is closer (bottom), it makes a larger image in the eye [image credits: (A) Amazing Design: https://www.thingiverse.com/thing:347623 (B) Charles Van gestel: https://www.myminifactory.com/object/3d-print-albert-einstein-bust-142808 (C) Brittney Truong].

Types of Constancy

Visual constancy can be divided into various types depending on the visual cues and the differences involved, such as size, brightness, and shape (see here). Size constancy involves recognizing two objects as the same even when they are different sizes, based on perspective. For instance, a dog walking down the street toward you (Figure 2A) looks like it is growing in size. But, thanks to size constancy, your brain knows the dog is just getting closer. Brightness constancy, also known as illumination constancy, is when the lighting changes, but you know that the object did not change. For example, when telling ghost stories, the storyteller may shine a flashlight on his face from underneath (Figure 2B), but you can still recognize him. Shape constancy is when you understand that the object is the same even if it appears to change shape—for instance, when it is seen from a different point of view (Figure 2C).

Figure 2 - Types of perceptual constancy.

Visual constancy has many names depending on the field of science and who is studying it. In computer science, researchers call it object invariance. Other scientists call it object constancy or perceptual constancy [2]. Perceptual constancy applies to all our senses (including vision, hearing, and touch), whereas visual constancy applies only to the sense of vision. Perceptual constancy helps us recognize that objects continue to exist even when they cannot be seen, heard, or felt [3].

Without Constancy, We Could Not Function

Perceptual constancy is important for daily living. This is because even when the objects around us stay the same, the information our brains receive about them keeps changing (see here). If we did not have perceptual constancy, every aspect of our lives would be affected: we would fail to recognize our loved ones if we saw them from another angle, or we would not recognize them if they had a cold and sounded different. We would still be able to see, feel, and listen to objects or sounds, but we would not be able to understand them. The visual world around us would have no meaning because we could not recognize a chair or an apple if these objects moved within the room or if the lighting in the room changed. Therefore, without perceptual constancy, we would not be able to function in the real world at all!

Visual Constancy Is a Learned Skill

If you are reading this, you have visual constancy. But it is not a skill we are born with—it must be learned. How do we know we are not born with fully developed visual constancy? We know in part from people who are born blind and gain sight later in life through surgery. In his book An Anthropologist on Mars, the famous brain doctor Oliver Sacks tells the story of many such people (see here). After gaining sight, these people had functioning eyes but could not understand what they were seeing. They had learned to read, recognize objects, and navigate using their other senses. Their other senses, such as hearing and touch, had perceptual constancy, but these people lacked visual constancy. Once their vision was restored, they had to learn visual constancy because their brains had not developed this skill. They could not recognize things if they moved or were seen in different conditions because they relied on touch instead of sight.

Studies on healthy babies have also helped prove that perceptual constancy is not inborn, but learned. When you play peekaboo with a newborn infant, they think you disappear when you hide behind your hands. The famous Swiss psychologist Jean Piaget (1896–1980) first showed that by the time babies are about 2 years old, they have a fully developed concept of object permanence, or the concept that objects continue to exist even when they disappear from view (see here). Subsequent research has shown that babies start developing object permanence about 3–5 months after birth.

We do not yet fully understand how the brain learns or implements visual constancy (see here). We know that brain regions near the temples, called the fusiform gyrus, help us recognize faces from many different angles, but we know that the fusiform gyrus does this by working with other regions in many parts of the brain.

Computer scientists have some ideas about how the brain can learn visual constancy. One such idea is that the brain observes how the image of a given object in the eye changes over time. Imagine that each time you blink, you are taking a picture. Your brain stores these images and matches what you are currently seeing to what you saw just a moment ago. Your brain can make many such comparisons to understand how objects are changing in your field of view.

Smart Computers Need Constancy Too

Just as humans need constancy to operate in the real world, smart computers—such as smartphones or self-driving cars—also need constancy. Before they are trained to be smart, smart computers are like people who have regained their sight: they have functioning visual systems but cannot make sense of what they see. Computer scientists must teach these computers how to recognize the sameness of objects in the real world.

To train these computers to recognize an object, computer scientists give the computers many example pictures or videos of the object. For instance, to help a self-driving car recognize a stop sign, scientists provide the car’s computer with a large number of pictures of stop signs from many angles or distances, and under various lighting conditions. Thus, the car learns what a stop sign looks like in as many real-world conditions as possible. That is, the car acquires visual constancy for a stop sign. This process of machine learning is repeated to teach the car about the many other objects it is likely to encounter, such as other road signs, traffic lights, pedestrians, or other vehicles (Figures 3A, B; see here). Learning the constancy of various objects also helps the computer tell objects apart from each other, such as a stop sign vs. a traffic light.

Figure 3 - Examples of machine learning.

Like our example of people taking snapshots in their minds, computers look for the important features of an object to identify it. The more points of view a computer has, the better it can identify objects. The same is true for humans. If we saw a statue, we would have an easier time recognizing it if we could walk around it and observe all its features. This seems obvious, but it was not until computer scientists began developing smart computers that brain scientists made this connection.

Using Perceptual Constancy to Outsmart Robots

Criminals can sometimes use smart computers to overwhelm a website. Computer scientists take advantage of perceptual constancy to outsmart such misuse of smart computers. If you have ever tried to access a website but received a barrier asking you to prove you are not a robot, you have encountered this defense. Such barriers are called CAPTCHA (Figure 3C). CAPTCHA stands for “completely automated public Turing test to tell computers and humans apart.” CAPTCHA works because every time you prove you are human—either by identifying the picture it asks you to select, or by spelling out blurred, jumbled up words with letters and numbers—you are demonstrating that you have better perceptual constancy than a computer!

Human responses to CAPTCHA tests are in turn used to train smart computers to better recognize images and text. As smart computers get better at recognizing images and text, the CAPTCHA tests must get harder, so that only humans can pass the tests (see here). So, the next time CAPTCHA gives you street signs or roads to identify, you may be helping to improve self-driving cars. How cool is that?!

What Does It All Mean?

Scientists study the brain to learn more about how humans function and behave. Understanding how we use visual constancy helps us explain the ways that our brains make sense of the world around us. Without visual constancy, we would be unable to recognize and interact with the world, we would have to relearn every face and object we encounter. Further, understanding visual constancy could help to make computers smarter and more helpful, because they can better process what they see to help them make decisions. We can use what we know about the brain to make more intuitive computers that can work for and alongside humans. Having computers that can learn and operate without the constant supervision of human operators allows for computers to supplant human tasks like driving, reading images, and monitoring.

The brain and computers have similarities. When scientists attempt to create better computers, they ask important questions about how our own brains work. Learning more about how our brains interact with the world teaches us about ourselves and how we can engineer systems that are almost as good as us at certain tasks!

Glossary

Visual Constancy: The mental ability to recognize a visual object as the same even when it looks different. Visual constancy is a type of perceptual constancy (see below).

Visual Cues: Specific features that people use to visually identify objects.

Perceptual Constancy: The mental ability to recognize an object using one or more senses as the same even when it may appear different.

Object Permanence: The learned mental faculty by which we recognize an object does not go away when it disappears from view.

Fusiform Gyrus: The brain region that specializes in the perception of faces located near the jaw on either side.

Smart Computers: Technology that can use experience to learn and improve performance.

Machine Learning: The science of creating and using smart computers. Sometimes called artificial intelligence.

Acknowledgments

BT, DM, AK, and BC were supported an educational grant to JH by the Undergraduate Research Apprenticeship Program of the U.S. Army Educational Outreach Program (AEOP). Research in JH’s laboratory has been supported by the Army Research Office (ARO) grant #W911NF-15-1-0311 to JH.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.


References

[1] Backus, B., Bulthoff, H., Huebner, G., Langer, M., and Wallis, G. 2009. Learning illumination and orientation invariant representations of objects through temporal association. J. Vis. 9:6. doi: 10.1167/9.7.6

[2] Vuilleumier, R., Henson, R., Driver, J., and Dolan, R. 2002. Multiple levels of visual object constancy revealed by event-related fMRI of repetition priming. Nat. Neurosci. 5:nn839. doi: 10.1038/nn839

[3] Hegdé, J. 2018. Neural mechanisms of high-level vision. Compr. Physiol. 8:903–53. doi: 10.1002/cphy.c160035