2009/11/13

Notes on 《How the Mind Works》Ch.4 —— The Mind's Eye

In this chapter, steven explores how vision turns retinal depictions into mental descriptions.

The definition of vision: a process that produces from images of the external world a description that is useful to the viewer and not cluttered with irrelevant information.
What description means here is: mental symbol, and the mental propositions that capture the spatial relations among objects.

Correspondence problem in stereo vision: matching up the marks in one eye with their counterparts in the other.

In order to solve the matchup in mind, there are 3 built-in assumptions about the world we evolved in plus constraint satisfaction:
  1. every mark in the world is anchored to one position on one surface at one time.
  2. a dot in one eye should be matched with no more than one dot in the other.
  3. matter is cohesive and smooth.
Constraint-satisfaction technique: make tentative guesses and hash it out among themselves until a global solution emerges

Stereo vision is a mixture of nature and nurture:
  • different forms of stereoblindness suggest is is genetically determined
  • stereo vision is not present at birth, and it can be permanently damaged in children or young animals if one of the eyes is temporarily deprived of input or disrupted by experience.
A visual system make us see the most probable state of the world given the retinal images via probability theory -- Bayes' theorem(贝叶斯方法): assigning a probability to a hypothesis based on some evidence. That is, the odds favoring one hypothesis over another can be calculated from just two numbers for each hypothesis:
  1. prior probability(先验概率): how confident are you in the hypothesis before you even look at the evidence? (假设本身独立的可能性大小)
  2. likelihood(相似度): if the hypothesis were true, what is the probability that the evidence as you are seeing it now would have appeared?
How we see the world around us:
  1. We see only what is in front of our eyes; the world beyond the perimeter of the visual field and behind the head is known only in a vague, almost intellectual way.
  2. We see surfaces, not volumes.
  3. We see in perspective.
  4. We don't immediately see "objects".
  5. We see in two and a half dimensions. Depth is whimsically downgraded to half a dimension because it does not define the medium in which visual information is held (unlike the left-right and high-low dimensions)
In order to access the visual information properly, Reference frames are inextricable.
  • rentina's frame: allow us to judge the location via compensating for movements of the head and body.
  • world-aligned reference frame: allow us to judge the genuine angles and extents of the matter outside our skin.
  • the inner ear's frame: allow us to judge the direction of gravity
Motion sickness is triggered by the mismatch signals sent from the retina's frame and inner ear's frame.

Several theories used in shape recognition:
  1. geon theory (by Irv Biederman):
    • geon is simple geometric part of objects. ( Like protons and electrons making up atoms)
    • Geons can be assembled into objects with a few attachment relations like "above", "beside", "parallel" etc. (These relations are defined in a frame of reference centered on the object not the visual field)
    • Geons are combinatorial, like grammar. (an object is not the sum of its geons but depends on their spatial arrangement;) 24 types of geon, 15 different sizes and builds, 81 ways to join them.
    • Left hemisphere: has the ability to recognize and imagine shapes defined by arrangments of geons.
    • Right hemisphere: to measure whole shapes like taller or shoter, nearer or distanter.
  2. multiple-view theory: people create a separate memory file for every orientation in which an object commonly appeared.
  3. mental-rotation theory: people rotate shapes in their minds. When a shape appeared at a new, unfamiliar orientation, the farther it would have to be rotated to be aligned with the nearest familiar view, the more time people took.
When to use which theory for shape recognition:
  • when a shape's sides are not too different, geon theory is used.
  • when the shape is more complicated, multiple-view theory is used.
  • When the shape appears at an unfamiliar orientation, mental-rotation theory is used.
What is mental image for:
  • A mental image is a pattern in the 2 and 1/2 -Dimension sketch that is loaded from long-term memory rather than from the eyes.
  • Mental imagery is the engine that drives our thinking about objects in space.
  • Mental imagery help creative people to "see" the solution to a problem.
  • Images drive the emotions as well as the intellect.
Thinking in images engages the visual parts of the brain.
  • the Perky effect: holding a mental image interferes with seeing faint and fine visual details.
  • Mental images of lines can affect perception just as real lines do: they make it easier to judge alignment and can even induce visual illusions.
Thinking in images has some limitations:
  • Images are fragmentary.people cannot reconstruct an image of an entire visual scene.
  • images are slaves to the organization of memory.
  • images cannot serve as our concepts, nor can they serve as the meanings of words in the mental dictionary.

没有评论: