Some stability properties of Stereo Vision

The images presented on this page demonstrate some of the surprising stabilities of stereo vision, and also some of its instabilities, which allow certain aspects of processing order in the visual system to be inferred. We begin with two practice images to accustom you to viewing stereo images in the 'mirror view' stereo pairs presented on this page.

A general explanation of these images and how to view them is found at The Multimedia library stereo-viewing tutorial. All the images on this page are set up to be viewed using the method seen there. The random dot stereograms presented below can be seen (but with some difficulty) in cheap drugstore hand-mirror, but are more easily seen using a good front-surface mirror, which are readily available around many laboratories, and which you can buy for a few dollars more at many optical supply warehouses, such as MWK industries ($5.00) or Edmund Scientific (more).

The first image presented is a very simple practice image for getting used to stereo viewing. It consists of two pairs of three nested squares, with the inner squares on the right shifted several pixels right relative to the mirrored outer square to give the some stereo disparity. When you view these squares in the binocularly fused manner explained in the tutorial, the inner squares will appear to have moved in front of the surrounding square. Do not proceed further until you can comfortably fuse this first stereo pair in the explained in the tutorial, since the remaining images are a bit more difficult to see (though quite easy once you have gotten the knack by practicing for a few minutes using the first image.)

The next practice image is a standard Julesz random-dot stereogram, created by taking two random-dot images which are mirror-images of each other, and moving a subrectangle of the left-hand image a few pixels to the right to give it stereo disparity. Looking at these monocularly, one sees nothing but the random texture which they contain. But if you fuse these images binocularly using a mirror in the manner explained, the stereo disparity of the shifted subrectangle can be seen plainly as a flat rectangle that appears to stand forward of the similarly textured random-dot background.Julesz' famous observation is that this demonstrates that stereo vision sometimes operates, not after the formation of visual gestalt fragments such as edges and corners, but at a more primitive level.

Intensity independence of random-dot stereograms

Our next image pair shows the same random-dot stereogram as before, but with the left hand image darkened by blending it with a 60% opaque black overlay, and the right-hand image correspondingly lightened. The raised stereo square is still easily visible, showing that the neural stereo detection mechanism responds not to absolute intensities but to some signal emphasizing intensity differences. Below this we show a pair formed in the same way, but using 70% and 80% blends. A stereo square is still seen at the 70% blend level (though it may take a few seconds to become visible), but disappears at the 80% level.

60% blend

70% blend

80% blend

Our next image continues the theme of intensity independence in stereo perception by presenting a random-dot stereo pair in which black and white dots are seen on the right, but the matching dots on the left are given random grey levels, but with dots that would otherwise be white (resp. black) being given random light (resp dark) greys. The raised stereo rectangle is still seen clearly.

Surprisingly, stereo perception of thickly textured images (such as random-dot stereograms) is independent of the finest details of the texture viewed. The next image demonstrates this by blending the right hand half of the stereo pair itself after a two pixel right-shift. This leaves the stereo rectangle clearly visible even though the details of the right hand image are greatly changed.

Early separation of 'dark dots' from 'light dots'

The next three image all contain the same random dot stereo pair, in which the originally black dots have been given a dark (resp. light) grey color on the left (resp. right.) They differ only in the grey level of the background seen between the dots. In the first (resp. last) image the dots on both sides are darker (resp. lighter) than the background and the raised stereo rectangle is seen clearly. In the second image, in which the background has a grey-level intermediate between that of the right and the left-hand dots, the dots the left are darker than the background, and so are perceived as 'dark spots', but those on the right are lighter, and so are perceived as 'light spots'. Since the stereo vision system resists combination of dark with light spots, no stereo perception emerges in this case.

This and many related perceptual phenomena suggest that 'light spots' from 'dark spots' are separated very early in the visual processing stream and that spots of the different characters are handled separately by many subsequent processing steps.

Light background

Intermediate background

Dark background

More evidence of detail-independence in perception of random-dot stereograms

Our next four images provide additional evidence of the detail-independence of random-dot stereogram perception. The first two images show a Julesz pair whose left halves have been overlayed with a fine vertical grid of black and white lines respectively. The raised stereo rectangle remains easily visible. In the third image the left half has been overlayed with a black grid and the right with a matching white grid; the raised stereo rectangle remains visible.In the final image the right hand grid has been shifted one pixel to the right, so that the part of the random dot pattern remaining visible on the left no longer matches that visible on the right. This disrupts perception of the stereo rectangle, which however is still seen more weakly as a rectangle of incoherent 3-D texture distinguishable from its flat stereo surround.

We give five more images which continue this theme. The first image erases a randomly selected 50% of black pixels of the left half of our Julesz pair to white. The second is similar, but erases to black. The third image blends the left half with a an independent random-dot pattern. The fourth is much the same as the first, but in color. The raised stereo rectangle is easily visible in all for cases. But in the fifth image below, we use a random pattern to invert bits on the left. Since this interchanges black and white, the stereo pattern disappears.

Since the overlay used to erase in the first three of the five preceding images is combined in a completely symmetrical way with the image under it, the resulting half-images must be as prone to generate a stereo perception when combined binocularly with the masking image as with the image it masks. Our next three images show this to be true by juxtaposing these three left hand half images with the reflected mask. It will be seen that the mask contains a raised stereo circle, distinguishing it from the image being masked, which contains a raised rectangle.

Blending of raised and lowered random-dot stereograms

If a random-dot stereo pair containing a raised stereo rectangle is reversed right-to-left, the stereo perception will also reverse and the raised rectangle will be seen as a recessed 'hole'. Two such images, one containing a raised and the other a recessed rectangle, can be blended by blending their intensities pixel-by-pixel. Depending on the proportions of the blend, either a raised rectangle, or a recessed rectangle, or an intermediate configuration in which either a raised or a recessed rectangle is seen, perhaps transparently, in front of or behind a random dot background which may also exhibit transparency in the area occupied by the rectangle. The next four stereo pairs show these effects. The first is a 29% lowered, 71% raised blend in which a raised, partly transparent rectangle is visible with a flat textured background behind it. The last shows an opposite extreme; is a 60% lowered, 40% raised blend in which a recessed rectangle is visible behind a flat partly transparent background. The next-to-last case (50/50 blend) is like the last, but the foreground seen seems substantially less transparent and the recessed surface less distinct. In the remaining case (the 40/60 blend), no distinct stereo perception emerges.

The perceptions generated by the first and last two of these images, and in particular the fact that several depth levels seem to be seen more or less simultaneously in their central regions, suggest the presence in the visual system of at least two populations of cells which react preferentially to distinct stereo disparities. The fact that no raised or lowered surface is perceived in the intermediate case suggests the presence of mutual inhibition between these cell populations.

Raised/lowered blend with 71% raised image

Raised/lowered blend with 60% raised image

Raised/lowered blend with 50% raised image

Raised/lowered blend with 40% raised image

Size and direction-dependence of the stereo fusion field

Pixels of uniform dark/light character grouped coherently enough to form perceptible edges will stimulate the direction-sensitive cells of the visual cortex, and some phenomena of perceptions suggest that the edges generated in this way tend to extend themselves in their tangent direction. The illusory rectangle seen in the well-known 'Kanisza' figure shown below suggest this idea. The variant stereo version seen beneath it, which seems to show a Kanisza rectangle moved forward in 3-D, suggests that (if they exist) these extensions are used as inputs by the visual system's stereo perception mechanisms.

Here is another Kanisza whose illusory 3-D edges are even more striking.

The four figures which follow collect additional evidence for the statement that oriented edges generate (invisible) visual stereo inputs which extend preferentially in the edge direction. The first two of these match dashed vertical edges to other dashed edges which are displaced too far vertically from them for stereo fusion to result. Nevertheless there result clear stereo perceptions, of an advanced edge in the first case and of a receding edge in the second. The third image reverse the left-hand dashes from black to white, and is seen to disrupt the stereo effect substantially or fully.

The next three images continue this same theme. The first of them merely repeats our prior forward-standing dashed figure to facilitate comparison with the figure which follows, which is the same except that the dashes have been turned 90 degrees to make them horizontal. It will be seen that this weakens the 3-D effect; this can be confirmed by comparing the last two figures, which are not as clearly 'forward' and recessed as our two earlier figures in which the dashes are vertical.

These observations conform to our suggestion that oriented edges generate (invisible) visual stereo inputs which extend preferentially in the edge direction, and suggest further that stereo processing remains largely light/dark segregated even when invisible edge extesnios aeeingused as inputs.

The transition to symbolically represented image elements

Although much anatomical and neurophysiological evidence shows that the earliest stages of the visual system are retinotopically organized and so presumably handle information in some modified image format (for example, as a family of multiple streaming images in each of which some significant image property is enhanced, as e.g. image edges, edge directions, corners etc.) there must come stages at which these formats give way to more symbolic representations adapted for connection to the linguistic and motor processes of the mind. The manner in which this is accomplished, and even the manner in which visual inputs are represented during and after this transition, remains unknown. How can distributed neurons be used to indicate that parts of an image constitute significant perceptual 'gestalt fragments' that need to be grouped together and represented internally in some symbolic form more condensed than the image itself? Our next two images give particularly evidence that some such grouping process is at work. They are simply binocularly matched pairs of horizontal lines. Viewed binocularly, the first, static, image is seen as a pair of lines tilted toward the viewer at their nearby ends. In the second, animated image the break between the two segments on the right moves periodically left and then back to the right, causing the perceived tilt of the two segments to reverse (the changed perception takes some what less than a second to stabilize.) This raises the question of how the sensation of tilt is transmitted to visual neurons dealing with pixels internal to one of the segments seen, which respond differently depending on the position of a remotely located image element, namely the break between the two line segments.

Our next two images pose this same problem, but in a two-dimensional setting. Both show a random-dot stereo pair with a raised annulus. In the first (resp. second) image a pair of filled (resp. empty) light grey rectangles are added to create stereo image elements standing in front of the annulus. In the first image this is seen as a raised grey translucent rectangle, in the second as an outlined rectangle. The first image puts the question of how the sensation of being inside something raised is transmitted to the visual neurons which deal with pixels internal to the rectangle, when all the local information available at these pixels indicates that they are in the plane of the annulus or the background plane. The second image, in which a perfectly transparent rectangle filling the visible rectangular outline is seen (though less distinctly than in the previous case) raises the same problem. (The strength of the raised rectangular gestalts formed on the basis of evidence available only at the rectangle edges is indicated by the occasional tendency of the random-dot texture to rise into the plane of the raised rectangle.)