3D perception I
Perceived slant of rectangular grids viewed on slanted screens
A fundamental question in vision is how 3D perception is inferred from 2D images. Many studies showed that monocular and binocular sources of information (cues) contribute to perceived depth and 3D shape. To test contributions of individual cues, several laboratories have measured perceived slants induced by single and combined, i.e. disparity and perspective-related, cues. The consensus of these studies is that observers combine cues in a statistically optimal fashion. Optimal cue combination does not explain 3D perception in pictures and 2D movies. Disparity would strongly reduce perceived depth during binocular viewing. To investigate the effect of a screen on slant perception, I measured slants of rectangular grids that were viewed binocularly on a screen placed on a turntable. Slant of the computed grids and slant of the screen were varied independently. Subjects indicated perceived slant by orienting a physical rectangular surface placed on another turntable. The judgments show that the slant of the screen does not affect the perceived slant of the grids. The conclusion is that slant perception of a grid on a screen is based on perspective-related cues. Disparity and monocular cues related to the screen are fully suppressed.
Binocular fusion, suppression and diplopia: effects of disparity, contrast polarity and contrast imbalance
Stuart Wallis and Mark Georgeson
With different images to each eye, one may experience fusion, suppression of one eye's view, or diplopia. To understand better the underlying binocular processes, we studied perception of binocular edges as a function of binocular disparity. Single, Gaussian-blurred, horizontal edges (blur B=8 minarc) were shown to each eye at various vertical disparities (0 to 8B), with the same or opposite contrast polarity. Observers could indicate the position and polarity of a single perceived edge, or report 2 edges. Diplopia increased with disparity, but when contrasts were unequal the lower-contrast edge was often not seen, particularly at disparities 3 to 5B. We developed a simple descriptive model to interpret the behavioural responses as arising from (a) the probability of fusion (assumed to fall with increasing disparity), (b) the probability of suppression occurring when fusion fails, and (c) the role of positional noise and criterion in judgements of edge position. From this modelling, we conclude that fusion extends to disparities of about 2.5B for all observers, but is absent for opposite polarities. Probability of suppression also declined monotonically with increasing disparity, increased with contrast imbalance, and tended to be lower for opposite polarities, but was highly variable across observers.
Nonlinear binocular summation and interocular suppression implement binocular fusion: a unification of two models
Mark Georgeson and Stuart Wallis
A striking feature of binocular vision is that different images in the two eyes can be 'fused' in perception, yet little is known about how fusion is achieved. We studied fusion and diplopia for Gaussian-blurred, horizontal edges with vertical disparity (silencing stereo vision). For a wide range of blurs B, the range of fusion is about 2.5B. If fusion linearly summed or averaged the monocular signals, we should expect fused edges to look increasingly blurred as disparity increased. In a blur-matching task, we found that this was true when the two edges were physically added (monocular control), but for dichoptic edges perceived blur was nearly invariant with disparity. We show that such fusion, preserving blur, occurs if luminance gradients are computed for each eye, and then the two Gaussian gradient profiles are combined as a contrast-weighted geometric mean. Finally, we show that this model for fusion is almost exactly equivalent to our earlier two-stage model derived from experiments on binocular and dichoptic contrast discrimination (Meese, Georgeson & Baker, Journal of Vision 2006). The binocular interactions proposed there can now be seen to implement the contrast-weighted geometric mean, and thus to achieve blur-preserving binocular fusion, followed by signal compression.
Depth constancy and frontal-plane scaling in the absence of vertical disparities
Binocular disparities vary inversely with the square of viewing distance and therefore scaling is needed in order to achieve depth constancy. Scaling is also needed to correct for the differing patterns of horizontal disparities created by frontal-plane surfaces at different distances. Both vergence and differential perspective (vertical disparities) have been shown to provide scaling information (Rogers and Bradshaw, Perception 24, 1995) but the extent of constancy is much higher for frontal-plane judgments. This has been attributed to the fact that a distance estimate is not required because there is frontal-plane information in the vertical and horizontal disparity fields. This idea was tested using stimuli that provided no vertical disparity information. Observers made frontal-plane judgments using a narrow (2˚ high x 70˚ wide) textured strip while simultaneously adjusting the amplitude of a narrow (2˚ wide x 60˚ high) strip of horizontally-oriented, triangular corrugations until the ridge angles appeared to be 90˚. Vergence was varied between 0˚ (∞) and 13˚ (29cm) in different trials. Scaling was more complete (70-80%) in observers’ frontal-plane judgments compared with the depth task which produced only ~30% of the required scaling. Cue-conflicts cannot account for the results since the two tasks were done under the same stimulus conditions.
Phantom Contours in da Vinci Stereopsis
Barbara Gillam and Susan Wardle
It is now known that monocular regions in binocular arrays are informative about depth in a number of ways. For example a monocular region attached to a binocular surface can appear occluded by the surface if on its temporal side and camouflaged against it if on its nasal side. Depth magnitude varies with the width of the attachment. We showed using a binocular probe that in the camouflage case, seeing the attachment in depth requires that it have the same luminance and colour as the background surface. When a nasal attachment did not satisfy camouflage conditions a phantom occluder was seen on its nasal side “accounting for” its monocular status. The magnitude of the depth seen in the phantom was as precise and accurate as regular stereopsis. No depth was seen in the surface itself. This outcome implies double matching ?of the edge of the binocular region in one eye with both the edge of the binocular region and the edge of the monocular region in the other eye?a novel form of Panum’s limiting case. These findings are not compatible with several theories of Panum’s Limiting Case and show that accurate/precise stereopsis does not require a contour to carry depth.I
Broad spatial tunings of the object aftereffect: Evidence for global statistical representations of 3D shape and material
We recently showed that adaptation to a 3D object with a particular shape (e.g., bumpy) and material (e.g., glossy) alters the appearance of the subsequent object (e.g., smooth and matte; Motoyoshi, 2012, Vision Sciences Society). Notably, this object aftereffect is also induced by adaptation to a random noise having a similar spatial frequency with the adapting object, indicating an impact of simple image statistics in the perception of 3D shape and material. To test if this is consequent to local feature coding in early levels, we here examined spatial tuning of the aftereffect. Following adaptation for 40 sec (4-sec top-up) to a pair of two band-pass noises with different amplitudes, a pair of realistic spherical objects with different bumpiness were presented for 250 msec at various locations. Observers judged which object appeared bumpier, and the PSE was estimated. We found robust aftereffects (~30% of those at the adapting location) even for objects presented 12 deg away from the adapting noises (~3.5 deg in radius). Similar results were obtained for glossiness. These findings manifest the existence of high-level visual mechanisms that represent 3D shape and material as summary image statistics within a very large receptive field.
Apparent motion in depth: first attempts
The new wave of stereoscopic movies has stimulated interest in the old wave which started it. The combination of simulated motion and depth required three prior stages of invention: apparent motion, stereoscopy and photography. The origins for all these can be found in the decade after 1825, mostly in London but also in Paris. Instruments were devised which simulated motion and depth: sequences of slightly different still images could appear to move and paired pictures (with small horizontal disparities and presented to different eyes) were seen in depth. In 1831, Faraday’s experiments on persisting images provided the impetus for Plateau’s phenakistiscope and Stampfer’s stroboscopic disc (both in 1833). Daguerre and Talbot described their techniques for capturing light on metal or paper in 1839.Wheatstone invented the stereoscope in 1832, directed Talbot to take paired photographs for use with it in 1840, and suggested how sequences of stereoscopic photographs could be combined in the phenakistiscope (in a letter to Plateau in 1849). This last was attempted by Claudet in London and Duboscq in Paris in the early 1850s with the fantascopic stereoscope and bioscope, but their efforts were not rewarded. Motion was easier to simulate than motion in depth.
The role of stereopsis in figural grouping versus segmentation
Lesley Deas and Laurie M. Wilcox
The disparity required to discriminate the relative depth of a pair of isolated vertical lines is minute but increases dramatically when these lines are connected to form a closed figure (McKee, 1983, Vision Research, 23, 191-198). Here we propose that the loss of sensitivity in the closed configuration is due to within-object depth averaging. We tested this proposal by measuring discrimination thresholds for neighbouring vertical lines in a set of four equally spaced lines that produced three adjacent test pairs (left, central, right). We created two closed rectangles by connecting the outer pairs of lines, and measured thresholds for the same line pairs. Note that now in the central pair condition the lines formed sides of separate rectangles. As expected, thresholds were lower for isolated lines than for their closed counterparts. Importantly, thresholds were also lower in the central condition, when the line pairs belonged to distinct objects. Our results suggest that sensitivity to binocular disparity depends critically on figural grouping. Specifically we hypothesize that the high-resolution disparity signal helps segregate one object from another, and that this resolution is sacrificed (possibly via disparity averaging) to reinforce object cohesiveness.
Parallactic movement beats binocularity in the presence of external visual noise
Nicole Voges, Michael Bach and Guntram Kommerell
Binocular vision provides a considerable advantage over monocular vision in the presence of disturbances along the view line [Otto et al, 2010, Graefe’s Arch Clin Ophthamol, 248, 535-–541]. A typical example is a driver who tries to identify objects through a windshield dotted with snowflakes. During driving, any bumpiness of the road will cause a vertical parallactic up-and-down movement of particles on the windshield with respect to the visual object. We simulated this dynamic situation and found: (1) The benefit of binocular over monocular vision largely vanishes. (2) This strong loss of binocular benefit is partly due to a 'ceiling effect'. An additional experiment that avoided 'ceiling effects', however, showed that the effect of moving vs. stationary noise was still markedly larger than the effect of binocular vs. monocular viewing.