While working with Richard, I had begun trying to build a quantitative model of the stereo and motion systems which could explain our psychophysical results. After his death, I continued this work, and eventually wrote it up as the paper above. Again, I'm very grateful to Bruce, Simon and Andrew Glennerster for reading drafts of this and giving me feedback.

An interesting property of anti-correlated stimuli is that, under almost all reasonable assumptions about how disparity might be encoded, different spatial-frequency/orientation channels return different estimates of stimulus disparity. The problem in the model was finding a good way to combine the answers from different channels. In the end, I decided that the best way was to convert each channel's output into a common language, namely a Bayesian probability estimate of disparity. Suppose you have a binocular neuron tuned to disparity D, and you know that the input from the left eye is L. Then you can calculate the expected response of the neuron under the assumption that the stimulus disparity is D, because then the input from the right eye should be the same as that from the left, apart from small differences due to noise. If the actual response of the neuron is very different from this, then it isn't very likely that D really is the disparity of the stimulus.

I was expecting to have to build different ways of combining the different channels' outputs for the two systems, motion and stereo, in order to explain the differences in perception. But, to my surprise, I found that I could use the same mathematical structure. Differences in performance could be captured quite well by just assuming that the motion system is subject to a greater noise level than the stereo system, and that the stereo system prefers small disparities more than the motion system prefers lower speeds.

An interesting property of anti-correlated stimuli is that, under almost all reasonable assumptions about how disparity might be encoded, different spatial-frequency/orientation channels return different estimates of stimulus disparity. The problem in the model was finding a good way to combine the answers from different channels. In the end, I decided that the best way was to convert each channel's output into a common language, namely a Bayesian probability estimate of disparity. Suppose you have a binocular neuron tuned to disparity D, and you know that the input from the left eye is L. Then you can calculate the expected response of the neuron under the assumption that the stimulus disparity is D, because then the input from the right eye should be the same as that from the left, apart from small differences due to noise. If the actual response of the neuron is very different from this, then it isn't very likely that D really is the disparity of the stimulus.

I was expecting to have to build different ways of combining the different channels' outputs for the two systems, motion and stereo, in order to explain the differences in perception. But, to my surprise, I found that I could use the same mathematical structure. Differences in performance could be captured quite well by just assuming that the motion system is subject to a greater noise level than the stereo system, and that the stereo system prefers small disparities more than the motion system prefers lower speeds.