François d’Aguilon – ahead of his time?

I’ve just downloaded d’Aguilon’s 1613 “Six Books of Optics” from the Internet Archive ( Opticorum Libri Sex by Aguilonius). It’s such a joy to have this classic texts freely available now. The downside is that it tempts me to go down historical rabbit-holes rather than writing the lecture I should be writing on the evolution of stereopsis… but it’s fascinating so I can’t mind too much.

I’m struck by how ahead of his time this seventeenth-century Jesuit is. My mental summary of the history stereopsis had been something like: “People like Descartes and Kepler believed that vergence was a powerful distance cue, and Leonardo in his Treatise in Painting pointed out how interocular differences in occluded regions give a compelling sense of depth. However, the role of retinal disparity was not appreciated until Wheatstone’s 1838 presentation to the Royal Society. This showed for the first time that retinal disparity can give a depth percept with no change in vergence. More recently, vision scientists have pointed out how much more sensitive humans are to the relative disparity between objects or surfaces than to the absolute disparity of an isolated object.”.

I don’t know of an English translation, which probably wouldn’t be available freely on the Internet anyway. The Latin is rather beyond me (I did the A-level 30 years ago and that was classical rather than seventeenth-century scientific Latin). But as far as I can tell, Aguilonius argues that vergence is not a distance cue, and he lays out the importance of relative disparity quite clearly. I am surprised as I didn’t think those views were held so early.

Aguilonius’ Third Book of Optics is “de communium objectorum cognitione” , which I think might be translated as “on the perception of general objects”. As far as I can gather, he thinks there are 9 properties of objects: distantia( (distance), quantitas (size – “big or small, thick or thin, long or wide”), figura (shape – “straight or curved, convex or concave, acute or obtuse”), locus (3D location – “above/below, left./right, forward/back”), situs (stance – “sitting, standing, order, arrangement”), continuitas, discretio (these seem to be lumped together into what we might call numerosity), motus, quietus (motion and rest) . He says distance is the most important because the perception of the others depends on it (Distantiam praemittimus, ex cuius perceptione ceterorum cognitio dependet).



He then discusses cues to distance. He mentions known size (“who would deny that distance is inferred from the known magnitude of a thing, especially from that common notion, that a person judges those objects to be far away which appear small, when in reality they are large?”)

I don’t follow his “preliminary notes”, but I think he is emphasising the difficulty of what we would call metric depth: it is difficult to perceive not that an object is distant, but how far away it is.

The importance of binocular vision for depth perception

Proposition 1: “A single eye cannot define distance by itself”. He points out that all objects along a given line of sight project to the same point in the eye. He goes into the difficulty of depth perception with one eye covered, giving my favourite example of how hard it is to thread a needle. (“Hinc etiam filum per foramen acus transversum immittere, altero occluso oculo incerti negotii est”).

He mentions a game “which we learnt from boys, but judge worthy of a philosopher”: “The game was thus: one of the boys would hold a stick upright in his hand, while another would strive to touch it with his outstretched index finger while looking with only one eye; and whenever the stick was moved, he would almost always miss the goal.” (Lusus hic erat: puerorum alter bacillum erectum tenebat manu, hunc alter protenso in transversum indice tangere nitebatur uno tantum cernens oculo, ac quoties id moliebatur, toties paene a meta aberrabat) Aguilonius argues that they make these mistakes  because they can’t judge distance correctly. Rubens has illustrated this beautifully in the chapter heading :



Ono et al (Perception 42(1):45-49) argue that this illustration “is possibly depicting the cosmic observer performing Porta’s sighting test; having pointed to the stick, held by a putto, the observer closes the left eye to determine whether the finger is still aligned with the stick.” But I think given the above, it’s clear that Rubens is illustrating the game described, played by a philosopher since Aguilonius suggests it’s worthy of them.

Having made the case that you can’t judge distance mononocularly, Aguilonius immediately, in his second propsoition, gives us some exceptions, e.g. pointing out that due to the ground plane, height in the visual field enables us to infer distance with one eye:



Vergence

But it’s his third proposition that interests me most. As far as I can tell, he’s dissing the idea that vergence is a distance cue.

I’ll translate this as best I can:

“Proposition III. Theorem. Not correctly do some assert that distance can be known from the angles of the conjoined axes.

“There are those who want distance to be inferred from the magnitude of the angle made by the concurrent axes, so that the more distant objects are, the smaller the angle of the axes at the point where it intersects them; and the closer, the larger. Which they try to show in this way. Let there be two eyes A and B, and let CD be a general line. It doesn’t matter whether CD is orthogonal to the line connecting the centres of sight [AB], or oblique. Now let there be a visible object on the general line, now at E, now at D. With optic axes being drawn from E and D to each appearance, DA and DB and ditto EA and EB, the angle AEB will be greater than the angle ABD by the 21st theorem of the first book of Euclid: if indeed E falls within the triangle ADB.
Wherefore E will be shown to be closer, since it appears with a larger angle, and D further, since lesser. This demonstration seems solid to them: but really it is confounded most strongly by two arguments. The first is this: Sight does not perceive the magnitude of the angle which the axes make in their concourse: for it is outside the eye; nor is it imbued by any visible qualities by which it could be seen. For things which are outside the eye cannot be known except by their sensed properties; but things in the eye can be felt by the very same sense, such as the movement of the eyes, and their position, the operation of vision itself, and other things of this kind: not therefore by the work of that very angle can vision know the distance of things.”




His other killer argument is that points on the horopter have the same vergence angle but are at different distances, as shown here:



NB the wikipedia entry for horopter states “The term horopter was introduced by Franciscus Aguilonius in the second of his six books in optics in 1613.[5] In 1818, Gerhard Vieth argued from Euclidean geometry that the horopter must be a circle passing through the fixation-point and the nodal point of the two eyes.” But from the above diagram it seems pretty clear Aguilonius had already arrived at Vieth’s conclusion 200 years earlier.

I may be misunderstanding but I read the above as a pretty clear statement that vergence is not available from purely ocular information [he doesn’t know about modern photogrammetry and is considering a 2D system] and he seems to be rejecting the idea that extra-ocular information would be used to interpret visual information. This goes counter to what I thought I knew: that vergence was considered a strong distance cue at this time.

Relative disparity

Proposition IV is “Distance is recognised from the length of the optic axes”. This does not sound very promising but I think it’s actually a statement about relative disparity, which struck me as remarkably ahead of its time. Aguilonius confused me by talking about “the length of the optic axes” – how could you obtain that from visual information? But he’s pointing out you can infer that from the other eye. He refers us back to Proposition XXIV of the Second Book: “The length of one optic axis is perceived by the other eye from the size of the angle made by its own axes with the interocular axis”.



“Let there be two centres of projection, A and B, and let the axis of eye A be the line AC, whose magnitude I say can be recognised from the viewpoint of B by the size of the angle ABC made by its own optic axis BC and the line AB connecting the centres of projection”.

Now as an argument for metric depth, this is a bit flawed. Aguilonius is trying to get metric depth from absolute retinal disparity, but he hasn’t appreciated that the criticism he made in Proposition II of Book III also applies here: the position of the interocular axis AB is “outside the eye” and cannot be deduced from the visual information he is discussing. He later argues (correctly in my view) that the angle ADB in this diagram isn’t directly available from retinal information, but he assumes that the angle DBA is – and I disagree.

But I think in Book III, Proposition IV, he uses this line of argument much more convincingly.



“It sometimes occurs that a visible object on the same axis BD now becomes closer, as in C, now more distant, as in D. When this happens, in the same way as the eye B sees both axes AC and AD as equal, since each subtends the same angle ABC; in the same way eye B perceives points C and D as equally distant, or to put it better, it does not discern their unequal distances, since it judges equal to be equally distant: for as was shown in Proposition I of this book, eye B alone cannot perceive the distance of objects on a straight line drawn from the eye. But eye A, bringing its own action also to bear in common, resolves the question. For eye A, grasping that line BC is shorter than BD, defines the object C as being nearer to B than point D. Thus, from the lengths of the optic axes, two eyes together, by their joint power, lead to the perception of distance, which neither could achieve by its solitary action.”

Notice that here he doesn’t make a case for metric depth: he’s just arguing that binocular vision enables us to see that C is closer than D. I think this is a fairly clear statement about depth perception from relative disparity, explicitly not dependent on vergence.

Perhaps I’m being too generous since he seems to be considering the relative disparity between the same object at different times (“nunc propinquius fit, nunc remotius“) rather than the relative disparity between two simultaneously visible objects. Nevertheless I was surprised to see this in 1613, 250 years before Wheatstone’s creation of stereograms. What do other vision scientists think?
 

 

 

 

 

 

 

 

 

Direction selectivity requires nonlinearity

Thanks to Damon Clark at Yale and Jacob A. Zavatone-Veth at Harvard for pointing out the following to me.

I had always thought that you could get a direction-selective neuron with a linear filter that is spatiotemporally inseparable, so that it is “tilted” in spacetime, with the gradient defining a speed and direction. I always thought you would get a bigger response for a stimulus moving at the speed and in the direction matching the filter, than for one moving in the opposite direction. I know models, like the motion energy model, then tend to place a nonlinearity after the linear filtering, but I didn’t think this was necessary for direction-tuning when the filter is already tilted in this way.

Well… yes it is (although it does slightly depend on what you mean by direction-selective). The video below shows a Gaussian-blob stimulus passing over a receptive field, first left to right and then right to left.

The leftmost panel shows the tilted spatiotemporal linear filter representing the receptive field. How to read this function: It responds weakly to the present stimulus (tau=0), responding most to stimulation at x=-50. It responds more strongly to stimuli as we go back into the past. It responds most strongly to stimuli that were presented at a time tau=40 units ago, located at x=-11. As we go further back into the past, its responds decays away. It responds weakly to stimuli presented 70 time-units ago, most strongly to those that were at x=10 units 70 time-units ago. This panel doesn’t change, as the filter’s shape is a permanent feature (it’s time-invariant).

The middle panel shows the stimulus. The bottom row (tau=0) shows where the stimulus is now; the rows above show where it was at times progressively further into the past. At any given moment of time, the stimulus is a Gaussian function of position. The contour lines show the filter for comparison.

The right-hand panel shows the response of the linear filter, which is the inner product of the filter and the stimulus at every moment in time.
The red curve shows the response when the stimulus moves rightward; the blue curve shows the response when the stimulus moves leftward.

Here is the response as a function of time for both directions of motion. Notice that although the response to the leftward stimulus peaks at a much higher value, the total response is the same for both directions of motion. I had never realised this was the case until Damon pointed it out to me and found it hard to believe at first — although as the video makes clear, it’s just because in both cases the stimulus is sweeping out the same volume under the filter.

So can you describe this linear filter as direction-selective? It certainly gives a different response to the same stimulus moving rightward vs leftward, so I’d argue to that extent you can describe it as such. But since the total response is the same for both directions, it’s hard to argue it has a preferred direction. And it’s certainly true that to use it in any meaningful way, you’d want to apply a nonlinearity, whether squaring or a threshold or whatever. For example, if you wanted to use this “leftward filter” to drive a robot to turn its head to follow a leftward moving object, you’d be in trouble if you just turned the head leftward by an angle corresponding to the output of this filter. Sure the robot would turn its head left by so many degrees as an object passed left in front of it, but it would also turn its head left by the exact same angle if an object passed rightward! So in that sense, this filter is not direction-selective, and a nonlinearity is required to make it so.

Many thanks Damon and Jacob for taking the time to explain this to me!


Fitzgerald & Clark (2015). Nonlinear circuits for naturalistic visual motion estimation. eLife 2015;4:e09123.


The (slightly crummy) Matlab code I wrote to generate this video is below:

function JDirTest
clear all;
close all;

% This makes the tilted RF
xp = [-100:100];
X = exp(-(xp./20).^2)
figure(1000)
plot(xp,X);
nt = 100; %number of time samples
tau = [1:nt];
tG = exp(-((tau-40)./20).^2); % just shifted so as to be causal
for j = 1:nt
rim(j,:) = circshift(X,j-round(nt/2));
lim(j,:) = circshift(X,round(nt/2)-j);
RFim(j,:) = circshift(X,j-round(nt/2));
RFim(j,:) = RFim(j,:) .* tG(j);
end
figure
imagesc(xp,tau,RFim);
xlabel(‘position x ‘)
ylabel(‘time \tau (seconds before present)’)
set(gca,’ydir’,’norm’)
title(‘Linear spatiotemporal filter f(x,\tau)’)

% Now run a nice simulation
figure(‘pos’,[20 374 1495 420])
for jdirection=1:2
if jdirection==1
x0=-100;
speed=+1;
colspeed = ‘r’;
label=’rightward’;
else
subplot(1,3,3)
hold on
x0 = +100;
speed=-1;
colspeed=’b’;
label=’leftward’;
end

subplot(1,3,1)
imagesc(xp,tau,RFim);
set(gca,’ydir’,’norm’)
xlabel(‘position x ‘)
ylabel(‘relative time \tau (seconds before present)’)
set(gca,’ydir’)

time = [0:300];
for jt=1:length(time)
timenow = time(jt);
% delete(h)
% h = plot(stimuluslocation(t
subplot(1,3,1)
title(sprintf(‘Current time is t=%3.0f’,timenow))
subplot(1,3,2)
s = stimulus(xp,tau,timenow);
hold off
imagesc(xp,tau,s);
hold on
contour(xp,tau,RFim);

set(gca,’ydir’,’norm’)
xlabel(‘position x ‘)
ylabel(‘relative time \tau (seconds before present)’)
set(gca,’ydir’)
title(‘Stimulus s(x,t-\tau)’)
drawnow

% Do inner product of current stimulus with filter:
response(jt) = sum(sum( s.*RFim)) ;
% Plot it
if jt>1
subplot(1,3,3)
h(jdirection) = plot(time(1:jt),response(1:jt),’-‘,’col’,colspeed);
xlim([0 max(time)])
ylim([0 1000])
lab{jdirection} = sprintf(‘%s, total = %3.0f’,label,trapz(time(1:jt),response(1:jt)));
legend(h,lab)
title(‘Response’)
xlabel(‘time (s)’)
end

end

end % do other direction

% Stimulus is a Dirac delta function, x = vt + x0
function s = stimulus(xp,tau,time)
% returns s(x,t-au)
[xp2,tau2] = meshgrid(xp,tau);
%s = 0*xp2;
%s(xp2 == time – tau2 ) = 1;
s = exp(-(xp-x0 – speed*(time – tau2)).^2/2/10^2);
end

end

Da Vinci Stereopsis

I have just been asked for “a succinct explanation of da Vinci stereopsis”. I googled in the hope of finding one, but couldn’t, so thought I’d put one up here.

Leonardo da Vinci didn’t quite realise that stereoscopic depth perception was a thing, but he did explain in his “Treatise on Painting” that a given object occludes different parts of the background when viewed from the left eye as compared to the right eye. “Da Vinci Stereopsis” now refers to depth perception based on the occlusion geometry in the two eyes. The term was introduced by Nakayama and Shimojo in a 1990 paper.

Consider the left-hand figure below. Both eyes see a large black rectangular object, but the right eye also sees a black bar to its right. Most observers, seeing these images, experience a weak sense that the bar is further away than the rectangle. This is because of the geometry shown in the figure. The left eye doesn’t see the bar because it’s hidden from view (“occluded”) by the nearer black rectangle.


Conversely, in the right-hand figure, the bar is only visible in the left eye, again to the right of the rectangle. Now, most people will report a weak sense that the bar is closer than the rectangle. This is because these retinal images could be accounted for by the scene shown at the top of the figure: the bar is technical seen by both eyes, but in the right eye it appears on top of the rectangle. Both objects are black and so the bar is invisible in the right eye.

Many vision scientists think that da Vinci stereopsis is a separate form of stereo vision that is not based on disparity (the separation between the images of the same object as seen in left and right eye). The argument is that because the bar is only visible in one eye’s image, a disparity cannot be defined.

Vision Sciences Society meeting 2017

Sid, Maydel, Chris and I had an excellent time at the VSS meeting last month. Thanks to Ignacio for this snap of my talk on “When invisible noise obscures the signal: the consequences of nonlinearity in motion detection.” Sid gave a great talk on his work “Modeling response variability in disparity-selective cells.” tmp2

3D glasses with glasses

I was giving a talk recently about my work on viewer experience with stereoscopic 3D television, and an audience member asked a good question, which was: Was there any relationship between people complaining of adverse effects and whether they routinely wore prescription spectacles? Such people are wearing two pairs of glasses to view S3D, which might be more uncomfortable, but equally they are already used to wearing glasses so might be less bothered than your average person who is wearing glasses only to view 3D.

We didn’t put anything about that in the papers, but I dug out the data and had a look. I haven’t done the stats, but it seems pretty clear there’s no effect of glasses. First, here is Fig 7 from Read & Bohr 2014:

tmp

And here is a version split up by whether or not participants usually wore glasses (in each pair of bars, the left-hand bar is for people who wore contacts or no correction, and the right-hand bar is for people who wore glasses).
tmp
In the graph, it looks as if there’s a striking difference in the “fake 3D passive” case, but really that’s to do with the small number of participants – we have 1/17 people without glasses reporting adverse effects, compared to 3/15 in the people with glasses. So if just one person changed their answer, it would look much less impressive. Since the effect isn’t seen in the other groups, I think it’s probably just a blip.

Averaging over all participants who wore 3D glasses (ie excluding only those in the true 2D group), the numbers are as follows:
n total = 311
n reporting adverse effects = 64 (21%)
n who habitually wear glasses = 117 (38%)
of whom n reporting adverse effects = 21 (18%)
n who do not habitually wear glasses = 194 (62%)
of whom n reporting adverse effects = 43 (22%)

User experience while viewing stereoscopic 3D television

Read JCA, Bohr I ( 2014 )
User experience while viewing stereoscopic 3D television

Ergonomics Vol: 57(8) Pages: 1140-53 [view on journal website] Pubmed ID : 24874550


Journal press release "Good news for couch potatoes".
0.4 MiB
1764 Downloads