ASTEROID stereotest v1.0: lower stereo thresholds using smaller, denser and faster dots by Read JCA, Wong ZY, Yek X, Wong YX, Bachtoula O, Llamas-Cornejo I, Serrano-Pedraza, ReadWongYekWongBachtoulaLlamasCornejoSerranoPedraza2020_compressed.pdf (0.6 MiB) - Purpose: In 2019, we described ASTEROID, a new stereotest run on a 3D tablet
computer which involves a four-alternative disparity detection task on a dynamic
random-dot stereogram. Stereo thresholds measured with ASTEROID were well
correlated with, but systematically higher than (by a factor of around 1.5), thresholds
measured with previous laboratory stereotests or the Randot Preschool clinical
stereotest. We speculated that this might be due to the relatively large, sparse
dots used in ASTEROID v0.9. Here, we introduce and test the stereo thresholds
and test-repeatability of the new ASTEROID v1.0, which uses precomputed
images to allow stereograms made up of much smaller, denser dots.
Methods: Stereo thresholds and test/retest repeatability were tested and compared
between the old and new versions of ASTEROID (n = 75) and the Randot Circles
(n = 31) stereotest, in healthy young adults.
Results: Thresholds on ASTEROID v1.0 are lower (better) than on ASTEROID
v0.9 by a factor of 1.4, and do not differ significantly from thresholds on the Randot
Circles. Thresholds were roughly log-normally distributed with a mean of
1.54 log10 arcsec (35 arcsec) on ASTEROID v1.0 compared to 1.70 log10 arcsec
(50 arcsec) on ASTEROID v0.9. The standard deviation between observers was
the same for both versions, 0.32 log10 arcsec, corresponding to a factor of 2 above
and below the mean. There was no difference between the versions in their test/
retest repeatability, with 95% coefficient of repeatability = 0.46 log10 arcsec (a
factor of 2.9 or 1.5 octaves) and a Pearson correlation of 0.8 (comparable to other
clinical stereotests).
Conclusion: The poorer stereo thresholds previously reported with ASTEROID
v0.9 appear to have been due to the relatively large, coarse dots and low density
used, rather than to some other aspect of the technology. Employing the small
dots and high density used in ASTEROID v1.0, thresholds and test/retest repeatability
are similar to other clinical stereotests.
Blog Archives
Stereotest Comparison: Efficacy, Reliability, and Variability of a New Glasses-Free Stereotest
Stereotest Comparison: Efficacy, Reliability, and Variability of a New Glasses-Free Stereotest by McCaslin AG, Vancleef K, Hubert L, Read JCA, Port NL, McCaslinVancleefHubertReadPort2020_compressed.pdf (0.6 MiB) - Purpose: To test the validity of the ASTEROID stereotest as a clinical test of depth perception by comparing it to clinical and research standard tests.
Methods: Thirty-nine subjects completed four stereotests twice: the ASTEROID test on an autostereo 3D tablet, a research standard on a VPixx PROPixx 3D projector, Randot Circles, and Randot Preschool. Within 14 days, subjects completed each test for a third time.
Results: ASTEROID stereo thresholds correlated well with research standard thresholds (r = 0.87, P < 0.001), although ASTEROID underestimated standard threshold (mean difference = 11 arcsec). ASTEROID results correlated less strongly with Randot Circles (r = 0.54, P < 0.001) and Randot Preschool (r = 0.64, P < 0.001), due to the greater measurement range of ASTEROID (1–1000 arcsec) compared to Randot Circles or Randot Preschool. Stereo threshold variability was low for all three clinical stereotests (Bland–Altman 95% limits of agreement between test and retest: ASTEROID, ±0.37; Randot Circles, ±0.24; Randot Preschool, ±0.23). ASTEROID captured the largest range of stereo in a normal population with test–retest reliability comparable to research standards (immediate r = 0.86 for ASTEROID vs. 0.90 for PROPixx; follow-up r = 0.68 for ASTEROID vs. 0.88 for PROPixx).
Conclusions: Compared to clinical and research standards for assessing depth perception, ASTEROID is highly accurate, has good test–retest reliability, and measures a wider range of stereo threshold.
Translational Relevance: The ASTEROID stereotest is a better clinical tool for determining baseline stereopsis and tracking changes during treatment for amblyopia and strabismus compared to current clinical tests.
Methods: Thirty-nine subjects completed four stereotests twice: the ASTEROID test on an autostereo 3D tablet, a research standard on a VPixx PROPixx 3D projector, Randot Circles, and Randot Preschool. Within 14 days, subjects completed each test for a third time.
Results: ASTEROID stereo thresholds correlated well with research standard thresholds (r = 0.87, P < 0.001), although ASTEROID underestimated standard threshold (mean difference = 11 arcsec). ASTEROID results correlated less strongly with Randot Circles (r = 0.54, P < 0.001) and Randot Preschool (r = 0.64, P < 0.001), due to the greater measurement range of ASTEROID (1–1000 arcsec) compared to Randot Circles or Randot Preschool. Stereo threshold variability was low for all three clinical stereotests (Bland–Altman 95% limits of agreement between test and retest: ASTEROID, ±0.37; Randot Circles, ±0.24; Randot Preschool, ±0.23). ASTEROID captured the largest range of stereo in a normal population with test–retest reliability comparable to research standards (immediate r = 0.86 for ASTEROID vs. 0.90 for PROPixx; follow-up r = 0.68 for ASTEROID vs. 0.88 for PROPixx).
Conclusions: Compared to clinical and research standards for assessing depth perception, ASTEROID is highly accurate, has good test–retest reliability, and measures a wider range of stereo threshold.
Translational Relevance: The ASTEROID stereotest is a better clinical tool for determining baseline stereopsis and tracking changes during treatment for amblyopia and strabismus compared to current clinical tests.
Characterizing the Randot Preschool stereotest: Testability, norms, reliability, specificity and sensitivity in children aged 2-11 years
Characterizing the Randot Preschool stereotest: Testability, norms, reliability, specificity and sensitivity in children aged 2-11 years by Read JCA, Rafiq S, Hugill J, Casanova T, Black C, O'Neill A, Puyat V, Haggerty H, Smart K, Powell C, Taylor K, Clarke MP, Vancleef K, ReadEARandotPreschool2019e.pdf (1.8 MiB) - Purpose
To comprehensively assess the Randot Preschool stereo test in young children, including testability, normative values, test/retest reliability and sensitivity and specificity for detecting binocular vision disorders.
Methods
We tested 1005 children aged 2–11 years with the Randot Preschool stereo test, plus a cover/uncover test to detect heterotropia. Monocular visual acuity was assessed in both eyes using Keeler Crowded LogMAR visual acuity test for children aged 4 and over.
Results
Testability was very high: 65% in two-year-olds, 92% in three-year-olds and ~100% in older children. Normative values: In 389 children aged 2–5 with apparently normal vision, 6% of children scored nil (stereoblind). In those who obtained a threshold, the mean log threshold was 2.06 log10 arcsec, corresponding to 114 arcsec, and the median threshold was 100 arcsec. Most older children score 40 arcsec, the best available score. We found a small sex difference, with girls scoring slightly but significantly better. Test/retest reliability: ~99% for obtaining any score vs nil. Agreement between stereo thresholds is poor in children aged 2–5; 95% limit of agreement = 0.7 log10 arcsec: five-fold change in stereo threshold may occur without any change in vision. In children over 5, the test essentially acts only as a binary classifier since almost all non-stereoblind children score 40 arcsec. Specificity (true negative rate): >95%. Sensitivity (true positive rate): poor, <50%, i.e. around half of children with a demonstrable binocular vision abnormality score well on the Randot Preschool.
Conclusions
The Randot Preschool is extremely accessible for even very young children, and is very reliable at classifying children into those who have any stereo vision vs those who are stereoblind. However, its ability to quantify stereo vision is limited by poor repeatability in children aged 5 and under, and a very limited range of scores relevant to children aged over 5.
Figures
To comprehensively assess the Randot Preschool stereo test in young children, including testability, normative values, test/retest reliability and sensitivity and specificity for detecting binocular vision disorders.
Methods
We tested 1005 children aged 2–11 years with the Randot Preschool stereo test, plus a cover/uncover test to detect heterotropia. Monocular visual acuity was assessed in both eyes using Keeler Crowded LogMAR visual acuity test for children aged 4 and over.
Results
Testability was very high: 65% in two-year-olds, 92% in three-year-olds and ~100% in older children. Normative values: In 389 children aged 2–5 with apparently normal vision, 6% of children scored nil (stereoblind). In those who obtained a threshold, the mean log threshold was 2.06 log10 arcsec, corresponding to 114 arcsec, and the median threshold was 100 arcsec. Most older children score 40 arcsec, the best available score. We found a small sex difference, with girls scoring slightly but significantly better. Test/retest reliability: ~99% for obtaining any score vs nil. Agreement between stereo thresholds is poor in children aged 2–5; 95% limit of agreement = 0.7 log10 arcsec: five-fold change in stereo threshold may occur without any change in vision. In children over 5, the test essentially acts only as a binary classifier since almost all non-stereoblind children score 40 arcsec. Specificity (true negative rate): >95%. Sensitivity (true positive rate): poor, <50%, i.e. around half of children with a demonstrable binocular vision abnormality score well on the Randot Preschool.
Conclusions
The Randot Preschool is extremely accessible for even very young children, and is very reliable at classifying children into those who have any stereo vision vs those who are stereoblind. However, its ability to quantify stereo vision is limited by poor repeatability in children aged 5 and under, and a very limited range of scores relevant to children aged over 5.
Figures
Two choices good, four choices better: For measuring stereoacuity in children, a four-alternative forced-choice paradigm is more efficient than two
Two choices good, four choices better: For measuring stereoacuity in children, a four-alternative forced-choice paradigm is more efficient than two by Vancleef K, Read JCA, Herbert W, Goodship N, Woodhouse M, Serrano-Pedraza I, VancleefReadHerbertGoodshipWoodhouseSerranoPedraza2018.pdf (4.5 MiB) - Purpose
Measuring accurate thresholds in children can be challenging. A typical psychophysical
experiment is usually too long to keep children engaged. However, a reduction in the number of trials decreases the precision of the threshold estimate. We evaluated the efficiency
of forced-choice paradigms with 2 or 4 alternatives (2-AFC, 4-AFC) in a disparity detection
experiment. 4-AFC paradigms are statistically more efficient, but also more cognitively
demanding, which might offset their theoretical advantage in young children.
Methods
We ran simulations evaluating bias and precision of threshold estimates of 2-AFC and 4-
AFC paradigms. In addition, we measured disparity thresholds in 43 children (aged 6 to 17
years) with a 4-AFC paradigm and in 49 children (aged 4 to 17 years) with a 2-AFC paradigm, both using an adaptive weighted one-up one-down staircase.
Results
Simulations indicated a similar bias and precision for a 2-AFC paradigm with double the number of trials as a 4-AFC paradigm. On average, estimated threshold of the simulated data was equal to the model threshold, indicating no bias. The precision was improved with an increasing number of trials. Likewise, our data showed a similar bias and precision for a 2-AFC paradigm with 60 trials as for a 4-AFC paradigm with 30 trials. Trials in the 4-AFC paradigm took slightly longer as participants scanned more alternatives. However, the 4-AFC task still ended up faster for a given precision.
Conclusion
Bias and precision were similar in a 4-AFC task compared to a 2-AFC task with double the number of trials. However, a 4-AFC paradigm was more time efficient and is therefore recommended
Measuring accurate thresholds in children can be challenging. A typical psychophysical
experiment is usually too long to keep children engaged. However, a reduction in the number of trials decreases the precision of the threshold estimate. We evaluated the efficiency
of forced-choice paradigms with 2 or 4 alternatives (2-AFC, 4-AFC) in a disparity detection
experiment. 4-AFC paradigms are statistically more efficient, but also more cognitively
demanding, which might offset their theoretical advantage in young children.
Methods
We ran simulations evaluating bias and precision of threshold estimates of 2-AFC and 4-
AFC paradigms. In addition, we measured disparity thresholds in 43 children (aged 6 to 17
years) with a 4-AFC paradigm and in 49 children (aged 4 to 17 years) with a 2-AFC paradigm, both using an adaptive weighted one-up one-down staircase.
Results
Simulations indicated a similar bias and precision for a 2-AFC paradigm with double the number of trials as a 4-AFC paradigm. On average, estimated threshold of the simulated data was equal to the model threshold, indicating no bias. The precision was improved with an increasing number of trials. Likewise, our data showed a similar bias and precision for a 2-AFC paradigm with 60 trials as for a 4-AFC paradigm with 30 trials. Trials in the 4-AFC paradigm took slightly longer as participants scanned more alternatives. However, the 4-AFC task still ended up faster for a given precision.
Conclusion
Bias and precision were similar in a 4-AFC task compared to a 2-AFC task with double the number of trials. However, a 4-AFC paradigm was more time efficient and is therefore recommended
ASTEROID: A New Clinical Stereotest on an Autostereo 3D Tablet
ASTEROID: A New Clinical Stereotest on an Autostereo 3D Tablet by Vancleef K, Serrano-Pedraza I, Sharp C, Slack G, Black C, Casanova T, Hugill J, Rafiq S, Burridge J, Puyat V, Ewane Enongue J, Gale H, Akotei H, Collier Z, Haggerty H, Smart K, Powell C, Taylor K, Clarke MP, Morgan G, Read JCA, VancleefEA_ASTEROIDMethods.pdf (1.9 MiB) - Purpose: To describe a new stereotest in the form of a game on an autostereoscopic
tablet computer designed to be suitable for use in the eye clinic and present data on
its reliability and the distribution of stereo thresholds in adults.
Methods: Test stimuli were four dynamic random-dot stereograms, one of which
contained a disparate target. Feedback was given after each trial presentation. A
Bayesian adaptive staircase adjusted target disparity. Threshold was estimated from the
mean of the posterior distribution after 20 responses. Viewing distance was monitored
via a forehead sticker viewed by the tablet’s front camera, and screen parallax was
adjusted dynamically so as to achieve the desired retinal disparity.
Results: The tablet must be viewed at a distance of greater than ~35 cm to produce a
good depth percept. Log thresholds were roughly normally distributed with a mean
of 1.75 log10 arcsec ¼ 56 arcsec and SD of 0.34 log10 arcsec ¼ a factor of 2.2. The
standard deviation agrees with previous studies, but ASTEROID thresholds are
approximately 1.5 times higher than a similar stereotest on stereoscopic 3D TV or on
Randot Preschool stereotests. Pearson correlation between successive tests in same
observer was 0.80. Bland-Altman 95% limits of reliability were 60.64 log10 arcsec ¼ a
factor of 4.3, corresponding to an SD of 0.32 log10 arcsec on individual threshold
estimates. This is similar to other stereotests and close to the statistical limit for 20
responses.
Conclusions: ASTEROID is reliable, easy, and portable and thus well-suited for clinical
stereoacuity measurements.
Translational Relevance: New 3D digital technology means that research-quality
psychophysical measurement of stereoacuity is now feasible in the clinic.
tablet computer designed to be suitable for use in the eye clinic and present data on
its reliability and the distribution of stereo thresholds in adults.
Methods: Test stimuli were four dynamic random-dot stereograms, one of which
contained a disparate target. Feedback was given after each trial presentation. A
Bayesian adaptive staircase adjusted target disparity. Threshold was estimated from the
mean of the posterior distribution after 20 responses. Viewing distance was monitored
via a forehead sticker viewed by the tablet’s front camera, and screen parallax was
adjusted dynamically so as to achieve the desired retinal disparity.
Results: The tablet must be viewed at a distance of greater than ~35 cm to produce a
good depth percept. Log thresholds were roughly normally distributed with a mean
of 1.75 log10 arcsec ¼ 56 arcsec and SD of 0.34 log10 arcsec ¼ a factor of 2.2. The
standard deviation agrees with previous studies, but ASTEROID thresholds are
approximately 1.5 times higher than a similar stereotest on stereoscopic 3D TV or on
Randot Preschool stereotests. Pearson correlation between successive tests in same
observer was 0.80. Bland-Altman 95% limits of reliability were 60.64 log10 arcsec ¼ a
factor of 4.3, corresponding to an SD of 0.32 log10 arcsec on individual threshold
estimates. This is similar to other stereotests and close to the statistical limit for 20
responses.
Conclusions: ASTEROID is reliable, easy, and portable and thus well-suited for clinical
stereoacuity measurements.
Translational Relevance: New 3D digital technology means that research-quality
psychophysical measurement of stereoacuity is now feasible in the clinic.
Analysis of Soft Data for Mass Provision of Stereoacuity Testing Through a Serious Game for Health
Analysis of Soft Data for Mass Provision of Stereoacuity Testing Through a Serious Game for Health by Ushaw G, Sharp S, Hugill J, Rafiq S, Black C, Casanova T, Vancleef K, Read JCA, Morgan G, UshawSharpHugillRafiqBlackCasanovaVancleefReadMorgan2017.pdf (1.7 MiB) - Mass provision of healthcare through a digital medium can be greatly enhanced by the use of serious games. The accessibility and engagement provided by a serious game to the subject can significantly increase participation. The commercial games industry employs numerous techniques to analyse soft data collected from early users of an application to evolve the application itself and improve the experience of playing it. A game for mass stereoacuity testing of young children is used as a case study in this paper, to illustrate how soft feedback can be used to improve the effectiveness of a clinical trial. The key to the approach is identified as rapid incremental evolution of the application and trial protocol in a manner which increases the amount and usefulness of soft data collected, and reacts to issues identified in the soft data in a timely fashion. It is hoped that the approach can be adopted for a wide range of digital applications for mass health provision.
ASTEROID: Accurate STEReoacuity measurement in the eye clinic
ASTEROID: Accurate STEReoacuity measurement in the eye clinic by Read JCA, Vancleef K, Serrano-Pedraza I, Morgan G, Sharp C, Clarke MP, ReadetalASTEROIDECVP2015.png (0.2 MiB)
Overestimation of stereo thresholds by the TNO stereotest is not due to global stereopsis.
Overestimation of stereo thresholds by the TNO stereotest is not due to global stereopsis. by Vancleef K, Read JCA, Herbert W, Goodship N, Woodhouse M, Serrano-Pedraza I, VancleefReadHerbertGoodshipWoodhouseSerranoPedraza2017_2.pdf (18 KiB) - Purpose
It has been repeatedly shown that the TNO stereotest overestimates stereo threshold compared to other clinical stereotests. In the current study, we test whether this overestimation can be attributed to a distinction between ‘global’ (or ‘cyclopean’) and ‘local’ (feature or contour-based) stereopsis.
Methods
We compared stereo thresholds of a global (TNO) and a local clinical stereotest (Randot Circles). In addition, a global and a local psychophysical stereotest were added to the design. One hundred and forty-nine children between 4 and 16 years old were included in the study.
Results
Stereo threshold estimates with TNO were a factor of two higher than with any of the other stereotests. No significant differences were found between the other tests. Bland-Altman analyses also indicated low agreement between TNO and the other stereotests, especially for higher stereo threshold estimates. Simulations indicated that the TNO test protocol and test disparities can account for part of this effect.
Discussion
The results indicate that the global – local distinction is an unlikely explanation for the overestimated thresholds of TNO. Test protocol and disparities are one contributing factor. Potential additional factors include the nature of the task (TNO requires depth discrimination rather than detection) and the use of anaglyph red/green 3D glasses rather than polarizing filters, which may reduce binocular fusion.
It has been repeatedly shown that the TNO stereotest overestimates stereo threshold compared to other clinical stereotests. In the current study, we test whether this overestimation can be attributed to a distinction between ‘global’ (or ‘cyclopean’) and ‘local’ (feature or contour-based) stereopsis.
Methods
We compared stereo thresholds of a global (TNO) and a local clinical stereotest (Randot Circles). In addition, a global and a local psychophysical stereotest were added to the design. One hundred and forty-nine children between 4 and 16 years old were included in the study.
Results
Stereo threshold estimates with TNO were a factor of two higher than with any of the other stereotests. No significant differences were found between the other tests. Bland-Altman analyses also indicated low agreement between TNO and the other stereotests, especially for higher stereo threshold estimates. Simulations indicated that the TNO test protocol and test disparities can account for part of this effect.
Discussion
The results indicate that the global – local distinction is an unlikely explanation for the overestimated thresholds of TNO. Test protocol and disparities are one contributing factor. Potential additional factors include the nature of the task (TNO requires depth discrimination rather than detection) and the use of anaglyph red/green 3D glasses rather than polarizing filters, which may reduce binocular fusion.
Avoiding monocular artifacts in clinical stereotests presented on column-interleaved digital stereoscopic displays
Avoiding monocular artifacts in clinical stereotests presented on column-interleaved digital stereoscopic displays by Serrano-Pedraza I, Vancleef K, Read JCA, SerranoPedrazaVancleefRead.pdf (1.5 MiB) - New forms of stereoscopic 3-D technology offer vision
scientists new opportunities for research, but also
come with distinct problems. Here we consider
autostereo displays where the two eyes’ images are
spatially interleaved in alternating columns of pixels
and no glasses or special optics are required. Columninterleaved
displays produce an excellent stereoscopic
effect, but subtle changes in the angle of view can
increase cross talk or even interchange the left and
right eyes’ images. This creates several challenges to
the presentation of cyclopean stereograms (containing
structure which is only detectable by binocular vision).
We discuss the potential artifacts, including one that is
unique to column-interleaved displays, whereby scene
elements such as dots in a random-dot stereogram
appear wider or narrower depending on the sign of
their disparity. We derive an algorithm for creating
stimuli which are free from this artifact.We show that
this and other artifacts can be avoided by (a) using a
task which is robust to disparity-sign inversion—for
example, a disparity-detection rather than
discrimination task—(b) using our proposed algorithm
to ensure that parallax is applied symmetrically on the
column-interleaved display, and (c) using a dynamic
stimulus to avoid monocular artifacts from motion
parallax. In order to test our recommendations, we
performed two experiments using a stereoacuity task
implemented with a parallax-barrier tablet. Our
results confirm that these recommendations eliminate
the artifacts. We believe that these recommendations
will be useful to vision scientists interested in running
stereo psychophysics experiments using parallaxbarrier
and other column-interleaved digital displays
scientists new opportunities for research, but also
come with distinct problems. Here we consider
autostereo displays where the two eyes’ images are
spatially interleaved in alternating columns of pixels
and no glasses or special optics are required. Columninterleaved
displays produce an excellent stereoscopic
effect, but subtle changes in the angle of view can
increase cross talk or even interchange the left and
right eyes’ images. This creates several challenges to
the presentation of cyclopean stereograms (containing
structure which is only detectable by binocular vision).
We discuss the potential artifacts, including one that is
unique to column-interleaved displays, whereby scene
elements such as dots in a random-dot stereogram
appear wider or narrower depending on the sign of
their disparity. We derive an algorithm for creating
stimuli which are free from this artifact.We show that
this and other artifacts can be avoided by (a) using a
task which is robust to disparity-sign inversion—for
example, a disparity-detection rather than
discrimination task—(b) using our proposed algorithm
to ensure that parallax is applied symmetrically on the
column-interleaved display, and (c) using a dynamic
stimulus to avoid monocular artifacts from motion
parallax. In order to test our recommendations, we
performed two experiments using a stereoacuity task
implemented with a parallax-barrier tablet. Our
results confirm that these recommendations eliminate
the artifacts. We believe that these recommendations
will be useful to vision scientists interested in running
stereo psychophysics experiments using parallaxbarrier
and other column-interleaved digital displays
Viewing 3D TV over two months produces no discernible effects on balance, coordination or eyesight.
Viewing 3D TV over two months produces no discernible effects on balance, coordination or eyesight. by Read JCA, Godfrey A, Bohr I, SImonotto J, Galna B, Smulders TV, ReadGodfreyBohrSimonottoGalnaSmulders2016.pdf (2.2 MiB) - With the rise in stereoscopic 3D media, there has been concern that viewing stereoscopic 3D (S3D) content could have long-term adverse effects, but little data are available. In the first study to address this, 28 households who did not currently own a 3D TV were given a new TV set, either S3D or 2D. The 116 members of these households all underwent tests of balance, coordination and eyesight, both before they received their new TV set, and after they had owned it for 2 months. We did not detect any changes which appeared to be associated with viewing 3D TV. We conclude that viewing 3D TV does not produce detectable effects on balance, coordination or eyesight over the timescale studied. Practitioner Summary: Concern has been expressed over possible long-term effects of stereoscopic 3D (S3D). We looked for any changes in vision, balance and coordination associated with normal home S3D TV viewing in the 2 months after first acquiring a 3D TV. We find no evidence of any changes over this timescale.