AUTHOR=Martinez-Garcia Marina , Bertalmío Marcelo , Malo Jesús 

TITLE=In Praise of Artifice Reloaded: Caution With Natural Image Databases in Modeling Vision

JOURNAL=Frontiers in Neuroscience

VOLUME=13

YEAR=2019

URL=https://www.frontiersin.org/journals/neuroscience/articles/10.3389/fnins.2019.00008

DOI=10.3389/fnins.2019.00008

ISSN=1662-453X

ABSTRACT=<p>Subjective image quality databases are a major source of raw data on how the visual system works in <italic>naturalistic environments</italic>. These databases describe the sensitivity of many observers to a wide range of distortions of different nature and intensity seen on top of a variety of natural images. Data of this kind seems to open a number of possibilities for the vision scientist to check the models in realistic scenarios. However, while these natural databases are great benchmarks for models developed in some other way (e.g., by using the well-controlled <italic>artificial stimuli</italic> of traditional psychophysics), they should be carefully used when trying to fit vision models. Given the high dimensionality of the image space, it is very likely that some basic phenomena are under-represented in the database. Therefore, a model fitted on these large-scale natural databases will not reproduce these under-represented basic phenomena that could otherwise be easily illustrated with well selected artificial stimuli. In this work we study a specific example of the above statement. A standard cortical model using wavelets and divisive normalization tuned to reproduce subjective opinion on a large image quality dataset fails to reproduce basic cross-masking. Here we outline a solution for this problem by using artificial stimuli and by proposing a modification that makes the model easier to tune. Then, we show that the modified model is still competitive in the large-scale database. Our simulations with these artificial stimuli show that when using steerable wavelets, the conventional unit norm Gaussian kernels in divisive normalization should be multiplied by high-pass filters to reproduce basic trends in masking. Basic visual phenomena may be misrepresented in large natural image datasets but this can be solved with model-interpretable stimuli. This is an additional argument <italic>in praise of artifice</italic> in line with Rust and Movshon (<xref ref-type="bibr" rid="B55">2005</xref>).</p>