IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/0040027.html
   My bibliography  Save this article

Why is Real-World Visual Object Recognition Hard?

Author

Listed:
  • Nicolas Pinto
  • David D Cox
  • James J DiCarlo

Abstract

Progress in understanding the brain mechanisms underlying vision requires the construction of computational models that not only emulate the brain's anatomy and physiology, but ultimately match its performance on visual tasks. In recent years, “natural” images have become popular in the study of vision and have been used to show apparently impressive progress in building such models. Here, we challenge the use of uncontrolled “natural” images in guiding that progress. In particular, we show that a simple V1-like model—a neuroscientist's “null” model, which should perform poorly at real-world visual object recognition tasks—outperforms state-of-the-art object recognition systems (biologically inspired and otherwise) on a standard, ostensibly natural image recognition test. As a counterpoint, we designed a “simpler” recognition test to better span the real-world variation in object pose, position, and scale, and we show that this test correctly exposes the inadequacy of the V1-like model. Taken together, these results demonstrate that tests based on uncontrolled natural images can be seriously misleading, potentially guiding progress in the wrong direction. Instead, we reexamine what it means for images to be natural and argue for a renewed focus on the core problem of object recognition—real-world image variation.Author Summary: The ease with which we recognize visual objects belies the computational difficulty of this feat. At the core of this challenge is image variation—any given object can cast an infinite number of different images onto the retina, depending on the object's position, size, orientation, pose, lighting, etc. Recent computational models have sought to match humans' remarkable visual abilities, and, using large databases of “natural” images, have shown apparently impressive progress. Here we show that caution is warranted. In particular, we found that a very simple neuroscience “toy” model, capable only of extracting trivial regularities from a set of images, is able to outperform most state-of-the-art object recognition systems on a standard “natural” test of object recognition. At the same time, we found that this same toy model is easily defeated by a simple recognition test that we generated to better span the range of image variation observed in the real world. Together these results suggest that current “natural” tests are inadequate for judging success or driving forward progress. In addition to tempering claims of success in the machine vision literature, these results point the way forward and call for renewed focus on image variation as a central challenge in object recognition.

Suggested Citation

  • Nicolas Pinto & David D Cox & James J DiCarlo, 2008. "Why is Real-World Visual Object Recognition Hard?," PLOS Computational Biology, Public Library of Science, vol. 4(1), pages 1-6, January.
  • Handle: RePEc:plo:pcbi00:0040027
    DOI: 10.1371/journal.pcbi.0040027
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.0040027
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.0040027&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.0040027?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Yuri Vankov & Aleksey Rumyantsev & Shamil Ziganshin & Tatyana Politova & Rinat Minyazev & Ayrat Zagretdinov, 2020. "Assessment of the Condition of Pipelines Using Convolutional Neural Networks," Energies, MDPI, vol. 13(3), pages 1-12, February.
    2. Pavel Škrabánek & Alexandra Zahradníková jr., 2019. "Automatic assessment of the cardiomyocyte development stages from confocal microscopy images using deep convolutional networks," PLOS ONE, Public Library of Science, vol. 14(5), pages 1-18, May.
    3. Sebastian Bach & Alexander Binder & Grégoire Montavon & Frederick Klauschen & Klaus-Robert Müller & Wojciech Samek, 2015. "On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation," PLOS ONE, Public Library of Science, vol. 10(7), pages 1-46, July.
    4. Dileep George & Jeff Hawkins, 2009. "Towards a Mathematical Theory of Cortical Micro-circuits," PLOS Computational Biology, Public Library of Science, vol. 5(10), pages 1-26, October.
    5. Pedro Malaca & Luis F. Rocha & D. Gomes & João Silva & Germano Veiga, 2019. "Online inspection system based on machine learning techniques: real case study of fabric textures classification for the automotive industry," Journal of Intelligent Manufacturing, Springer, vol. 30(1), pages 351-361, January.
    6. Hailay Hagos Entahabu & Amare Sewnet Minale & Emiru Birhane, 2023. "Modeling and Predicting Land Use/Land Cover Change Using the Land Change Modeler in the Suluh River Basin, Northern Highlands of Ethiopia," Sustainability, MDPI, vol. 15(10), pages 1-15, May.
    7. Xiaofu He & Zhiyong Yang & Joe Z Tsien, 2011. "A Hierarchical Probabilistic Model for Rapid Object Categorization in Natural Scenes," PLOS ONE, Public Library of Science, vol. 6(5), pages 1-15, May.
    8. Qianli Yang & Edgar Walker & R. James Cotton & Andreas S. Tolias & Xaq Pitkow, 2021. "Revealing nonlinear neural decoding by analyzing choices," Nature Communications, Nature, vol. 12(1), pages 1-13, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:0040027. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.