IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1005338.html
   My bibliography  Save this article

A New Approach to Model Pitch Perception Using Sparse Coding

Author

Listed:
  • Oded Barzelay
  • Miriam Furst
  • Omri Barak

Abstract

Our acoustical environment abounds with repetitive sounds, some of which are related to pitch perception. It is still unknown how the auditory system, in processing these sounds, relates a physical stimulus and its percept. Since, in mammals, all auditory stimuli are conveyed into the nervous system through the auditory nerve (AN) fibers, a model should explain the perception of pitch as a function of this particular input. However, pitch perception is invariant to certain features of the physical stimulus. For example, a missing fundamental stimulus with resolved or unresolved harmonics, or a low and high-level amplitude stimulus with the same spectral content–these all give rise to the same percept of pitch. In contrast, the AN representations for these different stimuli are not invariant to these effects. In fact, due to saturation and non-linearity of both cochlear and inner hair cells responses, these differences are enhanced by the AN fibers. Thus there is a difficulty in explaining how pitch percept arises from the activity of the AN fibers. We introduce a novel approach for extracting pitch cues from the AN population activity for a given arbitrary stimulus. The method is based on a technique known as sparse coding (SC). It is the representation of pitch cues by a few spatiotemporal atoms (templates) from among a large set of possible ones (a dictionary). The amount of activity of each atom is represented by a non-zero coefficient, analogous to an active neuron. Such a technique has been successfully applied to other modalities, particularly vision. The model is composed of a cochlear model, an SC processing unit, and a harmonic sieve. We show that the model copes with different pitch phenomena: extracting resolved and non-resolved harmonics, missing fundamental pitches, stimuli with both high and low amplitudes, iterated rippled noises, and recorded musical instruments.Author Summary: By means of a sound's pitch, we can easily discern between low and high musical notes, regardless of whether they originate from a guitar, piano or a vocalist. The relation between different sounds that yield the same percept is what makes pitch an interesting subject of research. Today, despite extensive research, the mechanism behind this physical to perceptual transformation is still unclear. The large dynamic range of the cochlea combined with its nonlinear nature makes the modeling and understanding of this process a challenging task. Given a large amount of physiological and psychological data, a general explanation consistent with many of these phenomena would be a major step in elucidating the nature of pitch perception. In this paper, we recast the problem in the general framework of sparse coding of sensory stimuli. This framework, initially developed for the visual modality, posits that the goal of the neural representation is to represent the flow of sensory information in a concise and parsimonious way. We show that applying this principle to the problem of pitch perception can explain many perceptual phenomena.

Suggested Citation

  • Oded Barzelay & Miriam Furst & Omri Barak, 2017. "A New Approach to Model Pitch Perception Using Sparse Coding," PLOS Computational Biology, Public Library of Science, vol. 13(1), pages 1-36, January.
  • Handle: RePEc:plo:pcbi00:1005338
    DOI: 10.1371/journal.pcbi.1005338
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1005338
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1005338&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1005338?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Evan C. Smith & Michael S. Lewicki, 2006. "Efficient auditory coding," Nature, Nature, vol. 439(7079), pages 978-982, February.
    2. Daniel Bendor & Xiaoqin Wang, 2005. "The neuronal representation of pitch in primate auditory cortex," Nature, Nature, vol. 436(7054), pages 1161-1165, August.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Mark R. Saddler & Ray Gonzalez & Josh H. McDermott, 2021. "Deep neural network models reveal interplay of peripheral coding and stimulus statistics in pitch perception," Nature Communications, Nature, vol. 12(1), pages 1-25, December.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Noga Mosheiff & Haggai Agmon & Avraham Moriel & Yoram Burak, 2017. "An efficient coding theory for a dynamic trajectory predicts non-uniform allocation of entorhinal grid cells to modules," PLOS Computational Biology, Public Library of Science, vol. 13(6), pages 1-19, June.
    2. Mark R. Saddler & Ray Gonzalez & Josh H. McDermott, 2021. "Deep neural network models reveal interplay of peripheral coding and stimulus statistics in pitch perception," Nature Communications, Nature, vol. 12(1), pages 1-25, December.
    3. Sam V Norman-Haignere & Josh H McDermott, 2018. "Neural responses to natural and model-matched stimuli reveal distinct computations in primary and nonprimary auditory cortex," PLOS Biology, Public Library of Science, vol. 16(12), pages 1-46, December.
    4. Falk Lieder & Klaas E Stephan & Jean Daunizeau & Marta I Garrido & Karl J Friston, 2013. "A Neurocomputational Model of the Mismatch Negativity," PLOS Computational Biology, Public Library of Science, vol. 9(11), pages 1-14, November.
    5. Philip J Monahan & Kevin de Souza & William J Idsardi, 2008. "Neuromagnetic Evidence for Early Auditory Restoration of Fundamental Pitch," PLOS ONE, Public Library of Science, vol. 3(8), pages 1-6, August.
    6. Gwangsu Kim & Dong-Kyum Kim & Hawoong Jeong, 2024. "Spontaneous emergence of rudimentary music detectors in deep neural networks," Nature Communications, Nature, vol. 15(1), pages 1-11, December.
    7. Jonathan J Hunt & Peter Dayan & Geoffrey J Goodhill, 2013. "Sparse Coding Can Predict Primary Visual Cortex Receptive Field Changes Induced by Abnormal Visual Input," PLOS Computational Biology, Public Library of Science, vol. 9(5), pages 1-17, May.
    8. Jacob N Oppenheim & Pavel Isakov & Marcelo O Magnasco, 2013. "Degraded Time-Frequency Acuity to Time-Reversed Notes," PLOS ONE, Public Library of Science, vol. 8(6), pages 1-6, June.
    9. Lubomir Kostal & Petr Lansky & Jean-Pierre Rospars, 2008. "Efficient Olfactory Coding in the Pheromone Receptor Neuron of a Moth," PLOS Computational Biology, Public Library of Science, vol. 4(4), pages 1-11, April.
    10. Jonathan Schaffner & Sherry Dongqi Bao & Philippe N. Tobler & Todd A. Hare & Rafael Polania, 2023. "Sensory perception relies on fitness-maximizing codes," Nature Human Behaviour, Nature, vol. 7(7), pages 1135-1151, July.
    11. R Channing Moore & Tyler Lee & Frédéric E Theunissen, 2013. "Noise-invariant Neurons in the Avian Auditory Cortex: Hearing the Song in Noise," PLOS Computational Biology, Public Library of Science, vol. 9(3), pages 1-14, March.
    12. Gonzalo H Otazu & Christian Leibold, 2011. "A Corticothalamic Circuit Model for Sound Identification in Complex Scenes," PLOS ONE, Public Library of Science, vol. 6(9), pages 1-15, September.
    13. Daniel Bendor, 2015. "The Role of Inhibition in a Computational Model of an Auditory Cortical Neuron during the Encoding of Temporal Information," PLOS Computational Biology, Public Library of Science, vol. 11(4), pages 1-25, April.
    14. Lingyun Zhao & Li Zhaoping, 2011. "Understanding Auditory Spectro-Temporal Receptive Fields and Their Changes with Input Statistics by Efficient Coding Principles," PLOS Computational Biology, Public Library of Science, vol. 7(8), pages 1-16, August.
    15. Christophe Micheyl & Paul R Schrater & Andrew J Oxenham, 2013. "Auditory Frequency and Intensity Discrimination Explained Using a Cortical Population Rate Code," PLOS Computational Biology, Public Library of Science, vol. 9(11), pages 1-7, November.
    16. Tomas Barta & Lubomir Kostal, 2019. "The effect of inhibition on rate code efficiency indicators," PLOS Computational Biology, Public Library of Science, vol. 15(12), pages 1-21, December.
    17. Patrick C M Wong & Bharath Chandrasekaran & Jing Zheng, 2012. "The Derived Allele of ASPM Is Associated with Lexical Tone Perception," PLOS ONE, Public Library of Science, vol. 7(4), pages 1-8, April.
    18. Philippe Albouy & Samuel A. Mehr & Roxane S. Hoyer & Jérémie Ginzburg & Yi Du & Robert J. Zatorre, 2024. "Spectro-temporal acoustical markers differentiate speech from song across cultures," Nature Communications, Nature, vol. 15(1), pages 1-13, December.
    19. Joseph D. Zak & Gautam Reddy & Vaibhav Konanur & Venkatesh N. Murthy, 2024. "Distinct information conveyed to the olfactory bulb by feedforward input from the nose and feedback from the cortex," Nature Communications, Nature, vol. 15(1), pages 1-16, December.
    20. Weiping Yang & Jingjing Yang & Yulin Gao & Xiaoyu Tang & Yanna Ren & Satoshi Takahashi & Jinglong Wu, 2015. "Effects of Sound Frequency on Audiovisual Integration: An Event-Related Potential Study," PLOS ONE, Public Library of Science, vol. 10(9), pages 1-15, September.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1005338. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.