IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/0020006.html
   My bibliography  Save this article

Seriation in Paleontological Data Using Markov Chain Monte Carlo Methods

Author

Listed:
  • Kai Puolamäki
  • Mikael Fortelius
  • Heikki Mannila

Abstract

Given a collection of fossil sites with data about the taxa that occur in each site, the task in biochronology is to find good estimates for the ages or ordering of sites. We describe a full probabilistic model for fossil data. The parameters of the model are natural: the ordering of the sites, the origination and extinction times for each taxon, and the probabilities of different types of errors. We show that the posterior distributions of these parameters can be estimated reliably by using Markov chain Monte Carlo techniques. The posterior distributions of the model parameters can be used to answer many different questions about the data, including seriation (finding the best ordering of the sites) and outlier detection. We demonstrate the usefulness of the model and estimation method on synthetic data and on real data on large late Cenozoic mammals. As an example, for the sites with large number of occurrences of common genera, our methods give orderings, whose correlation with geochronologic ages is 0.95.Synopsis: Seriation, the task of temporal ordering of fossil occurrences by numerical methods, and correlation, the task of determining temporal equivalence, are fundamental problems in paleontology. With the increasing use of large databases of fossil occurrences in paleontological research, the need is increasing for seriation methods that can be used on data with limited or disparate age information. This paper describes a simple probabilistic model of site ordering and taxon occurrences. As there can be several parameter settings that have about equally good fit with the data, the authors use the Bayesian approach and Markov chain Monte Carlo methods to obtain a sample of parameter values describing the data. As an example, the method is applied to a dataset on Cenozoic mammals. The orderings produced by the method agree well with the orderings of the sites with known geochronologic ages.

Suggested Citation

  • Kai Puolamäki & Mikael Fortelius & Heikki Mannila, 2006. "Seriation in Paleontological Data Using Markov Chain Monte Carlo Methods," PLOS Computational Biology, Public Library of Science, vol. 2(2), pages 1-9, February.
  • Handle: RePEc:plo:pcbi00:0020006
    DOI: 10.1371/journal.pcbi.0020006
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.0020006
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.0020006&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.0020006?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Quirin Schiermeier, 2003. "Setting the record straight," Nature, Nature, vol. 424(6948), pages 482-483, July.
    2. Halekoh, U. & Vach, W., 2004. "A Bayesian approach to seriation problems in archaeology," Computational Statistics & Data Analysis, Elsevier, vol. 45(3), pages 651-673, April.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Olena Morozova & Vyacheslav Morozov & Brad G Hoffman & Cheryl D Helgason & Marco A Marra, 2008. "A Seriation Approach for Visualization-Driven Discovery of Co-Expression Patterns in Serial Analysis of Gene Expression (SAGE) Data," PLOS ONE, Public Library of Science, vol. 3(9), pages 1-11, September.
    2. Javier Alcaraz & Eva M. García-Nové & Mercedes Landete & Juan F. Monge, 2020. "The linear ordering problem with clusters: a new partial ranking," TOP: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 28(3), pages 646-671, October.
    3. Arcagni, Alberto & Avellone, Alessandro & Fattore, Marco, 2022. "Complexity reduction and approximation of multidomain systems of partially ordered data," Computational Statistics & Data Analysis, Elsevier, vol. 173(C).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Javier Alcaraz & Eva M. García-Nové & Mercedes Landete & Juan F. Monge, 2020. "The linear ordering problem with clusters: a new partial ranking," TOP: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 28(3), pages 646-671, October.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:0020006. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.