IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0066344.html
   My bibliography  Save this article

Keywords and Co-Occurrence Patterns in the Voynich Manuscript: An Information-Theoretic Analysis

Author

Listed:
  • Marcelo A Montemurro
  • Damián H Zanette

Abstract

The Voynich manuscript has remained so far as a mystery for linguists and cryptologists. While the text written on medieval parchment -using an unknown script system- shows basic statistical patterns that bear resemblance to those from real languages, there are features that suggested to some researches that the manuscript was a forgery intended as a hoax. Here we analyse the long-range structure of the manuscript using methods from information theory. We show that the Voynich manuscript presents a complex organization in the distribution of words that is compatible with those found in real language sequences. We are also able to extract some of the most significant semantic word-networks in the text. These results together with some previously known statistical features of the Voynich manuscript, give support to the presence of a genuine message inside the book.

Suggested Citation

  • Marcelo A Montemurro & Damián H Zanette, 2013. "Keywords and Co-Occurrence Patterns in the Voynich Manuscript: An Information-Theoretic Analysis," PLOS ONE, Public Library of Science, vol. 8(6), pages 1-9, June.
  • Handle: RePEc:plo:pone00:0066344
    DOI: 10.1371/journal.pone.0066344
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0066344
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0066344&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0066344?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Marcelo A. Montemurro & Damián H. Zanette, 2010. "Towards The Quantification Of The Semantic Information Encoded In Written Language," Advances in Complex Systems (ACS), World Scientific Publishing Co. Pte. Ltd., vol. 13(02), pages 135-153.
    2. J. P. Herrera & P. A. Pury, 2008. "Statistical keyword detection in literary corpora," The European Physical Journal B: Condensed Matter and Complex Systems, Springer;EDP Sciences, vol. 63(1), pages 135-146, May.
    3. Stephen P. Harter, 1975. "A probabilistic approach to automatic keyword indexing. Part I. On the Distribution of Specialty Words in a Technical Literature," Journal of the American Society for Information Science, Association for Information Science & Technology, vol. 26(4), pages 197-206, July.
    4. Ramon Ferrer-i-Cancho & Brita Elvevåg, 2010. "Random Texts Do Not Exhibit the Real Zipf's Law-Like Rank Distribution," PLOS ONE, Public Library of Science, vol. 5(3), pages 1-10, March.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Vladimír Matlach & Barbora Anna Janečková & Daniel Dostál, 2022. "The Voynich manuscript: Symbol roles revisited," PLOS ONE, Public Library of Science, vol. 17(1), pages 1-26, January.
    2. Corrêa, Edilson A. & Marinho, Vanessa Q. & Amancio, Diego R., 2020. "Semantic flow in language networks discriminates texts by genre and publication date," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 557(C).
    3. Nazim Choudhury & Shahadat Uddin, 2016. "Time-aware link prediction to explore network effects on temporal knowledge evolution," Scientometrics, Springer;Akadémiai Kiadó, vol. 108(2), pages 745-776, August.
    4. Choudhury, Nazim & Faisal, Fahim & Khushi, Matloob, 2020. "Mining Temporal Evolution of Knowledge Graphs and Genealogical Features for Literature-based Discovery Prediction," Journal of Informetrics, Elsevier, vol. 14(3).
    5. Espitia, Diego & Larralde, Hernán, 2020. "Universal and non-universal text statistics: Clustering coefficient for language identification," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 553(C).
    6. Quispe, Laura V.C. & Tohalino, Jorge A.V. & Amancio, Diego R., 2021. "Using virtual edges to improve the discriminability of co-occurrence text networks," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 562(C).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Carretero-Campos, C. & Bernaola-Galván, P. & Coronado, A.V. & Carpena, P., 2013. "Improving statistical keyword detection in short texts: Entropic and clustering approaches," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 392(6), pages 1481-1492.
    2. Jamaati, Maryam & Mehri, Ali, 2018. "Text mining by Tsallis entropy," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 490(C), pages 1368-1376.
    3. Mehri, Ali & Agahi, Hamzeh & Mehri-Dehnavi, Hossein, 2019. "A novel word ranking method based on distorted entropy," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 521(C), pages 484-492.
    4. Diego R Amancio, 2015. "Probing the Topological Properties of Complex Networks Modeling Short Written Texts," PLOS ONE, Public Library of Science, vol. 10(2), pages 1-17, February.
    5. Usó-Doménech, J.L. & Nescolarde-Selva, J.A. & Lloret-Climent, M. & Gash, H., 2016. "Semantics of language for ecosystems modelling: A model case," Ecological Modelling, Elsevier, vol. 328(C), pages 85-94.
    6. Jorge A. V. Tohalino & Thiago C. Silva & Diego R. Amancio, 2024. "Using word embedding to detect keywords in texts modeled as complex networks," Scientometrics, Springer;Akadémiai Kiadó, vol. 129(7), pages 3599-3623, July.
    7. Chen, Yanguang, 2012. "Zipf’s law, 1/f noise, and fractal hierarchy," Chaos, Solitons & Fractals, Elsevier, vol. 45(1), pages 63-73.
    8. Cárdenas, Juan Pablo & González, Iván & Vidal, Gerardo & Fuentes, Miguel Angel, 2016. "Does network complexity help organize Babel’s library?," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 447(C), pages 188-198.
    9. Espitia, Diego & Larralde, Hernán, 2020. "Universal and non-universal text statistics: Clustering coefficient for language identification," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 553(C).
    10. Shuo Xu & Ling Li & Xin An & Liyuan Hao & Guancan Yang, 2021. "An approach for detecting the commonality and specialty between scientific publications and patents," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(9), pages 7445-7475, September.
    11. Bian, Tian & Hu, Jiantao & Deng, Yong, 2017. "Identifying influential nodes in complex networks based on AHP," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 479(C), pages 422-436.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0066344. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.