IDEAS home Printed from https://ideas.repec.org/a/spr/annopr/v309y2022i1d10.1007_s10479-021-04352-1.html
   My bibliography  Save this article

Reflections on kernelizing and computing unrooted agreement forests

Author

Listed:
  • Rim Wersch

    (Maastricht University)

  • Steven Kelk

    (Maastricht University)

  • Simone Linz

    (University of Auckland)

  • Georgios Stamoulis

    (Maastricht University)

Abstract

Phylogenetic trees are leaf-labelled trees used to model the evolution of species. Here we explore the practical impact of kernelization (i.e. data reduction) on the NP-hard problem of computing the TBR distance between two unrooted binary phylogenetic trees. This problem is better-known in the literature as the maximum agreement forest problem, where the goal is to partition the two trees into a minimum number of common, non-overlapping subtrees. We have implemented two well-known reduction rules, the subtree and chain reduction, and five more recent, theoretically stronger reduction rules, and compare the reduction achieved with and without the stronger rules. We find that the new rules yield smaller reduced instances and thus have clear practical added value. In many cases they also cause the TBR distance to decrease in a controlled fashion, which can further facilitate solving the problem in practice. Next, we compare the achieved reduction to the known worst-case theoretical bounds of $$15k-9$$ 15 k - 9 and $$11k-9$$ 11 k - 9 respectively, on the number of leaves of the two reduced trees, where k is the TBR distance, observing in both cases a far larger reduction in practice. As a by-product of our experimental framework we obtain a number of new insights into the actual computation of TBR distance. We find, for example, that very strong lower bounds on TBR distance can be obtained efficiently by randomly sampling certain carefully constructed partitions of the leaf labels, and identify instances which seem particularly challenging to solve exactly. The reduction rules have been implemented within our new solver Tubro which combines kernelization with an Integer Linear Programming (ILP) approach. Tubro also incorporates a number of additional features, such as a cluster reduction and a practical upper-bounding heuristic, and it can leverage combinatorial insights emerging from the proofs of correctness of the reduction rules to simplify the ILP.

Suggested Citation

  • Rim Wersch & Steven Kelk & Simone Linz & Georgios Stamoulis, 2022. "Reflections on kernelizing and computing unrooted agreement forests," Annals of Operations Research, Springer, vol. 309(1), pages 425-451, February.
  • Handle: RePEc:spr:annopr:v:309:y:2022:i:1:d:10.1007_s10479-021-04352-1
    DOI: 10.1007/s10479-021-04352-1
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s10479-021-04352-1
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s10479-021-04352-1?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Ruriko Yoshida & Kenji Fukumizu & Chrysafis Vogiatzis, 2019. "Multilocus phylogenetic analysis with gene tree clustering," Annals of Operations Research, Springer, vol. 276(1), pages 293-313, May.
    2. V. Chvatal, 1979. "A Greedy Heuristic for the Set-Covering Problem," Mathematics of Operations Research, INFORMS, vol. 4(3), pages 233-235, August.
    3. Jochen Alber & Nadja Betzler & Rolf Niedermeier, 2006. "Experiments on data reduction for optimal domination in networks," Annals of Operations Research, Springer, vol. 146(1), pages 105-117, September.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Marjan Marzban & Qian-Ping Gu & Xiaohua Jia, 2016. "New analysis and computational study for the planar connected dominating set problem," Journal of Combinatorial Optimization, Springer, vol. 32(1), pages 198-225, July.
    2. Davidov, Sreten & Pantoš, Miloš, 2017. "Planning of electric vehicle infrastructure based on charging reliability and quality of service," Energy, Elsevier, vol. 118(C), pages 1156-1167.
    3. Filipe Rodrigues & Agostinho Agra & Lars Magnus Hvattum & Cristina Requejo, 2021. "Weighted proximity search," Journal of Heuristics, Springer, vol. 27(3), pages 459-496, June.
    4. Lan, Guanghui & DePuy, Gail W. & Whitehouse, Gary E., 2007. "An effective and simple heuristic for the set covering problem," European Journal of Operational Research, Elsevier, vol. 176(3), pages 1387-1403, February.
    5. Song, Zhe & Kusiak, Andrew, 2010. "Mining Pareto-optimal modules for delayed product differentiation," European Journal of Operational Research, Elsevier, vol. 201(1), pages 123-128, February.
    6. Seona Lee & Sang-Ho Lee & HyungJune Lee, 2020. "Timely directional data delivery to multiple destinations through relay population control in vehicular ad hoc network," International Journal of Distributed Sensor Networks, , vol. 16(5), pages 15501477209, May.
    7. Zhuang, Yanling & Zhou, Yun & Yuan, Yufei & Hu, Xiangpei & Hassini, Elkafi, 2022. "Order picking optimization with rack-moving mobile robots and multiple workstations," European Journal of Operational Research, Elsevier, vol. 300(2), pages 527-544.
    8. Menghong Li & Yingli Ran & Zhao Zhang, 2022. "A primal-dual algorithm for the minimum power partial cover problem," Journal of Combinatorial Optimization, Springer, vol. 44(3), pages 1913-1923, October.
    9. Wang, Yiyuan & Pan, Shiwei & Al-Shihabi, Sameh & Zhou, Junping & Yang, Nan & Yin, Minghao, 2021. "An improved configuration checking-based algorithm for the unicost set covering problem," European Journal of Operational Research, Elsevier, vol. 294(2), pages 476-491.
    10. C Guéret & N Jussien & O Lhomme & C Pavageau & C Prins, 2003. "Loading aircraft for military operations," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 54(5), pages 458-465, May.
    11. Keisuke Murakami, 2018. "Iterative Column Generation Algorithm for Generalized Multi-Vehicle Covering Tour Problem," Asia-Pacific Journal of Operational Research (APJOR), World Scientific Publishing Co. Pte. Ltd., vol. 35(04), pages 1-22, August.
    12. R. L. Francis & T. J. Lowe & Arie Tamir, 2000. "Aggregation Error Bounds for a Class of Location Models," Operations Research, INFORMS, vol. 48(2), pages 294-307, April.
    13. Dongyue Liang & Zhao Zhang & Xianliang Liu & Wei Wang & Yaolin Jiang, 2016. "Approximation algorithms for minimum weight partial connected set cover problem," Journal of Combinatorial Optimization, Springer, vol. 31(2), pages 696-712, February.
    14. Abdullah Alshehri & Mahmoud Owais & Jayadev Gyani & Mishal H. Aljarbou & Saleh Alsulamy, 2023. "Residual Neural Networks for Origin–Destination Trip Matrix Estimation from Traffic Sensor Information," Sustainability, MDPI, vol. 15(13), pages 1-21, June.
    15. June Sung Park & Jinyoung Jang & Eunjung Lee, 0. "Theoretical and empirical studies on essence-based adaptive software engineering," Information Technology and Management, Springer, vol. 0, pages 1-13.
    16. Wedelin, Dag, 1995. "The design of a 0-1 integer optimizer and its application in the Carmen system," European Journal of Operational Research, Elsevier, vol. 87(3), pages 722-730, December.
    17. Sun, Yi-Fan & Sun, Zheng-Yang, 2019. "Target observation of complex networks," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 517(C), pages 233-245.
    18. Victor Reyes & Ignacio Araya, 2021. "A GRASP-based scheme for the set covering problem," Operational Research, Springer, vol. 21(4), pages 2391-2408, December.
    19. Taoqing Zhou & Zhipeng Lü & Yang Wang & Junwen Ding & Bo Peng, 2016. "Multi-start iterated tabu search for the minimum weight vertex cover problem," Journal of Combinatorial Optimization, Springer, vol. 32(2), pages 368-384, August.
    20. Giovanni Felici & Sokol Ndreca & Aldo Procacci & Benedetto Scoppola, 2016. "A-priori upper bounds for the set covering problem," Annals of Operations Research, Springer, vol. 238(1), pages 229-241, March.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:annopr:v:309:y:2022:i:1:d:10.1007_s10479-021-04352-1. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.