ARPIP: Ancestral sequence Reconstruction with insertions and deletions under the Poisson Indel Process.

Jowkar, Gholamhossein; Pečerska, Jūlija; Maiolo, Massimo; Gil, Manuel; Anisimova, Maria (2022). ARPIP: Ancestral sequence Reconstruction with insertions and deletions under the Poisson Indel Process. (In Press). Systematic biology Oxford University Press 10.1093/sysbio/syac050

[img]
Preview
Text
syac050.pdf - Published Version
Available under License Creative Commons: Attribution-Noncommercial (CC-BY-NC).

Download (52MB) | Preview

Modern phylogenetic methods allow inference of ancestral molecular sequences given an alignment and phylogeny relating present day sequences. This provides insight into the evolutionary history of molecules, helping to understand gene function and to study biological processes such as adaptation and convergent evolution across a variety of applications. Here we propose a dynamic programming algorithm for fast joint likelihood-based reconstruction of ancestral sequences under the Poisson Indel Process (PIP). Unlike previous approaches, our method, named ARPIP, enables the reconstruction with insertions and deletions based on an explicit indel model. Consequently, inferred indel events have an explicit biological interpretation. Likelihood computation is achieved in linear time with respect to the number of sequences. Our method consists of two steps, namely finding the most probable indel points and reconstructing ancestral sequences. First, we find the most likely indel points and prune the phylogeny to reflect the insertion and deletion events per site. Second, we infer the ancestral states on the pruned subtree in a manner similar to FastML. We applied ARPIP on simulated datasets and on real data from the Betacoronavirus genus. ARPIP reconstructs both the indel events and substitutions with a high degree of accuracy. Our method fares well when compared to established state-of-the-art methods such as FastML and PAML. Moreover, the method can be extended to explore both optimal and suboptimal reconstructions, include rate heterogeneity through time and more. We believe it will expand the range of novel applications of ancestral sequence reconstruction.

Item Type:

Journal Article (Original Article)

Division/Institute:

04 Faculty of Medicine > Service Sector > Institute of Pathology

UniBE Contributor:

Maiolo, Massimo Vincenzo

Subjects:

500 Science > 570 Life sciences; biology
600 Technology > 610 Medicine & health

ISSN:

1063-5157

Publisher:

Oxford University Press

Language:

English

Submitter:

Pubmed Import

Date Deposited:

25 Jul 2022 11:45

Last Modified:

27 Jul 2022 04:45

Publisher DOI:

10.1093/sysbio/syac050

PubMed ID:

35866991

Uncontrolled Keywords:

Poisson indel process SARS-CoV ancestral sequences dynamic programming evolutionary stochastic process indel joint ancestral sequence reconstruction maximum likelihood phylogeny

BORIS DOI:

10.48350/171496

URI:

https://boris.unibe.ch/id/eprint/171496

Actions (login required)

Edit item Edit item
Provide Feedback