Variation graph applications

DSpace Repositorium (Manakin basiert)

Zur Kurzanzeige

dc.contributor.advisor Weigel, Detlef (Prof. Dr.)
dc.contributor.author Vorbrugg, Sebastian
dc.date.accessioned 2026-03-23T10:57:01Z
dc.date.available 2026-03-23T10:57:01Z
dc.date.issued 2026-03-23
dc.identifier.uri http://hdl.handle.net/10900/177445
dc.identifier.uri http://nbn-resolving.org/urn:nbn:de:bsz:21-dspace-1774456 de_DE
dc.identifier.uri http://dx.doi.org/10.15496/publikation-118769
dc.description.abstract Variation graphs provide a powerful solution to overcome the limitations of linear reference genomes, especially in representing the diversity and structural complexity within species. As genome sequencing becomes more accessible and datasets grow in both quality and scope, it is increasingly clear that traditional reference-based analyses fall short in capturing large-scale variation, population structure, and genomic complexity. However, the practical interpretation and use of genome graphs remains an open challenge. Both graph construction and downstream analysis require new tools that can operate at scale, preserve biological interpretability, and offer meaningful metrics to describe the underlying structure. In this thesis, I present a set of tools developed to address key challenges in variation graph analysis. The core contribution is gretl, a fast and flexible framework for computing graph- and path-based statistics. It enables systematic comparisons across parameter settings and graph construction methods, and has been used to analyze graphs built from multiple species, including a yeast dataset and the 1001 Genomes Arabidopsis pangenome. The framework reveals how parameters such as segment length and alignment thresholds strongly affect graph structure and interpretability. I also introduce gfa2bin, a graph-to-GWAS bridge that supports association testing directly from graph node coverage. This method demonstrates the potential of graph-based GWAS to detect both known and novel signals of trait associations. In addition, I develop a novel variation detection approach based on bifurcation events between paths, offering a complementary alternative to standard bubble detection algorithms. Together, these tools enable direct statistical exploration and biological analysis of genome graphs at both global and sample-specific levels. Applied to the Arabidopsis dataset, they reveal population structure, patterns of pangenome expansion, and the role of private and structural variation across diverse accessions. While challenges remain in variant extraction, graph augmentation, and performance scaling, this work demonstrates that genome graphs can be used not only to store variation, but also to interpret and analyze it in meaningful ways. The tools and methods presented here are a step toward more flexible, interpretable, and biologically aware graph-based genomics. en
dc.language.iso en de_DE
dc.publisher Universität Tübingen de_DE
dc.rights ubt-podno de_DE
dc.rights.uri http://tobias-lib.uni-tuebingen.de/doku/lic_ohne_pod.php?la=de de_DE
dc.rights.uri http://tobias-lib.uni-tuebingen.de/doku/lic_ohne_pod.php?la=en de_DE
dc.subject.ddc 004 de_DE
dc.title Variation graph applications en
dc.type PhDThesis de_DE
dcterms.dateAccepted 2025-10-28
utue.publikation.fachbereich Informatik de_DE
utue.publikation.fakultaet 7 Mathematisch-Naturwissenschaftliche Fakultät de_DE
utue.publikation.noppn yes de_DE

Dateien:

Das Dokument erscheint in:

Zur Kurzanzeige