A feature-based neural model of sound change informed by global lexicostatistical data

DSpace Repositorium (Manakin basiert)

Zur Kurzanzeige

dc.contributor.author Rubehn, Arne
dc.date.accessioned 2024-04-09T07:19:14Z
dc.date.available 2024-04-09T07:19:14Z
dc.date.issued 2022-12-28
dc.identifier.uri http://hdl.handle.net/10900/152716
dc.identifier.uri http://nbn-resolving.de/urn:nbn:de:bsz:21-dspace-1527169 de_DE
dc.identifier.uri http://dx.doi.org/10.15496/publikation-94055
dc.description.abstract Historical linguists have successfully reconstructed numerous unattested ancestral languages for over a century, mainly by applying the comparative method, a powerful procedure for recovering extinct languages and understanding how they developed into their modern daughter languages. With the exponential rise of computational power, scholars have been trying to develop computational solutions for tasks in historical linguistics for roughly two decades. The success of these methods, however, is limited to solving some individual tasks satisfyingly, while there are still no good solutions for other tasks. Part of the reason why scholars were not able to find good computational methods for some parts of the comparative method is that historical linguists often rely on their intuition and general linguistic knowledge when reconstructing ancestral languages, a component that computational models naturally lack. This thesis presents a neural model that aims at bridging that gap by providing typological information about the likelihood of sound changes. The model was trained on large-scale global lexical data and is therefore able to assess whether a queried sound change is common or uncommon on a global scale. Since it operates on phonological features, it is able to process any given sound change between two arbitrary IPA symbols. The model was trained on sound changes observed in Maximum Parsimony reconstructions on a large-scale global lexical dataset. The model was trained as a binary classifier in a noise-contrastive estimation setting, where the observed sound changes contributed positive training data which was weighed against randomly generated negative training data. Applying a weighted version of Maximum Parsimony, in which the weights were derived from the model, produced better reconstructions for Proto-Austronesian and Proto-Oceanic than unweighted Maximum Parsimony reconstructions. That showed that the model was able to learn common sound changes, including the direction in which they tend to happen. While it requires further systematic testing, the model shows the potential to enhance tasks in computational historical linguistics by simulating implicit linguistic knowledge as a component of the comparative method. en
dc.language.iso en de_DE
dc.publisher Universität Tübingen de_DE
dc.rights ubt-podok de_DE
dc.rights.uri http://tobias-lib.uni-tuebingen.de/doku/lic_mit_pod.php?la=de de_DE
dc.rights.uri http://tobias-lib.uni-tuebingen.de/doku/lic_mit_pod.php?la=en en
dc.subject.classification Maschinelles Lernen , Historische Sprachwissenschaft , Computerlinguistik , Lautwandel de_DE
dc.subject.ddc 400 de_DE
dc.subject.other computational linguistics en
dc.subject.other historical linguistics en
dc.subject.other sound change en
dc.subject.other machine learning en
dc.title A feature-based neural model of sound change informed by global lexicostatistical data en
dc.type MasterThesis de_DE
utue.publikation.fachbereich Allgemeine u. vergleichende Sprachwissenschaft de_DE
utue.publikation.fakultaet 5 Philosophische Fakultät de_DE
utue.publikation.noppn yes de_DE

Dateien:

Das Dokument erscheint in:

Zur Kurzanzeige