Lunch at 12:30pm, talk at 1pm, in 148 Fitzpatrick
Title: De Renovato Latino: A Survey of NLP Tools for Latin
Abstract: As we continue to hone our NLP approaches, heuristics, and models in the modern world, modern languages are not the only ones that benefit. Rather, within the field of digital humanities, ancient languages also reap the spoils. This is particularly the case with Latin—a language preserving a variety of cultures and linguistic developments for roughly two millennia. In this talk, I will indicate and describe a selection of significant Latin NLP resources, including the following: a set of five Latin dependency treebanks designed under the Universal Dependency (UD) framework; a large language model for Latin, Latin BERT, trained upon the sum of ancient and modern Latin corpora; and the LiLa, or “Linking Latin,” project, which aims to conjoin a plethora of disparate Latin resources under one framework in order to allow for deeper analyses of the language. Both theoretical intuitions and practical applications will be discussed. I will conclude by mentioning other available computational resources for the study of Latin in the hope that such knowledge may inspire future interdisciplinary research.
Stephen Bothwell is a 2nd year PhD student in the CSE department. His research is on automatic analysis of the writings of St. Augustine.