Natural language processing (NLP) aims to enable computers to use human languages – so that people can, for example, interact with computers naturally; or communicate with people who don't speak a common language; or access speech or text data at scales not otherwise possible. The NLP group at Notre Dame is interested in all aspects of NLP, with a focus on machine translation and connections with formal language theory.

The NLP group co-sponsors NL+, the Natural Language Processing Lunch Seminar.

Current Members

Former Members

Projects

Neural networks for machine translation Models and algorithms for translation and language modeling using neural networks.
Expressivity of neural sequence models Relating neural sequence models to automata, grammars, circuits, and logics.
Natural language (variety) processing Collaboration with Antonis Anastaspoulos (GMU) and Yulia Tsvetkov (UW). Sponsored by NSF.
Language documentation with an AI helper Collaboration with Antonis Anatasopoulos and Geraldine Walther (GMU). Sponsored by NSF.
Differentiable, probabilistic programming with recursive structured models. Collaboration with Chung-chieh Shan (IU). Sponsored by NSF.
NLP on medieval texts Analysis of Latin texts and language modeling for OCR of Latin manuscsripts. Collaborations with Walter Scheirer and Hildegund Müller. Sponsored by Notre Dame FRSP.

Recent Publications

Chihiro Taguchi, Yusuke Sakai, Parisa Haghani, and David Chiang. Universal automatic phonetic transcription into the International Phonetic Alphabet. In Proc. INTERSPEECH. 2023. To appear. BibTeX
Alexandra Butoi, Ryan Cotterell, and David Chiang. Convergence and diversity in the control hierarchy. In Proc. ACL. 2023. To appear. PDF BibTeX
David Chiang, Peter Cholak, and Anand Pillay. Tighter bounds on the expressivity of transformer encoders. In Proc. ICML. 2023. To appear. PDF BibTeX
Aarohi Srivastava and David Chiang. Fine-tuning BERT with character-level noise for zero-shot transfer to dialects and closely-related languages. In Proc. Workshop on NLP for Similar Languages, Varieties and Dialects. 2023. PDF BibTeX
Patrick Soga and David Chiang. Bridging graph position encodings for transformers with weighted graph-walking automata. Transactions on Machine Learning Research, 2023. PDF BibTeX
Brian DuSell and David Chiang. The surprising computational power of nondeterministic stack RNNs. In Proc. ICLR. 2023. PDF BibTeX
David Chiang, Colin McDonald, and Chung-chieh Shan. Exact recursive probabilistic programming. PACMPL, 2023. doi:10.1145/3586050. PDF BibTeX
Chihiro Taguchi and David Chiang. Introducing morphology in Universal Dependencies Japanese. In Proc. Workshop on Universal Dependencies, 65–72. 2023. PDF BibTeX
David Chiang, Alexander M. Rush, and Boaz Barak. Named tensor notation. Transactions on Machine Learning Research, 2023. PDF BibTeX
Chihiro Taguchi. Mermaid constructions in Lexical Functional Grammar. In Proc. LFG, 365–384. 2022. PDF BibTeX
Darcey Riley and David Chiang. A continuum of generation tasks for investigating length bias and degenerate repetition. In Proc. BlackboxNLP. 2022. PDF BibTeX
Alexandra Butoi, Brian DuSell, Tim Vieira, Ryan Cotterell, and David Chiang. Algorithms for weighted pushdown automata. In Proc. EMNLP. 2022. PDF BibTeX
Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, and others. Beyond the Imitation Game: quantifying and extrapolating the capabilities of language models. 2022. arXiv:2206.04615. PDF BibTeX
David Chiang and Peter Cholak. Overcoming a theoretical limitation of self-attention. In Proc. ACL. 2022. PDF BibTeX
Brian DuSell and David Chiang. Learning hierarchical structures with differentiable nondeterministic stacks. In Proc. ICLR. 2022. PDF BibTeX
David Chiang and Darcey Riley. Factor graph grammars. In Proc. NeurIPS, 6648–6658. 2020. PDF BibTeX
Justin DeBenedetto and David Chiang. Representing unordered data using complex-weighted multiset automata. In Hal Daumé III and Aarti Singh, editors, Proc. ICML, volume 119 of Proceedings of Machine Learning Research, 2412–2420. 2020. PDF BibTeX
Kenton Murray and David Chiang. Correcting length bias in neural machine translation. In Proc. WMT, 212–223. 2018. PDF BibTeX

All papers

Language and Computation at Notre Dame

Research

People

Courses