Resources

Below you’ll find a curated collection of tools, datasets, tutorials, and reference materials related to linguistics, document preparation, data analysis, and academic life more broadly.


“Pensar es olvidar diferencias, es generalizar, abstraer” (Borges)

Main tools

These are the main tools I’ve created:

  • phonokit (Typst package for phonology). Easily create a wide range of phonological representations: SPE, OT, MaxEnt, feature geometry, prosody, vowel trapezoids, vowel dispersion, consonant tables, and more.
  • synkit (Typst package for syntax/semantics). Syntax trees with native support for semantic annotation, glosses, numbered examples. Intuitive functions and smart labels.
  • Fonology (R package). Extract phonological variables from written data. Grapheme-to-phoneme conversion for English, French, Italian, Portuguese, and Spanish.

General

Raw text in TalianBelow you’ll find different scripts I’ve developed using R, as well as some information on the tools I normally use (My workflow). These scripts are here to complement the R tutorials I have designed over the years.

Data

The Portuguese Stress Lexicon is a project I developed during my PhD. It’s a comprehensive lexicon of Portuguese non-verbs coded for a number of phonological variables. Due to its tidy data format, the lexicon can be easily analyzed. The Talian corpus project is an ongoing project with Natália B. Guzzo. Talian is an understudied language spoken in southern Brazil. Our goal is to make coded data accessible so that other researchers can use the corpus in their own projects.

Research methods in linguistics

Below you will find How to plot vowels in R, a brief tutorial on how to plot vowels using ggplot2 in R. The tutorial is updated every now and then. You will also find tutorials for web scraping, syllabification with Regex, and Automating Praat experiments, a Praat script I have developed to combine questionnaire responses, data files, and Praat experiment files. The goal is to generate csv files that are ready for analysis. I have used this script several times, and it has saved me hours of work.

If you’re looking for a place to learn R more generally, my top recommendation for R is Wickham’s R for Data Science. My top recommendation for statistics in general is McElreath’s Statistical Rethinking (see below, under Useful YouTube channels).

TipQuarto dashboards for teaching

The goal of these apps is to make abstract concepts more user-friendly and intuitive. I normally use them in research methods courses.

Document preparation

When it comes to document preparation, my main tool was \(\LaTeX\) for about 15 years. Then I switched to Typst. As a result, the two entries below don’t really reflect what I currently use. I still think learning \(\LaTeX\) is important, since many journals don’t support Typst (although many don’t support tex files either…). But it’s difficult to go back to \(\LaTeX\) after using Typst. You will find some blog posts about this topic here.

  • LaTeX and phonology. There aren’t many packages for phonologists in \(\LaTeX\) beyond tipa and a couple of others. This tutorial is basically on how to do phonology using tikz. As you will see, this is not ideal. That’s why I created phonokit.
  • Formatting your documents

Portuguese language

My native language is (Brazilian) Portuguese. These are the two oldest grammars of Portuguese.

Grad school

There’s a lot to know about graduate school and the academic job market before you decide to begin your journey. There are many useful articles online, and I would strongly recommend the book below.


I subscribe to too many YouTube channels, so I always have several recommendations (perhaps too many?). The list below is divided by topics that interest me.

Statistics

  • Statistical Rethinking (playlist with Richard McElreath’s lectures). This is, in my opinion, the best stats course you can take if you already have some basics. Likewise, his book (same name) is my top recommendation when it comes to statistics and how to really think about the topic.

Language & Linguistics

General


Copyright © Guilherme Duarte Garcia