Resources
Below you find links to the Portuguese Stress Lexicon, scripts and tutorials on a range of topics involving data analysis and phonology.
“Pensar es olvidar diferencias, es generalizar, abstraer” (Borges)
General
Below you will find different scrips I’ve developed using R, as well as some random information on the tools I normally use (My workflow). These scripts are here to complement R tutorials I have designed over the years.
Data
The Portuguese Stress Lexicon is a project I developed during my PhD. It’s a comprehensive lexicon of Portuguese non-verbs coded for a number of phonological variables. Due to its tidy data format, the lexicon can be easily analyzed. The Talian corpus project is an ongoing project with Natália B. Guzzo. Talian is an understudied language spoken in southern Brazil. Our goal is to make coded data acessible so that other researchers can use the corpus in their own projects.
- Portuguese Stress Lexicon
- Talian corpus project (in progress)
Research methods in linguistics
Below you will find How to plot vowels in R, a brief tutorial on how to plot vowels using ggplot2
in R. The tutorial is updated every now and then. You will also find tutorials for web scraping, syllabification with Regex, and Automating Praat experiments, a Praat script I have developed to combine questionnaire responses, data files, and Praat experiment files. The goal is to generate csv
files that are ready for analysis. I have used this script several times, and it has saved me hours of work.
- Web scraping with R
- Syllabification with Regex
autoPraat
: Automating Praat experiments- Data analysis using R
- Ordinal models in R (HLS 2022), with Scott Perry, University of Alberta
If you’re looking for a place to learn R more generally, my top recommendation for R is Wickham’s R for data science. My top recommendation for statistics in general is McElreath’s Statistical Rethinking (see below, under Useful YouTube channels).
The goal of these apps is to make abstract concepts more user-friendly and intuitive. I normally use them in research methods courses. Phonology-specific apps are can be replaced my R package Fonology.
Document preparation
When it comes to document preparation, my main tip for graduate students is to learn how to LaTeX and BibTeX as soon as possible. I recognize, however, that not everyone wants to, and not everyone needs to. I’ve been using LaTeX for virtually every document I produce since 2013, I think. That’s why I decided to add LaTeX for phonologists to my website.
Portuguese language
My native language is (Brazilian) Portuguese. These are the two oldest grammars of Portuguese.
- First grammar of Portuguese (1536)
- Fernão Doliueira; from Biblioteca Nacional de Portugal
- Second grammar of Portuguese (1540)
- João de Barros; from Biblioteca Nacional de Portugal
Grad school
There’s a lot to know about graduate school and the academic job market before you decide to begin your journey. There are many useful articles online, and I would strongly recommend the book below.
- Read this book by Jason Brennan
- Some general tips for grad students
Here’s a list of great tools, websites, books, and projects developed by great people. The topics range from general to specific, but the main theme is obviously linguistics.
- Logical fallacies and cognitive biases
- Regression Modeling for linguistic data: e-book by Sonderegger (McGill)
- Improving your statistical inferences: e-book by Daniël Lakens
- R for Data Science: comprehensive book on R by Wickham and colleagues
- Detexify: draw the \(\LaTeX\) symbol you’re looking for
- The International Phonetic Alphabet
- Interactive IPA chart
- A more detailed interactive IPA chart
- Seeing speech (IPA chart)
- IPAify by K. Ryan (conversion to narrow IPA)
- Pink Trombone: speech synthesis
- Language map by number of speakers
- Language families (Europe)
tidygutenbergr
(works from Project Gutenberg)- All Things Linguistic
- Language Log
- Omniglot: The online encyclopedia of writing systems and languages
- Native Land Digital
- World Atlas of Linguistic Structures
- Grambank
- CMU Dictionary
- Buckeye Corpus
- Atlas sonore des langues régionales de France
- Phoible
- Fonds de données linguistiques du Québec
- Friends Don’t Let Friends Make Bad Graphs
I subscribe to too many YouTube channels, so I always have several recommendations (perhaps too many?). The list below is divided by topics that interest me.
Statistics
- Statistical rethinking (playlist with Richard McElreath’s lectures). This is, in my opinion, the best stats course you can take if you already have some basics. Likewise, his book (same name) is my top recommendation when it comes to statistics (and how to really think about the topic).
Language & Linguistics
General
- McGill Office for Science and Society
- Simon Clark’s channel on doing a PhD
- VSauce (asking questions and exploring answers)
- 3Blue1Brown (if you like math and visuals)
- Physics Girl (great physics videos)
- Great art explained (amazing videos)
- Half as interesting (interesting things you normally don’t know about)
- Wendover (similar to Half as interesting, but longer/more detailed)
- ASAP Science (short videos about interesting topics)
- MKBHD (tech reviews)
- OverLeaf/ShareLatex (if you’d like to leave Word and use \(\LaTeX\) instead)
- The Nerdwriter (essays on various topics)
Copyright © 2024 Guilherme Duarte Garcia