Portuguese Stress Lexicon

Figure générée par DALL•E 3The Portuguese Stress Lexicon (PSL; Garcia (2014)) contains non-verbs in the Portuguese language (excluding monosyllables). The lexicon is largely based on the list of words in the Houaiss Dictionary (Houaiss et al., 2001), which is the most comprehensive dictionary in Portuguese. PSL contains 154,610 entries and 62 columns, which provide a comprehensive set of variables (including pronunciation, syllabification, stress position, syllabic constituents, intervals, CV profiles and weight profiles).

Paper

The lexicon was developed as part of Garcia (2017) (see here), which examined weight and stress in Portuguese. Because stress in verbs is not phonologically conditioned in the language, only non-verbs are included in PSL.

Updates
  • Mid vowels are now accurately represented in words where no diacritic is present in the orthography.
  • Rhotics have been adjusted (to R) when preceded by {m,n,s,l} in coda position.

Sample

word pro penSyl weightProfile stressLoc
abelha a-'be-La be LLL penult
átomo 'a-to-mo to LLL antepenult
cavalo ka-'va-lo va LLL penult
dínamo 'di-na-mo na LLL antepenult
perna 'pEr-na pEr HL penult
viscoso vis-'ko-zo ko HLL penult


 Download

To have access to PSL (RData format), click here.

Transcription
  • S stands for sh in shape (post-alveolar fricative)
  • Z stands for the voiced post-alveolar fricative (IPA: /ʒ/)
  • R is a velar fricative (IPA: /x/)
  • L is a voiced palatal lateral approximant (IPA: /ʎ/)
  • N is a voiced palatal nasal (IPA: /ɲ/)
  • ~ indicates nasality in cases where no nasal consonant follows (e.g., anã = a-'na~)
  • j,w glides are assumed to be in nuclear position
  • ' primary stress mark
  • - syllable boundary

All other phonemes are straightforward.

A note on nasality

The symbol used to represent nasalization is the tilde (~). This is treated as a coda consonant in the lexicon (except for ão sequences, see below). For example, in the word anã (‘dwarf’, fem.), transcribed as a.'na~, there is a word-final coda, namely, ~. In cases where nasalization is the result of assimilation, ~ is not used. For example, the words cama and canta (‘bed’, ‘sings’), are transcribed as 'ka.ma and 'kan.ta, respectively, as nasality is predictable. Finally, in words with ão, the tilde is assumed to be in the nucleus. The word coração (‘heart’) is transcribed as ko.ra.'sa~w. The assumption that nasals in Portuguese are VN sequences is based on Câmara Jr. (1970), and is not definitive. Rather, this should be taken as a (neutral) baseline for the general characterization of the Portuguese lexicon.

A note on glides

The lexicon assumes that glides are in the nucleus. While this is probably not correct for Portuguese, it is a more neutral take. You may wish to assume that glides are in the onset (rising diphthongs) or coda (falling diphthongs).


Copyright © 2024 Guilherme Duarte Garcia

References

Câmara Jr., J. M. (1970). Estrutura da lı́ngua portuguesa. Editora Vozes.
Garcia, G. D. (2014). Portuguese Stress Lexicon.
Garcia, G. D. (2017). Weight gradience and stress in Portuguese. Phonology, 34(1), 41–79. https://doi.org/10.1017/S0952675717000033
Houaiss, A., Villar, M., & Mello Franco, F. M. de. (2001). Dicionário eletrônico Houaiss da lı́ngua portuguesa. Objetiva.