The function below, biGram(), calculates the bigram probability for any given word based on a given corpus. The output is logarithmic sum of the individual probabilities for each segmental bigram. The function requires two arguments, namely, a word (x) and a corpus/list of words. Some lines in the function below are based on the Portuguese Stress Lexicon. The function requires tidyverse.

Tip

To work with n-grams in general, I strongly recommend the excellent ngram package.