Do CAPES-7 programs publish in predatory journals?
Working note for the “Para quem publicamos?” study. Question raised: since much humanities output is in non-indexed journals, what share of those publications is in predatory journals? Analysis run 2026-05-28.
Short answer: Almost none. Across all CAPES-7 article output (2017–2020), 0.24% of articles are in predatory venues. For Linguistics/Letras the figure is 0.026% — a single article in four years. Predatory publishing is not a meaningful phenomenon in this elite sample, and is effectively absent from Letras.
Why this matters
It forecloses a lazy misreading of the main finding. Low international indexing in Linguistics/Letras is not because the field publishes in predatory junk. The non-indexed output is legitimate-but-local (Brazilian journals, in Portuguese), not predatory. The two things are completely different and this note quantifies the difference.
Data and universe
- Source: CAPES ARTPE records (journal articles) for CAPES-7 programs, 2017–2020 quadrennium.
- Universe: 42,235 program-article records; 3,878 in Linguistics/Letras (
focal_field=1), of which 3,405 are non-indexed in SJR and 473 indexed. - Journal name + ISSN were parsed from the CAPES
DS_ISSNfield (format(ISSN) JOURNAL NAME); publisher fromNM_EDITORA(present for only ~16% of rows); SJR publisher/quartile from the matchedarticles.csv.
Method
Reference lists: Beall’s List (archived, beallslist.net) — 1,344 publishers + 1,515 standalone journals, normalized (accent-stripped, lowercased, punctuation removed).
An article is flagged predatory if either the journal name matches a Beall standalone journal or the publisher (NM_EDITORA or the SJR publisher field) matches a Beall publisher — subject to two precision rules:
- Definition of record (conservative): predatory = on Beall’s list AND not SJR-indexed. An SJR-indexed journal is, by SJR’s curation, circulating internationally, so labelling it predatory is exactly the indefensible move. This rule dissolves the contested-megapublisher problem (see below).
- Collision suppression / contested exclusion: contested-but-indexed megapublishers (MDPI, Frontiers, Hindawi, Dove, Libertas, Bentham) are excluded, and generic journal-name collisions with legitimate journals (e.g. ACS’s Journal of Natural Products) are removed by the not-indexed rule.
Why the conservative definition is necessary
A naïve “count everything on Beall’s” yields 735 articles (1.7%) — but that count is dominated by SJR-indexed journals from contested megapublishers (MDPI’s Molecules, Sensors, Entropy, Nanomaterials…, Bentham, Oncotarget). Those are indexed and internationally circulating; treating them as predatory would be wrong and would be the first thing a referee attacks. Restricting to non-indexed Beall venues removes that noise and leaves a clean, unambiguous set.
Results
| Group | Articles | Naïve (all Beall) | Predatory (conservative) | % |
|---|---|---|---|---|
| All CAPES-7 | 42,235 | 735 | 100 | 0.237 |
| Linguistics/Letras | 3,878 | 2 | 1 | 0.026 |
| Other fields | 38,357 | 733 | 99 | 0.258 |
| Letras — non-indexed only | 3,405 | — | 1 | 0.029 |
- 56 distinct predatory journals, 100 articles total.
- The single Linguistics/Letras hit: International Journal of English Research (ISSN 2455-2186), one article, LETRAS/UFRGS, 2018.
- Mechanism-driven cross-check for Letras: predatory venues are English-only, so any Letras predatory publication must be English-titled. The English-titled non-indexed Letras candidate pool (138 journals / 219 articles) was inspected; all but the one above are legitimate (Revista da ANPOLL, Letras de Hoje, Alfa, WORD, Studies in Romanticism, JoSS…). The estimate of 1 is therefore near-exhaustive, not a lower bound clipped by tooling.
Representative predatory journals (non-indexed, on Beall’s)
| Articles | Journal (ISSN) | Field |
|---|---|---|
| 15 | International Journal of Development Research (2230-9926) | other |
| 11 | International Journal for Innovation Education and Research (2411-2933 / 2411-3123) | other |
| 4 | International Journal of Science and Research (2454-2008) | other |
| 4 | Journal of Novel Physiotherapies — OMICS (2165-7025) | other |
| 4 | European Journal of Chemistry (2153-2249) | other |
| 4 | Biomedical Journal of Scientific & Technical Research (2574-1241) | other |
| 2 | Creative Education — SCIRP (2151-4755) | other |
| 1 | American Journal of Applied Chemistry — SciencePG (2330-8753) | other |
| 1 | European Journal of Scientific Research (1450-216X) | other |
| 1 | European Academic Research (2286-4822) | other |
| … | (OMICS dental/oncology/obesity titles, etc.) | other |
| 2 | International Journal of English Research (2455-2186) | Letras (1) + other (1) |
The genuine predatory hits cluster in applied/STEM, medicine/dentistry, and business — not the humanities.
Interpretation: the language-barrier mechanism
The board analysis (predatory_authors.md) found Brazilian linguists — including the sitting ABRALIN president — listed on predatory editorial boards. Yet they almost never publish in predatory venues. The two diverge for a simple structural reason:
- Board membership is low-friction vanity: you receive a flattering email, you have little exposure to foreign venue quality, you accept. No language barrier.
- Publishing is high-friction: predatory journals operate in English, never Portuguese. A Portuguese-default author has to clear the same language hurdle as for any international venue — so the predatory “shortcut” offers no advantage over a legitimate local journal.
Result: governance complicity ≠ output. Brazilian linguists appear on predatory mastheads far more than they publish in predatory journals.
Limitations
- Beall’s is frozen at 2017 and is name/publisher-keyed; no comprehensive predatory-ISSN list was obtainable (SciencePG/OMICS directories are JS-rendered and could not be harvested statically).
NM_EDITORAis blank for ~84% of rows; publisher-level predators were partly recovered via the SJR publisher field and via journal-name matching, but some publisher-level cases with no name match and no publisher field will be missed.- The not-indexed rule slightly undercounts genuinely-predatory journals that achieved brief SJR indexing.
- Net effect: the other-fields figure is a mild lower bound; the Letras figure is robust (near-exhaustive given the language-barrier filter).
Reproducibility
Matching is name+publisher+SJR-publisher against normalized Beall lists, joined to lattes/data/processed/articles.csv on (cd_programa_ies, year, production_id) and lattes/data/raw/.../*_artpe.csv. Beall snapshots: beallslist.net publishers + standalone-journals pages.
Note (to be added):
Copyright © Guilherme Duarte Garcia