Findings Summary

Date generated: 2026-05-24

Scope

This project currently analyzes CAPES-7 journal article output for 2017-2020. The focal field is LINGUÍSTICA E LITERATURA, compared with eleven other journal-heavy CAPES evaluation areas.

The current processed dataset contains:

  • 63 CAPES-7 programs.
  • 39,049 journal article rows.
  • 12 discipline groups.
  • Publication language from CAPES DS_IDIOMA.
  • SJR quartile matching via ISSN.

The quartile source currently used is the accessible UKZN mirror of the 2024 Scimago/SJR quartile file. This is a proxy because Scimago’s official historical CSV endpoint was blocked from this environment. For publication, replace this proxy with historical SJR files for 2017-2020 if possible.

Areas Included

The figures use every area currently present in the processed article table:

Area CAPES-7 programs Article rows
Administração Pública e de Empresas, Ciências Contábeis e Turismo 3 2,021
Astronomia / Física 11 10,261
Ciência da Computação 7 2,766
Ciência Política e Relações Internacionais 2 754
Ciências Biológicas I 5 3,698
Economia 4 564
História 2 771
Linguística e Literatura 6 3,878
Matemática / Probabilidade e Estatística 7 2,199
Psicologia 3 1,105
Química 10 9,487
Sociologia 3 1,545

Political Science is therefore already included, under CAPES’s area label CIÊNCIA POLÍTICA E RELAÇÕES INTERNACIONAIS.

Main Result

Linguistics/Letras is an extreme outlier on three related outcomes:

Outcome Linguistics/Letras Comparison fields Gap
Matched to SJR 12.2% 73.4% -61.2 pp
Published in English 8.3% 82.2% -73.9 pp
Q1, all articles denominator 1.9% 43.1% -41.2 pp
Q1/Q2, all articles denominator 6.0% 62.3% -56.3 pp

The indexing result should not be treated as a mere missing-data problem. In journal-based fields, SJR coverage is itself a meaningful signal of whether a journal belongs to the main international indexing ecosystem. Being indexed does not guarantee quality, but being absent from SJR is rarely consistent with being a top international journal in the field.

Discipline-Level Summary

Area Articles SJR match English Q1, all articles Q1/Q2, all articles Q1/Q2 among SJR-indexed
Linguística e Literatura 3,878 12.2% 8.3% 1.9% 6.0% 49.3%
História 771 22.7% 6.0% 4.2% 15.6% 68.6%
Sociologia 1,545 20.6% 10.6% 3.8% 15.7% 76.4%
Ciência Política e Relações Internacionais 754 29.0% 17.9% 8.4% 22.3% 76.7%
Administração Pública e de Empresas, Ciências Contábeis e Turismo 2,021 41.8% 45.9% 17.0% 25.1% 60.1%
Psicologia 1,105 56.7% 50.0% 17.2% 29.7% 52.3%
Economia 564 56.2% 72.5% 31.6% 41.5% 73.8%
Ciência da Computação 2,766 67.3% 91.6% 39.1% 58.1% 86.4%
Ciências Biológicas I 3,698 91.8% 92.6% 58.1% 81.0% 88.2%
Química 9,487 84.1% 93.4% 42.3% 68.3% 81.3%
Astronomia / Física 10,261 80.3% 95.0% 56.4% 73.2% 91.1%
Matemática / Probabilidade e Estatística 2,199 84.6% 96.4% 57.2% 78.6% 92.9%

The indexed-only Q1/Q2 column is useful, but it answers a different question: conditional on a journal being indexed in SJR, where does it rank? The all-article denominator is the stronger field-level measure because it preserves the indexing gap rather than discarding it.

Regression-Style Checks

Simple logit models with a year control give the same qualitative result:

  • Odds of SJR indexing for Linguistics/Letras articles are about 0.050 times the comparison fields.
  • Odds of being in English are about 0.020 times the comparison fields.
  • Odds of Q1/Q2 placement are about 0.039 times the comparison fields.

These are descriptive checks, not causal models. They show that the focal-field gap is not an artifact of small year-to-year changes from 2017 to 2020.

Language as a Mechanism

The data support the idea that language is part of the mechanism linking field norms to indexing and quartile placement.

Field group Language Articles SJR indexed Q1/Q2, all articles
Linguistics/Letras English 322 36.0% 27.3%
Linguistics/Letras Not English 3,556 10.0% 4.1%
Comparison fields English 28,925 81.9% 71.4%
Comparison fields Not English 6,246 34.4% 20.4%

Within Linguistics/Letras, English articles are much more likely to be indexed and much more likely to be Q1/Q2. This is consistent with the proposed pathway:

field norms / audience / language orientation
  -> Portuguese and Brazilian/local journal publication
  -> lower SJR indexing
  -> lower observed Q1/Q2 placement

Adding English to the logit models reduces, but does not eliminate, the Linguistics/Letras penalty:

  • For SJR indexing, the focal-field odds ratio moves from 0.050 to 0.188 after adding English. English itself has an odds ratio of 8.39.
  • For Q1/Q2 placement, the focal-field odds ratio moves from 0.039 to 0.160 after adding English. English itself has an odds ratio of 9.74.

This is a descriptive mediation pattern: English publication explains part of the gap, but not all of it.

Brazilian Journal Proxy

For articles matched to SJR, the 2024 SJR file includes journal country. This allows a partial proxy for Brazilian versus non-Brazilian journals, but only among indexed journals.

Field group SJR country group Indexed articles English Q1/Q2 among indexed
Linguistics/Letras Brazilian SJR journal 307 13.4% 39.1%
Linguistics/Letras Non-Brazilian SJR journal 166 45.2% 68.1%
Comparison fields Brazilian SJR journal 2,133 49.4% 29.2%
Comparison fields Non-Brazilian SJR journal 23,697 95.5% 89.9%

The pattern is consistent with the hypothesis that Brazilian/local journal publication is associated with lower international quartile placement. However, this proxy cannot classify the large number of non-indexed articles by journal country. A stronger version would require an external ISSN-to-country registry or a curated Brazilian-journal list.

Working Hypotheses

The current evidence supports these working hypotheses:

  1. Linguistics/Letras CAPES-7 output is structurally less integrated into SJR’s international journal-indexing system than the comparison fields.
  2. Low English-language publication is one major mechanism behind low indexing.
  3. Brazilian/local journal publication likely mediates part of the relationship between language and SJR indexing.
  4. The Q1/Q2 gap should be measured primarily with all articles in the denominator, because non-indexing is substantively informative rather than a random missingness process.
  5. Conditional Q1/Q2 among indexed journals is useful as a secondary diagnostic: it asks whether indexed Linguistics/Letras journals are also lower-ranked after excluding non-indexed output.
  6. The results support the claim that Linguistics/Letras publishes much less in international indexed journals. They do not by themselves prove that the underlying research is worse; they show that journal placement, language, and indexing ecology differ sharply.

Figures

Generated figures are in figures/:

  • pct_english_by_discipline.png
  • sjr_indexing_rate_by_discipline.png
  • sjr_quartile_distribution_by_discipline.png
  • q1q2_among_indexed_by_discipline.png
  • english_vs_q1q2_by_discipline.png
  • english_share_over_time.png
  • sjr_indexing_by_language_and_field.png
  • q1q2_by_language_and_field.png

Current Caveats

  • The SJR file is a 2024 proxy, not a historical 2017-2020 panel.
  • CAPES’s accessible datastore copy of the 2013-2016 article-detail table is incomplete, so processed analyses currently use 2017-2020 only.
  • Professor-level denominators are based on observed faculty authors because the CAPES docente datastore returned empty filtered files for the selected program codes.
  • Linguistics/Letras in CAPES combines linguistics and literature programs. The present summary treats the CAPES area as the focal unit; a later refinement could split linguistics-oriented and literature-oriented programs.

Adding Areas

To add another CAPES evaluation area, add its normalized CAPES area name to STUDY_AREAS in scripts/derive_capes7_programs.py, then rerun:

python3 scripts/derive_capes7_programs.py
python3 scripts/fetch_capes7_datastore.py
python3 scripts/build_database.py
python3 scripts/export_summaries.py
Rscript analysis.R

Political Science does not need to be added because it is already present as CIÊNCIA POLÍTICA E RELAÇÕES INTERNACIONAIS.

Copyright © Guilherme Duarte Garcia