The Students Writing in Their Third Language Were the Ones the Algorithm Flagged

Johan Steyn
May 28
6 min read

AI detection tools are biased against non-native English speakers at rates that would be considered institutional misconduct in any other context. South African universities cannot afford to pretend otherwise.

Sign up for my Substack daily AI newsletter here.

See my AI Training course portfolio for corporate Business Leaders here.

Article link: https://open.substack.com/pub/johanosteyn/p/the-students-writing-in-their-third?r=73gqa&utm_campaign=post&utm_medium=web&showWelcomeOnShare=true

Follow me on LinkedIn: https://www.linkedin.com/in/johanosteyn/

In July 2023, a group of Stanford computer scientists published a finding that should have ended the use of AI detection tools in multilingual educational environments immediately. They tested seven widely used AI detectors on 91 essays written by human students who were non-native English speakers. The detectors flagged 61.22% of those authentic, human-written essays as AI-generated. 19% of the essays were unanimously misclassified by all seven detectors. 97% were flagged by at least one. The students who wrote every word of those essays were being identified as cheats. The tools kept running for another three years.

On 22 May 2026, the University of the Free State announced it would discontinue Turnitin’s AI detection function from July 2026, following the University of Cape Town, which took the same step in October 2025 — together representing a growing South African institutional consensus that detection-based academic integrity enforcement has failed its students.”

This is a meaningful correction because it actually strengthens the article’s argument — it is no longer one isolated institution making a principled decision. It is a pattern, and the Daily Maverick specifically called on other South African universities to follow UCT’s lead as far back as July 2025. The UFS decision is confirmation that the argument was right and the movement is growing. The decision is welcome. The question it demands is not why UFS made it. It is why it took this long, and what every other South African institution is still waiting for.

CONTEXT AND BACKGROUND

The mechanics of the bias are not complicated. AI detection tools assess writing using a metric called perplexity, which broadly correlates with linguistic sophistication — the variety of vocabulary, the complexity of syntax, the diversity of sentence structure. Writing that scores low on perplexity appears predictable and simple to the algorithm, and predictable, simple writing is what these tools are trained to associate with machine-generated text. The problem is that predictable, simple writing is also what human students produce when they are writing in a language that is not their first. A student composing an essay in English as her third language will naturally use a more limited vocabulary and simpler sentence structures than a native speaker. The algorithm cannot distinguish between those two things. It does not try to.

The consequences of this failure are not abstract. Student discipline for AI misconduct increased 33% globally between 2022 and 2026, even as detection capacity expanded and false positive rates remained stubbornly high. At institutions using Turnitin’s AI detection function, international students — the population most likely to be writing in a second or third language — faced false positive rates two to six times higher than those facing native English speakers.

In the United States, academic misconduct findings carry consequences that extend well beyond grades: international students found in violation of academic integrity policies face the threat of visa cancellation and deportation. The stakes for a false positive are not an inconvenience. They are a life-altering event.

Major institutions began acting on the evidence. UCLA, UC San Diego, Cal State LA, Vanderbilt, Cornell, Pittsburgh, and Iowa all deactivated AI detectors between 2024 and 2025, citing accuracy concerns and equity risks. Harvard, Oxford, and the University of Michigan moved from categorical bans to disclosure frameworks, treating AI use as an attribution and transparency question rather than a binary permitted or prohibited one. The University of the Free State’s decision follows this global trajectory, but it does so with a clarity of purpose that deserves attention. UFS Deputy Vice-Chancellor Prof Anthea Rhoda stated it directly: “Rather than relying primarily on technologies whose outcomes remain contested within global higher education contexts, we are reaffirming the importance of academic judgement, transparent assessment practices, and the responsible use of generative AI”.

INSIGHT AND ANALYSIS

South Africa’s higher education context makes the equity argument more acute, not less. A significant proportion of South African university students are studying in English as a second, third, or fourth language, having been schooled primarily in an African language. Many come from under-resourced secondary schools where English proficiency was not systematically developed. They arrive at university already navigating a profound linguistic disadvantage, and they are now being assessed by tools whose design parameters treat that disadvantage as evidence of dishonesty. That is not a technical imperfection. It is a structural injustice embedded in the architecture of the tool itself.

Prof Francois Strydom of the UFS Centre for Teaching and Learning named the necessary reframe: “The conversation around AI in higher education cannot only be about detection. It must also focus on how we design meaningful learning experiences and assessments that encourage critical engagement, creativity, reflection, and responsible knowledge production in an AI-enabled society.” That is the right argument, and it has implications far beyond academic integrity. Assessment design that rewards genuine thinking, original engagement, and the development of voice is an assessment design that AI cannot shortcut — not because AI cannot produce plausible-sounding output, but because the assessment is designed to surface things that plausible-sounding output does not contain.

I have previously written about the question of who bears the cost when an organisation makes a strategic choice to adopt AI, and the growing legal and governance consensus that those costs cannot simply be transferred to the individuals who had no say in the decision. The detection debate sits inside that same argument, applied to education. When institutions respond to AI in education primarily through surveillance and enforcement rather than through a genuine rethinking of what learning is for, they do not protect the integrity of education. They protect its appearance. The students who find ways around the detector graduate with credentials that certify nothing. The students who are falsely accused by it lose everything. Neither outcome serves the institution's actual purpose, and the cost of the institutional failure, as in the workplace, falls on the people least able to absorb it.

IMPLICATIONS

The detection arms race has been expensive in ways that extend beyond its documented injustices. US institutions were spending between $2,768 and $110,400 per year on AI detection tools, depending on their size and contract terms. Many of them subsequently deactivated the tools, having purchased something that did not work as advertised and that exposed them to significant liability for false accusations. South African institutions operating on constrained budgets cannot afford to waste resources on tools that compound existing inequities. The money spent on detection is money not spent on the assessment redesign, oral examination infrastructure, and academic development capacity that actually produce the outcomes the institution exists to deliver.

The shift to authentic assessment is not a soft option. Oral defences, reflective journals, iterative drafts, and in-class writing components are more demanding of academic staff time and more difficult to scale than a Turnitin submission. They require investment in staff development, assessment design expertise, and the institutional culture that supports genuine engagement between academics and students rather than administrative processing of submissions. The UFS has committed to expanding support structures for staff and students during the transition, including assessment redesign support and AI-related learning modules through its Digital Skills and Competency Framework. That commitment needs to be resourced, not simply announced.

For other South African universities still operating AI detection tools, the question is not whether the UFS decision is correct. The evidence that it is correct has been available since 2023. The question is what institutional risk calculation is keeping those tools in place, whose interests that calculation serves, and whether the students being falsely accused under it have any meaningful avenue for redress. Academic integrity frameworks that apply disproportionate suspicion to the most linguistically vulnerable students in the institution are not protecting integrity. They are encoding a particular form of structural disadvantage into the machinery of assessment itself.

CLOSING TAKEAWAY

The UFS decision is the right one, and it is overdue. Dropping an AI detection tool that was failing its students is not a surrender to AI or a retreat from academic integrity. It is an acknowledgement that integrity cannot be built on a foundation of algorithmic bias, and that the question education needs to be asking is not whether a student used AI to produce their submission, but whether the submission demonstrates that learning actually happened.

South African students writing in their third language deserve to be assessed on what they know and what they can do, not on whether their English prose patterns match the statistical expectations of a tool trained on native speaker writing. They always deserved that. The algorithm simply made the failure to provide it impossible to ignore.

Johan Steyn is a prominent AI thought leader, speaker, and author with a deep understanding of artificial intelligence’s impact on business and society. He is passionate about ethical AI development and its role in shaping a better future. Find out more about Johan’s work at https://www.aiforbusiness.net

Brainstorm magazine November 2019

The Students Writing in Their Third Language Were the Ones the Algorithm Flagged

Recent Posts

Comments