A doctoral researcher at a Max Planck Institute in Munich finishes a draft chapter at midnight. She used ChatGPT to polish the language and Claude to refactor a Python script. Tomorrow she will submit a peer review for a Wiley journal. Next week she will help her supervisor finalize a DFG proposal. Three tasks, one researcher, and at least five different rulebooks—each pointing in a slightly different direction.
Welcome to the AI compliance jungle of German academic life in 2026.
A Thin Canopy of Consensus
Spend an afternoon clicking through the AI policies of the Deutsche Forschungsgemeinschaft (DFG), the Max Planck Society, the Helmholtz Association, the Hochschulrektorenkonferenz, and a dozen universities, and you will come away convinced that German science has lost its bearings. Different documents use different definitions, different binding force, and—occasionally—directly contradictory rules.
There is a consensus, but it is thinner than it first appears. Across the DFG, the EU AI Act, major publishers, and leading research-integrity bodies, five principles recur with striking regularity—though not always with the same legal force, wording, or operational detail.
First: no AI authorship. Major publishers and leading editorial-ethics bodies draw the same line: authorship requires accountability. A large language model cannot approve a final manuscript, answer for errors, or take responsibility for corrections. It therefore cannot be listed as an author.
Second: humans remain accountable. Whoever uses AI remains responsible for accuracy, integrity, originality, and compliance. „The model said so“ is not a defense.
Third: relevant AI use must be disclosed. The shared premise is transparency: if AI materially shapes the scientific content, readers, reviewers, or funders should be told.
Fourth: confidential and sensitive material stays out of public AI tools. Manuscripts under review, unpublished proposals, and personal or sensitive data do not belong in public chatbots. This is the clearest operational line in the system.
Fifth: AI literacy is an institutional responsibility. From the EU AI Act to HRK, KMK, and university guidance, the direction is clear: institutions must help researchers, reviewers, teachers, and students use AI responsibly.
Notice, however, that this consensus stops short where it matters most: at the practical level. The dominant academic use cases are everyday writing and research support: drafting an introduction, polishing a paragraph, translating a paper, summarizing literature, fixing grammar, refactoring code. And it is exactly here, in the most common tasks, that the jungle is densest. What counts as relevant use, what must be declared, where, in what format, and what follows from omission is left to dozens of overlapping rulebooks. There is no 80- or 90-percent safe zone for the everyday researcher. There is a thin outer frame, and inside it, a jungle.
What is even more striking is how rarely the guidelines sound like encouragement. On the current source base, only a small minority of the institutions reviewed explicitly invite researchers to use AI as a tool for improving research quality. Far more common is a defensive-regulatory approach: AI is permitted or tolerated, but primarily constrained through duties of transparency, responsibility, data protection, and confidentiality.
The DFG Cuts a Cleaner Path — but the Old Markers Stand
Consider the DFG, Germany’s most powerful research funder. In September 2023, its presidium issued a stark verdict on AI in peer review. The exact wording is worth pausing on: „in the preparation of reviews, the use of generative models is impermissible with regard to the confidentiality of the review process. Documents made available for review are confidential and may in particular not be used as input for generative models.“
Read carefully, that 2023 sentence is not really an objection to AI. It is an objection to an upload. The reasoning is entirely about where the manuscript goes—into a public LLM whose providers may store, train on, or otherwise lose control of confidential text. The technology is mentioned only because it was the most plausible vehicle for the data leak. Strip the confidentiality concern away, and there is no DFG argument left against AI in review work.
In 2023, that distinction was easy to miss — public ChatGPT was the dominant face of generative AI, and „AI in peer review“ effectively meant „uploading a confidential manuscript to OpenAI.“ The ban was reasonable shorthand, just aimed at the wrong target.
The DFG has now corrected itself, and elegantly. Form 4.04, in force since April 16, 2026, replaces the flat ban with a conditional permission built around four principles—confidentiality, transparency, quality assurance, and responsibility—and operationalizes confidentiality the way it should have been operationalized from the start: by regulating the data path rather than the technology. AI is permitted in reviewing only on systems of three types: locally hosted, hosted by a trusted institution, or cloud-based with a contractual data security guarantee. Reviewers must give active consent in the elan portal and disclose use when the review is submitted.
This is the right move. The new rule keeps the original concern intact—confidential documents must not leak into uncontrolled systems—while abandoning the false premise that the technology itself is the threat. A reviewer running a locally hosted open-source model on her own institute’s GPU cluster, or using an institutional chat interface, does not create the same confidentiality risk if properly controlled; the manuscript never leaves the trusted environment. Treating that case identically to a paste into a public chatbot was always a category error. The 2026 framework names the actual variable—data egress—and lets the AI question follow from it.
Helmholtz’s AI recommendations show how quickly institutional guidance can become dated. Issued in September 2024, they advised against AI support in evaluations and reviews and referred to the DFG’s then-current position. Since the DFG changed its review guidance in April 2026, Helmholtz has not publicly updated its recommendations. The result is an update gap: Helmholtz’s guidance remains more restrictive than the current DFG framework while still pointing back to the DFG.
Where the Paths Fork: Publishers Disagree on Copy Editing
Now imagine you ran your draft through ChatGPT for nothing more than language polishing. Must you say so?
Publisher rules split sharply on language polishing. Springer Nature says AI-assisted copy editing need not be declared; De Gruyter Brill takes a similar view for simple proofreading and copy editing. Wiley is stricter: AI use in developing any part of a manuscript must be disclosed, and ACS and the ICMJE recommendations point in the same direction. The same narrow act of AI-assisted polishing may therefore be declaration-free at a Nature title but declaration-relevant at a Wiley title.
A second contradiction lurks in peer review itself. The DFG now permits supportive AI use under its four principles. Wiley allows AI to improve reviewer feedback but bans manuscript uploads. Springer Nature requires disclosure of any AI involvement in evaluation. Horizon Europe permits AI only for ancillary tasks. The European Research Council is the most restrictive of all: AI may not summarize proposals, assess merit, or draft reviews, and proposal content may never be uploaded to external systems. The same reviewer workflow can therefore be permissible in one process, disclosure-required in a second, and outright misconduct in a third.
Five Microclimates: The University Landscape
If the publisher world is patchy, the German university world is genuinely wild. For this article, AI rules and guidance from forty-four German universities were reviewed along two axes: who sets the rule, and what the rule says. Some universities regulate centrally; others leave the matter to faculties, departments, or examiners. In practice, the landscape falls into five categories: outright bans, default bans with permission exceptions, default permission with examiner exceptions, permission with mandatory declaration, and gaps where no public central AI rule could be verified.
On the first trail — authority — the picture is split. Some universities have moved toward central rules or formal university-wide guidance — Augsburg, Düsseldorf, TU Braunschweig, Cologne, Mainz, and Leipzig among them. Others govern AI use through faculties, departments, or examiners: LMU Munich has no single university-wide BA/MA standard, at TUM many rules are organized at School or chair level, and at FU Berlin central guidance coexists with subject-specific rules. Above them, state legislatures have mostly stayed out of the operational details. Even where AI appears in higher-education law, as in Rheinland-Pfalz, it is framed as a general institutional task rather than a concrete examination rule.
The second trail is substance: what the rule actually says. The extreme case is the hard line: outright bans on specific forms of generative text production. These are the exception, not the rule, and are found mostly at faculty or institute level — for example, at KIT’s physics faculty, Bonn’s philosophy faculty, and the FU Berlin Institute for Communication Studies. In these settings, unauthorized AI use can trigger the ordinary machinery of examination misconduct.
A less absolute version keeps the default ban but adds an escape hatch. Hochschule Trier is the cleanest statutory case: AI applications that automatically generate content are generally impermissible aids unless expressly allowed. At TUM and RWTH Aachen, the same logic appears in less codified form: AI use in written work depends on explicit permission from the relevant examiner, task, or rule. Silence does not mean permission.
A third variant starts from permission, but lets examiners narrow the field. HU Berlin is the cleanest example: its central recommendations say that AI may generally be used, while allowing its use to be restricted or barred in individual courses or exams. Heidelberg points in a similar direction, but more cautiously: its university-wide guidance treats AI as usable when handled responsibly and transparently, while leaving concrete thesis rules to faculties and examination regulations. Here, silence leans toward permission — but not toward unlimited freedom.
The most common bargain is permission with mandatory declaration, but even that does not mean general permission. Universities and faculties typically allow particular uses — language polishing, translation, coding support, literature structuring, or formatting help — only insofar as they do not replace the student’s own academic work. FAU, FU Berlin, Leipzig, KIT, LMU subfields, and several medical or natural-science faculties follow versions of this approach. Here, the machine is not the problem; undeclared or oversized use is.
And then there are the regulatory gaps: the Charité at the BA/MA level, the University of the Bundeswehr Munich, and the University of Rostock among them. That does not mean AI is free. It means no publicly verifiable central AI rule was found for this context, so the answer falls back to older machinery: permitted aids, declarations of independent work, general misconduct rules, and whatever the responsible examiner or faculty has put in writing.
The volatility is real. Saarland University’s economics unit banned generative text AI for seminar papers and theses in June 2024, then reversed that prohibition in July 2025 and moved to chair-level discretion. But volatility does not only mean rules changing over time: within a single graduate program — for example, an International Max Planck Research School, where doctoral candidates may be enrolled at Tübingen, Stuttgart, or any of several other partner universities — the rules can differ depending on which university grants the degree. A neighbor in the next office, working on the same project under the same supervisor at the same research institute, can be operating under genuinely different obligations.
Where lawmakers and universities leave gaps, courts have begun to fill them. The Munich Administrative Court held in November 2023 (M 3 E 23.4371) that AI use in an admissions essay could be treated as the preparation of an examination paper by a third person; the Kassel Administrative Court reinforced the principle in February 2026, treating substantial undeclared AI use in student work as deception even without a specific AI prohibition. Silence is not a safe harbor.
A Rule with No Trail to Follow
Complete prompt documentation — every input, every model version, logged and disclosed — sounds like the gold standard for AI transparency. In practice, it is both an illusion and unworkable. The Leibniz Association’s recommendation of November 29, 2024 comes closest to this model, asking for disclosure of AI use including software, version, and full prompt documentation. Some university guidance, including at FAU, also moves toward documentation of AI use depending on the assessment format. The Free University of Berlin and the World Association of Medical Editors point in a similar direction, even if not every rule has the same binding force or operational detail.
The first problem is that the record does not deliver what it promises. A prompt log shows what was typed, but unless it also captures the output, model version, date, settings, and system context, it does not let another reader reconstruct the interaction. Even then, AI outputs are not reliably reproducible: the same prompt can yield different texts. OWID’s FAQ captures the policy problem from the other side: transparency is expected, but its exact form remains unevenly defined. The transparency the rule promises is therefore thinner than it looks.
The second problem is volume. Even the visible, human-typed prompts add up quickly. The disclosure at the end of this article documents nineteen deep-research reports and several cross-comparison dialogues; a conservative estimate of the user-facing prompts behind this single short article is 100 to 150. A doctoral thesis with a serious AI-assisted workflow over three or four years could plausibly run into the thousands. And that is only the visible layer. Modern deep-research tools expand a single user prompt into dozens or hundreds of internal sub-queries; reasoning models may generate intermediate self-instructions; and agentic systems can search, click, retrieve, compare, and execute tasks without each step appearing as a user-facing prompt. At that point, even the term “prompt” becomes unstable: does it mean only the human instruction, or also the system’s hidden reformulations, planning steps, tool calls, and search queries? The real interaction record is therefore partly inaccessible by design. Complete prompt documentation under those conditions is not principled rigor; it is paperwork the technology has already made impossible.
Finding a Path Through
For a working researcher, the practical compass is more sobering than the early discourse suggested. The five-point consensus is real, but it is a perimeter, not a path. Inside that perimeter, every concrete writing decision — polish or rewrite, declare or do not declare, methods or acknowledgments, footnote or none — depends on the specific rulebook in force for that task, journal, funder, university, or institute. And that rulebook may have changed since the last time you checked.
Three habits reduce the risk. Treat the strictest applicable disclosure threshold as your working default; it is usually easier to explain a careful declaration than to defend an omission. Check the most recent version of the rule that actually governs each task — not last year’s PDF, not your supervisor’s recollection, not the policy of a sister institution. And keep a running log of meaningful AI interactions as they happen, so a disclosure paragraph can be assembled in minutes rather than reconstructed from memory under deadline pressure.
The jungle is real, and for everyday academic writing and review work it is exactly as dense as it looks. But it has paths, and the people who navigate it best are the ones who stop expecting a single map.
Disclosure on AI use and source base: This article is based on a staged AI-assisted research design. Fifteen structured deep-research reports were produced across five thematic blocks — national frameworks, research organizations/MPG, universities, publication, peer review, and research funding, and the legal landscape — with each block researched independently on ChatGPT (GPT-5.5), Perplexity (using Claude Sonnet 4.6 Thinking), and Claude (Sonnet 4.6 Adaptive). Four additional scoping runs on ChatGPT and Perplexity addressed two narrower questions: rules on AI use generally and rules on LLM-based text generation. Altogether, the nineteen reports drew on more than 200 primary documents, including laws, EU regulations, court rulings, university statutes, publisher policies, ethics-body recommendations, and funder guidelines. Their findings were compared in two independent AI-assisted cross-checks, one with Claude (Opus 4.7 Thinking) and one with ChatGPT (GPT-5.5); divergences and gaps were rechecked against additional sources. Claude then produced the first article draft from the consolidated research base. The draft was reviewed, fact-checked, shortened, and revised through iterative human–AI interaction with Claude (Opus 4.7 Adaptive) and ChatGPT (GPT-5.5). Responsibility for content, framing, and any errors rests with the author.
