Research Notes - Desirable Difficulty in AI-Assisted Learning and Research

Bram Donkers; Claude Sonquatre-six

Research Notes - Desirable Difficulty in AI-Assisted Learning and Research

Bram Donkers · Claude Sonquatre-six

AI Generated by claude-sonnet-4-6 · human-supervised · Created: 2026-03-10 · History

Research: Desirable Difficulty in AI-Assisted Learning and Research

Date: 2026-03-10 Search queries used:

“desirable difficulty learning Robert Bjork cognitive science”
“desirable difficulty AI-assisted learning research 2024 2025”
“productive struggle AI tutoring systems cognitive load”
“desirable difficulty vs undesirable difficulty AI tools over-reliance metacognition”
“Manu Kapur productive failure AI learning design scaffolding 2024 2025”
“spacing effect retrieval practice interleaving AI tools research 2025 learning retention”
“desirable difficulty artificial intelligence research assistance knowledge generation 2025”

Executive Summary

“Desirable difficulty” is a term coined by cognitive psychologist Robert Bjork (UCLA) describing learning conditions that feel harder in the short term but produce superior long-term retention and transfer. Core techniques include spaced practice, interleaving, retrieval practice, and generation tasks. The growing use of AI tools in education and research creates a fundamental tension: AI’s capacity to remove effort — summarising, drafting, answering on demand — systematically eliminates the struggle that makes learning durable. Contemporary research (2024–2026) documents measurable harms: 22–35% reductions in test scores, 17% declines in critical thinking, and neurophysiological evidence of “cognitive debt” accumulating when AI replaces encoding. The central design challenge is distinguishing desirable difficulty (effort that encodes understanding) from undesirable difficulty (friction without learning value), and building AI tools that preserve the former while reducing the latter.

Key Sources

Desirable Difficulties Perspective on Learning — Robert Bjork (UCLA)

URL: https://bjorklab.psych.ucla.edu/wp-content/uploads/sites/13/2016/07/RBjork_inpress.pdf
Type: Encyclopedia entry / foundational academic paper
Key points:
- Manipulations that speed acquisition during instruction can fail to support long-term retention
- Five core desirable difficulties: spacing, interleaving, variability, intermittent feedback, retrieval/generation over presentation
- The word “desirable” is critical: difficulty is only desirable if the learner is equipped to overcome it
- Human memory is reconstructive, not a playback device — retrieval itself is a learning event that alters future accessibility
- Learning and performance are not the same — high current performance can mask low actual learning
Tenet alignment: Strongly aligns with Symbiotic Intelligence (expanding human capability vs. replacing it) and Human Intent First (orienting systems around human development goals)
Quote: “Manipulations that speed the rate of acquisition during instruction can fail to support long-term retention and transfer, whereas other manipulations that appear to introduce difficulties and slow the rate of acquisition can enhance post-instruction recall and transfer.”

The Impact of AI on Note Taking in Higher Education — Genio/Glean (Literature Review)

URL: https://genio.co/resources/research-and-insights/the-impact-of-ai-on-note-taking-in-higher-education
Type: Literature review / practitioner synthesis
Key points:
- AI “frictionless outputs” deny learners the encoding opportunities required for mastery
- Three levels of AI usage: low-level (transcription), mid-level (scaffolded support), full task offloading — only low/mid preserve learning
- Full offloading: 35% test score reduction (Chen et al., 2025), 24% reduction (Kreijkes et al., 2026), 22% overall learning outcome reduction (Rohilla, 2025)
- “Fluency illusion”: students mistake smooth AI output for their own understanding
- “Digital amnesia”: brain forms pointers to where information is stored, not the knowledge itself
- High AI dependence correlates with 17.3% reduction in critical thinking scores (Rohilla, 2025)
- 83% of students had difficulty quoting their own AI-assisted written work immediately after writing (Kosmyna et al., 2025)
- Introduces concept of “friction architects” — educators who intentionally design productive struggle while removing mechanical friction
Tenet alignment: Directly aligns with Symbiotic Intelligence tenet; conflicts with naive “AI as equalizer” narratives
Quote: “Effortless learning is a deceptive shortcut that leads to poor retention.”

Productive Struggle: How AI Is Changing Learning — Bellwether

URL: https://bellwether.org/publications/productive-struggle/
Type: Research publication
Key points:
- AI-facilitated ease may unlock curiosity and extend time-on-task in some contexts
- Risk: students arrive at right answers without the cognitive processes required for genuine learning
- AI assistance architecture matters — systems that answer immediately vs. those that scaffold productive attempts differ fundamentally
Tenet alignment: Nuanced — partially aligns with Always Scalable (matching effort to outcomes) and Symbiotic Intelligence

Your Brain on ChatGPT — Kosmyna et al. (MIT Media Lab, 2025)

URL: Reference via Genio literature review; arXiv:2506.08872
Type: Empirical study
Key points:
- Frictionless AI bypasses internal encoding entirely — neurophysiological evidence
- Concept of “pedagogical debt”: cognitive shortcuts now create compounding deficits later
- 83% of students could not accurately quote their own AI-assisted essays immediately after writing
Tenet alignment: Strong evidence base for Symbiotic Intelligence concern about AI replacing rather than extending understanding

Beware of Metacognitive Laziness — Fan et al. (2024)

URL: https://doi.org/10.1111/bjet.13544 (British Journal of Educational Technology)
Type: Peer-reviewed study
Key points:
- Generative AI erodes self-regulatory processes: monitoring, evaluating, synthesising
- Students delegating note-taking receive higher-performing essays but no increase in knowledge transfer or retention
- Metacognitive laziness: offloading the processes that build mastery, not just the mechanical tasks
Tenet alignment: Aligns with Human Intent First — AI that serves short-term task completion undermines long-term human intent

AI Cognitive Offloading and Implications for Education — UTS (2026)

URL: https://www.uts.edu.au/.../ai-cognitive-offloading-and-implications-for-education.pdf
Type: Academic policy paper
Key points:
- “Generation desirable difficulty” — requiring students to generate answers before seeing them leads to significantly better long-term retention
- AI risks eliminating generation tasks by providing answers on demand
Tenet alignment: Aligns with Symbiotic Intelligence

Major Positions

Position 1: Desirable Difficulty as Essential Mechanism (Bjork; Cognitive Science Consensus)

Proponents: Robert Bjork, Elizabeth Bjork (UCLA Bjork Memory & Forgetting Lab); John Dunlosky; Mark McDaniel
Core claim: Conditions that slow apparent learning and feel harder actually produce deeper encoding, stronger retrieval routes, and better transfer to novel contexts
Key arguments:
- Memory is reconstructive — retrieval processes trained during learning must match demands of later application
- “Transfer-appropriate processing”: difficulty works when it exercises the same retrieval/application processes needed post-instruction
- Generation, spacing, interleaving, and retrieval practice all work through this mechanism
- Performance during acquisition is a poor proxy for learning
Relation to site tenets: Foundational — establishes that “symbiotic” AI must preserve these mechanisms, not eliminate them. Supports Human Intent First: what learners want (answers) often differs from what they need (encoding)

Position 2: AI Eliminates Desirable Difficulty (Critical Education Research)

Proponents: Rohilla (2025); Chen et al. (2025); Kosmyna et al. (2025); Fan et al. (2024); Krsmanović & Deek (2025)
Core claim: Current AI tools, particularly generative AI used for note-taking, essay writing, and question-answering, systematically remove the productive friction that makes learning stick
Key arguments:
- Cognitive offloading to AI produces “digital amnesia” — biological pointers without knowledge
- “Fluency illusion” — smooth AI output is mistaken for understanding
- Quantified harms: 17–35% declines in retention, critical thinking, task initiation skills
- Students entering AI-free test environments experience “inertia” — cannot function without AI scaffold
Relation to site tenets: Challenges naive Symbiotic Intelligence framing — symbiosis requires both partners contributing meaningfully; AI absorbing the human’s generative role is not symbiosis

Position 3: AI Can Implement Desirable Difficulty (Optimistic Design Research)

Proponents: Thompson & Hughes (2023); Intelligent tutoring system researchers; Manu Kapur (productive failure design)
Core claim: AI systems can be designed to implement desirable difficulties — adaptive spacing, interleaving, retrieval quizzing, scaffolded productive failure
Key arguments:
- Adaptive systems can implement spacing and interleaving policies; schedule retrieval practice
- Manu Kapur’s “productive failure” model: structured exploration before instruction produces deeper learning — AI can scaffold this without bypassing it
- Intelligent tutoring systems that resist giving direct answers and instead prompt students to attempt generation first
- The problem is not AI per se, but AI configured for frictionless output rather than learning support
Relation to site tenets: Strongly aligns with Symbiotic Intelligence — AI as the mechanism that ensures productive struggle at the right level of difficulty for each learner; also aligns with Always Scalable (right-sizing effort to context)

Position 4: Desirable Difficulty Is Context-Dependent (Nuanced View)

Proponents: Bjork (implied in original framework); Skulmowski (2023); nuanced education researchers
Core claim: Not all difficulty is desirable. Whether a difficulty is desirable depends on whether the learner is equipped to overcome it — otherwise it becomes an undesirable difficulty that demoralizes without encoding
Key arguments:
- Undesirable difficulties are those the learner cannot overcome with available resources — they demoralize without producing learning
- For novices, some scaffolding is necessary before difficulties are productive
- AI can play a legitimate role in reducing extraneous load (mechanical capture, formatting, finding sources) while preserving germane load (synthesis, evaluation, generation of meaning)
- The “sweet spot” concept: sufficient challenge to activate encoding, not so much as to overwhelm
Relation to site tenets: Aligns with Pluralism of Perspectives — different learners, contexts, and domains require different friction levels. Supports nuanced Always Scalable framing

Key Debates

Debate 1: Is AI fundamentally incompatible with desirable difficulty?

Sides: Critical researchers (AI eliminates productive struggle by design) vs. optimistic designers (AI can scaffold productive struggle if intentionally configured)
Core disagreement: Whether the dominant mode of AI use (answer on demand) is an inherent feature or a design choice that can be corrected
Current state: Ongoing — empirical evidence documents harms from current tools; design-oriented researchers argue for AI tutoring systems built on productive failure principles

Debate 2: Who is responsible for preserving difficulty — tool designers or educators/users?

Sides: “Friction architect” educators (users must intentionally reintroduce difficulty) vs. systemic designers (tools should embed difficulty by default, not require it to be manually restored)
Core disagreement: Whether the default should be frictionless-with-optional-friction or friction-preserving-with-optional-shortcuts
Current state: Contested — Genio argues for institutional/educator responsibility; design researchers push for built-in scaffolding

Debate 3: Does the “desirable difficulty” framework apply to professional/expert knowledge work?

Sides: Education researchers (framework developed in formal learning contexts) vs. knowledge work researchers (similar mechanisms govern expert skill maintenance and creative insight)
Core disagreement: Whether professionals using AI for research, synthesis, and design face analogous risks of “skill atrophy” and “cognitive debt”
Current state: Nascent — skill atrophy research (Krsmanović & Deek, 2025) and cognitive debt literature (Kosmyna et al., 2025) suggest yes, but direct research on creative professionals is limited

Historical Timeline

Year	Event/Publication	Significance
1992	Schmidt & Bjork — “New conceptualizations of practice”	Established desirable difficulties framework in motor learning
1999	Bjork — “Assessing our own competence: Heuristics and illusions”	Extended to metacognitive illusions — we misjudge our learning
2006	Manu Kapur — “Productive Failure” first published	Showed structured failure before instruction outperforms direct instruction
2011	Bjork & Bjork — “Making things hard on yourself, but in a good way”	Canonical accessible synthesis of desirable difficulties for educators
~2016	Bjork — Encyclopedia entry formalizing “desirable difficulties perspective”	Cross-referenced with memory, forgetting, spacing, and transfer literature
2021	Grinschgl et al. — cognitive offloading consequences	Showed offloading boosts performance but diminishes memory
2023	Skulmowski — “Cognitive externalization in digital learning”	Applied cognitive load framework to digital/AI tool use
2024	Fan et al. — “Beware of metacognitive laziness”	Documented metacognitive erosion from generative AI use
2024	Manu Kapur — “Productive Failure” book (Wiley)	Brought productive failure framework to mainstream audience
2025	Rohilla — AI tool usage and cognitive ability	Quantified critical thinking decline (17.3%)
2025	Chen et al. — AI note-taking and cognitive engagement	Showed inverse relationship between AI assistance level and cognitive engagement
2025	Kosmyna et al. — “Your Brain on ChatGPT”	Neurophysiological evidence of cognitive debt from AI essay assistance
2026	Kreijkes et al. — LLM use and reading comprehension	RCT in secondary schools: AI note-taking reduced comprehension 24%

Potential Article Angles

Based on this research, an article could:

“The Friction Paradox: Why AI Tools That Feel Most Helpful May Harm Your Thinking” — Explains the desirable difficulty framework and maps how current AI workflows eliminate each mechanism (spacing → instant recall; retrieval → instant answer; generation → instant draft). Aligned with Symbiotic Intelligence tenet: what feels like augmentation may be replacement.
“Friction Architect: A Design Principle for AI Tools in Knowledge Work” — Proposes a design framework for AI systems that remove extraneous load (mechanical, organizational friction) while preserving germane load (synthesis, generation, evaluation). Connects to Always Scalable and Human Intent First tenets — the right level of difficulty for the right phase of work.
“Cognitive Debt and the Knowledge Worker” — Extends the pedagogical debt concept to professional creative and strategic work. What competencies are knowledge workers gradually losing to AI offloading, and what practices restore them? Connects to Symbiotic Intelligence and existing site research on AI intensification and the oversight tax.

When writing the article, follow obsidian/project/writing-style.md for:

Named-anchor summary technique for forward references
Background vs. novelty decisions (what to include/omit)
Tenet alignment requirements
LLM optimization (front-load important information)

Gaps in Research

Professional knowledge work context: Most empirical evidence comes from formal education (students). Direct research on designers, strategists, and researchers experiencing skill atrophy from AI use is thin.
Longitudinal data: Most studies measure short-term retention effects; the long-term trajectory of “pedagogical debt” across professional careers is unstudied.
Distinguishing tool types: Most research treats “AI” as monolithic. Studies comparing prompt-and-answer LLMs vs. AI-scaffolded retrieval systems vs. Socratic AI tutors would sharpen the picture.
Recovery protocols: Little research on how practitioners can detect and reverse cognitive debt once accumulated.
Cross-cultural variation: 94% of AI learning tool studies are in Western industrialized countries (Luo et al., 2025) — unclear whether desirable difficulty mechanisms are culturally universal in application.

Citations

Bjork, R. A. (in press). Desirable difficulties perspective on learning. In A. S. Benjamin (Ed.), Successful remembering and successful forgetting: A Festschrift in honor of Robert A. Bjork. Psychology Press. https://bjorklab.psych.ucla.edu/wp-content/uploads/sites/13/2016/07/RBjork_inpress.pdf

Bjork, E. L., & Bjork, R. A. (2011). Making things hard on yourself, but in a good way: Creating desirable difficulties to enhance learning. In M. A. Gernsbacher et al. (Eds.), Psychology and the real world (pp. 56–64). Worth Publishers.

Chen, X., Ruan, K., Ju, K. P., Yap, N., & Wang, X. (2025). More AI assistance reduces cognitive engagement: Examining the AI assistance dilemma in AI-supported note-taking. Proceedings of the ACM on Human-Computer Interaction, 9(CSCW2), Article 451. https://doi.org/10.1145/3757632

Fan, Y., Tang, L., Le, H., Tan, S., Rong, J., Rakovic, M., Tsai, Y.-S., & Gašević, D. (2024). Beware of metacognitive laziness: Effects of generative artificial intelligence on learning motivation, processes, and performance. British Journal of Educational Technology, 56(2), 489–530. https://doi.org/10.1111/bjet.13544

Genio. (2025). The impact of AI on note taking in higher education [Literature review]. https://genio.co/resources/research-and-insights/the-impact-of-ai-on-note-taking-in-higher-education

Grinschgl, S., Papenmeier, F., & Meyerhoff, H. S. (2021). Consequences of cognitive offloading: Boosting performance but diminishing memory. Quarterly Journal of Experimental Psychology, 74(9), 1477–1496. https://doi.org/10.1177/17470218211008060

Kapur, M. (2006). Productive failure. Cognition and Instruction, 24(3), 379–424.

Kapur, M. (2024). Productive failure: Unlocking deeper learning through the science of failing. Wiley.

Kosmyna, N., Hauptmann, E., Yuan, Y. T., Situ, J., Liao, X., Beresnitzky, A. V., Braunstein, I., & Maes, P. (2025). Your brain on ChatGPT: Accumulation of cognitive debt when using an AI assistant for essay writing task. arXiv preprint arXiv:2506.08872.

Kreijkes, P., Kewenig, V., Kuvalja, M., Lee, M., Hofman, J. M., Vitello, S., Sellen, A., Rintel, S., Goldstein, D. G., Rothschild, D., Tankelevitch, L., & Oates, T. (2026). Effects of LLM use and note-taking on reading comprehension and memory: A randomised experiment in secondary schools. Computers & Education, 243, 105514. https://doi.org/10.1016/j.compedu.2025.105514

Krsmanović, M., & Deek, F. (2025). The impact of AI on cognitive skills in higher education. Journal of Higher Education Theory and Practice, 25(1), 52–63.

Rohilla, A. (2025). Impact of excessive AI tool usage on the cognitive abilities of undergraduate students: A mixed method study. Advance Social Science Archive Journal, 4(1), 2131–2143. https://doi.org/10.55966/assaj.2025.4.1.0115

Schmidt, R. A., & Bjork, R. A. (1992). New conceptualizations of practice: Common principles in three paradigms suggest new concepts for training. Psychological Science, 3, 207–217.

Skulmowski, A. (2023). Cognitive externalization in digital learning: A cognitive load perspective. Educational Psychology Review, 35, 101. https://doi.org/10.1007/s10648-023-09818-1

Tags: learning, cognitive-augmentation, knowledge-work, ai-tools, epistemology