
Kathryn Conrad / https://betterimagesofai.org / https://creativecommons.org/licenses/by/4.0/
The explosion of generative AI in education over the past two years has forced universities to confront a fundamental question: what are we, as educators, actually assessing? This is not a question about how to catch students using ChatGPT or whether we should ban AI tools. Rather, it is a question about the entire paradigm through which we evaluate learning.
The uncomfortable reality is that students are using generative AI to complete most of their assignments. According to recent data from a Nature article on assessment in the AI era [1], 92% of UK undergraduates are now using AI in some form. Yet, universities continue to rely on traditional exams and essays to evaluate learning. Not just written documents, but also code, data analysis, or presentation slides. This is not inherently the problem. The problem is that the traditional notion of higher education assessment is designed to evaluate the learning product rather than understanding the learning process. Using this product-based assessment, we cannot know how much of a final deliverable reflects the student’s actual thinking and critical engagement when AI has participated in its creation.
Focusing solely on the product of learning stems from an outdated conceptualisation of higher education that operates within a transactional model. Students focus on producing a “grade-worthy” product rather than on the learning that happens along the way. Generative AI has made this weakness more visible and more urgent. It exposes what was already broken: treating assignments as discrete, one-time events rather than iterative learning experiences.
Before we can redesign the assessment, we need to be clear about what we want students to learn. This is a more complex question than most institutions address. Learning outcomes, according to the European Qualification Framework [2], typically fall into three categories:
- Knowledge and notions, concepts, and foundational understanding.
- Technical skills like programming, data analysis, or design.
- Transversal competencies include collaboration, communication, reflection, and metacognitive awareness.
Yet our assessment systems, particularly exams and traditional essays, capture only fragments of this spectrum. They cannot assess whether a student has learned to work effectively with others, whether they can reflect on their own learning process, or whether they have developed the research mindset needed in their field.
Generative AI systems, meanwhile, can now produce writing that surpasses the quality of most student work. They can generate code, create visualisations, and synthesise information. This capability renders traditional written assessment increasingly meaningless as a signal of student learning. We need to find ways to bypass this.
One solution lies in shifting our focus to the learning process itself. This means asking students to document and share intermediate steps, to articulate their decision-making, or to reflect on challenges and iterations. This is what process-folios are designed to capture: transparent records of a student’s learning journey, not just the final destination.
As Berkley scholar Jason Gulya advocates [3], this approach requires us to move away from “one-and-done” assignments toward what he calls “process clusters“. Students complete multiple related assignments that build toward mastery of a particular skill or competency. Rather than receiving a grade on each piece, they receive feedback: a simple “Complete” or “Try Again”. A Complete means the student has hit the learning objective. A Try Again means another attempt is needed. This framing changes the relationship between student, teacher, and assessment. It becomes formative rather than summative, iterative rather than final.
Importantly, process-based assessment is inherently transparent about AI use. When a student documents their process, they reveal where AI was used, how it was integrated, and what their own contributions were. This transparency is not a workaround to detect cheating; rather, it is a direct window into learning.
Documenting process works best when paired with a competency-based assessment. Rather than assigning grades to individual assignments, educators may group assignments by the specific competencies they address. For instance, in a design or development course, students might be assessed on their growth in areas like user research, prototyping, collaboration, and communication.
The authors of the Nature article also propose other alternatives: (1) conversation-based assessments that adapt to students’ understanding in real time, (2) continuous assessment replacing high-stakes exams, and (3) transparent evaluation frameworks that acknowledge AI involvement. While conversation-based assessment can be tricky for distance-learning institutions like the German University of Digital Science, continuous assessment, supported by AI-driven feedback systems, can provide a more meaningful evaluation than a single exam ever could. When educators observe student learning across multiple touchpoints over weeks and months, they develop a sophisticated understanding of growth, conceptual development, and skill acquisition. This is already standard practice in medical education, where clinical supervisors assess students continuously through observation and reflection. There is no reason other disciplines cannot adopt similar approaches.
Another powerful technique is to have students co-create the rubrics by which their work will be assessed. This is counterintuitive to traditional practice, but it serves several purposes. When students help define “what success looks like” for a particular project, they clarify their own understanding of learning objectives. They become active participants in setting standards rather than passive recipients of judgment. They develop metacognitive awareness of quality and what it takes to achieve it. Peer assessment becomes a natural extension of this approach: students evaluate each other’s work using the shared criteria they have established.
This design also mitigates a core concern about AI in assessment. When students create their own rubrics and participate in peer evaluation, they shift from focusing on impressing a teacher or producing the perfect product to genuinely grappling with quality standards. They are less tempted to lean entirely on AI if they are invested in the criteria for success.
A concrete tool to support this shift is to make use of AI transparency statements. For each assignment or project, students describe their process and explicitly indicate where and how they used generative AI. Some institutions are experimenting with structured scales (such as the AI. Assessment Scale [4, 5]) that allow students to rate their AI use at each stage of their work: planning, drafting, revising, etc. This is not about policing or punishing students. It is about creating a culture in which process thinking is explicitly valued, and AI use is integrated thoughtfully rather than hidden.
One legitimate concern, especially at the German UDS, about process-based and continuous assessment is scalability. Instructor-created rubrics are time-intensive and do not scale well. With several students, grading the workload becomes unsustainable without significant institutional support or technology infrastructure.
Yet emerging learning analytics and AI-supported systems can help. Conversation-based assessments powered by AI can sustain dialogue with students, ask adaptive follow-up questions, and provide immediate personalised feedback. Learning platforms can track student progress across competencies, aggregate data, and surface patterns that might otherwise be invisible. The key is to use AI not to automate judgment, but to augment it, creating a hybrid system in which human insight guides and interprets automated analysis.
My recent presentation at the professor retreat in November 2025 at the Schwilowsee outlined a framework for scalable assessment that integrates many of these elements, inspired by Dr Gulya:
- Students complete project-based learning experiences in which they design their own project, apply design thinking, establish success criteria, implement a prototype or assignment, and engage in self-assessment and peer assessment.
- Process is documented on collaborative platforms like YellowDig, where students share intermediate steps and receive peer feedback.
- Process folios accumulate over time, creating a transparent record of growth.
- Rather than grades on individual assignments, students receive Complete or Try Again feedback aligned to learning objectives.
- Competencies are mapped across multiple assignments, and once a student achieves 90% completion across assignments addressing a particular competency, they demonstrate mastery.
- Student-created rubrics and peer assessment further reinforce engagement with quality standards.
This approach is deliberately designed to be transparent about AI use, to value process over product, to distribute assessment across the term rather than concentrate it in final exams, and to encourage continuous iteration and growth.
The challenge of generative AI in education is not fundamentally a problem of academic integrity or detection. It is an opportunity to recognise that our assessment systems were already misaligned with what we claim to value about learning.
The path forward involves a multi-pronged approach: revamped grading policies that de-emphasise points and grades, process-based documentation that creates transparency, competency-based frameworks that clarify learning goals, and continuous assessment that replaces high-stakes exams with ongoing feedback and opportunity for growth.
This is not a quick fix. It requires rethinking course design, instructor workflows, and institutional policies. But it is necessary. As AI capabilities continue to advance, the old paradigm of assessing student products will become increasingly ineffective and increasingly divorced from the actual learning that matters.
References
[1] Kovanović, V., Barthakur, A., Joksimović, S., & Siemens, G. (2025). Why universities need to radically rethink exams in the age of AI. Nature, 648(8092), 35-37. https://doi.org/10.1038/d41586-025-03915-7
[2] https://europass.europa.eu/en/european-qualifications-framework-eqf
[3] Gulya, J. (2025, April 8). Bringing a process mindset to higher ed. Higher Education Digest. https://www.highereducationdigest.com/bringing-a-process-mindset-to-higher-ed/
[4] Perkins, Mike, Roe Jasper, and Leon Furze. “Reimagining the Artificial Intelligence Assessment Scale: A Refined Framework for Educational Assessment.” Journal of University Teaching and Learning Practice 22, no. 7 (2025). https://doi.org/10.53761/rrm4y757.
