How I Learned to Stop Worrying and Love the Large Language Model (for college writing courses)
My reflections on guiding students to personally appreciate the strengths and limits of generative AI in writing assignments
This short writeup details my experience designing and running a pedagogical assignment to, as the subtitle suggests, help students develop a personal appreciation for the strengths and limits of using generative AI in their writing.
Background: I’m a university professor who teaches a writing-intensive class, targeted at senior-level undergraduates. At the start of the new fall semester, which ran Aug-Dec 2023, there was lots of talk about Large Language Models (LLMs), and how they would impact teaching writing1. I knew that I had to address this issue head-on2, so I designed the following assignment, which is the keystone assignment in my class (worth about half of the class grade).
Before this semester, I have students produce a research paper (“Term Paper A”). They receive personalized feedback and work on revising their writing to produce a final paper (“Term Paper B”). (Revising from feedback is itself an important step to help students improve their writing, though not the focus of this post.)
For this semester, I added a new assignment, which was to have an AI “write” a Term Paper on the same topic (“Term Paper AI”), and then critique the AI writing. Let’s discuss the logistics first, and then I’ll describe the reasoning and reflections.
Designing the assignments
Term Paper A
This was a 10-12 page APA-formatted research paper on a topic of their choosing relevant to the class, and on which they have to do some research. NO AI ALLOWED.
Term Paper AI
I added a Term Paper AI, due about 2 weeks after their first paper, that requires the use of generative AI. Students use generative AI to write a paper on the SAME TOPIC that they wrote their term paper on. They are then to critique and reflect on the strengths and limitations of the generated text.
Students mark their text in the default color, and the AI-generated text in blue so it's clear which is which. The format was flexible: They could (i) append their reflections before or after the AI Term Paper, or (ii) interleave their reflections (e.g., chunk of AI text, chunk of their reflections).
Students were told that they would be graded on the quality of their analysis (e.g., commenting on how compelling (or not) was the AI text). The instructions were also clear that, just like any research paper, (i) they need to still provide an introduction/thesis (e.g., what did I find, what were good points and bad points that I noticed), (ii) logical/reasoned arguments, and (iii) they still needed to write clearly and convincingly. I also stressed that they will not be graded on the quality of the AI text. (In fact, if the AI-generated text is bad, that's great for students to comment on!). Prompt instructions (as well as a grading rubric) are available upon request!
Term Paper B
Now after having seen how this topic was approached both by themselves (Term Paper A) and by an AI (Term Paper AI), I allow students to use as much generative AI as they wish in their final revision (just mark it in blue; no penalties whatsoever).
I tell students that, by this point, (if I have done my job correctly), they should be mature enough to know the pros and cons of using generative AI. Thus, for the final assignment, they have no restrictions (just like what they would experience after they graduate), and I expect them to be able to justify to themselves their use of AI (no need to justify to me). I think this last step is respectful of students’ autonomy and their growth over the course.
Rationale
This was a metacognitive exercise that guided students through thinking about their writing by critically considering how another writer (in this case, AI) approached the same topic, in the process self-reflecting.
Students first did the hard work of learning and writing about a topic. This initial step is necessary to give students some “expertise” upon which to make the critique.
It’s important to scaffold the critique process to guide students’ learning. For instance, in the instruction prompt, I gave example questions that students could ask themselves while reading (e.g., “Was the argument compelling? Why or why not?”). I also stressed that students still had to produce an “introduction” / thesis statement that summarized their reflections, rather than only produce stream-of-consciousness reactions.
Logistically, it’s also important to provide a step-by-step guide on how to use these LLMs. My instructions did not go into the weeds of how to write a good prompt; just which LLMs are free, how to create an account, and that one may have to use multiple prompts to get enough responses for an essay.
The existence of the third part of the assignment (“revising into Term Paper B”) strengthens the reflective exercise. I think there is some benefit to doing the critique with a future “revised Term Paper” goal in mind—for instance, students might compare whether the AI wrote about a point better than they did, and so reflect on their writing. But ultimately if others were to reproduce this assignment in other classes and are unable to fit this last part in, I do not think it is as crucial, depending on the learning objectives.
I should also mention how much care I take to motivate this assignment to my students, some of which I describe above. I think it’s very important to be open with students as to why this assignment is designed like this, as well as why it is an important learning process that will equip them for this new LLM-powered world that we are entering.
Results / Reflections
Most of my students provided very mature critiques in their Term Paper AI’s. Most of them recognized the shortcomings of the AI because they wrote a paper on the same topic, so they were able to compare it against their knowledge. For instance, they were better able to identify made-up references (e.g., “This is a published author in this field, and I know because I had cited them, but this particular reference is made up”). But they were also able to notice that the AI tends to superficially discuss other points without much elaboration—those same points, if they had been novel, might have been deemed “good writing” by a novice.
It is crucial, from a design perspective, to first have students write a paper on their own after doing their own research.
Without this background, many students would have found the AI-generated text to be compelling.
Careful students would also pick up on the linguistic characteristics, e.g., it tends to be wordy and throw around big words, use overly-formal constructions, or have repetitive sentence structures.
Here is a summary of what students picked up on (with a few select quotes written by students):
Advantages:
Very good at generating points. Most students were also impressed that the AI had matched the same points that they came up with, and sometimes even generated points that they had not thought of (given that they had done their research).
Efficiency
“To be frank, it felt frustrating to see a machine output content that was better than my own in a matter of seconds”
Common drawbacks:
AI-generated language tends to be repetitive.
In one instance, the same “concluding statement” at the end of a section that connected that point back to the importance of the main thesis appeared, word-for-word, at the end of two consecutive sections.
Sometimes the AI uses superlative language or overly formal language.
Does not often substantiate claims, or elaborate.
“Hops” from one idea to another.
Sometimes the AI’s logic does not flow across paragraphs/sections. Consequently, this results in poor transitions.
“Rather than an argument that builds on itself, this was more of a comprehensive list of facts that relate to the topic”
Poor at using real-world examples.
“I was captivated by its nearly excessive information on certain topics, but also annoyed because I was not sure why ChatGPT could not connect those ideas to real events”
Miscellaneous:
Students also expressed value for having their own “voice” be expressed in text. This is something not quantifiable, and I suspect matters a lot more to some, than others.
“I also think that AI is a bit more eloquent than I am, but I do not think I would use AI to re-word my work because I would rather my authentic expression be read”
One student shared in class: “Even though I probably spent 30-40 hours on my own paper, while the AI took a few minutes, I benefited from struggling to put my ideas down into words.”
“Seeing a side-by-side comparison of ChatGPT’s output and my own work in my term paper was the greatest learning experience I could have asked for.”
Finally, I also made sure to leave some time to discuss this exercise in class (after they turned it in and while the experience was still fresh), where I asked students to share their experiences. I felt that students also benefitted from hearing other students’ opinions and seeing their observations shared by others. This also became a wide-ranging discussion where some students made comparisons of LLMs in writing to how the calculator changed math education. (One student countered that “we are not wired to do math, but we are wired to produce language to develop our ideas and communicate with others: we shouldn’t replace that with an AI”.) This yielded super interesting ideas, but which are well beyond the scope of this post.
Conclusion
Students all expressed positive feedback about this exercise. Many of them had not used, or thought of using, generative AI before, yet most of them also had heard that other students (or even close friends) were using it. Now having gone through this exercise, students reported being more aware of the limitations of generative AI from an “expert” perspective (having done a bit of research on their paper topic). They expressed more appreciation for why people might think of using generative AI and more confidence in being able to make their own choices in the future. And overall some students endorsed the benefits of working through those ideas yourself, which develops critical thinking skills and perspective-taking/communication skills. Overall I felt the exercise was a success.
For you!
If you are an instructor and want to use this exercise in your course, please feel free to adapt it to your context! I am happy to share my assignment instructions, as well as my grading rubric (please reach out to me).
If you have ideas or feedback, or end up using this in your class, please let me know too!
To end, I do think that LLMs will fundamentally change education. There’s no putting the genie back in the bottle. And perhaps there are still certain educational contexts where it might be desirable to outright ban the use of LLMs. But I think that for the majority of contexts, educators have to adapt their paradigms and their approach to education. (I briefly mentioned the calculator-math education analogy earlier; I do not see it as exactly analogous, but it is perhaps too early to tell). I sincerely hope that we as a society will still prioritize learning to write well—not just for the mechanics of stringing together coherent and effective sentences, but also for how writing itself sharpens thoughts and ideas.
Brief summary of the issues, which are too complex to get into in this space and also not the main point of this post: LLMs can produce very fluent writing, and students may be tempted to use AI to write their assignments. This defeats the pedagogical purpose of “writing” assignments. One solution is to “ban” the use of LLMs (which opens another can of worms in detection/enforcement). Another possible solution, which I propose here, is to adapt the way we structure writing assignments.
Hence the Stanley Kubrick-inspired title. (Though I hope our LLM-powered future won’t be as dark as the nuclear apocalypse depicted in the original film.)