Students Who Think Out Loud With AI Get 2x Better Grades

We ran a study on 10,000 quiz questions where students used StudyFetch's AI tutor while taking practice quizzes. Each chat message was scored on how responsibly the student engaged: were they using the AI as a thought partner, or just trying to get the answer?
Students who engaged at the highest level were 2.3x more likely to answer correctly on their first attempt. They were 2x more likely to master the question. Their average best grade was 21 points higher.
The students who didn't engage, who typed things like "what do I pick" or "just tell me the answer," scored the worst on every metric. The AI tutor doesn't give direct answers, so those students got redirected back toward thinking. Most of them didn't.
The Gap Is Biggest When Guessing Won't Save You
On multiple choice questions, the difference between engaged and disengaged students was small. Guessing gives everyone a baseline. But on free response questions, where students have to construct their own answer, the gap exploded. Engaged students answered correctly at 9.6x the rate of disengaged students (31.9% vs 3.3%).
When you can't guess your way to an answer, how you interact with the AI tutor determines whether you learn the material.
The Finding That Surprised Us
We asked students to rate their own confidence before submitting each answer. Even when students reported the same confidence level, those who had engaged with the AI tutor still answered correctly at higher rates:
Confidence Level Disengaged Students Engaged Students
"I don't know" 8.3% correct 17.0% correct
Low confidence 10.6% correct 29.1% correct
Medium confidence 25.7% correct 48.0% correct
A student who worked through the problem with the AI and said "I don't know" got the answer right twice as often as a student who skipped the thinking and also said "I don't know."
At low confidence, engaged students were nearly 3x more likely to get the right answer. These students had absorbed something from the interaction that they couldn't feel. The understanding showed up in their performance even when it didn't show up in their self-assessment.
What "Engaged" Actually Means
StudyFetch's AI literacy system scores every chat message on a Responsibility scale from 1 to 4:
Score 4: Shows their own work, asks conceptual questions, uses the AI as a tutor. Example: "I think the answer involves cellular respiration because ATP is produced there, but I'm not sure how the proton gradient connects. Can you help me with that part?"
Score 3: Wants to understand the material, asks for explanations. Example: "Can you explain how mitosis is different from meiosis?"
Score 2: Asks for help with minimal effort. Example: "help me with this"
Score 1: Doesn't engage with the material. Example: "what's the answer"
The distinction matters because the AI tutor responds the same way regardless of score. It never gives direct answers. But students who bring their own thinking to the conversation leave with more than students who don't.
Just Asking the AI Helps
Separate from how well students engage, we also measured whether using the AI chat helps at all. From the full dataset of 536,362 quiz questions where students opened the chat:
Students who got a question wrong and then chatted with the AI before retrying were more likely to answer correctly on their next attempt than students who retried without chatting. The effect is consistent across question types and education levels.
Students who opened the chat before their first answer attempt got the question right at a substantially higher rate than students who answered cold, without chatting at all. The baseline first-attempt correct rate for students who didn't chat first was 22.1%.
The AI tutor helps even when the student isn't engaging deeply. But the gap between "used the tutor" and "used the tutor well" is where the real learning difference appears. The 2.3x and 2x numbers above show what happens when students bring their own thinking into the conversation rather than just reading what the AI says.
This Is What Purpose-Built Educational AI Looks Like
Generic AI tools give students whatever they ask for. StudyFetch's tutor is built differently. It explains concepts. It gives hints. It asks guiding questions. It does not hand over answers. And the quiz system requires mastery: students can't move on until they demonstrate real understanding.
When you pair a tutor that refuses to shortcut with a quiz system that demands comprehension, students learn. The ones who engage with the process, who show their work and think through problems, score 2x better than the ones who don't. And even the students who engage minimally still benefit from having a tutor that guides them toward understanding rather than handing them answers.
That's the case for purpose-built AI in education. Not AI that does the work for students. AI that makes students do better work.
Privacy: No student message text, quiz questions, quiz answers, or student responses are included in any published dataset. All user identifiers have been salted and hashed. The published data contains only anonymized IDs, numerical scores, timestamps, and metadata. StudyFetch does not use student data to train AI models. This analysis was conducted on structured score data only.
Press
More press releases

Students Who Communicate Better With AI Learn Faster. Here's What Our Data Shows
We looked at how students' AI literacy grades correlate with their mastery performance. The results are consistent: students who communicate better with AI and use it more responsibly perform better on assessments, engage more with practice, and master topics at higher rates.


