The old debugging loop

The traditional debugging loop is slow by design: reproduce the issue, form a hypothesis, add instrumentation, run the code, observe the output, revise the hypothesis. For complex bugs — the ones that span multiple systems, manifest only under specific conditions, or involve race conditions — this loop can run for hours.

AI-assisted debugging breaks the loop. Not by eliminating it, but by collapsing the hypothesis-formation step from minutes to seconds.

The core technique: full context dump

The biggest mistake engineers make when using AI for debugging is underspecifying the problem. They paste an error message and ask "what's wrong?"

The error message alone is almost never enough. A good debugging session with Claude starts with a full context dump:

The error or unexpected behavior, verbatim
The relevant code — not a snippet, the full function and its immediate dependencies
What the code is supposed to do
What it actually does
What you have already tried
The environment: language version, framework version, relevant configuration

This takes three minutes to assemble. It saves thirty minutes of back-and-forth. Claude's first response to a well-specified debugging request is usually the correct hypothesis.

Feeding logs to Gemini

Log-based debugging is a category where Gemini's long context window is specifically valuable.

A service that has been misbehaving for an hour may have produced megabytes of logs. The relevant signal is in there somewhere. Traditional debugging means reading through logs manually or constructing grep patterns to find the event you are looking for.

With Gemini, the workflow is different: paste the full log, describe the symptom and the expected behavior, ask Gemini to identify the sequence of events that led to the failure.

The output is usually a precise narrative: "At 14:23:07, the connection pool reached saturation. The next incoming request at 14:23:08 queued but the timeout was set to 0, which caused an immediate rejection rather than a wait. The downstream service received an unexpected error..." This is a different quality of insight than grep can produce.

For large log files — the ones that exceed normal context windows — this is a Gemini-specific capability. No other model handles the full volume.

The hypothesis stress-test

Once you have a hypothesis — from your own reasoning, from Claude, or from log analysis — the next step is stress-testing it before you spend time implementing a fix.

I do this explicitly with Claude: "My hypothesis is X. Walk me through the ways this hypothesis could be wrong."

This is a high-value use of the model. Claude is good at finding holes in reasoning, identifying assumptions that are not stated explicitly, and suggesting alternative explanations for the same symptoms. It does not get attached to a hypothesis the way engineers sometimes do after spending an hour developing it.

The stress-test takes five minutes. It has caught wrong hypotheses — ones that would have sent me down a 90-minute fix path — too many times to count.

Rubber duck debugging, upgraded

Rubber duck debugging is the practice of explaining a problem out loud to an inanimate object. The act of articulation often produces the insight without the duck needing to respond.

AI rubber duck debugging adds the response. You explain the problem; the model asks clarifying questions or points out the thing you just said that contradicts something you said earlier.

For bugs that feel confusing but that you cannot quite articulate, this conversational mode is more valuable than the dump-and-analyze approach. Start talking through the problem. Claude's questions will surface what you have not said.

"The bug I was stuck on for two hours revealed itself in the second sentence of my explanation when Claude asked: 'You said the cache is invalidated on write — are you certain that write is completing before the read that follows it?'"

When to reach for each tool

Claude: hypothesis generation, code-level analysis, stress-testing your reasoning, explaining what a piece of unfamiliar code does in the context of your bug.

Gemini: log analysis at scale, tracing event sequences across large volumes of output, correlating symptoms across multiple log sources.

Neither: bugs in systems you do not understand well enough to describe. These require understanding the system first. AI debugging without understanding produces plausible-sounding wrong answers. Learn the system, then bring the model in.

The ROI is asymmetric

Debugging is the task where the quality difference between AI-assisted and unassisted work is highest for most engineers. A 90-minute debugging session becomes 15 minutes not because every step is faster, but because the hypothesis-formation step — the expensive one — is compressed dramatically.

The technique is not complicated. Dump full context. Feed logs at scale. Stress-test hypotheses. Talk through problems conversationally.

The investment is the discipline to assemble the full context before asking, rather than sending the error message and hoping the model guesses the rest.

How to debug with AI faster than you can think