Lesson 15 of 27 beginner 9 min read

Before this:App, IDE, agent — or all three?

Code & “sentence” completion

Key takeaways Ghost text — a grey, inline suggestion that finishes your current line or function, which you accept or reject. Fill-in-the-middle — the model reads code both before and after your cursor to predict what belongs in between. Low-risk, not no-risk — small visible suggestions are easy to reject, but they can still encode plausible-looking wrong logic.

This is lesson 15 of the path, and the first of four lessons on the modes of AI-assisted coding — the different ways you actually put a model to work. We start with the lightest-touch mode: inline code completion. By the end you’ll understand what that grey “ghost text” in your editor is, the trick that lets it finish your code so accurately, where it earns its keep, and the quiet ways it can lead you astray.

What inline completion looks like

You’re typing in your editor. You write the start of a line and, before you finish, faint grey text appears ahead of your cursor proposing the rest. Press Tab and it becomes real code; keep typing and it disappears. That grey text is an inline completion, and most people call it ghost text because it’s there-but-not-there until you commit to it.

This is the descendant of plain old autocomplete — the dropdown that suggested a method name once you typed a dot. The difference is scale and source. Classic autocomplete listed names it found by scanning your code. Modern completion is generated by a language model (the kind we met back in What is AI for coding?) that predicts the content, not just a name: a whole expression, the rest of a line, the body of a function, the next case in a switch.

A useful analogy is the predictive text on your phone keyboard. You type “see you” and it offers “tomorrow.” It isn’t reading your mind; it has learned that those words tend to follow each other, and it’s offering the statistically likely continuation. Code completion is the same idea aimed at source code: given everything around your cursor, here is the most likely next stretch of code.

How it works: fill-in-the-middle

Earlier in this path, in How a model decides what to write, we saw that a language model predicts the next token from the tokens before it. Plain left-to-right prediction is fine when you’re writing at the very end of a file, but most editing happens in the middle of existing code — there’s code above your cursor and code below it, and both matter.

The technique that handles this is called fill-in-the-middle (sometimes abbreviated FIM). During training, completion-tuned models are shown examples where a chunk has been cut out of the middle of a file and the surrounding parts are rearranged so the model learns to predict the missing chunk from both sides. At completion time, the editor sends the model the code before your cursor (the “prefix”) and the code after your cursor (the “suffix”), and the model predicts what belongs in between.

That two-sided view is why completions feel so context-aware. If a function below already returns a specific type, the suggestion above it will tend to match. If you’ve opened a for loop and there’s a closing brace waiting two lines down, the model fills the body, not another loop. Editors also feed in extra signal — the file’s imports, nearby open files, your recent edits — so the suggestion fits this codebase, not generic code from the internet.

Here’s the shape of it in a Go file from a project like GopherTrunk. Suppose you type the signature and the opening, and the cursor sits where the comment is:

// computeCRC returns the CRC for a P25 data unit.
func computeCRC(data []byte) uint16 {
    var crc uint16 = 0xFFFF
    for _, b := range data {
        // cursor here — ghost text proposes the CRC update
    }
    return crc
}

Because the model can see both the 0xFFFF seed above and the return crc below, it can propose a plausible body — the XOR-and-shift loop a CRC needs — that lines up with what you’ve already committed to. You read it, and either accept or keep typing.

Where completion shines

Inline completion is at its best when the next code is predictable — where you already know what you want and would only be slowed down by typing it out. That covers a lot of day-to-day work:

Situation Why completion helps
Boilerplate Constructors, getters, error-wrapping, logging lines — verbose but obvious code the model can finish from one cue.
Repetitive patterns Once you write one case, struct field, or assignment, the model infers the next several from the established shape.
The obvious next line After if err != nil {, a Go developer almost always writes a return; the model offers it instantly.
Test cases Given one row of a table-driven test, completion proposes more rows with varied inputs and expected outputs.
Data structures Filling in struct literals or maps where the field names and rough values are already implied by context.

The thread running through these is repetition and convention. Completion thrives on code that follows from a pattern you’ve established. It is far less reliable for the genuinely novel — the tricky algorithm you haven’t worked out yet, or a design decision the surrounding code can’t hint at. For those, you’ll reach for chat instead.

Why it’s relatively low-risk

Of all the modes we’ll cover, inline completion is the gentlest, and it’s worth being clear about why, because the reasons map directly onto what makes a tool safe to lean on:

  • Suggestions are small. A completion is usually a line or a few lines, not a sprawling change across many files. The amount of code you have to evaluate at once is tiny.
  • They’re visible. The proposed code sits right in front of you, in context, before it’s committed. You’re not running anything; you’re reading a preview.
  • You accept each one explicitly. Nothing happens until you press Tab. Every suggestion passes through a deliberate human decision. Reject it and it’s as if it never existed.

This is the same human-in-the-loop principle that makes chat-assisted coding safe, just at the smallest possible grain: one suggestion, one decision. Because the unit is so small, your review cost per suggestion is low, and a mistake is cheap to catch.

The drawbacks worth knowing

“Low-risk” is not “no-risk,” and the failure modes are subtle precisely because the tool feels harmless.

It can autocomplete wrong logic that looks right. This is the big one. A completion is fluent and well-formatted — it looks like correct code because the model is very good at producing code-shaped text. But fluent is not the same as correct. The model can offer an off-by-one boundary, the wrong comparison operator, a CRC polynomial that’s close but not the one your protocol uses, or a plausible-looking constant that’s simply wrong. The smoothness that makes completion pleasant is exactly what can lull you into accepting a bug. Read suggestions for meaning, not just shape — and lean on tests to catch what your eyes miss.

It can anchor your thinking. The moment a suggestion appears, it nudges you toward its approach. Maybe you were about to write a cleaner solution, but the grey text proposed a serviceable-but-clunkier one and you took it because it was already there. This anchoring is quiet — you rarely notice it happening — and over a whole codebase it can flatten your code toward whatever the model finds most probable rather than what’s best for your problem. When a suggestion appears for a decision that actually matters, it’s worth pausing to ask whether you’d have written it that way unprompted.

Neither drawback is a reason to turn completion off. They’re a reason to stay the author: the model proposes the next line, but you’re still the one deciding it’s the right line.

Quick check: What does "fill-in-the-middle" let an inline completion model do?

Recap

  • Ghost text — inline completion shows a grey suggestion that finishes your current line or function, becoming real code only when you accept it.
  • Predictive-text analogy — like your phone offering the next word, completion offers the statistically likely next stretch of code from its context.
  • Fill-in-the-middle — the model reads both the code before your cursor and the code after it, which is why suggestions fit the surrounding code.
  • Best for the predictable — boilerplate, repetitive patterns, the obvious next line, test cases, and data structures are where it shines.
  • Low-risk by design — suggestions are small, visible, and explicitly accepted one at a time, so mistakes are cheap to reject.
  • Subtle failure modes — it can autocomplete wrong-but-plausible logic and anchor your thinking toward its suggestion, so read for meaning before accepting.

Next up: when the next line isn’t obvious and you need to describe what you want, you move from completion to conversation — see Chat-assisted coding.

Frequently asked questions

What is the grey text that appears as I type code?

That is an inline completion, often called ghost text. An AI model has read the code around your cursor and is proposing how to finish the current line, expression, or function. It is only a suggestion: it does nothing until you accept it, usually by pressing Tab.

How is code completion different from chat?

Completion is inline and continuous — it predicts the next few tokens as you type, with no conversation. Chat-assisted coding is a back-and-forth where you ask in words and review a larger answer. Completion is for the obvious next line; chat is for anything you need to describe or discuss.

Can I trust an inline suggestion?

Trust it the way you’d trust autocomplete on your phone: glance, then decide. Suggestions are small and visible, and you accept each one deliberately, so a wrong one is easy to reject. The real trap is a suggestion that looks right but encodes subtly wrong logic — so read it before pressing Tab, especially for anything non-trivial.