Why screenshots alone don't make good documentation

Open any random Confluence page from 2023. The one with five screenshots and three sentences. Each screenshot shows a UI element with a red arrow. The sentences read like “click the Approve button,” “then go to the dashboard,” “configure as needed.” It looked fine the day someone wrote it. Today the buttons have moved, the page you were screenshotting has been redesigned twice, and nobody remembers which environment those screenshots were taken in.

That page isn't bad documentation because the screenshots are bad. The screenshots are probably fine. It's bad because the screenshots are doing all the work, and a screenshot without context is a riddle. The reader has to guess what URL was open, what state the app was in, what account they should be logged in as, and what comes next. Every guess is an opportunity to be wrong.

This post is about why that happens, and what a screenshot has to be paired with before it earns the word “documentation.” Disclosure: I work on UIHike, and the answer happens to be the shape of the tool we build. I've tried to argue the case from first principles and link out where someone else made the point earlier.

What a screenshot is, technically

A screenshot is a flat raster of pixels captured at one moment in time. It carries the visual state of the screen and nothing else. It doesn't carry the URL. It doesn't carry which element was clicked. It doesn't know what the user was trying to accomplish. It doesn't store the form values, the keyboard shortcut, the right-click menu, or whether you were in staging or production. All of that lives in your head, and if it doesn't end up next to the screenshot in writing, it's gone the moment you close the tab.

The technical writing community has been making this point for years. The short list of reasons not to rely on screenshots always lands on the same beats: they go stale fast, they fail accessibility checks, they don't survive translation, they're a pain to keep consistent, and the file sizes balloon. Most of those are real. None of them is the deepest problem.

The deepest problem is that a screenshot is a snapshot of the surface, and a procedure is a sequence of decisions made underneath. A screenshot of a button doesn't tell you what the button does, when to press it, what state to be in before you press it, or what to do if it's greyed out. The button is the easy part to capture. The hard part is everything around it, and a raster image captures none of that.

The four things missing from a bare screenshot

When a procedure goes wrong six months after publication, it's usually one of these.

1. The URL

A screenshot of the “User settings” page is useless if the reader can't get to that page. The URL bar may or may not be visible in the screenshot, depending on how the author cropped. Either way the URL isn't copy-pasteable, isn't indexed by search, and disappears the moment the page is restructured.

A walkthrough that records the URL alongside the screenshot solves this in one move. The reader clicks the URL and lands on the same page. No guessing whether “Settings” means the gear icon in the top-right or the link in the sidebar.

2. The clicked element

“Click Approve” is a sentence the author can type. It's also a sentence the author can forget to type. And in a long doc with ten screenshots, the repetition of typing “then click X, then click Y” is exactly the kind of thing humans skip.

The element the author clicked has structured properties the screenshot can't express on its own: the visible text on the button, its associated label, its CSS selector, its tag. Capturing those alongside the screenshot lets the description write itself, and lets a future reader find the element on a redesigned page even if the pixels around it have moved.

3. The narrative between steps

Five screenshots in a row, with no prose between them, is a tell. It means the author shipped pixels and called it a doc. The reader has to reverse-engineer what the author's plan was — what was the goal of each step, why this order, what would happen if you skipped step three. That work is supposed to live in the doc, not in the reader's head.

The minimum prose-per-step is a sentence: what the user is doing and why. “Open the project to set its billing address before adding contractors” is twelve words and saves the reader from inferring why billing comes first. A walkthrough tool that auto-fills the title from the page's h1 already does half of this for you.

4. The boundary between current and stale

This is the silent killer. A page reviewed last week and a page written three years ago look identical in Confluence. Same header, same fonts, same confident left-justified text. Nothing in the visual signal tells the reader that the screenshots they're looking at were taken before the dashboard redesign.

Documentation rot is not a one-time event. It's a continuous process and the wiki shape pretends otherwise. Treating freshness as metadata — last recorded, last verified, last reviewed — turns “is this still right?” into a question the system can answer instead of a question the reader has to ask.

The cognitive cost of a screenshot-only doc

Reading a procedure documented as a wall of screenshots is more work than it should be. The reader is doing several jobs at once: parsing the screenshot, mapping elements in the screenshot to elements on their actual screen, inferring the order of operations, and remembering where they are in the sequence.

Five consecutive screenshots with no text between them overwhelm readers; they can't tell which step they're on or what each image represents. A close-up screenshot of a single button, with no surrounding interface, makes the reader hunt for that button on the live page. Inconsistent annotation styles across screenshots add cognitive friction the reader pays for every time they switch between two adjacent images.

None of that overhead is doing useful work. It's the cost of a doc that pushed the structuring problem onto the reader instead of solving it at write time.

What a walkthrough adds

The shape that solves the riddle is a walkthrough: an ordered sequence of steps, each containing a screenshot and the structured context that the screenshot alone is missing.

In UIHike, a recorded step contains the screenshot plus the URL, the page title, the CSS selector of the clicked element, the visible text on that element, the associated label if any, the value typed (passwords masked), and a description field. That isn't a marketing list — that's the RecordedStep data model. The step also tracks when it was captured, so a reader can see whether the page they're looking at is from this quarter or from before the last UI redesign.

The practical effect: the same screenshot, in a walkthrough, is no longer a riddle. The reader knows which page, which element, which input value, and what comes next. If the page has been redesigned, the clicked element's visible text or label often still matches even when the pixels don't — there's a string the reader can search for, not just a coordinate.

What about screen recordings?

The first instinct, when someone says “your screenshots aren't enough,” is to record a video. Loom-style screen recordings have the same context that a walkthrough captures — the URL, the click, the value typed — embedded in the timeline.

Video solves one problem (it has the context) and creates two new ones. The reader has to watch the video linearly to find what they need; you can't skim a video the way you skim a doc. And updating a video for a small UI change means re-recording, not editing. Industry surveys keep finding that a majority of users abandon video tutorials when they can't scan for the specific written step they need.

A walkthrough is the middle term. It carries the context that a video carries, in the form of a doc that you can skim, link to a single step, copy a URL out of, or update without re-recording.

The decay rate of a screenshot-only doc

Documentation goes stale silently. There's no test suite that goes red when your runbook references a renamed service, no compiler error when the screenshot shows a button that's been moved. The decay is invisible until someone follows the doc and gets stuck.

For pure-screenshot docs, the decay rate is roughly the rate at which the underlying UI changes. Every redesign breaks every screenshot. Every renamed button breaks every “click X” sentence. The half-life of a screenshot-only procedure is shorter than most teams realize, and the team usually finds out the hard way: someone follows the doc, hits a screenshot that doesn't match what they see, and either Slacks the original author or guesses.

Walkthroughs decay too. They are not magical. The difference is in the failure mode. When a walkthrough's screenshot doesn't match a live page, the clicked-element text and the URL still tell the reader where they are and what to look for. When a screenshot-only doc's image doesn't match, there's nothing else to fall back to.

The case where a bare screenshot is fine

Not every screenshot needs to be a step in a walkthrough. Three reasonable uses for a one-off image:

The bug repro in a Jira ticket. One image, one comment, lifespan of a few days. The dev fixes the bug and the screenshot becomes archival. It doesn't need URL or element metadata because nobody will follow it as a procedure.
The visual answer in chat. Someone asks “where's the export button?”, you screenshot the page with a red circle, you paste, you move on. The image is disposable.
The marketing asset. A product-page screenshot, polished, shipped to a landing page. Has its own context (the surrounding copy) and its own update cadence (when marketing says so).

Every one of those is a single image with a single audience and a short shelf life. The trouble starts when a procedure — something the team will follow repeatedly, in different contexts, six months from now — gets documented in the same shape.

How to upgrade an existing screenshot doc

You don't have to throw the doc away. The quickest way to upgrade a procedure that today is just screenshots and bullet points is to rerun it once, capturing properly. Three passes:

Pass 1: rerecord, end to end. Open the procedure. Hit record in a walkthrough tool. Do the work as if you were doing it for the first time. The capture pipeline writes the screenshot, the URL, the clicked element, and the value for every step. Stop recording. Now you have a parallel version of the doc with all the context the original was missing.

Pass 2: edit titles and descriptions. Auto-fill is good but not perfect. Walk through each step and fix the title where it's wrong. Add a one-sentence description on the steps where the screenshot alone isn't self-evident. Skip the steps where it is.

Pass 3: redact and publish. Drag a redaction box over anything sensitive — customer names, internal URLs, API keys. The original PNG is preserved as a layer; the redacted version is what gets exported. Publish, share the link, retire the original page or replace its body with the link.

Total time on a fifteen-step procedure: usually under thirty minutes. The first time. The next time the procedure changes, it's the same thirty minutes, not a screenshot-by-screenshot cropping session in Snagit.

The principle

A screenshot is a fact. A walkthrough is a fact in context. Documentation that survives is the kind that records the context at capture time, instead of hoping the reader can reconstruct it six months later from pixels.

The cost of capturing context at write time is a few extra fields per step. The cost of not capturing it is the time every reader spends guessing, every author spends rewriting, and every team spends triaging the “the doc was wrong” tickets.

Try UIHike on the screenshot-heavy doc that ages worst on your team — the onboarding guide, the support runbook, the audit walkthrough. Re-record it once. Compare the two versions in six months and see which one the new hire actually follows without asking a question.

— The UIHike team