Framework Builder

Decompose a complex question into testable parts — then test them yourself, well.

← Second Thought

Some questions are too big to settle in an evening. The work tonight isn't to answer one — it's to make it concrete: break it into hypotheses you could actually test, so you end the evening knowing what the question requires rather than pretending you've resolved it. Designing and running the tests comes later, over days. A short discipline below shows how to do both parts well.

The process — tonight, then later
  1. Decompose (tonight). Run the prompt below to turn your question into considerations, hypotheses, and a note on what kind of evidence bears on each. This is the evening's work, and finishing it is a real result.
  2. Edit (tonight). Knock out hypotheses that miss, refine considerations, add what's missing. This is the work that makes the framework yours.
  3. Design your tests (later). Using the discipline below, turn each hypothesis into one or more discrete, decisive tests. This is where judgment lives, and it's worth doing carefully rather than quickly.
  4. Run them, and optionally ask the AI to help (later). Gather the evidence over days, with your team and your calendar. For a specific test, you can hand it back and ask the AI to attempt authoritative retrieval — the third prompt does this honestly.

Step 1 — Decompose the Question

Considerations and hypotheses, plus a pointer to the kind of evidence each needs. No tests yet.

Click anywhere in the block, then Ctrl+C to copy. Paste into a fresh AI conversation. With Copilot, consider selecting Claude Opus or GPT from the model dropdown for stronger structured output.

You are helping me build a conceptual framework for a complex decision. Your job is to decompose the question — not to answer it, and not to design or run tests. Stop at the hypothesis level, plus a brief note on what kind of evidence would bear on each hypothesis. I will design the actual tests myself.

Before producing the framework, briefly think through what is actually at stake in this question and what the genuinely independent considerations would be — but do not include this thinking in your output.

Generate a framework with the following structure.

CONSIDERATIONS. Three to five second-level questions that, if answered, would let me resolve the main question. Considerations should be genuinely independent — not three different ways of asking the same thing — and together should cover what actually matters. Each should be of roughly comparable weight. Do not phrase them in ways that suggest any particular answer.

HYPOTHESES. An appropriate number per consideration, between two and six. The number must vary across considerations based on each consideration's scope; a uniform count signals you have not thought about scope differences. State each as a positive assertion, not a question — a claim that could be supported or refuted by evidence. The hypotheses under a consideration should, taken together, span the realistic possibilities.

EVIDENCE TYPE. For each hypothesis, name the one or two kinds of evidence that would most bear on it, drawn only from the list below. Name the type and, in a short phrase, what specifically would be examined. Do NOT design a test, do NOT name specific sources, do NOT estimate effort, and do NOT attempt to gather anything. Only point at the kind of evidence.

The evidence types:
- Records & documents: data, filings, reports, or correspondence that already exist.
- Analysis & comparison: comparing figures against trends, benchmarks, expectations, or each other.
- Inquiry: asking people who would know — experts, participants, those with direct experience.
- Observation: watching a process or behavior directly.
- Confirmation: independent verification from a third party.
- Recalculation or reperformance: independently redoing a computation or process to check a claimed result.

Format the output as plain text. Use ALL CAPS for labels and normal case for content. Use indentation and line breaks for hierarchy, not markdown symbols. Preserve the structure exactly as shown. Structure each item like this:

CONSIDERATION 1. [text]

  HYPOTHESIS 1A. [positive assertion]
    Evidence: [type] — [what specifically would be examined]

  HYPOTHESIS 1B. [positive assertion]
    Evidence: [type] — [what specifically would be examined]

CONSIDERATION 2. [text]

  HYPOTHESIS 2A. [positive assertion]
    Evidence: [type] — [what specifically would be examined]

(...and so on)

Do not preface with a summary, do not conclude with recommendations, do not design tests, and do not answer the main question. Produce only the framework through the evidence-type notes.

My main question:

[paste your question here]

Read the framework critically and edit it. The point of Step 1 is a structure you shape, not output you accept.

Step 2 — Design Your Tests

This is yours to do. Turning the kind of evidence into a real test is judgment.

A hypothesis is a claim. A test is a specific, finishable check that would move you toward believing or disbelieving it. Most of the value in this whole exercise lives here — in turning a claim into a test you could actually run, against evidence you could actually reach, with a result that would actually change your mind. Four principles separate a real test from a vague intention.

Discrete

A test is one finishable check, not a standing effort. "Compare California verdict trends to ours" is a research project. "Whether California's commercial-casualty verdict severity grew faster than the countrywide average last year" is a test. If a test would have to be broken down to start it, it's really a sub-question — promote it to a consideration or split it.

Decisive

State in advance what result would support the hypothesis and what would refute it. If no possible finding would change your view, you don't have a test — you have a gesture. The discipline of naming the disconfirming result is what protects you from gathering only the evidence that flatters your prior.

Authoritative

Evidence has to come from a source you would actually trust and that directly applies to your question. Inference assembled from sources whose reliability you can't judge is not evidence, however confident it sounds. Decide what would count as authoritative before you go looking, so you aren't tempted to lower the bar to fit what you find.

Honest about effort

Ask whether the answer already exists and just needs retrieving, or whether it has to be built — assembled, analyzed, or generated. Building is real work even when the raw inputs are your own (pulling your team together to analyze internal data is a project, not a lookup). Know which you're signing up for before you commit to the test.

The evidence types the AI pointed at in Step 1 come from a long tradition in auditing of how an assertion actually gets tested. Each suggests a different way to construct your test:

Records & documents
Examine data, filings, reports, or correspondence that already exist. Strongest when the source is independent and authoritative. Ask: does this document actually say what I need, or am I stretching it?
Analysis & comparison
Compare figures against trends, benchmarks, expectations, or each other. Powerful for spotting whether something is out of line — but only as good as the comparability of what you're putting side by side.
Inquiry
Ask people who would know. Fast and often rich, but the weakest form of evidence on its own — people misremember, simplify, and have stakes. Best for surfacing leads and judgment, then corroborated by something harder.
Observation
Watch a process or behavior directly. Tells you what actually happens rather than what people say happens — but only captures the moment you observed, which may not be representative.
Confirmation
Get independent verification from a third party with no reason to shade the answer. Among the strongest forms of evidence, because it doesn't rely on the party with an interest in the outcome.
Recalculation / reperformance
Independently redo a computation or process to see whether you get the claimed result. Decisive when a hypothesis turns on whether a number or a method actually holds up.
The one that matters most

Decide what counts as authoritative before you look

The single most common way good analysis goes wrong is lowering the evidence bar to fit what turned out to be available. Name your standard for trustworthy, directly-applicable evidence up front — then hold tests that can't meet it as open, not as weakly answered. An honest "we don't yet have authoritative evidence on this" is worth more than a confident answer built on sources you couldn't vouch for.

Step 3 — Optionally, Ask the AI to Find Evidence

For one specific test you've designed. The AI tries, and reports honestly what it could and couldn't reach.

Once you've designed a test, you can hand it back to the AI and ask it to attempt authoritative retrieval. This is the only point where the AI gathers evidence — and only for a single, specific test you've defined, never for the framework as a whole. Whether the AI can actually reach the evidence is settled by it trying, not by it guessing in advance. If it hits a paywall, a login, or a source it can't open, it tells you, and you retrieve it yourself.

Click anywhere in the block, then Ctrl+C to copy. Fill in the hypothesis and the test you designed.

I have designed a specific test for a hypothesis, and I want you to attempt to find authoritative evidence for it. Do not broaden the test, do not substitute a different one, and do not attempt anything beyond what I describe.

Rules:

Use only authoritative, directly applicable sources — government data, regulatory filings, peer-reviewed work, recognized industry bodies, or reputable named publications. If the only thing you can offer rests on inference from sources you cannot vouch for, do not offer it. Say you could not find authoritative evidence.

If you find authoritative evidence, report what you found, cite the specific source by name or URL, and state whether it supports, refutes, or is mixed on the hypothesis.

If you cannot retrieve the evidence — because it sits behind a paywall, requires a login, is in a format you cannot open, lives in a database you cannot query, or does not exist in public form — say so plainly and specifically, and tell me what I would need to do to retrieve it myself.

Do not fill gaps with plausible-sounding substitutes. A clear "I could not find this authoritatively" is more useful to me than a confident guess.

My hypothesis:

[paste the hypothesis]

The test I want you to attempt:

[paste the test you designed]

The AI surfaces what's findable. You decide what's true.