Framework · TopicSpace Research

Context Before Chunks

Retrieval answers what is similar. Field models help answer what is happening.

Most LLM context systems retrieve relevant documents. Dynamic reasoning also requires a model of the current field.

April 2026 · topicspace.ai

Retrieval-augmented generation works well for a specific task: find the documents most relevant to this query, then generate an answer. It works less well when the task is to understand what is actually happening in a complex, fast-moving domain. A dynamic field — markets, geopolitics, competitive technology — doesn't just contain relevant documents. It has structure: narratives strengthening or collapsing, claims being confirmed or contradicted, actors in relation to each other. The field has state.

This piece describes a different framing. What if context assembly started not with retrieval, but with a model of the field? And what would that model actually contain?

Why chunk-first context breaks down

Semantic retrieval surfaces topically similar chunks. This works for recall. It works less well for situational awareness — four structural limitations recur in practice:

Locality biasEach chunk is judged individually. A claim from three months ago scores the same as one from last week, unless temporal metadata is explicitly re-surfaced.

Weak contradiction structureContradicting claims can both retrieve highly. Nothing encodes which are live, which are resolved, or which constitute the actual disagreement.

No salience priorRetrieval treats relevance as flat. There is no encoding of what the field currently considers important versus what merely mentioned a term.

Reconstructed temporal structureSequence, velocity, and phase must be re-inferred from individual chunks on every query — an inference that is expensive and frequently wrong.

Retrieval finds evidence

Field models represent the situation

Chunk-first retrieval

QueryRetrieve semantically similar chunks

RelevanceInferred after retrieval

SituationAll content treated as equally situated

ConflictContradictions surface only incidentally

SalienceNo prior — every chunk scores flat

TimeSequence reconstructed per query

Field-aware context

QueryInfer field state first

RelevanceRetrieval conditioned on current state

SituationReasoning inside situational structure

ConflictContradiction is a first-class property

SalienceEncoded as pressure and dislocation

TimeSequence persists as a named primitive

The problem isn't that retrieval is wrong. It's that retrieval answers “what is similar?” not “what is happening?” For dynamic domains, those are different questions.

What a field model represents

A field model is a structured representation of what is currently happening in a domain — not just what happened. It captures the live state of competing narratives, which actors are driving them, how claims are being validated or contradicted, and where the boundaries of any current state are.

Rather than a single structure, this is better understood as a family of primitives — each representing a different type of field property:

Structural

· Named narrative threads

· Cross-actor clusters

· Coherence + phase state

Temporal

· Event sequence

· Phase transitions

· Velocity / acceleration

Relational

· Actor propagation chains

· Leadership / receiver roles

· Amplification patterns

Validation

· Narrative dislocation score

· Historical benchmark outcomes

· State classification

Trigger / Constraint

· Pressure thresholds

· Break conditions

· Setup expiry signals

Together, these primitives describe a domain that has shape — not just content. The field has leaders and followers, acceleration and resistance, coherent structures and fragile ones. The specific families a system needs will vary by domain; what matters is that the model exists before retrieval.

What a field model enables

Prioritization

Field pressure and dislocation rank what deserves attention before a retrieval query is formed.

Contradiction detection

Conflicting claims are properties of field structure, not incidental to individual retrieved documents.

Setup typing

States like 'unpaid narrative' or 'crowded repricing' become typed inputs to reasoning — not labels reconstructed each time.

Claim evaluation

A claim about momentum can be tested against pressure velocity, dislocation, and historical benchmark outcomes.

Validation paths

Break conditions define when a state is invalidated — giving the system an observable signal, not just a stale classification.

Retrieval conditioning

Field state gates what evidence is relevant, reducing noise at the retrieval step rather than filtering it out downstream.

Each capability follows from having a representation of field state — not from better retrieval. They are transformations of the reasoning process.

There is a second point, less obvious: a field model is valuable not only because it introduces higher-order primitives, but because it separates different interpretive functions. Dynamic reasoning gets harder when historical tendency, current interpretation, recent change, and the conditions for resolution are all rendered by the same surface. A system that conflates them can appear informative while making each individual claim difficult to evaluate. Keeping them distinct is not a presentation preference — it is a representational constraint. Each layer answers a different question; together, they constitute a situation rather than a summary.

Historical benchmarkHow has this state tended to resolve?

Setup classificationWhat kind of situation is this right now?

Transition signalWhat just changed in the setup?

Evidence / driversWhat is producing the current state?

Event trailWhat happened recently?

Validation pathWhat confirms or breaks the current state?

The specific layers a system needs will vary by domain. What matters is that they are defined separately and held separately — so that a reasoning layer can use them individually, not only in combination.

Retrieval policy as one downstream use

Retrieval still matters. But in a field-model architecture, it becomes a downstream policy conditioned on what the field model has already determined. Instead of “retrieve what is relevant to this query,” the policy becomes: “retrieve what is relevant to this query, given that the field is in state X, under narrative Y, with actor Z showing elevated pressure.”

Evidence Layerevents · prices · filings · coverage

→

Field Modelstructures · states · pressure · benchmarks

→

Retrieval Policyconditioned on field state

→

Reasoning Layersynthesis · claims · outputs

The field model sits between evidence ingestion and retrieval — not after it.

The field model doesn't replace retrieval. It gives retrieval a better question to answer.

Worked example: the AI ecosystem

TopicSpace tracks narratives across ~36 actors in the AI and adjacent technology ecosystem. The system evolved iteratively — each stage adding field structure that changed what could be reasoned about.

1

Actor-first

Individual narrative pressure. Coverage volume, signal density, trend direction per actor.

2

Cluster-first

Cross-actor co-movement. Storm detection groups actors sharing a narrative event surge.

3

Structure-first

Named narrative lineages with phase tracking. Cross-cluster coherence and persistence.

4

State benchmarking

Historical outcomes per state. What has 'story not being paid' resolved to? Excess returns by state.

5

Setup classification

Typed inference from state + benchmark + dislocation. Unpaid narrative. Crowded repricing. Fragile squeeze.

The progression wasn't planned in advance. Each stage was added to answer a question the previous stage couldn't — which is a reasonable description of how field models get built in practice. But there was a second lesson embedded in the process: adding more layers wasn't sufficient on its own. Each layer had to be given a distinct interpretive function. Early representations tended to let multiple surfaces handle the same kind of meaning — historical tendency, current situation, recent change, and forward path could all appear in the same context block, restating each other. The improvement came from enforcing separation: what has this state historically meant; what kind of situation is this now; what just changed; what would resolve it. When these are kept distinct, each one becomes independently evaluable — and the reasoning that builds on them becomes more precise.

Fig. 1

Narrative Leaderboard

replace with screenshot

Actor states, dislocation scores, and setup rankings across the tracked field.

Fig. 2

Actor Detail

replace with screenshot

Benchmarked state, setup type, narrative pressure, and validation path for a single actor.

Fig. 3

Field Structures

replace with screenshot

Named cross-actor narrative threads — higher-order field objects with coherence and phase tracking.

Why pricing helped

One useful discovery in building TopicSpace: adding price data made the narrative model better. Not because price is the target — but because price is an independent validation signal with no retrieval bias. In this domain, price served as an external confirmation surface; other fields may use different layers — survey data, policy text, network activity.

When narrative and price diverge, one of them is wrong about the current state. That divergence — the Narrative Dislocation Score (NDS) — became a first-class field property. It surfaced states the system hadn't been able to label before, and those states turned out to carry different implications for retrieval, synthesis, and claim evaluation.

Price confirming narrative

Story and price aligned. The field is legible and internally consistent.

Story not being paid

Strong narrative, price lagging. Positive dislocation — historically, this state has tended to resolve constructively.

Price ahead of story

Price has run past narrative support. Extended without structural backing. Follow-through has historically been weak.

Rejection of negative narrative

Price is not confirming the bearish story. Sector benchmark context determines whether the rejection holds.

Early confirmation

Price starting to follow narrative. Direction is forming but not yet clean — needs follow-through.

“Story not being paid” and “price ahead of story” are structurally different situations — and they require different reasoning, not just different documents. The divergence is information; encoding it explicitly, rather than letting it average out in retrieval, is what makes it useful.

Limits and open problems

Field models have their own failure modes:

False coherenceA field model can produce confident-looking structure from noisy inputs. The system appears to know what's happening when the underlying signals are weak or contradictory.

Ontology fragilityThe primitive families have to be defined by someone. That design encodes assumptions about what counts as a narrative, a cluster, a phase. Wrong assumptions produce confidently wrong models.

Weak joinsConnecting events to narratives, actors to states, and claims to validation paths requires entity resolution that is hard to get right at scale. Errors compound.

DuplicationMultiple overlapping narratives can represent the same event differently. Without deduplication, the model perceives more coherence than actually exists.

Sparse coverageField models degrade on thin actors or events. The primitive structure requires rendering a state even when the data is insufficient to support one.

Evaluation difficultyHow do you know whether the field model is correct? The absence of ground truth for narrative states makes calibration genuinely hard — standard held-out validation doesn't translate cleanly.

Closing

Retrieval-augmented generation is a memory architecture — it gives a model access to relevant information it wouldn't otherwise have. Field models are a situational awareness architecture — they give a model a representation of what is currently happening, not just what has been written about. Both are useful. They solve different problems.

A related point: the value of a field model is not only in what it adds, but in what it separates. Historical base rates, current interpretation, recent change, and the conditions that would resolve a state are different kinds of meaning. Systems that conflate them produce context that is dense but difficult to evaluate. Systems that keep them distinct give a reasoning layer something it can actually work with — each piece answering a different question, together constituting a situation rather than a summary.

Better retrieval improves what you can recall. A field model may improve what you can reason about. For dynamic domains, the second problem is at least as hard as the first.

TopicSpace tracks narrative structure in the AI and adjacent technology ecosystem. This piece reflects ongoing research — not a completed system.
Leaderboard →Signals →Structures →