Research note

expectation-field replay

When I first started building topicspace, the idea was not to generate more questions. It was to maintain continuity of understanding in a changing field. The expectation-field replay is the first version of that idea that actually feels like the original concept.

topicspace researchMay 13, 20267 min read

A lot of systems either reset their view every day, lock onto one thesis and quietly drift, or accumulate notes without showing how the underlying understanding changed. What I wanted instead was a system that could make its expectations explicit, carry them forward through time, and show when they weakened, split, or were retired.

The setup is simple. At T0, the system forms an initial expectation from the field as it exists that day. Then it walks forward day by day, updating that expectation deterministically against the evolving field. If the structural support holds, the expectation stays alive. If support weakens, it enters a weakening state. If the field clearly splits, narrower child expectations can spawn. If support breaks badly enough for long enough, the expectation is retired. The important point is that the system is no longer just producing daily prose. It is maintaining an explicit expectation object with a trajectory.

The two expectations being tracked

The replay runs against tracked expectations that topicspace already maintains. Each expectation has a stable handle, a cohort of actors it covers, and at any point in time a forward claim the system has formed about what the field implies next. The two families I ran this experiment on are the ones that have produced the most structural movement over the last six months.

E-001 · AI hardware continues to outperform software.

15 actors, mostly chip/server/infrastructure: NVDA, AMD, SMCI, DELL, INTC, ARM, AVGO, TSM, ASML, MU, MRVL, VRT, ANET, VST, CEG. The expectation tracks whether the hardware/software bifurcation measured in the R-001 case study is holding, widening, or reversing.

E-001 · T0 expectation · 2025-11-19Prediction: the hardware/software gap will persist, with hardware showing early bullish signs and software facing bearish pressures.Grounded in:bullish signals from ASML, AVGO, CEG, VRT, and VST; software under pressure with CRM showing downside confirmation; mixed overall outlook with a slight bullish tilt.

E-002 · The loud divergence cluster does not resolve downward.

A cohort of names where narrative pressure has been positive but price has been weak — the “story not being paid” cluster: ANET, VST, NFLX, TTD, MSFT, META, PLTR, CEG, plus two recent additions (ZETA, MELI) not yet in the backtest history. The expectation tracks whether this divergence eventually resolves with price moving down to meet the narrative, or whether the narrative deflates instead.

E-002 · T0 expectation · 2025-11-19Prediction: the divergence will not resolve downward in the near term; the narrative-price gap persists rather than snapping shut.Grounded in:a neutral to slightly bullish field — most cohort members in a neutral state and two showing bullish trends, with no clear downward shift in the narrative.

These are the two objects whose evolution the replay tracks. Each starts as one expectation. Over the next 61 trading days each gets re-checked daily against the structural field, can spawn narrower child expectations when the field splits, and can weaken or be retired when the structure no longer supports it.

What a single trajectory looks like

I ran this replay first on one family — the hardware/software gap (E-001) — over 61 trading days from November 19, 2025 to February 18, 2026. The result was not a noise cloud. It was one initial expectation, six child expectations, and a readable lifecycle.

Here is what the parent expectation lived through, one cell per trading day:

E-001 initial expectation · daily event2025-11-19 → 2026-02-18 · 61 trading days

stablestrengthenedweakenedsplitretired

Stable days made up 44% of the window, weakened 33%, strengthened 17%, split 4%, and retired 3%. That is much closer to evolving understanding than to repeated rephrasing. The parent stayed alive through the full window, weakened in January, and partially recovered. Five of the six children were retired; one survived in weakening status at the end.

Broad expectations are durable; narrow ones are mostly noise

The expectation graph for E-001 makes the difference between the parent and its children visible. The parent runs the full window. Most children spawn, struggle, and fade.

E-001 expectation lineage · 2025-11-19 → 2026-02-18

●initial expectation

ACTIVE

└split #1

RETIRED

└split #2

RETIRED

└split #3

RETIRED

└split #4

RETIRED

└split #5

RETIRED

└split #6

WEAKENING

Nov 19DecJanFeb 18

What each split claimed

split #1

AMD · DELL · MRVL · SMCIborn 2025-11-19 · 55dRETIRED

This subset is neutral, not joining the early hardware rally led by ASML/AVGO.

split #2

ANET · ASML · SMCIborn 2025-11-28 · 12dRETIRED

Distinct bullish sub-trend within hardware — stronger momentum than the broader group.

split #3

DELL · SMCI · VRTborn 2025-12-04 · 7dRETIRED

DELL/SMCI/VRT outperforming peers — strategic AI-infrastructure pull-ahead.

split #4

AMD · VRT · VSTborn 2026-01-05 · 22dRETIRED

Bullish sub-cohort with stronger growth potential than the parent's mixed outlook.

split #5

ARM · MRVL · VSTborn 2026-01-06 · 17dRETIRED

More pronounced bullish outlook — stronger growth signals than peers.

split #6

ANET · ASML · SMCIborn 2026-01-20 · 20dWEAKENING

Distinct bullish trend — stronger AI-hardware momentum than the parent cohort.

Read down the list and a pattern surfaces: nearly every split was a slightly different attempt to name “the subset that is more bullish than the rest of hardware.” The system kept proposing narrower sub-cohorts of the same underlying claim — energy-and-infrastructure leadership, custom-silicon leadership, memory-and-server leadership — and the field refused to sustain any of those distinctions for long. The parent (hardware vs software) was real. The children were over-refinements.

That same pattern showed up across both families. In a forward-correctness overlay, the initial T0 expectations of both major families held up much better than the narrower derived children. E-001’s parent survived 60 days with weakening status but stayed alive. E-002’s parent survived 60 days as active and never weakened materially. The broadest framings the system formed from the full field were the most stable objects in the whole replay.

The narrower children fared much worse. Across E-001 and E-002, nine children were born and seven were retired within the replay windows. That matches the earlier candidate backtests: the system is good at seeing structural differences, but many of the narrower splits it can name do not remain durable for long. The replay makes that visible in a cleaner way than the question graph ever did. The parent holds the thread; the children are attempts at refinement, and most of those refinements fail.

The two families behave differently

Comparing the two parent expectations side by side shows that the hardware/software family is structurally noisier than the divergence-cluster family. The parent for E-002 spent more days stable, fewer days weakened, and survived the full window without ever materially weakening.

E-001 parent  ·  hardware-software gap61 days

66%

11%

18%

E-002 parent  ·  divergence cluster61 days

77%

15%

stablestrengthenedweakenedsplitretired

This is the kind of structural difference the replay can show that the daily briefing cannot. The two families look superficially similar on any given day, but their trajectories are markedly different. E-001 is an expectation whose underlying field legitimately changes a lot — it splits often, weakens often, recovers often. E-002 is more cohesive.

Weakening is an early-warning signal, not a label

A third finding is that weakening is not just a label. It appears to be an early-warning signal. For expectations that were eventually retired, the weakened event preceded retirement by a measurable amount: about 19.8 trading days on average in E-001 and 13.5 days in E-002.

The average lead time from weakened to retired was roughly two to three trading weeks, across both families and both starting dates.

That does not make weakening a perfect predictor. The observed lead-time distribution ranged from a few days to several weeks. But it does suggest the system is doing more than labelling a retirement after the fact. It is picking up deterioration before the expectation fully breaks.

The pattern held at a second starting date

The most important question after the first run was whether these results were artifacts of one T0. I reran the replay starting February 10, 2026 for both families. The broad patterns held.

run	born	durable	weakened	incomplete	% durable @ 30D
E-001 · T0 Nov 19	7	1	1	5	50%
E-001 · T0 Feb 10	9	2	1	6	67%
E-002 · T0 Nov 19	4	1	0	3	100%
E-002 · T0 Feb 10	3	0	0	2	0%

E-001 consistently produced more expectations than E-002 across both T0s, which suggests the hardware/software family is structurally noisier while the divergence-cluster family is more cohesive. All four parent expectations across the two families and two T0s survived their replay windows alive. The children remained much noisier than the parents.

The architecture-level result appears reproducible: the system can maintain broad expectation continuity over time, but its narrower descendants are far less reliable.

What this is and isn't

That is the part that feels important to me. The original idea was never “can we generate a lot of clever follow-on questions?” It was “can we make the system’s evolving expectations observable?” The expectation-field replay is the first artifact that actually answers that question.

It lets me ask: what did topicspace believe about the hardware/software gap on each day in this window, when did that belief weaken, when did it split, which narrower interpretations survived, and which were retired? I could not ask that cleanly in the question-graph version, because the graph was centered on questions and candidates rather than on the expectation itself. Re-centering expectations as first-class objects was the necessary correction.

What this does not prove is that the expectations were externally right. The current evaluator tests structural persistence, not truth. It can tell me that an expectation stayed coherent, that weakening preceded retirement, and that broad frames survived where narrow ones did not. It does not yet tell me whether surviving expectations tracked reality well enough to be useful beyond internal coherence. That remains the next layer. The current system is better at maintaining an evolving belief trace than at proving the external correctness of that belief trace.

Still, that is enough to say something meaningful. topicspace now appears capable of maintaining continuity of understanding across changing conditions in a way that is explicit, inspectable, and falsifiable. The strongest objects are broad expectations formed at T0. They can weaken, recover, and spawn narrower children without being silently overwritten. The weaker objects are the children themselves, many of which do not last. That may not be a flaw. It may be the honest shape of belief in a noisy field: durable broad frames surrounded by many short-lived attempts at refinement.

The expectation-field replay suggests that topicspace is better at maintaining broad, evolving expectations over time than at producing many durable narrow refinements — and that is much closer to the original idea than the question graph ever was.

Epilogue · where these expectations stand today

Six months after the original T0, both initial expectations have now broken. More importantly, they broke in different enough ways that the difference is itself a finding. The Feb. 10 T0 ran forward into late April and early May; this is what the two cohorts look like as of 2026-05-12.

E-001 broke compositionally

The original framing was broad: hardware would outperform software. By the end of the Feb. 10 replay window, the expectation’s supporting share had fallen from 1.00 to 0.07. Only CEG still backed the original frame.

What happened is more interesting than simple failure. Hardware did not lose as a bloc. It split internally.

The names that anchored the bullish T0 framing — the AI-infrastructure complex of ASML, AVGO, ANET, VST, CEG, and MRVL — are now mostly in DIVERGENCE or REPRICING: positive narrative, negative price. Meanwhile the names that were neutral or weak at T0 — MU, INTC, AMD, DELL, and SMCI — are the ones actually leading on price.

E-001 · commodity hardware — price ran ahead, narrative now negative or lagging

INTC
DISAGREEMENT
NDS -168.7
+29.1%

MU
DISAGREEMENT
NDS -164.8
+32.0%

AMD
EARLY
NDS -130.6
+28.3%

DELL
PRICE-LED
NDS -62.6
+10.7%

SMCI
CONFIRMED
NDS -32.2
+14.1%

ASML
EARLY
NDS -23.8
+7.0%

VRT
PRICE-LED
NDS -22.8
+5.2%

NVDA
EARLY
NDS -1.8
+4.5%

E-001 · AI infrastructure — narrative still positive, price refusing to follow

ANET
DIVERGENCE
NDS +142.8
-27.0%

VST
DIVERGENCE
NDS +73.4
-11.5%

CEG
DIVERGENCE
NDS +53.3
-12.7%

MRVL
REPRICING
NDS +20.1
-1.6%

AVGO
REPRICING
NDS +19.7
-3.1%

ARM
REPRICING
NDS +13.0
-1.4%

TSM
NEG_CONFIRMATION
NDS +12.4
-5.3%

So E-001 was directionally right but compositionally wrong. Hardware did outperform software on average. But the subset that led was not the one the original expectation implied. The actually-profitable slice turned out to be memory, CPU, and server names where the AI narrative attached late and weakly, while the most narratively confident AI-infrastructure names became the ones the market was least willing to pay for.

E-002 broke directionally

E-002 failed more cleanly. The original framing was that the divergence cluster would persist rather than resolve downward. Instead, the supporting share fell from 1.00 to 0.38, and every member of the cohort is now negative on relative return. The narrative-price gap closed by price falling to meet the narrative, not by the narrative being rewarded.

E-002 · all members now negative on price — the divergence resolved downward

NFLX
DIVERGENCE
NDS +84.6
-12.1%

TTD
DIVERGENCE
NDS +77.6
-16.7%

META
DIVERGENCE
NDS +75.5
-7.9%

MSFT
DIVERGENCE
NDS +61.2
-6.2%

CRM
NEG_CONFIRMATION
NDS +59.6
-10.3%

PLTR
DIVERGENCE
NDS +52.3
-12.3%

GOOGL
REPRICING
NDS +46.0
-4.6%

Software is now the cohort producing the clearest sustained negative-confirming pattern in the system. CRM has moved into NEG_CONFIRMATION. MSFT, META, NFLX, PLTR, and TTD all remain in DIVERGENCE with falling price. Whether that reflects structural weakness or an overshoot in sentiment is a separate question. What matters here is that the market has behaved for months as if software is the weaker side of the field. Persistence is not correctness. But it is not nothing either.

How the break matters more than the break itself

For the experiment, this is the cleanest possible outcome: not that the system was right, but that it stayed coherent long enough to be wrong in a specific, decomposable way.

We can now say how each expectation failed:

E-001 broke compositionally — hardware outperformed in aggregate, but the wrong subset led
E-002 broke directionally — the divergence resolved downward, exactly the path the original framing treated as unlikely

That is the value of an honest expectation system. It does not just tell you that a view failed. It tells you which part failed, when it failed, and what replaced it. Knowing how an expectation broke is more useful than knowing only that it broke.

← all writings topicspace board →

sue@topicspace.ai