On a phone, the grid is still usable, but it reads best if you start with the guide below and then scroll the matrix horizontally.

Grid FAQ
4 answers

What are we simulating?

We imagine an AI operator is training some machine learning model on different slices of data. There is a small set of data objects to choose between, named A, B, C, D, and so on. The operator will train a model on some slice of training data (e.g., A, B, and C) and then evaluate on a set of data (e.g., just A and B).

How do I read one cell?

Rows show different training scenarios. Columns are evaluation scenarios. One cell means: train on the row world, then evaluate on the column slice.

How do I use this page?

Pick a question family and a cell score, then use the method-specific controls (clicking on the grid, clicking buttons below) to explore different data counterfactual measurements in that question family.

What questions can I ask here?

For now: direct cell reading, leave-one-out, group leave-one-out, Shapley-style values, scaling, toy privacy, toy unlearning, and toy poisoning.

1
Choose what you're exploring

Pick the question lens and the cell score, then choose a train row and eval slice below.

Question familyExplore
Cell scoreJaccard
Question family
Explore
Lens
What does this question ask?
Read one train/eval cell directly.
Cell score
Jaccard overlap
Toy proxy
What does this cell score mean?
Each cell is a toy proxy for "retrain on A and evaluate on A." Jaccard summarizes how much the train and eval sets structurally line up, which can loosely track performance when overlap helps. It still ignores labels, features, model choice, and optimization, so treat it as a heuristic rather than true accuracy.
Display settingsRaw values on, all eval cols
Raw values are visible, so each cell shows its numeric score as well as its color. All evaluation subsets are shown.
Walk me through an example for this Question family1 scenes for Explore
1 presets
Scenes for Explore preload a useful train/eval pair and the mode-specific controls.
2
Counterfactual grid

Choose a train row and eval slice.

Grid controls
Rows trainCols eval
3
0
1A
2B
3C
4AB
5AC
6BC
7ABC
0
1.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
1A
0.00
1.00
0.00
0.00
0.50
0.50
0.00
0.33
2B
0.00
0.00
1.00
0.00
0.50
0.00
0.50
0.33
3C
0.00
0.00
0.00
1.00
0.00
0.50
0.50
0.33
4AB
0.00
0.50
0.50
0.00
1.00
0.33
0.33
0.67
5AC
0.00
0.50
0.00
0.50
0.33
1.00
0.33
0.67
6BC
0.00
0.00
0.50
0.50
0.33
0.33
1.00
0.67
7ABC
0.00
0.33
0.33
0.33
0.67
0.67
0.67
1.00
3
Selection workspace

Read, compare, and adjust one anchored train/eval pair.

Current reading

Read one train/eval cell

Cell score: 1.0000

What does it mean to train on A and evaluate on A?

Selected state
ExploreTrain AEval AJaccard
Train A on eval A lands at 1.000 on the current toy score.
Cell score
1.000
f(A, A)
Train world
A
The selected row determines which world the toy model trains on.
Eval slice
A
The selected column determines which slice gets evaluated.
Mode controls

Explore

Read one train/eval cell directly.

How to use this mode
Click any cell to set the train/eval pair. Explore mode keeps the question local instead of aggregating across many worlds.
Train / eval jump
Jump straight to the anchored train row and evaluation slice you want to read.
Grid markers

Mark the cells you want to talk about.

Click any cell to read it directly as one train/eval world pair.
Comparison marker
Meaningful here: yes. In Explore mode a comparison marker is a true second cell for side-by-side reading.
Any second cell makes sense because you are contrasting two train/eval world pairs directly.
No comparison cell marked yet.
Full JSONCurrent explorer state
Export
{
  "conceptMode": "explore",
  "tutorial": null,
  "universe": [
    "A",
    "B",
    "C"
  ],
  "metric": "jaccard",
  "realData": null,
  "covertypeDomains": null,
  "palette": "Clear daylight",
  "focus": "A",
  "focusSet": [
    "A"
  ],
  "baselineTrain": {
    "index": 1,
    "set": [
      "A"
    ]
  },
  "evalColumn": {
    "index": 1,
    "set": [
      "A"
    ]
  },
  "edits": {
    "poison": false
  },
  "betaShape": {
    "alpha": 2,
    "beta": 2
  },
  "dp": {
    "epsilon": 1
  },
  "unlearning": {
    "tolerance": 0.15
  },
  "scalingK": 2,
  "showNumbers": true,
  "showSingletonEvalCols": false,
  "gridView": "real",
  "rows": [
    "∅",
    "A",
    "B",
    "C",
    "AB",
    "AC",
    "BC",
    "ABC"
  ]
}