On a phone, the grid is still usable, but it reads best if you start with the guide below and then scroll the matrix horizontally.

1 active cell; 63 inactive; 0 computations ready.

ModeExplore
ScoreJaccard
AnchorA / A
Active set1
Inactive63
OptionsJaccard; Raw values on, all eval cols
Normalized overlap between train and eval worlds.

Build the active set.

Click any cell to add or remove it. Click a row or column header to toggle a whole row or column.

Data counterfactual grid

Rows are training worlds; columns are evaluation worlds.

Grid controls
Rows trainCols eval
3

Smart explorer

Active cells are translated literally, then checked against every computation the grid knows how to make.

Current anchor
Train A / Eval A / Score 1.0000
f(A, A) = 1.0000
Train on A; evaluate on A. This cell is the score for that exact train/eval world.

What this active set can compute

Leave-one-out
1 missing
Compare one selected train/eval world to the train world with one focus item removed. Missing 1 of 2 required cells.
Shapley value
7 missing
Average a focus item's marginal contribution across every partial training world. Missing 7 of 8 required cells.
Banzhaf value
7 missing
Use the Shapley pair universe, but weight every coalition equally. Missing 7 of 8 required cells.
Beta Shapley
7 missing
Use the same pair universe with beta-binomial coalition-size weights. Missing 7 of 8 required cells.
Scaling law
3 missing
Average all training worlds of the same size against the active eval world. Missing 3 of 3 required cells.
Eval scaling
3 missing
Average all eval worlds of the same size against the active training world. Missing 3 of 3 required cells.
Diagonal scaling
3 missing
Average worlds where the training and evaluation subsets grow together. Missing 3 of 3 required cells.
Budgeted subset scan
3 missing
Find the highest-scoring training world of a fixed size against the active eval world. Missing 3 of 3 required cells.
Unlearning audit
1 missing
Compare the selected world to the exact retrain world without the focus item. Missing 1 of 2 required cells.
Eval value
not available
A is already present in eval A.
Group leave-one-out
not available
Choose at least two focus items for a coalition.
Pair interaction
not available
Choose two focus items for an interaction term.
Inspect stateSelection and query JSON
{
  "appMode": "explore",
  "queryConcept": "loo",
  "metric": "jaccard",
  "count": 3,
  "train": "A",
  "eval": "A",
  "focusSet": [
    "A"
  ],
  "selectedCells": [
    "1:1"
  ],
  "activePlan": {
    "id": "loo",
    "status": "partial",
    "value": 1,
    "requiredCells": [
      {
        "rowIndex": 1,
        "colIndex": 1
      },
      {
        "rowIndex": 0,
        "colIndex": 1
      }
    ]
  }
}
Grid FAQ
6 answers

What are we simulating?

We imagine an AI operator is training some machine learning model on different slices of data. There is a small set of data objects to choose between, named A, B, C, D, and so on. The operator will train a model on some slice of training data (e.g., A, B, and C) and then evaluate on a set of data (e.g., just A and B). The same object can be imagined as trainable data, evaluation data, both, reserved for a secure holdout, or unavailable.

How do I read the grid?

Rows show different training scenarios. Columns are evaluation scenarios. One cell means: train on the row world, then evaluate on the column slice.

How do I use this page?

Use Explore when you want to click around and build a set of evidence cells yourself. Use Compute when you want to ask for one quantity and see the exact cells that define it.

What questions can I ask here?

For now: direct cell reading, leave-one-out, evaluation value, group leave-one-out, pair interaction, Shapley value, Banzhaf value, Beta Shapley, row-size scaling, eval-size scaling, diagonal scaling, budgeted subset scans, and a simple unlearning reference comparison.

What else could fit this grid?

Several nearby methods still fit the same subset-world picture. Good next candidates include strike curves from the full dataset down to smaller retained worlds, acquisition curves from a seed set upward, regret against the full-data row, composition-stratified scaling, Owen-style group values, sampled or truncated Shapley approximations, and simple datamodel-style response surfaces fit over the grid.

Some of these would only need new formulas over cells we already display. Others would need new controls: group partitions for Owen values, budgets for coreset scans, sampling rules for Monte Carlo Shapley, or a small regression view for datamodels.

What does not fit cleanly yet?

Some important methods live nearby but are not fully represented by this fixed subset grid. True influence functions need gradients, Hessians, and an optimizer. Differential privacy is a guarantee over release distributions and all adjacent worlds, not just a few visible cells. Poisoning needs a threat model, feasible perturbations, and an adversarial objective. Dataset distillation and condensation search over synthetic data, not only observed subsets. Active learning and experimental design need a candidate pool, label model, and acquisition policy. Curriculum learning, forgetting events, and data cartography depend on training order and trajectories. Membership inference needs an attacker and observation model.

So the grid can be a diagnostic baseline for those families, but it should not pretend to certify or replace the full method.