Glossary
Key terms used throughout the site. Definitions connect to the grid metaphor where relevant.
-
Data counterfactual
A "what if" question about training data. What would happen to model performance if we trained on different data? The grid visualization shows all possible counterfactuals: each row is a possible training set, each column is an evaluation point.
-
Influence function
A technique for estimating how much a single training example affects a model's predictions, without retraining. Approximates what would happen if you removed or upweighted that point.
-
Data Shapley
A method for assigning value to each training point based on its average marginal contribution across all possible subsets. Borrowed from cooperative game theory. Computationally expensive but theoretically principled.
-
Leave-one-out
The simplest data counterfactual: compare performance with and without a single data point. In the grid, this means comparing two rows that differ by exactly one point.
-
Coreset
A small subset of training data that approximates training on the full dataset. The goal is to find a much smaller row in the grid that lands in roughly the same performance region.
-
Data poisoning
Deliberately corrupting training data to cause targeted model failures. Expands the grid dramatically—every possible perturbation creates new rows with potentially different outcomes.
-
Backdoor attack
A type of data poisoning where a trigger pattern (e.g., a small patch in an image) causes the model to misclassify inputs containing that trigger, while behaving normally otherwise.
-
Data strike
Coordinated withholding of data by creators to reduce model performance and exert leverage over AI operators. A strategic move to a less favorable row in the grid.
-
Data leverage
The power that data creators have over AI systems by virtue of controlling training data. Leverage depends on how much performance drops when data is withheld or degraded.
-
Scaling law
An empirical relationship describing how model performance changes with data size (or parameters, or compute). In grid terms, a regression over average performance across rows grouped by size.
-
Differential privacy
A mathematical framework for limiting how much any single data point can affect model outputs. Constrains how far the model can move in the grid when one point changes.
-
Memorization
When a model stores training data verbatim rather than learning general patterns. Can enable extraction attacks where adversaries recover private training examples from model outputs.
-
Machine unlearning
Efficiently updating a model as if a data point was never in the training set—moving from one row to another without full retraining. Motivated by privacy regulations like GDPR.
-
Active learning
Choosing which data points to label next, given a limited labeling budget. Navigation within the grid: deciding which rows become available for training.
-
Data augmentation
Creating synthetic variations of training data (rotations, crops, noise, etc.). Effectively generates new rows in the grid. Mixup, Cutout, and CutMix are common techniques.
-
Curriculum learning
Training on examples in a meaningful order, typically from easy to hard. Changes how you traverse the grid over time, not just which row you end up on.
-
Membership inference
Determining whether a specific example was in a model's training set. Exploits the fact that models behave differently on data they've seen before.
-
Training set
The data used to train a model. In the grid, each row represents a possible training set. Changing the training set moves you to a different row.
-
Evaluation set
Data used to measure model performance. In the grid, each column represents a possible evaluation point or set. The cell value is performance when trained on that row, evaluated on that column.