What Is Supervised Learning?

### Concept

What Is Supervised Learning?

### One-sentence definition

Supervised learning trains a model by showing it many examples of inputs paired with correct labels, so it can learn a mapping that generalises to inputs it has never seen before.

### What real data or case will you use to illustrate it?

The AI Fair Recruitment audit is the primary anchor — it is the clearest example of supervised learning causing a concrete, measurable harm: the model learns from historical hiring decisions (inputs: gender, age, test scores → output: hired/not hired) and reproduces a 4.51% gender gap. We will walk through the train/test split, model fitting, and prediction step using unfair.py directly. The label bias explainer already in this repo (label-bias.md) covers why the labels are tainted; this explainer covers how the learning mechanism works and why it faithfully reproduces whatever pattern is in the labels — good or bad. The German Credit Lending audit will be referenced as a second example of a different supervised task (credit approval) with a different label type.

### What are the limitations or trade-offs of this concept?

The label is always a proxy for the true target. "Hired" is not the same as "best candidate." The choice of what to label as ground truth is a value judgment, not a technical one — connects to label-bias.md.
A high overall accuracy score masks unequal error rates across subgroups, which is why accuracy alone is insufficient and the fairness metrics covered elsewhere in this repo are necessary.
Distribution shift: a model trained on one population's hiring patterns will degrade when applied to a different organisation or time period — connects to the Model Drift explainer being proposed alongside this on

### Before you start

- [x] I've checked the explainers/ folder — this concept isn't already covered
- [x] I have real data or a documented external case to illustrate the concept (not a toy example)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What Is Supervised Learning? #34

Concept

One-sentence definition

What real data or case will you use to illustrate it?

What are the limitations or trade-offs of this concept?

Before you start

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

What Is Supervised Learning? #34

Description

Concept

One-sentence definition

What real data or case will you use to illustrate it?

What are the limitations or trade-offs of this concept?

Before you start

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions