Skip to content

What Is Supervised Learning? #34

@yakew7

Description

@yakew7

Concept

What Is Supervised Learning?

One-sentence definition

Supervised learning trains a model by showing it many examples of inputs paired with correct labels, so it can learn a mapping that generalises to inputs it has never seen before.

What real data or case will you use to illustrate it?

The AI Fair Recruitment audit is the primary anchor — it is the clearest example of supervised learning causing a concrete, measurable harm: the model learns from historical hiring decisions (inputs: gender, age, test scores → output: hired/not hired) and reproduces a 4.51% gender gap. We will walk through the train/test split, model fitting, and prediction step using unfair.py directly. The label bias explainer already in this repo (label-bias.md) covers why the labels are tainted; this explainer covers how the learning mechanism works and why it faithfully reproduces whatever pattern is in the labels — good or bad. The German Credit Lending audit will be referenced as a second example of a different supervised task (credit approval) with a different label type.

What are the limitations or trade-offs of this concept?

The label is always a proxy for the true target. "Hired" is not the same as "best candidate." The choice of what to label as ground truth is a value judgment, not a technical one — connects to label-bias.md.
A high overall accuracy score masks unequal error rates across subgroups, which is why accuracy alone is insufficient and the fairness metrics covered elsewhere in this repo are necessary.
Distribution shift: a model trained on one population's hiring patterns will degrade when applied to a different organisation or time period — connects to the Model Drift explainer being proposed alongside this on

Before you start

  • I've checked the explainers/ folder — this concept isn't already covered
  • I have real data or a documented external case to illustrate the concept (not a toy example)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions