Topic 7B - Supervised Learning and Neural Networks

One neuron cannot do much. One hundred billion of them, connected correctly, can recognize a face.

Learning Objectives

By the end of this topic, you should be able to:

Learning Activities

To help you meet the learning objectives, we have prepared three readings. Please complete them in order.

Readings

These readings intentionally build on each other, so please complete them in order.

Checking for Understanding

Review the Learning Objectives at the top of this page. The questions below will help you check your understanding before moving on to Topic 7C.

The Perceptron

  1. A perceptron has three inputs with weights 2, −1, and 3, and a threshold of 4. Calculate its output for each of the following input patterns:
    • Inputs: 1, 1, 1
    • Inputs: 1, 0, 1
    • Inputs: 0, 1, 1
    • Inputs: 1, 1, 0
  2. One input to a perceptron has a weight of −5. What does that weight tell you about the relationship between that input and the perceptron's output? Give a real-world example of an input that should have a strong negative weight in a classification system.

Networks and Training

  1. A neural network is trained to recognize handwritten digits (0-9). During training, it is shown a handwritten 3 and predicts 8. Describe in your own words what happens next. What changes inside the network? What does not change?
  2. Why does a neural network need hidden layers? What limitation of a single-layer network makes hidden layers necessary?
  3. ALVINN had a surprisingly simple architecture: 960 inputs, 4 hidden neurons, 30 output neurons. It could nonetheless drive a van at highway speed. What does this tell us about the relationship between network size and network capability?

Deep Learning and Tradeoffs

  1. A school district is choosing between a decision tree system and a deep neural network for identifying students who may benefit from additional support. The neural network is more accurate. The decision tree is more interpretable. What questions should the district ask before deciding which system to use? What factors make interpretability especially important in this context?
  2. In Topic 6C, we discussed the overfitting problem for decision trees. Neural networks can overfit too. Given what you know about how neural networks learn, explain in your own words why a network with too many parameters might memorize its training data rather than learn from it.

It is completely fine to revisit the readings as you work through these questions.

Extend Your Learning

These optional topics go beyond the core learning goals but are rich avenues for deeper understanding.

  • Convolutional neural networks (CNNs)
    • The architecture that powers most modern image recognition. CNNs apply learned filters across an image to detect features like edges, textures, and shapes before passing that information to deeper layers.
  • Recurrent neural networks (RNNs) and LSTMs
    • Networks with feedback loops that allow them to process sequential data — text, speech, time series — by maintaining a form of memory across the sequence. The precursor to transformer-based language models.
  • Backpropagation
    • The mathematical algorithm that makes training deep networks practical. Backpropagation efficiently computes how much each weight contributed to the prediction error, allowing all weights in the network to be adjusted in a single pass.
  • The ImageNet moment (2012)
    • The competition result that launched the modern deep learning era: a deep convolutional network dramatically outperformed all previous methods on a large-scale image recognition task, changing the field overnight.