Topic 7B - Supervised Learning and Neural Networks

Learning Objectives

By the end of this topic, you should be able to:

Explain the concept of a perceptron: inputs, weights, weighted sum, threshold, and output.
Given a perceptron with specified weights and threshold, calculate its output for a given set of inputs.
Explain how positive and negative weights allow a perceptron to model inputs that excite or inhibit a response.
Describe the layer structure of an artificial neural network: input layer, hidden layer(s), and output layer.
Explain how a neural network is trained: labeled examples, forward pass, error measurement, weight adjustment, and repetition.
Explain what deep learning is and how increasing hidden layers changes what a network can learn.
Explain the tradeoff between the power of deep neural networks and their lack of interpretability compared to decision trees.

To help you meet the learning objectives, we have prepared three readings. Please complete them in order.

Reading 1 - The Perceptron — from biological neuron to artificial neuron, inputs and weights, a fully worked calculation, and what one perceptron can and cannot do
Reading 2 - Networks and Learning — layer architecture, how training adjusts weights through repeated prediction and error correction, and ALVINN as a concrete worked example
Reading 3 - Deep Learning and Why It Matters — what changes with many hidden layers, the capabilities that emerge, real applications, and the power-versus-transparency tradeoff

These readings intentionally build on each other, so please complete them in order.

Review the Learning Objectives at the top of this page. The questions below will help you check your understanding before moving on to Topic 7C.

A perceptron has three inputs with weights 2, −1, and 3, and a threshold of 4. Calculate its output for each of the following input patterns:
- Inputs: 1, 1, 1
- Inputs: 1, 0, 1
- Inputs: 0, 1, 1
- Inputs: 1, 1, 0
One input to a perceptron has a weight of −5. What does that weight tell you about the relationship between that input and the perceptron's output? Give a real-world example of an input that should have a strong negative weight in a classification system.

A neural network is trained to recognize handwritten digits (0-9). During training, it is shown a handwritten 3 and predicts 8. Describe in your own words what happens next. What changes inside the network? What does not change?
Why does a neural network need hidden layers? What limitation of a single-layer network makes hidden layers necessary?
ALVINN had a surprisingly simple architecture: 960 inputs, 4 hidden neurons, 30 output neurons. It could nonetheless drive a van at highway speed. What does this tell us about the relationship between network size and network capability?

A school district is choosing between a decision tree system and a deep neural network for identifying students who may benefit from additional support. The neural network is more accurate. The decision tree is more interpretable. What questions should the district ask before deciding which system to use? What factors make interpretability especially important in this context?
In Topic 6C, we discussed the overfitting problem for decision trees. Neural networks can overfit too. Given what you know about how neural networks learn, explain in your own words why a network with too many parameters might memorize its training data rather than learn from it.

It is completely fine to revisit the readings as you work through these questions.

These optional topics go beyond the core learning goals but are rich avenues for deeper understanding.

Convolutional neural networks (CNNs)
- The architecture that powers most modern image recognition. CNNs apply learned filters across an image to detect features like edges, textures, and shapes before passing that information to deeper layers.
Recurrent neural networks (RNNs) and LSTMs
- Networks with feedback loops that allow them to process sequential data — text, speech, time series — by maintaining a form of memory across the sequence. The precursor to transformer-based language models.
Backpropagation
- The mathematical algorithm that makes training deep networks practical. Backpropagation efficiently computes how much each weight contributed to the prediction error, allowing all weights in the network to be adjusted in a single pass.
The ImageNet moment (2012)
- The competition result that launched the modern deep learning era: a deep convolutional network dramatically outperformed all previous methods on a large-scale image recognition task, changing the field overnight.