Session 6
Higher-Order Functions
Warm-Up Exercise: Acronyms
We don't do a lot of string processing in this course, but Racket string processing works in a way that you will find familiar — with a Racket-y prefix twist, of course. For example, two useful primitive string functions are:
-
(string char0 char1 ...), a constructor that creates a string out of a bunch of characters -
(string-ref str k), an accessor that returns the kth character in the 0-based string str
For example:
> (string #\E #\u #\g #\e #\n #\e) "Eugene" > (string-ref "Eugene" 2) #\g > (string-ref (string #\E #\u #\g #\e #\n #\e) 3) #\e
Now for the exercise:
acronym
Write a function named acronym that takes one
argument, a list of strings, and returns a string consisting
of the first character of each string in the list.
For example:
> (acronym '("National" "Basketball" "Association"))
"NBA"
> (acronym '("The" "Artist" "Formerly" "Known" "As" "Prince"))
"TAFKAP"
After Session 5, you have all the tools you need to solve this problem without a loop and without recursion. You can use the ideas and functions we learned last time to do the job.
Think about the data you are given and the steps you will need to take on the way to the solution... You are given a list of strings. What step can you take that would get you closer to a solution?
Solving the Problem
Using map
How can map help us here?
We are given a list of strings. What we really want is a list containing the first character of each of those strings.
Last time we learned that, instead of writing a loop, we can
map a function over a list. map
applies a function to every item in a list and returns a list
of the results. That is exactly what we want to do: convert
a list of strings into a list of first characters.
To do that, though, we need a function that returns the first
character of a string. string-ref doesn't quite
do that, but we can use string-ref to write a
function that does.
The problem is that string-ref requires that we
tell it the position of the character to retrieve. In this
case, we always want the first character, in position 0.
Let's write a one-argument helper function:
(define (first-char str) (string-ref str 0))
With map and a first-char function,
we can produce a list containing the first character of each
of the strings:
> (map first-char '("National" "Basketball" "Association"))
'(#\N #\B #\A)
Using apply
Now we have to put these characters together to make a string.
string does that. However, string
takes any number of individual characters as arguments, not a
list of characters as a single argument. That's where
apply comes to the rescue:
> (apply string
(map first-char
'("National" "Basketball" "Association")))
"NBA"
That's the body of the function we need, with the list of strings to process taken as an argument:
(define (acronym list-of-strings)
(apply string
(map first-char
list-of-strings)))
This exercise shows how we can use our two new functions,
map and apply, to combine
other functions in a way that solves a problem. It
also illustrates how functional programmers think about
problems and what programs look like when we get done.
Possible Improvements to acronym
I can think of two possible extensions to our solution that might be helpful.
First, it might be convenient if we could call
acronym without "listing" its arguments,
like this:
> (acronym "National" "Basketball" "Association") "NBA" > (acronym "The" "Artist" "Formerly" "Known" "As" "Prince") "TAFKAP"
We will learn something new about lambda today
that makes this possible. The new idea will expand our
understanding of how parameters are specified.
Second, acronym would be even cooler if it
omitted the little words whose initials we usually don't
want to appear in our acronyms, like this:
> (acronym '("University" "of" "Northern" "Iowa"))
"UNI"
> (acronym '("University" "of" "California" "at" "Los" "Angeles"))
"UCLA"
If your Racket-fu is strong,
make it so
[optional video]. You will need a new Racket primitive
— a function similar to map — to
help you! We will see that new function in class next time.
If you would like a sneak peak, check out
racket-fu.rkt
in today's code.
On the Programming Style
Finally: notice the style in which we wrote
acronym. It is often easier to grow
a large program gradually from smaller parts than it is to
design and write a complete solution up front. This is a
good practice in most languages and most styles, but I think
you'll find it especially helpful when programming in a
functional style.
Growing a program gradually in this way is interactive. The REPL gives us feedback at each step. When we are confident that one expression works, we use it to build another expression.
A large, even complicated function can start as a single function call that we gradually refine into something that solves the entire problem. Looking only at the final result hides the process, a cycle of:
Try something. Hit control-up to get the expression back. Make one change and try again.
Trust this process, and it will reward you with more reliable success.
A language that doesn't affect the
way you think about programming
is not worth knowing.
—
Alan Perlis,
in
Epigrams on Programming
Recap: Functions as Values
Last time, we learned about lambda, the special
form that creates functions. lambda takes two
arguments. The first is a list of (unevaluated) parameter
names, and the second is an unevaluated expression that defines
the operation to be performed. lambda expressions
can be used in literal form or as named objects.
> ((lambda (n) (+ n 1)) 143) 144 ; that lambda is equivalent to Racket's primitive function add1 > (add1 27) 28
Creating and naming functions isn't new to you; you do it as a matter of course in other programming languages. What is new about Racket functions?
- Function names are variable bindings just like any other variable binding. We can use function names in expressions just like we use any other variable name.
- Function is a first-class data type. A functions is a value just like any other data value.
When we say that function is a first class type, we mean that we can use function in all the same ways we use numbers, strings, booleans, or any other data type. We can:
- write them literally, using
lambda - name them
- evaluate them
- pass them as arguments to a function
- return them as the value of a function
The last two capabilities on that list are different from how you are used to programming. We say that a function is higher-order if it takes a function as an argument or returns a function as its value. Being able to pass a function into or out of another function opens a door to new possibilities for us.
Last time, we learned about our first higher-order functions,
apply
and
map.
Each takes a function as an argument and uses it to
compute a value. We used apply and
map at the start of this session to implement our
acronym function.
But we left some unfinished business...
Returning a Function as a Value
What about a function that returns a function as its value? There is no new syntax to learn.
We can write a function as a value using lambda.
If the body of a function is a lambda expression,
then that is the value it returns.
Why might we want to write such a function? In order to create functions to use as part of a larger solution.
For example, if I need to increment every number in a list by
one, I can map Racket's add1 function:
> (map add1 '(1 4 9 16 25)) '(2 5 10 17 26)
What if I need to add some other value to every number in the list, sometimes 6 or 10 or 12?
> (map add6 '(1 4 9 16 25))
add6: undefined;
cannot reference an identifier before its definition
Alas, there is no add6 function built in Racket.
No
first through tenth luck
here...
We could define add6:
(define (add6 x) (+ x 6))
That works fine, but is unnecessarily limiting. Later, we have to define functions for 10 or 12.
Instead of writing a function to add six, we can write a function that makes special-purpose "add n" functions for us:
(define (add n) ; that takes one argument
;---------- and returns
(lambda (m) ; a one-argument function
(+ m n)) ; that adds the two numbers
;----------
)
Now I can add 6 to every number in the list with:
> (map (add 6) '(1 4 9 16 25)) '(7 10 15 22 31)
or 10:
> (map (add 10) '(1 4 9 16 25)) '(11 14 19 26 35)
or 12:
> (map (add 12) '(1 4 9 16 25)) '(13 16 21 28 37)
If we really want a function named add6, we can
now create it with a one-liner:
(define add6 (add 6))
A language that has higher-order functions gives you a new tool for customizing your code.
Languages that do not support higher-order functions limit your ability to write programs.
The add function is a handy tool I need for
creating the functions I need without writing them from
scratch. Your reading assignment for
next time includes a short discussion of the idea at play in
functions like add, called currying. It
is a handy tool for creating a whole class of functions that
return functions as their values.
But as you know now from your reading, the idea of functions that return functions as their values has practical benefits in other settings, too.
An Example from Your Reading: Self-Verifying Numbers
In your reading for today, you learned about a scenario in which higher-order functions of both kinds play a useful role.
make-validator is a "function factory". It takes
two arguments, a digit-manipulating function f
and a modulus m, and returns as its value
a new function that validates numbers using the
standard formula. This function enables us to generate
validation functions for an entire family of self-verifying
numbers.
Notice how, in that code, we treat a function in exactly the same way as we treat a more "ordinary" value, an integer:
- We have two functions that are identical except for the integer modulus they use. So we factor the modulus out of the two functions and make it an argument to a common function.
- We have two functions that are identical except for the function they use to process digits. So we factor the function out of the two functions and make it an argument to a common function.
We can then use the common framework of the two functions to create a validation function from a modulus and a digit function.
We would have done the same thing with the integer in a Java, Python, or C program. Because Racket treats functions as first-class values, we can do this with the digit function as well.
Python and Java allow us to do this, too, though you may
not have seen it yet. In those languages, it usually requires
some syntactic gymnastics. In Racket, though, this is a
natural part of programming that doesn't even require new
syntax. We simply write a function that returns a
lambda expression as its value.
The self-verifying number example is neat because it lets us see how higher-order functions matters in an application beyond the scope of this course.
The process we went through in building that code should be important to you as a programmer. We wrote a couple of functions, recognized similarities or duplication in the code, and factored the common code into its own function. This is a common way of building large programs, another way of growing them from small examples over time.
A language that has higher-order functions gives you another tool for factoring your code.
Languages that do not support higher-order functions limit your ability to write programs.
Study this example some more, and keep your eyes open for opportunities to do the same later.
Interlude: MapReduce
On first exposure, you might imagine that you'll never use
functions such as map and apply
after you finish this course, but you might be wrong...
In order to do distributed computing on large data sets across
clusters of computers, programmers at Google developed a
technique called
MapReduce.
The "map" in MapReduce is essentially the same map
we learned about last session. The "reduce" is a general name
for the idea of combining a set of partial results into a single
final answer. apply some-function
is a reducer!
Our solution to the opening exercise is a simple example of MapReduce:
(apply string
(map first-char
list-of-strings))
It processes a list of strings to create a list of characters and then reduces that list into a single string. After Quiz 1, we will begin to learn techniques for writing other kinds of mappers and reducers.
MapReduce is now available as open-source software in packages such as Hadoop, which many people use to process large data sets.
Let's now work through another problem to see how functional programmers use first-class functions to think and write code. Then we will close our discussion of functions for now by considering another feature of functions that will be helpful to us as we study languages and write interpreters, variable arity.
Thinking Functionally
Many years ago, I ran across a programming challenge called Advent of Code. It poses a problem each day from December 1 and asks programmers to solve it with a program. In December 2019, the challenge for Day 1 boiled down to this:
You are given a file listing the masses of number of modules, one per line. Compute the total amount of fuel needed to send all of the modules into space. The fuel required to launch one module is based on its mass: divide the mass of the module by three, round down, and subtract two.
We haven't read data from files yet, but Racket provides
several useful functions. One is the primitive function
file->lines, which will seem familiar to some
Python programmers:
> (file->lines "modules.txt")
'("12" "14" "1969" "100756")
We would like to write a function that takes a filename as an argument and returns the total fuel needed to send all of the modules into space:
> (total-fuel "modules.txt") 34241
How would a functional programmer approach this problem? In much the same way as any other programmer:
- read the file into a list of strings
- convert the list into a list of integers
- for each module in the list, compute the fuel needed
- add up the individual fuel amounts to compute the total
The big difference is that functional programmers are always asking themselves:
What functions can help me the most here?
... keeping in mind higher-order functions such as
apply and map.
At each step, we use a function to help us, writing any function we don't already have available to us...
file->linesstring->number........... andmapmodule->fuel............... andmap+................................... andapply
module->fuel does not exist yet, but it is
described in the problem statement. So we can write it!
(define (mass->fuel m) (- (quotient m 3) 2))
See the full solution in the session's zip file.
This style is not as foreign as it might seem! The basic process is a sequence of steps, on each of which we pass our data to a function and pass the result to another function. We program like this all the time in other languages, only with a sequence of statements and variables that tie the statements together.
strings = (file->lines filename) numbers = (map string->number strings) fuels = (map module->fuel numbers) total = (apply + fuels)
Those of you who use Linux and love the command line do this sort of programming all the time, too, and in a way that is almost Racket-y:
> cat session06.rkt | grep lambda | wc -l 5
In Racket, we might write:
(wc '-l (grep 'lambda (cat "session06.rkt")))
The difference is more a matter of syntax than style!
Variable Arity Functions
Arity refers to the number of arguments that a function
accepts. For example, sqrt is a unary operation:
it accepts exactly one argument. We say that sqrt
has an arity of 1. In most languages, addition and
subtraction are binary operations and have an arity of 2.
For generality's sake, you will sometimes see these written
as 1-ary and 2-ary, respectively.
All the functions that we have written thus far have taken a
fixed number of arguments. But we have also used several
standard Racket functions that can take any number of arguments,
such as +, list, and
string. We say that such functions are, or have,
variable arity.
There are very few features of Racket's primitives that we cannot implement directly when writing our own functions in Racket. So you should not be surprised to learn that we can write variable arity functions in Racket, too!
To create a variable arity function, we will make one change
to the syntax of lambda. We do not know how many
arguments will be passed, so we cannot enumerate the parameter
list and give each parameter its own name. Instead, we give
a single name, without (), to name a list
that contains all of the arguments that are passed to the
function. For example:
(lambda name-of-list body of function)
Consider
the average function
from last session. We wanted to compute the grade-point average
for each student in a list of students. I defined the average
function we needed in
the session's code file:
(define average
(lambda numbers
(/ (apply + numbers)
(length numbers))))
There are no parentheses around the numbers.
We don't know how many arguments the caller will send, so we
tell lambda to take all the arguments, however
many there are, and put them in a list named
numbers. The body of the function can then act
on the list, whether it contains 2, 100, or even 0 items.
apply and length can work on lists
of any length.
In this case, we can write the body of average
function using existing Racket functions, including the
variable-arity + function and the higher-order
apply function.
In other cases, we will have to write a recursive function
that processes the list of values one by one. Three sessions
from now, we will begin to discuss techniques for processing
lists and other data types recursively. After you learn those
techniques, implementing a variable-arity function like
+ will seem straightforward.
How do we write variable arity functions in the shorthand method for defining functions? This syntax also uses our knowledge of lists and pairs:
(define (average . numbers)
(/ (apply + numbers)
(length numbers)))
Can you see what the dot is doing for us in that expression?
Quick Tidbit: variable arity in Python and C
Did you know that you can create variable arity functions in Python and C, too? Take a look at a simple Python example and a simple C++ example, of functions that accept any number of arguments. Python's syntax is relatively simple, though it requires that we also include a parameter for the size of the list.
Now that we know how to create variable arity functions, how
would we
improve our acronym function
so that it doesn't take a list of strings?
Wrap Up
-
Reading
- Read this mini-lecture on curried functions. It contains two important ideas that we will return to many times this semester.
- Then read this short review of Racket's substitution model. It will help you to better understand how Racket expressions are evaluated.
- As always, study the notes and code from this session. Pay special attention to the section on thinking functionally.
-
Homework
- Homework 3 is available now and due before next session.
-
Quiz
Quiz 1 comes at the end of Session 8, one week from today. It will cover what we have learned about Racket and functional programming style thus far, including the assigned readings. This includes:- Racket's built-in data types and functions (primitives)
- Racket expressions and data structures (means of combination)
- Racket definitions and functions (means of abstraction) We have paid special attention to Racket functions and how they differ from functions in other languages.