TLP - Programming Maintenance
Week 4 - Analyzing Superbowl Commercials
Background
This past Sunday was a pretty big deal to a lot of people in the US - Super Bowl Sunday. Even if you aren't a big football fan it is quite possible that you were looking forward to the big day because, for many people, it's a chance to have a big party, watch the game, the half-time show, and the commercials!
The journalists at FiveThirtyEight spent time analyzing the kinds of ads run by the 10 brands that most frequently advertised at the Super Bowl between 2000 and 2020. Those companies were:
- Bud Light
- Budweiser
- Coca-Cola
- Doritos
- E-Trade
- Hyundai
- KIA
- The NFL
- Pepsi
- Toyota
In particular, the journalists were interested in the specific tone/content of Super Bowl ads. For each ad they recorded attributes of the commercial such as:
- does it show the product quickly
- is it patriotic
- does it feature one or more celebrities
- is it funny
- does it contain acts/elements of danger
- does it contain an animal
- does it use sex
That data is in this cs file:
Suppose that we want to know which pair of features are most often used together? It is funny animals? Is it sexy patriots? Is it celebrities in danger?
Task
- Write a function called analysis()
- Input
- Takes in one string assumed to be a csv file
- Output
- Returns a string stating which two columns appeared together most frequently
- For example:
- Scoring
- You will earn 1 point if you can do this for the superbowl.csv file only.
- Here is another variation of that file
- You will earn 2 points if you can do this for a file with any different header labels that I give you
- It is very possible to earn 1 point by "hard coding" the labels. But that isn't what I want (which is why I will only give half credit). I want you to figure out how to extract the labels by writing the appropriate code.
- But we still have to put a few constraints on this for now. So let's make a few assumptions.
- We will assume that the first four columns (0-3) are data about the commercials and that columns 4 through 10 are attributes
- We will assume that the header row contains the label for the columns (which we need for 4-10)
- Here are two variations to test this outcome
- You will earn 1 point if you can do this for the superbowl.csv file only.
Hints
- This has some similarities to the Bob Ross activity from last semester.