Customer Analytics Week 1
This version: April 2026 | License: CC BY 4.0 | We use javascript to track readership.
We welcome reuse with attribution. Please share widely.

Dog==Draymond || Dray || Click Clack || Major Jealous
Cat==Luka || Liquid || Hunter || Captain Kitty

This ancient Mesopotamian clay tablet is estimated to be nearly 4,000 years old. Data visualization is older than English and older than zero. Are those blanks zeros or missing values?

This attempted translation shows what appears to be accounting for construction labor. Columns B and E look like duplicates, and A appears to sum B and F. Are those blanks zeros or missing data?

This visualization shows absolute increase in cancer risk and offers information for different audiences, with a pleasing presentation despite a grave topic.
Does the graph show how much a person’s cancer risk changes when they change their drinking behavior? (I.e., a “causal effect”)

The world population curve has a sigmoid shape, also known as an S curve, in which growth increases and later decreases. A pop science book, Population Bomb by Ehrlich (1968), forecast mass starvation, since population was increasing exponentially but food production was increasing linearly. Why was Ehrlich’s forecast so far from correct?


Google searches for DraftKings, FanDuel during an NFL Game, 9-10 P.M. EST. Commercial minutes are shaded. What do you see?

This map graphs one year of San Diego Police Department police reports. When is data zero vs. missing? Provenance is key. What kind of decisions could this inform?
A key principle in Business Analytics is GIGO: Garbage In, Garbage Out. Don’t analyze noise.




“Graphics journalists urge that each chart should make exactly one point – and it should be obvious how to read it. Often charts say (1) number goes up/down recently, (2) number goes up/down when some event occurred, (3) one set of lines diverges from another set of lines (or, one line is an outlier compared to the rest), (4) the distribution is bimodal.” –Jeremy Merrill, WaPo. “Scientific charts are often the opposite of this – they have four variables, six symbols and are explained two pages away.”

The same dataset can be represented many different ways, each emphasizing different comparisons. Here is a very simple dataset; what comparison does each graphic communicate? How would you summarize each graphic in a sentence?
Misleading visualizations can indicate bad faith. You want to avoid mistaken interpretations of your work, and identify misleading impressions created by others.


This is why we shouldn’t just use regression to summarize data. What does each picture tell us?
related Consumer, Client

related Expert systems, business intelligence, data science, AI. Terms change frequently.

Which owners or employees in the business can afford to ignore customers?
Marketing has a marketing problem. Most people confuse marketing with ads; sales; or bullshit (“persuasive speech without regard for the truth”). This is because real marketing decisions are made at the top of the org, and short-term incentives often lead marketers into unethical choices.
Strategic consistency reduces customer risk. Other EDLP (vs High-Low) businesses: Costco, Trader Joes, Walmart. Common tension between short-term management and long-term strategies.
Firing customers is typically indirect, such as withdrawing preferred products or declining to encourage further purchasing. Why is firing customers usually controversial?
Consumer Panel Data include longitudinal data beginning in 2004. These data track a panel of 40,000–60,000 US households and their purchases of fast-moving consumer goods from a wide range of retail outlets across all US markets.
Retail Scanner Data consist of weekly pricing, volume, and in-store marketing info generated by point-of-sale systems from 90+ participating retail chains across all US markets. Data begin in 2006.
Retail industry always adopts customer analytics frameworks early. Consumer panel and retail scanner data are foundational in packaged goods categories.


On July 9, 2020, the CEO of Goya praised President Trump during a White House meeting, generating calls for a consumer boycott.

Despite calls for a boycott, total sales rose, mostly because Republican areas started buying Goya. Without customer data, we would be shooting in the dark.

Descriptive (what happened), diagnostic (why), predictive (what will happen), and prescriptive (what should we do). Which type is hardest?

This model is also known as the purchase journey. Very likely the most popular empirical analytics framework ever in customer analytics. How does it facilitate decisionmaking?

These bubbles represent analytics techniques used to address customer funnel-related goals. 800 e-commerce pros were surveyed; companies using 9+ data-driven methods were most satisfied with their conversion rates. The optimal number of A/B tests was 3-5 per month. Why do you think pop-ups are so common?
These are questions you can ask in job interviews.


Analytics works best when leadership creates an environment that enables analytics investments and rewards disciplined decisionmaking, with retrospective decision evaluations and continuous-learning feedback loops. How have you seen analytics used in practice?
Common language facilitates communication
In what markets does the customer often differ from the consumer? Known as “Choosing for others.”
In what markets do Uber and Zoom compete?

You need to know these well if you interview for marketing roles. Generations of marketing professionals were educated and continue to think this way. Still relevant, but less central, thanks to customer data & analytics abundance. MGT 103 complements this course well by covering these topics in depth, but without the same deep focus on customer data.


Y axes indicate the human-expert task-completion time (in hours) that a frontier agent can complete with 50% reliability. Cox (2006) said “How [the] translation from subject-matter problem to statistical model is done is often the most critical part of an analysis.” What is model mis-specification? How does this figure into the public conversation about future AI capabilities? Will AI become all-powerful or should you study and build skills?
/effort max to
My use/non-use cases reflect my role as an academic researcher and educator. Your optimal use/non-use cases are likely to differ. Where has LLM use helped in your education? Has it ever limited your education? How can it help in MGT 100?

Please write down your intentions for this class on Canvas. How will you measure your effort? Please don’t say grades alone; grades are outcomes, not inputs. Measurement is central in analytics; you cannot manage what you do not measure.
Bad habit: Write the whole script, run it, see where it breaks. Good habit: test each chunk before writing the next one.


| Week (relative to endorsement) | Region | Goya sales |
|---|---|---|
| -4 | Right-leaning | 87 |
| -3 | Right-leaning | 85 |
| -2 | Right-leaning | 87 |
| -1 | Right-leaning | 90 |
| 0 | Right-leaning | 140 |
| 1 | Right-leaning | 133 |
| 2 | Right-leaning | 110 |
| 3 | Right-leaning | 95 |
| -4 | Left-leaning | 158 |
| -3 | Left-leaning | 159 |
| -2 | Left-leaning | 158 |
| -1 | Left-leaning | 159 |
| 0 | Left-leaning | 176 |
| 1 | Left-leaning | 170 |
| 2 | Left-leaning | 162 |
| 3 | Left-leaning | 158 |
Visualize Goya’s weekly average sales in right- and left-leaning regions pre/post a major endorsement.
Data: goyadata <- readRDS(url("https://raw.githubusercontent.com/kennethcwilbur/mgt100/main/data/goya_sales.rds"))
Your visualization should make a clear point and be very easy to understand. Submit your R script and visualization on Canvas.


