MGT 100 Week 5
This version: April 2026 | License: CC BY 4.0 | We use javascript to track readership.
We welcome reuse with attribution. Please share widely.
Hint: Both segments were about the same size, and both had very similar distributions of customer demographics

What does this heterogeneity distribution imply for relevant strategic choices?

A phone shopper first snaps a photo and examines the result. How might the photo quality influence their perception of the phone’s less evaluable features?
Figures indicate the importance of understanding digital user needs, not assuming that usage indicates satisfaction, and using experiments to indicate optimal default choices
Modeling heterogeneity can alleviate the first 3 and enable better predictions, but not price endogeneity.
However, predictive performance will worsen if we overfit the model. Have you heard of cross-validation?

Analyst can impose heterogeneity assumptions or seek to learn heterogeneity structure from data (very demanding). Tradeoff depends on empirical reasonableness of modeling assumptions vs. data thickness/sparsity
\[s_{jt}=\int \frac{e^{x_{jt}\delta w_{it}- w_{it}\gamma p_{jt} }}{\sum_{k=1}^{J}e^{x_{kt}\delta w_{it}- w_{it}\gamma p_{kt} }} dF(w_{it}) \approx \frac{1}{N_t}\sum_i \frac{e^{x_{jt}\delta w_{it}- w_{it}\gamma p_{jt}}}{\sum_{k=1}^{J}e^{x_{kt}\delta w_{it}- w_{it}\gamma p_{kt}}}\]
We often restrict elements in \(\delta\) and \(\beta\) to zero, if we think the interaction is unimportant, to avoid overparameterizing the model. What goes into \(w_{it}\)? What if \(dim(x)\) and/or \(dim(w)\) is large?
Alternatively, it is possible to estimate segment memberships, rather than supplying them via kmeans. Pros: We don’t have to pre-specify segment memberships. Cons: Noisy, so we need a lot of data to do this well.
Use modeling purposes and constraints as model selection criteria. Purposes can include prediction, explanation and decision-making. Constraints can include privacy and ethics. What criteria would you prioritize?

Should you model weight=f(height) or height=f(weight)? Notice the many-to-one correspondence and the discrete nature of height measurements.
Imagine you prepared for a quiz by perfectly memorizing the textbook but without understanding the material. Could you accurately apply the concepts in unseen settings?

Choosing a model that maximize a single criterion, such as R-square, can lead to bad decisions.
Adding predictors always improves in-sample fit but risks overfitting. Regularization methods penalize model complexity on the theory that simpler models predict better:
The tuning parameter \(\lambda\) controls penalty strength: higher \(\lambda\) → more shrinkage, fewer effective parameters. Cross-validation is typically used to choose \(\lambda\).
If you have 100 product attributes but suspect only 10 matter, Lasso identifies which ones while Ridge keeps all 100 with smaller weights. Penalties are assumed as proxies for overfitting, but overfitting is not directly optimized.
Be careful not to confuse retrodiction with prediction. Why not?


This paper evaluated retrodictive performance of various models of auto production and pricing estimated using pre-2008 data after the 2008 gas price shock. It supported the generalizability of microfounded models. In what scenarios are tests like this most valuable?
Which information set do you expect to be most predictive, and why?
All models trained using 5-fold cross validation.
Why might these criteria disagree? “Pessimists are often right. Optimists are often wealthy.”

Used the degenerate strategies as benchmarks. What do you see?

How did statistical criteria compare to economic criteria?

Past purchases on discount are, by far, the strongest predictor of models’ targeting decisions.
Conjoint extends demand modeling from “predict demand for existing products” to “predict demand for products that don’t exist yet.” When would a firm need this capability?

Let’s fill out a class conjoint about food at the ballpark; use code 920-89










out_num) that maximizes the ll_ratio(mdat1, out_num).cv_mspe(out_num, mdat1).print() command to output ll_ratio(mdat1, out_num) and cv_mspe(out_num, mdat1) within your script.

