Stanford CS329H: Machine Learning from Human Preferences | Autumn 2024

Key Concepts:

Choice Models: Tools to predict choice behavior of individuals or groups within a specific context.
Rational Choice Models: Choice models based on the assumption that individuals make rational decisions to maximize their utility.
Discrete Choice: Choice from a finite set of alternatives.
Utility: A latent variable representing the benefit, value, or reward an individual derives from an item or choice.
Featurization: Representing items and individuals with a set of features for model input.
Bradley-Terry, Plackett-Luce: Standard choice models for discrete choice.
Logistic Regression: A statistical model used for binary classification, often used in choice modeling with specific noise assumptions.
Probit Model: A binary choice model using a standard normal distribution for noise.
Multiclass Logistic Regression: An extension of logistic regression to handle multiple choice options.
Ordered Logit Model: A choice model for ranked preferences using thresholds.
Revealed Preference: Observing real choices made in real situations.
Stated Preference: Gathering hypothetical choices in a controlled environment.
Overfitting: A situation where a model performs well on training data but poorly on unseen data.

1. Introduction to Choice Models

The goal is to provide the technical tools needed to understand human preference learning in modern machine learning pipelines.
Choice models are tools to predict choice behavior of individuals or groups within a specific context.
The process involves observing choices, fitting a model to the data, and predicting future choices given new contexts.
It's important to engage with and critique the assumptions made when building choice models.

2. Applications of Choice Models

Marketing: Modeling preferences for car purchases based on features like brand, price, and demographics.
Transportation: Route planning algorithms that consider factors like weather, traffic, and user preferences (speed vs. shortest path).
Energy: Planning and logistics applications.
Activity Planning: Modeling an individual's activity sequence based on adjustable factors like driving or walking.
Language Modeling: Modeling choices of decision-makers across preferences across documents.

3. Historical Context

4. Core Technology and Assumptions

Choice models involve asking humans about choices across alternatives and combining this with featurization.
Core technology developed in the 1950s and 1960s is still used today.
Models covered include Bradley-Terry and Plackett-Luce.
Assumptions about rationality are important.
Focus is on discrete or finite choices.
Context can be featurized or represented directly (e.g., using sentences).

5. Discrete Choice Models and Utility

Discrete choice models capture decision processes for individuals or groups.
They assume the existence of utility, which can be thought of as benefits, value, or reward.
The utility an individual gets from a pair of items is a function of the frequency they choose one item over the other.
True utility is assumed unobservable but can be measured via stated or revealed preferences.

6. Mathematical Formalism

For an individual N, given items I and J, the observation is whether they choose I over J (1 if yes, 0 if no).
Choices are assumed to be generated by an underlying utility function.
Features (Z or X) describe individual attributes and alternative choices.
A function maps features to utility.
A simple example is a linear model.

7. Implications of the Choice Model

The utility cannot be fully estimated.
Adding a constant to all utilities does not change the choice model.
Only ordering information across alternatives is captured.
The probability of making a certain choice is the probability that the utility for item I is greater than the utility for all other items.
The model is scale-free and invariant to monotonic transformations.
Comparability across contexts is limited without normalization.
Normalization, such as assuming a standardized variance, can allow for comparability.

8. Binary Choice Model

Restricting the choice to two options: pick the item or not.
The utility function is a linear model.
If the noise model is logistic, the probability of picking the item is a logistic function.
Fitting the model involves using logistic regression.

9. Noise Models and Extensions

Choosing a different noise model, such as a standard normal, results in a probit-type binary choice model.
Using ID extreme value distribution for noise terms leads to a solution that can be written with utility separate or with a shared beta.
The ID (independent and identically distributed) assumption for noise may not be realistic in real settings.
Correlations between noise terms can be modeled using a hierarchical model.

10. Generalization and Model Fitting

The problem can be set up as a multiclass problem or a binary problem.
Any model class can be used (deep learning, decision tree, SVM).
Standard machine learning practices, such as bias-variance trade-offs and overfitting, apply.
Individual differences can be modeled by assuming different utility functions or tying betas together.

11. Lykert Scale Preferences

Extending the model to handle ranked preferences using thresholds.
In addition to fitting the parameters of the h function, the thresholds also need to be fit.
This can be set up as a maximum likelihood estimation problem.

12. Plackett-Luce Model

A model for ranking over J items.
The probability model is based on a cumulative sum of probabilities of each choice given the previous choices.
Standard extensions can be applied, such as using different function classes or noise models.

13. Summarizing the Choice Modeling Process

14. Observing Preferences: Revealed vs. Stated

Revealed Preference: Observing real choices in real situations.
Stated Preference: Gathering hypothetical choices in a controlled environment.
Stated preferences allow for controlled experiments but may be unrealistic.
Revealed preferences capture real behavior but may have issues of compounds and coverage.
In language models, experiments often use stated preferences.

15. Conclusion

Choice models are a powerful set of tools for predicting human behavior.
The choice of model, noise distribution, and observation method depends on the specific application and the assumptions one is willing to make.
Understanding the limitations and assumptions of these models is crucial for their effective use.

Stanford CS329H: Machine Learning from Human Preferences | Autumn 2024 | Preference Models