AI Voter Personas: US 2024

Problem

How can we generate structured, data-grounded responses to new questions without fielding new surveys?

Approach

  • Variance-weighted K-means clustering (15 voter segments from 51 policy variables)
  • Silhouette and stability diagnostics
  • 10-question probabilistic assignment quiz
  • Structured LLM conditioning grounded in cluster-level distributions
  • Controlled prompting to prevent drift

Validation

  • Tested on 200 held-out ANES respondents
  • Predictions meaningfully above chance
  • Accuracy improved when demographics were included

Limitations

  • Smooths over individual inconsistency
  • Augments but does not replace real surveys
Guillermo Lezama
Guillermo Lezama
PhD Economist | Data Scientist | Causal Inference & ML Systems