← Explore Research Questions Research Question
What behavioral indicators reliably signal attempts to game deliberative processes?
Related Existing Resources
Research
Adversarial testing for Generative AI
Google’s guide defining adversarial testing as systematically evaluating ML models against malicious or inadvertently harmful input, covering explicit queries (containing policy-violating language) and implicit queries (seeming harmless but involving sensitive topics). The four-stage workflow inv...
Research
Strategy-proofness and Arrow's conditions: Existence and correspondence theorems for voting procedures and social welfare functions
Satterthwaite’s landmark 1975 work on strategy-proofness and Arrow’s conditions, investigating the relationship between preventing strategic manipulation in voting procedures and satisfying Arrow’s impossibility conditions. This foundational work in mechanism design theory demonstrates existence ...
Research
Strategic Classification
Hardt et al. (2015) address classifier manipulation by strategic actors, modeling the problem as a sequential game between classifier designers and individuals seeking favorable classification who may alter attributes to game the system. For natural cost function classes, they developed computati...
Research
Strategic Classification is Causal Modeling in Disguise
Miller, Milli, and Hardt (2020) reveal a fundamental connection between strategic classification and causal inference, distinguishing between gaming (circumventing the system) and genuine improvement. Their central argument is that designing classifiers that incentivize improvement must inevitabl...