← Explore Research Questions Research Question

What behavioral indicators reliably signal attempts to game deliberative processes?

Related Existing Resources

Research

Adversarial testing for Generative AI

Google’s guide defining adversarial testing as systematically evaluating ML models against malicious or inadvertently harmful input, covering explicit queries (containing policy-violating language) and implicit queries (seeming harmless but involving sensitive topics). The four-stage workflow inv...
Research

Strategy-proofness and Arrow's conditions: Existence and correspondence theorems for voting procedures and social welfare functions

Satterthwaite’s landmark 1975 work on strategy-proofness and Arrow’s conditions, investigating the relationship between preventing strategic manipulation in voting procedures and satisfying Arrow’s impossibility conditions. This foundational work in mechanism design theory demonstrates existence ...
Research

Strategic Classification

Hardt et al. (2015) address classifier manipulation by strategic actors, modeling the problem as a sequential game between classifier designers and individuals seeking favorable classification who may alter attributes to game the system. For natural cost function classes, they developed computati...
Research

Strategic Classification is Causal Modeling in Disguise

Miller, Milli, and Hardt (2020) reveal a fundamental connection between strategic classification and causal inference, distinguishing between gaming (circumventing the system) and genuine improvement. Their central argument is that designing classifiers that incentivize improvement must inevitabl...