← Explore Research Questions
Resource
Introducing Democratic Fine-Tuning
Joe Edelman’s Democratic Fine-Tuning (DFT) process aligning LLMs with human values through collective deliberation using Values Cards (where participants articulate underlying values like “protecting my community” rather than divisive language) and Moral Graphs (collaborative data structures mapping relationships between values to create a “wisdom gradient”). Participants engage in three stages: articulating considerations, selecting wisest values, and identifying hierarchical relationships, producing training data for reward models that fine-tune LLMs toward wise rather than merely obedient behavior.