← Explore Research Questions
Resource

Introducing Democratic Fine-Tuning

Joe Edelman’s Democratic Fine-Tuning (DFT) process aligning LLMs with human values through collective deliberation using Values Cards (where participants articulate underlying values like “protecting my community” rather than divisive language) and Moral Graphs (collaborative data structures mapping relationships between values to create a “wisdom gradient”). Participants engage in three stages: articulating considerations, selecting wisest values, and identifying hierarchical relationships, producing training data for reward models that fine-tune LLMs toward wise rather than merely obedient behavior.

Experimental Practice
Creators Joe Edelman and Oliver Klingefjord
Year 2023