Value lock-in in AI systems refers to the phenomenon where AI models, trained on vast amounts of human data, reflect and propagate existing societal biases, entrenching misguided moral beliefs and practices on a large scale4. This can reinforce problematic behaviors such as climate inaction and discrimination.
ProgressGym addresses AI value lock-in by introducing "progress alignment," a solution that incorporates mechanisms emulating human-driven moral progress3. The framework uses historical text data and models to track, predict, and co-evolve with human values, ensuring AI alignment with current and future values while mitigating the risks of locking in misguided moral beliefs and practices14.