The Length-Instruction Fine-Tuning (LIFT) method is a novel approach to address length bias in instruction-following models1. It augments training data with explicit length instructions, allowing models to be controlled at inference time to adhere to specified length constraints. LIFT incorporates Direct Preference Optimization (DPO) and constructs preference pairs reflecting both length constraints and response quality, enhancing the model's ability to generate accurate and appropriately concise responses13.
LIFT (Length-Instruction Fine-Tuning) is a novel AI training method that addresses the length bias problem in instruction-following models by augmenting training data with explicit length instructions. Unlike previous methods that incorporated length penalties into evaluation benchmarks or used fine-tuning techniques like RLHF, LIFT enables models to be controlled at inference time to adhere to specified length constraints. This results in models that can generate accurate and appropriately concise responses.
The LIFT method was used to fine-tune Llama 2 and Llama 3 models. These models were trained using augmented datasets that included length instructions, enabling them to generate responses that adhere to specific length constraints while maintaining high response quality.