Goldfish loss reduces memorization in language models, mitigating privacy and copyright risks in industrial applications1. It prevents models from reproducing exact sequences from training data, while maintaining performance comparable to standard-trained models. Goldfish-trained models are resistant to verbatim reproduction and less susceptible to data extraction attacks, offering a practical approach to enhance privacy in industrial settings.
The "goldfish loss" technique reduces LLM memorization by excluding a random subset of tokens from the loss computation during training. This prevents the model from memorizing and reproducing exact sequences from its training data. A hashed masking approach ensures consistent token masking based on the context of preceding tokens, avoiding the leakage of entire passages while effectively learning language patterns during training.
The hashed masking approach enhances the goldfish loss technique by ensuring consistent token masking based on the context of preceding tokens. This method is crucial for handling duplicate passages in web documents, where variations exist due to different attributions, headers, and other content. By hashing a localized context of preceding tokens, the model avoids leaking entire passages while learning important language patterns effectively during training.