FairDeDup reduces social biases in AI by removing redundant data while incorporating controllable, human-defined dimensions of diversity to mitigate biases4. This approach enables cost-effective and accurate AI training while promoting fairness by allowing people to define what is fair in their specific setting, rather than relying on biased internet data or large-scale datasets4.
Deduplication in AI training refers to the process of removing redundant information from the data used to train AI systems, which helps lower the high computing costs associated with training3. This process can also mitigate harmful social biases that AI models might learn from biased datasets, leading to more fair and accurate AI training.