BM42 is designed to revolutionize hybrid search by bridging the gap between traditional BM25 algorithms and modern transformer models. It aims to enhance search accuracy, efficiency, and scalability for Retrieval-Augmented Generation (RAG) applications. BM42 combines the strengths of BM25 with the capabilities of transformer models, leveraging attention matrices to determine term importance in documents. This new algorithm offers a versatile, efficient, and highly accurate solution for today's search applications.
BM42 integrates transformer models by using attention matrices from transformers to determine the importance of terms within documents. It leverages the attention row corresponding to the special [CLS] token to accurately gauge the importance of each token in a document, even for shorter texts typical in RAG applications. This integration combines the strengths of BM25 with modern transformer models, offering a significant upgrade for search applications.
BM42 overcomes several limitations of SPLADE. It retains the interpretability and simplicity of BM25 while addressing SPLADE's limitations, such as the need for extensive computational resources and issues with tokenization and domain dependency3. BM42 is more efficient, has a low memory footprint, supports multiple languages and domains, and can accurately gauge the importance of each token in a document, even for shorter texts typical in RAG applications.