In 2022, we launched the Eddy AI chatbot during our 2.0 launch. The Eddy AI chatbot uses the Retrieval Augmented Generation (RAG) approach, powered by MongoDB vector search capabilities. The initial release focused on semantic matching, and the UI showed all the source articles that were used to generate a response for lineage. Since our launch, our team has been monitoring the performance of the Eddy AI chatbot, which includes
- Tracking latency of content retrieval: Time taken to retrieve the right contexts from the MongoDB collections, as articles are stored in chunks.
- Tracking whether keywords are entered in the chatbot UI: Understanding the volume of questions or a keyword entered in the chatbot interface
- Tracking model drift for the accuracy of the generated response
Eddy AI chatbot has been serving thousands of prompts every day, and monitoring these prompts at a holistic level gives an intuition of how the chatbot is being used by its customers. We observed that most of the customers are entering keywords into the UI and expecting a response. Given that semantic matching works better if questions have the right context. If the keywords are entered, many times the Eddy AI chatbot produces an “I do not know” response. This response erodes the capabilities of Eddy AI and its adoption. This gives a greater insight into customer behavior.
Staying true to one of our core values, quality, is what we strive to do to serve them beyond their expectations. We began exploring ways to bridge this gap, leading us to hybrid search.
Our Journey into Hybrid Search
We were exploring how to use keywords as part of vector search, and we have discovered about hybrid search approach. Hybrid search utilizes both keywords and questions as part of the RAG pipeline. This hybrid search solution solves the problem of picking up more context as more chunks are picked. Given that more contexts are picked, then there is a problem of picking up the “right” chunks required to generate a response. The reranking algorithms help to order the chunks and can eliminate noisy chunks. The state-of-the-art algorithm is the Reciprocal Rank Fusion (RRF) that can combine chunks picked using “keyword” search and “semantic” search, respectively.
Initially, we were trying to implement this approach using custom code, and later found out that MongoDB recently released the “Hybrid search” capability.
Our Hybrid Search Implementation
We built a MongoDB Hybrid search pipeline whereby
- MongoDB Atlas search is used for “keyword” search after extracting keywords from the prompts
- MongoDB vector search is used for “semantic” matching using prompts
- Rerank the chunks using the RRF re-ranker algorithm
This solution has been pushed into production recently and has been receiving rave reviews. The accuracy of the generated response was greatly enhanced using “ENN” rather than “ANN” for precision.
If the user enters only keywords, only the “Atlas” search part of the hybrid search is activated, and can pick the right chunks. In the future, we are planning to use the synonyms/related terms feature in MongoDB Atlas search to improve precision.
Scaling with the Latest GenAI Technologies
As a part of continuous innovation, we have recently switched to the newest OpenAI Large Language Model GPT 4.1 mini to increase the accuracy of the generated response. This hybrid search approach has helped to improve the chatbot’s response accuracy and is able to respond even if the customer enters a keyword in the interface. We’re also exploring MongoDB’s latest advancements in GenAI:
$rankFusion: We’ve tried out $rankfusion to streamline the hybrid search and are impressed with how it simplifies the hybrid search pipeline. Given that $rankfusion provides scoring details, it is used for other use-cases such as a recommendation engine, and so on. We created a pipeline to incorporate $rankfusion into the upcoming product release.
Voyager AI Embedding: With the recent MongoDB acquisition, we are planning to create embeddings natively at the data layer rather than using a third-party API to create and then store embeddings in MongoDB collections. Also exploring utilizing many MongoDB GenAI capabilities for their emerging new cases, such as creating AI agents, and so on.
Elevating Eddy AI Through Innovation
We remain committed to evolving Eddy AI to serve our customers better. Monitored all prompts and undertook data analysis to understand customer search trends and deeply understood the customer pain point. Our move to MongoDB hybrid search and adopting the latest Gen AI technology is a part of our customer-centric approach: to provide the accuracy of responses generated by our Eddy AI Chatbot to its users.
We believe that combining the right technology with customer-centric design is the key to unlocking AI’s full potential. As MongoDB continues to innovate, it is important to apply the right technology solutions to customer problems. We at Document360 always have a focus on both innovation and customer experience.
We’re excited about what’s next.