Insights
Over the last year, weʼve helped companies unlock the full potential of their data. Equipped with a foundation of quicker and more actionable insights, they are defining the frontiers of whatʼs possible with AI in their respective industries — and achieving market-leading positions as a result.
Retrieval-Augmented Generation (RAG) is at the heart of this.
RAG leverages your data to enhance AI responses by retrieving and integrating relevant data from your sources during generation. Here's how it works:
Weʼve noticed businesses exploring RAG are often confronted with a labyrinth of options and complexities. This slows their progress. This guide aims to help you navigate these challenges and set your projects up for success.
The success of any RAG system hinges on the quality of your data. Itʼs a simple principle, but one that can be surprisingly difficult to get right.
Data quality is key
Take time to ensure that your data is accurate, up-to-date, and well-structured. If you do this right, you will only have to do it once.
We worked with a leading investment management company to create a unified knowledge base across terabytes of data, forming a strong foundation for RAG systems that deliver value across their organisation.
Large Language Models (LLMs) play a critical role here too. The days of spending countless hours categorising files manually are a thing of the past. You can quickly configure LLMs to clean and label your data, helping you to transform chaos into clarity.
Maintaining data privacy
For many organisations, making data more accessible across teams is new territory, often raising questions about who should see what. It may even uncover situations where employees have had access to data they shouldnʼt.
These are critical challenges. But they are solvable.
RAG systems can be built in a way that addresses these concerns, even down to managing access at a single document level. This ensures that users only see what they are authorised to see, helping your organisation remain compliant with regulations and internal policies whilst empowering teams to collaborate more effectively without the fear of data leaks or privacy breaches.
For any project to succeed, it must address real challenges. Engage your users early to uncover their pain points and try to understand how you can solve them.
In many cases, RAG can be an effective tool to help alleviate these challenges. Across industries, RAG often lies at the heart of larger solutions. Weʼve leveraged RAG to build cutting-edge applications for clients ranging from customer service chatbots handling millions of interactions to better-than-human data cleansing.
By building small prototypes that incorporate RAG and sharing them with users for feedback, you can refine your systems to be both intuitive and user-friendly. Keeping users at the centre of your development process ensures youʼre creating technology that teams truly value — and actually use. This is critical for translating your AI investment into competitive advantage.
We worked with a global drinks company to run a global marketing campaign focused on helping students develop start-up ideas ideas that would make a positive impact to people, community and the planet. The ambitious goal was to increase applications by over 20 times compared to 2023. To achieve this, we integrated GenAI into every aspect of the user experience. From generating personalised starter ideas to simulating expert advice on crafting compelling communications, GenAI played a pivotal role in driving engagement and delivering exceptional results.
Context management is king in RAG systems. Itʼs all about getting the right data to the right model at the right time.
There are many ways to optimise your systemsʼ performance, but we often see the most gains by focusing on improving the retrieval step - the search for the data to answer a userʼs question.
What works best will depend on your specific use case, but we can share a few best practices that will help improve your systemsʼ performance across many domains.
Step 1: Getting your data structure right
Once youʼve cleaned and labelled your data, there are a few different options for storing it.
Step 2: Enhancing your database search
Step 3: Enhancing the data you retrieve
When embedding models donʼt quite work
For more nuanced data, such as technical or legal data, common embedding models can struggle to capture subtle but pivotal distinctions. In these cases, you can instead use an LLM to label relevant pieces of information more precisely.
These designs deliver significantly improved retrieval accuracy with trade-offs in latency and cost. However, as the cost of AI has dropped by over 99% in the past two years, the development of bespoke systems like this more accessible than ever.
Once youʼve built your RAG system, you face one last hurdle - getting your team to trust it.
Define clear metrics
Start by identifying the key metrics that reflect your systemʼs performance. Common metrics include accuracy, relevance, speed, and user satisfaction. Tailor these metrics to align with your organisationʼs priorities. For instance, a customer support bot might prioritise safety and speed, whilst an investment knowledge base may emphasise accuracy and relevance.
Leverage synthetic data
If youʼre building a RAG system for a workflow without a lot of historic process data, it can be difficult to evaluate the performance of your system. In this case, synthetic data generation can be useful to get a benchmark. Using LLMs, you can quickly generate example user inputs or underlying data to test how your system performs. This helps you tune the system before pilot testing.
Put it in front of real users
Synthetic benchmarks are useful, but nothing beats real-world testing. As soon as you can, deploy your system for a pilot with your most engaged end users to:
Build trust through transparency
Build explainability into your system. Transparency is key for building user trust, particularly in regulated or high-stakes industries. Build explainability into your system by:
RAG was the star of the enterprise AI show in 2024. But, forward-looking organisations are already asking: whatʼs next?
Future-proof your system
Build a flexible RAG system. New AI models and tools are emerging every day - this way you will be able to quickly integrate them without major overhauls.
Embrace fine-tuning
Organisations at the forefront of enterprise AI are turning to fine-tuning to unlock more complex workflows and higher-value automation. Fine-tuning involves retraining a general purpose model to perform more effectively within a specific task or domain.
At a high level, the enterprise use cases for fine-tuning can be thought of as falling into two categories:
We developed a “Prompt Guard” for an international transportation firm’s customer-facing chatbot to block inappropriate content and protect their reputation. Initially, we implemented a prompt-engineered GPT-4o-mini solution, achieving c.92% accuracy in rejecting unsuitable messages. By prototyping and collecting data from this solution, we built a dataset to fine-tune GPT-4o-mini, ultimately surpassing 99% accuracy in detecting illicit messages while significantly reducing false positives.
We worked with a leading international insurance firm to migrate an internal productivity copilot from GPT-4o to a fine-tuned GPT-4o-mini, achieving a 50x cost saving whilst maintaining performance.
There are several ways to fine-tune LLMs. A common entry point is Supervised Fine-Tuning (SFT), where you train the model to produce correct, domain-specific responses based on a curated set of input-output examples.
OpenAI have recently introduced two new fine-tuning methods which can be used to push models to the next level.
Prepare for agents
The future of RAG doesnʼt end with improved retrieval and cost reductions - itʼs about orchestrating advanced AI components to enable entirely new classes of solutions.
What does this look like in practice?
Maybe this all sounds like science fiction? Itʼs not. Itʼs happening now.
We recently partnered with a major sports league to deploy an AI agent that leverages RAG to analyse fresh game insights in the context of historical data, autonomously generates tailored content for fans, and decides when to publish based on its own criteria for whatʼs timely and exciting.
This system isnʼt just delivering information — itʼs making editorial judgments and taking action on them.
Deploying a successful RAG project isnʼt just about adopting the latest technology — itʼs about delivering measurable business value. At Tomoro, we specialise in aligning advanced AI solutions with your strategic priorities to achieve real impact. Whether youʼre launching your first RAG initiative or looking to take your production-ready system to the next level, our team is here to guide you every step of the way.
Are you ready to lead the charge into 2025?
Connect with our specialists today and start transforming your vision into reality.
Tomoro works with the most ambitious business & engineering leaders to realise the AI-native future of their organisation. We deliver agent-based solutions which fit seamlessly into businesses’ workforce; from design to build to scaled deployment.
Founded by experts with global experience in delivering applied AI solutions for tier 1 financial services, telecommunications and professional services firms, Tomoro’s mission is to help pioneer the reinvention of business through deeply embedded AI agents.
Powered by our world-class applied AI R&D team, working in close alliance with Open AI, we are a team of proven leaders in turning generative AI into market-leading competitive advantage for our clients.
Continue Reading
We’re looking for a small number of the most ambitious clients to work with in this phase, if you think your organisation could be the right fit please get in touch.