Insights

To KG, or not KG, that is the Question

Author: Rishabh Sagar

When NOT to use knowledge graphs in retrieval based applications

Introduction

The wisdom of modelling the world around us using a network of graphs is an ancient one.

Meditations

As wise as these words are, they are also fake! They were generated using GPT-4 and Dalle-3 to help me make a point.

When attempting to solve a problem with just a hammer in hand, all challenges tend to appear like a nail. Especially if the hammer is a shiny new one that you just learned how to use. Every new piece of information can cause confirmation bias.

In the last article, we discussed the benefits of using graph databases as RAG backends when building LLM-based applications.

We demonstrated some key benefits of grounding such applications with information in graph databases:

Observed BenefitDescription
Reduced HallucinationsReduced hallucinations due to improved recall - Average Recall@5 on news articles was consistently 1.2% above vector-based retrieval in our experiments.
Reduced input token sizeUsing a graph database as a backend allows increased granularity and allows the capture of data at fact-level (details below). We observed up to 20% reduced input tokens due to the granular selection step. This allowed us to better select for relevance and increase the precision of the retrieval process.
Better Data ManagementGrouping facts related to specific entities, enabling "entity-level" data ownership, simplifies operational management and reduces complexity in data management processes. Additionally, it incorporates a time dimension and decay factor for recency metrics during the data retrieval phase.
Knowledge inference opportunitiesUsing graph databases as RAG-backed allows opportunities to actively mine for inferences based on knowledge already held in the corpus. This boosts answer quality at the generation stage.

However, graph databases are certainly not the only correct solution. Depending on your application requirements, the process of selecting the backend can be more nuanced.

In this article, we will discuss the contours of this decision-making process and explore scenarios where a humble vector-based retrieval technique might outshine other options. We do this intending to highlight the importance of using the right tool for the job and avoid falling into the “one size fits all” design pattern fallacy.

The traditional approach to RAG

In traditional RAG implementations, the knowledge corpus is processed through an embedding model and the resulting output is stored in (typically) a vector store.

At retrieval time, the user question is processed through an embedding model and, using a similarity algorithm (e.g. cosine similarity), the “distance” between the user question and data chunks in vector databases is calculated. The chunks closest to the user question are deemed to be relevant to the question and retrieved. Then used as context to answer the question.

These knowledge chunks, along with the original question and accompanying context, are wrapped around a system prompt and sent to a foundational model to generate an appropriate response.

Meditations

In defence of traditional approaches to retrieval

When choosing RAG backends, here are some design considerations to reflect on:

The need for relationships

If your use case does not fundamentally benefit from the interconnected nature of the underlying dataset, then a more traditional approach of “semantic search and stuff” pattern might be the way to go.

This approach, which leans more towards vector-based retrieval systems, is particularly effective in scenarios where the data structure is less complex and does not inherently require the intricate relational mapping that graph databases excel at.

In use cases like customer policy chatbots, for example, the backend data often consists of self-contained documentation. This type of data is typically structured and doesn't necessarily benefit from the relational interconnectivity provided by a graph database.

During our research and development, we found that the cost of pre-processing this data into a graph data model does not yield major precision or recall benefits. The effort and resources required to transform and maintain these documents in a graph format are often not justified by the marginal improvements in the quality of results.

Furthermore, in the context of LLM generations, the ultimate goal is to provide accurate and relevant information or responses. For a customer policy chatbot, the effectiveness is measured by how accurately and quickly the system can retrieve and provide policy information in response to customer queries. Our findings suggest that vector-based retrieval systems are more than capable of achieving this without the additional complexity of a graph database.

These systems use advanced embeddings to understand the semantics of both the user query and the document content, enabling them to fetch the most relevant information effectively.

A heuristic to consider is that if you are not using graph algorithms like shortest path traversal, similar neighbour analysis etc. in the retrieval stage, you might not be using the graph database to its fullest potential. If your results are satisfactory without the use of these techniques, it might be worth simplifying the backend stack.

Solution complexity

Using graph databases requires pre-processing data and may include steps like data decomposition, entity resolution, and relationship discovery. These steps require careful consideration and may add to the application's complexity.

Data decomposition involves breaking down complex data structures into simpler, graph-compatible formats, which can be a time-consuming and intricate process.

Entity resolution, the task of identifying and linking mentions of the same entity across different datasets, demands sophisticated algorithms and can be particularly challenging in the presence of ambiguous or incomplete data.

Relationship discovery, essential for constructing the graph, involves identifying and categorizing the connections between entities, which can be complex due to the nuanced nature of real-world relationships.

During the discovery or in the value-proving stage, this added complexity can be distracting and counter-productive to the overall aim of the project. It can lead to increased development time and costs, as well as requiring a higher level of expertise from the team. In the early stages of a project, where the focus should be on proving the concept and demonstrating value quickly, these complexities can become significant roadblocks. They can divert resources and attention away from core project objectives, potentially delaying time-to-market and impacting the project's momentum.

Moreover, implementing a rigid structure for graph databases can limit flexibility in dealing with evolving data models. As the project progresses and requirements change, adapting the graph database structure can be cumbersome. This rigidity can be particularly challenging in dynamic environments where the data and relationships are constantly evolving, requiring frequent updates to the database schema.

In contrast, vector-based retrieval systems, with their ability to handle unstructured data and dynamically interpret relationships, can offer a more agile and scalable solution in such scenarios. These systems can adapt more readily to changes in data and project requirements, allowing for faster iteration and more focused development efforts.

Therefore, while graph databases offer powerful capabilities for modelling complex relationships, it's crucial to weigh these benefits against the potential for added complexity and inflexibility in certain project stages or environments. Understanding the specific needs and constraints of a project is key to choosing the right technology solution, whether it be a graph database, vector-based retrieval system, or a combination of different technologies.

Data Modelling

Much of the power of graph databases comes from their ability to implement an interconnected data model. This modelling work, however, can be time-consuming and often requires several iterations to perfect. The initial design of a graph database model is rarely the final one, as real-world data and requirements are complex and multifaceted. As we delve deeper into the data and its relationships, the initial model often proves inadequate or overly simplistic, necessitating a redesign.

On average, we have had to perform major re-designs of the model approximately 3–5 times when building out graph backends. This frequency of redesign is usually a consequence of the iterative discovery process as the solution scales. In the early stages, the model might seem adequate, but as the system scales and more diverse data is introduced, limitations and inefficiencies become apparent. For instance, certain relationships or entities that were not considered critical in the initial stages may emerge as vital as the project evolves. Similarly, as the data grows, performance issues might arise, prompting a need to restructure the model for better efficiency.

Each redesign of the graph model involves not just changes to the database structure, but also modifications to the application logic that interacts with the database. This can lead to significant development overhead and can disrupt the project timeline. Moreover, it requires a deep understanding of both the data and the graph database technology, making it a resource-intensive task.

Furthermore, these redesigns are often a response to the evolving understanding of the project's requirements. As stakeholders gain a clearer view of what they want and need from the application, the data model must adapt accordingly. This evolving nature of project requirements can make graph database implementations somewhat unpredictable and harder to plan for in terms of resources and timelines.

In contrast, vector-based retrieval systems offer a level of flexibility and adaptability that can be advantageous in rapidly changing and scaling environments. These systems are inherently more capable of handling unstructured data and are built on a much simpler data structure.

Retrieval speed

Backends that support vector-based retrieval tend to regularly outperform graph databases in terms of raw speed of retrieval. This advantage becomes particularly significant in scenarios where rapid data access is critical. Vector-based systems are optimized for quick, efficient retrieval of relevant information from large datasets, making them ideal for applications requiring high-speed data processing.

In our testing, we found that depending on the complexity of the model, retrieval latency can be almost 50% more when using graph backends. This difference in latency is particularly noticeable in applications that handle large volumes of data or require real-time data processing.

For instance, in a use-case like a code explainer chatbot, where users expect immediate and relevant suggestions, the extra time taken by graph databases for retrieval can negatively impact the user experience. On the other hand, vector-based systems can quickly process and retrieve information, ensuring a seamless and responsive user interaction.

The inherent structure of graph databases, while excellent for representing complex relationships, can become a bottleneck when it comes to the speed of data retrieval. The traversal of nodes and edges in a graph database, especially in a densely interconnected network, requires more computational resources and time, thereby increasing latency. This issue is compounded as the size of the database grows and the complexity of the relationships increases.

If your solution requires raw speed, especially at scale, it might be worth considering re-designing the use case to minimize or eliminate real-time graph retrieval. In cases where high performance is a priority, segregating the data processing tasks can be beneficial. For example, non-critical data processing tasks that do not require immediate response times can continue to utilize graph databases for their depth of relational analysis, while time-sensitive tasks can leverage the speed of vector-based retrieval.

Conclusion

In conclusion, while graph databases offer unparalleled depth in modelling complex relationships and interconnectivity, they are not a one-size-fits-all solution for every RAG-based LLM application. The decision to use graph databases should be driven by the specific nature of the data and the application's requirements. Where speed, efficiency, and scalability are paramount, vector-based retrieval systems often provide a more suitable alternative, especially in handling large volumes of unstructured data or requiring real-time processing.

It's crucial for developers and decision-makers to carefully consider the trade-offs between the depth of analysis provided by graph databases and the agility and speed of vector-based systems. Understanding these nuances will enable teams to make informed choices that align with their project's goals, resource constraints, and timelines, ultimately leading to more successful and efficient implementations of RAG-based LLM applications.

This nuanced approach towards choosing the right technology backend — be it graph databases or vector-based retrieval systems — will not only optimize the performance of LLM applications but also ensure a more strategic allocation of resources and a better alignment with the project's overall objectives.


Tomoro works with the most ambitious business & engineering leaders to realise the AI-native future of their organisation. We deliver agent-based solutions which fit seamlessly into businesses’ workforce; from design to build to scaled deployment.

Founded by experts with global experience in delivering applied AI solutions for tier 1 financial services, telecommunications and professional services firms, Tomoro’s mission is to help pioneer the reinvention of business through deeply embedded AI agents.

Powered by our world-class applied AI R&D team, working in close alliance with Open AI, we are a team of proven leaders in turning generative AI into market-leading competitive advantage for our clients.

We’re looking for a small number of the most ambitious clients to work with in this phase, if you think your organisation could be the right fit please get in touch.