Insights
READER NOTE/INTENTION OF ARTICLE
This is technical-lite blog post that explains the risks of GenAI applications, while demonstrating that Tomoro has the expertise to build them safely for your company.
Over the last year, we've partnered with a leading game developer to deploy a GenAI chatbot that fields roughly 40,000 messages a day from a player base that includes children. Balancing speed, accuracy, safety, and fun is essential:
When creating a chatbot to represent your brand, you don't just want it to be safe - it needs to sound authentically you.
In the case of a game company, that means marrying safety with fun and user excitement - especially for younger audiences.
The LLM's that underpin modern GenAI applications are very flexible (which is why they generalise so well to many problems) but also under-constrained. This means that they can often go ‘off-piste' and return a wide variety of text (some of which might not be appropriate).
Exposing an LLM directly to the public will certainly result in unexpected behaviour and without proper mitigations, can run the risk of:
News headlines from unintended LLM use
These incidents underscore why building and maintaining safety in GenAI systems isn't optional. This is especially true when young children are involved, you can't simply rely on disclaimers — robust guardrails are essential to keep content appropriate.
Much like the RAG (Retrieval-Augmented Generation) systems we've built for clients, we leverage multiple AI components to reduce the risks of jailbreaking while keeping performance high.
A simplified architecture diagram of our system
To ensure system safety, we use LLM classifiers on both inputs and outputs:
To maintain speed and affordability:
A recent study by Anthropic described Constitutional Classifiers—a system that relies on additional classifiers, trained via “constitutional" rules, to detect malicious or unsafe requests. They reported:
We see a future where these constitutional approaches can complement or even replace traditional classification layers—offering a more comprehensive defense against evolving jailbreaking tactics.
Classification layers prevents malicious queries from ever reaching the generative core—and ensures poor outputs are never delivered.
By focusing on fun, lore-driven warnings and playful refusals, users are more inclined to accept the constraints of the system.
Regular red-teaming, plus frequent monitoring of player behaviour, will help spot vulnerabilities before they are known to the wider player base. These can then be fed back into the few shot classifier prompts to prevent them being used in the future (presently, this is a manual process but could be automated).
Regardless if you're a gaming company, a bank or even a government department, if you are going to implement a customer service chatbot at scale you need to aim for BOTH user design and trustworthiness.
At Tomoro we build customer-facing solutions for organisations & brand where:
If you'd like to learn more, please get in touch.
Tomoro works with the most ambitious business & engineering leaders to realise the AI-native future of their organisation. We deliver agent-based solutions which fit seamlessly into businesses’ workforce; from design to build to scaled deployment.
Founded by experts with global experience in delivering applied AI solutions for tier 1 financial services, telecommunications and professional services firms, Tomoro’s mission is to help pioneer the reinvention of business through deeply embedded AI agents.
Powered by our world-class applied AI R&D team, working in close alliance with Open AI, we are a team of proven leaders in turning generative AI into market-leading competitive advantage for our clients.
Continue Reading
We’re looking for a small number of the most ambitious clients to work with in this phase, if you think your organisation could be the right fit please get in touch.