Trismik: £2.2 Million Pre-Seed Funding Closed For LLM Evaluation Platform

Trismik, an emerging company in the realm of artificial intelligence (AI), has recently announced the completion of a £2.2 million pre-seed financing round. This funding was led by Twinpath Ventures and included participation from notable investors such as Cambridge Enterprise Ventures, Parkwalk Advisors, Fund F, Vento Ventures, and various angel investors associated with Ventures Together.

As the race to develop more powerful AI models intensifies among various research labs and companies, a critical challenge looms over the field: our ability to effectively measure the capabilities of these advanced systems is diminishing. Traditional benchmarks have reached a saturation point, where multiple models are now achieving impressive accuracy rates exceeding 90% on widely used evaluation metrics, such as MMLU (Massive Multitask Language Understanding) and GSM8K (a dataset tailored for mathematical reasoning tasks). While this high level of accuracy might seem encouraging, it poses a significant challenge for businesses seeking to assess how well their models perform specific tasks, as well as to communicate these results effectively to stakeholders and decision-makers.

The existing public benchmarks often fail to adequately reflect the specialized requirements of various domains or the unique distributions of proprietary data that are fundamental to enterprises. This misalignment can hinder continuous model evaluation and improvement. Furthermore, as new regulatory frameworks emerge, they demand different types of assessments to meet compliance standards, further complicating the landscape for businesses. The result of these overlapping challenges creates a perfect storm: teams lack the precise measurements necessary for informed model selection and simultaneously struggle to adapt their testing methodologies over time to ensure trustworthy model deployment—all while contending with competitive market pressures that demand rapid development cycles.

In light of this complex situation, a collaborative team from the University of Cambridge has proposed a promising solution, sparking a renewed debate on the most effective methods for evaluating next-generation AI models. Operating under the newly established company Trismik, these Cambridge scientists advocate for incorporating established psychometric approaches used in assessing human intelligence, along with adaptive testing methodologies that automatically adjust the difficulty of questions in real-time.

Trismik’s founding team brings together a wealth of experience, combining former engineering and commercial leaders from industry giants such as Amazon and Salesforce with leading scientists from the University of Cambridge. Their groundbreaking approach employs Item Response Theory (IRT) and Computerized Adaptive Testing (CAT)—the scientific foundations of conventional human intelligence testing—to refine AI evaluations. Much like educational psychologists tailor the complexity of questions to the individual test-taker, Trismik’s platform dynamically adjusts the AI evaluation complexity, enabling it to accurately represent the capabilities of the model being evaluated.

At the helm of this innovative initiative is Professor Nigel Collier, a prominent researcher in Natural Language Processing (NLP) at Cambridge and Trismik’s Chief Scientific Officer. Having embarked on his academic journey in the 1990s, Collier’s research began with a PhD focused on machine translation using neural networks.

Over the years, he has developed a strong commitment to ensuring AI systems operate as reliable partners for humanity, rather than posing risks. Collier’s prolific output, comprising over 200 published papers, reflects his deep curiosity about the potential for AI assessment to be executed with the same fairness and efficiency as that of human intelligence, ultimately motivating the development of Trismik’s distinctive adaptive evaluation approach.

In 2023, when Professor Collier met co-founder Rebekka Mikkola, who has a proven track record as a repeat founder and enterprise sales executive with a dedication to building in AI and advocating for women in technology, they were fortunate to gain early support from Cambridge Enterprise. Their initial collaboration with a major telecommunications company in the UK led to the creation of Trismik’s first Minimal Viable Product (MVP). In 2025, the founding team was further enhanced by the addition of Marco Basaldella, a former Amazon scientist and TEDx speaker, who joined as Chief Technology Officer (CTO). This melding of expertise from diverse backgrounds in science, engineering, and commercial sectors fortifies Trismik’s innovative mission.

Should Trismik’s approach come to fruition as envisioned, it has the potential to fundamentally alter the current thinking around model capabilities by shifting the focus from simplistic accuracy percentages to more nuanced distributions that reflect a model’s abilities. For organizations investing significant resources—over $100,000 per month—in GPU computing for evaluations, Trismik’s method can drastically reduce costs by up to 95% while simultaneously delivering richer insights. Adaptive tests have demonstrated remarkable correlations with full assessments; specifically, in four out of five datasets tested, Spearman correlations exceeded 0.96, all while utilizing only 8.5% of the total questions typically required.

Having finalized its fundraising round, Trismik is now poised to concentrate on the launch of its flagship product, tailored for AI developers. This product will encompass both classical and adaptive evaluation methodologies, along with an expanding repertoire of high-quality public datasets that address various aspects such as factuality, alignment, safety, logical reasoning, and domain-specific knowledge. This breadth of resources will equip researchers and scientists with the necessary benchmarks to better interpret and understand model performance.

Currently, Trismik’s LLM Experimentation Platform offers lightweight, user-friendly tools designed for rapid and reliable evaluations of AI models. Looking ahead, Trismik aims to develop the platform into an all-encompassing environment that facilitates the design, execution, and analysis of LLM (Large Language Model) experiments, which includes fine-tuning processes and prompt engineering. The company intends to initiate user onboarding towards the end of 2025, with early access available through its official website.

The founders anticipate the unveiling of their enterprise solution by early 2026, which will further solidify Trismik’s position as a leader in the AI evaluation landscape.

KEY QUOTES:

“If we want to trust AI, our methods have to be as rigorous as our ideas. Benchmark saturation is creating problems in every domain, from general knowledge, to reasoning, math, and coding. Scientists, researchers and technical teams face mounting pressure as evaluation is exploding in importance and has become essential for tying AI to trust. We need an evaluation framework that scales and can support this.”

Professor Nigel Collier, a Cambridge NLP researcher and Trismik’s Chief Scientific Officer

“The AI evaluation market is at an inflection point. Every AI team we speak with is drowning in evaluation overhead, it has become the hidden bottleneck preventing teams from shipping faster and with confidence. Trismik’s approach is compelling because it applies proven scientific methods from a completely different domain to solve this problem. When you can reduce evaluation time by two orders of magnitude while actually increasing measurement precision, you fundamentally change what’s possible in AI development cycles.”

John Spindler, Twinpath Ventures

“Trismik exemplifies Cambridge’s continued contribution to global AI development with the team combining world-class academic credentials and practical industry experience that has given them the unique authority to define how AI capabilities should be measured. By solving a pivotal challenge in AI adoption, Trismik is positioned to drive trust at scale – we’re excited to support their journey to market.”

Dr. Christine Martin, Head of Ventures at Cambridge Enterprise