Patronus AI recently launched the first automated evaluation and security platform that helps companies safely use large language models (LLMs). And using proprietary AI, the new platform enables enterprise development teams to score model performance, generate adversarial test cases, benchmark models, and more. Plus, Patronus AI automates and scales the manual and costly model evaluation methods prevalent in the enterprise today, enabling organizations to confidently deploy LLMs while minimizing the risk of model failures and misaligned outputs.
Patronus AI was launched by machine learning experts Anand Kannappan and Rebecca Qian. Before Patronus AI, Rebecca led responsible NLP research at Meta AI, and Anand pioneered explainable ML frameworks at Meta Reality Labs. And they founded the company after experiencing firsthand the difficulties of evaluating AI outputs and recognized early on that LLM evaluation would become a massive challenge for enterprises.
Patronus AI utilizes state-of-the-art machine learning technology to test and score any language model to identify potential failures. This platform automates:
1.) Scoring – Scores model performance in real-world scenarios and key criteria like hallucinations and safety.
2.) Test generation – Automatically generates adversarial test suites at scale.
3.) Benchmarking – Compares models to help customers identify the best model for specific use cases.
Driving the company’s launch is a $3 million seed funding round led by Lightspeed Venture Partners with participation from Factorial Capital, the CEO of Replit Amjad Masad, Gokul Rajaram, and several other Fortune 500 executives and board members.
The early platform partners include leading AI companies, including Cohere, Nomic AI, and Naologic. And several high-profile companies in traditional industries like financial services will be piloting Patronus AI in the coming months.
“Every company is looking for ways to use LLMs today, yet they are concerned that unexpected model behavior, incorrect outputs and hallucinations will put their business and customers at risk. Whether off-the-shelf, open-source or custom, models today remain inadequately vetted and tested in real-world scenarios. And until now, the process of evaluating LLMs has been extremely inefficient and unscalable, producing unreliable results.”
– Anand Kannappan, CEO and co-founder, Patronus AI
“AI has become a must-have for businesses, as they seek to realize its full potential and not be left behind in the LLM revolution. But no responsible company is going to put their reputation on the line by leveraging risky models. Patronus AI not only has the technology to tackle this problem head-on, they have a world-class team from Meta, Airbnb and Samsung, with the expertise to help organizations safely navigate LLMs. We’re thrilled to be on this journey with them and look forward to playing a role in their continued growth.”
– Nnamdi Iregbulem, Partner at Lightspeed Venture Partners