Kolena: Putting AI and ML reliability to the test

2 hours ago

Mohamed Elgendy. Image: Kolena

This San Franciso-based start-up wants to make sure that AI and machine learning models are consistent and reliable, no matter the situation.

The AI surge continues to capture people’s attention with constant waves of innovation and upgrades. New chatbots and tools are appearing frequently as companies try to gain an upper hand in an increasingly competitive industry.

And while the world is continuously struck by the growing capabilities of AI, the tech has also seen its fair share of scrutiny due to ‘hallucinations’, from laughable answers from Google’s AI Overview feature to GDPR concerns sparked by incorrect information from OpenAI’s ChatGPT.

Blunders like these naturally cast doubt onto the efficiency and accuracy of AI, and ultimately erode trust in the technology. But what our latest Start-up of the Week, Kolena, wants to do is to help AI developers ensure that their models are reliable for every situation.

Click here to listen to The Leaders' Room podcast.

Kolena is a platform for testing and validating AI and machine learning (ML) models. According to CEO Mohamed Elgendy, Kolena wants to ensure that these systems are “resilient and adaptive to real-world changes” by ensuring systems are consistently accurate across different scenarios.

‘Testing on steroids’

Kolena – which is based in San Francisco – was founded in 2021 by Elgendy, CTO Andrew Shi and CPO Gordon Hart, all of whom have an extensive background in AI and ML. The idea for the start-up evolved from a project in which the co-founders showcased a computer vision ML model that was designed to spot weapons in airport x-ray scans.

According to Elgendy, while the model initially had 99pc accuracy, it faltered during a live demo when it failed to detect a gun in an otherwise empty bin. After looking into the issue, the team realised that a previous version of the model with a lower accuracy of 97.2pc was better at detecting a gun inside an empty bin but needed tuning for other “less important scenarios”.

“In other words, the less ‘statistically accurate’ model was actually the better model for the customer,” explains Elgendy. “The team would have caught this in minutes if they were validating their models at a granular level versus the overall object level or ‘class’ detected (gun).

“The lesson the team realised was that traditional model testing frameworks don’t provide the level of granularity the industry desperately needs to validate the performance of AI products with confidence.”

It was this realisation that led Kolena to design its ‘model quality framework’, which adapts software concepts such as unit testing and regression testing to fit the ML development process, with the hopes of improving on traditional model-testing methods.

Through its testing and validation solution, Kolena helps developers to improve their AI systems by scrutinising AI and ML models in precise scenarios. “This is testing on steroids: not just ticking boxes, but diving deep into the nuances of model accuracy,” says Elgendy.

The team also hopes to solve what Elgendy describes as the “most formidable barrier to widespread AI adoption” – trust.

“The team also realised that their approach could bridge the gap in ML tooling and solve one of AI’s biggest problems – the lack of trust in models’ effectiveness – through more rigorous testing.”

Value of validation

Today, Kolena has no shortage of success. According to Elgendy, the start-up supports a number of Fortune 500 companies and government organisations, as well as European AI standardisation institutes and AI start-ups in industries such as robotics, healthcare, autonomous vehicle and banking.

In terms of funding, the company raised a $15m Series A investment in September 2023, led by Lobby Capital with participation from SignalFire and Bloomberg Beta. Along with a previous seed funding round, this investment brought the total amount of funding for the start-up to $21m, and the company is not currently seeking additional funding.

While on this road to success, Elgendy says the team has had to stay mindful of AI regulations, which he describes as one of the big challenges of the industry today due to uncertainty and new testing requirements.

“Fortunately, our team at Kolena has been aware for some time that this challenge would eventually become a major issue in our industry,” he says.

“Since we first started Kolena, we’ve been attuned to guidance from agencies like NIST – which is working to execute president Biden’s executive order on AI nationally – and to the approaches being taken by EU regulators, which are much further along in developing their regulatory frameworks.”

But while Kolena welcomes the development of regulations, Elgendy says that more needs to be done.

“While NIST and some state legislators are working to define quality standards and regulations – like, for example, SB 1047 here in California – regulators and lawmakers are not in a position to keep pace with the speed of innovation that we’re seeing across the AI industry,” he says. “What we need are clear standards that are developed, applied and continually updated by leaders and companies within the industry itself, in order to make sure that we’re anticipating where model failures may occur and head them off before they can create problems.”

And while AI model failures such as Google Gemini telling people to put glue on pizza are “somewhat benign”, Elgendy says that if model failures were to occur in military applications or self-driving cars, the consequences could be serious and cause considerable damage.

“Those kinds of failures undermine our ability to trust AI and limit the benefits we can realise from the technology.

“Fortunately, we are seeing start-ups, regulators, researchers and large companies come together to establish standards and deal with these challenges.”

Don’t miss out on the knowledge you need to succeed. Sign up for the Daily Brief, Silicon Republic’s digest of need-to-know sci-tech news.

Colin Ryan is a copywriter/copyeditor at Silicon Republic

editorial@siliconrepublic.com