CentML gets $27 million from Nvidia, others to run AI models more efficiently | TechCrunch

Image credit: Acus Stiller/Bloomberg/Getty Images

Contrary to what you may have heard, the era of big seeds is not over, at least in the AI ​​sector.

CentML, a startup developing tools to reduce the cost and improve the performance of deploying machine learning models, announced this morning that it has raised $27 million in partnership with Gradient Ventures, TR Ventures, Nvidia and Misha Bilenko, vice president of artificial intelligence at Microsoft Azure. .

CentML originally closed its seed round in 2022, but over the past few months, as interest in its product grew, it expanded its seed round to $30.5 million.

According to CentML co-founder and CEO Gennady Pakhimenko, the new capital will be used to strengthen CentML’s product development and research efforts in addition to expanding the startup’s engineering team and a broader workforce of 30 people across the US and Canada.

Pakhimenko, an associate professor at the University of Toronto, co-founded CentML last year with Akbar Nurlibayev and Ph.D. Students of Shang Wang and Anand Jayarajan. Pakhimenko says they share a vision to create technology that can increase access to computing in the face of a worsening AI chip supply problem.

The costs of machine learning, the shortage of talent and chips, every AI and machine learning company faces at least one of these challenges, and most face several at the same time, Pakhimenko told TechCrunch in an email interview. Due to high demand from companies and startups, high-end chips are not usually available. This leads companies to sacrifice the size of the model they can deploy or lead to higher inference latency for their deployed models.

Most companies training models, especially AI generative models such as ChatGPT and Stable Diffusion, rely heavily on GPU-based hardware. The ability of GPUs to perform many calculations in parallel makes them suitable for training today’s most powerful artificial intelligence.

But there aren’t enough chips to go around.

In its summer earnings report, Microsoft warned that Microsoft is facing a shortage of server hardware needed to run artificial intelligence, which could lead to service disruptions. And Nvidia’s top-performing AI cards will be sold by 2024.

This has led some companies, including OpenAI, Google, AWS, Meta, and Microsoft, to build or explore building their own custom chips to train the model. But even this has not been proven to be a panacea. Metas’ efforts have been plagued by problems that have led the company to abandon some of its experimental hardware. Wired recently reported that Google has been unable to keep up with demand for its home GPU, the Tensor Processing Unit (TPU).

With spending on AI-focused chips expected to reach $53 billion this year and more than double over the next four years, according to Gartner, Pakhimenko felt the time was right to launch software that could make models with Run more efficiently on existing hardware.

Training AI and machine learning models is increasingly expensive, Pakhimenko said. With CentMLs optimization technology, costs can be reduced by up to 80% without compromising speed or accuracy.

This is quite a claim. But at a high level, CentMLs software is relatively easy to understand.

The platform attempts to identify bottlenecks during model training and predict the total time and cost to deploy a model. Beyond this, CentML provides access to a compiler that translates a programming language’s source code into machine code that hardware such as GPUs can understand to automatically optimize model training loads for best performance on target hardware. have

Pakhimenko claims that CentMLs software does not destroy models and requires little or no effort for engineers to use.

For one of our customers, we optimized their Llama 2 model to run 3x faster using Nvidia A10 graphics cards, he added.

CentML is not the first to take a software-based approach to model optimization. It has competitors in MosaicML, which Databricks acquired for $1.3 billion in June, and OctoML, which received $85 million in cash in November 2021 for its machine learning acceleration platform.

But Pakhimenko claims that CentMLs techniques do not lead to a loss of model accuracy, as MosaicML can sometimes do, and that the CentMLs compiler is a newer and more efficient generation than the OctoML compiler.

In the near future, CentML plans to turn its attention to optimizing not only model training, but also inference, i.e. running models after training. GPUs are still heavily used in inference today, and Pakhimenko sees that as a potential avenue for the company’s growth.

Pakhimenko said the CentML platform can run any model. CentML generates optimized code for a variety of GPUs and reduces the memory required to deploy models, allowing teams to deploy on smaller, cheaper GPUs.



#CentML #million #Nvidia #run #models #efficiently #TechCrunch
Image Source : techcrunch.com

Leave a Comment