The term “inflection point” is overused, but it certainly applies to the current state of artificial intelligence. Technology providers—and the companies that depend on them—can choose one of two roads to AI development: proprietary or open source. This dichotomy has existed for decades, with both sides achieving great levels of success. However, I would argue that the stakes for AI are higher than we’ve ever seen, and that the open source model is critical for the productive, economically feasible, and safe productization and consumption of AI.
And, in terms of open source, the Kubernetes project should serve as the blueprint for the way in which we develop, govern, fund, and support AI projects, large language models (LLMs), training paradigms, and more.
Kubernetes is an open source success story—not for a single company, but for all of the companies, non-profit foundations, and independent individual contributors involved. Yes, it’s a container orchestration solution that has effectively met a market need. But, more importantly in this context, Kubernetes is one of the best functioning communities in the history of technology development.
Since Kubernetes joined the Cloud Native Computing Foundation (CNCF) in 2016, thousands of organizations and tens of thousands of individuals have contributed to the project, according to a CNCF report. These individuals include for-profit companies, non-profit foundations, universities, governments, and, importantly, independent contributors (or, those not affiliated with or paid by an organization).
Sharing the cost of innovation
In finance and product development, it’s common to think in terms of value creation and value capture. The Kubernetes project has created immense value in the marketplace. And, if you think about it, the Kubernetes project has also captured value for anyone involved with it. Contributors—be they individuals, companies, non-profits, or governments—gain not only a voice in what the project can do, but also the cachet of being connected with a widely used and highly regarded technology and community. Much like working at Goldman Sachs or Google, if you contribute to the Kubernetes project for three to four years, you can get a job anywhere.
For businesses, any cost invested in paying developers, quality engineers, documentation writers, program managers, etc., to work on Kubernetes has the potential for significant return, especially when compared with proprietary efforts to develop a similarly expensive code base. If I’m a proprietary business, I may invest $100 million in R&D to get a $200 million dollar return from selling a product. If I’m an open source business, I may invest $20 million while other organizations may invest the remaining $80 million, but I still get a $200 million return. There are a lot of $100 million to $300 million businesses built on open source, and it’s a lot better to have others help you fund the R&D of your code base!
This model will be all the more important for AI because the costs associated with AI are astronomical. And the more popular AI gets, and the bigger LLMs become, the higher the costs will go. I’m talking costs across the board, from the people who develop and maintain AI models to the compute power required to run them. Having every organization spend billions of dollars on foundation models simply won’t scale.
In start-up circles, it’s common knowledge that venture capital doesn’t want to fund any more new businesses based on selling a foundation model. This is partly because there’s too much competition (for example, Meta and Mistral are giving away their foundation models for free) and partly because VCs anticipate that they will get better returns on investment by building solutions on top of these foundation models.
Financial cost is but one metric, cognitive load is another. The number of companies and individuals involved in the Kubernetes project doesn’t just have financial benefits; it also ensures that code conforms to expectations and meets quality benchmarks. Many hands make light work, but they also multiply ideas and expertise and scrutiny. AI projects without such critical developer mass are unsustainable and won’t have the same quality or velocity. This could lead to consolidation in the AI space, like container orchestration before it (Apache Mesos and Docker Swarm couldn’t compete with Kubernetes). Critical mass is particularly important with AI because the stakes are potentially much higher. The fewer the participants (and the less the participants are aligned with open source principles), the greater the chance for bias and unchecked errors, the repercussions of which we can’t even imagine right now.
On the bright side, if everybody’s contributing to an open source model, we could be talking about trillions of parameters. Based on open source principles, these models (7B, 70B, 1T parameters) could be used based on size for all kinds of different things, and they would be transparently trained too. You’d be getting the best and brightest ideas—and review—from all of these different people to train it.
A killer value proposition
That amounts to a pretty killer value proposition for open source AI: It’s cheaper, it includes great ideas from many people, and anybody can use it for anything they want. The upstream InstructLab project—which enables pretty much anyone to improve LLMs in less time and at a lower cost than is currently possible—is attempting to achieve exactly what I’ve described.
Also, don’t discount the AI supply chain piece of this. It’s all about risk reduction: Do you want to put this in the hands of one vendor that secretly does all this? Or do you want to put it out in the open source community and trust a bunch of companies, non-profits, governments, and individual contributors—working together to show and check their work—to do that? I know which one makes me less nervous.
Kubernetes is not the only open source project that can serve as a powerful example for AI—Linux, anyone?—but the relatively short time line of Kubernetes (so far) provides a clear picture of the factors that have led to the project’s success and how that has played out for the product companies, service companies, non-profits, governments, and other organizations making use of it.
An open source environment that includes many contributors, all coalesced around enabling people to use and fine-tune projects in a sane and secure way, is the only path to a realistic future for trusted AI. Instead of relying on global institutions or economic interdependence, open source AI provides a solution that should satisfy any hard-nosed, skeptical, offensive realists who believe that most private companies don’t do what’s best, they do what they can get away with. 🙂
At Red Hat, Scott McCarty is senior principal product manager for RHEL Server, arguably the largest open source software business in the world. Scott is a social media startup veteran, an e-commerce old timer, and a weathered government research technologist, with experience across a variety of companies and organizations, from seven person startups to 12,000 employee technology companies. This has culminated in a unique perspective on open source software development, delivery, and maintenance.
—
New Tech Forum provides a venue to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries to newtechforum@infoworld.com.
Copyright © 2024 IDG Communications, Inc.