Project Bluefin and the future of operating systems

Even with all of the advances in IT, whether it’s modular hardware, massive cloud computing resources, or small-form-factor edge devices, IT still has a scale problem. Not physically—it’s easy to add more boxes, more storage, and more “stuff” in that respect. The challenge with scale is getting your operations to work as intended at that level, and it starts with making sure you can build, deploy, and maintain applications effectively and efficiently as you grow. This means that the basic building block of devops, the operating system, needs to scale—quickly, smoothly, and seamlessly.

I’ll say this up front: This is hard. Very hard.

But we could be entering into an(other) age of enlightenment for the operating system. I’ve seen what the future of operating systems at scale could be, and it starts with Project Bluefin. But how does a new and relatively obscure desktop Linux project foretell the next enterprise computing model? Three words: containerized operating system.

In a nutshell, this model is a container image with a full Linux distro in it, including the kernel. You pull a base image, build on it, push your work to a registry server, pull it down on a different machine, lay it down on disk, and boot it up on bare metal or a virtual machine. This makes it easy for users to build, share, test, and deploy operating systems—just like they do today with applications inside containers.

What is Project Bluefin?

Linux containers changed the game when it came to cloud-native development and deployment of hybrid cloud applications, and now the technology is poised to do the same with enterprise operating systems. To be clear, Project Bluefin is not an enterprise product—rather, it’s a desktop platform geared in large part toward gamers—but I believe it’s a harbinger of bigger things to come.

“Bluefin is Fedora,” said Bluefin’s founder, Jorge Castro, during a video talk at last year’s ContainerDays Conference. “It’s a Linux for your computer with special tweaks that we’ve atomically layered on top in a unique way that we feel solves a lot of the problems that have been plaguing Linux desktops.”

Indeed, with any Linux environment, users do things to make it their own. This could be for a number of reasons, including the desire to add or change packages, or even because of certain business rules. Fedora, for example, has rules about integrating only upstream open source content. If you wanted to add, say, Nvidia drivers, you’d have to glue them into Fedora yourself and then deploy it. Project Bluefin adds this kind of special sauce ahead of time to make the OS—in this case, Fedora—easier to deploy.

The “default” version of Bluefin is a GNOME desktop with a dock on the bottom, app indicators on top, and the flathub store enabled out of the box. “You don’t have to do any configuration or anything,” Castro said. “You don’t really have to care about where they come from. … We take care of the codecs for you, we do a bunch of hardware enablement, your game controller’s going to work. There’s going to be things that might not work in default Fedora that we try to fix, and we also try to bring in as many things as we can, including Nvidia drivers. There’s no reason anymore for your operating system to compile a module every time you do an upgrade. We do it all in CI, and it’s great. We fully automate the maintenance of the desktop because we’re shooting for a Chromebook. … It comes with a container runtime, like all good cloud-native desktops should.” 

How Bluefin portends enterprise potential

The way Castro describes how and why project Bluefin was built sounds strikingly similar to the reasons why developers, architects, sysadmins, and anyone else who consumes enterprise operating systems create core builds. And therein lies the enterprise potential, although most people aren’t seeing that the problem Bluefin solves is identical to a business problem that we have in the enterprise.

It all starts with the “special tweaks” Castro mentioned. 

Take, for example, a big bank. They take what the operating system vendor gives them and layer on special tweaks to make it fit for purpose in their environment based on their business rules. These tweaks pile up and can become quite complicated. They might add security hardening, libraries, and codecs for compression, encryption algorithms, security keys, configurations for LDAP, specially licensed software, or drivers. There can be hundreds of customizations in a large organization with complex requirements. In fact, whenever a complex piece of software transfers custody between two organizations, it almost always requires special tweaks. This is the nature of large enterprise computing.

It gets even more complicated within an organization. Distinct internal experts such as security engineers, network admins, sysadmins, architects, database admins, and developers collaborate (or try to, anyway) to build a single stack of software fit for purpose within that specific organization’s rules and guidelines. This is particularly true for the OS at the edge or with AI, where developers play a stronger role in configuring the underlying OS. To get a single workload right, it could require 50 to 100 interactions among all of these experts. Each of these interactions takes time, increases costs, and widens the margin for error.

It gets even harder when you start adding in partners and external consultants. 

Today, all of those experts speak different languages. Configuration management and tools like Kickstart help, but they’re not elegant when it comes to complex and sometimes hostile collaboration between and within organizations. But what if you could use containers as the native language for developing and deploying operating systems? This would solve all of the problems (especially the people problems) that were solved with application containers, but you’re bringing it to the OS.

AI and ML are ripe for containerized OSes

Artificial intelligence and machine learning are particularly interesting use cases for a containerized operating system because they are hybrid by nature. A base model often is trained, fine-tuned, and tested by quality engineers and within a chatbot application—all in different places. Then, perhaps, it goes back for more fine-tuning and is finally deployed in production in a different environment. All of this screams for the use of containers but also requires hardware acceleration, even in development, for quicker inference and less annoyance. The faster an application runs, and the shorter the inner development loop, the happier developers and quality engineering people will be.

For example, think about an AI workload that is deployed locally on a developers laptop, maybe as a VM. The workload includes a pre-trained model and a chatbot. Wouldn’t it be nice if it ran with hardware acceleration for quicker inference, so that the chatbot responds quicker?

Now, say developers are poking around with the chatbot and discover a problem. They create a new labeled user interaction (question and answer document) to fix the problem and want to send it to a cluster with Nvidia cards for more fine-tuning. Once it’s been trained further, the developers want to deploy the model at the edge on a smaller device that does some inferencing. Each of these environments has different hardware and different drivers, but developers just want the convenience of working with the same artifacts—a container image, if possible.

The idea is that you get to deploy the workload everywhere, in the same way, with only some slight tweaking. You’re taking this operating system image and sharing it on a Windows or Linux laptop. You move it into a dev-test environment, train it some more in a CI/CD, maybe even move it to a training cluster that does some refinement with other specialized hardware. Then you deploy it into production in a data center or a virtual data center in a cloud or at the edge. 

The promise and the current reality

What I’ve just described is currently challenging to accomplish. In a big organization, it can take six months to do core builds. Then comes a quarterly update, which takes another three months to prepare for. The complexity of the work involved increases the time it takes to get a new product to market, never mind “just” updating something. In fact, updates may be the biggest value proposition of a containerized OS model: You could update with a single command once the core build is complete. Updates wouldn’t be running yum anymore—they’d just roll from point A to point B. And, if the update failed, you’d just roll back. This model is especially compelling at the edge, where bandwidth and reliability are concerns.

A containerized OS model would also open new doors for apps that organizations decided not to containerize, for whatever reason. You could just shove the applications into an OS image and deploy the image on bare metal or in a virtual machine. In this scenario, the applications gain some, albeit not all, of the advantages of containers. You get the benefits of better collaboration between subject matter experts, a standardized highway to deliver cargo (OCI container images and registries), and simplified updates and rollbacks in production.

A containerized OS would also theoretically provide governance and provenance benefits. Just as with containerized apps, everything in a containerized OS would be committed in GitHub. You’d be able to build an image from scratch and know exactly what’s in it, then deploy the OS exactly from that image. Furthermore, you could use your same testing, linting, and scanning infrastructure, including automation in CI/CD.

Of course, there would be some obstacles to overcome. If you’re deploying the operating system as a container image, for example, you have to think about secrets in a different way. You can’t just have passwords embedded in the OS anymore. You have that same problem with containerized apps. Kubernetes solves this problem now with its secrets management service, but there would definitely need to be some work done around secrets with an operating system when it’s deployed as an image.

There are many questions to answer and scenarios to think through before we get a containerized OS that becomes an enterprise reality. But, Project Bluefin hints at a containerized OS future that makes too much sense not to come to fruition. It will be interesting to see if and how the industry embraces this new paradigm.

At Red Hat, Scott McCarty is senior principal product manager for RHEL Server, arguably the largest open source software business in the world. Scott is a social media startup veteran, an e-commerce old timer, and a weathered government research technologist, with experience across a variety of companies and organizations, from seven person startups to 12,000 employee technology companies. This has culminated in a unique perspective on open source software development, delivery, and maintenance.

New Tech Forum provides a venue to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries to newtechforum@infoworld.com.

Copyright © 2024 IDG Communications, Inc.

Source