As cloud spending continues to surge, managing and forecasting costs has become a top priority for businesses operating in the cloud. One key area of focus is compute costs, which can represent anywhere from 50-80% of a company’s overall cloud bill. With cloud spending worldwide expected to grow 18.5% this year to $576.5 billion, businesses need to have a plan in place to accurately forecast.
As a young, digital-native business, it can be extremely intimidating to try to tackle cloud spend, which can result in poorly optimized systems and overburdened teams. All of these contribute to increased expenses and employee burdens. From unexpected surges and spikes in usage, to exhaustive manual requirements from engineers, thoughtlessly-designed cloud architecture can be a significant impediment to success.
Often, these poorly optimized structures are at their most damaging as companies grow and scale up. Growth periods are a perfect storm of increased usage and complexity that occur during a time when employees’ attention is pulled in several directions, making mistakes more difficult to identify quickly. A cost overrun can grow exponentially and engineers have less time than usual to dedicate to system audits.
For that reason, building a thorough ramp plan for increased cloud spending is essential during growth. There are a number of areas to consider when growing cloud spend, and attending to all of them can help teams scale as painlessly as possible.
Planning Ahead
Before embarking on a new cloud venture, the team must be aligned on what their needs are, as there are a plethora of factors that shape and mold strategy.
Lead engineers should first consider their user base. Of course that includes simple questions like how much user volume they expect and what the average cost per user will be, but it goes deeper than that. Your scale plan needs to be four dimensional – literally. Instead of just looking at totals, consider where your spend will be needed geographically, and also when the spend will spike.
While the engineers figure out the cost details, the c-suite should think about bigger picture questions. For CTO’s, that means figuring out what tools and platforms the project will run on. The solutions you’re already using may fit your current needs and budget, but that can easily change as you scale. Different providers have different strengths and weaknesses, operate in different regions, and have ever-evolving savings opportunities that may align with spending goals.
CEO’s and CHRO’s need to think about staffing. Engineers and architects are not perfectly interchangeable, and skillsets and software need to align. If a team of Google engineers is assigned to develop an AWS infrastructure, the learning curve will leave a cost. And while the global workforce is closer than ever, your architects should have some familiarity with the regions they’re assisting with for both compliance and customer support purposes.
Lastly, your CFO and Chief Product Officer need to work with engineers to develop a spending strategy. There are a range of options available from the major hyperscalers, including long-term commitments, on-demand spend, spot instances and more. There’s no “right” answer for the best way to approach, but it is essential that the entire team understands why the strategy they select is right. Third party providers can offer invaluable assistance and guidance in building a plan, but ultimately their counsel only helps if there’s cooperation and participation from the project leaders.
Understand Needs
AWS, Google and Microsoft all offer varying forms of compute commitment discounts that all map to a similar principle: Commit to a spend for a certain amount of time and save money on those workloads. While that sounds fairly straightforward, the customizable nature of these plans make them anything but. There are several variables that factor into a compute commitment, and the more specific the promise, the greater the savings:
- Time: The most straightforward of the factors. Typically budgeted one or three year buckets, the longer the commitment the greater the savings. However, committing too long can leave you overspending if traffic goes down, and limits flexibility to make strategic changes.
- Regions: Depending on where your user base is, this can have serious implications in your strategy, as changing the region in which your workloads operate may not be possible depending on the type of commitment you signed. Expansion into new geographies will likely require different commitments specific to that region.
- Machine Types: Choosing your machine type will vary greatly depending on your use case, with the different families and sizing providing various combinations of CPU, memory, storage, and networking capacity. Similar to regional restrictions, your machine type may not be adjustable under the commitment that you signed, so a new application or use case may call for a new machine family, and thus another commitment.
- Internal Scope: Not only is it important to figure out what machine types you need to properly operate your software, but you also need to figure out what your full capacity needs to be, what kind of usage fluctuation there could be, which DevOps teams will be utilizing the resources, and what strategy you’d like to employ to balance commits and on-demand spend. For example, a company may choose to cover 50% of its workloads with a 3-year commitment, 25% with 1-year plans, and run the remaining 25% as on-demand. This strategy offers wiggle room for accidental overprovisioning while still delivering some savings.
- Services Needed: What services are you relying on, and which providers will you get them from? Some more complex structures require containerization to keep things organized and functional, while others may not. Figuring out what these are and how they fit into your commitment strategy is a key step in the process.
Resources at Your Fingertips
Especially for a young company, answering all these questions can be daunting, and gathering the technical expertise to tackle them isn’t cheap. Hyperscalers like AWS and Google understand that and offer some resources to help teams understand the basics. Both offer documents and training tools designed to help get initial migrations off the ground.
That said, these are self-help resources rather than formal trainings. For teams looking for a more hands-on approach, partner companies like MSPs and resellers can offer a closer relationship, ranging from consulting and advice all the way to full cloud management. Ideal partners will understand your business as well as you do and combine that with their expertise to fast track a sustainable growth plan.
Whether you choose to work with a partner or fly solo, carefully structured cloud growth can be a major driver of business success, but a messy scale-up process can lead to unexpected costs, outages and more. Planning ahead, balancing risk and costs, and regular audits are crucial steps in the trajectory of a growing cloud investment, just like they are of a growing business.
By Craig Lowell