It’s Friday morning, and you are excited. Today your new generative AI system running on a large public cloud provider is going to begin to work with the existing e-commerce systems that drive 80% of the company’s revenue.
The company is expecting the new ability to generate more sales while providing a better understanding of the customers using the site. The system will also be able to craft custom bundled deals on the fly. Marketing estimates this will increase the average single sale by 30%. Game changer.
There has been such a push to implement this that the cloud and website development teams skipped most of the stress testing, relying instead on the promise that these cloud-based systems “should be able to scale.”
The e-commerce systems communicate with the generative AI system using several APIs. These allow the applications on the e-commerce site to prompt the generative AI system, including sending data. The productive AI system then returns the desired answers.
But all is not well. As the number of users on the e-commerce system rises above 5,000, increasing the load on the APIs that are working with the e-commerce system, performance drops off a cliff. The number of users aborting the site rises significantly, so much that the e-commerce team returns to the last version of the site and removes the connection with the new generative API system.
I see this type of scenario quite often. The systems are designed well, but the APIs are undervalued, bringing performance, scaling, and latency problems. Many hide these issues by tossing resources at them, such as more server instances, which is easy to do in a public cloud. But resources are not free, and eventually, those APIs need to be fixed.
The basics of API design
Core to all these types of failures where the API doesn’t work as planned is a need for design that considers multiple aspects of a good API plan. Let’s go back to the basics. We’ve known this stuff for decades, but it hasn’t been the priority. Often, when I tell a client what I’m about to tell you, it comes across as new information. That’s scary when you’re talking to the API development team.
What are the basics of good API design? Let’s review the big ones:
Scalability is a huge one, meaning that we need to design the APIs to handle increasing requests without degraded performance. Here are a few tricks: Implement caching strategies and load balancers, and ensure that the underlying architecture can dynamically allocate resources as demand increases.
Modularity means building APIs as a set of modular services. Separation allows individual components to be developed, deployed, and scaled independently. This reduces complexity, improves maintainability, and increases the chances that the code can be reused.
Statelessness is in line with RESTful principles. APIs should not retain data between requests. This is sound design. Being stateless enhances scalability and reliability because servers within the cluster can handle any request independently.
Efficient data handling means optimizing the size of the data packets being transmitted back to the requester. API responses should exclude unnecessary data. This minimizes latency and bandwidth usage.
Monitoring and testing
Most of those who build and deploy APIs can’t tell me the point at which the API will be saturated, meaning it stops responding in a timely manner. They have little or no idea how the API behaves at different levels of stress. The only way to find this out is through monitoring, testing, and using metrics to see if the API is indeed optimized, based on the principles I covered above.
I recommend the following performance metrics:
- Constantly monitor API latency, the time it takes for a request to travel from the client to the server and for the response to return to the client.
- Measure throughput, the number of successful messages delivered over a given period. This allows you to understand the API’s capacity.
- Watch API error rates to proactively mitigate any reliability issues. Too many errors means something is wrong and needs to be located and corrected.
Judging from the many cloud systems I come across, it’s apparent that we’ve lost our way regarding API design, development, and operation. I suspect API design is often not taught in many cloud development trainings or is only covered in passing.
As we build cloud-based systems that are better optimized, meaning they use fewer resources but provide above-and-beyond performance, we need to get better at all components of an application or set of applications. The API is key these days and should be prioritized more during design. Too much to ask?
Copyright © 2024 IDG Communications, Inc.