OpenAI/Azure’s ChatGPT in Production: What to Expect

TL;DR

Building apps with OpenAI/Azure’s ChatGPT SaaS is simple—if you’re aware of its limitations and can fix ongoing issues. However, when moving to production, things can get more complex. OpenAI/Azure’s ChatGPT in production can present challenges related to stability, scalability, and service degradation that may impact performance. In this post, I’ll walk you through common problems we’ve faced during projects leveraging popular LLM APIs, highlighting what to watch for as your team gets started.

Getting Started with AI: Key Considerations for Using OpenAI/Azure’s ChatGPT in Production

AI is everywhere these days, and using solutions like OpenAI/Azure’s ChatGPT in production is becoming increasingly common. With transformer-based Large Language Models (LLMs) taking center stage, the adoption of Machine Learning is spreading rapidly. It’s no longer a question of whether you should use this technology—because of its business impact and the push from the market (or maybe your CEO), it’s a must-have. My experience developing ShareTheBoard has significantly expanded my ML knowledge, enabling me to offer practical, valuable advice.

The good news is, to introduce some of this ML magic to your team, you no longer need a group of ML experts, huge datasets, or countless hours spent on model architecture and tuning. Now, all it takes is an OpenAI account, a linked credit card, and you’re ready to start building AI solutions—just like when you first adopted cloud services.

The good news is, to introduce some of this ML magic to your team, you no longer need a group of ML experts, huge datasets, or countless hours spent on model architecture and tuning.

But, as always, there are a few important things to consider before you dive in!

We'd love to help!

Wondering how to make AI work for your organization?
Book a call and let’s chat!

Going Full Speed with ChatGPT on OpenAI/Azure

Here’s the exciting part: Even without a ton of Machine Learning experience, OpenAI or Azure lets you build a working Proof of Concept (PoC) in a matter of days, with a production-ready version possible within weeks. And we’re not just talking about startup projects; you can even update legacy software that’s been around for a decade or more.

OpenAI or Azure lets you build a working Proof of Concept (PoC) in a matter of days.

You no longer have to worry about:

The usual ML tasks like picking the right neural network, gathering data, and training models.
Deployment hassles like figuring out the best GPU for your LLM or scaling up.

This is a game-changer! We’ve spent countless hours building neural network-based applications, and the traditional methods took serious time and effort—data collection, model training, and deployment decisions all demanded dedicated resources. But now, with the OpenAI API, most of these challenges are behind us.

Our story

We recently launched a new service for a client, using the brand-new vision-to-text LLM from OpenAI (now included with ChatGPT-4o). We had previously used traditional algorithms and open-source tools, but OpenAI’s LLM was something else—just exceptional. We got the beta version of our new feature out in no time. It felt like a dream—until we encountered the first issues, that is!

Anticipating Disruptions with OpenAI/Azure’s ChatGPT in Production

It's essential to recognize that OpenAI/Azure's ChatGPT production issues do happen, and anticipating them can spare you a major headache.

Typically, when you choose a SaaS model, you expect to feel secure, with the peace of mind that maintenance is off your plate. But that might not be the case with OpenAI or Azure (if you’re using Microsoft’s services).

It's essential to recognize that OpenAI/Azure's ChatGPT production issues do happen, and anticipating them can spare you a major headache.

You might encounter service degradation more often than expected. The unpredictable nature of these disruptions can be particularly troubling.

Here are some of the most frustrating issues we’ve faced:

OpenAI API Service Unavailable
OpenAI ChatGPT Model Degradation
OpenAI & Azure Token Limits
API Request Delays

Although the list is short, the impact on your application’s stability can be significant, potentially leading to failures and lengthy meetings with stakeholders, especially as your user base grows.

OpenAI API Service Unavailable

At first, when we encountered failures, we were convinced the issues were on our end. However, a quick investigation revealed that the problems were actually coming from the API we were using.

One issue led to another—our requests were failing without any clear explanation, and eventually, even the entire OpenAI management portal went down. Fortunately, this happened while we were still testing.

To stay on top of these issues, we added an OpenAI health check to our alerting system. Unfortunately, I’ve seen the “Service OpenAI is experiencing some issues” alert more often than I’d like.

OpenAI ChatGPT Model Degradation

OpenAI’s new 4o model, while more affordable, has shown a decline in quality based on our experience and feedback from some of our clients.

The consistency of outputs has shifted, which can disrupt your app’s logic.

We’ve had to adapt by tweaking our post-processing layer, but these aren’t the issues you want to be dealing with once your project is stable.

OpenAI & Azure Token Limits

You might think that with enough budget, you can freely use the Azure or OpenAI APIs.

However, as we discovered while working on a client's project, there is a limit you cannot surpass.

Although this limit is relatively high, it can quickly feel restrictive as your service grows in popularity.

API Request Delays

When managing multiple complex requests, you might notice an increase in response times, leading to timeouts.

This might not be obvious during testing, but as your solution and prompts grow, your service will become unusable.

Additional Resources

There’s a site tracking ongoing OpenAI issues, where you can see that its uptime is around 99.7%—equating to roughly 30 minutes of downtime per week or 7.5 days per year. Click here to access it.

I also recommend taking a look at this article about the Azure OpenAi service's quotas and limits.

How to Manage OpenAI/Azure’s ChatGPT Production Issues

We've come across and addressed some tricky ChatGPT issues in our work. Let’s explore how to tackle them.

The right answer depends on your tech expertise, business goals, and budget.

We've come across and addressed some tricky ChatGPT issues in our work. Let’s explore how to tackle them.

One practical approach is to acknowledge ChatGPT's imperfections. Labeling your service as 'Beta' can set user expectations about possible hiccups. It might seem like a big step, but it's often a smart choice when you're just starting out.

If that doesn’t work for you, identify your key issues and tackle them one by one.

For example:

Managing Token Usage: As you approach your token limit, consider splitting accounts by application or feature. This way, a resource-heavy app won’t consume all your tokens. You could also segment your environments or the regions you support.
Reducing Delays: Check the regions you operate in. The U.S. is often the most crowded, so switching to a less crowded service region could help.

We’d be happy to consult, diagnose, and resolve your AI-related issues, backed by our 5.0 rating on Clutch and hundreds of successful projects!

OpenAI/Azure GPT Issues are a can of worms—plenty of them out there with very few known solutions.

If you have Machine Learning experience, you might be able to manage these challenges with your in-house team.

If the issues become overwhelming and you need expert guidance, give us a shout at hello@frompolandwithdev.com or fill out the form below!

And in case your project requires a professional website, we recently published an insightful article on which CMS is best for you. Be sure to check it out!

We’d be happy to consult, diagnose, and resolve your AI-related issues, backed by our 5.0 rating on Clutch and hundreds of successful projects!