Skip to main content
Key Takeaways

Sky-High Ambitions Meet Cloud Realities: Hathora aimed for rapid growth by relying on cloud infrastructure, hosting on AWS with Kubernetes, catering to game developers ranging from small teams to massive studios.

MMO Market: A Billion-Dollar Playground: The MMO games market is projected to expand by nearly $30 billion by 2029, presenting both vast opportunities and serious infrastructure challenges for companies like Hathora.

Success and Strain: Too Much, Too Fast: Hathora's platform was adopted by over 100 studios quickly, but larger client demands exposed the limitations and high costs of their cloud-only approach.

Revisiting Strategy: Learning the Hard Way: Realizing their infrastructure inadequacies, Hathora's team re-evaluated their business plan to find sustainable ways to support and profit from large game studios.

Lessons for Tech Leaders: Adapting to Scale: Hathora's experience serves as a lesson for tech leaders on the importance of infrastructure scalability and the balance between cloud costs and growth strategies.

Sometimes, the very same infrastructure that helps you achieve a successful launch ends up becoming a hindrance as your business flies skyward.

In the case of Hathora, that was both literally true and a tad ironic: The company is in the business of providing critical infrastructure and server orchestration for video game studios to host and scale their multiplayer online games.

It’s a huge market by most measures. Research firm Technavio projected that the massive multiplayer online (MMO) games market will grow nearly $30 billion between 2025 and 2029, for example. Other analysts project similar growth over the next five-ish years.

Hathora wanted to launch and scale its platform fast, so it followed what has become a common playbook for many startups, small teams, and other businesses: It went all-in on cloud, running everything on Amazon Web Services with EKS, the cloud giant’s fully managed Kubernetes service. Its ambitions were seemingly limitless: Hathora wanted to serve game developers of all sizes, from the tiniest teams to the largest studios and everyone in between

And it worked – up to a point. More than 100 game studios adopted the Hathora platform within six months of its initial launch. That initial success revealed a fundamental problem given Hathora’s goal of serving game studios of all sizes and from around the globe: When they began working with larger and larger studios, the economics of their cloud-only infrastructure quickly became unsustainable – so much so that they risked losing out of a large, lucrative segment of the overall market.

In this case study, we’ll get a first-hand look at how Hathora’s CTO and team transformed its initial business and technology strategy with potentially dramatic results – as well as some lessons learned for other technology leaders in any industry.

The Company & CTO

Company: Hathora, a global provider of critical infrastructure and server orchestration for video game studios to host their multiplayer online games.

The CTO: Harsh Pandey, who is also the company’s founder:

“We built Hathora to give game studios of all sizes access to global-scale infrastructure without needing to build it themselves. But when we started pricing things out for larger studios, the economics broke down. Cloud costs, especially data egress, made our offering unsustainable for exactly the kind of customers we wanted to serve.”

Headquarters: New York City

Customer Footprint: Today, Hathora serves (literally) customers around the world from 14 global regions spread across six continents.

The Business Problem

Big ambitions usually bring big challenges, and that was precisely the case for Harsh and Hathora.

Some of that is fundamental to the nature of the gaming industry: MMO games aren’t mom-and-pop storefronts with predictable traffic patterns. Rather, multiplayer games must serve constant updates to every single connected player simultaneously in a highly performant way.

That means tons of data regularly moving back and forth between back-end infrastructure and individual players, who can number in the millions for larger games. And if you think your customers are demanding, well, let’s just say that serious gamers aren’t necessarily a patient bunch when it comes to game performance and user experience.

Running game server workloads from Hathora’s cloud-only platform worked well for smaller studios. But when a larger studio approached the company about pricing, Harsh and the team realized the limitations of the financial model.

“We knew we had a challenge to address when a large game studio using our game server hosting platform for internal playtests asked about pricing for launch,” Harsh told us. “Their projected bandwidth bill came out to over a million dollars a month.”

That was four times the normal expected compute costs for running the game, according to Harsh. That wasn’t sustainable for the studio or for Hathora. That put Hathora at a proverbial crossroads: stick with its original cloud-only strategy but risk losing the large-studio segment, or make a change.

“The problem to fix was pricing, plain and simple,” Harsh says.

The solution, however, was technical: Hathora needed to transform its own infrastructure to better meet the unique demands of game server workloads, while also adopting a pricing model that even the largest studios could easily adopt.

What They Did About It and Why

Tech pros of all kinds have heard plenty of pitches about the benefits of migrating infrastructure and workloads to the cloud, or simply launching as a cloud-only shop. That’s been a major trendline for years, and Hathora wasn’t an exception.

Hathora moved in nearly the opposite direction to solve its large-customer problem, adopting a hybrid cloud architecture and ultimately making a big bet on an old stalwart: bare metal servers. 

(Check out our article on this more recent trend of moving out of cloud environments into on-premises infrastructure.)

Harsh shares about the company’s evolution:

“We shifted to a hybrid model and tracked infrastructure cost per concurrent session, latency performance across global regions, and the engineering effort needed to operate everything. By moving 80% of our workloads to bare metal, we can cut compute costs nearly in half and bring bandwidth costs down by more than 90%. We now manage over 30,000 cores across 14 regions with a relatively small engineering team. That level of efficiency and reach would have been impossible with our original architecture.”

They still leverage cloud infrastructure for scalability and demand spikes, accounting for roughly 20% of workloads (instead of 100%). But Hathora’s stack now leans on two bare-metal providers and AWS and GCP cloud resources, instead of keeping everything on AWS. Harsh and team opted for Talos Linux, a minimalist distro designed specifically for Kubernetes environments, and Omni (also from Sidero Labs) for orchestration from a single control plane across its hybrid environments.

Needless to say, this wasn’t as simple as turning a light switch on or off. Moreover, there were pros and cons to consider. Going cloud-only initially helped Hathora launch quickly and successfully. Moving away from that model meant giving up some of the advantages cloud offers, at least at first:

“Making a change meant giving up a lot of the conveniences cloud platforms offer,” Harsh says. “We would lose easy autoscaling, built-in monitoring, and managed services. That added operational complexity. We also had to prove we could match or exceed cloud performance. There was no guarantee we’d find vendors or tooling that would work the way we needed. But we believed that getting cost and performance under control was essential to our business model, and we were ready to take that risk.”

The bet is paying off, paving the way for the next phase of Hathora’s growth: the company now serves customers on six continents. It can deliver the same critical infrastructure to the largest studios at a significantly lower cost than before. It also offers more granular control over the unique requirements of game workloads, providing node-level control with Talos Linux and unified cluster orchestration with Omni, regardless of where workloads run.

“These tools are minimal, secure, and easy to operate across different infrastructure types,” Harsh says. “We evaluated our stack based on performance, cost, and how much control we would retain. The new setup lets us operate with a small team and scale globally without being tied to any one vendor.”

Indeed, Hathora’s small engineering team has become an unofficial KPI of sorts. It demonstrated immediate performance and cost gains after migrating its first full region to the hybrid model, and subsequently scaled rapidly (now running 14 regions around the world).

“Every time someone finds out our infrastructure is managed by just a few engineers, it’s a moment that reinforces how far we’ve come.”

Key Insights & Lessons Learned

You don’t have to be in the digital infrastructure or video games industries to apply similar principles and lessons in your own business.

  • Listen to your customers: Harsh and the team realized they had a problem – and that they needed to be proactive about solving it – by listening to their customers, in particular the large studio that had been using Hathora for internal testing and later asked for cost estimates if they ran the actual game on it. CTOs are as much responsible for solving customers’ problems as they are for building technology solutions – the two go hand-in-hand.
  • Don’t be afraid to pivot. “Cloud first,” “cloud native,” and similar terms should be treated as adaptable strategies, not sacred dogma. The same principle applies widely: just because you start with a particular strategy doesn’t mean you have to stick with it forever. Hathora could have stood pat and stayed entirely in the cloud, but doing so would have limited its growth potential and ultimately the performance of the games running on its infrastructure.

Pivoting to a hybrid model enabled the company to serve customers of all sizes and improve performance because of greater flexibility and control when addressing the specific requirements of game workloads. Hathora has since been able to launch an enterprise tier that gives the largest studios predictable pricing and strong performance guarantees.

  • Be diligent in your analysis. That kind of transformation can’t be accomplished by guesswork. Harsh and team did a full proof-of-concept with Talos and Omni to put their strategic shift to the test:

"Once we validated that we could orchestrate everything through one system, we migrated our first full region. The performance gains were immediate. From there, we rolled out the model globally, adding vendors and locations as we went,” Harsh said.

It analyzed the pros and cons – including the initial loss of cloud’s benefits that Harsh outlined above – to minimize surprises. It knew it had to find a Kubernetes-friendly solution that could orchestrate everything across multiple environments and regions, and essentially treat every machine the same whether it ran on bare metal or in a cloud.

Harsh noted that they also went deep on vendor benchmarking, comparing AMD and Intel CPUs, testing real-world network performance, and evaluating vendors on how well they performed against AWS’ Global Accelerator. 

“We made sure our new setup wasn’t just cheaper, but also faster and more reliable in the regions that mattered.”

  • Play the long game: Transformative changes in direction like this one aren’t just about solving one-time problems. Done right, they open up more and more possibilities going forward. The long-term mindset is already bearing fruit for Hathora. They’re continuing to develop new features for larger studios, such as dedicated cluster options, regional peering for low-latency handoffs, and enhanced observability during live events.

Harsh also said that Hathora’s new infrastructure model for gaming is showing potential value in other industries that require real-time responsiveness and high-throughput, and the company’s long-term plan is to roll out new solutions for developers outside of the games industry.

“Moving to a hybrid model gave us long-term flexibility. We can now enter new regions quickly, test new hardware vendors without lock-in, and give customers more control over how their infrastructure is deployed. The platform has become much more modular and adaptable.”

For more case studies and playbooks, subscribe to The CTO Club’s newsletter.

Kevin Casey

Kevin Casey is an award-winning technology and business writer with deep expertise in digital media. He covers all things IT, with a particular interest in cloud computing, software development, security, careers, leadership, and culture. Kevin's stories have been mentioned in The New York Times, The Wall Street Journal, CIO Journal, and other publications. His InformationWeek.com on ageism in the tech industry, "Are You Too Old For IT?," won an Azbee Award from the American Society of Business Publication Editors (ASBPE), and he's a former Community Choice honoree in the Small Business Influencer Awards. In the corporate world, he's worked for startups and Fortune 500 firms – as well as with their partners and customers – to develop content driven by business goals and customer needs. He can turn almost any subject matter into stories that connect with their intended audience, and has done so for companies like Red Hat, Verizon, New Relic, Puppet Labs, Intuit, American Express, HPE, Dell, and others. Kevin teaches writing at Duke University, where he is a Lecturing Fellow in the nationally recognized Thompson Writing Program.