Why Disrupting Cloud Infrastructure Requires an Economic Shift

Andy Manoske
12 min readApr 3, 2018

Why disrupting cloud infrastructure requires major economic, not just technological, innovation

What it looked like to connect to a game of Counter-Strike in 2003. The declining cost of video game servers highlights how industrial organization plays a defining role in lowering the cost of computing

When I was fourteen I wanted to run a Counter-Strike server.

This was easier said than done in the early 2000’s. Even in Silicon Valley, finding remotely managed computing power that with a reliable, strong network connection was expensive and difficult. Back then my only options for getting a Linux-based computer I could use to run a CS 1.4 server were the following:

Rent a server from a colo (colocation) provider.
This was what most e-commerce businesses did back then, and while it was the most reliable option it was also the most expensive. A colo Linux box running RedHat with enough RAM and a T-1 could cost hundreds of dollars a month, and with the bandwidth I’d need to serve games of de_dust2 at a high tick rate it was likely I’d be paying anywhere from $200–300 per month — a cost far more expensive than my $20/wk allowance would permit.

Rent time on an Application Service Provider (or ASP).
ASPs were brand new and were the precursor to modern “clouds” like AWS and Azure. ASPs unfortunately were very application specific, as virtualization technology like the hypervisor were still a few years away from breaking out of enterprise computing. And even more unfortunately, most ASPs didn’t like dealing with 14 year old kids trying to get them to run a Counter-Strike server.

While less expensive than a colo at somewhere in the high tens to low hundreds of dollars per month, it was a no-go.

Rent a game server from a Managed Service Provider (or MSP).
Have a gaming-specific Managed Service Provider (or MSP) manage the process of setting up, deploying, and maintaining the Counter-Strike server. I wouldn’t have root access to manage the actual machine running CS, but at least I’d be able to play on it.

Game MSPs were the cheapest option available. But even at $35/month (or $50/month in 2018 dollars), this was still an extremely expensive option.

Instead I did what any nerdy teenager would do: I hacked it. I port-scanned my gateway to look for an in-bound service my ISP didn’t block and I locally ran my Counter-Strike server for my friends on a server in my room.

Granted this also meant that I had to have a Linux server constantly running in my room, sometimes massive lag spikes occurred when my ISP ran whatever esoteric service ran on that port, and in retrospect this was probably not in our EULA. But so it goes: a kid’s got to get out of CAL-O right?

Counter-Strike Go: some things are different, but everything is still pretty lulzy

Flash forward to 2016. 13 years later, I’m twenty seven. And I want to run a Counter-Strike server.

The world has changed a lot since 2003. Colos and MSPs were significantly cheaper in 2016 than they were in 2003. A configuration necessary to run CS:Go at a high tickrate would likely be in the high tens of dollars per month, and MSPs charged anywhere from $20–$50 per month depending on specs for a dedicated server that you could actually control at an operating system level.

But modern cloud computing environments like Amazon Web Services provide an alternative. Here I can run my CS server off of the fractional computing power of Amazon’s infrastructure for comparatively pennies on the dollar.

In fact, through some careful bandwidth management and a barebones AMI, I was able to get our server running mostly within Amazon’s free tier. My friends and I were able to get all of our ridiculousness going for under $10/month, with a strong network to back the server and in a way that I had total control over our Counter-Strike server.

In the thirteen years since I was forced to dodgily run a CS server on my Linux box in my bedroom, computing has changed tremendously. Comparing the costs to run a server in 2003 with the costs to deploy roughly the same server in 2016, one feels compelled to point to Moore’s Law and say this is purely a result of technology improving over that time.

Given that the modern iPhone X is an order of magnitude more powerful (and relatively less than a third of the cost) than my gaming computer in my early teens, doesn’t it make sense that the server I use to run my video game is similarly an order of magnitude cheaper?

Not necessarily.

Yes, dramatically cheaper computing power has a role to play here. Computing costs dropped precipitously between 2003 and 2016, though server processors like Intel Xeons remained at the extreme end of processor costs in the hundreds of dollars.

Most importantly given how many server providers charge, bandwidth dropped such that the cost of serving the same volume of data in the mid 2010’s was likely 20% of what it cost thirteen years prior. Even with these significant cost changes however, it seems like these achievements alone can’t justify why it costs an order of magnitude cheaper to run a Counter-Strike server in 2016 vs. 2003.

To understand this, we need to look back at the history of AWS.

In 2006 Amazon premiered Amazon Web Services, or AWS. AWS was a combination of a series of public-facing infrastructure hosting solutions that sold spare computing power from Amazon’s enormous retail computing colossus to help maximize profit margins.

Powered by advances in virtualization technology that helped harvest spare processor time, memory, and bandwidth from Amazon.com transactions, AWS would grow from a revenue generating side project to a $17.4 billion per year business. For comparison, AWS’ annual revenue is nearly as much as the entire global coffee industry.

AWS created the modern cloud computing industry and handily beat out its competition in colo as well as in period ASPs like Heroku for one simple reason: it was dirt cheap. Because AWS used spare computing power from Amazon’s retail infrastructure (costs that had already been “paid” in both the costs of deploying/staffing those data centers as well as in electricity and heating/cooling) Amazon was able to charge rock bottom prices for hosting users’ applications and steadily undercut their competition.

AWS is a prime example of Economies of Scale. Because the initial cost of AWS’ operation was effectively paid for (and because more efficient means of virtualization helped to linearly scale computing power across data centers), AWS was able to grow quickly and cheaply off of the back of Amazon.com’s global computing infrastructure. Amazon passed these cost savings onto the consumer, and those savings helped capitalize a commanding lead in the late 2010’s into dominance as the number one cloud computing vendor in the world.

Economies of scale for a vendor can provide immediate cost savings that can be passed onto the consumer. But they can also provide the basis for stifling competition in the long run, thereby creating natural oligopolies or monopolies.

Natural oligopolies and natural monopolies are market structures wherein a small group of sellers (or even a single seller) command market power through “natural” characteristics of the good they’re selling and the market they operate in. In natural oligopolistic or monopolistic markets, these charactersitics are typically persistent economies of scale combined with a high barrier to entry due to the costs of starting up and operating a competitive business at scale.

Cloud computing is a textbook example of a natural oligopoly. Vendors like Amazon, Microsoft, and Google have been able to capitalize off of their already-existent global computing infrastructure to offer cloud infrastructure services like AWS, Azure, and GCP at a fraction of the cost of what colo vendors (or even other potential cloud vendors) could provide.

Combined with powerful virtualization technology, an expertise in growing that existing infrastructure due to years of experience already doing so, and lots of cash and credit these vendors can linearly scale their infrastructure and much cheaper than their competition.

But cost savings is just one half of the equation. The other component for the natural oligopoly in cloud computing comes of Barriers to Entry. It is phenomenally expensive to setup, deploy, staff, and run a cloud provider that can operate at the same scale of Amazon AWS. Doing so would cost tens of billions of dollars alone in simply deploying the network — and likely billions more per year to operate.

Economies of scale and barriers to entry have ensured that AWS, Azure, and GCP are the dominant top 3 cloud vendors in the world. Few vendors in the world even have enough money to deploy a competitive product, much less operate one reliably at scale while providing cost-competitive offerings similar to the top 3 cloud vendors.

So what about decentralization?

The surging rise of Bitcoin prices in 2017 have led to significant interest in blockchains and other types of decentralized ledger technologies (or DLT) as mechanisms for tracking the state of an ongoing series of transactions in a verfiable and potentially very secure way.

DLT technology is very interesting, and it provides some unique benefits for being able to solve consensus problems in an environment where computing nodes can’t/don’t trust each other. This unique benefit (combined with lots of Dutch Tulip-level overhype around Bitcoin’s price) have led some to ask if blockchain technology can be used to decentralize infrastructure.

While I’m personally at my wit’s end with a lot of the cryptocurrency community’s behavior — as are most people who work in cryptography — exploring the possibility of using DLT to address traditionally monolithic computing tasks is something I think is an extremely apt, if not noble, endeavor.

After all, the Bitcoin network grew on the backs of hobbyist miners to become a multi-billion dollar market cap asset. Why can’t the next cloud infrastructure vendor?

There are several DLT-based decentralized computing projects attempting to do just that. For example, Akash uses an implementation of the Tendermint’s consensus algorithm that provides a Byzantine Fault Tolerant (BFT) that takes a RAFT-esque approach to building a blockchain.

The creators of Tendermint are similarly engaged in building Cosmos, an implementation of Tendermint’s Proof of Stake algorithm to host application non-specific ledgers that, theoretically, could be used to similarly process general purpose computing tasks.

These are projects with commendable goals, and arguably some of the best publicly available approaches to this problem — I’m aware of a few more similar DLT-based computing platforms being quietly developed at several large corporations to capitalize on their spare computing power.

But while this line of investigation is fair, I’m concerned about one serious problem in any approach to decentralize cloud infrastructure by any DLT technology:

Destablizing the cloud computing market is not a technology problem. It is an economics problem.

Virtualization is the technology that enabled the modern cloud. Without the innovation of the hypervisor, a piece of technology that allowed a computer system to share multiple operating systems on the same hardware, environments such as AWS and Azure would not exist.

In the mid-2000’s, hypervisors left expensive supercomputer and mainframe environments to enter the mainstream in the form of VMWare’s ESX suite. Open source variants such as Linux KVM followed suit, and a year after KVM’s release Microsoft released their own Hyper-V hypervisor into Windows. By the time of AWS’ launch, hypervisors were available on every major computing platform and were natively included within the kernels for Windows and Linux.

Virtualization technology has since continued to democratize the cost of computing. Virtualization-accelerating instruction sets such as Intel VT-X are now standard in most processors, allowing even basic desktop computers to performently run multiple servers on the same platform.

In the mid 2010’s, virtualization further advanced with the rise of Docker and other containers: microservices that ran single instance applications with minimal computing cost. Containers have given way to new, even lighter-weight microservices architectures that cobble together the compute, networking, and storage for an application from whatever a system has available.

Known as serverless architectures, these modern microservice suites highlight just how far computing has come in the last ten years such that a task that costs pennies to run on AWS Lambda once cost hundreds of dollars a year on a colo server a decade prior.

Virtualization is elemental to cloud computing, and platforms such as Hyper-V and KVM literally run environments like Azure and AWS. So as virtualization has seen major advances in the last 10 years, it seems likely that those advances should have had a disruptive impact on the cloud computing market.

There’s just one problem with that prediction: that isn’t what happened.

A slide from Urz Holze’s presentation at Google Next highlighting the discrepancy between public cloud pricing versus price innovation via Moore’s Law

Instead while the cost of computing as a whole has dropped significantly over the past decade, cloud computing prices have trailed somewhat slothfully. According to Google Fellow Urz Holze, most cloud computing prices have only dropped in the mid single-digits year over year while computing pricing has dropped several times that value over the same interval.

The easiest answer to explain this is also possibly the most nefarious: cloud vendors have no incentive to drop pricing more precipitously and thus don’t really care. In a world where these vendors are competing against alternatives like colos that cost orders of magnitude more for their services, there’s no reason why Amazon EC2 pricing needs to similarly track the cost benefits of computing’s advance due to Moore’s Law.

But what about other cloud vendors? If computing power is getting cheaper and cheaper — due largely to Moore’s Law as well as software advances in fields such as virtualization — why hasn’t someone come to seriously threaten the oligopoly of Google, Amazon, and Microsoft?

Because this isn’t a technology problem. This is an economics problem.

Going back to our analysis of natural oligopolies and natural monoplies, we noted there’s a “one-two” punch of economic conditions that need to be satisfied.

The first Economies of Scale. To mitigate the top 3 cloud vendors in this field, you would need to figure out a way to outcompete their advantages in expertise and growing their environment.

Unfortunately, despite the rapid advance of computing from the mid-2000’s till today, no technology has premiered to provide a small team (or even small enterprise) with the capability of deploying, managing, and growing an infrastructure as fast as Google, Microsoft, and Azure. Virtualization and infrastructure tools such OpenStack have helped teams operate at large scale, but nobody save large technology organizations such as Facebook can scale as efficiently as the top cloud vendors.

The second conditions for a natural oligopoly are Barriers to Entry. Again, while technology certainly helps here, it doesn’t override the commanding lead that vendors such as AWS and Azure have over the cloud.

For example, software such as OpenStack and Kubernetes have been able to provide cost-effective mechanisms for setting up and deploying a large cloud infrastructure.

But these software suites still need physical servers, as well as physical data centers, to run. Real estate costs and staffing costs dictate a massive barrier to entry in providing a truly independent alternative to the top cloud vendors, ensuring that once again only large technology companies could possibly pose a significant risk.

So what about the blockchain? While DLT is great, it remains to be seen whether it really addresses these two major traits that give the cloud vendors their edge.

For DLT to really make a dent (in a way that technology elemental to cloud infrastructure such as containers or virtualization have failed to do) it would need to provide a disruptive means of outcompeting Amazon at both buying data centers, running large scale applications reliably, and also scaling cheaper and more effectively across the globe than the entire organization of AWS can muster with its hundreds of billions of dollars in cash and credit.

That, to be blunt, seems extremely unlikely — not the least of which because most blockchain systems will likely run on cloud vendors such as AWS and Azure (who likely could just turn them off if they ever had an issue).

Decentralizing the cloud is an economics problem, not a technological one. But while blockchains and serverless tech alone won’t destabilize the current trinity of cloud providers, major economic shifts in international business may do just that.

Many of the big three’s economies of scale are being destablized by new data sovereigty laws. In China for example, domestic PRC laws around cryptography and computing have ensured that Chinese vendors such as Huawei and AliCloud will hold dominance against foreign vendors such as AWS in the largest economy n the world.

The EU’s GDPR similarly restricts the movement of physically across geography, ensuring that domestic telcos who already have a limited competitive advantage vs. Microsoft and other major cloud vendors within the EU are able to capitalize off of their own Economies of Scale and subject the big three to their own costs of high legal barriers to entry.

In a world where data has become more regionalized due to international politics, the sovreignty of local computing power poses a major economic destablizer to the current cloud computing market more so than any other technological advancement in the last twenty years.

While this may shake up the ordering (or even make the market less oligopolistic), real decentralization of cloud computing is unlikely until some major economic shift changes the dynamics of the market for cloud infrastructure.

--

--

Andy Manoske

Security and devops product leader. Prev: HashiCorp's first product manager and creator of Vault Enterprise, security PM @ NetApp, AlienVault. Warhammer player.