Cloud bursting is a way to deal with one of IT’s constant challenges. Namely, balancing demand with the compute and storage capacity needed to fulfil it.
Cloud bursting allows an organisation to rapidly migrate datacentre workloads to the cloud’s theoretically unlimited compute and storage when required.
In this, hybrid cloud infrastructure combines fixed, on-premise infrastructure with the on-demand capacity of the cloud. But how does cloud bursting work, and how practical is it for day-to-day IT operations?
“Cloud-bursting was a seductive idea,” says Bryan Betts from analyst Freeform Dynamics. “If demand peaks and you run out of compute capacity on-premise, shift some jobs out to the elastic cloud and run them there. Then, when the peak is past, pull it all back on-premise and carry on as before.”
“The reality is rather different, of course. Cloud-bursting is genuinely feasible – indeed, some organisations use it all the time – but the inevitable caveats and limitations are a lot more obvious now.”
As Betts suggests, cloud bursting is not a new idea – it has been available for at least 10 years – but growing interest in hybrid cloud, cloud-native applications, and technologies such as containers are prompting more IT teams to see what it can offer in their environments.
To date, take up has been strongest in sectors such as retail, which have large – but fairly predictable – peaks in demand.
Cloud bursting pre-requisites
Cloud bursting offers a technical solution to those short-term peaks. Its purpose is to transfer workloads and their associated data to the cloud. But once the peak has passed, the workload – including the data – transfer back to the original host system.
Cloud bursting can be between on-premise IT and cloud infrastructure, between private and public cloud, or even potentially between public cloud services. Transferring workloads from private to public cloud and back is the most common scenario currently.
Cloud bursting is an ongoing, two-way flow between local and cloud infrastructure. This makes it more complex than systems that simply offload data to the cloud, such as archiving or disaster recovery. Instead, cloud bursting requires data and workloads to be suited to cloud bursting, as not all are.
“In terms of technical prerequisites, the most challenging piece is the data, and in real-time scenarios it’s generally not possible to keep moving data around,” says Dale Vile from Freeform Dynamics.
“We’re therefore largely talking about moving only the compute element. However, this in turn mandates high-speed and low-latency connectivity between the VM [virtual machine], container or serverless function and the data source.”
Bursting lends itself to systems with extreme variability in demand, and where the system’s operator can make use of public cloud spot prices to deliver that demand economically. This allows firms to avoid building in-house architecture for large, but possibly infrequent peaks.
At the same time, however, cloud bursting might not be best-suited to mission-critical or performance-sensitive applications. Bursting to the cloud introduces delays and additional latency, and poses questions around data integrity and security.
An organisation might, for example, put a customer service application on to cloud bursting infrastructure because website customers may tolerate a little latency. But such delays would be less acceptable in bank-to-bank financial transactions.
However, as Freeform Dynamics’s Vile points out, cloud service providers now offer high-speed, high bandwidth connectivity directly to customer datacentres or even to other cloud service providers so they can cater for more demanding applications. “You can, for example, burst from reserved or bare metal instances on one cloud to dynamic resources on another,” he says.
Applications also need to allow for cloud bursting. An application that is tightly bound to a relational database is less well-suited to the emerging generation of applications designed for distributed multicloud and hybrid environments.
Where organisations need to work with a single database or dataset, it may be possible to burst compute to the cloud and leave data on-premise. This can work where latency is not a key factor, or for applications where capacity limits are not in the database but elsewhere such as in a web application layer.
Cloud bursting in practice
Most service providers categorise cloud bursting into three main types: distributed load balancing, automated bursting, and manual bursting. Even with manual processes, some degree of automated monitoring is needed.
According to Amazon Web Services: “IT administrators configure the capacity limits of their on-premise computing resources. When an application’s workload reaches the limit, the application switches to using public cloud resources. Load balancer technology redirects incoming requests for the application to the cloud.”
Distributed load balancing is perhaps the most sophisticated tier. It automatically splits workloads between on-premise hardware and standby capacity, according to load thresholds that admins set for in-house infrastructure.
Automated bursting suits more regular bursts. Users can set up the limits for bursting to the cloud and for scaling down.
Manual bursting relies on IT teams to spin up and take down additional capacity, usually prompted by monitoring software. This is best-suited to temporary, but possibly large, capacity peaks or testing.
Vendors such as AWS, with EC2 Auto Scaling, and Microsoft, with Azure Scale Sets, are designed to allow VM-based workloads to scale to the cloud to manage capacity. Increasingly, though, IT architects are looking at containers as an alternative because Kubernetes can automatically scale clusters and pods. So, if an application can run in containers, it can also take advantage of cloud bursting.
Cloud bursting’s economics – and its skills deficit
Cloud bursting can save money by allowing customers to use cheaper, public cloud resources instead of buying additional on-premise capacity that is likely to be under-utilised.
And, with the option of bursting from private to public cloud or even between clouds, it allows organisations to optimise spending by switching resources between zones or even providers. Potentially, organisations can reduce private cloud provisioning in favour of cheaper, public cloud offerings too, assuming that security and data management risks can be controlled.
This does, however, require investment. First, firms need to have compatible applications, or need to invest in cloud-native or otherwise scalable software. Second, bursting might require upgrades in the datacentre, such as deployment of cloud providers on-premise technology, and better bandwidth.
IT departments will also need strong market intelligence to identify which cloud provider offers are going to bring savings. This is especially important for architectures designed to make use of spot markets.
Lastly, customers need real time monitoring even for the simplest manual bursting scenario. Without good monitoring, organisations will not know when to trigger the burst or when to repatriate a workload and scale back cloud provision. Without this, they risk service interruption or wasting money.
All this also needs skills. Firms already face a shortage of skilled IT staff, and especially those with cloud know-how.
Because cloud bursting straddles IT infrastructure, cloud infrastructure and capacity and cost planning, those skills are in short supply, so firms might need to grow their own talent.
“If you want the ability to handle demand spikes in real-time without also risking busting your budget, you’re going to need the ability to continuously monitor app performance, check the spot-market prices of various cloud providers, and automatically move the workloads around. It’s do-able, but it isn’t trivial,” warns Freeform Dynamics’s Betts.