I’ve been spending some time looking through Microsoft’s Azure offerings as of late. With the company championing services and products that can be used across any hardware running any operating system, developers are being recommended to move business logic of their apps into the cloud. Whilst server-side logic has been a staple of cross platform development for decades, the often trumpeted feature of moving to the cloud is scalability.
In this piece I’ll be taking a look at the different options available in the Azure cloud for scaling services and infrastructure, and where they are most beneficial.
Before I go any further it’s worth defining two terms you’ll see used in this article, “Scaling out” and “Scaling up”. If you consider that all of Azure services can be broken down into instances of resources, a resource being perhaps a virtual machine, which could be one or several instances. Scaling out is to increase the number of instances available – say by adding more virtual machines – allowing parallel processing to handle increased load. Scaling up would be to increase the resource itself – for instance by increasing the processing power or capacity of the resource – therefore increasing the throughput.
Autoscaling is provided for most of the application products available in Azure, it is usually a type of Scaling out based on load. Almost all Autoscaling can be configured to behave differently during different time periods. For instance if you know that your service will be doing most of the work during a week day, you can configure one configuration to be used during weekdays, and one during weekends. Or with any of the available schedules:
- Day and night schedules
- Weekdays and weekend schedules
- Week day, week night, and weekend schedule (by combining the first two options)
- Specific date ranges (should you know of an expected surge of demand for a given hour, day or week)
Autoscaling takes different forms depending on the product that you’re working with, so let’s take a look at a few of the scenarios available.
The simplest way to add a data storing back-end to a phone application is often to use an Azure Mobile Service. Microsoft is essentially offering to manage all the infrastructure for you, so that you don’t have to configure and maintain any virtual machines, and completely divorce the component from the infrastructure. There are some limitations as to what is possible in mobile services, but if chances are you’ll want to start here if you’re writing for a mobile app and aren’t absolutely certain that a mobile service isn’t suitable.
For Autoscaling, Microsoft assigns “units” to a mobile service – which I expect actually does ultimately mean a virtual machine somewhere – when using the Basic or Standard tier (Autoscaling is unavailable for the free tier). A scaling out operation is then performed simply against the number of API Calls, with units being made available as API calls increase, then reduced as the calls decrease.
When creating a new virtual machine through the advanced dialog, it is possible to add it to an existing cloud service and subsequently to an availability set. An availability set can then be autoscaled in the same way as a cloud service would be, by turning on virtual machines according to load, or at specific times. It’s important to know that you cannot “quick create” a virtual machine and add it to an existing cloud service, you must go through the longer option of creating “from gallery”.
Availability sets not only grant you flexible scalability, but are actually required by Microsoft to meet their 99.95% Service Level Agreement: essentially allowing for one of the virtual machines to be turned off for scheduled maintenance whilst the others pick up the load.
Once you have at least two virtual machines in an availability set, you can configure Autoscale by CPU usage, or based on the number of messages in one of your configured message queues. This is again, a method of scaling out your service.
Websites running on Azure (like this very one you are reading), can also be configured to Autoscale. Microsoft keeps the process very simple for the Shared and Basic plans – no scaling is available for the Free plan – with the option to scale out by increasing the number of instances of the website available.
For the Standard website plan, we gain much more control, as the Autoscaling changes to a virtual machine based configuration whereby instances can be started or stopped according to a target CPU usage range. Note that there is also an option to specify the virtual machine instance size, a manual decision that must be made but the first example of scaling up in this post.
I can’t claim to be any kind of database expert, I’ve worked with colleagues who have given their lives to SQL and I’m in constant awe of their abilities. Whilst I shy away from the specifics I think we all acknowledge that databases are responsible for keeping the tech world turning, and they are certainly core to many Azure products.
With that in mind I’ll quickly address that Autoscaling is available on databases using any tier, however where the retired “Web” and “Business” tiers simply offer a max size – with the database being scaled up as required – the newer tiers of Basic, Standard or Premium provide performance level options, allowing you to scale up the throughput capability of your database. This performance level scaling would be essential for many big data projects.
We’ve covered several options in Azure’s extensive Autoscaling offers, but there are other manual ways to scale, and some other tools available if you’re looking to create something bulletproof.
By far the simplest way to deal with increased load, any virtual machine can of course be upgraded. More cores, larger memory, or in the case of the lower tiers, moving away from a shared core to a dedicated core. All of these upgrades are scaling up the capability of the VM.
If your service runs on virtual machines and is struggling under work load then increasing the machine size is a good logical first step. Of course it means a few moments to reconfigure the VM and does not solve the issue of having redundancy in the same way an availability set would.
The Traffic Manager
Azure’s Traffic Manager is a simple load balancing service, but one that most users should be considering. For those unfamiliar with a load balancer, it accepts DNS queries, then distributes them across different endpoints which would usually be physical sites (in this case imagine a “site” being a cloud service instance or a virtual machine availability set). Using a load balancer provides redundancy and makes upgrades and maintenance a breeze; one site can be removed from the load balancer whilst it is worked on, then the balancer can be swapped over to work on another site with no apparent outage to the user.
Once a traffic manager is created a single DNS endpoint is assigned and ready to use. From that moment you can reconfigure your traffic manager’s load balancing method to be round robin, performance, or failover.
Whilst failover can be a useful configuration for scaling, in that when one site hits too much load and begins to fail, traffic will be sent to another site, the other two options provide a useful method of scaling out that can be beneficial.
In round robin mode traffic will be distributed amongst sites, so imagine that the first user would be sent to the first site, a second user would be sent to a second site, and so on. Whilst this is not a particularly smart way of scaling, due to requiring both sites to be always available, it works well and should provide a consistent experience.
Setting the traffic manager to performance mode is perfect if your Azure service needs to be global. Say you have multiple cloud services set up in Azure at different geographical locations (perhaps one in the US and one in Europe), the traffic manager will send requests to the lowest latency, ensuring your users in Europe get the same fast response times as those in America.
So which do I use?
We’ve explored some examples of scaling in the Azure cloud, now which do I recommend you use? Whichever one or several suits your purpose of course!
Yes it’s a bit of a cliché but your mileage really will vary with each of these scaling options. If you’re running a small website like this one you probably don’t need any scaling for load, and scaling for availability isn’t too important as Azure mostly takes care of that for you.
With that said, if you’re setting up a service with any kind of SLA or uptime requirements in general, you should absolutely be taking advantage of Autoscaling and probably using something like the traffic manager as well. Taking virtual machines as an example, setting up an availability set will give you the impressive 99.95% uptime SLA from Azure, and also allows you to manually turn on multiple virtual machines to cover you whilst you perform any application maintenance/upgrades on them.
The traffic manager though is something any high availability service should be using. By utilising a traffic manager in your software you are giving yourself complete flexibility for whatever the future may throw at your service as you’ll be able to construct new architecture behind the traffic manager.
Of course adding new endpoints to a traffic manager takes time, so unless you’re expecting the increased load you’re likely to be caught out with some downtime. So using both a manual traffic manager, in combination with Azure’s excellent Autoscaling capabilities is bound to be a winning combination.