Building highly available applications with Amazon RDS

Development | Denis Susac

Building highly available applications with Amazon RDS

Wednesday, Jul 13, 2016 • 13 min read

A couple of months ago we have published a series of posts describing how to set up a fault tolerant infrastructure for our applications. This time, we'll learn how to use Amazon services to do the same thing while saving a lot of time.

Setting up a private cloud infrastructure, as described in a series of posts published a couple of months ago, was a lot of fun - but we have spent a lot of time doing it. Maintaining this type of “private cloud” environment is not a task for the faint-hearted. Most importantly, you can only scale up within the limits of the existing hardware platform; going above that limit requires purchasing additional servers, and involves additional installation and configuration.

Amazon Web Services (AWS) provides a platform that is well suited for building fault-tolerant software. Similar services are offered by other cloud providers, and most of them provide the infrastructure needed for building fault-tolerant systems that operate with a minimal amount of human interaction and up-front financial investment. We have chosen AWS because of its sheer size, popularity, and price structure. More importantly, we were looking to avoid complicated and time-consuming administrative tasks such as database installation and upgrades; storage management; replication for high availability and read throughput; and backups for disaster recovery. Over the past few years, PostgreSQL has become the preferred open source relational database for many developers, and we are using it in most of our new applications. Amazon Relational Database Service (RDS) makes it easy to set up, operate, and scale PostgreSQL databases in the cloud, and provides support for Amazon Aurora, MySQL, MariaDB, Oracle and SQL Server too. It is an ideal choice for both smaller shops without an experienced DBA, and larger organizations that need to scale their solutions.

Let’s go through the steps needed to set up a fully functional, fault-tolerant environment able to support an ASP.NET application that scales out to multiple servers. Below is a high-level overview of the complete solution.

Overview Diagram

Amazon Virtual Private Cloud (VPC)

Amazon Virtual Private Cloud (VPC) lets you provision a logically isolated section of the “big” Amazon cloud where you can launch AWS resources in a virtual network that you define. Since our database instance only needs to be available to the web server - and not to the public Internet - we create a VPC with both public and private subnets. The web server will be hosted in the public subnet. The database instance will be hosted in a private subnet. The web server will be able to connect to the database instance because it is hosted within the same VPC, but the database instance will not be available to the public Internet, providing greater security.

Sign up for Amazon Web Services and go to the VPC dashboard. Choose the AWS region in the upper right corner of the screen if you are not placed in your preferred region by default. To begin creating a VPC, choose Start VPC Wizard button from the VPC Dashboard.

VPC Wizard

Choose VPC with Public and Private Subnets. In addition to containing a public subnet, this configuration adds a private subnet whose instances are not addressable from the Internet. This creates an A /16 network (65531 IP addresses available) with two /24 subnets (251 IP addresses available). Public subnet instances use Elastic IPs to access the Internet. Private subnet instances access the Internet via Network Address Translation (NAT).

In Step 2 (see figure below), enter your VPC name (we opted for baasic-vpc), choose the availability zone for public and private zones, and enter the name for the public and private network (baasic-public and baasic-private in our case). Choose the instance size - click on Use a NAT instance instead if it is not displayed. You can find a detailed discussion on selecting the right NAT instance size here - we have opted for t2.small.

CIDR blocks should be left as they are - 10.0.1.0/24 for the private subnet, and 10.0.0.0/24 for the public subnet. Here is a simple online tool for calculating CIDR ranges.

VPC Wizard Step 2

Click the Create VPC button and your VPC will be set up and ready in a matter of seconds.

Security groups

Before creating a subnet group, we need to add a second private subnet to the VPC. To create an Amazon RDS DB subnet group for an RDS DB instance to use in a VPC, you must have either two private subnets or two public subnets available. Because the RDS DB instance for this tutorial is private, add a second private subnet to the VPC before creating a subnet group. The second public subnet will be needed later, for load balancing web servers, so we might as well create it in this step.

On the VPC dashboard, click on Subnets in the left navigation pane.
Click the Create subnet button.
In the Create subnet dialog, enter the name (baasic-private-2), select the VPC created in the previous step, choose different availability zone from the zone created in the previous step, and enter CIDR block 10.0.2.0/24.
To ensure that the second private subnet that you created uses the same route table as the first private subnet, choose VPC Dashboard, choose Subnets, and then choose the first private subnet that was created for the VPC - there will be a tab with the route table below. Note the Current Route Table value, for example, rtb-a23f16c5. Make sure that this value is the same in the second private subnet, if not click Edit in the Route Table tab and fix that.
Repeat the steps above to create a second public subnet (baasic-public-2), you can use the next CIDR block 10.0.3.0/24, and make sure that you use the “public” routing table for it.

We can now create a security group for public access. To connect to public instances in your VPC, you have to add inbound rules to your VPC security group that allows traffic from the public Internet. Go to the VPC Dashboard, choose Security Groups, and then choose Create Security Group group.

We used baasic-securitygroup for the Name tag and Group name. You have to enter the group description and choose the VPC it belongs to - that will be the VPC created before. Click on Yes, Create.

To add inbound rules to the security group, find the IP address that you will use to connect to instances in your VPC via RDP. You can use the service at http://checkip.amazonaws.com to easily find your public IP address.

Click on the Security groups, select the newly created group and enter the inbound rules. If you use 0.0.0.0/0, you enable all IP addresses to access your public instances. Enter rules for HTTP, HTTPS, RDP, and any other protocol that you need.

Public Subnet Inbound Rules

To keep your Amazon RDS DB instance private, create a second security group for private access. To connect to private instances in your VPC, you add inbound rules to your VPC security group that allow traffic from your web server only.

Create another security group (we called it baasic-db-securitygroup) and choose the previously created VPC. In the Inbound rules, select PostgreSQL (5432) as the Type and add Group ID of the previously created public security group as Source.

Private Subnet Inbound Rules

We are now ready to create an RDS instance.

Amazon Relational Database Service (RDS)

Let’s log into the Amazon RDS dashboard. Go to Instances, Launch DB instance, select PostgreSQL, and select Production if this is a production deployment. Dev/Test tier creates a Single-AZ db.t2.micro instance with 20 Gb of storage, which may be enough for simple test scenarios. We have used db.m4.large as the DB instance class, 9.4.7 as a version (newer versions are available), true in the Multi-AZ Deployment, General purpose (SSD) under the Storage Type, and set the Allocated Storage to 400 Gb.

A word about Multi-AZ deployments, which turns out to be one of the coolest database features we use in the AWS. When you provision a Multi-AZ DB Instance, Amazon automatically creates a primary DB Instance and synchronously replicates the data to a standby instance in a different Availability Zone. Each AZ has its own physically distinct infrastructure. In the case of an infrastructure failure, Amazon RDS performs an automatic failover to the standby, so that you can resume database operations as soon as the failover is complete - in our initial tests, this takes less than a minute. Since the endpoint for your DB Instance remains the same after a failover, your application can resume database operation without the need for any manual intervention.

Some of you might ask why did we choose General purpose instead of Provisioned IOPS. The official recommendation is to use Provisioned IOPS for production applications that require fast and consistent I/O performance, as this storage type delivers fast, predictable, and consistent throughput performance, which is critical in database environments. The problem with it - apart from being pricey - is that a steady volume of writes/reads is assumed in order to be cost effective, as a max throughput capacity is always reserved for you. This is not often a realistic scenario, but you will end up paying for that capacity, use it or not.

Nicolas Potter discusses an interesting alternative in his post “Why buying Provisioned IOPS on RDS may be a mistake.” In a nutshell, for every GB of storage you buy, you get 3 free base IOPS, so for 100 GBs of space, you get 300 “provisioned” IOPS for free. While that might not be good enough for your database load, Amazon has a neat trick in that they will credit you for whenever your DB is using fewer IOPS than your guaranteed rate and let you burst above it. This approach does not fit all scenarios but lets you save a significant amount of money while getting access to more storage.

Going back to the RDS DB instance creation wizard, enter the DB instance identifier (we used baasic for that), master username and a password. Choose Next Step and set the following values in the Configure Advanced Settings page:

Choose the existing VPC in the VPC field, leave Create new DB Subnet Group in the Subnet group.
Publicly accessible has to be set to No, VPC Security group is set to the “private” security group we created (baasic-db-securitygroup in our case).
Leave database name empty so Amazon RDS does not create a database when it creates this DB instance. We will restore the databases from backups later.
Check backup and maintenance windows if you want to set it to non-peak times.
Create the instance and write down its endpoint. You are going to use it in the application connection strings.

While are at it, we might as well set up PostgreSQL parameters for our instance. There is no way to access postgres.conf or other configuration files in Amazon RDS. Instead, click on the Parameter Groups link in the left navigation. Click on the Create Parameter Group button, and enter Group name (baasic ion our case) and a description. Click on Create. This will create a new parameter group with parameters you can edit – this is not the case with the default group. Select the newly created group, click on the Edit parameters button and change the parameters as required. For example, we routinely change the max_prepared_transactions parameter.

Amazon Elastic Compute Cloud (EC2)

Now it’s time to set up the web servers. Sign into the AWS Management Console and open the Amazon EC2 console. We are going to set up a Windows web server that will later serve as a template for launching additional servers when we need to scale our applications.

Choose Launch Instance and select Microsoft Windows Server 2012 R2 Base. Choose the appropriate instance size, in our case it was m4.large.
Select one instance for a start, select the VPC we’ve created in the first step, choose the public subnet, enable Auto-assign public IP.
Add the appropriate amount of storage (in our case that was 50 Gb) of the Root storage in Step 4, Add Storage.
In Step 5, Tag instance, you may enter the unique name for this instance that will make it easy to recognize it in the future. We have chosen baasic-web-1 as a value for the Name key.
In Step 6, Configure Security Group, choose Select an existing security group and select the “public” group created in the first step (in our case that was baasic-securitygroup).
Select Launch.
On the Select an existing key pair or create a new key pair page, shown on the following screen, choose Create a new key pair and set Key pair name to tutorial-key-pair. Choose Download Key Pair, and then save the key pair file on your local machine. You use this key pair file to connect to your EC2 instance.

To retrieve the administrator password of this instance, right click on the instance in the dashboard and select Get Windows Password. It will ask for a key pair that was generated in the previous step to decrypt the password. The next screen will show the connection information.

You will now be able to connect to the EC2 virtual server via RDP. We usually proceed by installing pgAdmin on this instance and checking the connection to the PostgreSQL instance created in the previous step. When restoring individual databases, we typically first create an empty database via pgAdmin, paying attention to pick up the right settings in the Definition tab - Encoding is usually set to UTF8, Template to template0, Collation to hr_HR.UTF-8, and Character type to hr_HR.UTF-8. These settings should match the settings of the source database that is going to be restored.

Right click on this newly created database in the pgAdmin, select Restore, pick the dump file of the database you want to restore, and click on Display objects button. You should be able to click on a checkbox next to the name of the dump file in the Objects tab, and click on the OK button.

Now would be the right time to install Application and Web server roles using the Server manager (Server manager – Manage – Add Roles and Features).

Roles and Features Wizard

Select ASP.NET 4.5 under .NET Framework 4.5 Features.

ASP.NET Feature

Check Web Server (IIS) Support in Role Services.

Web Server Role

We also typically install URL rewrite module in IIS – the easiest way to do it is to use Web Platform Installer from within IIS manager console.

URL Rewrite Module

Now we are ready to install our ASP.NET applications on this virtual server. After everything is done, we need to create an image of the installed web server that will be used to create additional instances. Before this, run Ec2ConfigService utility on the original instance, go to the Image tab, and set Random in the Administrator password section. Click on the Shutdown with Sysprep, and the machine will shut down after being prepared to create working images from it.

Let’s create a custom AMI image of our server by right-clicking on it in the EC2 console, and choosing Image - Create image.

Create AMI Image

Enter image name and description, leave default values in everything else. Any snapshots backing your new EBS image can be managed on the snapshots screen after successful image creation – see the Images – AMIs menu option on the left side in the EC2 AWS console.

Create AMI Image Details

After the process finishes, you can create new instances by clicking Launch Instance button in the EC2 dashboard and selecting My AMIs.

My AMIs

You can now create a second EC2 instance from this AMI in a second public subnet (baasic-public-2). It will be used with to balance the user load between multiple web servers.

ElastiCache

Amazon ElastiCache is a web service that makes it easy to deploy, operate, and scale an in-memory cache. It supports two open-source in-memory caching engines, Memcached and Redis. We are using Redis in our projects, and ElastiCache supports Master / Slave replication and Multi-AZ, which can be used to achieve cross zone redundancy.

In the ElastiCache console, click on the Cache subnet group, and then click Create Cache Subnet Group.

Enter a name for the group (baasic-cache in our case), enter a description and select the VPC created in the first step.
Click on the Add all the subnets, and remove the public group from the list.
In the ElastiCache console, click Get Started Now.
Select Redis.
Enter a Replication group name (we have used baasic-redis), choose the Node Type, and a number of read replicas.
In the Advanced settings, select the previously created cache group (baasic-cache), and choose the default VPC Security Group. After you click on the Launch Replication Group button in the next step, your cache cluster will become ready in several minutes. After the provisioning process is finished, you’ll get the cache endpoint that will be used in the web application.

Certificate Manager

We are about to deploy the load balancer to our infrastructure, but since we plan to use HTTPS, we need to take care of certificates. We are using a wildcard certificate (of the form *.baasic.com) on our web servers, but we need to set up the certificate on the load balancer also. To do that, go to the Certificate manager console, and click on the Get started button.

Certificate Manager

Enter the domain name with a wildcard. (we have used *.baasic.com).
Click Review and Request, and then Confirm and request.
A certificate approval mail is then sent to the domain owner. Click on a link in the mail message.
Click on the I Approve button. We are good to go!

Certificate Manager

Load Balancers

Go to the Load balancing section of the AWS dashboard and click on Create Load Balancer.
Enter the Load Balancer name (we have used baasic-lb). In Create LB Inside dropdown select the VPC created at the beginning (baasic-vpc in our case).
Add HTTPS protocol in the Load Balancer Protocol section.
Add baasic-public and baasic-public2 subnets, as they are in different availability zones.

Load Balancer Step 1

In Step 2, choose Select an existing security group and assign baasic-securitygroup.
In Step 3, choose the existing certificate we created in the Amazon Certificate Manager console in the previous step.

Load Balancer Step 3

In Step 4, enter the URL for the health check ping path. An HTTP or HTTPS GET request is issued to the instance on the ping port and the ping path. If the load balancer receives any response other than “200 OK” within the response timeout period, the instance is considered unhealthy. If the response includes a body, your application must either set the Content-Length header to a value greater than or equal to zero, or specify Transfer-Encoding with a value set to ‘chunked’.

Load Balancer Step 4

In Step 5, add the existing two EC2 web server instances we have created earlier.

Load Balancer Step 5

Leave everything else as is and click on the Create button. You are now ready to roll your solution! You can additionally apply autoscaling policies to your infrastructure - you simply define rules that specify when to instantiate additional EC2 instances and when to discard them. We usually do not use this feature in smaller-scale deployments, but it is very useful for larger solutions.

Wow, this has to be the longest blog post that I’ve ever written. Anyway, I hope you’ll find it useful when setting up your own solutions based on the Amazon AWS.