Utilizing Amazon Web Services (AWS) Storage GAteway to archive backups

Recently I have been exploring some practical, relatively low cost methods for utilizing cloud services in my lab environment.  In this exploration I found a great video tutorial produced by Luke Miller (a Veeam SE Manager) who walks through how to use the AWS Storage Gateway for placing backup copy jobs onto AWS S3 storage.  I liked the video so much that I thought I would implement it in my lab to augment my existing Veeam setup and Vice Versa backup copies.  Needless to say I have been very impressed with the ease of deployment and the effectiveness of using S3 as an archive storage tier for my lab, that I wanted to share my setup for those who may be interested in doing the same.

The Goal

Just a quick overview of the goal:  Utilize a 15TB Amazon S3 volume as a remote backup archive presented to a local on-premises Windows 2012 R2 application server.

Per Amazon: The AWS Storage Gateway is an on-premises virtual appliance that provides seamless and secure integration between your on-premises applications and AWS's storage infrastructure. The service enables you to securely upload data to the AWS cloud for scalable and cost-effective storage. The AWS Storage Gateway allows you to create iSCSI storage volumes, storing all or just your recently accessed data on-premises for low-latency access, while asynchronously uploading this data to Amazon S3, minimizing the need to scale your local storage infrastructure.  The gateway fits into an existing infrastructure as shown in the diagram provided below - in my case the application server will be running Windows 2012 R2, the host is ESXi 5.5 and I am running both direct attached storage and a DataGravity NFS datastore.

Deploying the gateway

This Gateway VM can be accessed and downloaded from the your AWS Management Console, and provides a nice step-by-step walk through to deploy the gateway.  Utilizing the Gateway-Cached volume so that I can store most of the cold, archive data up to Amazon S3.  I plan on testing out the Virtual Tape and Stored Volume configurations as well but that is for another post.

I am utilizing ESXi in my lab, so we will deploying the gateway as an OVA file.  Supported hardware, hypervisor versions, network, etc. are well documented, so I can begin the deployment.

I will assume that if you are reading this you most likely have no issue deploying an OVA file in your environment, but Amazon does have the process very well documented in case you need a reference.  Once deployed on my ESXi host, AWS recommends that you validate NTP for the host and sync the gateway guest VM with the host's time.

Provision Local Disk

The gateway VM does need disk allocated for both it's cache storage and an upload buffer.  The guideline and recommendation is to allocate at least 20 percent of your existing file store for your cache storage, and at least 150 GB as an upload buffer.  For this deployment I will be using (2) 150GB virtual hard disks - one for the cache storage and one for the upload buffer.  For further detail on AWS recommendations for sizing these, there is some great guidelines and recommendations.

It is important to modify the SCSI controller type for these disks to VMware Paravirtual

Now we are ready to power on the Gateway VM, and the last step is to activate it within AWS.  This is done by entering in the IP Address of the Gateway VM once it powers on.  Yes, it is possible to specify a static IP Address for your Gateway by launching it's console and working your way through the network configuration menu, but in my case I utilized the DHCP address that it picks up. 

AWS does charge a $125 per month fee for each activated gateway.  There is a 60 day free trial, which I am currently utilizing to determine if I find value in the service.  I do like this option, and yes I am seeing value.

Configuring the local storage of your VM gateway is really just a matter of specifing which disk will be used for the Cache Storage and which will be utilized for the Upload Buffer.  Both of my disks are equivalent in size, so pretty much a no brainer, but it is something important to take note of.

Lastly we will provision the capacity of the S3 volume that we want to present.  In my case I will be testing with a number of different backup and copy jobs, so I provisioned a 15 TB volume.

Presenting S3 Volume to Windows application server

Now that I have 15 TB of cloud storage to work with, let's complete the deployment by connecting the dots by presenting this storage to my application server.  This works by using using the AWS Gateway VM as an ISCSI target, and in my case utilizing the default Windows 2012 R2 ISCSI initiator.


Once the disk can be seen by the server, then it can be enabled, formated, and assigned a drive letter for use.  I will initially be using this as a Veeam Backup Copy repository, so I assign it the drive letter V:\

Now the disk is ready to use and you can start consuming AWS S3 storage natively via your application server. In my initial use case I am using this space as a Backup Copy repository for my lab backups with Veeam, so that I have an offsite copy of my backups.


Overall I have been very pleased with the ease of presenting cloud storage to my lab environment, and the AWS storage gateway setup, configuration and execution is extremely straight forward.  After running this for a little while we will see what our monthly bill looks like.  Thanks Luke for the nice video overview.