Remembering to Clean Up with Terraform

One of my favorite uses of Terraform is to quickly turn up an infrastructure environment with only a few lines of code. Of equal importance is the ability to tear down parts of the environment when they are no longer needed or need to be rebuilt. Terraform helps me leverage elasticity both in building, destroying and rebuilding as necessary.

Reminders

If you are like me, you tend to forget things and need reminders. I have been building out environments now for some time in an automated way, but I am not always the best at remembering to tear them down when I am done. Don’t get me wrong, the act of tearing things down is easy with commands like terraform destroy, but remembering to do so is where I have a gap.

To close that gap I wanted to create a monitoring and trigger mechanism that would remind me when my infrastructure is running idle, and to go clean it up. Since many of my deployments are in AWS, the two tools I will leverage to accomplish this are CloudWatch and SNS. For those not familiar, CloudWatch is a monitoring and management service provided by Amazon that provides operational metrics on the health of a given environment. SNS is a notification service that allows you to send messages to a variety of endpoints - including SMS text messages which is a great way to remind me of doing things.

Incorporating Monitoring into My Build

Defining CloudWatch and SNS is relatively easy in Terraform as both resources can be defined using the Terraform AWS provider. Examples for both can be found on the Terraform website, and I have folded them both into a module I created on GitHub.

We will use these resources to monitor when the our autoscaling group goes idle, which I define as less then 2% CPU every minute for 5 minutes. When that occurs send a text message to the supplied phone number. To keep it simple the module accepts both the autoscaling group to monitor and the phone number to send messages to as variables. There is nothing preventing us from also defining the thresholds and polling intervals as variables as well, and in fact is something that we should probably do in the future to make the module more robust.

Using the Cloud-Watch Module

To make use of this module, we simply need to edit the main.tf file we have been using in development to include the cloud-watch module, which we will call from GitHub. We will pass the name of the auto scaling group created within the webserver_cluster module as an input for monitoring and prompt for the phone number to send the alert message to.

Now when we deploy our fleet there will be a two cloud watch alarms created against the deployed auto-scaling group. One that will report on idle time in a 5 minute window, and the other reporting on idle time in a 5 hour window. The idea being that if I missed one text message, I will get the second so that I can perform a terraform destroy to tear down the environment when it is not being utilized.

Now that I have included the cloud-watch module to my development main.tf file let’s initialize (terraform init), plan (terraform plan), and deploy (terraform apply).

Notification and Clean UP

I can see that it successfully created my alarm in CloudWatch and tied it to the auto-scaling group it created when deploying the fleet.

 Output from running a terraform apply, listing the DNS name and autoscaling group of the sever fleet.

Output from running a terraform apply, listing the DNS name and autoscaling group of the sever fleet.

 CloudWatch Alarm - Two were created, one for 5 minute intervals and the other for 5 hour intervals.

CloudWatch Alarm - Two were created, one for 5 minute intervals and the other for 5 hour intervals.

Now when the environment goes idle, an alarm will trigger and send me a text message. Should I not take care of it at that time, another text message in 5 hours will be send should the environment remain idle.

 Text Message from AWS SNS notifying me that my auto-scaling group has had idle CPU for the last 5 minutes.

Text Message from AWS SNS notifying me that my auto-scaling group has had idle CPU for the last 5 minutes.

Since Terraform makes it easy to cleanup (terraform destroy), I will be sure to perform that step to not incur costs for unused assets and environments. Terraform destroy will be sure to cleanup not only the environment it deployed but also the alarms and SNS notifications it created during buildout.

This is part of a Terraform series in which we have covered:

Managing VMware Proactively with Runecast

As technologists, how do you troubleshoot problems? Does this cycle sound familiar:

  1. bad thing happens

  2. Google the error message

  3. land on a knowledge base/blog article

  4. fix the issue

  5. if not…. rinse and repeat.

I am sure we can all relate.

As admins/operators we are constantly pressed for getting things working in short order. This results in much of our time reactively troubleshooting issues, digging through logs and using Google to try to find a fix. While most of us have the desire to stay on top of issues and become more proactive - the reality is that it is an uphill battle.

I am sure we can all relate.

Runecast - For VMWare admins, built by VMWARE ADMINS

At VMworld this year I got to learn a little bit about Runecast during their Tech Field Day presentation. Runecast is a company focused on helping VMware admins with the task of keeping their environments healthy, secure and in line with published best practices. The company was founded by a set of practitioners who ran the VMware Center of Excellence for IBM, where they were in the trenches living and breathing VMware for many years. From these trenches arose the idea for product which would help tackle many of the common operational problems VMware operators face: combing through VMware knowledge base articles and logs, awareness and adherence to VMware best practices, security compliance checks, and spending too much time troubleshooting. A product was born to address these problems and it is called Runecast Analyzer.

Runcast Analyzer deploys as a single VM that aggregates information from a variety of sources including the VMware Knowledge Base, social media, and security hardening guides and synthesizes them into a central repository. Using this information, the analyzer runs a discovery against the environment to identify potential issues before they have the ability to cause outages. The technical details of how this is done is nicely explained in the white board/chalk talk presented by Runecast co-founder, Stanimir Markov.

RUNECAST Analyzer

Once deployed, Runecast Analyzer can be accessed via it’s web interface and presents several different views highlighting the health of your VMware environment. Runecast is not a performance and capacity alerting tool (as there are many of those available), but rather places its focus on configuration, manageability, security and VMware best practice conformance. The dashboard below shows the overall health based on those standards and allows you to drill down into items that may be of most importance in your organization.

RunecastDashboard.png

Looking at the inventory view for critical items across this vSphere environment, it is easy to see a series of patches that should be installed on the vCenter and ESXi hosts. This details of this critical alert provides the relative risk rating, KB article reference, and resolution details for how to address the problem. Runecast does not currently provide the ability to take remediation action from within the web interface, but that is something that may be provided in the future.

RunecastPatch.png

Best Practices, Security Hardening, Compliance & VMware KBs

In it’s first iterations Runecast Analyzer was focused on analyzing configuration items contained within VMware Knowledge Base articles, best practices and security hardening guides. Recently the Analyzer has been expanded to include log analysis and specific security/compliance standards (DISA STIG and PCI DSS). This means that it can cross check against VMware logs for known issues, as well as call out items that don’t comply with specific security standards.

Below is a shot of the inventory view in which all items can be categorized, sorted and filtered based on what is most important including a categorization by product and impact. It is encouraging to see these new items added into the product, and I can envision additional sources and levels of analysis being included moving forward.

RunecastCategories.png


Try it Out

During the Tech Field Day presentation there was a cool demo of Runecast Analyzer which you should check out, but why not try it for yourself? Runecast provides both an online/interactive demo as well as a free trial of Runecast Analyzer for you to run in your environment. Also, if you happen to be a vExpert you can take advantage of their NFR offering. This was my first exposure to Runecast, and overall I would have to say I am highly impressed. This is a product for VMware admins, built by VMware admins and aimed at helping VMware admins move ever close to proactive management of their environments.

Disclaimer:  I was personally invited to attend Tech Field Day Xtra at VMworld 2018. I was not compensated for my time or travel.  I am not required to blog on any content; blog posts are not edited or reviewed by the presenters or Tech Field Day team before publication.

Creating A CloudMapper Virtual Appliance using Packer

One of my favorite visualization tools for diagraming Amazon Web Services (AWS) environments is Duo CloudMapper. CloudMapper helps you understand visually what exists in your AWS accounts by running a collection against the environment and providing an interactive web page. This is extremely handy for identifying possible network misconfigurations, along with a slew of other benefits. For a full listing why I like this tool check out my post on How to Visualize Your Cloud Deployments with CloudMapper.

Despite it’s power, one of the challenges I have found is to simply get it started and working. CloudMapper is open source built upon other open source products and I have found that there are inevitably build and dependency issues that suck up my time before I can simply use the tool. For these reasons and to make things easier in general, I chose to create and deploy CloudMapper as virtual appliance.

Building the Virtual Appliance

I utilized Packer to provision my CloudMapper virtual appliance. Packer is excellent for creating machine images for multiple platforms from a single source configuration. In this case we will build out an Amazon Machine Image (AMI) with Packer, which will take care of all package installation and dependencies for the build out. You can learn more about all the Packer goodness on the HashiCorp website and Paul Kirby provides a nice overview in his Packer PluralSight course.

  1. Install Packer

  2. Download the cloudmapper.packer template from my GitHub account. (Packer templates are simply JSON files that specify the various components used to create the machine image, and where the build of the image will be saved. In our case we will be creating and deploying our virtual appliance into AWS, but Packer comes with support to build images for Amazon EC2, CloudStack, DigitalOcean, Docker, Google Compute Engine, Microsoft Azure, QEMU, VirtualBox, VMware, and more.)

  3. Specify AWS Credentials for creating our virtual appliance. There are a number of ways to accomplish this but we will use environment variables.

       $ export AWS_ACCESS_KEY_ID="awsaccesskey" 
       $ export AWS_SECRET_ACCESS_KEY="awsecretkey"
  4. Build the image.

    $ packer build -var aws_region="us-west-2" -var ami_id="ami-6cd6f714" -var python_version="3.5.6" cloudmapper.packer
        # aws_region is where the image will be stored.
        # ami_id is the base Amazon Linux image in the region.
        # python version of your choice.

    There are currently some issues with CloudMapper and Python 3.7, so I am using the recommend version of 3.5.6

  5. The build process will take ~10-15 minutes as it needs to compile and pull down all of the components. Once it is complete, Packer will notify of your unique AMI that can now be used for deployment.

packami.png

Deploying the Virtual Appliance

Now that the image for our virtual appliance is available in AWS, let deploy it and run CloudMapper. My preferred way to deploy would be using Terraform but for purposes of this post we will step through the manual steps.

  • Launch an instance using the newly created CloudMapper image. You can accept the defaults providing your instance a public IP with SSH access.

myami.png

Configure CloudMapper by logging in via SSH and performing the final initialization steps. (While these could be automated and built into the image, I get sensitive about saving AWS credentials anywhere even if my image is private. I prefer to specify them when needed.)

  • $ aws configure

    You can specify a full access account to run CloudMapper but I like least privilege so have setup a “Visualization” IAM user with the privileges specified in the CloudMapper readme.

cloudmapperiam.png
myIAMAccount.png
  • Configure CloudMapper’s account information in the config.json file to match aws credentials:

    $ cd ~/cloudmapper
    $ pipenv run python3 cloudmapper.py configure add-account --config-file config.json --name AWS_USERNAME --id AWS_ACCESS_KEY_ID
       #AWS_USERNAME is “friendly name” tied to IAM account   
       #AWS_ACCESS_KEY_ID is the AWS Access Key ID specified in aws configure.

  • Run CloudMapper’s collection against the environment. The collection phase can take some time, as it is truly pulling all the metadata information for your entire AWS account across all components and regions.

    $ pipenv run python3 cloudmapper.py collect --account AWS_USERNAME
  • Prepare the results and launch the webserver to display them.

    $ pipenv run python3 cloudmapper.py prepare --config config.json --account AWS_USERNAME
    $ pipenv run python3 cloudmapper.py webserver --public

  • Create and attach a security group to the instance to make the site publicly available.

securitygroupcloudmapperweb.png
securitygroupcloudmapperwebassign.png
  • Browse to public DNS address of your virtual appliance on port 8000

Please note that these steps show running this instance with a publicly available website. You can certainly deploy this to a private subnet and access through a bastion server, etc which is recommended. It would also make sense to put this site behind a login which I have noted as an opportunity for further improvement. Be sure to stop this instance when you are done using it.

devstagingprod_cloudmapper.png

Further Improvements

Having a readily available virtual appliance that just works is perfect, but there are some further improvements that I think would be handy:

  • Create a docker image of CloudMapper that can be run as a container. (There are some folks who have built this)

  • Save the collection data to an external volume so that it doesn’t live in the running appliance.

  • Create the virtual appliance that can be deployed within other Packer supported platforms, namely vSphere and Azure.

  • Lock down the website behind a username and password.

How to Visualize Your Cloud deployments - CloudMapper

As you are aware, I am a big fan of visualizations.  In fact one of my most popular set of posts centers on using RVTools to collect and visualize a VMware environment.  As much of my focus is now centered on cloud deployments I wanted to highlight some of the tools I have found particularly useful for visualizing AWS and Azure.  These are:

  1. CloudMapper

  2. CloudCraft

  3. Hava

CloudMapper

CloudMapper is a tool from Duo Security for visualizing Amazon Web Services (AWS) cloud environments.  It was build out of a need to help people perform their jobs easier by providing simple and interactive visualizations of their AWS account.  CloudMapper runs a collection process against your AWS account to prepare and build an interactive visualization for each component along with their connections. Some have called it Google Maps for your AWS account, and to put it simply CloudMapper shows how your AWS environment actually looks.

To see the level of interaction check out their online demo of a deployed application in the us-east-1 region. Below is the CloudMapper visualization of the web applicatoin deployment highlighted in several of my Terraform posts.

cloudmapper.png

CloudMapper was built by Scott Piper in conjunction with Duo Security and luckily for us, they have open sourced their work and continue to  maintain it.  Of the three tools mentioned, it definetly provides the most robust view in terms of connectivity and security for visualinzing an AWS environment.  To get started using CloudMapper check out the product page as well as the installation and setup details on GitHub.

Benefits:

  • Especially good for seeing how resources are connected, and visualizing your AWS environment.

  • Interactive web diagram is extremely handy for understanding and validating your deployment.

  • I have found CloudMapper to be the most thorough tool of the three highlighted

  • Free / Open Source

Nice to Haves:

  • Setup is several steps and more involved compared to the other tools. I did run into a number of compatibility issues with some of the backend Python packages - which reminded me that yes, it is open source.

  • Collection phase can take some time, as it is truly pulling all the metadata information for your entire AWS account across all components and regions.

  • Would be nice to have this exported in different formats - currently supports PNG and JSON only. Visio and PDF are some of other formats similar tools support.

  • Support only for AWS, it would nice to be able to see support for other clouds (Azure, GCP, etc.)