My last blog article received some great attention, and highlights the growing demand for understanding and determining where sensitive data lives on corporate endpoints.  That post walked through the steps taken with a great free product by Veeam Software called Veeam Endpoint Protection, and coupled it with the power of DataGravity's File Analytics on a simple SMB share.

In fact the popularity of the article spurred a question at a local VMUG in which it was asked if DataGravity might be able to provide the same level of insights to to replicated VMs sitting at a DR location.  It just so happens that this customer is already using Veeam Backup and Replication to backup and replicate VMs to a DR environment every night - ready to failover, should the need arise.  The customer asked 'if rather then simply replicating these VMs to an otherwise unintelligent data repository, could I replicate to a DataGravity datastore at my DR location to make use of the VM File Analytics?'  This is a great use case and the answer is YES.

You may be asking - Replicated VMs - doesn't DataGravity have to act as your primary storage?  Well in fact DataGravity is enterprise primary storage but there is nothing that prohibits it from also serving as a great Backup/Replication target.  In fact DataGravity by design, natively performs it's analytics on VMDK files residing on NFS datastores regardless if those datastores are serving in a primary storage capacity or as a replication target; regardless of the VM's power state. This means VMs can be replicated, remain powered off at the DR location and non-intrusively be analyzed for sensitive information. Very powerful.  Let's see how to do it:

Create Target VM Datastore for Replica VMs

The first step would be to create a VM datastore which will be used as a replication target.  We will call it 'DataMRI' - because we plan on taking a deeper look into the health of our replicated VMs, so the name fits.


This is a simple four step process, which the DataGravity 'Create Datastore' setup will walk you through.

  1. Specify a name and and  size for the datastore
  2. Specify which ESXi hosts the datastore should be attached to
  3. Specify the Discovery Point Policy - this is the frequency in which the Data Analytics will be run on the replicated VMs.
  4. Validate the datastore is attached to your ESXi hosts at the DR location.  You can see that the datastore is online and ready to receive VMs.

Configure Veeam replication Job

Since this customer is already using Veeam Replication it is very simple to modify or create a new replication job that specifies the newly created datastore (DataMRI) at the DR location as the replication target.  As I have indicated in previous articles, I like the simplicity and flexibility that Veeam provides for my backups, and their replication engine serves very well in a modern data protection architecture.  It offers image-based VM replicas, built in WAN Acceleration, and the ability to failover and failback to/from those replicas.

To create a Replication Job, log into Veeam and step through the New Replication Job setup.

  1. Create a New VMware Replication Job
  2. Specify a name for the Replication Job
  3. Specify the VMs to include for replication.


4. Specify the Destination to replicate to: Host, Resource Pool, Folder, and Datastore (DataMRI) that was created in 'Create Target DataStore' above.

5. I like to append a suffix to my replicated VMs so I know that these are replicated, so I appreciate this option in Veeam.  I used the suffix _DataMRI, and I choose to only keep 1 restore point.

6. Schedule the time for the replication to occur - I choose every night at 2:00 AM.

7. Review the summary details and save the job.

Veeam Replication Job - Part 2.png

Replicate VMs

Now that we have the target datastore defined for our VM replicas, and the replication job in place.....let's replicate. The timing of the replication will of course follow the schedule you specified above in the Veeam Replication job and below you can see that we are able to highlight the status of the job on a per-VM basis.

We can also see the real time performance on the target DataGravity datastore, as well as the status of the VM replicas which are starting to populate the target.

Now that we have VMs starting to come over to the target DataGravity datastore, let's take a look at the Analytics and information that these VMs are holding.

VM Analytics

To begin, let's take a look at the File Analytics view from within DataGravity.  This view allows us to search and uncover all the details for our replicated VMs and the data which they contain.

Looking at one of the replicated VMs - DGVDI01_DataMRI, we can start to see some critical information.  We can see the Top Users of the VM, the Most Active Users on that VM, Dormant Data and File Growth over time. Additionally we can see that the VM contains a number of files with Social Security and Credit Card numbers.  15 files with Social Security Numbers on this VM.....let's dig into that.

The impressive thing is that the replicated VM doesn't need to be powered on at all to see this level of detail, so it doesn't disrupt the data protection architecture or DR procedure.

Further detail on the makeup of these files, as well as a full listing is only a click away.  Here is a list of the 15 files stored on this VM that contain Social Security numbers.  This is making tremendous use of our VM Replicas.


Analyze VMS and find Sensitive Data with Zero Impact

Replication can now serve more then just as a safety net.  Why not include the power of Veeam Replication with DataGravity Analytics inside you backup and DR strategy - and not only be ready to failover in the event of service disruption, but also be informed of how that data is growing, as well as what sensitive information is being saved within the infrastructure.

Much thanks to my customer base for presenting such a great use case, and allowing me to share.

Finding Sensitive Data on Endpoints with Veeam Endpoint Backup FREE and DataGravity

I have for a long time been a huge fan of Veeam - both as a customer and as a virtualization community member.  I cut my teeth with their FastSCP product (remember that?) to efficiently move files between ESX/ESXi hosts and datastores.  It was awesome, and the best part about it was that Veeam offered it completely for free. In fact, they still do as part of Veeam Backup Free edition.  Fast forward a number of years, and Veeam has done it again.  This time they have released Veeam Endpoint Backup - a completely free standalone solution to help protect Windows endpoints.

Knowing their reputation for developing products that simply 'just work', I was eager to try out this new Endpoint product.  In fact I recently had a customer who asked if they might be able to use the product to save data from some of their Windows clients up to a DataGravity SMB share.  Now that caught my attention, and sure enough 'it just works'.  Let's check out how.

Install Veeam Endpoint Backup Free & Configure Backup

There are several tutorials on the internet to show you how to install Veeam Endpoint Backup, so I will spare you all of the 'Next, Next, Next, Finish' details.  It really is that simple.  I tested this with Windows 7, but can be run on Windows 8, 2008R2 & 2012.  

Once installed, you simply need to configure the backup of the endpoint.  I chose to backup the entire computer to a shared folder on my DataGravity array which also serves as a backup repository for the Veeam backups of my VMs.  I scheduled this backup to run every night at a specific time, but one cool option is to schedule it run whenever the backup target is available.  Veeam Endpoint Backup actually throttles the frequency/activity of the backup so it doesn't compete with other applications running on your endpoint, and it doesn't mess around backing up stuff that doesn't matter like temporary and page files.  Very nice.

Backup Mode.png

Run a Backup of your Windows Endpoint

Now that we have configured and and started to protect our Windows endpoints, we can check the status of these Veeam restore points very quickly from the Control Panel.  You can open this up from endpoint itself by selecting the Veeam icon in the system tray.

This will allow you to see the status of all of your restore points, and drive into any of them to initiate a recovery.


To begin a restore, simply select the 'Restore Files' option under any restore point.  This launches the Backup Browser which allows us to specify the file level items to restore.  This is actually opening up the appropriate Veeam VBK and VIB files in the backup repository and presenting them in a directory tree (mounted to the the VeeamFLR directory).  In our case we won't actually be restoring the files to the original endpoint, but rather making use of the Copy function to extract all of these files up to a data-aware SMB share on the DataGravity array named End Point Data.

Checking for Sensitive INformation in Endpoint Data

We can now look at the data demographics of this endpoint within DataGravity to identify dormant data, file category growth, top consumers of space, as well as any sensitive items.  Looking at the File Analytics of this endpoint data we can see that there are several files with Credit Card numbers being saved.

Looking at the details of these files containing credit cards, we can see that this endpoint has Excel spreadsheets and Word documents with the Sales Team expense account information.  These include the credit card numbers of the team being stored in clear text.

We can also see from the search below, that there is content being saved out to DropBox and Google Drive from this endpoint.



For my customer, this series of steps was exactly what they were looking for - 1.) Getting a backup of their most important PCs & 2.) Understanding if there is sensitive data being saved, carried around (laptops), or being synced from these PCs.  The economics of the solution certainly couldn't be beat.  This highlights just one use case for the Veeam Endpoint product paired with DataGravity, but it certainly can offer much more: Volume level restores, Bare Metal Restores with Recovery Media, integration with Veeam Backup & Replication - the list goes on and on, which is a topic for a separate post.  Nice work Veeam.