Securing Files Containing Sensitive Data

Everyday I see more and more sensitive information being saved in places where security is wide open.  Credit Cards, Intellectual Property, Social Security Numbers, Private Certificates - you name it - I have seen it.  So I wanted to build on a series of recent posts to demonstrate how PowerShell can be used to help secure sensitive files. In this workflow we will identify sensitive files using the DataGravity Discovery system, secure them with PowerShell, and validate our updates.

THE WORKFLOW

  1. Identify files containing sensitive data with DataGravity and export files names. (CSV format)
  2. Run the ChangeFilePermissions.ps1 PowerShell script
  3. Validate Permission Changes and restore original permissions if required.

Identify Sensitive DAta

In earlier posts I have highlighted easy ways to find sensitive data using the DataGravity search and dynamic tagging.  An example of these sensitive tags are social security numbers lurking in unsecured files.  The simple search below returns this list of files residing on a public share. 

We can export this information out to a CSV file and use it as an input parameter in then next step.

THE SCRIPT

The full ChangePermissions PowerShell script is available on my Powershell repo on GitHub.  Let's look at an example of how to run it:

ChangeFilePermissions.ps1 -ShareFilePath "\\CorporateDrive\Public" -csvFilePath "c:\temp\public.csv" -SensitiveTag "SS" -logFile "C:\Temp\FilesPermissionChanges.log

Script parameters:

-ShareFilePath is the path to share where the files containing sensitive data live.  In our example it is the public share.

-csvFilePath is the path to the exported CSV listing all files, including those that contain sensitive information. This is an export from a DataGravity search.

-SensitiveTag the sensitive tag(s) to look for when selecting which files to secure (Ex. SS, CC, Email Address, etc.)

-logFile is an optional location for where we want to log the activity of what files have been secured.

Securing Files with Sensitive Data

It is very important to place emphasis on the fact that when dealing with automation and the modification of security permissions for anything we must BE CAREFUL and be sure to have our UNDO button handy.  Remember that just as fast as you can automate a process, you can equally as fast have a royal mess on your hands.  Check out my UNDO button later in the post.

In the example below, we are running the ChangeFilePermissions.ps1 script against the public folder to deny access to all files that containing social security numbers.  The script can be modified to include other sensitive tags or a combination of tags.

We secured 30 files on the share by changing their security access.  This is validated by looking at the activity timeline within DataGravity, which confirms that the 'set ACL' operation was performed on the files.

VALIDATE SECURITY

We can also validate the security of the files was updated by attempting to access one of the files and verifying that we no longer have permission to view the file.  This is validated by: i) output log from the script ii) security tab of the file properties.

The UNDO Button

I personally always like to have a back out plan when making large amounts of changes - after all who doesn't like and UNDO button? DataGravity's Discovery Points work very well as my UNDO button and therefore I recommend creating a manual one before running the script.

This gives us the ability to restore any or all of the modified files to their original security settings.  You can see that it is easy to view previous versions for any file and restore if needed, including the original permissions.

I hope you find this walkthrough and script valuable to making your environment more secure.

Finding Duplicate Files with PowerShell

Let's explore a script that leverages DataGravity's file fingerprints to identify the top 10 duplicate files on a given department share or virtual machine.

The Workflow

  1. Export fingerprints and file names to File List (CSV format)
  2. Run the FindDuplicateFiles.ps1 powershell script
  3. List the Top 10 duplicate files and space they are consuming

Files and Fingerprints

DataGravity makes it easy to identify files and their unique SHA-1 fingerprints on a share or virtual machine (VMware or Hyper-V).  In this example we are going to gather the file names and fingerprints in the Sales department share.

The Script:

FindDuplicateFiles.ps1 -csvFilePath "c:\temp\sales.csv" -top 10

Script parameters:

-csvFilePath is the path to the CSV file we downloaded in the first step which contains a list of the files and file fingerprints.  This is an export from DataGravity's Search.

-top optional parameter that if specified will show the top number of duplicate files

Listing and Validating Duplicates

Let's run the script to return the top 10 duplicate files, and their file size.

These can of course be validated as the example below returns duplicate files consuming the most space.

The full powershell script is listed below, and available on my Powershell repo on GitHub.




Deleting Dormant Data with Powershell

One of my favorite forms of managing data is to DELETE it.  One of my favorite ways to delete things is with SPEED and CONFIDENCE.

I have been quoted as saying that "DELETE is the best form of de-duplication" - in fact it is 100% dedupe. Some of the best data to DELETE is the stuff that no one is using: dormant data.  So putting my automation hat on, let's explore a script that helps DELETE things quickly but still provides us with the ability to UNDO using DataGravity File Analytics for Dormant Data.

The workflow:

  1. Export Dormant Data to CSV File List
  2. Run the ArchiveDormantData.ps1 powershell script
  3. Optionally create an archive txt file to notify it has been deleted
  4. Validate space savings and recover individual files if required.  

Identify Dormant Data:

DataGravity makes it easy to identify and download a list of all the dormant data.  In this example we are going to grab anything that hasn't been updated, read or touched within a year or more on the Marketing share.

The script:

ArchiveDormantData.ps1 -ShareFilePath "\\CorporateDrive\Marketing" -csvFilePath "c:\temp\Marketing.csv" -logFile "C:\Temp\DormantDataDelete.log" -ArchiveStub

Script parameters:

-ShareFilePath is the path to the data where the dormant data lives to be deleted.  In our example it is the Marketing share.

-csvFilePath is the path to the CSV file we downloaded in the first step which contains a list of the files to be deleted.  This is an export from the DataGravity's Dormant Data.

-logFile is an optional location for where we want to log the activity of what has been removed.

-ArchiveStub optional parameter that if specified will create a TXT stub in the place of the deleted file

Validate and Recover if necessary (The undo button)

If you get anything wrong or delete the wrong thing, it is always handy to have an UNDO button.  There are several ways to do that using backup/recovery tools, and in this case since we are already using DataGravity, we can crate a manual discovery point to changes and restore any files if required.

The full powershell script is listed below, and available on my Powershell repo on GitHub.  Big thanks to Will Urban for the heavy lifting on this one.  Happy DELETING.