Preparing for a File Share Migration to SharePoint with Power BI
Here at Envision IT we've helped many clients with migrations to Office 365. This often involves a file share migration off of legacy on premises file shares into SharePoint Online and OneDrive. Clients often have multiple file shares, often with decades of legacy documents in deep trees of folders.
These file shares typically have to forms. One is personal drives, where each user has their own private folder to keep their own content in. These may quite easily to user's OneDrive library, where each Office 365 user gets 1 TB of private storage. Cleanup may be desirable before migrating this, but often that is left to the individual users.
The other type of file shares are department or organizational file shares. These are more difficult to plan out for. We strongly discourage just fork-lifting the entire structure into a SharePoint library. While often painful, this is the time to do some serious housekeeping, and tidy things up. It is also time to reorganize the content, hopefully into a Teams-centric set of SharePoint site collections and folders that correspond to organizational, department, and project Teams and Channels.
The first step to prepare for this migration is to inventory what is in these file shares. This can be done at a high level by selecting all the folders at the top level of a file share in Windows Explorer, and selecting Properties. This will give you a count of the files, folders, and their total size, which is a good place to start.
In order to go deeper, we usually use Power BI. While it has a built in connector specifically for inventorying a file system, we don't actually use that. The challenge with that approach is if we want to tweak the data in Power Query, we need a live connection to the data source, which we often don't have in this situation. Instead we use a simple PowerShell script.
# Read in the parameters$filePath = Read-Host "Enter the full path to inventory"$outputFile = Read-Host "Enter the full path and filename for the output CSV file"# Get the file details$files = get-childitem $filePath -recurse | where-object{! $_.psiscontainer}| select-object Attributes,CreationTime,LastAccessTime,LastWriteTime,IsReadOnly,Length,Extension,FullName,Name,Directory# Get the owner information from the file ACLs$owners = $files.FullName | Get-Acl# Go through the two arrays and merge them into one$fileCount = $files.Count$results = for ( $i = 0; $i -lt $fileCount; $i++) {# Create the merged array item[PSCustomObject]@{Attributes = $files[$i].AttributesCreationTime = $files[$i].CreationTimeLastAccessTime = $files[$i].LastAccessTimeLastWriteTime = $files[$i].LastWriteTimeIsReadOnly = $files[$i].IsReadOnlyLength = $files[$i].LengthExtension = $files[$i].ExtensionFullName = $files[$i].FullNameName = $files[$i].NameDirectory = $files[$i].DirectoryOwner = $owners[$i].Owner}}# Output the CSV results$results | export-csv $outputFile -encoding UTF8 -NoTypeInformation
The script doesn't collect any of the contents of the files, just the metadata, such as filename, date created and modified, file type and size. Care still needs to be taken though, as filenames themselves can still contain confidential information. The person that runs the script needs to have access to all of the folders in the file share.
In order to collect the data, follow these steps:
- Open a PowerShell window, or the Windows PowerShell ISE
- Paste the above script in the window
- Enter the full path to inventory, and the path and filename to save the output CSV file into
Once you have the collected data, download a copy of our Power BI File Share Analysis.zip file. If you'd like to play around with the queries and need some sample data, that is also in the zip file.
To update the Power BI file to work with your CSV data file, follow these steps:
- Download Power BI Desktop if you don't already have it from https://powerbi.microsoft.com/en-us/desktop/
- Download and open the above PBIX file
- Open Power Query from the Transform data button in the ribbon
- Click on the FilePath parameter in the list of Queries on the left
- Enter the path to your CSV file
- Click on the BasePath parameter
- Enter the appropriate base path to remove from your dataset
- Click Close and Apply to apply your changes and return to the dashboard
- Explore your dataset