Treesize Script

I’ve been trying to create a script that returns the largest folders on a drive. From what I can tell, Get-ChildItem is my only real option. I have tired dir/s and even Joakim Svendsen’s “folder sizes using COM and by default with a fallback to robocopy.exe” function, but nothing seems to give much of an improvement in speed.

I have tried three different approaches…

Recursive sub-folder enumeration

This solution runs Get-ChildItem a very large number of times, and the majority of files on the system are counted multiple times.

Surprisingly this solution works really well. On my test drive it completes in about 60 seconds and consumed about 200MB of RAM. However, the memory requirements increase sharply on systems with a large number of files and folders such as file servers. This just feels inefficient to me, so I started looking at alternatives.

function getSubFolderSizes { param( [string]$targetFolder, [Int64]$minSize = 1GB, [Int64]$parentSize, [String]$indent = " ", [Int]$depth = 1 ) if ($depth -le 3){ #depths greater than this threshold cause the function to return $null, signaling that the higher level folder should be used # Get a list of Sub-directories in the current targetFolder $colItems = Get-ChildItem $targetFolder -Directory -ErrorAction SilentlyContinue | Where-Object{$_.FullName.Substring(0,1) -eq $targetFolder.Substring(0,1)} # Check each sub-folder foreach ($i in $colItems){ # Measure the size of all Files in this subfolder Get-ChildItem -Depth 1000 -LiteralPath $i.FullName -File -Force -Recurse -ErrorAction SilentlyContinue | Measure-Object -property length -sum -ErrorAction SilentlyContinue | Where-Object {$_.Sum -ge $minSize} | ForEach-Object { # This subfolder's size is above threshold # Add it to our output ("{0,15}{1,7}{2,-1}" -f ("{0:N2}" -f ($_.sum / 1GB) + " GB"), ("{0:P0}" -f ($_.sum / ($lowestVolume.Size - $lowestVolume.SizeRemaining)) + " "), ($indent + $i.FullName)) #And dig deeper getSubFolderSizes $i.FullName $minSize $_.sum (" " + $indent) ($depth+1) } } } } $minFolderSize = 1GB $lowestVolume = Get-Volume C getSubFolderSizes ($lowestVolume.DriveLetter+":") $minFolderSize ($lowestVolume.Size - $lowestVolume.SizeRemaining) 

A single Get-ChildItem for the entire drive and then recursing through this list

This solution requires that I save the results of this single GCI to memory. This consumed ALL memory. Piping this output to a file for my test system created only an 80MB file, but reading it to a variable consumed gigabytes of memory. Go figure.

I attempted to write the results to file and then stream it in, but my calculation routine required non-sequential access to the results. I couldn’t find a nice way of doing that.

At any rate, the single top-level GCI command also took longer to complete than the entire recursive script (that’s not including the actual file size calculations). This was a very counter-intuitive result for me since it’s counting each file only once. It seems that this command doesn’t scale well at all, resulting in a large number of small GCIs working faster than a single large one.

A single Get-ChildItem that simply calculates the size of each folder regardless of sub-folders and presents the largest X results

This solution trades usefulness for performance. It completes in only 30 seconds and uses almost no RAM, regardless of the number of files and folders, but it completely fails to report on a folder containing 100’s of sub-folders, each with a small amount of data. ie. every users iTunes backup folder.

param( [String]$Path, [Int]$Count = 10, [Int64]$MinSize = 1GB ) $tempOutput = New-Object PSObject -Property @{Length=0; Path=$null} Get-ChildItem -Depth 1000 -LiteralPath $path -File -Force -Recurse -ErrorAction SilentlyContinue | Select-Object Directory, Length | ForEach-Object{ if ($_.Directory.FullName -eq $tempOutput.Path){ $tempOutput.Length += $_.Length } else { if ($tempOutput.Length -ge $minSize){ $tempOutput | select Path, @{Name="Size (GB)"; Expression={"{0:N2}" -f ($_.Length / 1GB)}} } $tempOutput.Length = 0 $tempOutput.Path = $_.Directory.FullName } } | Sort-Object "Size (GB)" -Descending | Select-Object -First $Count 

So, at the end of the day I don’t have a solution that I’m comfortable using.

Any critiques or bright ideas?

submitted by /u/Servinal
[link] [comments]

Leave a Reply