A collection of cybersecurity content.

Hunting for Hashes: Algorithm Unknown? No problem!

INTRO

Hashes are a fundamental tool in technical fields. Utilizing the values of hashes has become a common practice for ensuring the integrity of data, such as verifying the authenticity of a file during transfer or detecting malicious files through hash hunting. In the realm of security operations, threat hunting for known indicators is a crucial component, often performed using advanced tools like Endpoint Detection and Response (EDR) and Security Information and Event Management (SIEM) systems.

Scenario

During internal investigations, the use of hashes can prove invaluable. Let’s assume a scenario where mislabeled documents containing sensitive information were uploaded on a network share accessible by unauthorized individuals. In this case, the known records of the hash values of the mislabeled documents were used to locate the files in question. Due to the absence of an EDR sensor, a manual search was necessary on the computers of the individuals who had access to the files to find and remove them if they exist.

To add to the challenge, the hash values of the mislabeled documents were originally generated using a combination of SHA256, MD5, and SHA1 algorithms adding a small bump of inconsistency.

Solution

The script below presents a solution to this challenge by allowing a search for files based on a list of hashes, regardless of the algorithm used to generate the hashes so long as it is listed in the $Algorithm array which is at the mercy of your operating system and enforced policies.

<#

.SYNOPSIS
This PowerShell script is designed to assist Cyber Security Incident Response teams in identifying files on a host that match a provided list of hashes. The script is flexible in that it allows for a mix of hashes generated using different algorithms to be used, as long as the algorithms are included in the '$Algorithms' array. If you are aware of the specific algorithm used to generate the hashes in your list, it is recommended to modify the '$Algorithms' array to include only that algorithm in order to improve the speed of the script.

.DESCRIPTION
find_hashes_from_list.ps1

.EXAMPLE
.\find_hashes_from_list.ps1

.NOTES
Pick the method of preference to load the hashes before the hash hunt begins. Comment out methods you choose not to use.
Method 1: Space-delimited array (easy copy/paste)
Method 2: Load hash list from a '.txt' file
Method 3: Load hash list from a '.csv' file where the column name containing hashes is named 'Hashes'
Modify the following variables before execution:
1. $OutputFile
2. $Paths
3. $Hashes

.Outputs
$OutputFile

# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
#>

# Define the path of the output file
$OutputFile = "C:\temp\file_hashes_found.txt"

# Define an array of paths to start searching recursively (this variable can also use single path input)
$Paths = "C:\path\to\search\1", "C:\path\to\search\2"


<#
############
# Method 1 #
############
# This method defines an array of hashes to search for; specifying space-delimited values which then uses the '-split' operator to split the string into an array of hashes 
$Hashes = "hash1 hash2 hash3 hash4"
$Hashes = $Hashes -split " "
#>


<#
############
# Method 2 #
############
# Load the list of hashes from the '.txt' file
$Hashes = Get-Content -Path "C:\path\to\hashes.txt"
#>


<#
############
# Method 3 #
############
# Import the hashes from a CSV file where the column name of the hash list is 'Hashes'
$Hashes = Import-Csv -Path "C:\path\to\hashes.csv" | Select-Object -ExpandProperty Hashes
#>



###################
# Start Hash Hunt #
###################

# Define an array of algorithms to use
$Algorithms = @("MD5", "SHA1", "SHA256", "SHA384", "SHA512")

# Loop through each path
foreach ($Path in $Paths) {

    # Use Get-ChildItem to search for files recursively in the path
    $Files = Get-ChildItem -Path $Path -Recurse

    # Loop through each file found
    foreach ($File in $Files) {

        # Loop through each algorithm
        foreach ($Algorithm in $Algorithms) {

            # Use Try/Catch to handle any errors that may occur when calculating the hash
            try {

                # Use Get-FileHash to calculate the hash of the file using the current algorithm
                $Hash = Get-FileHash -Path $File.FullName -Algorithm $Algorithm

                # Check if the hash matches one of the hashes in the array
                if ($Hashes -contains $Hash.Hash) {

                    # Use Add-Content to write the output to the file
                    Add-Content -Path $OutputFile -Value "$($File.FullName) ($Algorithm): $($Hash.Hash)"
                }
            }
            catch {

                # Press on
                continue
            }
        }
    }
}

Let’s see this code in action. For this task, I have 3 hashes of files to hunt down of which I do not know the hashing algorithm they were derived from.

Hash 1: D7EE485737448009075F45E7A47450B33B1D8E32
Hash 2: 51F41F467CF0F6E7E8714085E18529A435E9ABC6AB27103F66B37E0192CF9040
Hash 3: 06E66584763AC876543C4DBF8F22791D

Using Method 1 of the script provided above, I place these hashes into the array ‘$Hashes‘ and execute the script starting from the root of the drive (‘C:\‘). The output, if any, will be placed at “C:\temp\file_hashes_found.txt” showing the locations of the files, the hash value that matched, and the algorithm that was used to generate the original hash values.

Conclusion

Using hashes to identify files is a precise and efficient method. Hashing algorithms produce a unique representation, called a hash value, for a given file’s contents. By searching for files that match a specific hash value, a script can quickly locate the desired files among a long list of values if necessary, eliminating the need for manual searches. Additionally, hash values remain constant as long as the file contents remain unchanged, making them an invaluable technique for identifying files or file changes in fields such as computer forensics, data management, and software development.