Skip to main content
 
Splunk Lantern

Typosquatting clicks on a network

 

Recently, several users on your network have fallen victim to phishing attempts that used typosquatting. Typosquatting is a technique that alters a few letters of a legitimate domain in the hopes of redirecting a user to a malicious one, for example www.mycampany.com instead of www.mycompany.com. It relies on users not reading information carefully. As part of on-going efforts to educate your users, you'd like to compile a list of those domains. You already have a list of legitimate domains that your users commonly visit.

Required data

Firewall data

Procedure

This sample search uses Palo Alto Networks data. You can replace this source with any other firewall data used in your organization. In addition, you must install the URL toolbox app for this search to work.

Run the following search. You can optimize it by specifying an index and adjusting the time range.

sourcetype=pan:threat url=*
| stats count BY url
| eval list="mozilla" 
| `ut_parse_extended(url, list)`
| stats sum(count) AS count BY ut_domain
| where ut_domain!="<legitimate domain.com>"
| eval company_domain="<legitimate domain>"
| `ut_levenshtein(ut_domain, company_domain)`
| eval ut_levenshtein= min(ut_levenshtein)
| where ut_levenshtein < 3

Search explanation

The table provides an explanation of what each part of this search achieves. You can adjust this query based on the specifics of your environment.

Splunk Search Explanation

sourcetype=pan:threat 

Search only threat events from Palo Alto Networks data.

url=*

Search data with a value in the url field.

| stats count BY url

Show the event count for each value in the url field.

| eval list="mozilla" 

Search the Mozilla catalog for top level domains.

This eval function is required for the next line in the search (ut_parse_extended) to work.

| `ut_parse_extended(url, list)`

Parse the URLs based on the Mozilla top level domain list.

The punctuation surrounding a Splunk macro is always a back tick (`), not a single quote (').

| stats sum(count) AS count BY ut_domain

Total the number of times each value in the ut_domain field appeared and group the results by those values.

| where ut_domain!="<legitimate domain.com>"

Show only results where the domains extracted are not the legitimate domain.

| eval company_domain="<legitimate domain.com>"

Establish the legitimate domain as the one to compare the other domains against.

| `ut_levenshtein(ut_domain, company_domain)`

Run the macro `ut_levenshtein(1)` against the newly extracted ut_domain field and compare it to the legitimate domain.

| eval ut_levenshtein= min(ut_levenshtein)

Extract the minimum Levinshtein score, which is the number of changes made to transform one string to another.

| where ut_levenshtein < 3

Show only results where the Levenshtein score is less than 3. 

Next steps

You can use the URLs you find to create education courses to help network users understand how they are likely to be fooled. In addition, you can set up alerts so you know when users are accessing sites that have a high probability of being malicious. A threshold of 1-2 is good for alerting; anything more than 2 and you risk getting lots of false positives.

Finally, you might be interested in other processes associated with these use cases: