Skip to main content
 
Splunk Lantern

Geographically improbable access detected

 

Concurrent authentication attempts from multiple IP addresses can indicate unauthorized sharing of credentials, or even stolen credentials. Improbable travel anomalies, such as logging in from two geographically distant locations at the same time, can be an indicator of exploited or misused credentials. You want to search for this type of anomalous authentication attempt.

Required data

Authentication data. This sample search uses AWS CloudTrail data. You can replace this source with any other authentication data used in your organization.

Procedure

  1. Identify all relevant IT assets from a data mapping exercise conducted by the Data Privacy Officer’s team. These are all IT assets that are relevant to the full audit trail of data processing activities. This includes not only data stores and repositories that house sensitive personal data (PD) and personally identifiable information (PII), but also any technologies that are involved in the processing, storage, transmission, receipt, rendering, encrypt/decrypt, relaying, and handling of such data in any capacity. 
  2. Ensure that those assets are configured properly to report logging activity to an appropriate central repository. 
  3. Use your data mapping results to build a lookup that associates systems to their system category.
  4. Run the following search. You can optimize it by specifying an index and adjusting the time range.
sourcetype=sourcetype=aws:cloudtrail user=*
|sort 0 user, _time 
|streamstats window=1 current=f values(_time) AS last_time values(src) AS last_src by user 
|lookup <name of lookup you created in step 3> accountId 
|where isnotnull(category) AND last_src != src AND _time - last_time < 8*60*60 | iplocation last_src 
|rename lat AS last_lat lon AS last_lon 
|eval location = City . "|" . Country . "|" . Region 
|iplocation src 
|eval rlat1 = pi()*last_lat/180, rlat2=pi()*lat/180, rlat = pi()*(lat-last_lat)/180, rlon= pi()*(lon-last_lon)/180 
|eval a = sin(rlat/2) * sin(rlat/2) + cos(rlat1) * cos(rlat2) * sin(rlon/2) * sin(rlon/2) 
|eval c = 2 * atan2(sqrt(a), sqrt(1-a)) 
|eval distance = 6371 * c, time_difference_hours = round((_time - last_time) / 3600,2), speed=round(distance/ ( time_difference_hours),2) 
|fields - rlat* a c 
|eval day=strftime(_time, "%m/%d/%Y") 
|stats values(accountId) values(awsRegion) values(eventName) values(distance) values(eval(mvappend(last_Country, Country))) AS Country values(eval(mvappend(last_City, City))) AS City values(eval(mvappend(last_Region, Region))) AS Region  values(lat) values(lon)  values(userAgent) max(speed) AS max_speed_kph min(time_difference_hours) AS min_time_difference_hours BY day user distance

Search explanation

The table provides an explanation of what each part of this search achieves. You can adjust this query based on the specifics of your environment.

Splunk Search Explanation

sourcetype=sourcetype=aws:cloudtrail

Search only AWS CloudTrail data.

user=* 

Return events where the user field is set to a value. 

|sort 0 user, _time 

Sort all the results (0 means no limit and will sort the entire result set) by user and then then by time ascending.

|streamstats window=1 current=f values(_time) AS last_time values(src) AS last_src by user 

Return the last src_ip for each user.

|lookup <name of lookup you created in step 3> accountId 

Look up the account ID in the categorization lookup you created.

|where isnotnull(category) AND last_src != src AND _time - last_time < 8*60*60 

Filter for logins that are in-scope, and in a short enough time range that it would be difficult to travel to distant parts of the globe.

|iplocation last_src 

Extract location information from the MaxMind database that is built into Splunk Enterprise.

|rename lat AS last_lat lon AS last_lon 

Rename the fields as shown for better readability.

|eval location = City . "|" . Country . "|" . Region 

Resolve the last source IP address to a physical location.

|iplocation src 

Resolve the current source IP address.

|eval rlat1 = pi()*last_lat/180, rlat2=pi()*lat/180, rlat = pi()*(lat-last_lat)/180, rlon= pi()*(lon-last_lon)/180
|eval a = sin(rlat/2) * sin(rlat/2) + cos(rlat1) * cos(rlat2) * sin(rlon/2) * sin(rlon/2)

|eval c = 2 * atan2(sqrt(a), sqrt(1-a))
|eval distance = 6371 * c, time_difference_hours = round((_time - last_time) / 3600,2), speed=round(distance/ ( time_difference_hours),2)

Calculate the shortest distance between two points on the surface of the earth described by longitudes and latitudes. This is done using the ”great circle distance formula.” It can also be done using a haversine function, which is basically the same but reduces the number of sine functions used. 

|fields - rlat* a c

Remove the indicated fields from the result 

|eval day=strftime(_time, "%m/%d/%Y") 

Return the date of the event.

|stats values(accountId) values(awsRegion) values(eventName) values(distance) values(eval(mvappend(last_Country, Country))) AS Country values(eval(mvappend(last_City, City))) AS City values(eval(mvappend(last_Region, Region))) AS Region  values(lat) values(lon)  values(userAgent) max(speed) AS max_speed_kph min(time_difference_hours) AS min_time_difference_hours BY day user distance

Collect all the values into one line, per user, per day, and per set of locations. 

This example shows specific AWS data fields here. If you're using a log source like VPN, then you might choose other fields.

Next steps

Below is a sample table showing two users that have logged into two systems that are geographically improbable given the time differences. (Not all fields from the search above are shown.) 

City County day distance max_speed_kph user
Singapore Singapore 1/23/18 15562.4 44464.07 alice
San Jose United States 1/23/18 3213.199 321319.86 chuck

There are two big buckets for false positives. One is where the geoip is unreliable, particularly outside of major economic areas (e.g., US, larger countries in Western Europe). The free MaxMind GeoIP that ships with Splunk Enterprise tends to be less accurate in these locations, causing some customers to add the paid version in their Splunk installations. The other big category is where IPs are centralized, such as someone in the US using a Korean VPN service or using a networking service that originates nationwide traffic from a single set of IPs. For example, years ago all traffic from a major US cellular carrier originated from the same IP space that was geolocated to Ohio.

When this fires, you should reach out to the user involved to see if they're aware of why their account was used in two places. You should also see what actions were taken, particularly if one of the locations was unusual. If the user is not aware of the reason, it's important to also ask if the user is aware of sharing their credentials with anyone else. You can also see what other activities occurred from the same remote IP addresses.

Additionally, you should monitor the mapped IT assets changes in logging status, adjust for known outages, and prioritize incident response for any failures to report by hosts that are not scheduled for downtime.

GDPR Relevance: Detecting and proving that only individuals who are authorized to access, handle, and process personal data is an industry best practice and can be considered an effective security control, as required by Article 32. Demonstrating that credentials are not being misused or exploited will help to prove compliance for data privacy audits from authorities (Article 58) or to counteract any compensation claims (Article 82). This is applicable to processing personal data from the controller, and needs to also be addressed if contractors or sub-processors from third countries or international organizations access and transfer personal data (Article 15). Ensure you have a GDPR-mandated audit trail with individual user accounts for each processor accessing personal data. 

Finally, you might be interested in other processes associated with the Complying with General Data Protection Regulation use case.