Skip to main content
Splunk Lantern

First time accessing an internal Git repository

Developers are often granted access to the Git (or other software life cycle repository) that their responsibilities require. The first time a user accesses a given repository could be perfectly normal, or if the repository contains code not relevant to the developers role, could be an anomaly to investigate. You want to monitor first access instances so you can investigate if needed.

Required data

Code management data

Procedure

This sample search uses Atlassian Bitbucket as a source. You can replace this source with any other web server data used in your organization.

Run the following search. You can optimize it by specifying an index and adjusting the time range.

source="*/atlassian-bitbucket-access.log" 
|rex "GET /projects/[^/]*/repos/(?<git_repo>[^/]*)"
|rex "(?<git_repo>[^/]*)\.git"
|rex "git\.[^ /]{1,}/projects/[^/]*/repos/(?<git_repo>[^/]*)"
|search  git_repo="*"
|stats earliest(_time) AS earliest latest(_time) AS latest  BY user, git_repo
|where earliest > relative_time(now(), "-1d@d")

Search explanation

The table provides an explanation of what each part of this search achieves. You can adjust this query based on the specifics of your environment.

Splunk Search Explanation

source="*/atlassian-bitbucket-access.log"

Pull in your Git dataset.

Your may be different from the Atlassian sample shown here. Update the source according to your environment.

|rex "GET /projects/[^/]*/repos/(?<git_repo>[^/]*)" 

|rex "(?<git_repo>[^/]*)\.git"

|rex "git\.[^ /]{1,}/projects/[^/]*/repos/(?<git_repo>[^/]*)"

Extract the field names. 

These regular expressions have worked in a couple of environments, but you may need to adapt them to yours.

|search  git_repo="*"

Filter for logs that include a git_repo field.

|stats earliest(_time) AS earliest latest(_time) AS latest  BY user, git_repo

Calculate the earliest and the latest time that the repos were accessed.

|where earliest > relative_time(now(), "-1d@d")

Return results where the earliest time was within the last day.

Next steps

While there are no traditional false positives in this search, every time this fires, it will accurately reflect the first occurrence in the time period you're searching over (or for the lookup cache feature, the first occurrence over whatever time period you built the lookup). You should not review these alerts directly (except for access to extremely sensitive repositories), but instead use them for context, or to aggregate risk.

When this search returns values, initiate your incident response process and identify the user account accessing the specific repo. Contact the user and manager to determine if they are accessing the repo with authorization. If they did not access this repo, attempt to determine if the user credentials have been used by another party by stealing a user's credentials.

Finally, you might be interested in other processes associated with the Monitoring use of Git repositories use case.