Configuring Splunk DB Connect for use with Google BigQuery

Last updated
Save as PDF
Share
1. Share
2. Tweet
3. Share

While Splunk DB Connect supports a wide range of out-of-the-box database connectors, Google's BigQuery is not one of them. With a little effort, however, you can build a custom JDBC add-on for DB Connect that allows the Splunk platform to interface with your BigQuery data warehouse. This article shows you the steps required for building this add-on, along with instructions for configuring the BigQuery connection within the DB Connect app.

For generic information on configuring Splunk DB Connect, see Configuring Splunk DB Connect and the guidance on working with DB Connect on Splunk Docs.

Connecting DB Connect to a BigQuery data warehouse will incur costs from Google Cloud. Splunk is not responsible for queries that result in large or unexpected bills. Always test your queries in the BigQuery console to preview the bytes scanned and determine if the associated cost of the scan is acceptable.

Prerequisites

To configure DB Connect for use with BigQuery, you need:

A Splunk Enterprise or Splunk Cloud Platform environment
curl, unzip, and jq CLI utilities
gcloud SDK
Splunk DB Connect installed with configured JRE
Google Cloud project with a user who possesses administrative privileges for managing service accounts and IAM
Familiarity with the BigQuery billing model

JDBC driver add-on information

Installing JDBC drivers for supported DB Connect databases is a relatively simple process. Each database has its drivers packaged into a dedicated add-on as described in the documentation. Driver installation and upgrades can be managed independently of the DB Connect app.

If you want to use JDBC drivers for an unsupported database such as BigQuery, you should follow the unsupported driver installation process. Part of this process involves adding files to the DB Connect app itself. This will work fine for environments that are fully under your control. However, for Splunk Cloud Platform environments, this option is not available.

To work around this issue, you can create a private add-on that looks and behaves like the ones Splunk supplies for its officially supported database drivers. When DB Connect scans for Splunk-supplied JDBC drivers, it traverses the app directory structure looking for add-ons that begin with Splunk_JDBC_. By using this same naming convention and a special directory structure, you can create your own custom add-ons containing unsupported JDBC drivers. These private add-ons can be installed in Splunk Cloud Platform environments using the private app install process.

Build the add-on

The shell script found in this repository can be used to build a custom add-on containing the Google-supplied JDBC drivers. It downloads the JDBC drivers, creates configuration files, and generates an add-on suitable for installation in a Splunk Enterprise or Splunk Cloud Platform environment.

After the script is finished executing, an add-on file with the name Splunk_JDBC_BigQuery_100.tgz will be present in the current working directory.

Install the add-on

In addition to installing the DB Connect App, the new custom JDBC add-on must also be installed. The installation process depends on whether the environment is self-hosted or Splunk Cloud Platform. See Splunk Cloud documentation for more information.

Prepare Google Cloud environment

Create a service account

Export several environment variables for use throughout the configuration instructions.

export SA_NAME=dbx-bq
export PROJECT_ID=<PROJECT_ID>
export KEY_FILE=dbx-bq-sa.json
export SA_EMAIL=${SA_NAME}@${PROJECT_ID}.iam.gserviceaccount.com

Create a service account in a Google Cloud project.

gcloud iam service-accounts create ${SA_NAME} --project=${PROJECT_ID}

Export a JSON credential for this new service account.

gcloud iam service-accounts keys create ${KEY_FILE} --iam-account=${SA_EMAIL}

Grant this service account the bigquery.user role within the project scope. Optionally, to use the dbxoutput command provided by DB Connect, you can grant bigquery.admin instead. Note that the bigquery.user role is sufficient to use the dbxquery command.
```
gcloud projects add-iam-policy-binding ${PROJECT_ID} --member=serviceAccount:${SA_EMAIL} --role=roles/bigquery.user
```
In order to use the JSON credential within the {{platform}, you need to flatten the JSON into a single line. The easiest way to do this is using the jq utility:
```
jq -c . ${KEY_FILE}
```

This should output the JSON on a single line similar to the output below.

{"type":"service_account","project_id":"<REDACTED>,"private_key_id":"e580bb185173189d15a653408a24957ffb8d2872","private_key":"-----BEGIN PRIVATE KEY-----\n<REDACTED>\n-----END PRIVATE KEY-----\n","client_email":"dev-dbx-bq-sa@<REDACTED>.iam.gserviceaccount.com","client_id":"109787105253850867414","auth_uri":"https://accounts.google.com/o/oauth2/auth","token_uri":"https://oauth2.googleapis.com/token","auth_provider_x509_cert_url":"https://www.googleapis.com/oauth2/v1/certs","client_x509_cert_url":"https://www.googleapis.com/robot/v1/metadata/x509/dev-dbx-bq-sa%40<REDACTED>.iam.gserviceaccount.com","universe_domain":"googleapis.com"}

Copy and paste this flattened JSON into a text file so you can reference it later in your DB Connect configuration.

Configure DB Connect

To prepare DB Connect for a BigQuery connection, you must first create a placeholder identity and a new connection.

Create an identity

Within the Splunk DB Connect App, navigate to Configuration > Databases > Identities. Click New Identity and select Basic Identity.

For the Identity Name, Username, and Password fields, enter bq and then click Save.

Create a connection

Navigate to Configuration > Databases > Connections. Click New Connection.
Enter bq as the connection name. Select bq as the Identity, BigQuery as the Connection Type, and Etc/UTC for Timezone.
Place a check mark next to the Edit JDBC URL option. This allows you to edit the JDBC connection URL and inject a service account credential.

To create the custom JDBC URL, use the following format with the changes described below:

jdbc:bigquery://https://www.googleapis.com/bigquery/v2:443;EnableSession=1;ProjectId=<YOUR-PROJECT>;OAuthType=0;OAuthServiceAcctEmail=<YOUR-SERVICE-ACCOUNT-EMAIL>;OAuthPvtKey=<JSON-SERVICE-ACCOUNT-KEY>;

Substitute your project ID for <YOUR-PROJECT>.
Substitute the email of the service account created in the account creation steps from the previous section for<YOUR-SERVICE-ACCOUNT-EMAIL>.
Substitute the single-line version of the service account key you previously reformatted with the jq command for the<JSON-SERVICE-ACCOUNT-KEY>.
Ensure the JDBC URL ends with a semicolon.

Here is a completed example:

jdbc:bigquery://https://www.googleapis.com/bigquery/v2:443;EnableSession=1;ProjectId=my-project-151018;OAuthType=0;OAuthServiceAcctEmail=dev-dbx-bq-sa@my-project-151018.iam.gserviceaccount.com;OAuthPvtKey={"type":"service_account","project_id":"my-project-151018","private_key_id":"8715a33233d1d7f3a1f8c6138eb780b474c67422","private_key":"-----BEGIN PRIVATE KEY-----\n<REDACTED>-----END PRIVATE KEY-----\n","client_email":"dev-dbx-bq-sa@gsa-project-151018.iam.gserviceaccount.com","client_id":"109787105253850867414","auth_uri":"https://accounts.google.com/o/oauth2/auth","token_uri":"https://oauth2.googleapis.com/token","auth_provider_x509_cert_url":"https://www.googleapis.com/oauth2/v1/certs","client_x509_cert_url":"https://www.googleapis.com/robot/v1/metadata/x509/dev-dbx-bq-sa%40my-project-151018.iam.gserviceaccount.com","universe_domain":"googleapis.com"};

Ignore the Advanced settings and click Save to finish.

Test

Now that the BigQuery connection has been defined in DB Connect, you're ready to issue your first query. In this example, a public data set is used to do this.

Performing this query results in BigQuery charges. It is your responsibility to perform the dry-run in the BigQuery console to estimate costs.

Navigate to the Search app and enter the following in the Splunk search box:

| dbxquery query="SELECT * FROM `bigquery-public-data.noaa_historic_severe_storms.storms_2023`" connection="bq"

This should return ~ 87,564 rows.

Limitations

The private key for the service account is visible in the JDBC URL and is not stored encrypted.
Custom add-ons containing jar files must go through manual vetting in a Splunk Cloud environment.
Customers should review GoogleSQL DML limitations.

Next steps

These Splunk resources might help you understand and implement the recommendations in this article:

Splunk Docs: About Splunk DB Connect
Splunk Docs: How Splunk DB Connect works
Splunk Docs: Splunk DB Connect release notes
Splunk Docs: Splunk DB Connect Prerequisites and Java runtime environment requirements