Configuring Splunk DB Connect for use with Google BigQuery
While Splunk DB Connect supports a wide range of out-of-the-box database connectors, Google's BigQuery is not one of them. With a little effort, however, you can build a custom JDBC add-on for DB Connect that allows the Splunk platform to interface with your BigQuery data warehouse. This article shows you the steps required for building this add-on, along with instructions for configuring the BigQuery connection within the DB Connect app.
For generic information on configuring Splunk DB Connect, see Configuring Splunk DB Connect and the guidance on working with DB Connect on Splunk Docs.
Connecting DB Connect to a BigQuery data warehouse will incur costs from Google Cloud. Splunk is not responsible for queries that result in large or unexpected bills. Always test your queries in the BigQuery console to preview the bytes scanned and determine if the associated cost of the scan is acceptable.
Prerequisites
To configure DB Connect for use with BigQuery, you need:
- A Splunk Enterprise or Splunk Cloud Platform environment
- curl, unzip, and jq CLI utilities
- gcloud SDK
- Splunk DB Connect installed with configured JRE
- Google Cloud project with a user who possesses administrative privileges for managing service accounts and IAM
- Familiarity with the BigQuery billing model
JDBC driver add-on information
Installing JDBC drivers for supported DB Connect databases is a relatively simple process. Each database has its drivers packaged into a dedicated add-on as described in the documentation. Driver installation and upgrades can be managed independently of the DB Connect app.
If you want to use JDBC drivers for an unsupported database such as BigQuery, you should follow the unsupported driver installation process. Part of this process involves adding files to the DB Connect app itself. This will work fine for environments that are fully under your control. However, for Splunk Cloud Platform environments, this option is not available.
To work around this issue, you can create a private add-on that looks and behaves like the ones Splunk supplies for its officially supported database drivers. When DB Connect scans for Splunk-supplied JDBC drivers, it traverses the app directory structure looking for add-ons that begin with Splunk_JDBC_
. By using this same naming convention and a special directory structure, you can create your own custom add-ons containing unsupported JDBC drivers. These private add-ons can be installed in Splunk Cloud Platform environments using the private app install process.
Build the add-on
The shell script found in this repository can be used to build a custom add-on containing the Google-supplied JDBC drivers. It downloads the JDBC drivers, creates configuration files, and generates an add-on suitable for installation in a Splunk Enterprise or Splunk Cloud Platform environment.
After the script is finished executing, an add-on file with the name Splunk_JDBC_BigQuery_100.tgz
will be present in the current working directory.
Install the add-on
In addition to installing the DB Connect App, the new custom JDBC add-on must also be installed. The installation process depends on whether the environment is self-hosted or Splunk Cloud Platform. See Splunk Cloud documentation for more information.
Prepare Google Cloud environment
Create a service account
- Export several environment variables for use throughout the configuration instructions.
export SA_NAME=dbx-bq export PROJECT_ID=<PROJECT_ID> export KEY_FILE=dbx-bq-sa.json export SA_EMAIL=${SA_NAME}@${PROJECT_ID}.iam.gserviceaccount.com
- Create a service account in a Google Cloud project.
gcloud iam service-accounts create ${SA_NAME} --project=${PROJECT_ID}
- Export a JSON credential for this new service account.
gcloud iam service-accounts keys create ${KEY_FILE} --iam-account=${SA_EMAIL}
- Grant this service account the
bigquery.user
role within the project scope. Optionally, to use thedbxoutput
command provided by DB Connect, you can grantbigquery.admin
instead. Note that thebigquery.user
role is sufficient to use thedbxquery
command.gcloud projects add-iam-policy-binding ${PROJECT_ID} --member=serviceAccount:${SA_EMAIL} --role=roles/bigquery.user
- In order to use the JSON credential within the {{platform}, you need to flatten the JSON into a single line. The easiest way to do this is using the
jq
utility:jq -c . ${KEY_FILE}
- This should output the JSON on a single line similar to the output below.
{"type":"service_account","project_id":"<REDACTED>,"private_key_id":"e580bb185173189d15a653408a24957ffb8d2872","private_key":"-----BEGIN PRIVATE KEY-----\n<REDACTED>\n-----END PRIVATE KEY-----\n","client_email":"dev-dbx-bq-sa@<REDACTED>.iam.gserviceaccount.com","client_id":"109787105253850867414","auth_uri":"https://accounts.google.com/o/oauth2/auth","token_uri":"https://oauth2.googleapis.com/token","auth_provider_x509_cert_url":"https://www.googleapis.com/oauth2/v1/certs","client_x509_cert_url":"https://www.googleapis.com/robot/v1/metadata/x509/dev-dbx-bq-sa%40<REDACTED>.iam.gserviceaccount.com","universe_domain":"googleapis.com"}
- Copy and paste this flattened JSON into a text file so you can reference it later in your DB Connect configuration.
Configure DB Connect
To prepare DB Connect for a BigQuery connection, you must first create a placeholder identity and a new connection.
Create an identity
Within the Splunk DB Connect App, navigate to Configuration > Databases > Identities. Click New Identity and select Basic Identity.
For the Identity Name, Username, and Password fields, enter bq and then click Save.
Create a connection
- Navigate to Configuration > Databases > Connections. Click New Connection.
- Enter bq as the connection name. Select bq as the Identity, BigQuery as the Connection Type, and Etc/UTC for Timezone.
- Place a check mark next to the Edit JDBC URL option. This allows you to edit the JDBC connection URL and inject a service account credential.
- To create the custom JDBC URL, use the following format with the changes described below:
jdbc:bigquery://https://www.googleapis.com/bigquery/v2:443;EnableSession=1;ProjectId=<YOUR-PROJECT>;OAuthType=0;OAuthServiceAcctEmail=<YOUR-SERVICE-ACCOUNT-EMAIL>;OAuthPvtKey=<JSON-SERVICE-ACCOUNT-KEY>;
- Substitute your project ID for
<YOUR-PROJECT>
. - Substitute the email of the service account created in the account creation steps from the previous section for
<YOUR-SERVICE-ACCOUNT-EMAIL>
. - Substitute the single-line version of the service account key you previously reformatted with the
jq
command for the<JSON-SERVICE-ACCOUNT-KEY>.
- Ensure the JDBC URL ends with a semicolon.
Here is a completed example:jdbc:bigquery://https://www.googleapis.com/bigquery/v2:443;EnableSession=1;ProjectId=my-project-151018;OAuthType=0;OAuthServiceAcctEmail=dev-dbx-bq-sa@my-project-151018.iam.gserviceaccount.com;OAuthPvtKey={"type":"service_account","project_id":"my-project-151018","private_key_id":"8715a33233d1d7f3a1f8c6138eb780b474c67422","private_key":"-----BEGIN PRIVATE KEY-----\n<REDACTED>-----END PRIVATE KEY-----\n","client_email":"dev-dbx-bq-sa@gsa-project-151018.iam.gserviceaccount.com","client_id":"109787105253850867414","auth_uri":"https://accounts.google.com/o/oauth2/auth","token_uri":"https://oauth2.googleapis.com/token","auth_provider_x509_cert_url":"https://www.googleapis.com/oauth2/v1/certs","client_x509_cert_url":"https://www.googleapis.com/robot/v1/metadata/x509/dev-dbx-bq-sa%40my-project-151018.iam.gserviceaccount.com","universe_domain":"googleapis.com"};
- Substitute your project ID for
- Ignore the Advanced settings and click Save to finish.
Test
Now that the BigQuery connection has been defined in DB Connect, you're ready to issue your first query. In this example, a public data set is used to do this.
Performing this query results in BigQuery charges. It is your responsibility to perform the dry-run in the BigQuery console to estimate costs.
Navigate to the Search app and enter the following in the Splunk search box:
| dbxquery query="SELECT * FROM `bigquery-public-data.noaa_historic_severe_storms.storms_2023`" connection="bq"
This should return ~ 87,564 rows.
Limitations
- The private key for the service account is visible in the JDBC URL and is not stored encrypted.
- Custom add-ons containing jar files must go through manual vetting in a Splunk Cloud environment.
- Customers should review GoogleSQL DML limitations.
Next steps
These Splunk resources might help you understand and implement the recommendations in this article:
- Splunk Docs: About Splunk DB Connect
- Splunk Docs: How Splunk DB Connect works
- Splunk Docs: Splunk DB Connect release notes
- Splunk Docs: Splunk DB Connect Prerequisites and Java runtime environment requirements