Building a custom OpenTelemetry collector
This article shows how to build a custom OpenTelemetry (OTel) collector distribution. By creating a custom collector, you can significantly reduce binary size and include only the components needed for your monitoring requirements.
What is OpenTelemetry?
OpenTelemetry (OTel) is a collection of tools, APIs, and SDKs used to instrument, generate, collect, and export telemetry data including metrics, logs, and traces. This data helps you analyze your software's performance and behavior.
Splunk maintains the Splunk Distribution of the OpenTelemetry Collector, and if you're using a Splunk forwarder, you can distribute this collector using the Splunk Add-On for OpenTelemetry Collector.
OpenTelemetry provides several advantages for infrastructure and application monitoring, particularly for organizations looking to standardize their observability stack. Because OTel is likely to become the standard for performance monitoring tools, this industry direction makes OTel a future-proof choice for new implementations.
Optimizing the Splunk Add-On for OpenTelemetry Collector
The Splunk Add-On for OpenTelemetry Collector provides comprehensive monitoring capabilities, but this flexibility can come at a cost. The add-on can be hundreds of megabytes in size, whereas purpose-built tools can be much smaller. Version 1.7.0 of the Splunk Add-On for OpenTelemetry Collector exceeds 450MB as a compressed archive and is even larger when expanded. For simpler use cases, for example those that only require host metrics monitoring without application instrumentation or JVM monitoring, this footprint can be significantly reduced.
Removing unnecessary components
The command below shows an example of how you can optimize the Splunk Add-On for OpenTelemetry Collector for Linux-only host metrics collection by removing several directories from the root level of the add-on:
cd Splunk_TA_otel # Ensure you are in the correct directory before running these commands rm -Rf windows_x86_64/ rm -Rf configs/
The configs directory removal is shown above, but it will be recreated later in this process when developing a configuration app for the Splunk Add-On for OpenTelemetry Collector.
Extracting and optimizing the agent bundle
Next, extract the agent bundle and remove unnecessary components:
tar -xzf linux_x86_64/bin/agent-bundle_linux_amd64.tar.gz -C bin/ rm linux_x86_64/bin/agent-bundle_linux_amd64.tar.gz
This creates the directory structure at /opt/splunk/etc/apps/Splunk_TA_otel/bin/agent-bundle. Within the agent bundle, you can remove additional files that aren't needed for basic host metrics collection:
cd Splunk_TA_otel/bin/agent-bundle # Ensure you are in the correct directory before running the commands below rm -Rf run var lib collectd-python collectd-java jre signalfx_types.db
While these optimizations reduce the overall package size, the OTel collector binary itself remains fairly large (373MB in version 1.7.0 or 391MB in version 1.8.0). This size is likely due to the collector's flexibility and comprehensive feature set. For scenarios where you don't need the extra features, building a custom collector can reduce the package size.
Building a custom OTel collector
The OpenTelemetry Collector Builder (OCB) is the official tool for creating custom collector distributions. It allows you to specify exactly which components to include, resulting in a smaller, more focused binary.
Creating the builder configuration
The builder uses a YAML configuration file to define which components to include. The following configuration is based on OTel version 1.39.0, and overall matches the components used in the Splunk Add-On for OpenTelemetry Collector but is based on upstream OTel sources. Save this as otelcol-builder.yaml:
dist: name: otelcol-custom description: Local OpenTelemetry Collector binary output_path: /tmp/dist exporters: - gomod: github.com/open-telemetry/opentelemetry-collector-contrib/exporter/splunkhecexporter v0.139.0 receivers: - gomod: github.com/open-telemetry/opentelemetry-collector-contrib/receiver/hostmetricsreceiver v0.139.0 processors: - gomod: go.opentelemetry.io/collector/processor/batchprocessor v0.139.0 - gomod: go.opentelemetry.io/collector/processor/memorylimiterprocessor v0.139.0 - gomod: github.com/open-telemetry/opentelemetry-collector-contrib/processor/resourcedetectionprocessor v0.139.0 providers: - gomod: go.opentelemetry.io/collector/confmap/provider/envprovider v1.45.0 - gomod: go.opentelemetry.io/collector/confmap/provider/fileprovider v1.45.0 - gomod: go.opentelemetry.io/collector/confmap/provider/httpprovider v1.45.0 - gomod: go.opentelemetry.io/collector/confmap/provider/httpsprovider v1.45.0 - gomod: go.opentelemetry.io/collector/confmap/provider/yamlprovider v1.45.0 extensions: - gomod: github.com/open-telemetry/opentelemetry-collector-contrib/extension/healthcheckextension v0.139.0
The resourcedetectionprocessor shown above can significantly impact binary size. When testing this build, the binary was approximately 28MB, while including the resourcedetectionprocessor increased the size to 98MB. In this case, the resource detection was useful, so it has been left in this build.
Running the build process
You have two options for running the process to build the collector - Docker, or an OCB binary available from the builder releases GitHub. For more information on this process, see Building a custom collector.
Option 1: Using Docker
If you are able to use Docker, use the official Docker image to run the build.
Option 2: Using the OCB binary
If you are unable to use Docker, you can download the appropriate OCB binary for your platform from GitHub, and then run:
./ocb_0.139.0_linux_amd64 --config=otelcol-builder.yaml
You will need Go and Git to be available on the server building the binary.
The binary uses slightly different syntax than expected by the Splunk Add-On for OpenTelemetry Collector. You need to modify line 308, which at the time of writing is Splunk_TA_otel/linux_x86_64/bin/Splunk_TA_otel.sh:
SPLUNK_OTEL_FLAGS="--config-file=file:$splunk_config_value"
This change allows the agent to start correctly using the Splunk Add-On for OpenTelemetry Collector.
Developing a configuration app for the Splunk Add-On for OpenTelemetry Collector
Separating the OTel configuration from the Splunk Add-On for OpenTelemetry Collector means you can deploy different configurations to different server groups without redeploying the entire add-on.
The application is named A_Splunk_TA_otel_config to ensure it loads before Splunk_TA_otel in the lexographical app loading order used in the Splunk platform. This is important because configuration precedence depends on app load order.
mkdir -p /opt/splunk/etc/apps/A_Splunk_TA_otel_config/configs
cd /opt/splunk/etc/apps/A_Splunk_TA_otel_config/configs
cat > ta-agent-config_new.yaml <<EOF
# Default configuration file for the Linux (deb/rpm) and Windows MSI collector packages
# If the collector is installed without the Linux/Windows installer script, the following
# environment variables are required to be manually defined or configured below:
# - SPLUNK_ACCESS_TOKEN: The Splunk access token to authenticate requests
# - SPLUNK_API_URL: The Splunk API URL, e.g. https://api.us0.signalfx.com
# - SPLUNK_BUNDLE_DIR: The path to the Smart Agent bundle, e.g. /usr/lib/splunk-otel-collector/agent-bundle
# - SPLUNK_COLLECTD_DIR: The path to the collectd config directory for the Smart Agent, e.g. /usr/lib/splunk-otel-collector/agent-bundle/run/collectd
# - SPLUNK_INGEST_URL: The Splunk ingest URL, e.g. https://ingest.us0.signalfx.com
# - SPLUNK_LISTEN_INTERFACE: The network interface the agent receivers listen on.
# - SPLUNK_TRACE_URL: The Splunk trace endpoint URL, e.g. https://ingest.us0.signalfx.com/v2/trace
extensions:
health_check:
endpoint: "${SPLUNK_LISTEN_INTERFACE}:13134"
receivers:
hostmetrics:
collection_interval: 1m
scrapers:
cpu:
metrics:
system.cpu.utilization:
enabled: true
system.cpu.time:
enabled: false
disk:
metrics:
system.disk.weighted_io_time:
enabled: false
system.disk.merged:
enabled: false
filesystem:
memory:
metrics:
system.memory.limit:
enabled: true
system.linux.memory.available:
enabled: true
network:
metrics:
system.network.packets:
enabled: false
# System load average metrics https://en.wikipedia.org/wiki/Load_(computing)
#load:
# Paging/Swap space utilization and I/O metrics
paging:
metrics:
system.paging.usage:
enabled: false
system.paging.faults:
enabled: false
# Aggregated system process count metrics
processes:
# System processes metrics, disabled by default
# process:
processors:
batch:
# Enabling the memory_limiter is strongly recommended for every pipeline.
# Configuration is based on the amount of memory allocated to the collector.
# For more information about memory limiter, see
# https://github.com/open-telemetry/opentelemetry-collector/blob/main/processor/memorylimiter/README.md
memory_limiter:
check_interval: 2s
limit_mib: ${env:SPLUNK_MEMORY_LIMIT_MIB}
# Detect if the collector is running on a cloud system, which is important for creating unique cloud provider dimensions.
# Detector order is important: the `system` detector goes last so it can't preclude cloud detectors from setting host/os info.
# Resource detection processor is configured to override all host and cloud attributes because instrumentation
# libraries can send wrong values from container environments.
# https://docs.splunk.com/Observability/gdi/opentelemetry/components/resourcedetection-processor.html#ordering-considerations
resourcedetection:
detectors: [system, gcp, ecs, ec2, azure]
override: true
exporters:
splunk_hec/platform:
token: "${SPLUNK_ACCESS_TOKEN}"
# URL to a Splunk instance to send data to.
endpoint: "${SPLUNK_INGEST_URL}"
index: "metrics"
source: "otel"
sourcetype: "otel"
tls:
insecure_skip_verify: true
service:
extensions: [health_check]
pipelines:
metrics:
receivers: [hostmetrics]
processors: [memory_limiter, batch, resourcedetection]
exporters: [splunk_hec/platform]
EOF
mkdir -p /opt/splunk/etc/apps/A_Splunk_TA_otel_config/local
cd /opt/splunk/etc/apps/A_Splunk_TA_otel_config/local
echo "example" > access_token
cat > app.conf <<EOF
# Autogenerated file
[install]
state = enabled
EOF
cat > inputs.conf <<EOF
[monitor://\$SPLUNK_HOME/var/log/splunk/Splunk_TA_otel.log]
_TCP_ROUTING =
index = _internal
sourcetype = Splunk_TA_otel
[Splunk_TA_otel://Splunk_TA_otel]
disabled=false
start_by_shell=false
interval = 60
index = _internal
sourcetype = Splunk_TA_otel
splunk_access_token_file=\$SPLUNK_HOME/etc/apps/A_Splunk_TA_otel_config/local/access_token
#splunk_api_url=https://api.us0.signalfx.com
splunk_api_url=https://localhost:8089
splunk_bundle_dir=\$SPLUNK_HOME/etc/apps/Splunk_TA_otel/bin/agent-bundle
splunk_collectd_dir=\$SPLUNK_HOME/etc/apps/Splunk_TA_otel/bin/agent-bundle/run/collectd
splunk_ingest_url=https://localhost:8088/services/collector
splunk_listen_interface=localhost
splunk_realm=us0
splunk_config=\$SPLUNK_HOME/etc/apps/A_Splunk_TA_otel_config/configs/ta-agent-config_new.yaml
splunk_otel_log_file=\$SPLUNK_HOME/var/log/splunk/otel.log
splunk_memory_limit_mib=200
splunk_memory_total_mib=200
[http://OpenTelemetry Collector]
disabled = 0
index = metrics
indexes = metrics
sourcetype = Splunk_TA_otel
token = example
EOF
The app creates a HEC token named “example”, and gets the OTel agent running and sending to an index named “metrics”. This configuration is designed to run on a heavy forwarder and forward metrics locally via HEC. For a universal forwarder deployment, you'll need to modify the splunk_ingest_url setting and remove the HEC token configuration from inputs.conf.
The hostmetrics section has been customized to collect specific metrics while disabling others that might not be needed. This reduces data volume and focuses on the most relevant performance indicators. For full documentation, see Host metrics receiver.
This configuration assumes you're using the Splunk platform without Splunk Observability Cloud. Only the HEC token is configured to receive the hostmetrics into a metrics index.
Next steps
Building a custom OpenTelemetry collector requires initial effort to configure correctly, but after you establish a reliable build process, you can significantly reduce the collector's footprint for specific scenarios. The size reduction from hundreds of megabytes can be substantial, especially in large-scale deployments.
You should note that the Splunk platform does not provide pre-built dashboards for OpenTelemetry metrics, so you'll need to create custom dashboards if you're not using Splunk Observability Cloud. Despite this consideration, custom OTel collectors offer significant benefits for resource-constrained environments, security-conscious deployments, and organizations that want precise control over their observability stack.
In addition, these resources might help you understand and implement this guidance:
- OpenTelemetry Documentation: Collector overview
- OpenTelemetry Documentation: Building a custom collector
- GitHub: OpenTelemetry Collector core
- GitHub: OpenTelemetry Collector contrib
- GitHub: OpenTelemetry Collector builder releases

