Enabling access between Kubernetes indexer clusters and external search heads

Last updated
Save as PDF
Share
1. Share
2. Tweet
3. Share

This article discusses a project with the goal of using the Splunk Operator for Kubernetes (SOK) to create Splunk indexer clusters running on a new Kubernetes environment. The problem is that indexers running on Kubernetes (K8s) cannot be accessed by search heads that are not also running on K8s.

Why are external search heads unable to connect to indexers running on Kubernetes?

Kubernetes establishes an internal network that enables communication between K8s pods in the cluster. To enable incoming traffic to a Kubernetes-based Splunk cluster manager pod, we can use a Kubernetes service that allows external servers to access the REST port.

The challenge with that method lies in the way the cluster manager replies to requests for the indexer generation. The JSON payload includes IP addresses of the indexers that are pods within K8s. Therefore, the returned IPs cannot be accessed by servers that are not inside the K8s cluster.

This problem is shown in the following example sub-section of a JSON response from a Splunk cluster manager:

{
 "host_port_pair":"10.192.9.18:8089",
 "peer":"splunk-example-idxc-site2-indexer-12",
 "site":"site2",
 "status":"Up"
},

The host_port_pair is an internal K8s pod IP address.

Despite successfully connecting to the cluster manager through a service, the search head encounters a problem when making subsequent REST calls to the indexers due to the K8s-based pod IP not existing in the network.

A visual depiction would look like this (note that NodePort should be a HostPort in the diagram below):

Why not move the search heads into K8s?

In our scenario, we had on-premises bare metal Splunk search heads, and it did not make sense to rebuild the hardware as K8s nodes. Other reasons against moving search heads into K8s included:

Minimal benefit - We had no need for more search head cluster members.
Added complexity - K8s and the SOK introduce new challenges that we were not be prepared for.
Search head management - We had no access to two search heads so were unable to move them inside K8s.

How can you handle ingress traffic into Kubernetes?

Within Kubernetes you can use a “service”. The options are:

HostPort
NodePort
LoadBalancer / Gateway API

Since we were on-premises, the logical options were HostPort or NodePort.

The LoadBalancer service is commonly used in cloud environments. However, we didn’t have MetalLB or an alternative K8s bare metal load balancer available, so this was eliminated as an option.

A K8s service can expose the cluster manager’s REST port to Splunk search heads outside the K8s cluster. However, there are internal complications with how the Splunk platform works with the indexer “generation”. This problem is specific to Splunk indexer clusters and searching. Kubernetes services do not pose an issue for ingress of Splunk data via HEC, S2S, web interface ports (8000), or REST ports.

Solution

After the team we worked with from HPE completed a proof of concept (POC), implementing the solution became straightforward. The solution involved installing the flanneld and kube-proxy software on the search heads located outside the Kubernetes cluster.

At the Kubernetes level, we created a “node” to represent each search head with the taints of NoSchedule, NoExecute. We used the attribute unschedulable: true to ensure a pod could not be scheduled on the node, and we called these nodes “external nodes”. The external nodes used a manually allocated IP in the K8s CIDR at the end of the range to avoid clashes with standard K8s IP allocation.

The flanneld configuration on the external search heads query the K8s manager API, and the node name matched the “node” we created in K8s.

To add further context, flanneld implements the “overlay network” and kube-proxy generates rules, iptables rules in our case.

The following is a YAML example of an external node:

apiVersion: v1
kind: Node
metadata:
    name: externalhost.company.com
    labels:
        node-role.kubernetes.io/external-host: ""
spec:
    podCIDR: 10.207.255.225/32
    taints:
    - effect: NoSchedule
      key: node-role.kubernetes.io/external-host
    - effect: NoExecute
      key: node-role.kubernetes.io/external-host
    unschedulable: true

Kube-proxy and flanneld were configured to use a kubeconfig file so they could authenticate as a K8s node. The following is the relevant line from the kube-proxy systemd unit file:

ExecStart=/usr/local/bin/kube-proxy \
  --bind-address=0.0.0.0 \
  --cluster-cidr=10.192.0.0/12 \
  --kubeconfig=/root/.kube/admin.conf \
  --nodeport-addresses=127.0.0.1/8

How does this solution work?

The flanneld software allows network communication to the Kubernetes IP’s. This allows the external search heads access to any pod’s internal IP as if they are running inside the K8s cluster.

Kube-proxy allows the service-to-pod mapping to function as expected, in our case, updating iptables rules. We later removed kube-proxy from the solution because we did not require the K8s DNS name resolution that was used in the POC.

In the initial testing, we updated /etc/resolv.conf to use the K8s DNS service IP. Tthis allowed the use of K8s internal DNS names. Since search-head-to-indexer communication was direct to the IP address of indexer pods, we reverted this change in production and used our default company DNS servers.

Within the on-premises environment, we saw no issues with this setup. Search-head-to-indexer communication worked well and there was no noticeable difference when compared to connecting to non-K8s indexers.

In the cloud environment, we encountered issues with cloud to on-premises traffic. The VXLAN/UDP packets had drop issues and troubleshooting was very complicated. This might be a consideration if you choose to go down this path.

One suggestion I have is ensure the same maximum transmission unit is available throughout the network path. Otherwise, fragmentation of the packets will occur and this can increase the drop rate of the packets. I have further details on the challenges we found with the SOK in the article Splunk Operator for Kubernetes (SOK) — Lessons from our implementation.

Conclusion

While the Splunk Operator for Kubernetes has the general assumption that search heads will be within the same K8s cluster as indexers, we managed to create a solution that allowed external search heads to continue to function using flanneld and kube-proxy. Special recognition for this solution goes to the HPE team that was responsible for the proof of concept of “external nodes”.

These additional resources might help you understand and implement this guidance:

Product Tip: Understanding how to use the Splunk Operator for Kubernetes
Product Tip: Improving hardware utilization by moving indexers into Kubernetes