Load balancing traffic to Edge Processors in Amazon EKS

Last updated
Save as PDF
Share
1. Share
2. Tweet
3. Share

Now that we’ve got our Edge Processor deployment up and running and scaled for our needs, we need to create a path from our data sources into our Edge Processor nodes that are running in containers. The topic of Kubernetes ingress can be very complex and specific, with a wide variety of available products and approaches depending on your implementation of Kubernetes. This series assumes the use of a very basic Amazon EKS deployment of Kubernetes to keep the process as simple as possible, but you should be able to take the base knowledge learned here and adapt to your situation even if it deviates from the EKS approach.

In this article, we’ll use a Kubernetes LoadBalancer service to expose the containers to external sources. One great aspect of Amazon EKS in this regard is that building a LoadBalancer service automatically creates the required classic Amazon Network Load Balancer (NLB) instance to support access to the service. In other platforms and variations of Kubernetes, you will likely need to build and provision your own NLBs or application load balancers (ALBs).

The manifest for creating LoadBalancer service

apiVersion: v1
kind: Service
metadata:
  name: ep-service
spec:
  type: LoadBalancer
  ports:
    - name: s2s
      protocol: TCP
      port: 9997
      targetPort: 9997
    - name: hec
      protocol: TCP
      port: 8088
      targetPort: 8088
  selector:
    app: ep

While there are many other options available to fine tune and control your data ingress rules, this manifest represents the bare minimum of what is required to get an ingress data path.

This manifest assumes that your Edge Processor will accept data for Splunk clients on the standard port of TCP/9997 and for the HTTP Event Collector on TCP/8088. Because Edge Processor also supports syslog on arbitrary UDP and TCP ports, you might find that you need to add those ports and protocols to the list of ports that are opened.

Additionally, by placing a load balancer between our data and our containers, we have some flexibility in the ports that our data sources can expect to send to and the actual ports that our Edge Processors listen on. This can help address scenarios where the global port settings of your Edge Processors aren’t able to easily be changed, but your data sources require a different port. The most common situation for this to occur is syslog, when your servers are locked to a specific port like 514, but your Edge Processors are listening on a port like 5514 or others.

Finally, our selector just needs to match the annotation we provided in our deployment manifest from the prior article. In your deployments, you will most likely customize these annotations to match and support your overall topology. We’re keeping things simple in this manifest with just ep.

After you’ve applied this manifest, Amazon will provision a classic NLB. After that, your agents can begin sending data to the NLB name on the appropriate ports and that data will be routed to your Edge Processors. You can find your NLB-friendly name using: kubectl get svc. Alternatively, you can use your AWS console or command line to review the provisioned load balancers.

Autoscaling Edge Processor using Amazon EKS horizontal pod autoscaling

Perhaps one of the most exciting capabilities enabled by running Edge Processor in Kubernetes is the ability to scale up resources dynamically based on the utilization of the pods using horizontal pod autoscaling (HPA). As with load balancing, the use of EKS offers a lot of out-of-the-box capability without needing to prepare and configure a lot of prerequisites. Before getting into the HPA itself, let’s quickly explore the dimensions we’ll use for scaling.

In many horizontal autoscaling situations, a wide variety of system and application metrics can be measured and used to determine when to scale. The same is true for Edge Processor, but for this example we’ll focus on CPU and memory, which tend to be the best measure for needing additional resources. Over time, as you refine your metrics collection and find trends in your system usage, you might decide that network traffic or other Edge Processor specific metrics are more important, and you can adjust your autoscaling rules.

To get started, we need to install and enable the Kubernetes metrics server. By doing this, we expose memory as a measurable metric since CPU is the only metric exposed in EKS by default. With both CPU and memory available, we can build our HPA.

Installing the metrics server

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

It will take a few minutes after deploying the metrics server before we have data to report. You can check the metrics using the following command:

kubectl get -n default --raw "/apis/metrics.k8s.io/v1beta1/pods"

If you get an error “Error from server (ServiceUnavailable): the server is currently unable to handle the request”, the metrics server hasn’t finished its initialization.

The HPA manifest

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: ep-deployment-hpa
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: ep-deployment
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 70
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70

Let’s explore the critical portions of the manifest.

minReplicas: 2 maxReplicas: 10 The minimum and maximum replicas will inform the autoscaler about the limits you want to impose on your Edge Processor deployment.

metrics:

              - type: Resource

                resource:

                  name: memory

                  target:

                    type: Utilization

                    averageUtilization: 70

              - type: Resource

                resource:

                  name: cpu

                  target:

                    type: Utilization

                    averageUtilization: 70

The array of measured metrics defines the rules by which the autoscaler determines whether it’s time to scale.

While this seems straightforward, it’s important that you understand how these measurements are calculated. Review the autoscale algorithm details to learn more.

For Edge Processor, an average utilization of 70% matches the default alert threshold you can find the the global settings at https://px.scs.splunk.com/<your tenant>/data-management/edge-processor/global-settings.

After you’ve applied your HPA manifest you can review the status of the autoscaler with the following:

kubectl get hpa

With this output, you can review the total CPU and memory utilization, as well as the number of pods/replicas currently being managed by the autoscaler. And perhaps most importantly, these new pods will automatically be added to your load balancer for zero-touch scaling.

Next steps

Thank you for joining us on this journey. It was a long series to get us here, but we’re hopeful that the context and highly detailed explanations have given you the tools to implement a functional and dynamically scaled Edge Processor infrastructure ready to meet your data routing needs.