Managing telecommunications content delivery
In your role managing content delivery for a telecommunications organization, you have a lot of potential issues to monitor for. These include: response times, cache hit ratios, total traffic, HTTP errors, and last mile services. In addition, executives want information on content delivery revenue and volume so they can plan accordingly. This guide provides a number of searches for gathering content delivery data from which you can create dashboards for both of these distinct use cases.
Data required
- Content delivery network (CDN) operations logs
- Content provider data with customer IDs
Procedures
- Splunk recommends that customers look into using data models, report acceleration, or summary indexing when searching across hundreds of GBs of events in a single search. The searches provided here are a good starting point, but depending on your data, search time range, and other factors, more can be done to ensure that they scale appropriately.
- Your typical telecommunications transactions might include more than four steps, and some commands, parameters, and field names in the searches below might need to be adjusted to match your environment. In addition, to optimize the searches shown below, you should specify an index and a time range when appropriate.
- ► Stage 1: Search and investigation
-
Average response time for content delivery
This search gives an understanding of the average response times in the CDN. The locations in the CDN where such values are high provide insight as to where technical issues might exist and which issues might need to be investigated.
This search requires CDN application logs with time series data and the response time for each request.
| sourcetype=<content delivery network general operations> | eval time=strptime(_time,"%Y-%m-%d %H:%M:%S") | convert timeformat="%Y-%m-%d %H:%M:%S" mktime(_time) AS _time | sort 0 _time | fields _time responseTime | timechart fixedrange=false span=1m avg(responseTime) AS avgResponseTime | eval avgResponseTime=round(avgResponseTime,3)
Cache hit ratio with threshold alerts
This example shows both the ratio of requests that are accessing cached content and their associated alerts, based on a defined threshold. Understanding how much content is being delivered by CDN caches might allow you to better understand how to operationalize your networks for content delivery. Having access to alerts for when your cache hit ratios drop lower than a defined threshold might help provide better customer experience and reduce costs associated with transfers.
This search requires CDN application logs with time-series data for a total message size transferred per request, the content provider/identifier, and the status of the content, for example, if the requested content is cached or not. Additionally, a defined threshold for the alert is required and must be set in the search itself (in this case: 50%). This threshold can be adjusted to better reflect the service’s operational requirements.
| sourcetype=<content delivery network general operations> | eval _time=strptime(_time,"%Y-%m-%d %H:%M:%S") | fields _time messageBytes cached cp | stats count values(messageBytes) AS totalBytes by _time cp cached | stats sum(count) AS totalRequests sum(totalBytes) as totalBytes by cached _time | stats sum(totalRequests) AS totalRequested sum(eval(totalRequests*if(cached=="true",1,0))) AS "cached" BY _time | eval cacheRequestedHitRatio=round(cached/totalRequested * 100, 2) | timechart span=5min fixedrange=false avg(cacheRequestedHitRatio) AS avgCacheRequestedHitRatio | eval lowCacheHitThreshold = 50.0, avgCacheRequestedHitRatio=round(avgCacheRequestedHitRatio,2) | where avgCacheRequestedHitRatio < lowCacheHitThreshold | table _time avgCacheRequestedHitRatio
High last-mile round-trip times for content services
This search looks at the round trip times for the "last-mile" of the CDNs that exceed a defined threshold. Understanding latency from the core/backhaul network to the "last-mile" or edge services/customer premises equipment (CPE) might enable informed decisions about capacity management and planning. When these times are high, it indicates that there exists high load on the network and that capacity should potentially be increased.
This search requires CDN application logs with time-series data for the round-trip time of last-mile equipment. Additionally, an alerting threshold value is defined in the search (in this case, 49.95%). This threshold can be adjusted to better reflect the service’s operational requirements.
| sourcetype=<content delivery network general operations> | eval time=strptime(_time,"%Y-%m-%d %H:%M:%S") | convert timeformat="%Y-%m-%d %H:%M:%S" mktime(_time) AS _time | sort 0 _time | fields _time netPerflastMileRTT cp | stats avg(netPerflastMileRTT) AS avgLastMileRTT values(cp) AS CPE by _time cp | eval avgLastMileRTT=round(avgLastMileRTT,3), LastMileHigh = 49.95 | where avgLastMileRTT > LastMileHigh | table _time cp avgLastMileRTT
High-latency edge networks on the CDN
This example gives us the edge or regional CDNs with the slowest average response times. Being able to identify where edge delivery is impacted can be helpful for data analysis. If you can identify when edges or regions are experiencing high response times and degradation in services, you can better troubleshoot and remediate such issues. You might also be able to use such data to better understand regional user behavior and forecast network capacity.
This search requires CDN application logs with time-series data for the response time and IP address data for clients accessing the CDN.
| sourcetype=<content delivery network general operations> | eval time=strptime(_time,"%Y-%m-%d %H:%M:%S") | convert timeformat="%Y-%m-%d %H:%M:%S" mktime(_time) AS _time | sort 0 _time | fields _time responseTime requestClientIp | iplocation requestClientIp | eval City=case(City=="","Unknown",1=1,City) | eval Region=case(Region=="","Unknown",1=1,Region) | eval HostingEdge = Region." - ".Country | stats avg(responseTime) AS avgResponseTime BY HostingEdge | eval "Average Response Time"=round(avgResponseTime,3) | rename HostingEdge AS "Edge Hosting Region" | sort - "Average Response Time" | fields "Edge Hosting Region" "Average Response Time"
Total cache hit ratio of traffic within alert thresholds
This example shows the relative amount of traffic originating from your CDN cache, which is set within a defined threshold. Based on the value of this ratio, the business might be able to make informed decisions about where to increase or decrease caching capabilities.
This search requires CDN application logs with time-series data for a total message size transferred per request and the status of the content, for example, if the requested content is cached or not. A threshold is then defined within the search for such alert ratio, in this case 48%. This threshold can be adjusted to better reflect the service’s operational requirements.
| sourcetype=<content delivery network general operations> | eval _time=strptime(_time,"%Y-%m-%d %H:%M:%S") | fields _time messageBytes cached | timechart span=5m fixedrange=false sum(messageBytes) AS "totalBytes" sum(eval(messageBytes * if(cached=="true", 1, 0))) AS "cachedBytes" | eval cacheTrafficHitRatio= round(cachedBytes/totalBytes * 100, 2), lowHitRatio = 48 | where cacheTrafficHitRatio < lowHitRatio | table _time cacheTrafficHitRatio
Total traffic transferred over time with thresholds
This example shows the total amount of traffic your CDN uses and how to alert based on static thresholds for maximum or minimum traffic loads. When traffic is high, you can use this data to decide if you want to increase capacity or content delivery pricing.
This search requires CDN application logs with time-series data and total bytes of transferred data.
| sourcetype=<content delivery network general operations> | eval time=strptime(_time,"%Y-%m-%d %H:%M:S") | convert timeformat="%Y-%m-%d %T" mktime(_time) AS _time | sort 0 _time | fields _time messageBytes | eval trafficMbs=messageBytes/pow(1024,2) | timechart span=15m fixedrange=false sum(trafficMbs) AS trafficMbs | eval alertLowLevel = 1800.0 | eval alertHighLevel = 4350.0 | where ( trafficMbs < alertLowLevel ) OR ( trafficMbs > alertHighLevel ) | table _time trafficMbs
- ► Stage 2: Proactive monitoring
-
Last-mile round-trip times for content services
This example gives the average round-trip time for the "last mile" of the CDN. Understanding latency from the core/backhaul network to the "last mile", or edge services/customer premises equipment (CPE), can help enable informed decisions about capacity management and planning. When these times are high, such signals might indicate that there is high load on the network and that capacity should potentially be increased.
This search requires CDN application logs with time-series data for the round-trip time of “last-mile” equipment.
| sourcetype=<content delivery network general operations> | eval time=strptime(_time,"%Y-%m-%d %H:%M:S") | convert timeformat="%Y-%m-%d %T" mktime(_time) AS _time | fields _time messageBytes netPerfdownloadTime | sort 0 _time | eval bandwidthUsedMbps = ((messageBytes/netPerfdownloadTime)*1000/1024/128) | timechart fixedrange=false span=1m avg(bandwidthUsedMbps) AS avgBandwidthUsedMbps
Total number of HTTP request errors over time
This search allows you to find the total number of HTTP errors within your CDN. Identifying where such values are high might give insight into where you have hardware, software, or general content delivery issues.
This search requires CDN application logs with time-series data for a specific HTTP response code. At a minimum, 400+ HTTP response codes are required.
| sourcetype=<content delivery network general operations> | eval time=strptime(_time,"%Y-%m-%d %H:%M:S") | convert timeformat="%Y-%m-%d %T" mktime(_time) AS _time | sort 0 _time | fields _time status | eval statusFailed=if(status >= 400, 1, 0) | search statusFailed="1" | timechart fixedrange=false span=1m sum(statusFailed) AS "Error Events"
Total traffic transferred over time
This search gives more understanding into the total amount of traffic within the CDN. Specifically, identifying where such values are high gives insight into where you might need to increase capacity or increase content delivery pricing.
This search requires CDN application logs with time-series data and total bytes of data transferred.
| sourcetype=<content delivery network general operations> | eval time=strptime(_time,"%Y-%m-%d %H:%M:S") | convert timeformat="%Y-%m-%d %T" mktime(_time) AS _time | sort 0 _time | fields _time messageBytes | eval trafficMbs=messageBytes/pow(1024,2) | timechart span=1m fixedrange=false sum(trafficMbs) AS trafficMbs
- ► Stage 3: Operational visibility
-
Average bandwidth used over time by the CDNs
This search gives a visual representation of how much bandwidth is used to consume content from the CDN. This data helps with more informed decisions about network capacity and planning.
This search requires CDN application logs with time-series data for the amount of data downloaded and the total time it took for the download to complete. This information is then used to calculate an average bandwidth used.
| sourcetype=<content delivery network general operations> | eval time=strptime(_time,"%Y-%m-%d %H:%M:S") | convert timeformat="%Y-%m-%d %T" mktime(_time) AS _time | fields _time messageBytes netPerfdownloadTime | sort 0 _time | eval bandwidthUsedMbps = ((messageBytes/netPerfdownloadTime)*1000/1024/128) | timechart fixedrange=false span=1m avg(bandwidthUsedMbps) AS avgBandwidthUsedMbps
Cache hit ratio by total requests
This search looks at the ratio of requests that access content from the cache network, not directly from the central content provider. Having an understanding of how much content is being delivered by CDN caches might can help you ascertain how to better operationalize your networks for content delivery.
This search requires CDN application logs with time-series data for a total message size transferred per request, the content provider/identifier, and the status of the content, for example, if the requested content is cached or not.
| sourcetype=<content delivery network general operations> | eval time=strptime(_time,"%Y-%m-%d %H:%M:S") | fields _time messageBytes cached cp | stats count values(messageBytes) AS totalBytes BY _time cp cached | stats sum(count) AS totalRequests sum(totalBytes) AS totalBytes BY cached | stats sum(totalRequests) AS totalRequested sum(eval(totalRequests*if(cached=="true",1,0))) AS "cached" | eval cacheRequestedHitRatio=round(cached/totalRequested * 100, 2) | fields cacheRequestedHitRatio
Total cache hit ratio by traffic
This search looks at the ratio of traffic that accesses content from the cache network, not directly from the central content provider. This data can help you make more informed decisions about where to increase or decrease caching capabilities.
This search requires CDN application logs with time-series data for a total message size transferred per request and the status of the content, for example, if the requested content is cached or not.
| sourcetype=<content delivery network general operations> | eval time=strptime(_time,"%Y-%m-%d %H:%M:S") | fields _time messageBytes cached | stats sum(messageBytes) AS "totalBytes", sum(eval(messageBytes * if(cached=="true",1,0))) AS "cachedBytes" | eval cacheTrafficHitRatio=round(cachedBytes/totalBytes*100,2) | fields cacheTrafficHitRatio
Requests by geographic locations
This search provides a visual representation of where requests for content originate from. Understanding where your requests come from can help with more informed decisions about network capacity and planning.
This search requires CDN application logs with time-series data and IP addresses of requestors.
| sourcetype=<content delivery network general operations> | eval _time=strptime(_time,"%Y-%m-%d %H:%M:%S") | convert timeformat="%Y-%m-%d %H:%M:%S" mktime(_time) AS _time | sort 0 _time | fields _time requestClientIp | iplocation requestClientIp | geostats count latfield=lat longfield=lon
Total number of requests for the CDNs
This search provides the total number of requests that your CDN serves, including both cached and non-cached content. This data can help you make more informed decisions about when and where to increase or decrease caching capabilities.
This search requires CDN application logs with time-series data for a total message size transferred per request and the status of the content, for example, if the requested content is cached or not.
| sourcetype=<content delivery network general operations> | eval _time=strptime(_time, "%Y-%m-%d %H:%M:%S") | stats count
Total amount of traffic served by the CDNs
This search shows the total amount of data transferred out of a CDN, which can help you make more informed decisions about when and where to increase or decrease caching capabilities.
This search requires CDN application logs with time-series data for a total message size transferred per request.
| sourcetype=<content delivery network general operations> | eval time=strptime(_time,"%Y-%m-%d %H:%M:S") | fields _time messageBytes | stats sum(messageBytes) AS totalTrafficBytes | eval totalTrafficGBytes=totalTrafficBytes/pow(10,9) | fields totalTrafficGBytes
- ► Stage 4: Business insights
-
Highest- and lowest- volume content customers by requests
Having a real-time understanding of the amount of content being delivered for customers might be beneficial, particularly if you are able to identify who are the highest- and lowest-volume customers. This information helps you make informed decisions about capacity, billing rates, services, and marketing campaigns.
This search requires CDN logs that contain the requests and an identifier for the customer. This data is then enriched with a lookup that uses the customer identifier. Then the search calculates the number of requests against the CDN.
| sourcetype=<content delivery network general operations> | eval time=strptime(_time,"%Y-%m-%d %H:%M:S") | fields cp | lookup <cdn contentproviders> contentProviderCode AS cp | stats count BY contentProviderDescription | sort - count | head 5
Total revenue generated
Having real-time insight to revenue generated across services is beneficial to business, and it's what this search provides.
This search requires CDN logs with requests, IP addresses of the client, total bytes transferred, and an identifier for the customer. This data is then enriched with a lookup that uses the customer identifier and IP location data. After you have these regions, the cost for requests and data transfers for the region is added to the event. Finally, the search calculates the costs for both the requests and the data transferred, then aggregates these to a single value.
| sourcetype=<content delivery network general operations> | eval time=strptime(_time,"%Y-%m-%d %H:%M:S") | convert timeformat="%Y-%m-%d %T" mktime(_time) AS _time | sort 0 _time | fields _time cp requestClientIp status messageBytes | lookup <cdn contentproviders> contentProviderCode AS cp OUTPUT contentProviderName | iplocation requestClientIp allfields=true | fields _time cp Continent contentProviderName status messageBytes | eval megabytesTransferred=((messageBytes/pow(1024,2) )) | lookup cdn_content_costs billableRegion AS Continent OUTPUT costPerMBTransferred costPerRequest | eval xferRevenue = costPerMBTransferred * megabytesTransferred | timechart fixedrange=false span=1h eval(sum(xferRevenue) + sum(costPerRequest)) AS totalRevenue
Revenue being generated from requests and data transfers
A real-time understanding of the amount of content delivered for your customers might be very beneficial, especially if you are able to identify where content is delivered. With different costs across the world associated with requests against the CDN and data transferred, being able to aggregate and show revenue metrics in a central location can give real-time operational insight into where revenue is potentially generated.
These searches require CDN logs that contain the requests, IP address of the client, total bytes transferred, and an identifier for the customer. It then calculates the costs for both the requests and the data transferred.
Revenue generated by requests
| sourcetype=<content delivery network general operations> | eval time=strptime(_time,"%Y-%m-%d %H:%M:S") | convert timeformat="%Y-%m-%d %T" mktime(_time) AS _time | sort 0 _time | fields _time cp requestClientIp status | lookup <cdn contentproviders> contentProviderCode AS cp OUTPUT contentProviderName | iplocation requestClientIp allfields=true | fields _time cp Continent contentProviderName status | lookup cdn_content_costs billableRegion AS Continent OUTPUT costPerRequest | timechart fixedrange=false span=1h sum(costPerRequest) AS billableRequests
Revenue generated by data transfers
| sourcetype=<content delivery network general operations> | eval time=strptime(_time,"%Y-%m-%d %H:%M:S") | convert timeformat="%Y-%m-%d %T" mktime(_time) AS _time | sort 0 _time | fields _time cp requestClientIp status messageBytes | lookup <cdn contentproviders> contentProviderCode AS cp OUTPUT contentProviderName | iplocation requestClientIp allfields=true | fields _time cp Continent contentProviderName status messageBytes | eval megabytesTransferred=((messageBytes/pow(1024,2))) | lookup cdn_content_costs billableRegion AS Continent OUTPUT costPerMBTransferred | eval revenue = costPerMBTransferred * megabytesTransferred | timechart fixedrange=false span=1h sum(revenue) AS revenue
Next steps
You might be interested in the following additional telecommunications use cases: