Understanding and defending against threats to AI apps
Due to their non-deterministic nature, AI applications introduce unique security challenges and risks that traditional security tools weren't designed to address. Unlike traditional applications, AI models can produce unpredictable outputs, which makes them susceptible to prompt injection attacks, data extraction attempts, and safety violations.
This article provides an introduction to different types of AI risk and how you can protect against them, with a focus on first-party applications, for example, an internal AI system used by your organization's employees that uses a RAG database or a customer service chatbot used by your organization's customers. The scope of this article doesn't include third-party AI systems that are not developed by your internal IT teams.
Understanding different types of AI risk
It's important to understand the ways that AI risk are classified, so SOC analysts can:
- Classify incoming alerts using standardized terminology
- Prioritize response efforts based on threat category severity and business impact
- Communicate effectively with stakeholders using industry-recognized framework references
- Build detections that align with industry-standard techniques
AI risks can be broadly categorized into two main categories of risks, as shown in the following table.
| Safety risks | Security risks |
|---|---|
|
|
These risks increase as AI applications become more complex and handle more sensitive data, progressing from simple chatbots to RAG-enabled applications to agentic AI systems with autonomous decision-making capabilities.
Understanding the threats facing AI systems requires familiarity with three key industry frameworks that categorize and map these risks:
- OWASP LLM Top 10: The Open Worldwide Application Security Project (OWASP) has established a Top 10 list of the most critical security risks for Large Language Model applications, as well as a Top 10 for Agentic Applications. This framework helps teams prioritize remediation efforts by ranking the most critical and commonly exploited vulnerabilities in LLM applications.
- MITRE ATLAS Framework: The MITRE ATLAS (Adversarial Threat Landscape for AI Systems) framework maps the adversarial attack lifecycle specifically for machine learning systems. It follows a kill-chain methodology similar to the traditional MITRE ATT&CK framework but adapted for AI-specific threats. This framework helps security teams understand that AI attacks follow a structured progression from initial reconnaissance of AI assets through to ultimate impact, whether that's data exfiltration, model theft, or operational disruption.
- NIST Adversarial Machine Learning (AI 100-2): The National Institute of Standards and Technology (NIST) published the Adversarial Machine Learning taxonomy (NIST AI 100-2 E2023) to provide a standardized vocabulary and classification for attacks targeting AI systems. This framework categorizes threats across the AI lifecycle and defines the attacker goals, capabilities, and knowledge levels required to run them.
Cisco AI Defense, which integrates with Splunk Enterprise Security, uses the Integrated AI Security and Safety Framework to assist AI and security communities in navigating AI security and safety threats. This vendor-agnostic framework includes descriptions, examples, and mappings to AI security standards that Cisco co-developed alongside all three frameworks mentioned above.
The industry-standard frameworks mentioned above are complex and sometimes overlapping, and there are additional frameworks you might wish to use, depending on your organization and industry. The Integrated AI Security and Safety Framework is designed to be mutually exclusive and collectively exhaustive in terms of the kinds of threats that are faced.
A three-step framework for developing secure AI applications
You can use a three-pillar framework for securing AI applications throughout their lifecycle: Discovery, Detection, and Protection.
Discovery: Uncover AI assets
The first pillar focuses on gaining visibility into the AI assets within your environment. This includes:
- Identifying all AI models in use across the organization
- Cataloging AI agents, MCP servers and their associated permissions
- Mapping datasets used for training and fine-tuning
This discovery process helps organizations understand their AI attack surface and identify "Shadow AI" that might have been deployed without proper security oversight.
Detection: Perform red teaming
The second pillar focuses on proactively identifying vulnerabilities in AI models before they can be exploited. You can use red teaming techniques like:
- Vulnerability assessment: Using techniques like TAP (Tree of Attacks with Pruning) to systematically test model susceptibility to adversarial prompts. Identifying safety and security vulnerabilities across models with automated red teaming
- Scanning AI supply chain for threats: Scanning model files, repos, and MCP servers to proactively block malicious or unsafe AI assets before operations are impacted
- Continuous testing: Integrating with CI/CD workflows and performing ongoing assessments as models are updated or fine-tuned
Protection: Mitigate threats in real time
The third pillar focuses on enforcing security policies at runtime to block attacks in real-time:
- Input validation: Detecting and blocking attacks like prompt injection attempts before they reach the model
- Output filtering: Preventing sensitive data, system prompts, or harmful content from being returned to users
- Policy enforcement: Applying organization-specific rules for acceptable AI behavior
Integrating Cisco AI Defense with Splunk Enterprise Security
Cisco AI Defense integrates with Splunk Enterprise Security (ES) using the Cisco Security Cloud app. This unified app serves as the single integration point for all Cisco Security products, including AI Defense, and aligns with the best practices discussed in this article.
By integrating with AI Defense, security teams can protect against AI risks across the Threat Detection, Investigation and Response (TDIR) workflow by:
- Pulling in alerts and incidents from AI Defense, mapping them to the Splunk Common Information Model (CIM), correlating them with ES, and visualizing them in dashboards
- Improving AI risk detection accuracy and accelerating investigations using curated AI risk detections from Cisco Threat Intelligence Labs aggregated with broader security context from ES
- Enhancing security operations to identify AI risks across applications during development and at runtime
- Performing AI model validation and vulnerability assessment
- Leveraging an out-of-the-box ES detection that creates a search and surfaces potential attacks against the AI models running in your environment
The screenshots below show examples of AI Defense-generated analytics within Splunk Enterprise Security:

The Cisco AI Defense dashboard:

Findings within the analyst queue:

And a drill-down into these findings:

Additional resources
The content in this article comes from a .Conf presentation, one of the thousands of Splunk resources available to help users succeed.
In addition, these resources might help you understand and implement this guidance:
- Cisco Solution Overview: Cisco AI Defense
- Cisco Docs: Cisco Security Cloud user guide
- Cisco Blog: Using AI to automatically jailbreak GPT-4 and other LLMs in under a minute
- Cisco Blog: Bypassing OpenAI’s structured outputs: Another simple jailbreak
- Cisco Blog: Extracting training data from chatbots

