Understanding and defending against threats to AI apps

Last updated
Save as PDF
Share
1. Share
2. Tweet
3. Share

Due to their non-deterministic nature, AI applications introduce unique security challenges and risks that traditional security tools weren't designed to address. Unlike traditional applications, AI models can produce unpredictable outputs, which makes them susceptible to prompt injection attacks, data extraction attempts, and safety violations.

This article provides an introduction to different types of AI risk and how you can protect against them, with a focus on first-party applications, for example, an internal AI system used by your organization's employees that uses a RAG database or a customer service chatbot used by your organization's customers. The scope of this article doesn't include third-party AI systems that are not developed by your internal IT teams.

Understanding different types of AI risk

It's important to understand the ways that AI risk are classified, so SOC analysts can:

Classify incoming alerts using standardized terminology
Prioritize response efforts based on threat category severity and business impact
Communicate effectively with stakeholders using industry-recognized framework references
Build detections that align with industry-standard techniques

AI risks can be broadly categorized into two main categories of risks, as shown in the following table.

Safety risks	Security risks
Hate speech and toxicity in responses Inappropriate content generation Hallucinations and misinformation Harmful advice or instructions	Prompt injection attacks Sensitive data extraction Model jailbreaking Meta prompt extraction Infrastructure compromise

These risks increase as AI applications become more complex and handle more sensitive data, progressing from simple chatbots to RAG-enabled applications to agentic AI systems with autonomous decision-making capabilities.

Understanding the threats facing AI systems requires familiarity with three key industry frameworks that categorize and map these risks:

OWASP LLM Top 10: The Open Worldwide Application Security Project (OWASP) has established a Top 10 list of the most critical security risks for Large Language Model applications, as well as a Top 10 for Agentic Applications. This framework helps teams prioritize remediation efforts by ranking the most critical and commonly exploited vulnerabilities in LLM applications.
MITRE ATLAS Framework: The MITRE ATLAS (Adversarial Threat Landscape for AI Systems) framework maps the adversarial attack lifecycle specifically for machine learning systems. It follows a kill-chain methodology similar to the traditional MITRE ATT&CK framework but adapted for AI-specific threats. This framework helps security teams understand that AI attacks follow a structured progression from initial reconnaissance of AI assets through to ultimate impact, whether that's data exfiltration, model theft, or operational disruption.
NIST Adversarial Machine Learning (AI 100-2): The National Institute of Standards and Technology (NIST) published the Adversarial Machine Learning taxonomy (NIST AI 100-2 E2023) to provide a standardized vocabulary and classification for attacks targeting AI systems. This framework categorizes threats across the AI lifecycle and defines the attacker goals, capabilities, and knowledge levels required to run them.

Cisco AI Defense, which integrates with Splunk Enterprise Security, uses the Integrated AI Security and Safety Framework to assist AI and security communities in navigating AI security and safety threats. This vendor-agnostic framework includes descriptions, examples, and mappings to AI security standards that Cisco co-developed alongside all three frameworks mentioned above.

The industry-standard frameworks mentioned above are complex and sometimes overlapping, and there are additional frameworks you might wish to use, depending on your organization and industry. The Integrated AI Security and Safety Framework is designed to be mutually exclusive and collectively exhaustive in terms of the kinds of threats that are faced.

A three-step framework for developing secure AI applications

You can use a three-pillar framework for securing AI applications throughout their lifecycle: Discovery, Detection, and Protection.

Discovery: Uncover AI assets

The first pillar focuses on gaining visibility into the AI assets within your environment. This includes:

Identifying all AI models in use across the organization
Cataloging AI agents, MCP servers and their associated permissions
Mapping datasets used for training and fine-tuning

This discovery process helps organizations understand their AI attack surface and identify "Shadow AI" that might have been deployed without proper security oversight.

Detection: Perform red teaming

The second pillar focuses on proactively identifying vulnerabilities in AI models before they can be exploited. You can use red teaming techniques like:

Vulnerability assessment: Using techniques like TAP (Tree of Attacks with Pruning) to systematically test model susceptibility to adversarial prompts. Identifying safety and security vulnerabilities across models with automated red teaming
Scanning AI supply chain for threats: Scanning model files, repos, and MCP servers to proactively block malicious or unsafe AI assets before operations are impacted
Continuous testing: Integrating with CI/CD workflows and performing ongoing assessments as models are updated or fine-tuned

Protection: Mitigate threats in real time

The third pillar focuses on enforcing security policies at runtime to block attacks in real-time:

Input validation: Detecting and blocking attacks like prompt injection attempts before they reach the model
Output filtering: Preventing sensitive data, system prompts, or harmful content from being returned to users
Policy enforcement: Applying organization-specific rules for acceptable AI behavior

Integrating Cisco AI Defense with Splunk Enterprise Security

Cisco AI Defense integrates with Splunk Enterprise Security (ES) using the Cisco Security Cloud app. This unified app serves as the single integration point for all Cisco Security products, including AI Defense, and aligns with the best practices discussed in this article.

By integrating with AI Defense, security teams can protect against AI risks across the Threat Detection, Investigation and Response (TDIR) workflow by:

Pulling in alerts and incidents from AI Defense, mapping them to the Splunk Common Information Model (CIM), correlating them with ES, and visualizing them in dashboards
Improving AI risk detection accuracy and accelerating investigations using curated AI risk detections from Cisco Threat Intelligence Labs aggregated with broader security context from ES
Enhancing security operations to identify AI risks across applications during development and at runtime
Performing AI model validation and vulnerability assessment
Leveraging an out-of-the-box ES detection that creates a search and surfaces potential attacks against the AI models running in your environment

The screenshots below show examples of AI Defense-generated analytics within Splunk Enterprise Security:

The Cisco AI Defense dashboard:

Findings within the analyst queue:

And a drill-down into these findings:

Additional resources

The content in this article comes from a .Conf presentation, one of the thousands of Splunk resources available to help users succeed.

In addition, these resources might help you understand and implement this guidance:

Cisco Solution Overview: Cisco AI Defense
Cisco Docs: Cisco Security Cloud user guide
Cisco Blog: Using AI to automatically jailbreak GPT-4 and other LLMs in under a minute
Cisco Blog: Bypassing OpenAI’s structured outputs: Another simple jailbreak
Cisco Blog: Extracting training data from chatbots
Splunk OnDemand Services: Use these credit-based services for direct access to Splunk technical consultants with a variety of technical services from a pre-defined catalog. Most customers have OnDemand Services per their Success Plan. Engage the ODS team at ondemand@cisco.com if you would like assistance.