Spider-Sense: Intrinsic Risk Sensing for Efficient Agent Defense with Hierarchical Adaptive Screening

ZZhenxiong YuZZhi YangZZhiheng JinSShuhe WangHHeng ZhangYYanlin FeiLLingfeng ZengFFangqi LouSShuo ZhangTTu HuJJingping LiuRRongze ChenXXingyu ZhuKKunyi WangCChaofa YuanXXin GuoZZhaowei LiuFFeipeng ZhangJJie HuangHHuacan WangRRonghao ChenLLiwen Zhang

Published: February 5, 2026
Authors: 22
Word Count: 10,627
Code: Includes code

View on arXiv Download PDF

Enhance agent security with real-time, adaptive risk sensing.

Abstract

As large language models (LLMs) evolve into autonomous agents, their real-world applicability has expanded significantly, accompanied by new security challenges. Most existing agent defense mechanisms adopt a mandatory checking paradigm, in which security validation is forcibly triggered at predefined stages of the agent lifecycle. In this work, we argue that effective agent security should be intrinsic and selective rather than architecturally decoupled and mandatory. We propose Spider-Sense framework, an event-driven defense framework based on Intrinsic Risk Sensing (IRS), which allows agents to maintain latent vigilance and trigger defenses only upon risk perception. Once triggered, the Spider-Sense invokes a hierarchical defence mechanism that trades off efficiency and precision: it resolves known patterns via lightweight similarity matching while escalating ambiguous cases to deep internal reasoning, thereby eliminating reliance on external models. To facilitate rigorous evaluation, we introduce S^2Bench, a lifecycle-aware benchmark featuring realistic tool execution and multi-stage attacks. Extensive experiments demonstrate that Spider-Sense achieves competitive or superior defense performance, attaining the lowest Attack Success Rate (ASR) and False Positive Rate (FPR), with only a marginal latency overhead of 8.3\%.

Key Takeaways

1
SPIDER-SENSE introduces intrinsic risk sensing for agent security.
2
Reduces latency and improves efficiency in threat detection.
3
Uses hierarchical adaptive screening for balanced security checks.

Limitations

Requires initial setup and calibration for effective use.
May struggle with completely novel, unseen threat patterns.

Keywords

large language modelsautonomous agentssecurity challengesmandatory checking paradigmevent-driven defenseIntrinsic Risk Sensinghierarchical defense mechanismlightweight similarity matchingdeep internal reasoninglifecycle-aware benchmarkAttack Success RateFalse Positive Rate

More in AI Agents

View all

LongCat-Flash-Thinking-2601 Technical Report

Meituan LongCat Team, Anchun Gui +160

We introduce LongCat-Flash-Thinking-2601, a 560-billion-parameter open-source Mixture-of-Experts (MoE) reasoning model with superior agentic reasoning capability. LongCat-Flash-Thinking-2601 achieves ...

Jan 23149

Agentic Reasoning for Large Language Models

Tianxin Wei, Ting-Wei Li +27

Reasoning is a fundamental cognitive process underlying inference, problem-solving, and decision-making. While large language models (LLMs) demonstrate strong reasoning capabilities in closed-world se...

Jan 18149

Step 3.5 Flash: Open Frontier-Level Intelligence with 11B Active Parameters

Ailin Huang, Ang Li +213

We introduce Step 3.5 Flash, a sparse Mixture-of-Experts (MoE) model that bridges frontier-level agentic intelligence and computational efficiency. We focus on what matters most when building agents: ...

Feb 11140

UI-Venus-1.5 Technical Report

Veuns-Team, Changlong Gao +25

GUI agents have emerged as a powerful paradigm for automating interactions in digital environments, yet achieving both broad generality and consistently strong task performance remains challenging.In ...

Feb 9140

daVinci-Dev: Agent-native Mid-training for Software Engineering

Ji Zeng, Dayuan Fu +15

Recently, the frontier of Large Language Model (LLM) capabilities has shifted from single-turn code generation to agentic software engineering-a paradigm where models autonomously navigate, edit, and ...

Jan 26113

More AI Agents papers