This is a page that curates AI-related papers published worldwide. All content here is summarized using Google Gemini and operated on a non-profit basis. Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.
Manuel Costa, Boris K opf, Aashish Kolluri, Andrew Paverd, Mark Russinovich, Ahmed Salem, Shruti Tople, Lukas Wutschitz, Santiago Zanella-B eguelin
Outline
This paper explores leveraging Information Flow Control (IFC) to protect against vulnerabilities such as prompt injection for the security of increasingly autonomous and capable AI agents. We present a formal model for inferring the security and expressiveness of agent planners, characterize classes of properties that can be enforced with dynamic taint-tracking, and construct a task taxonomy to evaluate the security and utility tradeoffs of planner designs. Building on this exploration, we present Fides, a planner that tracks confidentiality and integrity labels, deterministically enforces security policies, and introduces novel primitives for selective information hiding. Evaluations on AgentDojo demonstrate that this approach can accomplish a wide range of tasks while maintaining security guarantees. A tutorial illustrating the concepts introduced in this paper can be found at https://github.com/microsoft/fides .
A novel method for strengthening security against vulnerabilities such as prompt injection in AI agents using Information Flow Control (IFC) is presented.
◦
We provide a formal model and task classification for inferring the security and expressiveness of agent planners.
◦
Development and experimental validation of a novel planner, Fides, capable of deterministically enforcing security policies and selectively hiding information.
◦
Experimental results using AgentDojo demonstrate the utility and wide-ranging applicability of Fides.
•
Limitations:
◦
A more in-depth analysis of the performance and scalability of the Fides planner is needed.
◦
Further research is needed on generalizability across different types of AI agents and task environments.
◦
The need to evaluate resistance to complex security threats and attacks that may arise in real-world applications.
◦
Additional explanation or documentation beyond the tutorial may be required.