Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Network-Level Prompt and Trait Leakage in Local Research Agents

Created by
  • Haebom

Author

Hyejun Jeong, Mohammadreza Teymoorianfard, Abhinav Kumar, Amir Houmansadr, Eugene Bagdasarian

Outline

This paper demonstrates that Web Research Agents (WRAs) are vulnerable to inference attacks from passive network adversaries, such as Internet Service Providers (ISPs). WRAs can be deployed locally by organizations and individuals for privacy, legal, or financial purposes. Unlike sporadic human web browsing, WRAs visit 70-140 domains and have distinguishable temporal correlations, enabling unique fingerprinting attacks. In this paper, we present a novel prompt and user attribute exfiltration attack that leverages only the network-level metadata of WRAs (i.e., the IP addresses and times of visits). We build a new WRA trace dataset based on user search queries and queries generated by synthetic personas. We define an action metric (OBELS) that comprehensively evaluates the similarity between original and inferred prompts. We demonstrate that it recovers over 73% of the functional and domain knowledge of user prompts. Extending to multi-session settings, we recover 19 out of 32 potential attributes with high accuracy. This attack is effective even under partial observation and noisy conditions. Finally, we discuss mitigation strategies that limit domain diversity or obfuscate tracking, and show that they reduce attack effectiveness by an average of 29% without significant utility impact.

Takeaways, Limitations

Takeaways:
We present the possibility of a new prompt and user attribute exfiltration attack using network metadata of WRAs.
Quantitatively evaluate the effectiveness of prompt and user attribute leak attacks using behavioral metrics such as OBELS.
Validate the effectiveness of mitigation strategies such as limiting domain diversity and obfuscating traces.
Highlighting the privacy and security risks of WRAs.
Limitations:
Current research may be limited to specific types of WRAs and network environments.
The effectiveness of mitigation strategies may vary depending on specific circumstances.
Further research is needed into more sophisticated and diverse attack techniques.
Further research is needed on the generality and limitations of the OBELS measurement items.
👍