A curation of awesome tools, documents and projects about LLM Security.
Contributions are always welcome. Please read the Contribution Guidelines before contributing.
- Not what you've signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection
- Visual Adversarial Examples Jailbreak Large Language Models
- Jailbroken: How Does LLM Safety Training Fail?
- Are aligned neural networks adversarially aligned?
- Rebuff: a self-hardening prompt injection detector
- Garak: a LLM vulnerability scanner
- LLMFuzzer: a fuzzing framework for LLMs
- Hacking Auto-GPT and escaping its docker container
- Prompt Injection Cheat Sheet: How To Manipulate AI Language Models
- Indirect Prompt Injection Threats
- Prompt injection: What’s the worst that can happen?
- OWASP Top 10 for Large Language Model Applications
- PoisonGPT: How we hid a lobotomized LLM on Hugging Face to spread fake news
- ChatGPT Plugins: Data Exfiltration via Images & Cross Plugin Request Forgery
- Gandalf: a prompt injection wargame
- LangChain vulnerable to code injection - CVE-2023-29374
- Jailbreak Chat
- Adversarial Prompting
- @llm_sec
- Blog : Embrace The Red
- Newsletter : AI safety takes