Arbitrary Code Execution without Sandboxing in ExecuterAgent

I noticed that the `ExecuterAgent` executes LLM-generated Python and Bash code directly on the host machine using `subprocess.Popen`.

This is a significant security risk. Beyond the danger of a buggy generation causing accidental damage, this opens a direct attack vector for bad actors. An attacker could manipulate the LLM (e.g., through indirect prompt injection) to intentionally generate malicious code. This makes any system or agent built using this repository vulnerable to being hijacked. An attack could lead to severe consequences like:

  * Stealing sensitive data from `~/.ssh/` or `~/.aws/`.
  * Deleting user files (`rm -rf /`).
  * Installing malware on the host system.

Considering that agentic systems may become more capable, I suggest to add a warning in the  the `README.md`, so users understand the risk before running the code. Something like this would be great:

```markdown
---
## ⚠️ Security Warning ⚠️

This tool allows Language Models (LLMs) to execute arbitrary code directly on your machine. This is inherently dangerous and can be exploited by bad actors. It is strongly recommended to always run this code in a sandboxed environment, such as a Docker container or a dedicated VM, to protect your system and data.
---
```

A more robust solution could be to make sandboxing the default execution method, for example using Docker. The `execute_code` function could be modified to spin up a minimal, isolated Docker container for each execution.

Thanks for the great work on this repository!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Arbitrary Code Execution without Sandboxing in ExecuterAgent #247

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Arbitrary Code Execution without Sandboxing in ExecuterAgent #247

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions