+
Skip to content

Conversation

sbrivio-rh
Copy link

@sbrivio-rh sbrivio-rh commented Oct 8, 2025

Note: this is the corrected version of moby/moby#51130, which I opened against the wrong repository. I'm just copying over the whole description from there.


I'm a maintainer and author of passt (https://passt.top/), a user-mode networking implementation, that's used to connect containers, with pasta(1), and virtual machines, with passt(1), in an unprivileged way, without creating network interfaces.

By the way, Moby optionally uses pasta(1) to connect rootless containers via rootlesskit:

Given that these tools deal with network packets from untrusted workloads, we pay particular attention to their security posture.

The project implements a rather substantial sandboxing mechanism, so that, once the initialisation phase completes, passt(1) and pasta(1) only have access to an empty filesystem with a zero-size limit, and relinquish access possibilities to any resources they don't need, by means of detaching namespaces:

Users report that they can't use passt(1) in Docker containers, with one notable example at:

and resort to run modified builds of passt:

with sandboxing features entirely disabled. This is of course not something we support, so it's not a particular concern in terms of maintainability, but still it forces users to disable important security features, and it's a rather alarming trend.

As a side note, Flatpak has a similar issue:

and, same there, users routinely run custom builds of applications that ship strict native sandboxing features (including passt, Chromium, and Firefox) with those features disabled. This is not in the best interest of security and surely not in the best interest of those users.

To fix this, enable unshare() regardless of the CAP_SYS_ADMIN capability, so that unprivileged applications can perform appropriate, strict sandboxing.

I'm well aware of CVE-2022-0185 and CVE-2022-0492, but, since then, there have been significant hardening efforts going on in the affected portions of the kernel and the current situation appears substantially different, now.

Despite the original intention, a blanket ban on unprivileged unshare() appears nowadays to be detrimental to the security of containerised application, instead of contributing to it, as an increased number of applications finally start using namespaces for their own sandboxing, which is generally stricter than what any container runtime can provide.

Link: https://bugs.passt.top/show_bug.cgi?id=116
Reported-by: simonvanderlans@gmail.com
Signed-off-by: Stefano Brivio sbrivio@redhat.com

- What I did
I took unshare(2), the system call, out of the CAP_SYS_ADMIN gate in the default seccomp profile.

- How I did it
I did it proudly, with a keyboard. I used so-called shortcuts that allowed me to conceptually cut one line of text file and paste it to another location.

- How to verify it
Run passt in a Docker container.

- Human readable description for the release notes

The unshare(2) system call is now permitted in the default seccomp profile, enabling users to run applications that provide native sandboxing capabilities based on Linux namespaces.

- A picture of a cute animal (not mandatory but encouraged)

Inspired from a submission at https://user.xmission.com/~emailbox/ascii_cats.htm:

fsc              ._
              .-'  `-.
           .-'        \
          ;    .-'\    ;
          `._.'    ;   |
                   |   |
                   ;   :
                  ;   :
                  ;   :
                 /   /
                ;   :                   ,
                ;   |               .-"7|
              .-'"  :            .-' .' :
           .-'       \         .'  .'   `.
         .'           `-. ""-.-'`""    `",`-._..--"7
         ;    .          `-.J `-,    ;"`.;|,_,    ;
       _.'    |         `"" `. ."""--. o \:.-. _.'
    .""       :            ,--`;   ,  `--/}o,' ;
    ;   .___.'        /     ,--.`-. `-..7_.-  /_
     \   :   `..__.._;    .'__;    `---..__.-'-.`"-,
     .'   `--. |   \_;    \'   `-._.-")     \\  `-,
     `.   -.`_):      `.   `-"""`.   ;__.' ;/ ;   "
       `-.__7"  `-..._.'`7     -._;'  ``"-''
                         `--.,__.'  let me run unshare() or isolation code in passt will face GRAVITY
                         

I'm a maintainer and author of passt (https://passt.top/), a user-mode
networking implementation, that's used to connect containers, with
pasta(1), and virtual machines, with passt(1), in an unprivileged way,
without creating network interfaces.

By the way, Moby optionally uses pasta(1) to connect rootless
containers via rootlesskit:
  https://github.com/rootless-containers/rootlesskit/blob/236f31ec2258a1da1b1a9b62b168dd5f9a840f83/pkg/network/pasta/pasta.go

Given that these tools deal with network packets from untrusted
workloads, we pay particular attention to their security posture.

The project implements a rather substantial sandboxing mechanism, so
that, once the initialisation phase completes, passt(1) and pasta(1)
only have access to an empty filesystem with a zero-size limit, and
relinquish access possibilities to any resources they don't need, by
means of detaching namespaces:
  https://passt.top/passt/tree/isolation.c
  https://passt.top/#security

Users report that they can't use passt(1) in Docker containers, with
one notable example at:
  https://bugs.passt.top/show_bug.cgi?id=116

and resort to run modified builds of passt:
  https://bugs.passt.top/show_bug.cgi?id=116#c6

with sandboxing features entirely disabled. This is of course not
something we support, so it's not a particular concern in terms of
maintainability, but it still forces users to disable important
security features, and it's a rather alarming trend.

As a side note, Flatpak has a similar issue:
  flatpak/flatpak#5921

and, same there, users routinely run custom builds of applications
that ship strict native sandboxing features (including passt,
Chromium, and Firefox) with those features disabled. This is not
in the best interest of security and surely not in the best interest
of those users.

To fix this, enable unshare() regardless of the CAP_SYS_ADMIN
capability, so that unprivileged applications can perform appropriate
sandboxing.

I'm well aware of CVE-2022-0185 and CVE-2022-0492, but, since then,
there have been significant hardening efforts going on in the affected
portions of the kernel and the current situation appears substantially
different, now.

Despite the original intention, a blanket ban on unprivileged
unshare() appears nowadays to be detrimental to the security of
containerised application, instead of contributing to it, as an
increased number of applications finally start using namespaces for
their own sandboxing, which is generally stricter than what any
container runtime can provide.

Link: https://bugs.passt.top/show_bug.cgi?id=116
Reported-by: simonvanderlans@gmail.com
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
@sbrivio-rh
Copy link
Author

I just found #4 as I moved this merge request to the right repository. I'm not sure what to do with this one, as it's partially a duplicate, but passt(1) and pasta(1) need unshare(2) flags that are not covered by that one.

"uname",
"unlink",
"unlinkat",
"unshare",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have this as a non-default built-in profile like --security-opt seccomp=allow-unshare-user?

Or if we are going to have this as the default, we will need to provide seccomp=disallow-unshare-user option.

Originally posted by @AkihiroSuda in #42441

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I wasn't aware of moby/moby#42441.

I would argue that unshare() should be the default, otherwise container developers will hit https://bugs.passt.top/show_bug.cgi?id=116#c0 and keep distributing less secure builds of software because they have no practical way to ask users to add options when they run containers. See also https://bugs.passt.top/show_bug.cgi?id=116#c9.

I can take care of adjusting this pull request (if it makes sense at all) in the sense of moby/moby#42455, which already implemented your suggestion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载