这是indexloc提供的服务,不要输入任何密码
Skip to content

Remove linking behavior #1369

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

nathanhammond
Copy link
Contributor

This is a followup to #1354 which completely removes the previous linking behavior.

@vercel
Copy link

vercel bot commented Jun 10, 2022

@nathanhammond is attempting to deploy a commit to the Vercel Team on Vercel.

A member of the Team first needs to authorize it.

@vercel
Copy link

vercel bot commented Jun 10, 2022

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Updated
turbo-site ⬜️ Ignored (Inspect) Jun 10, 2022 at 0:30AM (UTC)

Comment on lines +22 to +37
// We need to know if it is a symlink to a directory since we actually
// resolve all things now instead of persisting the link.
//
// We can't simply switch to os.Stat above without throwing errors in
// places where we didn't previously.
isSymlink := err == nil && info.Mode()&os.ModeSymlink == os.ModeSymlink
isSymlinkToDir := false
if isSymlink {
// We intentionally do not error on broken symlinks.
info, _ := os.Stat(from)
isSymlinkToDir = info.IsDir()
}

isDir := info.IsDir() || isSymlinkToDir

if isDir {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've operated as conservatively as possible in this space in terms of not changing behavior for broken symlinks... however, I do want to go ahead and break on broken symlinks. (And I want to do so in this PR.)

It won't change our caching population or restoration behavior in any way but it will result in a new error.

Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Converted to a draft PR to discuss our path forward here.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me like breaking on a broken symlink would change our cache restoration behavior.

IIUC, the current state of the world is that we:

  1. cache symlinks verbatim, including broken ones
  2. on restore we unroll them (MkdirAll rather than SymLink)
  3. In the event of a broken symlink, on restore, we skip it entirely (return godirwalk.SkipThis).

Note also that currently the restore behavior is shared w/ turbo prune.

My opinions on changing behavior:

  1. cache restoration should mirror Put, we should restore the links verbatim
  2. for the purposes of caching, we should not make any distinction about whether or not a link is broken
  3. Ideal for prune would probably be to error on links that traverse out of the set of files we're copying? I think? I'm less sure on this one.

I think we want anything that changes behavior to be a separate PR though.

Copy link
Contributor Author

@nathanhammond nathanhammond Jun 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our pre-#1354 symlink behavior is likely irredeemably broken and in my estimation should be disregarded.


Before #1354:

if (fromMode & os.ModeSymlink) != 0 {
// Don't try to hard-link to a symlink, that doesn't work reliably across all platforms.
// Instead recreate an equivalent symlink in the new location.
dest, err := os.Readlink(from)
if err != nil {
return err
}
// Make sure the link we're about to create doesn't already exist
if err := os.Remove(to); err != nil && !errors.Is(err, os.ErrNotExist) {
return err
}
return os.Symlink(dest, to)
}

  • We write files into the cache as files.
  • We add another hardlink in the case of encountering a hard link.
  • We write symlinks into the cache in a non-portable way:
    1. Determining the target of the symlink that we found in the output.
    2. Remove any pre-existing symlink from the cache directory at that path, which is weird and seems unnecessary.
    3. Creating a new symlink to existing symlink's target from the cache directory.
    4. Do no checking as to whether the symlink is broken or not.

I can't find anywhere it checked the target or forced it to be relative within the same output directory so I am pretty sure that it did not attempt to account for something like this:

// Might work, might not work:
os.Symlink("../vercel/turborepo/turbo.json", "relative-symlink")
relativeTarget, relativeErr := os.Readlink("./relative-symlink")
fmt.Printf("relativeTarget %v, relativeErr %v", relativeTarget, relativeErr)
// relativeTarget ../vercel/turborepo/turbo.json, relativeErr <nil>

// You can see where this is going to be a problem when it restores on your computer...
os.Symlink("/Users/nathanhammond/repos/vercel/turborepo/turbo.json", "absolute-symlink")
absoluteTarget, absoluteErr := os.Readlink("./absolute-symlink")
fmt.Printf("absoluteTarget %v, absoluteErr %v", absoluteTarget, absoluteErr)
// absoluteTarget /Users/nathanhammond/repos/vercel/turborepo/turbo.json, absoluteErr <nil>

The only "safe" implementation I can imagine for caching symlinks is if the target is relative, present, and within the same output directory (and therefore the same cache). As a result, I don't believe that any of our pre-#1354 caches that contained symlinks should have been treated as portable across systems.

Further, by including symlinks, we introduce an edge case for the tar handling that if the target within the archive doesn't exist on disk yet we have to take additional (recursive) passes until either everything is restored, or we discover that we can't do it.


After #1354:

  • We write a file into the cache when we read a file.
  • We write a file into the cache when we read a hardlink.
  • We write a file into the cache when we read a symlink.
  • We explicitly swallow the error and do nothing in the case of encountering a broken symlink.

// CopyFile copies a file from 'from' to 'to', with an attempt to perform a copy & rename
// to avoid chaos if anything goes wrong partway.
func CopyFile(from string, to string, mode os.FileMode) error {
fromFile, err := os.Open(from)
if err != nil {
fileInfos, err := os.Lstat(from)
isSymlink := err == nil && fileInfos.Mode()&os.ModeSymlink == os.ModeSymlink
if isSymlink {
// We have a broken symlink. Don't try to copy it.
return nil
}
return err
}
defer fromFile.Close()
return writeFileFromStream(fromFile, to, mode)
}

Previously, since we called Lstat and copied the link we did not break when encountering broken symlinks—we simply didn't know anything about them. However, since it was now going to try and Stat them it would break. An additional clause was added in order to protect against that edge case. Valid symlink targets would be read correctly, broken symlinks would no longer be copied.

This was fine because, even pre-#1354, we did not restore broken symlinks so it doesn't matter if they make it into the cache or not (there was no user-detectable behavior change):

// We currently don't restore broken symlinks. This is probably a bug


After #1369 my goal is:

  • We write a file into the cache when we read a file. (No change.)
  • We write a file into the cache when we read a hardlink. (No change.)
  • We write a file into the cache when we read a symlink. (No change.)
  • We throw the error in the case of encountering a broken symlink.

This makes user-apparent a scenario we have thus far silently ignored and reduces the complexity of the code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, my apologies for not providing all of this context in the first pass. It's intended to be a small change, but the context in which the change is being made is significant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bonus failure mode that I just realized we need to account for: I suspect our hashes are run against the symlinks themselves in the output directory, not the thing that they target (I haven't looked).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's valid to cache broken symlinks:

  • A symlink could be broken at the time we read it, but not at the time of use.
  • A task could be intending to symlink to some absolute path on the machine where the code will ultimately be deployed.
  • A task could be linking to something marked as a global dependency, but not in our cache, and would be valid on restore.
  • A user might just want a broken symlink for some reason

Unless there's some specific consequence to our own correctness, I do not think we should be failing caching or turbo runs because the task's output includes a broken symlink.

Copy link
Contributor Author

@nathanhammond nathanhammond Jun 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with caching the symlinks themselves is that we can never be sure if they're portable across runs or systems and there is little we can do to resolve that.

  • Turborebo in all published versions has silently skipped broken symlinks. In differing versions we have cached symlinks and the symlink targets.
  • Bazel you have to explicitly opt in to caching broken symlinks --experimental_allow_unresolved_symlinks, and they control the symlink process for symlinks that they cache to enforce that they're project-aware.
  • Gradle doesn't cache build outputs with broken symlinks and they cache by file.
  • nx caches links via cp -a (which will cache a broken symlink), with transparent fallback from cp -a to fse.copy (which caches files and will error on broken symlinks).
  • Can't quickly find info on Maven, but all of the examples of what people want to do with it are basically "subvert the DAG" with "this thing will eventually show up, promise" and it's terrifying. They're reporting errors where, because the DAG runs in an order they don't expect, the symlinks aren't present. (This reinforces my belief that "trust me, I've got it" is likely to go wrong and implicitly create DAG races.)
  • Rush disallows symlinks in the cache entirely. (Search for "symlink".)

I feel like our solution should be "safe by default" with required explicit opt-in to dangerous behavior (caching symlinks, whether broken or not). I feel like Bazel's approach is the one we should head towards if we want to support symlinks.

The "correctness-focused" set of options:

  • Bazel: Ensures correctness at the cost of verbosity to support symlinking, and an opt-in for broken symlinks.
  • Gradle: Ensures correctness at the cost of cache space, with no opt-in for caching broken symlinks. (This mirrors Turborepo post-Disable linking. #1354.)
  • Rush: Ensures correctness with a too-conservative solution.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that bazel and maven and gradle are necessarily the right comparisons, at least currently. Those are producing redistributable bundles. Each of our tasks doesn't necessarily produce something intended to be distributed on its own. We are caching intermediate states, in addition to final states.

Given that, I don't think it should be our responsibility to ensure that the links produced by user code are valid at any particular point in time. IMO it's a bug that we have any special treatment of symlinks.

But maybe I'm missing something about what's dangerous about caching symlinks?

@nathanhammond nathanhammond marked this pull request as draft June 10, 2022 12:37
@@ -108,20 +108,6 @@ func TestPut(t *testing.T) {
// Verify that we got the files that we're expecting
dstCachePath := filepath.Join(dst, hash)

dstAPath := filepath.Join(dstCachePath, src, "child", "a")
got, err := turbofs.SameFile(aPath, dstAPath)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than testing that they are hard-linked, I think we ought to test that they exist? We could assert that the contents match too, although maybe that test belongs at a lower level, and we want to assert that our cache contains the files we think it should, and that they are in the right location.

@@ -206,20 +192,6 @@ func TestFetch(t *testing.T) {
}
t.Logf("files %v", files)

dstAPath := filepath.Join(dstOutputPath, "child", "a")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, I think we should test that the file exists.

//
// We can't simply switch to os.Stat above without throwing errors in
// places where we didn't previously.
isSymlink := err == nil && info.Mode()&os.ModeSymlink == os.ModeSymlink
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've already checked for a non-nil err above.

Comment on lines +22 to +37
// We need to know if it is a symlink to a directory since we actually
// resolve all things now instead of persisting the link.
//
// We can't simply switch to os.Stat above without throwing errors in
// places where we didn't previously.
isSymlink := err == nil && info.Mode()&os.ModeSymlink == os.ModeSymlink
isSymlinkToDir := false
if isSymlink {
// We intentionally do not error on broken symlinks.
info, _ := os.Stat(from)
isSymlinkToDir = info.IsDir()
}

isDir := info.IsDir() || isSymlinkToDir

if isDir {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me like breaking on a broken symlink would change our cache restoration behavior.

IIUC, the current state of the world is that we:

  1. cache symlinks verbatim, including broken ones
  2. on restore we unroll them (MkdirAll rather than SymLink)
  3. In the event of a broken symlink, on restore, we skip it entirely (return godirwalk.SkipThis).

Note also that currently the restore behavior is shared w/ turbo prune.

My opinions on changing behavior:

  1. cache restoration should mirror Put, we should restore the links verbatim
  2. for the purposes of caching, we should not make any distinction about whether or not a link is broken
  3. Ideal for prune would probably be to error on links that traverse out of the set of files we're copying? I think? I'm less sure on this one.

I think we want anything that changes behavior to be a separate PR though.

@gsoltis
Copy link
Contributor

gsoltis commented Jun 16, 2022

As discussed offline, let's leave symlinking behavior in its existing state and come back to it later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants