Remove linking behavior #1369

nathanhammond · 2022-06-10T12:27:04Z

This is a followup to #1354 which completely removes the previous linking behavior.

vercel · 2022-06-10T12:27:09Z

@nathanhammond is attempting to deploy a commit to the Vercel Team on Vercel.

A member of the Team first needs to authorize it.

vercel · 2022-06-10T12:30:24Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment

Name	Status	Preview	Updated
turbo-site	⬜️ Ignored (Inspect)		Jun 10, 2022 at 0:30AM (UTC)

nathanhammond · 2022-06-10T12:34:58Z

cli/internal/fs/copy_file.go

+	// We need to know if it is a symlink to a directory since we actually
+	// resolve all things now instead of persisting the link.
+	//
+	// We can't simply switch to os.Stat above without throwing errors in
+	// places where we didn't previously.
+	isSymlink := err == nil && info.Mode()&os.ModeSymlink == os.ModeSymlink
+	isSymlinkToDir := false
+	if isSymlink {
+		// We intentionally do not error on broken symlinks.
+		info, _ := os.Stat(from)
+		isSymlinkToDir = info.IsDir()
+	}
+
+	isDir := info.IsDir() || isSymlinkToDir
+
+	if isDir {


I've operated as conservatively as possible in this space in terms of not changing behavior for broken symlinks... however, I do want to go ahead and break on broken symlinks. (And I want to do so in this PR.)

It won't change our caching population or restoration behavior in any way but it will result in a new error.

Thoughts?

(Converted to a draft PR to discuss our path forward here.)

It seems to me like breaking on a broken symlink would change our cache restoration behavior.

IIUC, the current state of the world is that we:

cache symlinks verbatim, including broken ones

on restore we unroll them (MkdirAll rather than SymLink)

In the event of a broken symlink, on restore, we skip it entirely (return godirwalk.SkipThis).

Note also that currently the restore behavior is shared w/ turbo prune.

My opinions on changing behavior:

cache restoration should mirror Put, we should restore the links verbatim

for the purposes of caching, we should not make any distinction about whether or not a link is broken

Ideal for prune would probably be to error on links that traverse out of the set of files we're copying? I think? I'm less sure on this one.

I think we want anything that changes behavior to be a separate PR though.

Our pre-#1354 symlink behavior is likely irredeemably broken and in my estimation should be disregarded.

Before #1354:

turborepo/cli/internal/fs/copy_file.go

Lines 18 to 30 in 3dc851d

if (fromMode & os.ModeSymlink) != 0 {

// Don't try to hard-link to a symlink, that doesn't work reliably across all platforms.

// Instead recreate an equivalent symlink in the new location.

dest, err := os.Readlink(from)

if err != nil {

return err

}

// Make sure the link we're about to create doesn't already exist

if err := os.Remove(to); err != nil && !errors.Is(err, os.ErrNotExist) {

return err

}

return os.Symlink(dest, to)

}

We write files into the cache as files.

We add another hardlink in the case of encountering a hard link.

We write symlinks into the cache in a non-portable way:

Determining the target of the symlink that we found in the output.

Remove any pre-existing symlink from the cache directory at that path, which is weird and seems unnecessary.

Creating a new symlink to existing symlink's target from the cache directory.

Do no checking as to whether the symlink is broken or not.

I can't find anywhere it checked the target or forced it to be relative within the same output directory so I am pretty sure that it did not attempt to account for something like this:

// Might work, might not work: os.Symlink("../vercel/turborepo/turbo.json", "relative-symlink") relativeTarget, relativeErr := os.Readlink("./relative-symlink") fmt.Printf("relativeTarget %v, relativeErr %v", relativeTarget, relativeErr) // relativeTarget ../vercel/turborepo/turbo.json, relativeErr <nil> // You can see where this is going to be a problem when it restores on your computer... os.Symlink("/Users/nathanhammond/repos/vercel/turborepo/turbo.json", "absolute-symlink") absoluteTarget, absoluteErr := os.Readlink("./absolute-symlink") fmt.Printf("absoluteTarget %v, absoluteErr %v", absoluteTarget, absoluteErr) // absoluteTarget /Users/nathanhammond/repos/vercel/turborepo/turbo.json, absoluteErr <nil>

The only "safe" implementation I can imagine for caching symlinks is if the target is relative, present, and within the same output directory (and therefore the same cache). As a result, I don't believe that any of our pre-#1354 caches that contained symlinks should have been treated as portable across systems.

Further, by including symlinks, we introduce an edge case for the tar handling that if the target within the archive doesn't exist on disk yet we have to take additional (recursive) passes until either everything is restored, or we discover that we can't do it.

After #1354:

We write a file into the cache when we read a file.

We write a file into the cache when we read a hardlink.

We write a file into the cache when we read a symlink.

We explicitly swallow the error and do nothing in the case of encountering a broken symlink.

turborepo/cli/internal/fs/fs.go

Lines 63 to 79 in b1210d5

// CopyFile copies a file from 'from' to 'to', with an attempt to perform a copy & rename

// to avoid chaos if anything goes wrong partway.

func CopyFile(from string, to string, mode os.FileMode) error {

fromFile, err := os.Open(from)

if err != nil {

fileInfos, err := os.Lstat(from)

isSymlink := err == nil && fileInfos.Mode()&os.ModeSymlink == os.ModeSymlink

if isSymlink {

// We have a broken symlink. Don't try to copy it.

return nil

}

return err

}

defer fromFile.Close()

return writeFileFromStream(fromFile, to, mode)

}

Previously, since we called Lstat and copied the link we did not break when encountering broken symlinks—we simply didn't know anything about them. However, since it was now going to try and Stat them it would break. An additional clause was added in order to protect against that edge case. Valid symlink targets would be read correctly, broken symlinks would no longer be copied.

This was fine because, even pre-#1354, we did not restore broken symlinks so it doesn't matter if they make it into the cache or not (there was no user-detectable behavior change):

turborepo/cli/internal/cache/cache_fs_test.go

Line 228 in b1210d5

// We currently don't restore broken symlinks. This is probably a bug

After #1369 my goal is:

We write a file into the cache when we read a file. (No change.)

We write a file into the cache when we read a hardlink. (No change.)

We write a file into the cache when we read a symlink. (No change.)

We throw the error in the case of encountering a broken symlink.

This makes user-apparent a scenario we have thus far silently ignored and reduces the complexity of the code.

Also, my apologies for not providing all of this context in the first pass. It's intended to be a small change, but the context in which the change is being made is significant.

Bonus failure mode that I just realized we need to account for: I suspect our hashes are run against the symlinks themselves in the output directory, not the thing that they target (I haven't looked).

I think it's valid to cache broken symlinks:

A symlink could be broken at the time we read it, but not at the time of use.

A task could be intending to symlink to some absolute path on the machine where the code will ultimately be deployed.

A task could be linking to something marked as a global dependency, but not in our cache, and would be valid on restore.

A user might just want a broken symlink for some reason

Unless there's some specific consequence to our own correctness, I do not think we should be failing caching or turbo runs because the task's output includes a broken symlink.

The problem with caching the symlinks themselves is that we can never be sure if they're portable across runs or systems and there is little we can do to resolve that.

Turborebo in all published versions has silently skipped broken symlinks. In differing versions we have cached symlinks and the symlink targets.

Bazel you have to explicitly opt in to caching broken symlinks --experimental_allow_unresolved_symlinks, and they control the symlink process for symlinks that they cache to enforce that they're project-aware.

Gradle doesn't cache build outputs with broken symlinks and they cache by file.

nx caches links via cp -a (which will cache a broken symlink), with transparent fallback from cp -a to fse.copy (which caches files and will error on broken symlinks).

Can't quickly find info on Maven, but all of the examples of what people want to do with it are basically "subvert the DAG" with "this thing will eventually show up, promise" and it's terrifying. They're reporting errors where, because the DAG runs in an order they don't expect, the symlinks aren't present. (This reinforces my belief that "trust me, I've got it" is likely to go wrong and implicitly create DAG races.)

Rush disallows symlinks in the cache entirely. (Search for "symlink".)

I feel like our solution should be "safe by default" with required explicit opt-in to dangerous behavior (caching symlinks, whether broken or not). I feel like Bazel's approach is the one we should head towards if we want to support symlinks.

The "correctness-focused" set of options:

Bazel: Ensures correctness at the cost of verbosity to support symlinking, and an opt-in for broken symlinks.

Gradle: Ensures correctness at the cost of cache space, with no opt-in for caching broken symlinks. (This mirrors Turborepo post-Disable linking. #1354.)

Rush: Ensures correctness with a too-conservative solution.

I'm not sure that bazel and maven and gradle are necessarily the right comparisons, at least currently. Those are producing redistributable bundles. Each of our tasks doesn't necessarily produce something intended to be distributed on its own. We are caching intermediate states, in addition to final states.

Given that, I don't think it should be our responsibility to ensure that the links produced by user code are valid at any particular point in time. IMO it's a bug that we have any special treatment of symlinks.

But maybe I'm missing something about what's dangerous about caching symlinks?

gsoltis · 2022-06-10T18:55:44Z

cli/internal/cache/cache_fs_test.go

@@ -108,20 +108,6 @@ func TestPut(t *testing.T) {
 	// Verify that we got the files that we're expecting
 	dstCachePath := filepath.Join(dst, hash)

-	dstAPath := filepath.Join(dstCachePath, src, "child", "a")
-	got, err := turbofs.SameFile(aPath, dstAPath)


Rather than testing that they are hard-linked, I think we ought to test that they exist? We could assert that the contents match too, although maybe that test belongs at a lower level, and we want to assert that our cache contains the files we think it should, and that they are in the right location.

gsoltis · 2022-06-10T18:58:13Z

cli/internal/cache/cache_fs_test.go

@@ -206,20 +192,6 @@ func TestFetch(t *testing.T) {
 	}
 	t.Logf("files %v", files)

-	dstAPath := filepath.Join(dstOutputPath, "child", "a")


As above, I think we should test that the file exists.

gsoltis · 2022-06-10T19:01:59Z

cli/internal/fs/copy_file.go

+	//
+	// We can't simply switch to os.Stat above without throwing errors in
+	// places where we didn't previously.
+	isSymlink := err == nil && info.Mode()&os.ModeSymlink == os.ModeSymlink


We've already checked for a non-nil err above.

gsoltis · 2022-06-10T19:25:19Z

cli/internal/fs/copy_file.go

+	// We need to know if it is a symlink to a directory since we actually
+	// resolve all things now instead of persisting the link.
+	//
+	// We can't simply switch to os.Stat above without throwing errors in
+	// places where we didn't previously.
+	isSymlink := err == nil && info.Mode()&os.ModeSymlink == os.ModeSymlink
+	isSymlinkToDir := false
+	if isSymlink {
+		// We intentionally do not error on broken symlinks.
+		info, _ := os.Stat(from)
+		isSymlinkToDir = info.IsDir()
+	}
+
+	isDir := info.IsDir() || isSymlinkToDir
+
+	if isDir {


It seems to me like breaking on a broken symlink would change our cache restoration behavior.

IIUC, the current state of the world is that we:

cache symlinks verbatim, including broken ones

on restore we unroll them (MkdirAll rather than SymLink)

In the event of a broken symlink, on restore, we skip it entirely (return godirwalk.SkipThis).

Note also that currently the restore behavior is shared w/ turbo prune.

My opinions on changing behavior:

cache restoration should mirror Put, we should restore the links verbatim

for the purposes of caching, we should not make any distinction about whether or not a link is broken

Ideal for prune would probably be to error on links that traverse out of the set of files we're copying? I think? I'm less sure on this one.

I think we want anything that changes behavior to be a separate PR though.

gsoltis · 2022-06-16T17:13:02Z

As discussed offline, let's leave symlinking behavior in its existing state and come back to it later.

Remove linking behavior

be38780

nathanhammond requested review from jaredpalmer, gsoltis, gaspar09 and tknickman as code owners June 10, 2022 12:27

nathanhammond commented Jun 10, 2022

View reviewed changes

nathanhammond marked this pull request as draft June 10, 2022 12:37

gsoltis suggested changes Jun 10, 2022

View reviewed changes

This was referenced Jun 13, 2022

Cache symlinks in filesystem cache #1068

Merged

Unify symlink handling in http cache #1071

Merged

nathanhammond closed this Jan 18, 2023

	if (fromMode & os.ModeSymlink) != 0 {
	// Don't try to hard-link to a symlink, that doesn't work reliably across all platforms.
	// Instead recreate an equivalent symlink in the new location.
	dest, err := os.Readlink(from)
	if err != nil {
	return err
	}
	// Make sure the link we're about to create doesn't already exist
	if err := os.Remove(to); err != nil && !errors.Is(err, os.ErrNotExist) {
	return err
	}
	return os.Symlink(dest, to)
	}

	// CopyFile copies a file from 'from' to 'to', with an attempt to perform a copy & rename
	// to avoid chaos if anything goes wrong partway.
	func CopyFile(from string, to string, mode os.FileMode) error {
	fromFile, err := os.Open(from)
	if err != nil {
	fileInfos, err := os.Lstat(from)
	isSymlink := err == nil && fileInfos.Mode()&os.ModeSymlink == os.ModeSymlink

	if isSymlink {
	// We have a broken symlink. Don't try to copy it.
	return nil
	}
	return err
	}
	defer fromFile.Close()
	return writeFileFromStream(fromFile, to, mode)
	}

Remove linking behavior #1369

Remove linking behavior #1369

Uh oh!

Conversation

nathanhammond commented Jun 10, 2022

Uh oh!

vercel bot commented Jun 10, 2022

Uh oh!

vercel bot commented Jun 10, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nathanhammond Jun 13, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nathanhammond Jun 14, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gsoltis commented Jun 16, 2022

Uh oh!

Uh oh!

vercel bot commented Jun 10, 2022 •

edited

Loading

nathanhammond Jun 13, 2022 •

edited

Loading

nathanhammond Jun 14, 2022 •

edited

Loading