这是indexloc提供的服务,不要输入任何密码
Skip to content

Add root boundary to untarring #1409

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jun 16, 2022
Merged

Add root boundary to untarring #1409

merged 4 commits into from
Jun 16, 2022

Conversation

gsoltis
Copy link
Contributor

@gsoltis gsoltis commented Jun 14, 2022

Also start adding some tests for untarring.

@vercel
Copy link

vercel bot commented Jun 14, 2022

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Updated
turbo-site ⬜️ Ignored (Inspect) Jun 16, 2022 at 6:01PM (UTC)

@gsoltis gsoltis marked this pull request as ready for review June 14, 2022 16:57
@@ -188,13 +188,14 @@ func (cache *httpCache) retrieve(hash string) (bool, []string, int, error) {
duration = intVar
}
artifactReader := resp.Body
defer func() { _ = artifactReader.Close() }()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm understanding correctly, this isn't closing this particular reader unless we return before line 214? I'd prefer to use separate names rather than reusing the name and having to explain what in the world is going on and capturing in the closure. We introduce spooky action at a distance with this one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying not to mess with this code too much, but I can restructure it to be a bit clearer.

This calls Close() on whatever artifactReader points to, which is a thing we [maybe] update later on. But we can be more explicit.

}
defer gzr.Close()
defer func() { _ = gzr.Close() }()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check out the lint rule change in the git PR that makes this ceremony unnecessary.

if isChild, err := root.ContainsPath(filename); err != nil {
return nil, err
} else if !isChild {
return nil, fmt.Errorf("cannot untar file to %v", filename)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we bail here should we clean up? In fact, at any error return from inside of this loop we might want to consider that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(We could do that by returning files and then clean up at the restoreTar call site.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't revert. We can delete files we've written, but we can't restore to previous state because we have potentially truncated existing files. Either way, you're in an inconsistent state and need to start over.

It might be worth considering what to do in this scenario, your best bet is likely engineering a cache miss. If you've gotten into this scenario accidentally, you likely need to make a code change anyways, which will force a miss.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-entrant to the task with no caching is probably the right answer, possibly after deleting any files created by the attempted restore. (Rechecking the original input hashes to make sure we're in an okay state as well?)

@@ -316,5 +335,6 @@ func newHTTPCache(opts Opts, config *config.Config, recorder analytics.Recorder)
teamId: config.TeamId,
enabled: config.TurboJSON.RemoteCacheOptions.Signature,
},
repoRoot: config.Cwd,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renaming keeps moving up my list.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be a good commit in the "move main to cobra" PR.

}
files = append(files, hdr.Name)
filename := root.Join(hdr.Name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is hdr.Name always a Unix path? The test has expectedFiles as / separated, if so, this line implicitly Cleans them to system separators.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we convert all paths to posix-style on write into the tar file.

Copy link
Contributor

@nathanhammond nathanhammond Jun 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, can you add a comment here so that I remember to make it do the right thing inside of turbopath?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Traced through this a little more, and I think it gets to our not-great cache abstraction. I don't think I've changed the behavior here, but I also think the return value, which will be posix-style repo-relative paths, is both wrong but also works due to Windows accepting /-delimited paths. This is used when a cache hit from remote cache is used to populate the filesystem cache. The return type here should match the argument type for files to cache.Put. That should be RepoRelativeSystemPath.

I'll add a comment, but I think this is best cleaned up when we have the types to clarify what we actually want.

@nathanhammond nathanhammond added the pr: fixship A PR which is approved with notes and does not require re-review. label Jun 16, 2022
Copy link
Contributor

@nathanhammond nathanhammond left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okidoke, now that I know what's going on, only requests are these two:

  • Rename the second artifactReader, if that issues a defer right after opening it would also do the right thing.
  • Comment on the hdr.Name spot.

@gsoltis gsoltis added pr: automerge Kodiak will merge these automatically after checks pass and removed pr: fixship A PR which is approved with notes and does not require re-review. labels Jun 16, 2022
@gsoltis gsoltis merged commit 501b6a6 into main Jun 16, 2022
@gsoltis gsoltis deleted the gsoltis/root_and_test_untar branch June 16, 2022 18:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr: automerge Kodiak will merge these automatically after checks pass
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants