-
-
Notifications
You must be signed in to change notification settings - Fork 3.5k
scripts(termux_download_deb_pac): optimization and refactoring #24621
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
5068f94
to
eb6f7dc
Compare
Makes sense.
That's probably not worth worrying about but go ahead.
Since when does that cause additional overhead? |
|
10 million iterations. |
Yes. Currently it does not matter but I am planning to parralelize packages downloading and after that this optimization will be more effective. |
My local testing puts the per-instance difference of This codepath is nowhere near hot enough to apply that level of pedantry. |
I am planning to write a cron script for package building validation. |
4 arches. Those run in parallel. We're saving 15ms (Milliseconds, not seconds), for each |
Ok, I'll revert this one after getting home. |
I did a quick test of these changes on pacman package loading, and I can say that the current changes break pacman package loading completely. |
I tried to build pacman packages in termux-packages repo and everything was fine, but it seems like it downloaded packages from debian repos. Does pacman-packages-related building require pacman-packages's version of |
For testing you need to reconfigure |
02da454
to
1d62bb0
Compare
Note that this has been rewritten in my repo.json changes pull including separate functions and any changes will be replaced in coming weeks. This also applies to most of the other dependencies/repo code. So you might want to hold off till after. |
When exactly do you want to merge your PRs? And maybe you have links to your PRs? |
Those changes are not final yet, so haven't pushed. I will working on them after current Termux app and keychain releases, when I work on the new bootstraps. There would be one for docker, another for repo.json (pull is at #21181 but changes there are outdated and different from my local changes) and another for other refactoring of the build system (some changes are at agnostic-apollo@8ab564a). Instead of doing major refactoring for structuring, if there are any high latency parts of the code taking significant seconds (not micro or mill seconds), feel free to make changes for that, I will handle any conflicts against those. |
Moved duplicated code to a separate function to make code more readable and editable. Removed invocations of external commands like `sed` and `seq` to avoid binary execution overhead, using bash builtins instead. Replaced all `test` and `[` with `[[` to improve speed. Grouped some variable declarations under single expressions to avoid parsing overhead.
1d62bb0
to
752d705
Compare
These changes are preparation for making parallel downloading of dependencies, but refactoring old code at least to get rid of duplicates is good too.
Parallel downloads can save minutes of CI time during building big packages or packages with big amount of dependencies, i.e. firefox, thunderbird or kdenlive. BTW do our servers have any kind of ratelimiting, preventing parallel downloads of packages or anything similar? |
@Maxython now pacman-packages should be fine too. |
return 1 | ||
fi | ||
|
||
termux_download "${TERMUX_REPO_URL[${idx}-1]}/${PKG_PATH}" \ | ||
termux_download "${TERMUX_REPO_URL[${idx}]}/${PKG_PATH}" \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've always wondered about this, but, there is a set of very similar code in the file generate-bootstraps.sh
. Do you think it would be appropriate or possible to use some of these optimizations to also optimize that, by making that script import and use the same code that this script is using?
curl --fail --location --output "$package_tmpdir/package.deb" "$package_url" |
That script already sources properties.sh
, so as I understand it, that means that it can already only work inside the Docker container, so there would be no meaningful downside to making it also source other files, especially if it would result in optimization and DRY.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new bootstrap will use new repo.json format and will share code with build-package.sh. I also also removed the duplicate code changed above. There are lot of other issues to handle in current code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh ok, I didn't know that you already changed generate-bootstraps.sh
in the future version you mentioned.
I have been using code that I heavily copied from generate-bootstraps.sh
somewhere else as well, with a comment about where it came from, so I'll want to pause that and wait for the new generate-bootstraps.sh
to come so I can keep my copied implementation mostly up-to-date with the reference implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
${TERMUX_REPO_URL[${idx}-1]}
It is related to using seq
instead of iterating over array keys.
read -d "\n" PKG_PATH PKG_HASH <<< "$(jq -r '.["'"$PACKAGE"'"] | | ||
select((.VERSION == "'"$VERSION_PACMAN"'" or "'"$TERMUX_WITHOUT_DEPVERSION_BINDING"'" == "true") and .SHA256SUM and .FILENAME) | | ||
"'"$arch"'/\(.FILENAME) \(.SHA256SUM)"' "$pkg_file_path")" || : |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest the following changes to improve the readability of jq
algorithms:
read -d "\n" PKG_PATH PKG_HASH <<< "$(jq -r '.["'"$PACKAGE"'"] | | |
select((.VERSION == "'"$VERSION_PACMAN"'" or "'"$TERMUX_WITHOUT_DEPVERSION_BINDING"'" == "true") and .SHA256SUM and .FILENAME) | | |
"'"$arch"'/\(.FILENAME) \(.SHA256SUM)"' "$pkg_file_path")" || : | |
read -d "\n" PKG_PATH PKG_HASH <<< "$(jq -r ".[\"$PACKAGE\"] | | |
select((.VERSION == \"$VERSION_PACMAN\" or \"$TERMUX_WITHOUT_DEPVERSION_BINDING\" == \"true\") and .SHA256SUM and .FILENAME) | | |
\"${arch}/\"+.FILENAME, .SHA256SUM" "$pkg_file_path")" || : |
} | ||
|
||
termux_download_deb_pac_get_pkg_hash_and_version() { | ||
local arch="$1" pkg_file_path="$TERMUX_COMMON_CACHEDIR-$1/$PACKAGE_FILE_PATH" PKG_PATH="" PKG_HASH="" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The arch
variable in the termux_download_deb_pac_get_pkg_hash_and_version()
function is used only once, although it could be used in the pkg_file_path
variable, but because they are in the same line, it is not possible to do so. Perhaps you should move the arch
variable before the pkg_file_path
variable or remove the arch
variable.
Moved duplicated code to a separate function to make code more readable and editable.
Removed invocations of external commands like
sed
andseq
to avoid binary execution overhead, using bash builtins instead.Replaced all
test
and[
with[[
to improve speed.Grouped some variable declarations under single expressions to avoid parsing overhead.
Replaced
${var}
with$var
in some places to reduce parsing overhead.