-
Notifications
You must be signed in to change notification settings - Fork 158
feat(scripts/pkg): filter out outdated mirrors #169
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
df904db to
4d6a031
Compare
We do not fetch file content itself so I called the variable
Logging of time difference between default mirror and current mirror last update time required changing logging logic a bit but I figured it out. Now output looks like this: |
9cb1342 to
3ded0da
Compare
|
I have a sneaking suspicion as to why some users are running into a bit of trouble here... Currently I am rewriting it with a bit more advanced logging and I found an interesting thing: |
|
Only some of mirrors are more or less up-to-date, others are outdated or unusable. |
|
You need to download the |
Yeah, I suspected this too, that's why this was important to implement to filter them out and report to mirror owners, and permanently remove them if they remain too out of date. |
But why not? Seems like reading "last-modified" header of file is enough for finding out if file is up-to-date. Rsync explicitly syncs timestamps of files during sync. |
|
I'll check it though. |
|
If a mirror url is not hosting a Release file and instead is hosting a different file or redirecting to a different url like in case of domain expiry or new host, then you are comparing last modified time of the redirected page, not the release file which doesn't even exist on the server. Look at following, no Release file exists at that url, but time is of right now. Basically header timestamp is not what should be compared, and it can be anything as per server policies. $ curl --head https://deb.kcubeterm.com/termux-main
HTTP/2 521
date: Mon, 19 May 2025 21:37:32 GMT
content-length: 0
server: cloudflare
cache-control: private, no-store |
|
Also not everyone may be using rsync and/or preserving timestamps of files. |
|
|
You are assuming 521 will always be returned. Redirection could happen to a valid url too, like homepage. |
|
With obtaining range 0-255 and parsing date from the release file the situation is not very much different. |
|
I was wrong about And even in this case the situation is not better. |
|
Also AFAIK in the case if you not specify What do you think? |
|
|
I was expecting some mirrors to have problems with |
|
Ok, so I am putting work in this direction on hold until I get your response on this. But I still think checking Last-modified is fine because it will be or synchronized with termux's server or it will be a date of the last sync so it is still better than the current master branch solution. |
|
Thanks, its not just about last modified time, we also need to ensure the url is for a valid Release file for the |
|
If you think download ~1MB data is fine (~13KB * 70 files, somebody still uses limited connections) I can download the whole file content as well. But I just want to mention currently |
|
That is definitely not fine, only following part needs to be downloaded for validation, that is Origin: termux-main stable
Label: termux-main stable
Suite: stable
Codename: stable
Date: Wed, 21 May 2025 13:07:54 UTC
Architectures: aarch64 arm i686 x86_64
Components: main$ curl --silent --fail https://packages.termux.dev/apt/termux-main/dists/stable/Release1 | head -c 200
$ echo "${PIPESTATUS[0]} ${PIPESTATUS[1]}"
22 0
$ curl --silent --fail https://packages.termux.dev/apt/termux-main/dists/stable/Release | head -c 200
Origin: termux-main stable
Label: termux-main stable
Suite: stable
Codename: stable
Date: Wed, 21 May 2025 13:07:54 UTC
Architectures: aarch64 arm i686 x86_64
Components: main
$ echo "${PIPESTATUS[0]} ${PIPESTATUS[1]}"
0 0 |
|
So maybe in this case we should do |
|
No, at least origin and architectures should be validated too. Regex engine should run internally in bash. Maybe |
|
I just wanted to add something simple, but somehow it spiraled into a full script refactor, I'm not even sure how! And I wonder how I have more deletions than insertions. So, there are some differences to original behaviour.
|
3ded0da to
266736f
Compare
|
So now the output is like this |
|
Probably I can make it print only hostnames instead of full mirror urls, or even hostname + location to make output a bit more useful. Should I do so? |
266736f to
61af27f
Compare
|
Seems like |
61af27f to
8de8a81
Compare
That sounds fine because, even if the full URL of the selected mirror is desired for convenience (like clicking it in a terminal emulator that supports opening URLs in order to browse it in a web browser), |
robertkirkman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested this and I tried to find anything wrong with it, but I have not found anything wrong in my opinion and this overall runs much faster than the current version.
|
Seems like performing However, apt does not update the CC @agnostic-apollo because you merged this change. |
Lies!!! Trying to ruin my good name! |
|
Will have to think more on the problem. Thanks for notification. |
|
|
That didn't add |
|
Any specific reason to avoid using |
|
There shouldn't be a functional difference as far I see. But there should have been an existence check added in |
8de8a81 to
cc1dea6
Compare
See #4 (comment)
With this change mirrors are being automatically filtered out in the case if last sync was more than 12 hours ago which should be fine.
But I am not sure what will happen if the last change in termux-packages happens more than 12 hours ago, probably it will break mirror picking, needs some tests.
Also it implements nice indication for outdated mirrors, like (example):
Closes termux/termux-app#4517