Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Please reconsider handling of tile download errors #88

Open
jsbien opened this issue Nov 20, 2020 · 8 comments
Open

Please reconsider handling of tile download errors #88

jsbien opened this issue Nov 20, 2020 · 8 comments
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@jsbien
Copy link
Contributor

jsbien commented Nov 20, 2020

The log of downloading a dictionary volume (created with script) has 21M. Now I know I should grep it for ERROR.
Because of some network problems I had 3 cases of
Only ??? tiles out of ??? could be downloaded. The resulting image was still created.
I would notice the problem earlier if such images had the prefix e.g. incomplete.
I would be also convenient to have the URL of the whole affected image (now only the URL of the tile is printed).

@lovasoa lovasoa added enhancement New feature or request good first issue Good for newcomers labels Nov 20, 2020
@lovasoa
Copy link
Owner

lovasoa commented Nov 20, 2020

If you are doing batch download, you should probably handle the exit status of dezoomify-rs after is has run. And you should also probably tweak the network-related settings; in particular, you should increase the number of retries when a download fails and the time between consecutive retries.

@jsbien
Copy link
Contributor Author

jsbien commented Nov 20, 2020

This is my command:
time curl "https://polona.pl/iiif/item/MTI2MzI0NjU/manifest.json" | jq -r ".items[].id" | xargs -n 1 ./dezoomify-rs -l --parallelism 1 --timeout 60s --retry-delay 60s
Is there an easy way to add the exit status checking? Anyway I can live with it. As for the retries number I hope the network problems will not occur again. Moreover I'm not in a hurry and I don't want to increase the server load.

@lovasoa
Copy link
Owner

lovasoa commented Nov 20, 2020

Increasing the number of retries will decrease the server load, not increase it. With only one retry, when the server starts to be overloaded and responds with errors, you will quickly move to the next tile and make one more request to the already overloaded server. With let's say 10 retries (and a parallelism of 1) dezoomify-rs will try 10 times with an exponental backoff strategy: it will make the second try after 10s, the next one after waiting another 20s, then 40s, and so on. This will be slower, but you will be sure not to overwhelm the server.

@jsbien
Copy link
Contributor Author

jsbien commented Nov 20, 2020

Thanks for the explanation. What about including it in the help? Now it says
-retry-delay
Amount of time to wait before retrying a request that failed [default: 2s]
So the default is different?

@lovasoa
Copy link
Owner

lovasoa commented Nov 21, 2020

Yes, this should be included in the help. Are you interested in making a contribution? The argument documentation is in src/arguments.rs and the remaining documentation is in README.md.

@jsbien
Copy link
Contributor Author

jsbien commented Nov 21, 2020

Please have a look at my fork and check whether I understand correctly what is going on.

@lovasoa
Copy link
Owner

lovasoa commented Nov 21, 2020

You can open a pull request here: https://github.com/lovasoa/dezoomify-rs/compare

I'll comment on it.

@drzraf
Copy link

drzraf commented Feb 26, 2024

What's the exit status in case of partially saved images? Grepping for error is problematic. Something like --with-errors or --without-errors is needed for users who prefer file integrity over partial results.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

3 participants