Added untar progress similar to existing unzip #17519

perseoGI · 2024-12-23T12:25:12Z

Changelog: Feature: Add report progress while unpacking tarball files
Docs: omit

Refer to the issue that supports this Pull Request.
If the issue has missing info, explain the purpose/use case/pain/need that covers this Pull Request.
I've read the Contributing guide.
I've followed the PEP8 style guides for Python code.
I've opened another PR in the Conan docs repo to the develop branch, documenting this one.

memsharded

There are different pieces in this PR.
For the untargz it seems the same approach as used in the FileUploader makes sense, but reusing same code, prepared for later refactor.

For the zip uncompress, if it can use the same class FileProgress, then great, if not, maybe leave that untouched.

conan/tools/files/files.py

memsharded

FileProgress wrapping the full file object, including the wrapping with open/close context is a bit risky, as the file descriptor will not be properly closed. This is one of the reasons why the previous wrapper was not inheriting from io.FileIO and wrapping the minimum necessary for the progress.

conan/tools/files/files.py

perseoGI · 2024-12-31T09:33:05Z

FileProgress wrapping the full file object, including the wrapping with open/close context is a bit risky, as the file descriptor will not be properly closed. This is one of the reasons why the previous wrapper was not inheriting from io.FileIO and wrapping the minimum necessary for the progress.

I understand your concern. I tried the original FileProgress approach but wanted to do a more "general" FileProgress wrapper that could work in different scenarios, such as this one:
https://github.com/perseoGI/conan/blob/8fe92be6d52f5fa357fed25d83ddced6b28b8f15/conan/tools/files/files.py?plain=1#L356
The only way I found to get a tarfile progress without any impact in the decompression is to create a wrapper of the fileobj class.
Please look at the new changes which now use with statement simplifying the consumer side.

memsharded · 2025-01-07T13:29:17Z

conans/client/rest/file_uploader.py

-        filesize = os.path.getsize(abs_path)
-        with open(abs_path, mode='rb') as file_handler:
-            big_file = filesize > 100000000  # 100 MB
-            file_handler = FileProgress(file_handler, filesize) if big_file else file_handler


I'd like to keep this pattern: avoid the overhead of the progress for small files, do it only for big files.

Also when applied to the unzip

By using a TimedOutput every 10", see https://github.com/conan-io/conan/pull/17519/files#diff-332104132c8c5a1bf5b07a3835b599373594e2d5a32dfba98955b7c2377ba721R104.
It may not be needed to control whether the file is big.

You could be trying to download a 1MB file and if your bandwidth is poor, it could take a while.
Progress bars, or in this case, "progress prints" IMHO are useful for users as they can have feedback about something being done. So the 10-second interval sounds better to me than the file size!

If downloading a 1Mb file takes a lot, you are not going to be able to use Conan to install dependencies, as it is very frequent that packages have hundreds of MBs.

The issue is not about the output, even if for small files that don't take 10 seconds to download and there is no output, there are other different risks:

If there is some bug or issue, the chances that it affects are much higher. A ton of files, maybe 95% of files that conan will download are <100Mbs. It is better to reduce the code to the minimum. The only code that is guaranteed to not have bugs is the one that is not written (or executed for the case)

There will always be some performance penalty. Maybe it is small or negligible in this case, but we are already starting to have performance as an important factor, as Conan use cases keep scaling. Using the progress for all files, irrespective if they are small and they will never print anything, might still have an impact on file read performance.

I know I am probably being excessively concerned about this, but this is mostly because of the previous scars I have from progress bars and other "beautiful UX nice to have" things that end up bringing very hard to debug issues. So for me, in this area, the less the better.

Okay, I understand your concerns, I've slightly changed the code only to report progress (and only execute that part of the code) if the file size is greater than 100MB (as it was originally) being done.

memsharded · 2025-01-07T13:30:10Z

conans/client/rest/file_uploader.py

+    def read(self, size: int = -1) -> bytes:
+        current_percentage = int(self.tell() * 100.0 / self._total_size) if self._total_size != 0 else 0
+        self._t.info(f"{self.msg} {self._filename}: {current_percentage}%")
+        return super().read(size)


It might be better to accumulate the read bytes and avoid the extra self.tell() call

Yes that is a good one!

Done in the last commit

perseoGI requested review from AbrilRBS and memsharded December 23, 2024 12:25

memsharded reviewed Dec 23, 2024

View reviewed changes

conan/tools/files/files.py Outdated Show resolved Hide resolved

conan/tools/files/files.py Outdated Show resolved Hide resolved

memsharded reviewed Dec 27, 2024

View reviewed changes

conan/tools/files/files.py Outdated Show resolved Hide resolved

conan/tools/files/files.py Outdated Show resolved Hide resolved

conan/tools/files/files.py Outdated Show resolved Hide resolved

memsharded self-assigned this Dec 27, 2024

memsharded reviewed Jan 7, 2025

View reviewed changes

perseoGI added 5 commits January 16, 2025 18:52

Added untar progress similar to existing unzip

130fe9d

Python 3.6 compatible

375dbf1

Refactor FileProgress class and use it on uncompress

5d1306a

Restored prev FileProgress location and use with

b7294e6

Removed tell() call and added size limitation

05f159b

perseoGI force-pushed the pgi/untar-progress branch from 8fe92be to 05f159b Compare January 17, 2025 10:55

memsharded approved these changes Jan 17, 2025

View reviewed changes

memsharded merged commit 4bb0bcc into conan-io:develop2 Jan 17, 2025
33 checks passed

memsharded added this to the 2.12.0 milestone Jan 17, 2025

memsharded mentioned this pull request Jan 17, 2025

[feature] Report progress of unpacking source files #14589

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added untar progress similar to existing unzip #17519

Added untar progress similar to existing unzip #17519

perseoGI commented Dec 23, 2024

memsharded left a comment

memsharded left a comment

perseoGI commented Dec 31, 2024

memsharded Jan 7, 2025

memsharded Jan 7, 2025

perseoGI Jan 16, 2025

memsharded Jan 16, 2025

perseoGI Jan 17, 2025

memsharded Jan 7, 2025

perseoGI Jan 16, 2025 •

edited

Loading

perseoGI Jan 17, 2025

Added untar progress similar to existing unzip #17519

Added untar progress similar to existing unzip #17519

Conversation

perseoGI commented Dec 23, 2024

memsharded left a comment

Choose a reason for hiding this comment

memsharded left a comment

Choose a reason for hiding this comment

perseoGI commented Dec 31, 2024

memsharded Jan 7, 2025

Choose a reason for hiding this comment

memsharded Jan 7, 2025

Choose a reason for hiding this comment

perseoGI Jan 16, 2025

Choose a reason for hiding this comment

memsharded Jan 16, 2025

Choose a reason for hiding this comment

perseoGI Jan 17, 2025

Choose a reason for hiding this comment

memsharded Jan 7, 2025

Choose a reason for hiding this comment

perseoGI Jan 16, 2025 • edited Loading

Choose a reason for hiding this comment

perseoGI Jan 17, 2025

Choose a reason for hiding this comment

perseoGI Jan 16, 2025 •

edited

Loading