-
Notifications
You must be signed in to change notification settings - Fork 167
Closed
Labels
api: storageIssues related to the googleapis/python-storage API.Issues related to the googleapis/python-storage API.status: investigatingThe issue is under investigation, which is determined to be non-trivial.The issue is under investigation, which is determined to be non-trivial.
Description
I'm attempting to stream-write an uncompressed zip file (200MB+ -> 1GB+ eventually, mostly ~3MB images) to avoid writing to disk. Unfortunately when the zip file is closed it attempts to at least partially flush the stream, and the storage client seems to assume that a flush will only occur to close the streamed file, which isn't the case here (and seems wrong in the general sense, since buffers are often flushed when they reach saturation).
Environment details
- OS type and version: Ubuntu 20.04 (custom devcontainer docker image)
- Python version: 3.8.10
- pip version: 21.3
google-cloud-storageversion: 1.42.3
Code example
import zipfile
from google.cloud.storage import Blob, Client
client = Client("some-account")
blob = Blob.from_string("gs://some-bucket/some-folder/something.zip", client)
with blob.open(mode="wb") as blob_file, \
zipfile.ZipFile(blob_file, mode="w") as zip_file:
# Empty/not empty doesn't matter, the same error is generated
passStack trace
Traceback (most recent call last):
File "s.py", line 14, in <module>
pass
File "/usr/lib/python3.8/zipfile.py", line 1312, in __exit__
self.close()
File "/usr/lib/python3.8/zipfile.py", line 1839, in close
self._write_end_record()
File "/usr/lib/python3.8/zipfile.py", line 1947, in _write_end_record
self.fp.flush()
File "/workspaces/someproject/.pyenv/lib/python3.8/site-packages/google/cloud/storage/fileio.py", line 401, in flush
raise io.UnsupportedOperation(
io.UnsupportedOperation: Cannot flush without finalizing upload. Use close() instead.
</ br>
Workaround
Insert an io.BufferedWriter:
import io
import zipfile
from google.cloud.storage import Blob, Client
client = Client("some-account")
blob = Blob.from_string("gs://some-bucket/some-folder/something.zip", client)
with blob.open(mode="wb") as blob_file, \
io.BufferedWriter(blob_file) as binary_file, \
zipfile.ZipFile(binary_file, mode="w") as zip_file:
# Empty/not empty doesn't matter, the same error is generated
passThis results in the file being written to cloud storage (and at least for a simple case, including correctly written contents), but it prints a new error:
Traceback (most recent call last):
File "s.py", line 24, in <module>
pass
File "/workspaces/someproject/.pyenv/lib/python3.8/site-packages/google/cloud/storage/fileio.py", line 406, in close
self._checkClosed() # Raises ValueError if closed.
File "/workspaces/someproject/.pyenv/lib/python3.8/site-packages/google/cloud/storage/fileio.py", line 413, in _checkClosed
raise ValueError("I/O operation on closed file.")
ValueError: I/O operation on closed file.
Metadata
Metadata
Assignees
Labels
api: storageIssues related to the googleapis/python-storage API.Issues related to the googleapis/python-storage API.status: investigatingThe issue is under investigation, which is determined to be non-trivial.The issue is under investigation, which is determined to be non-trivial.