Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Images in the exported dataset is different to the origin one #8771

Open
2 tasks done
imyhxy opened this issue Dec 4, 2024 · 2 comments
Open
2 tasks done

Images in the exported dataset is different to the origin one #8771

imyhxy opened this issue Dec 4, 2024 · 2 comments
Labels
bug Something isn't working dataset

Comments

@imyhxy
Copy link

imyhxy commented Dec 4, 2024

Actions before raising this issue

  • I searched the existing issues and did not find anything similar.
  • I read/searched the docs

Steps to Reproduce

Create a task with a JPEG image, export the dataset with images. Use md5sum to compare the image in the exported dataset and the origin. They does not match. Sometime, a 8M jpg will become 16M in the exported dataset.

Expected Behavior

The origin image and the exported image should be exactly the same.

Possible Solution

Do not decode the origin image when export dataset with images. Just copy the binary content.

Context

No response

Environment

  • git hash log:
commit c737f083ac6d9eaf5013acef1392b5922105f3c6 (HEAD, tag: v2.22.0)
Merge: 8d990c986 333df3563
Author: cvat-bot[bot] <147643061+cvat-bot[bot]@users.noreply.github.com>
Date:   Mon Nov 11 13:57:35 2024 +0000

    Merge pull request #8678 from cvat-ai/release-2.22.0
    
    Release v2.22.0
  • docker version
Client: Docker Engine - Community
 Version:           25.0.2
 API version:       1.44
 Go version:        go1.21.6
 Git commit:        29cf629
 Built:             Thu Feb  1 00:23:03 2024
 OS/Arch:           linux/amd64
 Context:           default

Server: Docker Engine - Community
 Engine:
  Version:          25.0.2
  API version:      1.44 (minimum version 1.24)
  Go version:       go1.21.6
  Git commit:       fce6e0c
  Built:            Thu Feb  1 00:23:03 2024
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.6.28
  GitCommit:        ae07eda36dd25f8a1b98dfbf587313b99c0190bb
 runc:
  Version:          1.1.12
  GitCommit:        v1.1.12-0-g51d5e94
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0
  • system version
Ubuntu 24.04.1 LTS
@imyhxy imyhxy added the bug Something isn't working label Dec 4, 2024
@bsekachev
Copy link
Member

Hello, thanks for report.

@sriramsowmithri9807
Copy link

sriramsowmithri9807 commented Dec 11, 2024

Hi @imyhxy,

I’m experiencing the same issue with the exported dataset images differing from the original ones. Thank you for detailing the steps to reproduce the problem.

Actions Before Raising This Issue

I also searched existing issues and found nothing similar.
I reviewed the documentation for any relevant information.

Steps to Reproduce

Created a task with a JPEG image.
Exported the dataset with images.
Used md5sum to compare the images in the exported dataset with the originals.
As you noted, the checksums do not match, and I’ve observed that sometimes an 8MB JPEG file becomes 16MB in the exported dataset.

Expected Behavior

I expected the original image and the exported image to be identical.

Possible Solution

I agree with your suggestion: it would be best to avoid decoding the original image when exporting the dataset with images and instead copy the binary content directly.

Context

I’m using the following environment setup:

Git Hash Log:

Verify

Open In Editor
Run

commit c737f083ac6d9eaf5013acef1392b5922105f3c6 (HEAD, tag: v2.22.0)
Merge: 8d990c986 333df3563
Author: cvat-bot[bot] <147643061+cvat-bot[bot]@users.noreply.github.com>
Date:   Mon Nov 11 13:57:35 2024 +0000

Docker Version:

Verify

Open In Editor
Run


Client: Docker Engine - Community
 Version:           25.0.2
 API version:       1.44
 Go version:        go1.21.6
 Git commit:        29cf629
 Built:             Thu Feb  1 00:23:03 2024
 OS/Arch:           linux/amd64
System Version:

Verify

Open In Editor
Run
Ubuntu 24.04.1 LTS
If there are any updates or further investigations into this issue, I would be keen to follow along. Thank you for your attention to this matter!

Best,
Sowmithri Sriram.
sriramsowmithri9807 --> github

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working dataset
Projects
Status: To do
Development

No branches or pull requests

3 participants