Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shift augmentation in ShiftScaleRotate works incorrect for keypoints and bboxes #182

Closed
mortido opened this issue Feb 7, 2019 · 13 comments
Assignees
Labels
bug Something isn't working

Comments

@mortido
Copy link

mortido commented Feb 7, 2019

Version: 1.12
Shift augmentation in ShiftScaleRotate works incorrect for keypoints and bboxes. Please compare how it's applied to img:
https://github.com/albu/albumentations/blob/c26383ecd9eeb51d57185bfd699179a8a41f7b6d/albumentations/augmentations/functional.py#L143

BBoxes:
https://github.com/albu/albumentations/blob/c26383ecd9eeb51d57185bfd699179a8a41f7b6d/albumentations/augmentations/functional.py#L635

and keypoints:
https://github.com/albu/albumentations/blob/c26383ecd9eeb51d57185bfd699179a8a41f7b6d/albumentations/augmentations/functional.py#L861

'dx' and 'dy' is percentage values of image width and height. As we don't have access to image shape during these transforms it may be good to set shift range in pixels not in percents.

@albu
Copy link
Contributor

albu commented Feb 7, 2019

bboxes and keypoints use normalized coordinates internally

@mortido mortido closed this as completed Feb 7, 2019
@mortido
Copy link
Author

mortido commented Feb 7, 2019

Actually... I'm going to reopen it.
Tried to add logging to https://github.com/albu/albumentations/blob/c26383ecd9eeb51d57185bfd699179a8a41f7b6d/albumentations/augmentations/functional.py#L858
method. And it doesn't look like keypoints were normalized. Also there is commit that removes normalize_keypoint function 41a5fdf

@mortido mortido reopened this Feb 7, 2019
@albu
Copy link
Contributor

albu commented Feb 7, 2019

Thanks. I'll pass it to @BloodAxe

@BloodAxe
Copy link
Contributor

BloodAxe commented Feb 7, 2019

Thanks, I'll take a look on weekend on this issue.

@ternaus ternaus added the bug Something isn't working label Feb 10, 2019
@diegombar
Copy link

Hi all,
First, thank you for such a useful library!
I noticed ShiftScaleRotate may still not be working correctly for bounding boxes. When applying only a rotation, the rotated bounding boxes are close to the objects in the rotated image but do not correspond exactly to them (at least for rectangular images of different height and width). Some objects actually end outside the rotated bounding box.

In the following function:
https://github.com/albu/albumentations/blob/09573604506b0f6b2eec3b8d6555faf10f0c5bc4/albumentations/augmentations/functional.py#L143-L153

Denormalizing the coordinates before the rotation and renormalizing again, solved the issues I had:

def bbox_shift_scale_rotate(bbox, angle, scale, dx, dy, interpolation, rows, cols, **params):
    height, width = rows, cols
    center = (width / 2, height / 2)
    matrix = cv2.getRotationMatrix2D(center, angle, scale)
    matrix[0, 2] += dx * width
    matrix[1, 2] += dy * height
    x = np.array([bbox[0], bbox[2], bbox[2], bbox[0]])
    y = np.array([bbox[1], bbox[1], bbox[3], bbox[3]])
    ones = np.ones(shape=(len(x)))
    points_ones = np.vstack([x, y, ones]).transpose()
    points_ones[:, 0] *= width
    points_ones[:, 1] *= height
    tr_points = matrix.dot(points_ones.T).T
    tr_points[:, 0] /= width
    tr_points[:, 1] /= height
    return [min(tr_points[:, 0]), min(tr_points[:, 1]), max(tr_points[:, 0]), max(tr_points[:, 1])]

It looks like normalization introduces different rescalings for each axis that make the transformation different from the one applied to images. Do you agree or did I miss something?

@johnsutor
Copy link

I'm experiencing issues with the bounding box with YOLO. The center coordinates are correct, though the width and height coordinates of the bounding box are way off. This is the chunk of code that I have to read in a YOLO annotation, apply augmentations to the data, and then re-save the augmentation (for a single object)

# Read in the object bounding boxes
        with open(WORK_DIR + '/renders/' + img[:-4] + '.txt', 'r') as f:
            labels = f.readline()
            coords = labels.split(' ')[1:]
            coords = [float(i) for i in coords]

        coords.append(0)

        # Apply the transforms to the image 
        image = imread(WORK_DIR + '/renders/' + img, pilmode='RGB')
        image = transforms(image=image, bboxes=[coords])

        print(f"Applying background and augmentations to {WORK_DIR + '/renders/' + img}")

        print(f"before: {coords}after: {image['bboxes']}")

        imsave(WORK_DIR + '/renders/' + img, image["image"])

        # Save the new
        with open(WORK_DIR + '/renders/' + img[:-4] + '.txt', 'w') as f:
            f.write(f"0 {image['bboxes'][0][0]} {image['bboxes'][0][1]} {image['bboxes'][0][2]} {image['bboxes'][0][3]}")

@cornzyblack
Copy link

I am also still experiencing the same issues as well

@AlessandroMondin
Copy link

AlessandroMondin commented Oct 11, 2022

Adding evidence on a COCO img:
On the left ground_truth bboxes, on the right augmented bboxes
Looks quite horrible tbh, to you suggest any turnaround?
Screenshot 2022-10-11 at 12 52 52

Here below the augmentations:

TRAIN_TRANSFORMS = A.Compose( [ A.Resize(width=640, height=640, interpolation=cv2.INTER_LINEAR), A.ColorJitter(brightness=0.6, contrast=0.6, saturation=0.6, hue=0.6, p=0.4), A.OneOf( [ A.ShiftScaleRotate( rotate_limit=20, p=0.5, border_mode=cv2.BORDER_CONSTANT ), A.Affine(shear=15, p=0.5, mode=cv2.BORDER_CONSTANT), ], p=1.0, ), A.HorizontalFlip(p=0.5), A.Blur(p=0.1), A.CLAHE(p=0.1), A.Posterize(p=0.1), A.ToGray(p=0.1), A.ChannelShuffle(p=0.05), A.Normalize(mean=[0, 0, 0], std=[1, 1, 1], max_pixel_value=255,), ToTensorV2(), ], bbox_params=A.BboxParams(format="coco", min_visibility=0.4, label_fields=[],), )

VAL_TRANSFORM = A.Compose( [ A.LongestMaxSize(max_size=IMAGE_SIZE), A.PadIfNeeded( min_height=IMAGE_SIZE, min_width=IMAGE_SIZE, border_mode=cv2.BORDER_CONSTANT ), A.Normalize(mean=[0, 0, 0], std=[1, 1, 1], max_pixel_value=255,), ToTensorV2(), ], bbox_params=A.BboxParams(format="coco", min_visibility=0.4, label_fields=[]), )

@AlessandroMondin
Copy link

Result does not change unfortunately
Screenshot 2022-10-11 at 13 15 26

TRAIN_TRANSFORMS = A.Compose( [ A.Resize(width=640, height=640, interpolation=cv2.INTER_LINEAR), A.ColorJitter(brightness=0.4, contrast=0.4, saturation=0.4, hue=0.4, p=0.4), A.OneOf( [ A.ShiftScaleRotate( rotate_limit=20, p=0.5, border_mode=cv2.BORDER_CONSTANT, rotate_method="ellipse" ), A.Affine(shear=15, p=0.5, mode=cv2.BORDER_CONSTANT), ], p=1.0, ), A.HorizontalFlip(p=0.5), A.Blur(p=0.1), A.CLAHE(p=0.1), A.Posterize(p=0.1), A.ToGray(p=0.1), A.ChannelShuffle(p=0.05), A.Normalize(mean=[0, 0, 0], std=[1, 1, 1], max_pixel_value=255,), ToTensorV2(), ], bbox_params=A.BboxParams(format="coco", min_visibility=0.4, label_fields=[],), )

@Dipet
Copy link
Collaborator

Dipet commented Oct 14, 2022

Looks like we need to add rotate_method=ellipse for Affine.

@mikel-brostrom
Copy link

Any updates on this @Dipet?

@Dipet
Copy link
Collaborator

Dipet commented Jan 31, 2023

Hi, sorry, no updated with this feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

10 participants