Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CLOUD_CONTAINER_DELETION] Core logic for container compaction #1646

Merged
merged 23 commits into from
Nov 24, 2020

Conversation

ankagrawal
Copy link
Collaborator

Container deletion logic in cloud.

@codecov-commenter
Copy link

Codecov Report

Merging #1646 into master will decrease coverage by 0.13%.
The diff coverage is 41.58%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master    #1646      +/-   ##
============================================
- Coverage     73.56%   73.42%   -0.14%     
- Complexity     8639     8748     +109     
============================================
  Files           609      619      +10     
  Lines         47183    47827     +644     
  Branches       5962     6035      +73     
============================================
+ Hits          34708    35119     +411     
- Misses        10694    10903     +209     
- Partials       1781     1805      +24     
Impacted Files Coverage Δ Complexity Δ
...om/github/ambry/cloud/CloudContainerCompactor.java 0.00% <0.00%> (ø) 0.00 <0.00> (?)
...java/com/github/ambry/cloud/CloudRequestAgent.java 88.57% <0.00%> (-2.61%) 9.00 <0.00> (ø)
...hub/ambry/cloud/azure/AzureContainerCompactor.java 20.45% <20.45%> (ø) 2.00 <2.00> (?)
...com/github/ambry/cloud/ContainerDeletionEntry.java 24.48% <24.48%> (ø) 3.00 <3.00> (?)
...m/github/ambry/cloud/azure/CosmosDataAccessor.java 61.15% <29.41%> (-2.49%) 40.00 <1.00> (+1.00) ⬇️
...ithub/ambry/cloud/azure/AzureCloudDestination.java 73.86% <54.54%> (-1.58%) 35.00 <6.00> (ø)
.../java/com/github/ambry/cloud/StaticVcrCluster.java 80.00% <60.00%> (-6.21%) 10.00 <1.00> (ø)
...n/java/com/github/ambry/cloud/HelixVcrCluster.java 72.97% <64.28%> (-2.44%) 12.00 <1.00> (+1.00) ⬇️
...ithub/ambry/cloud/azure/AzureStorageCompactor.java 71.08% <75.00%> (+0.17%) 25.00 <1.00> (ø)
...ub/ambry/cloud/CloudContainerDeletionSyncTask.java 78.57% <78.57%> (ø) 3.00 <3.00> (?)
... and 85 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 40ccf67...bdc597b. Read the comment docs.

@codecov-io
Copy link

codecov-io commented Oct 15, 2020

Codecov Report

Merging #1646 into master will decrease coverage by 57.94%.
The diff coverage is 0.00%.

Impacted file tree graph

@@              Coverage Diff              @@
##             master    #1646       +/-   ##
=============================================
- Coverage     73.65%   15.71%   -57.95%     
+ Complexity     8804     2088     -6716     
=============================================
  Files           629      630        +1     
  Lines         48371    48539      +168     
  Branches       6076     6094       +18     
=============================================
- Hits          35626     7626    -28000     
- Misses        10872    40476    +29604     
+ Partials       1873      437     -1436     
Impacted Files Coverage Δ Complexity Δ
...main/java/com/github/ambry/config/CloudConfig.java 0.00% <0.00%> (-100.00%) 0.00 <0.00> (-1.00)
...java/com/github/ambry/cloud/CloudRequestAgent.java 0.00% <0.00%> (-88.58%) 0.00 <0.00> (-9.00)
...n/java/com/github/ambry/cloud/HelixVcrCluster.java 0.00% <0.00%> (-75.00%) 0.00 <0.00> (-12.00)
.../java/com/github/ambry/cloud/StaticVcrCluster.java 0.00% <0.00%> (-86.21%) 0.00 <0.00> (-10.00)
...c/main/java/com/github/ambry/cloud/VcrMetrics.java 0.00% <0.00%> (-100.00%) 0.00 <0.00> (-2.00)
.../com/github/ambry/cloud/VcrReplicationManager.java 0.00% <0.00%> (-73.00%) 0.00 <0.00> (-15.00)
...com/github/ambry/cloud/azure/AzureCloudConfig.java 0.00% <0.00%> (-100.00%) 0.00 <0.00> (-1.00)
...ithub/ambry/cloud/azure/AzureCloudDestination.java 0.00% <0.00%> (-73.60%) 0.00 <0.00> (-35.00)
...mbry/cloud/azure/AzureCloudDestinationFactory.java 0.00% <0.00%> (-75.00%) 0.00 <0.00> (-5.00)
.../github/ambry/cloud/azure/AzureCompactionUtil.java 0.00% <0.00%> (ø) 0.00 <0.00> (?)
... and 499 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f401eb9...4dcc9f1. Read the comment docs.

@ankagrawal ankagrawal changed the title [CLOUD_CONTAINER_DELETION] Core logic for container compaction (WIP) [CLOUD_CONTAINER_DELETION] Core logic for container compaction Oct 19, 2020
@lightningrob
Copy link
Contributor

Haven't reviewed yet, but Codecov says the diff coverages is just over 40%, so may want to beef that up.

Comment on lines 390 to 393
cloudContainerCompactionIntervalHours = verifiableProperties.getInt(CLOUD_CONTAINER_COMPACTION_INTERVAL_HOURS, 24);
cloudBlobCompactionStartupDelaySecs = verifiableProperties.getInt(CLOUD_BLOB_COMPACTION_STARTUP_DELAY_SECS, 600);
cloudContainerCompactionStartupDelaySecs =
verifiableProperties.getInt(CLOUD_CONTAINER_COMPACTION_STARTUP_DELAY_SECS, 600);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggest using getIntInRange

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

Comment on lines 159 to 164
cosmosContainerDeletionBatchSize =
verifiableProperties.getInt(COSMOS_CONTAINER_DELETION_BATCH_SIZE, DEFAULT_COSMOS_CONTAINER_DELETION_BATCH_SIZE);
containerCompactionAbsPurgeLimit =
verifiableProperties.getInt(CONTAINER_COMPACTION_ABS_PURGE_LIMIT, DEFAULT_CONTAINER_COMPACTION_ABS_PURGE_LIMIT);
containerCompactionCosmosQueryLimit = verifiableProperties.getInt(CONTAINER_COMPACTION_COSMOS_QUERY_LIMIT,
DEFAULT_COSMOS_CONTAINER_DELETION_BATCH_SIZE);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, better to use getIntInRange

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its not 100% clear what the theoretical upper limit of this value should be. But it definitely shouldn't be less than 1. Made the changes accordingly.

Copy link
Contributor

@lightningrob lightningrob left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly looks good. A few concerns mentioned.

throws DocumentClientException {
String query = String.format(CONTAINER_BLOBS_QUERY, accountId, containerId);
SqlQuerySpec querySpec = new SqlQuerySpec(query);
FeedOptions feedOptions = new FeedOptions();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code from here down is nearly identical to getDeadBlobs. Please refactor into a common class that take SqlQuerySpec as argument.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, since we aren't time bucketing the queries like dead blob compaction does, we could run into the same throttling issues that prompted the use of time bucketing there.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is true. I plan to post a final follow up PR that would add two more optimizations

  1. Do the container compaction in multiple threads.
  2. Do a time based bucketing for deletion.

// Read the existing record
String id = CosmosContainerDeletionEntry.generateContainerDeletionEntryId(accountId, containerId);
String docLink = getContainerDeletionEntryDocumentLink(id);
RequestOptions options = getRequestOptions(id);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of this logic is very similar to updateMetadata(). Possible to reuse/combine?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add as an action item for follow up PR.

ambry-cloud/src/test/resources/azure-test.properties Outdated Show resolved Hide resolved
Copy link
Contributor

@SophieGuo410 SophieGuo410 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall lgtm. Added some minor comments.

* Compact blobs of the deprecated container from cloud. This method is one of the two entry points in the
* {@link AzureContainerCompactor} class along with {@link AzureContainerCompactor#deprecateContainers(Collection, Collection)}.
* Note that this method is not thread safe as it is expected to run in a single thread.
*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: missing java doc for @param assignedPartitions

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

@@ -13,9 +13,12 @@
*/
package com.github.ambry.cloud.azure;

import com.codahale.metrics.Timer;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: you can removed them since they are not been used.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

static final String DELETE_PENDING_PARTITIONS_KEY = "deletePendingPartitions";
private static final String ID_KEY = "id";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not used?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed.

containerDeletionEntrySet.remove(containerDeletionEntry);
for (String partitionId : containerDeletionEntry.getDeletePendingPartitions()) {
try {
int blobCompactedCount =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems the blobCompactedCount never been used?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will use it in future PR when I plan to add more metrics to the code.

List<? extends PartitionId> assignedPartitions) throws CloudStorageException {
Set<CosmosContainerDeletionEntry> containerDeletionEntrySet =
requestAgent.doWithRetries(() -> cosmosDataAccessor.getDeprecatedContainers(containerDeletionQueryBatchSize),
"GetDeprectedContainers", null);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: GetDeprectedContainers -> GetDeprecatedContainers

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

private static final String EXPIRED_BLOBS_QUERY = constructDeadBlobsQuery(CloudBlobMetadata.FIELD_EXPIRATION_TIME);
private static final String DELETED_BLOBS_QUERY = constructDeadBlobsQuery(CloudBlobMetadata.FIELD_DELETION_TIME);
private static final String CONTAINER_BLOBS_QUERY =
"SELECT TOP %d * FROM c WHERE c.accountId=%d and c.containerId=%d";
"SELECT TOP %d " + LIMIT_PARAM + " FROM c WHERE c.accountId=" + ACCOUNT_ID_PARAM + " and c.containerId="
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Cosmos will reject this. "%d" needs to be removed, and you need the "*" before FROM.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

private static final String EXPIRED_BLOBS_QUERY = constructDeadBlobsQuery(CloudBlobMetadata.FIELD_EXPIRATION_TIME);
private static final String DELETED_BLOBS_QUERY = constructDeadBlobsQuery(CloudBlobMetadata.FIELD_DELETION_TIME);
private static final String CONTAINER_BLOBS_QUERY =
"SELECT TOP %d * FROM c WHERE c.accountId=%d and c.containerId=%d";
"SELECT TOP %d " + LIMIT_PARAM + " FROM c WHERE c.accountId=" + ACCOUNT_ID_PARAM + " and c.containerId="
+ CONTAINER_ID_PARAM;
private static final String BULK_DELETE_QUERY = "SELECT c._self FROM c WHERE c.id IN (%s)";
private static final String DEPRECATED_CONTAINERS_QUERY =
"SELECT TOP %d * from c WHERE c.deleted=false order by c.deleteTriggerTimestamp";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you change this to LIMIT_PARAM too? I missed it earlier.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

Copy link
Contributor

@lightningrob lightningrob left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix the Cosmos queries.

@@ -72,15 +72,15 @@
private static final String LIMIT_PARAM = "@limit";
private static final String ACCOUNT_ID_PARAM = "@accountId";
private static final String CONTAINER_ID_PARAM = "@containerId";
private static final String MAX_ENTRIES_PARAM = "@maxEntries";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the difference between LIMIT and MAX_ENTRIES params?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed MAX_ENTRIES.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't look like it.

Copy link
Collaborator Author

@ankagrawal ankagrawal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed all review comments.

private static final String EXPIRED_BLOBS_QUERY = constructDeadBlobsQuery(CloudBlobMetadata.FIELD_EXPIRATION_TIME);
private static final String DELETED_BLOBS_QUERY = constructDeadBlobsQuery(CloudBlobMetadata.FIELD_DELETION_TIME);
private static final String CONTAINER_BLOBS_QUERY =
"SELECT TOP %d * FROM c WHERE c.accountId=%d and c.containerId=%d";
"SELECT TOP %d " + LIMIT_PARAM + " FROM c WHERE c.accountId=" + ACCOUNT_ID_PARAM + " and c.containerId="
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

private static final String EXPIRED_BLOBS_QUERY = constructDeadBlobsQuery(CloudBlobMetadata.FIELD_EXPIRATION_TIME);
private static final String DELETED_BLOBS_QUERY = constructDeadBlobsQuery(CloudBlobMetadata.FIELD_DELETION_TIME);
private static final String CONTAINER_BLOBS_QUERY =
"SELECT TOP %d * FROM c WHERE c.accountId=%d and c.containerId=%d";
"SELECT TOP %d " + LIMIT_PARAM + " FROM c WHERE c.accountId=" + ACCOUNT_ID_PARAM + " and c.containerId="
+ CONTAINER_ID_PARAM;
private static final String BULK_DELETE_QUERY = "SELECT c._self FROM c WHERE c.id IN (%s)";
private static final String DEPRECATED_CONTAINERS_QUERY =
"SELECT TOP %d * from c WHERE c.deleted=false order by c.deleteTriggerTimestamp";
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

@@ -72,15 +72,15 @@
private static final String LIMIT_PARAM = "@limit";
private static final String ACCOUNT_ID_PARAM = "@accountId";
private static final String CONTAINER_ID_PARAM = "@containerId";
private static final String MAX_ENTRIES_PARAM = "@maxEntries";
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed MAX_ENTRIES.

Copy link
Contributor

@lightningrob lightningrob left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for making the changes.

Copy link
Contributor

@SophieGuo410 SophieGuo410 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@lightningrob lightningrob merged commit ba801c3 into linkedin:master Nov 24, 2020
SophieGuo410 added a commit to SophieGuo410/ambry that referenced this pull request Dec 3, 2020
…din#1646)

* Initial implementation of Helix task to sync deleted containers between cloud and helix account service.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants