accelerate clustering with sparse-dense vector and parallel sorting #183

yaushian · 2022-11-11T23:38:26Z

Issue #, if available:

Description of changes: Current PECOS hierarchical clustering implementation stores clustering centers as dense vectors, which leads to O(2^(L-1)*d ) cost for layer L, where d is the total feature dimension and can be millions or more. In extremely large datasets, at a bottom layer, L is large while the center vectors are sparse, which makes operations on dense vectors inefficient. Empirically, we find this makes bottom layers more than 10x slower than top layers on large datasets. We apply sparse-dense vectors (sdvec) as an alternative to store centers whose time complexity is O(2^L * p), where p is the averaged number of on-zero elements of sparse center vectors, and is significantly smaller than d at bottom layers.

With sdvec, the computational bottleneck switches from bottom layers to the top layer since the top layer performs sorting on the whole dataset. We further accelerate clustering training via parallel sorting.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

OctoberChang · 2022-11-14T20:58:37Z

pecos/core/utils/clustering.hpp

-    std::vector<f32_dvec_t> center1; // need to be duplicated to handle parallel clustering
-    std::vector<f32_dvec_t> center2; // for spherical kmeans
+    std::vector<f32_sdvec_t> center1; // need to be duplicated to handle parallel clustering
+    std::vector<f32_sdvec_t> center2; // for spherical kmeans 


remove extra whitespace

OctoberChang · 2022-11-14T21:24:25Z

pecos/core/utils/clustering.hpp

-                }
+
+            for(int thread_id = 0; thread_id < threads; thread_id++) {
+                do_axpy(1.0,center_tmp_thread[thread_id],cur_center);


add white space between arguments, e.g., do_axpy(1.0, XXX, YYY).

OctoberChang · 2022-11-14T21:34:47Z

pecos/core/utils/matrix.hpp

@@ -83,7 +141,7 @@ namespace pecos {
        index_type* idx;
        value_type* val;
        sparse_vec_t(index_type nnz=0, index_type* idx=NULL, value_type* val=NULL): nnz(nnz), idx(idx), val(val) {}
-
+        


remove extra white space?

OctoberChang · 2022-11-14T21:38:47Z

pecos/core/utils/matrix.hpp

+    float32_t do_dot_product(const sparse_vec_t<IX, VX>& x, const sdvec_t<IY, VY>& y) {
+        float32_t ret = 0;
+        for(size_t s=0; s < x.nnz; s++) {
+            auto idx = x.idx[s];


in other similar use cases (do_axpy), you are using auto &. Why such disparity?

OctoberChang · 2022-11-14T21:39:46Z

pecos/core/utils/matrix.hpp

+            return do_dot_product(y, x);
+        }
+        float32_t ret = 0;
+        for(size_t s=0; s < x.nr_touch; s++) {


consider using the same style of for loop for (size_t s = 0; s < ...; s++) in all the functions for consistency.

OctoberChang changed the title ~~sparse-dense vector and parallel sorting for acceleration~~ accelerate clustering with sparse-dense vector and parallel sorting Nov 11, 2022

OctoberChang requested review from rofuyu and OctoberChang November 11, 2022 23:43

OctoberChang assigned OctoberChang and yaushian Nov 11, 2022

yaushian force-pushed the clustering_acceleration branch 2 times, most recently from 7e9183d to b1fe2c8 Compare November 14, 2022 20:20

OctoberChang reviewed Nov 14, 2022

View reviewed changes

yaushian force-pushed the clustering_acceleration branch from b1fe2c8 to 140dac0 Compare November 14, 2022 22:05

sparse-dense vec and parallel sorting for acceleration

6b1d16f

yaushian force-pushed the clustering_acceleration branch from 140dac0 to 6b1d16f Compare November 14, 2022 22:28

OctoberChang approved these changes Nov 14, 2022

View reviewed changes

OctoberChang merged commit e4f61bf into amzn:mainline Nov 14, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

accelerate clustering with sparse-dense vector and parallel sorting #183

accelerate clustering with sparse-dense vector and parallel sorting #183

yaushian commented Nov 11, 2022

OctoberChang Nov 14, 2022

OctoberChang Nov 14, 2022

OctoberChang Nov 14, 2022

OctoberChang Nov 14, 2022

OctoberChang Nov 14, 2022

accelerate clustering with sparse-dense vector and parallel sorting #183

accelerate clustering with sparse-dense vector and parallel sorting #183

Conversation

yaushian commented Nov 11, 2022

OctoberChang Nov 14, 2022

Choose a reason for hiding this comment

OctoberChang Nov 14, 2022

Choose a reason for hiding this comment

OctoberChang Nov 14, 2022

Choose a reason for hiding this comment

OctoberChang Nov 14, 2022

Choose a reason for hiding this comment

OctoberChang Nov 14, 2022

Choose a reason for hiding this comment