-
Notifications
You must be signed in to change notification settings - Fork 105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
accelerate clustering with sparse-dense vector and parallel sorting #183
Conversation
7e9183d
to
b1fe2c8
Compare
pecos/core/utils/clustering.hpp
Outdated
std::vector<f32_dvec_t> center1; // need to be duplicated to handle parallel clustering | ||
std::vector<f32_dvec_t> center2; // for spherical kmeans | ||
std::vector<f32_sdvec_t> center1; // need to be duplicated to handle parallel clustering | ||
std::vector<f32_sdvec_t> center2; // for spherical kmeans |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove extra whitespace
pecos/core/utils/clustering.hpp
Outdated
} | ||
|
||
for(int thread_id = 0; thread_id < threads; thread_id++) { | ||
do_axpy(1.0,center_tmp_thread[thread_id],cur_center); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add white space between arguments, e.g., do_axpy(1.0, XXX, YYY)
.
pecos/core/utils/matrix.hpp
Outdated
@@ -83,7 +141,7 @@ namespace pecos { | |||
index_type* idx; | |||
value_type* val; | |||
sparse_vec_t(index_type nnz=0, index_type* idx=NULL, value_type* val=NULL): nnz(nnz), idx(idx), val(val) {} | |||
|
|||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove extra white space?
pecos/core/utils/matrix.hpp
Outdated
float32_t do_dot_product(const sparse_vec_t<IX, VX>& x, const sdvec_t<IY, VY>& y) { | ||
float32_t ret = 0; | ||
for(size_t s=0; s < x.nnz; s++) { | ||
auto idx = x.idx[s]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in other similar use cases (do_axpy), you are using auto &
. Why such disparity?
pecos/core/utils/matrix.hpp
Outdated
return do_dot_product(y, x); | ||
} | ||
float32_t ret = 0; | ||
for(size_t s=0; s < x.nr_touch; s++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider using the same style of for loop for (size_t s = 0; s < ...; s++)
in all the functions for consistency.
b1fe2c8
to
140dac0
Compare
140dac0
to
6b1d16f
Compare
Issue #, if available:
Description of changes: Current PECOS hierarchical clustering implementation stores clustering centers as dense vectors, which leads to O(2^(L-1)*d ) cost for layer L, where d is the total feature dimension and can be millions or more. In extremely large datasets, at a bottom layer, L is large while the center vectors are sparse, which makes operations on dense vectors inefficient. Empirically, we find this makes bottom layers more than 10x slower than top layers on large datasets. We apply sparse-dense vectors (sdvec) as an alternative to store centers whose time complexity is O(2^L * p), where p is the averaged number of on-zero elements of sparse center vectors, and is significantly smaller than d at bottom layers.
With sdvec, the computational bottleneck switches from bottom layers to the top layer since the top layer performs sorting on the whole dataset. We further accelerate clustering training via parallel sorting.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.