Skip to content

Commit

Permalink
Update README.rst
Browse files Browse the repository at this point in the history
Doc on return values
  • Loading branch information
matteodellamico authored May 24, 2024
1 parent 96a2c3f commit c22a0a1
Showing 1 changed file with 35 additions and 5 deletions.
40 changes: 35 additions & 5 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -25,15 +25,14 @@ Installation
python3 setup.py install

A projects allowing scalable hierarchical clustering, thanks to an
approximated version of OPTICS, on arbitrary data and distance measures.
approximated version of HDBSCAN, on arbitrary data and distance measures.

Quickstart
----------

Look at the HDBSCAN documentation for the meaning of the return values
of the `cluster` method. There are plenty of configuration options,
inherited by HNSWs and HDBSCAN, but the only compulsory argument is a
dissimilarity function between arbitrary data elements::
There are plenty of configuration options, inherited by HNSWs and HDBSCAN,
but the only compulsory argument is a dissimilarity function between arbitrary
data elements::

import flexible_clustering
Expand All @@ -50,6 +49,37 @@ dissimilarity function between arbitrary data elements::
Make sure to run everything from *outside* the source directory, to
avoid confusing Python path.

Return Values
-------------

From the `HDBSCAN source code <https://hdbscan.readthedocs.io/en/latest/_modules/hdbscan/hdbscan_.html>`_:

labels : ndarray, shape (n_samples, )
Cluster labels for each point. Noisy samples are given the label -1.

probabilities : ndarray, shape (n_samples, )
Cluster membership strengths for each point. Noisy samples are assigned
0.

cluster_persistence : array, shape (n_clusters, )
A score of how persistent each cluster is. A score of 1.0 represents
a perfectly stable cluster that persists over all distance scales,
while a score of 0.0 represents a perfectly ephemeral cluster. These
scores can be guage the relative coherence of the clusters output
by the algorithm.

condensed_tree : record array
The condensed cluster hierarchy used to generate clusters.

single_linkage_tree : ndarray, shape (n_samples - 1, 4)
The single linkage tree produced during clustering in scipy
hierarchical clustering format
(see http://docs.scipy.org/doc/scipy/reference/cluster.hierarchy.html).

min_spanning_tree : ndarray, shape (n_samples - 1, 3)
The minimum spanning as an edgelist. If gen_min_span_tree was False
this will be None.

Demo/Example
------------

Expand Down

0 comments on commit c22a0a1

Please sign in to comment.