Skip to content

boruvka joblib error #22

@eyaler

Description

@eyaler

hdbscan 0.6.5, sklearn 0.17.0
calling HDBSCAN.fit() with algorithm=boruvka_kdtree or boruvka_balltree, i sometimes get this following error. it works fine with algorithm=prims_kdtree or prims_balltree

Traceback (most recent call last):
File "", line 1, in
File "c:\python2764\Lib\multiprocessing\forking.py", line 380, in main
prepare(preparation_data)
File "c:\python2764\Lib\multiprocessing\forking.py", line 495, in prepare
'parents_main', file, path_name, etc
...( references to my code calling HDBSCAN.fit() )...
File "C:\Users\eyalg\virtualenv\future64\lib\site-packages\hdbscan\hdbscan_.py", line 531, in fit
self.min_spanning_tree) = hdbscan(X, **self.get_params())
File "C:\Users\eyalg\virtualenv\future64\lib\site-packages\hdbscan\hdbscan
.py", line 363, in hdbscan
gen_min_span_tree)
File "C:\Users\eyalg\virtualenv\future64\lib\site-packages\sklearn\externals\joblib\memory.py", line 283, in call
return self.func(_args, *kwargs)
File "C:\Users\eyalg\virtualenv\future64\lib\site-packages\hdbscan\hdbscan
.py", line 163, in _hdbscan_boruvka_kdtree
alg = KDTreeBoruvkaAlgorithm(tree, min_samples, metric=metric, leaf_size=leaf_size // 3)
File "hdbscan/_hdbscan_boruvka.pyx", line 335, in hdbscan._hdbscan_boruvka.KDTreeBoruvkaAlgorithm.init (hdbscan_hdbscan_boruvka.c:4746)
File "hdbscan/_hdbscan_boruvka.pyx", line 364, in hdbscan._hdbscan_boruvka.KDTreeBoruvkaAlgorithm._compute_bounds (hdbscan_hdbscan_boruvka.c:5401)
File "C:\Users\eyalg\virtualenv\future64\lib\site-packages\sklearn\externals\joblib\parallel.py", line 771, in call
n_jobs = self._initialize_pool()
File "C:\Users\eyalg\virtualenv\future64\lib\site-packages\sklearn\externals\joblib\parallel.py", line 518, in _initialize_pool
raise ImportError('[joblib] Attempting to do parallel computing '
ImportError: [joblib] Attempting to do parallel computing without protecting your import on a system that does not support forking. To use parallel-computing in a script, you must protect your main loop using "if name == 'main'". Please see the joblib documentation on Parallel for more information

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions