Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
disentangle_metric.py		disentangle_metric.py
error_analyze.py		error_analyze.py
make_combined_label.py		make_combined_label.py
reorganize_clusters.py		reorganize_clusters.py
run_base.sh		run_base.sh
run_binned.sh		run_binned.sh
run_metric.sh		run_metric.sh

README.md

Label Disentanglement in Partition-based Extreme Multilabel Classification, NeurIPS 2021

This folder contains code to reproduce the key experiments in "Label Disentanglement in Partition-based Extreme Multilabel Classification"

Get Started

Clone the repository and enter examples/overlap-xmc directory.
First create a virtual environment and then install dependencies by running the following command:

pip install numba==0.52.0
pip install scipy==1.4.1

If you're unfamiliar with Python virtual environments, check out the user guide.

After installing, create following folders:

mkdir dataset/ dataset-binned/ model/

Downloading Data

The XMC datasets can be download at

# eurlex-4k, wiki10-31k, amazoncat-13k, amazon-670k, wiki-500k, amazon-3m
DATASET="wiki10-31k"
wget https://archive.org/download/pecos-dataset/xmc-base/${DATASET}.tar.gz
tar -zxvf ./${DATASET}.tar.gz

Then, move the data folders to ./dataset folder:

cp -rf ./xmc-base ./dataset && rm -rf ./xmc-base

Training and Evaluation

The training and evaluation of our label disentablement model, run

# reproduce ours + XR-Linear:
bash run_base.sh ${DATASET}
# reproduce Figure2:
base run_binned.sh ${DATASET}
# reproduce VI (should be launched after completion of `run_binned.sh`):
base run_metric.sh

Recommended platform for training: AWS p3.16xlarge instance or equivalent.

Known Issues

Be careful about the scipy and numba versions (needs to be 1.4.1 and 0.52.0, respectively).
For reproducing Our method + X-Transformer results, we need to upload the extracted features first. Currently they are missing.

Citation

Please consider to cite this paper if you find our work helpful:

@inproceedings{liu2021label,
  title={Label disentanglement in partition-based extreme multilabel classification},
  author={Liu, Xuanqing and Chang, Wei-Cheng and Yu, Hsiang-Fu and Hsieh, Cho-Jui and Dhillon, Inderjit S},
  booktitle={Thirty-Fifth Conference on Neural Information Processing Systems},
  year={2021}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

overlap-xmc

overlap-xmc

README.md

Label Disentanglement in Partition-based Extreme Multilabel Classification, NeurIPS 2021

Get Started

Downloading Data

Training and Evaluation

Known Issues

Citation

Files

overlap-xmc

Directory actions

More options

Directory actions

More options

Latest commit

History

overlap-xmc

Folders and files

parent directory

README.md

Label Disentanglement in Partition-based Extreme Multilabel Classification, NeurIPS 2021

Get Started

Downloading Data

Training and Evaluation

Known Issues

Citation