Computational platform: PyTorch 1.4.0, NVIDIA Geforce GTX 3090 (GPU), Inter i9-10900X (CPU), CUDA Toolkit 10.0
Development language: Python 3.6/C++
Libraries are listed as follow, which can be installed via the command pip install -r requirements.txt
.
numpy, scipy, tqdm, scikit-learn, sentencepiece=0.1.91, transformers, tensorboardX, nltk, os, sys, collections, itertools, argparse, subprocess, pickle, cudatoolkit=10.0, pytorch==1.4.0
We provide all the data sets (profession data set, hobby data set, and 20News data set) in the folder data/datasets/
.
Profession data set(obtained from the authors of [2])
atribute values: 71; user utterances: 5747
used by the previous work: CHARM DSCGN
Hobby data set (obtained from the authors of [2])
atribute values: 149; user utterances: 5787
used by the previous work: CHARM DSCGN
Note that we follow the same task setting as previous personal attribute prediction papers[2-4], where attribute values are NOT explicitly mentioned in utterances and the given candidate attribute values are ranked based on the underlying semantics of utterances.
20News data set(obtained from [1])
classes: 5; documents: 17871
used by the previous work: X-Class
Note that PEARL is tested on the weakly supervised text classification task to verify its universality, flexibility and effectiveness.
CUDA_VISIBLE_DEVICES = [gpu_id] python static_representations.py --dataset_name profession
CUDA_VISIBLE_DEVICES = [gpu_id] python utterance_word_representations.py --dataset_name profesion
Similarly, the hobby (resp. 20News) data set can be preprocessed by replacing "profession" as "hobby" (resp. "20News").
python iterate_frame_profession.py
Similarly, PEARL can run on the hobby (resp. 20News) data set via the command "python iterate_frame_hobby.py" (resp. "python iterate_frame_20News.py").
[1] Lang K. Newsweeder. Learning to filter netnews. Machine Learning Proceedings 1995, 331-339.
[2] Tigunova A, Yates A, Mirza P, et al. CHARM: Inferring personal attributes from conversations. EMNLP'20, 5391-5404.
[3] Liu Y, Chen H, Shen W. Personal Attribute Prediction from Conversations. WWW'2022, 223-227.
[4] Tigunova A, Yates A, Mirza P, et al. Listening between the lines: Learning personal attributes from conversations. WWW'2019, 1818-1828.