How to run HAABSADA, a Hybrid Approach for Aspect Based Sentiment Analysis with Data Augmentation

Data Augmentation extenstion on HAABSA. In order to use the code please follow the following steps:

Go to https://github.com/ofwallaart/HAABSA and follow the steps on the github page. There are some possible problems that you can come across when following his, which I hope to tackle as much as possible.

When running on windows, note that you need compatible c++ visual studio build tools (https://visualstudio.microsoft.com/downloads/).
If you want to run on GPU (NVIDIA) note that you need compatible drivers, Cuda (v9.0) and cuDNN (v6.4)
If you need nltk files --> in your environment run python afterwards import nltk afterwards ntlk.download(NLTK_FILE)
Make sure your environment is active (with Scripts/activate) --> you should see the name of your environment next to the command line

Clone this repository using git clone https://github.com/tomasLiesting/HAABSADA.git and, in your file explorer, overwrite the files in HAABSA with the files in this repo
In your environment, cd to HAABSA, and again run pip install -r requirements.txt
Make sure you have access to the google translate api, if you want to use backtranslation. You have to create an account and get 300€ for free. This is done as follows:

pip install --upgrade google-cloud-translate
Afterwards go to this page https://cloud.google.com/docs/authentication/getting-started#windows
Create a service account
Create a new project
Create a service account with as role project owner
Download the json authentication (and place it somewhere you like)
In the environment run $env:GOOGLE_APPLICATION_CREDENTIALS="[PATH]" (where PATH is the path to your json authentication, e.g. $env:GOOGLE_APPLICATION_CREDENTIALS="C:\Users\username\Downloads\my-key.json"

In the environment, run python main.py

How to modify parameters

In the file config.py, there are several Data Augmentation variables. These can be modified. To reproduce the 1% and 0.5% improvement on the 2015 and 2016 SemEval datasets, make sure that:

EDA_type = 'adjusted'
EDA_deletion = 0
EDA_replacement = 1
EDA_insertion = 1
EDA_swap = 1
EDA_pct = .2
backtranslation_langs = 'None'
use_word_mixup = 0
original_multiplier = 3

Furthermore, in main.py make sure that:

loadData = True
augment_data = True
useOntology = True
runLCRROTALT = True For the rest all parameters in main.py should be False.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.idea		.idea
OntologyReasoner.py		OntologyReasoner.py
README.md		README.md
Thesis_HAABSADA_TomasLiesting453177.pdf		Thesis_HAABSADA_TomasLiesting453177.pdf
att_layer.py		att_layer.py
config.py		config.py
dataReader2016.py		dataReader2016.py
data_augmentation.py		data_augmentation.py
eda.py		eda.py
lcrModelAlt.py		lcrModelAlt.py
loadData.py		loadData.py
main.py		main.py
main_cross.py		main_cross.py
main_hyper.py		main_hyper.py
nn_layer.py		nn_layer.py
requirements.txt		requirements.txt
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

How to run HAABSADA, a Hybrid Approach for Aspect Based Sentiment Analysis with Data Augmentation

How to modify parameters

About

Releases

Packages

Languages

11novi/HAABSADA

Folders and files

Latest commit

History

Repository files navigation

How to run HAABSADA, a Hybrid Approach for Aspect Based Sentiment Analysis with Data Augmentation

How to modify parameters

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages