Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

input shape issue with LSTM model #16

Open
alexv1247 opened this issue May 3, 2019 · 9 comments
Open

input shape issue with LSTM model #16

alexv1247 opened this issue May 3, 2019 · 9 comments

Comments

@alexv1247
Copy link

I want to use the promising alipy library to perform active learning on a timeseries classifier model built with keras and tensorflow. The model is a LSTM, which leads to a 3D input shape.
Is there any way to use the alipy library then?
Up to now I changed you 10min-tutorial code so that it uses my own model. However the code stops at the very top when I initialize the Toolbox, because of the input shape of the xdata.

@tangypnuaa
Copy link
Collaborator

Hi, most of the existing AL strategies are designed for a 2D feature array, so you should extract a 2D feature matrix of your dataset for AL selection. E.g., the hidden state of the LSTM, or the output of the penultimate layer of a VGG.
In another word, you can use the 3D raw data to train your model, and use the 2D feature matrix to select unlabeled data in AL.

@alexv1247
Copy link
Author

Okay, understood.
I will give it a go and share my experience.

@alexv1247
Copy link
Author

I got the 2d feature matrix out of the hidden states etc.
However still error occurs, although I just use the tutorial code.

  classifier = KerasClassifier(lstm_model)

  x_train, y_train, x_test, y_test = get_numpy_data()
 feature_matrix = np.load(r'C:\Users\alexv\PycharmProjects\Active_Learning\Daten\npy_Daten  \feature_matrix_training.npy')

  alibox = ToolBox(X=feature_matrix, y=y_train, query_type='AllLabels', saving_path='.')

  # Split data
  alibox.split_AL(test_ratio=0.3, initial_label_rate=0.1, split_count=10)

 # load LSTM model
 model = classifier

# The cost budget is 50 times querying
stopping_criterion = alibox.get_stopping_criterion('num_of_queries', 50)

# Use pre-defined strategy
QBCStrategy = alibox.get_query_strategy(strategy_name='QueryExpectedErrorReduction')
QBC_result = []

for round in range(10):
    # Get the data split of one fold experiment
    train_idx, test_idx, label_ind, unlab_ind = alibox.get_split(round)
    # Get intermediate results saver for one fold experiment
    saver = alibox.get_stateio(round)

    while not stopping_criterion.is_stop():
        # Select a subset of Uind according to the query strategy
        # Passing model=None to use the default model for evaluating the committees' disagreement
        select_ind = QBCStrategy.select(label_ind, unlab_ind, models=classifier)
        label_ind.update(select_ind)
        unlab_ind.difference_update(select_ind)

        # Update model and calc performance according to the model you are using
        model.fit(X=x_train[label_ind.index], y=y_train[label_ind.index])
        pred = model.predict(x_train[test_idx])
        accuracy = alibox.calc_performance_metric(y_true=y_train[test_idx],
                                              y_pred=pred,
                                              performance_metric='accuracy_score')

       # Save intermediate results to file
        st = alibox.State(select_index=select_ind, performance=accuracy)
        saver.add_state(st)
        saver.save()

        # Passing the current progress to stopping criterion object
        stopping_criterion.update_information(saver)
    # Reset the progress in stopping criterion object
   stopping_criterion.reset()
   QBC_result.append(copy.deepcopy(saver))

analyser = alibox.get_experiment_analyser(x_axis='num_of_queries')
analyser.add_method(method_name='QBC', method_results=QBC_result)
print(analyser)
analyser.plot_learning_curves(title='Example of AL', std_area=True)

The error is the following:
Traceback (most recent call last):
File "C:\ProgramData\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 3296, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "", line 1, in
runfile('C:/Users/alexv/PycharmProjects/Active_Learning/active_learning_types/standard_alipy.py', wdir='C:/Users/alexv/PycharmProjects/Active_Learning/active_learning_types')
File "C:\Program Files\JetBrains\PyCharm 2019.1.1\helpers\pydev_pydev_bundle\pydev_umd.py", line 197, in runfile
pydev_imports.execfile(filename, global_vars, local_vars) # execute the script
File "C:\Program Files\JetBrains\PyCharm 2019.1.1\helpers\pydev_pydev_imps_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "C:/Users/alexv/PycharmProjects/Active_Learning/active_learning_types/standard_alipy.py", line 26, in
QBCStrategy = alibox.get_query_strategy(strategy_name='QueryExpectedErrorReduction')
File "C:\ProgramData\Anaconda3\lib\site-packages\alipy\toolbox.py", line 390, in get_query_strategy
strategy = eval(strategy_name + "(X=self._X, y=self._y, **kwargs)")
File "", line 1, in
File "C:\ProgramData\Anaconda3\lib\site-packages\alipy\query_strategy\query_labels.py", line 751, in init
super(QureyExpectedErrorReduction, self).init(X, y)
TypeError: super(type, obj): obj must be an instance or subtype of type`

@tangypnuaa
Copy link
Collaborator

Hi, sorry for the inconvenience.
This is a bug which has been fixed in the latest codes.
Please upgrade ALiPy by pip install --upgrade alipy or re-install it from the master branch code.

By the way, there may be other issues in your code select_ind = QBCStrategy.select(label_ind, unlab_ind, models=classifier). The keyword argument is model not models and the EER method needs to re-train the classification model many times. So there are 2 constraints to the given model.
1. It is a sklearn model (or a model who implements their api).
2. It has the probabilistic output function predict_proba.

If you want to use the Keras classifier, please see issue 2. Or just use another strategy which does not need a prediction model (e.g., QUIRE).

@alexv1247
Copy link
Author

Hi, I am still wondering about the strategy you proposed on how to fix the 3D issue.
Since my model requires always a 3D input shape, I dont really get how the selecting of the unlabeled data with a 2d feature matrix should work. I assume that it will always fail when it tries to select the new index, because it will try to calculate the prediciton probability with a 2d feature matrix, which is not possible, because the model is designed for a 3d input matrix.

From what I understood up to now, the query strategies depend heavily on the otucome of the predict_proba method? If that is the case, the strategies should work with a 2d or 3d feature matrix, because they use the models' own predict_proba method? So in conclusion, if the model accepts a 3D input feature matrix, the query strategy method should do as well?

Pls correct me if I am wrong, but I dont see the need for this 2D constraint.

@tangypnuaa
Copy link
Collaborator

Hi, alexv1247!
That's a very good question.

Actually there are two different settings to do the sample selection in active learning. One is purely relying on unsupervised approach to select samples based on the data structure of unlabeled samples without any knowledge of the ground truth labels (e.g., QUIRE, GraphDensity, Random); the other is selecting samples with the help of an initially trained supervised classifier based on a seed set of limited labeled samples.

For the 1st setting, QUIRE and GraphDensity needs to calculate the kernel which is defined on vector.

For the 2nd setting, some methods only depend on the otucome of the predict_proba (e.g., Uncertainty). You don't even need to provide the feature matrix or model to select unlabeled data:

unc = QueryInstanceUncertainty()
unc.select_by_prediction_mat(unlabel_index, predict, batch_size=1)

qbc = QueryInstanceQBC(method='query_by_bagging', disagreement='KL_divergence')
committee_predict = [predict_mat1, predict_mat2, predict_mat3]
qbc.select_by_prediction_mat(unlabel_index, committee_predict , batch_size=1)

For EER method, it's technically sound if you want to use a keras model because EER does not use the feature at all. However, we use the sklearn function check_X_y to make sure the input is legal which enforces X 2d and y 1d. Maybe you can try this to avoid the validity checking:

eer = QureyExpectedErrorReduction(X=None, y=None)
eer.X = X3d
eer.y = y
eer.select(label_index, unlabel_index, model=keras_model, batch_size=1)

Note that, your model should implement fit and predict_proba methods. Or you can modify the source code directly to make the EER algorithm available for keras model.

However, some other methods use a specified model according to the paper which is trained and used inside the algorithm (e.g., LAL, BMDR, AURO, AUDI, HALC).

Specifically, LAL uses a RF to extract feature of unlabeled data for score prediction, higher score means more valuable. But some features only can be provided by RF (average depth of the trees in the forest, out of bag estimation). If you want to use another model, you may have to design a new feature extraction method for score prediction.

AURO and AUDI use a LabelRanking model which accepts instance-label pairs for training, and can provide special information that will be used in active selection. HALC uses 'one vs the rest' SVM for multi-label classification.

The optimization of BMDR depends directly on a linear classification model according to the paper, if you want to change another model or another loss function, you may have to re-implement the optimization function.

In this case, some specified models do not accept a 3D input shape.

@alexv1247
Copy link
Author

thanks for the quick and good answer.
So from your point of view ... would it make sense to select the labels with my 2d feature matrix with the default model which is used for each of the strategies? and then use this labels to train my model with the 3d feature matrix?
This approach seems a bit fragile to me.
But I am very interested to hear your opinion.

Sent with GitHawk

@tangypnuaa
Copy link
Collaborator

tangypnuaa commented May 5, 2019

The 2D feature matrix is obtained from your target model, I think it's ok to use it for AL selection. But use another model to get the proba prediction in AL may do harm to the performance.

If you are labeling a real dataset, the better way is to use your target model. So I suggest you to use Uncertainty and select_by_prediction_mat method to select data which only needs the proba prediction . Moreover, uncertainty is stable and effective in most cases.

If you are comparing with existing AL methods and going to write a paper, just make sure the target model for performance evaluation AND the 2D feature matrix for selection are the same for all compared methods. It's ok to use another model IF the compared method uses a specified model.

@Qizal
Copy link

Qizal commented Sep 11, 2019

i have downloaded LSTM .Now how to add it to matlab?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants