Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

some fixes #317

Open
wants to merge 19 commits into
base: master
Choose a base branch
from
Open

some fixes #317

wants to merge 19 commits into from

Conversation

tzukpolinsky
Copy link
Contributor

so the list of fixes is:
1.Reordered code in the enum class and the machine learning interface.
2.Fixed an issue where non-existent files were being called for app visibility.
3.Resolved a problem with loading the ini file due to the absence of the 'file_types' parameter in the FileSelect class.

There are likely more fixes; feel free to inquire about each one. Manual testing was conducted prior to creating this pull request

2. changed the value in the pop-up to the enum value for generalization
….9.0 it has np.float which was removed in later versions of numpy
…part_directionality_df_dir

so this is an implementation of how to read the directory tree
…part_directionality_df_dir

so this is an implementation of how to read the directory tree
…part_directionality_df_dir

so this is an implementation of how to read the directory tree
1. added freezing feature extractor
2. fixed bug in data.py, that didnt run the user defined class
3. changed the arg name in read_all_files_in_folder_mp_futures func to be more specific
4. fixed error in "apply all" button in ROI menu.
5. move the button of extract features below the "apply" button, UX reason, but not important
6. added a new function that take the annotation file and concatenate it with the feature file.
7.
1. added imbalance random forest and the infrastructure for more machine learning algorithms that has the methods  "fit" and "predict"
2. added 2 buttons that enables running a model on all the project data.
3. added method to detect and return graphviz dot if it exists on the computer, have tested on linux.
and moved all the machine learning vals to a single enum
assigning changes in enums
# Conflicts:
#	simba/SimBA.py
#	simba/data_processors/directing_animal_to_bodypart.py
#	simba/mixins/config_reader.py
#	simba/mixins/pop_up_mixin.py
#	simba/mixins/train_model_mixin.py
#	simba/model/train_rf.py
#	simba/ui/pop_ups/direction_animal_to_bodypart_settings_pop_up.py
#	simba/utils/data.py
#	simba/utils/enums.py
# Conflicts:
#	simba/ui/pop_ups/validation_plot_pop_up.py
#	simba/utils/enums.py
@@ -351,7 +354,7 @@ class Defaults(Enum):
LARGE_MAX_TASK_PER_CHILD = 1000
CHUNK_SIZE = 1
SPLASH_TIME = 2500
WELCOME_MSG = f'Welcome fellow scientists! \n SimBA v.{pkg_resources.get_distribution("simba-uw-tf-dev").version} \n '
WELCOME_MSG = f'Welcome fellow scientists! \n SimBA v.{pkg_resources.get_distribution("simba-uw-tf-dev").version if importlib.util.find_spec("simba-uw-tf-dev") is not None else "dev"} \n '
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason, importlib.util.find_spec("simba-uw-tf-dev") returns None in my python3.6 environment. Does something like this work on your end?

    SPLASH_TIME = 2500
    try:
        WELCOME_MSG = f'Welcome fellow scientists! \n SimBA v.{pkg_resources.get_distribution("simba-uw-tf-dev").version} \n '
    except pkg_resources.DistributionNotFound:
        WELCOME_MSG = f'Welcome fellow scientists! \n SimBA v. "dev" \n '
    BROWSE_FOLDER_BTN_TEXT = 'Browse Folder'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, i dont like the try except methodology in python but obviously it would work

@@ -10,3 +10,5 @@ Ear_left_1,Ear_right_1,Nose_1,Tail_base_1,Ear_left_2,Right_ear_2,Nose_2,Tail_bas
Ear_left_1,Ear_right_1,Nose_1,Center_1,Lat_left_1,Lat_right_1,Tail_base_1,Ear_left_2,Ear_right_2,Nose_2,Center_2,Lat_left_2,Lat_right_2,Tail_base_2,,
Ear_left_1,Ear_right_1,Nose_1,Center_1,Lat_left_1,Lat_right_1,Tail_base_1,Tail_end_1,Ear_left_2,Ear_right_2,Nose_2,Center_2,Lat_left_2,Lat_right_2,Tail_base_2,Tail_end_2
3D,,,,,,,,,,,,,,,
nose,left_ear,right_ear,tail_base,bug_center
nose,right_hand,left_hand,right_leg,left_leg,tail_base
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove these two lines and revert to original bp_names.csv otherwise others get to see your body-part configs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree, my bad

@@ -10,3 +10,5 @@ Multi-animals; 4 body-parts
Multi-animals; 7 body-parts
Multi-animals; 8 body-parts
3D tracking
mouse_and_bug
from_below
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree, my bad

@@ -10,3 +10,5 @@
2
2
1
1
1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree, my bad

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏻 I know in development this FileExistError can be a bit of a headache :) but a bit nervous about about letting users overwrite previous data by mistake.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes a lot of sense thanks @tzukpolinsky

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a note we should chat about this one, not sure when this is needed

@@ -90,30 +90,78 @@ def __init__(self, config_path: str):
"25",
validation="numeric",
)
self.estimators_entrybox.entry_set(val=2000)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this one.. remember you mentioned it but never got to it

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe worth adding a default value just in case trouble reading from config? E.g., n_estimators = read_config_entry(self.config, ConfigKey.CREATE_ENSEMBLE_SETTINGS.value, MachineLearningMetaKeys.RF_ESTIMATORS.value, data_type=Dtypes.INT.value, default_value=2000)?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I realized if filling in the settings from the project_config, and sey it founds "random undersamle", then the random undersample ratio should also be filled in and the entry box has to be activated. Right now it is not activated:
image

)

self.estimators_entrybox.entry_set(val=n_estimators)
max_features = read_config_entry(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And same here? max_features = read_config_entry(self.config, ConfigKey.CREATE_ENSEMBLE_SETTINGS.value, MachineLearningMetaKeys.RF_MAX_FEATURES.value, data_type=Dtypes.STR.value, default_value=self.max_features_options[0])

self.criterion_dropdown = DropDownMenu(
self.hyperparameters_frm, "Criterion: ", self.criterion_options, "25"
)
self.criterion_dropdown.setChoices(self.criterion_options[0])
train_test_size = read_config_entry(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

    train_test_size = read_config_entry(self.config, ConfigKey.CREATE_ENSEMBLE_SETTINGS.value, MachineLearningMetaKeys.TT_SIZE.value, data_type=Dtypes.STR.value, **default_value="0.2"**)

self.train_test_type_dropdown = DropDownMenu(
self.hyperparameters_frm,
"Train-test Split Type: ",
Options.TRAIN_TEST_SPLIT.value,
"25",
)
self.train_test_type_dropdown.setChoices(Options.TRAIN_TEST_SPLIT.value[0])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

train_test_split_type = read_config_entry( self.config, ConfigKey.CREATE_ENSEMBLE_SETTINGS.value, MachineLearningMetaKeys.TRAIN_TEST_SPLIT_TYPE.value, data_type=Dtypes.STR.value, default_value=Options.TRAIN_TEST_SPLIT.value[0]) ?

self.train_test_size_dropdown.setChoices(self.meta["train_test_size"])
self.min_sample_leaf_eb.entry_set(val=self.meta["rf_min_sample_leaf"])
self.undersample_settings_dropdown.setChoices(self.meta["under_sample_setting"])
self.behavior_name_dropdown.setChoices(self.meta[MachineLearningMetaKeys.CLASSIFIER])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this be MachineLearningMetaKeys.CLASSIFIER.**value**?

@@ -798,24 +843,22 @@ def load_config(self):
self.shap_absent.set_state(DISABLED)
self.shap_save_it_dropdown.enable()

if "train_test_split_type" in self.meta.keys():
self.train_test_type_dropdown.setChoices(self.meta["train_test_split_type"])
if MachineLearningMetaKeys.TRAIN_TEST_SPLIT_TYPE in self.meta.keys():
Copy link
Collaborator

@sronilsson sronilsson Jan 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like typo here with train test size set for the train test split type?

@@ -423,13 +391,13 @@ class Methods(Enum):
THIRD_PARTY_ANNOTATION_FILE_NOT_FOUND = "Annotations data file NOT FOUND"


class MetaKeys(Enum):
Copy link
Collaborator

@sronilsson sronilsson Jan 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the meta files, the header is named "classifier_name", ideally I'd like to keep that to not break compatibility with config files that are created with older versions. I'll insert another key in MachineLearningMetaKeys with classifier_name value. I bump into self.clf_name = meta_dict[MachineLearningMetaKeys.CLASSIFIER.value]
KeyError: 'classifier' when trying grid search rf

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correction, I change the value to align in both meta configs and the project config...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants