From 3c51cd44ccc809e7399fcf6a79e5a2f3131f460e Mon Sep 17 00:00:00 2001
From: h-aze
\n\n
\n\nA super simplistic python package for performing hyperparameter tuning (or more generally launching jobs and saving results) on a cluster using SLURM. Takes advantage of the fact that lots of jobs (including hyperparameter tuning) are embarrassingly parallel! With slune you can divide your compute into lots of separately scheduled jobs meaning that each small job can get running on your cluster more quickly, speeding up your workflow! Often significantly!
\n\nSlune is super-easy to use! We have helper functions which can execute everything you need done for you. Letting you speed up your work without wasting time.
\n\nSlune is barebones by design. This means that you can easily write code to integrate with slune if you want to do something a bit different! You can also workout what each function is doing pretty easily.
\n\nSlune is flexible. In designing this package I've tried to make as few assumptions as possible meaning that it can be used for lots of stuff outside hyperparameter tuning! (or also within!) For example, you can get slune to give you paths for where to save things, submit lots of jobs in parallel for any sort of script and do grid search! and there's more to come!
\n\nLet's go through a quick example of how we can use slune ... first let's define a model that we want to train:
\n\n# Simple Regularized Linear Regression without using external libraries\n\n# Function to compute the mean of a list\ndef mean(values):\n return sum(values) / float(len(values))\n\n# Function to compute the covariance between two lists\ndef covariance(x, mean_x, y, mean_y):\n covar = 0.0\n for i in range(len(x)):\n covar += (x[i] - mean_x) * (y[i] - mean_y)\n return covar\n\n# Function to compute the variance of a list\ndef variance(values, mean):\n return sum((x - mean) ** 2 for x in values)\n\n# Function to compute coefficients for a simple regularized linear regression\ndef coefficients_regularized(x, y, alpha):\n mean_x, mean_y = mean(x), mean(y)\n var_x = variance(x, mean_x)\n covar = covariance(x, mean_x, y, mean_y)\n b1 = (covar + alpha * var_x) / (var_x + alpha)\n b0 = mean_y - b1 * mean_x\n return b0, b1\n\n# Function to make predictions with a simple regularized linear regression model\ndef linear_regression_regularized(train_X, train_y, test_X, alpha):\n b0, b1 = coefficients_regularized(train_X, train_y, alpha)\n predictions = [b0 + b1 * x for x in test_X]\n return predictions\n\n# ------------------\n# The above is code for a simple normalized linear regression model that we want to train.\n# Now let's fit the model and use slune to save how well our model performs!\n# ------------------\n\nif __name__ == "__main__":\n # First let's load in the value for the regularization parameter alpha that has been passed to this script from the command line. We will use the slune helper function lsargs to do this. \n # lsargs returns a tuple of the python path and a list of arguments passed to the script. We can then use this to get the alpha value.\n from slune import lsargs\n python_path, args = lsargs()\n alpha = float(args[0])\n\n # Mock training dataset, function is y = 1 + 1 * x\n X = [1, 2, 3, 4, 5]\n y = [2, 3, 4, 5, 6]\n\n # Mock test dataset\n test_X = [6, 7, 8]\n test_y = [7, 8, 9]\n test_predictions = linear_regression_regularized(X, y, test_X, alpha)\n\n # First let's load in a function that we can use to get a saver object that uses the default method of logging (we call this object a slog = saver + logger). The saving will be coordinated by a csv saver object which saves and reads results from csv files stored in a hierarchy of directories.\n from slune import get_csv_slog\n csv_slog = get_csv_slog(params = args)\n\n # Let's now calculate the mean squared error of our predictions and log it!\n mse = mean((test_y[i] - test_predictions[i])**2 for i in range(len(test_y)))\n csv_slog.log({'mse': mse})\n\n # Let's now save our logged results!\n slog.save_collated()\n
\nNow let's write some code that will submit some jobs to train our model using different hyperparameters!!
\n\n# Let's now load in a function that will coordinate our search! We're going to do a grid search.\n# SearcherGrid is the class we can use to coordinate a grid search. We pass it a dictionary of hyperparameters and the values we want to try for each hyperparameter. We also pass it the number of runs we want to do for each combination of hyperparameters.\nfrom slune.searchers import SearcherGrid\ngrid_searcher = SearcherGrid({'alpha' : [0.25, 0.5, 0.75]}, runs = 1)\n\n# Let's now import a function which will submit a job for our model, the script_path specifies the path to the script that contains the model we want to train. The template_path specifies the path to the template script that we want to specify the job with, cargs is a list of constant arguments we want to pass to the script for each tuning. \n# We set slog to None as we don't want to not run jobs if we have already run them before.\nfrom slune import sbatchit\nscript_path = 'model.py'\ntemplate_path = 'template.sh'\nsbatchit(script_path, template_path, grid_searcher, cargs=[], slog=None)\n
\nNow we've submitted our jobs we will wait for them to finish \ud83d\udd5b\ud83d\udd50\ud83d\udd51\ud83d\udd52\ud83d\udd53\ud83d\udd54\ud83d\udd55\ud83d\udd56\ud83d\udd57\ud83d\udd58\ud83d\udd59\ud83d\udd5a\ud83d\udd5b, now that they are finished we can read the results!
\n\nfrom slune import get_csv_slog\ncsv_slog = get_csv_slog(params = None)\nparams, value = csv_slog.read(params = [], metric_name = 'mse', select_by ='min')\nprint(f'Best hyperparameters: {params}')\nprint(f'Their MSE: {value}')\n
\nAmazing! \ud83e\udd73 We have successfully used slune to train our model. I hope this gives you a good flavour of how you can use slune and how easy it is to use!
\n\nPlease check out the examples folder for notebooks detailing in more depth some potential ways you can use slune. The docs are not yet up and running \ud83d\ude22 but they are coming soon!
\n\nHowever, I am trying to keep this package as bloatless as possible to make it easy for you to tweak and configure to your individual needs. It's written in a simple and compartmentalized manner for this reason. You can of course use the helper functions and let slune handle everything under the hood, but, you can also very quickly and easily write your own classes to work with other savers, loggers and searchers to do as you please.
\n\nTo install latest version use:
\n\npip install slune-lib\n
\nTo install latest dev version use (CURRENTLY RECOMENDED):
\n\n# With https\npip install "git+https://github.com/h-aze/slune.git#egg=slune-lib"\n
\nHere we will outline the different kind of classes that are used in slune and how they interact with each other. There are 3 types:
\n\nThe base module is where the base classes for each of these types are defined. The base classes are:
\n\nTo create a new searcher, logger or saver, you must inherit from the appropriate base class and implement the required methods. The required methods will have the '@abc.abstractmethod' decorator above them and will throw errors if they are not implemented. The compulsory methods allow for well-defined interactions between the different classes and should allow for any combination of searcher, logger and saver to be used together.
\n\nPlease read the docs for the base classes to see what methods are required to be implemented and how they should be implemented.
\n"}, {"fullname": "src.slune", "modulename": "src.slune", "kind": "module", "doc": "\n"}, {"fullname": "src.slune.base", "modulename": "src.slune.base", "kind": "module", "doc": "\n"}, {"fullname": "src.slune.base.BaseSearcher", "modulename": "src.slune.base", "qualname": "BaseSearcher", "kind": "class", "doc": "Base class for all Searchers.
\n\nThis must be subclassed to create different Searcher classes.\nPlease name your searcher class Searcher
Initialises the searcher.
\n", "signature": "(*args, **kwargs)"}, {"fullname": "src.slune.base.BaseSearcher.next_tune", "modulename": "src.slune.base", "qualname": "BaseSearcher.next_tune", "kind": "function", "doc": "Returns the next configuration to try.
\n", "signature": "(self, *args, **kwargs):", "funcdef": "def"}, {"fullname": "src.slune.base.BaseSearcher.check_existing_runs", "modulename": "src.slune.base", "qualname": "BaseSearcher.check_existing_runs", "kind": "function", "doc": "Used to tell searcher to check if there are existing runs in storage.
\n\nIf there are existing runs, the searcher should skip them \nbased on the number of runs we would like for each job.\nThis may require a 'runs' attribute to be set in the searcher.\nIt will probably also require access to a Saver object,\nso we can use it's saving protocol to check if there are existing runs.\nIn this case is advised that this function takes a Saver object as an argument,\nand that the searcher is initialized with a 'runs' attribute.
\n", "signature": "(self, *args, **kwargs):", "funcdef": "def"}, {"fullname": "src.slune.base.BaseLogger", "modulename": "src.slune.base", "qualname": "BaseLogger", "kind": "class", "doc": "Base class for all Loggers.
\n\nThis must be subclassed to implement different Logger classes.\nPlease name your logger class Logger
Initialises the logger.
\n", "signature": "(*args, **kwargs)"}, {"fullname": "src.slune.base.BaseLogger.log", "modulename": "src.slune.base", "qualname": "BaseLogger.log", "kind": "function", "doc": "Logs the metric/s for the current hyperparameter configuration.
\n\nShould store metrics in some way so we can later save it using a Saver.
\n", "signature": "(self, *args, **kwargs):", "funcdef": "def"}, {"fullname": "src.slune.base.BaseLogger.read_log", "modulename": "src.slune.base", "qualname": "BaseLogger.read_log", "kind": "function", "doc": "Returns value of a metric from the log based on a selection criteria.
\n", "signature": "(self, *args, **kwargs):", "funcdef": "def"}, {"fullname": "src.slune.base.BaseSaver", "modulename": "src.slune.base", "qualname": "BaseSaver", "kind": "class", "doc": "Base class for all savers.
\n\nThis must be subclassed to implement different Saver classes.\nPlease name your saver class Saver
Initialises the saver.
\n\nAssigns the logger instance to self.logger and makes its methods accessible through self.log and self.read_log.
\n\nSaves the current results in logger to storage.
\n", "signature": "(self, *args, **kwargs):", "funcdef": "def"}, {"fullname": "src.slune.base.BaseSaver.read", "modulename": "src.slune.base", "qualname": "BaseSaver.read", "kind": "function", "doc": "Reads results from storage.
\n", "signature": "(self, *args, **kwargs):", "funcdef": "def"}, {"fullname": "src.slune.base.BaseSaver.exists", "modulename": "src.slune.base", "qualname": "BaseSaver.exists", "kind": "function", "doc": "Checks if results already exist in storage.
\n\nShould return integer indicating the number of runs that exist in storage for the given parameters.
\n", "signature": "(self, *args, **kwargs):", "funcdef": "def"}, {"fullname": "src.slune.loggers", "modulename": "src.slune.loggers", "kind": "module", "doc": "\n"}, {"fullname": "src.slune.loggers.default", "modulename": "src.slune.loggers.default", "kind": "module", "doc": "\n"}, {"fullname": "src.slune.loggers.default.LoggerDefault", "modulename": "src.slune.loggers.default", "qualname": "LoggerDefault", "kind": "class", "doc": "Logs metric/s in a data frame.
\n\nStores the metric/s in a data frame that we can later save in storage.\nLogs by creating data frame out of the metrics and then appending it to the current results data frame.
\n\nInitialises the logger.
\n", "signature": "(*args, **kwargs)"}, {"fullname": "src.slune.loggers.default.LoggerDefault.results", "modulename": "src.slune.loggers.default", "qualname": "LoggerDefault.results", "kind": "variable", "doc": "\n"}, {"fullname": "src.slune.loggers.default.LoggerDefault.log", "modulename": "src.slune.loggers.default", "qualname": "LoggerDefault.log", "kind": "function", "doc": "Logs the metric/s given.
\n\nStores them in a data frame that we can later save in storage.\nAll metrics provided will be saved as a row in the results data frame,\nthe first column is always the time stamp at which log is called.
\n\nReads log and returns value according to select_by.
\n\nReads the values for given metric for given log and chooses metric value to return based on select_by.
\n\n\n\n\n\n
\n- value (float): Minimum or maximum value of the metric.
\n
TODO: \n - Add more options for select_by.\n - Should be able to return other types than float?
\n", "signature": "(\tself,\tdata_frame: pandas.core.frame.DataFrame,\tmetric_name: str,\tselect_by: str = 'max') -> float:", "funcdef": "def"}, {"fullname": "src.slune.savers", "modulename": "src.slune.savers", "kind": "module", "doc": "\n"}, {"fullname": "src.slune.savers.csv", "modulename": "src.slune.savers.csv", "kind": "module", "doc": "\n"}, {"fullname": "src.slune.savers.csv.SaverCsv", "modulename": "src.slune.savers.csv", "qualname": "SaverCsv", "kind": "class", "doc": "Saves the results of each run in a CSV file in hierarchy of directories.
\n\nEach directory is named after a parameter - value pair in the form \"--parameter_name=value\".\nThe paths to csv files then define the configuration under which the results were obtained,\nfor example if we only have one parameter \"learning_rate\" with value 0.01 used to obtain the results,\nto save those results we would create a directory named \"--learning_rate=0.01\" and save the results in a csv file in that directory.
\n\nIf we have multiple parameters, for example \"learning_rate\" with value 0.01 and \"batch_size\" with value 32,\nwe would create a directory named \"--learning_rate=0.01\" with a subdirectory named \"--batch_size=32\",\nand save the results in a csv file in that subdirectory.
\n\nWe use this structure to then read the results from the csv files by searching for the directory that matches the parameters we want,\nand then reading the csv file in that directory.
\n\nThe order in which we create the directories is determined by the order in which the parameters are given,\nso if we are given [\"--learning_rate=0.01\", \"--batch_size=32\"] we would create the directories in the following order:\n\"--learning_rate=0.01/--batch_size=32\".
\n\nThe directory structure generated will also depend on existing directories in the root directory,\nif there are existing directories in the root directory that match some subset of the parameters given,\nwe will create the directory tree from the deepest matching directory.
\n\nFor example if we only have the following path in the root directory:\n\"--learning_rate=0.01/--batch_size=32\"\nand we are given the parameters [\"--learning_rate=0.01\", \"--batch_size=32\", \"--num_epochs=10\"],\nwe will create the path:\n\"--learning_rate=0.01/--batch_size=32/--num_epochs=10\".\non the other hand if we are given the parameters [\"--learning_rate=0.02\", \"--num_epochs=10\", \"--batch_size=32\"],\nwe will create the path:\n\"--learning_rate=0.02/--batch_size=32/--num_epochs=10\".
\n\nHandles parallel runs trying to create the same directories by waiting a random time (under 1 second) before creating the directory.\nShould work pretty well in practice, however, may occasionally fail depending on the number of jobs launched at the same time.
\n\nInitialises the csv saver.
\n\nStrips the parameter values.
\n\nStrips the parameter values from the list of parameters given,\nie. [\"--parameter_name=parameter_value\", ...] -> [\"--parameter_name=\", ...]
\n\nAlso gets rid of blank spaces.
\n\n\n\n", "signature": "(self, params: List[str]) -> List[str]:", "funcdef": "def"}, {"fullname": "src.slune.savers.csv.SaverCsv.get_match", "modulename": "src.slune.savers.csv", "qualname": "SaverCsv.get_match", "kind": "function", "doc": "\n
\n- stripped_params (list of str): List of strings containing the parameters used, in form [\"--parameter_name=\", ...].
\n
Searches the root directory for a path that matches the parameters given.
\n\nIf only partial matches are found, returns the deepest matching directory with the missing parameters appended.\nBy deepest we mean the directory with the most parameters matching.\nIf no matches are found creates a path using the parameters.\nCreates path using parameters in the order they are given, \nie. [\"--learning_rate=0.01\", \"--batch_size=32\"] -> \"--learning_rate=0.01/--batch_size=32\".
\n\nIf we find a partial match, we add the missing parameters to the end of the path,\nie. if we have the path \"--learning_rate=0.01\" in the root \nand are given the parameters [\"--learning_rate=0.01\", \"--batch_size=32\"],\nwe will create the path \"--learning_rate=0.01/--batch_size=32\".
\n\n\n\n", "signature": "(self, params: List[str]) -> str:", "funcdef": "def"}, {"fullname": "src.slune.savers.csv.SaverCsv.get_path", "modulename": "src.slune.savers.csv", "qualname": "SaverCsv.get_path", "kind": "function", "doc": "\n
\n- match (str): Path to the directory that matches the parameters given.
\n
Creates a path using the parameters.
\n\nDoes this by first checking for existing paths in the root directory that match the parameters given.
\n\nCheck get_match for how we create the path, \nonce we have the path we check if there is already a csv file with results in that path,\nif there is we increment the number of the results file name that we will use.
\n\nFor example if we get back the path \"--learning_rate=0.01/--batch_size=32\",\nand there exists a csv file named \"results_0.csv\" in the final directory,\nwe will name our csv file \"results_1.csv\".
\n\n\n\n", "signature": "(self, params: List[str]) -> str:", "funcdef": "def"}, {"fullname": "src.slune.savers.csv.SaverCsv.save_collated_from_results", "modulename": "src.slune.savers.csv", "qualname": "SaverCsv.save_collated_from_results", "kind": "function", "doc": "\n
\n- csv_file_path (str): Path to the csv file where we will store the results for the current run.
\n
Saves results to csv file.
\n\nIf the csv file already exists, \nwe append the collated results from the logger to the end of the csv file.\nIf the csv file does not exist,\nwe create it and save the results to it.
\n\nTODO: \n - Could be making to many assumptions about the format in which we get the results from the logger,\n should be able to work with any logger.\n We should only be assuming that we are saving results to a csv file.
\n", "signature": "(self, results: pandas.core.frame.DataFrame):", "funcdef": "def"}, {"fullname": "src.slune.savers.csv.SaverCsv.save_collated", "modulename": "src.slune.savers.csv", "qualname": "SaverCsv.save_collated", "kind": "function", "doc": "Saves results to csv file.
\n", "signature": "(self):", "funcdef": "def"}, {"fullname": "src.slune.savers.csv.SaverCsv.read", "modulename": "src.slune.savers.csv", "qualname": "SaverCsv.read", "kind": "function", "doc": "Finds the min/max value of a metric from all csv files in the root directory that match the parameters given.
\n\n\n\n", "signature": "(\tself,\tparams: List[str],\tmetric_name: str,\tselect_by: str = 'max',\tavg: bool = True) -> (typing.List[str], <class 'float'>):", "funcdef": "def"}, {"fullname": "src.slune.savers.csv.SaverCsv.exists", "modulename": "src.slune.savers.csv", "qualname": "SaverCsv.exists", "kind": "function", "doc": "\n
\n- best_params (list of str): Contains the arguments used to get the 'best' value of the metric (determined by select_by).
\n- best_value (float): Best value of the metric (determined by select_by).
\n
Checks if results already exist in storage.
\n\n\n\n", "signature": "(self, params: List[str]) -> int:", "funcdef": "def"}, {"fullname": "src.slune.savers.csv.SaverCsv.get_current_path", "modulename": "src.slune.savers.csv", "qualname": "SaverCsv.get_current_path", "kind": "function", "doc": "\n
\n- num_runs (int): Number of runs that exist in storage for the given parameters.
\n
Getter function for the current_path attribute.
\n\n\n\n", "signature": "(self) -> str:", "funcdef": "def"}, {"fullname": "src.slune.searchers", "modulename": "src.slune.searchers", "kind": "module", "doc": "\n"}, {"fullname": "src.slune.searchers.grid", "modulename": "src.slune.searchers.grid", "kind": "module", "doc": "\n"}, {"fullname": "src.slune.searchers.grid.SearcherGrid", "modulename": "src.slune.searchers.grid", "qualname": "SearcherGrid", "kind": "class", "doc": "\n
\n- current_path (str): Path to the csv file where we will store the results for the current run.
\n
Searcher for grid search.
\n\nGiven dictionary of parameters and values to try, creates grid of all possible configurations,\nand returns them one by one for each call to next_tune.
\n\nInitialises the searcher.
\n\nCreates search grid.
\n\nGenerates all possible combinations of values for each argument in the given dictionary using recursion.
\n\n\n\n", "signature": "(self, param_dict: dict) -> List:", "funcdef": "def"}, {"fullname": "src.slune.searchers.grid.SearcherGrid.check_existing_runs", "modulename": "src.slune.searchers.grid", "qualname": "SearcherGrid.check_existing_runs", "kind": "function", "doc": "\n
\n- all_combinations (list): A list of dictionaries, each containing one combination of argument values.
\n
We save a pointer to the savers exists method to check if there are existing runs.
\n\n\n\n\nn < runs -> run the remaining runs\n n >= runs -> skip all runs
\n
Skips runs if they are in storage already.
\n\nWill check if there are existing runs for the current configuration,\nif there are existing runs we tally them up \nand skip configs or runs of a config based on the number of runs we want for each config.
\n\n\n\n", "signature": "(self, grid_index: int) -> Tuple[int, int]:", "funcdef": "def"}, {"fullname": "src.slune.searchers.grid.SearcherGrid.next_tune", "modulename": "src.slune.searchers.grid", "qualname": "SearcherGrid.next_tune", "kind": "function", "doc": "\n
\n- grid_index (int): Index of the next configuration in the grid.
\n- run_index (int): Index of the next run for the current configuration.
\n
Returns the next configuration to try.
\n\nWill skip existing runs if check_existing_runs has been called.\nFor more information on how this works check the methods descriptions for check_existing_runs and skip_existing_runs.\nWill raise an error if we have reached the end of the grid.\nTo iterate through all configurations, use a for loop like so: \n for config in searcher: ...
\n\n\n\n", "signature": "(self) -> dict:", "funcdef": "def"}, {"fullname": "src.slune.slune", "modulename": "src.slune.slune", "kind": "module", "doc": "\n"}, {"fullname": "src.slune.slune.submit_job", "modulename": "src.slune.slune", "qualname": "submit_job", "kind": "function", "doc": "\n
\n- next_config (dict): The next configuration to try.
\n
Submits a job using specified Bash script
\n\nSubmits jobs based on arguments given by searcher.
\n\nFor each job runs the script stored at script_path with selected parameter values given by searcher\nand the arguments given by cargs.
\n\nUses the sbatch script with path sbatch_path to submit each job to the cluster.
\n\nIf given a Saver object, uses it to check if there are existing runs for each job and skips them,\nbased on the number of runs we would like for each job (which is stored in the saver).
\n\nReturns the script name and a list of the arguments passed to the script.
\n", "signature": "() -> (<class 'str'>, typing.List[str]):", "funcdef": "def"}, {"fullname": "src.slune.slune.garg", "modulename": "src.slune.slune", "qualname": "garg", "kind": "function", "doc": "Finds the argument/s with name arg_names in the list of arguments args_ls and returns its value/s.
\n\n\n\n", "signature": "(\targs: List[str],\targ_names: Union[str, List[str]]) -> Union[str, List[str]]:", "funcdef": "def"}, {"fullname": "src.slune.slune.get_csv_slog", "modulename": "src.slune.slune", "qualname": "get_csv_slog", "kind": "function", "doc": "\n
\n- arg_value (str or list of str): String or list of strings containing the values of the arguments found.
\n
Returns a SaverCsv object with the given parameters and root directory.
\n\n\n\n", "signature": "(\tparams: Optional[dict] = None,\troot_dir: Optional[str] = 'slune_results') -> slune.base.BaseSaver:", "funcdef": "def"}, {"fullname": "src.slune.utils", "modulename": "src.slune.utils", "kind": "module", "doc": "\n"}, {"fullname": "src.slune.utils.find_directory_path", "modulename": "src.slune.utils", "qualname": "find_directory_path", "kind": "function", "doc": "\n
\n- SaverCsv (Saver): Saver object with the given parameters and root directory.\n Initialized with a LoggerDefault object as its logger.
\n
Searches the root directory for a path of directories that matches the strings given in any order.\nIf only a partial match is found, returns the deepest matching path.\nIf no matches are found returns root_directory.\nReturns a stripped matching path of directories, ie. where we convert '--string=value' to '--string='.
\n\n\n\n", "signature": "(\tstrings: List[str],\troot_directory: Optional[str] = '.') -> Tuple[int, str]:", "funcdef": "def"}, {"fullname": "src.slune.utils.get_numeric_equiv", "modulename": "src.slune.utils", "qualname": "get_numeric_equiv", "kind": "function", "doc": "\n
\n- max_depth (int): Depth of the deepest matching path.
\n- max_path (string): Path of the deepest matching path.
\n
Replaces directories in path with existing directories with the same numerical value.
\n\n\n\n", "signature": "(og_path: str, root_directory: Optional[str] = '.') -> str:", "funcdef": "def"}, {"fullname": "src.slune.utils.dict_to_strings", "modulename": "src.slune.utils", "qualname": "dict_to_strings", "kind": "function", "doc": "\n
\n- equiv (str): Path with values changed to match existing directories if values are numerically equivalent, with root directory at beginning.
\n
Converts a dictionary into a list of strings in the form of '--key=value'.
\n\n\n\n", "signature": "(d: dict) -> List[str]:", "funcdef": "def"}, {"fullname": "src.slune.utils.find_csv_files", "modulename": "src.slune.utils", "qualname": "find_csv_files", "kind": "function", "doc": "\n
\n- out (list of str): List of strings in the form of '--key=value'.
\n
Recursively finds all csv files in all subdirectories of the root directory and returns their paths.
\n\n\n\n", "signature": "(root_directory: Optional[str] = '.') -> List[str]:", "funcdef": "def"}, {"fullname": "src.slune.utils.get_all_paths", "modulename": "src.slune.utils", "qualname": "get_all_paths", "kind": "function", "doc": "\n
\n- csv_files (list of str): List of strings containing the paths to all csv files found.
\n
Find all possible paths of csv files that have directory matching one of each of all the parameters given.
\n\nFinds all paths of csv files in all subdirectories of the root directory that have a directory in their path matching one of each of all the parameters given.
\n\n\n\n", "signature": "(dirs: List[str], root_directory: Optional[str] = '.') -> List[str]:", "funcdef": "def"}]; + /** pdoc search index */const docs = [{"fullname": "src", "modulename": "src", "kind": "module", "doc": "\n\n\n
\n- matches (list of str): List of strings containing the paths to all csv files found.
\n
\n\n
\n\nA super simplistic python package for performing hyperparameter tuning (or more generally launching jobs and saving results) on a cluster using SLURM. Takes advantage of the fact that lots of jobs (including hyperparameter tuning) are embarrassingly parallel! With slune you can divide your compute into lots of separately scheduled jobs meaning that each small job can get running on your cluster more quickly, speeding up your workflow! Often significantly!
\n\nSlune is super-easy to use! We have helper functions which can execute everything you need done for you. Letting you speed up your work without wasting time.
\n\nSlune is barebones by design. This means that you can easily write code to integrate with slune if you want to do something a bit different! You can also workout what each function is doing pretty easily.
\n\nSlune is flexible. In designing this package I've tried to make as few assumptions as possible meaning that it can be used for lots of stuff outside hyperparameter tuning! (or also within!) For example, you can get slune to give you paths for where to save things, submit lots of jobs in parallel for any sort of script and do grid search! and there's more to come!
\n\nThe docs are here.
\n\nLet's go through a quick example of how we can use slune ... first let's define a model that we want to train:
\n\n# Simple Regularized Linear Regression without using external libraries\n\n# Function to compute the mean of a list\ndef mean(values):\n return sum(values) / float(len(values))\n\n# Function to compute the covariance between two lists\ndef covariance(x, mean_x, y, mean_y):\n covar = 0.0\n for i in range(len(x)):\n covar += (x[i] - mean_x) * (y[i] - mean_y)\n return covar\n\n# Function to compute the variance of a list\ndef variance(values, mean):\n return sum((x - mean) ** 2 for x in values)\n\n# Function to compute coefficients for a simple regularized linear regression\ndef coefficients_regularized(x, y, alpha):\n mean_x, mean_y = mean(x), mean(y)\n var_x = variance(x, mean_x)\n covar = covariance(x, mean_x, y, mean_y)\n b1 = (covar + alpha * var_x) / (var_x + alpha)\n b0 = mean_y - b1 * mean_x\n return b0, b1\n\n# Function to make predictions with a simple regularized linear regression model\ndef linear_regression_regularized(train_X, train_y, test_X, alpha):\n b0, b1 = coefficients_regularized(train_X, train_y, alpha)\n predictions = [b0 + b1 * x for x in test_X]\n return predictions\n\n# ------------------\n# The above is code for a simple normalized linear regression model that we want to train.\n# Now let's fit the model and use slune to save how well our model performs!\n# ------------------\n\nif __name__ == "__main__":\n # First let's load in the value for the regularization parameter alpha that has been passed to this script from the command line. We will use the slune helper function lsargs to do this. \n # lsargs returns a tuple of the python path and a list of arguments passed to the script. We can then use this to get the alpha value.\n from slune import lsargs\n python_path, args = lsargs()\n alpha = float(args[0])\n\n # Mock training dataset, function is y = 1 + 1 * x\n X = [1, 2, 3, 4, 5]\n y = [2, 3, 4, 5, 6]\n\n # Mock test dataset\n test_X = [6, 7, 8]\n test_y = [7, 8, 9]\n test_predictions = linear_regression_regularized(X, y, test_X, alpha)\n\n # First let's load in a function that we can use to get a saver object that uses the default method of logging. The saving will be coordinated by a csv saver object which saves and reads results from csv files stored in a hierarchy of directories.\n from slune import get_csv_saver\n csv_saver = get_csv_saver(params = args)\n\n # Let's now calculate the mean squared error of our predictions and log it!\n mse = mean((test_y[i] - test_predictions[i])**2 for i in range(len(test_y)))\n csv_saver.log({'mse': mse})\n\n # Let's now save our logged results!\n csv_saver.save_collated()\n
\nNow let's write some code that will submit some jobs to train our model using different hyperparameters!!
\n\n# Let's now load in a function that will coordinate our search! We're going to do a grid search.\n# SearcherGrid is the class we can use to coordinate a grid search. We pass it a dictionary of hyperparameters and the values we want to try for each hyperparameter. We also pass it the number of runs we want to do for each combination of hyperparameters.\nfrom slune.searchers import SearcherGrid\ngrid_searcher = SearcherGrid({'alpha' : [0.25, 0.5, 0.75]}, runs = 1)\n\n# Let's now import a function which will submit a job for our model, the script_path specifies the path to the script that contains the model we want to train. The template_path specifies the path to the template script that we want to specify the job with, cargs is a list of constant arguments we want to pass to the script for each tuning. \n# We set saver to None as we don't want to not run jobs if we have already run them before.\nfrom slune import sbatchit\nscript_path = 'model.py'\ntemplate_path = 'template.sh'\nsbatchit(script_path, template_path, grid_searcher, cargs=[], saver=None)\n
\nNow we've submitted our jobs we will wait for them to finish \ud83d\udd5b\ud83d\udd50\ud83d\udd51\ud83d\udd52\ud83d\udd53\ud83d\udd54\ud83d\udd55\ud83d\udd56\ud83d\udd57\ud83d\udd58\ud83d\udd59\ud83d\udd5a\ud83d\udd5b, now that they are finished we can read the results!
\n\nfrom slune import get_csv_saver\ncsv_saver = get_csv_saver(params = None)\nparams, value = csv_saver.read(params = [], metric_name = 'mse', select_by ='min')\nprint(f'Best hyperparameters: {params}')\nprint(f'Their MSE: {value}')\n
\nAmazing! \ud83e\udd73 We have successfully used slune to train our model. I hope this gives you a good idea of how you can use slune and how easy it is to use!
\n\nPlease check out the examples folder for notebooks detailing in more depth some potential ways you can use slune and of course please check out the docs!
\n\nStill in early stages! First thing on the horizon is better integration with SLURM:
\n\nHowever, I am trying to keep this package as bloatless as possible to make it easy for you to tweak and configure to your individual needs. It's written in a simple and compartmentalized manner for this reason. You can of course use the helper functions and let slune handle everything under the hood, but, you can also very quickly and easily write your own classes to work with other savers, loggers and searchers to do as you please.
\n\nTo install latest version use:
\n\npip install slune-lib\n
\nTo install latest dev version use (CURRENTLY RECOMENDED):
\n\n# With https\npip install "git+https://github.com/h-0-0/slune.git#egg=slune-lib"\n
\nHere we will outline the different kind of classes that are used in slune and how they interact with each other. There are 3 types:
\n\nThe base module is where the base classes for each of these types are defined. The base classes are:
\n\nTo create a new searcher, logger or saver, you must inherit from the appropriate base class and implement the required methods. The required methods will have the '@abc.abstractmethod' decorator above them and will throw errors if they are not implemented. The compulsory methods allow for well-defined interactions between the different classes and should allow for any combination of searcher, logger and saver to be used together.
\n\nPlease read the docs for the base classes to see what methods are required to be implemented and how they should be implemented.
\n\nIf you would like to contribute to slune please first familiarize yourself with the package by taking a look at the docs. In particular please read about the class design, the base classes and take a look at the code for the helper functions in the slune module.
\n\nTo contribute to the package please either submit a pull request for an open issue or open a new issue. If you are unsure about whether to open a new issue or in general have any problems please open a discussion in the discussions tab.
\n\nChecklist for contributing:
\n\nBase class for all Searchers.
\n\nThis must be subclassed to create different Searcher classes.\nPlease name your searcher class Searcher
Initialises the searcher.
\n", "signature": "(*args, **kwargs)"}, {"fullname": "src.slune.base.BaseSearcher.next_tune", "modulename": "src.slune.base", "qualname": "BaseSearcher.next_tune", "kind": "function", "doc": "Returns the next configuration to try.
\n", "signature": "(self, *args, **kwargs):", "funcdef": "def"}, {"fullname": "src.slune.base.BaseSearcher.check_existing_runs", "modulename": "src.slune.base", "qualname": "BaseSearcher.check_existing_runs", "kind": "function", "doc": "Used to tell searcher to check if there are existing runs in storage.
\n\nIf there are existing runs, the searcher should skip them \nbased on the number of runs we would like for each job.\nThis may require a 'runs' attribute to be set in the searcher.\nIt will probably also require access to a Saver object,\nso we can use it's saving protocol to check if there are existing runs.\nIn this case is advised that this function takes a Saver object as an argument,\nand that the searcher is initialized with a 'runs' attribute.
\n", "signature": "(self, *args, **kwargs):", "funcdef": "def"}, {"fullname": "src.slune.base.BaseLogger", "modulename": "src.slune.base", "qualname": "BaseLogger", "kind": "class", "doc": "Base class for all Loggers.
\n\nThis must be subclassed to implement different Logger classes.\nPlease name your logger class Logger
Initialises the logger.
\n", "signature": "(*args, **kwargs)"}, {"fullname": "src.slune.base.BaseLogger.log", "modulename": "src.slune.base", "qualname": "BaseLogger.log", "kind": "function", "doc": "Logs the metric/s for the current hyperparameter configuration.
\n\nShould store metrics in some way so we can later save it using a Saver.
\n", "signature": "(self, *args, **kwargs):", "funcdef": "def"}, {"fullname": "src.slune.base.BaseLogger.read_log", "modulename": "src.slune.base", "qualname": "BaseLogger.read_log", "kind": "function", "doc": "Returns value of a metric from the log based on a selection criteria.
\n", "signature": "(self, *args, **kwargs):", "funcdef": "def"}, {"fullname": "src.slune.base.BaseSaver", "modulename": "src.slune.base", "qualname": "BaseSaver", "kind": "class", "doc": "Base class for all savers.
\n\nThis must be subclassed to implement different Saver classes.\nPlease name your saver class Saver
Initialises the saver.
\n\nAssigns the logger instance to self.logger and makes its methods accessible through self.log and self.read_log.
\n\nSaves the current results in logger to storage.
\n", "signature": "(self, *args, **kwargs):", "funcdef": "def"}, {"fullname": "src.slune.base.BaseSaver.read", "modulename": "src.slune.base", "qualname": "BaseSaver.read", "kind": "function", "doc": "Reads results from storage.
\n", "signature": "(self, *args, **kwargs):", "funcdef": "def"}, {"fullname": "src.slune.base.BaseSaver.exists", "modulename": "src.slune.base", "qualname": "BaseSaver.exists", "kind": "function", "doc": "Checks if results already exist in storage.
\n\nShould return integer indicating the number of runs that exist in storage for the given parameters.
\n", "signature": "(self, *args, **kwargs):", "funcdef": "def"}, {"fullname": "src.slune.loggers", "modulename": "src.slune.loggers", "kind": "module", "doc": "\n"}, {"fullname": "src.slune.loggers.default", "modulename": "src.slune.loggers.default", "kind": "module", "doc": "\n"}, {"fullname": "src.slune.loggers.default.LoggerDefault", "modulename": "src.slune.loggers.default", "qualname": "LoggerDefault", "kind": "class", "doc": "Logs metric/s in a data frame.
\n\nStores the metric/s in a data frame that we can later save in storage.\nLogs by creating data frame out of the metrics and then appending it to the current results data frame.
\n\nInitialises the logger.
\n", "signature": "(*args, **kwargs)"}, {"fullname": "src.slune.loggers.default.LoggerDefault.results", "modulename": "src.slune.loggers.default", "qualname": "LoggerDefault.results", "kind": "variable", "doc": "\n"}, {"fullname": "src.slune.loggers.default.LoggerDefault.log", "modulename": "src.slune.loggers.default", "qualname": "LoggerDefault.log", "kind": "function", "doc": "Logs the metric/s given.
\n\nStores them in a data frame that we can later save in storage.\nAll metrics provided will be saved as a row in the results data frame,\nthe first column is always the time stamp at which log is called.
\n\nReads log and returns value according to select_by.
\n\nReads the values for given metric for given log and chooses metric value to return based on select_by.
\n\n\n\n\n\n
\n- value (float): Minimum or maximum value of the metric.
\n
TODO: \n - Add more options for select_by.\n - Should be able to return other types than float?
\n", "signature": "(\tself,\tdata_frame: pandas.core.frame.DataFrame,\tmetric_name: str,\tselect_by: str = 'max') -> float:", "funcdef": "def"}, {"fullname": "src.slune.savers", "modulename": "src.slune.savers", "kind": "module", "doc": "\n"}, {"fullname": "src.slune.savers.csv", "modulename": "src.slune.savers.csv", "kind": "module", "doc": "\n"}, {"fullname": "src.slune.savers.csv.SaverCsv", "modulename": "src.slune.savers.csv", "qualname": "SaverCsv", "kind": "class", "doc": "Saves the results of each run in a CSV file in hierarchy of directories.
\n\nEach directory is named after a parameter - value pair in the form \"--parameter_name=value\".\nThe paths to csv files then define the configuration under which the results were obtained,\nfor example if we only have one parameter \"learning_rate\" with value 0.01 used to obtain the results,\nto save those results we would create a directory named \"--learning_rate=0.01\" and save the results in a csv file in that directory.
\n\nIf we have multiple parameters, for example \"learning_rate\" with value 0.01 and \"batch_size\" with value 32,\nwe would create a directory named \"--learning_rate=0.01\" with a subdirectory named \"--batch_size=32\",\nand save the results in a csv file in that subdirectory.
\n\nWe use this structure to then read the results from the csv files by searching for the directory that matches the parameters we want,\nand then reading the csv file in that directory.
\n\nThe order in which we create the directories is determined by the order in which the parameters are given,\nso if we are given [\"--learning_rate=0.01\", \"--batch_size=32\"] we would create the directories in the following order:\n\"--learning_rate=0.01/--batch_size=32\".
\n\nThe directory structure generated will also depend on existing directories in the root directory,\nif there are existing directories in the root directory that match some subset of the parameters given,\nwe will create the directory tree from the deepest matching directory.
\n\nFor example if we only have the following path in the root directory:\n\"--learning_rate=0.01/--batch_size=32\"\nand we are given the parameters [\"--learning_rate=0.01\", \"--batch_size=32\", \"--num_epochs=10\"],\nwe will create the path:\n\"--learning_rate=0.01/--batch_size=32/--num_epochs=10\".\non the other hand if we are given the parameters [\"--learning_rate=0.02\", \"--num_epochs=10\", \"--batch_size=32\"],\nwe will create the path:\n\"--learning_rate=0.02/--batch_size=32/--num_epochs=10\".
\n\nHandles parallel runs trying to create the same directories by waiting a random time (under 1 second) before creating the directory.\nShould work pretty well in practice, however, may occasionally fail depending on the number of jobs launched at the same time.
\n\nInitialises the csv saver.
\n\nStrips the parameter values.
\n\nStrips the parameter values from the list of parameters given,\nie. [\"--parameter_name=parameter_value\", ...] -> [\"--parameter_name=\", ...]
\n\nAlso gets rid of blank spaces.
\n\n\n\n", "signature": "(self, params: List[str]) -> List[str]:", "funcdef": "def"}, {"fullname": "src.slune.savers.csv.SaverCsv.get_match", "modulename": "src.slune.savers.csv", "qualname": "SaverCsv.get_match", "kind": "function", "doc": "\n
\n- stripped_params (list of str): List of strings containing the parameters used, in form [\"--parameter_name=\", ...].
\n
Searches the root directory for a path that matches the parameters given.
\n\nIf only partial matches are found, returns the deepest matching directory with the missing parameters appended.\nBy deepest we mean the directory with the most parameters matching.\nIf no matches are found creates a path using the parameters.\nCreates path using parameters in the order they are given, \nie. [\"--learning_rate=0.01\", \"--batch_size=32\"] -> \"--learning_rate=0.01/--batch_size=32\".
\n\nIf we find a partial match, we add the missing parameters to the end of the path,\nie. if we have the path \"--learning_rate=0.01\" in the root \nand are given the parameters [\"--learning_rate=0.01\", \"--batch_size=32\"],\nwe will create the path \"--learning_rate=0.01/--batch_size=32\".
\n\n\n\n", "signature": "(self, params: List[str]) -> str:", "funcdef": "def"}, {"fullname": "src.slune.savers.csv.SaverCsv.get_path", "modulename": "src.slune.savers.csv", "qualname": "SaverCsv.get_path", "kind": "function", "doc": "\n
\n- match (str): Path to the directory that matches the parameters given.
\n
Creates a path using the parameters.
\n\nDoes this by first checking for existing paths in the root directory that match the parameters given.
\n\nCheck get_match for how we create the path, \nonce we have the path we check if there is already a csv file with results in that path,\nif there is we increment the number of the results file name that we will use.
\n\nFor example if we get back the path \"--learning_rate=0.01/--batch_size=32\",\nand there exists a csv file named \"results_0.csv\" in the final directory,\nwe will name our csv file \"results_1.csv\".
\n\n\n\n", "signature": "(self, params: List[str]) -> str:", "funcdef": "def"}, {"fullname": "src.slune.savers.csv.SaverCsv.save_collated_from_results", "modulename": "src.slune.savers.csv", "qualname": "SaverCsv.save_collated_from_results", "kind": "function", "doc": "\n
\n- csv_file_path (str): Path to the csv file where we will store the results for the current run.
\n
Saves results to csv file.
\n\nIf the csv file already exists, \nwe append the collated results from the logger to the end of the csv file.\nIf the csv file does not exist,\nwe create it and save the results to it.
\n\nTODO: \n - Could be making to many assumptions about the format in which we get the results from the logger,\n should be able to work with any logger.\n We should only be assuming that we are saving results to a csv file.
\n", "signature": "(self, results: pandas.core.frame.DataFrame):", "funcdef": "def"}, {"fullname": "src.slune.savers.csv.SaverCsv.save_collated", "modulename": "src.slune.savers.csv", "qualname": "SaverCsv.save_collated", "kind": "function", "doc": "Saves results to csv file.
\n", "signature": "(self):", "funcdef": "def"}, {"fullname": "src.slune.savers.csv.SaverCsv.read", "modulename": "src.slune.savers.csv", "qualname": "SaverCsv.read", "kind": "function", "doc": "Finds the min/max value of a metric from all csv files in the root directory that match the parameters given.
\n\n\n\n", "signature": "(\tself,\tparams: List[str],\tmetric_name: str,\tselect_by: str = 'max',\tavg: bool = True) -> (typing.List[str], <class 'float'>):", "funcdef": "def"}, {"fullname": "src.slune.savers.csv.SaverCsv.exists", "modulename": "src.slune.savers.csv", "qualname": "SaverCsv.exists", "kind": "function", "doc": "\n
\n- best_params (list of str): Contains the arguments used to get the 'best' value of the metric (determined by select_by).
\n- best_value (float): Best value of the metric (determined by select_by).
\n
Checks if results already exist in storage.
\n\n\n\n", "signature": "(self, params: List[str]) -> int:", "funcdef": "def"}, {"fullname": "src.slune.savers.csv.SaverCsv.get_current_path", "modulename": "src.slune.savers.csv", "qualname": "SaverCsv.get_current_path", "kind": "function", "doc": "\n
\n- num_runs (int): Number of runs that exist in storage for the given parameters.
\n
Getter function for the current_path attribute.
\n\n\n\n", "signature": "(self) -> str:", "funcdef": "def"}, {"fullname": "src.slune.searchers", "modulename": "src.slune.searchers", "kind": "module", "doc": "\n"}, {"fullname": "src.slune.searchers.grid", "modulename": "src.slune.searchers.grid", "kind": "module", "doc": "\n"}, {"fullname": "src.slune.searchers.grid.SearcherGrid", "modulename": "src.slune.searchers.grid", "qualname": "SearcherGrid", "kind": "class", "doc": "\n
\n- current_path (str): Path to the csv file where we will store the results for the current run.
\n
Searcher for grid search.
\n\nGiven dictionary of parameters and values to try, creates grid of all possible configurations,\nand returns them one by one for each call to next_tune.
\n\nInitialises the searcher.
\n\nCreates search grid.
\n\nGenerates all possible combinations of values for each argument in the given dictionary using recursion.
\n\n\n\n", "signature": "(self, param_dict: dict) -> List:", "funcdef": "def"}, {"fullname": "src.slune.searchers.grid.SearcherGrid.check_existing_runs", "modulename": "src.slune.searchers.grid", "qualname": "SearcherGrid.check_existing_runs", "kind": "function", "doc": "\n
\n- all_combinations (list): A list of dictionaries, each containing one combination of argument values.
\n
We save a pointer to the savers exists method to check if there are existing runs.
\n\n\n\n\nn < runs -> run the remaining runs\n n >= runs -> skip all runs
\n
Skips runs if they are in storage already.
\n\nWill check if there are existing runs for the current configuration,\nif there are existing runs we tally them up \nand skip configs or runs of a config based on the number of runs we want for each config.
\n\n\n\n", "signature": "(self, grid_index: int) -> Tuple[int, int]:", "funcdef": "def"}, {"fullname": "src.slune.searchers.grid.SearcherGrid.next_tune", "modulename": "src.slune.searchers.grid", "qualname": "SearcherGrid.next_tune", "kind": "function", "doc": "\n
\n- grid_index (int): Index of the next configuration in the grid.
\n- run_index (int): Index of the next run for the current configuration.
\n
Returns the next configuration to try.
\n\nWill skip existing runs if check_existing_runs has been called.\nFor more information on how this works check the methods descriptions for check_existing_runs and skip_existing_runs.\nWill raise an error if we have reached the end of the grid.\nTo iterate through all configurations, use a for loop like so: \n for config in searcher: ...
\n\n\n\n", "signature": "(self) -> dict:", "funcdef": "def"}, {"fullname": "src.slune.slune", "modulename": "src.slune.slune", "kind": "module", "doc": "\n"}, {"fullname": "src.slune.slune.submit_job", "modulename": "src.slune.slune", "qualname": "submit_job", "kind": "function", "doc": "\n
\n- next_config (dict): The next configuration to try.
\n
Submits a job using specified Bash script
\n\nSubmits jobs based on arguments given by searcher.
\n\nFor each job runs the script stored at script_path with selected parameter values given by searcher\nand the arguments given by cargs.
\n\nUses the sbatch script with path sbatch_path to submit each job to the cluster.
\n\nIf given a Saver object, uses it to check if there are existing runs for each job and skips them,\nbased on the number of runs we would like for each job (which is stored in the saver).
\n\nReturns the script name and a list of the arguments passed to the script.
\n", "signature": "() -> (<class 'str'>, typing.List[str]):", "funcdef": "def"}, {"fullname": "src.slune.slune.garg", "modulename": "src.slune.slune", "qualname": "garg", "kind": "function", "doc": "Finds the argument/s with name arg_names in the list of arguments args_ls and returns its value/s.
\n\n\n\n", "signature": "(\targs: List[str],\targ_names: Union[str, List[str]]) -> Union[str, List[str]]:", "funcdef": "def"}, {"fullname": "src.slune.slune.get_csv_saver", "modulename": "src.slune.slune", "qualname": "get_csv_saver", "kind": "function", "doc": "\n
\n- arg_value (str or list of str): String or list of strings containing the values of the arguments found.
\n
Returns a SaverCsv object with the given parameters and root directory.
\n\n\n\n", "signature": "(\tparams: Optional[dict] = None,\troot_dir: Optional[str] = 'slune_results') -> slune.base.BaseSaver:", "funcdef": "def"}, {"fullname": "src.slune.utils", "modulename": "src.slune.utils", "kind": "module", "doc": "\n"}, {"fullname": "src.slune.utils.find_directory_path", "modulename": "src.slune.utils", "qualname": "find_directory_path", "kind": "function", "doc": "\n
\n- SaverCsv (Saver): Saver object with the given parameters and root directory.\n Initialized with a LoggerDefault object as its logger.
\n
Searches the root directory for a path of directories that matches the strings given in any order.\nIf only a partial match is found, returns the deepest matching path.\nIf no matches are found returns root_directory.\nReturns a stripped matching path of directories, ie. where we convert '--string=value' to '--string='.
\n\n\n\n", "signature": "(\tstrings: List[str],\troot_directory: Optional[str] = '.') -> Tuple[int, str]:", "funcdef": "def"}, {"fullname": "src.slune.utils.get_numeric_equiv", "modulename": "src.slune.utils", "qualname": "get_numeric_equiv", "kind": "function", "doc": "\n
\n- max_depth (int): Depth of the deepest matching path.
\n- max_path (string): Path of the deepest matching path.
\n
Replaces directories in path with existing directories with the same numerical value.
\n\n\n\n", "signature": "(og_path: str, root_directory: Optional[str] = '.') -> str:", "funcdef": "def"}, {"fullname": "src.slune.utils.dict_to_strings", "modulename": "src.slune.utils", "qualname": "dict_to_strings", "kind": "function", "doc": "\n
\n- equiv (str): Path with values changed to match existing directories if values are numerically equivalent, with root directory at beginning.
\n
Converts a dictionary into a list of strings in the form of '--key=value'.
\n\n\n\n", "signature": "(d: dict) -> List[str]:", "funcdef": "def"}, {"fullname": "src.slune.utils.find_csv_files", "modulename": "src.slune.utils", "qualname": "find_csv_files", "kind": "function", "doc": "\n
\n- out (list of str): List of strings in the form of '--key=value'.
\n
Recursively finds all csv files in all subdirectories of the root directory and returns their paths.
\n\n\n\n", "signature": "(root_directory: Optional[str] = '.') -> List[str]:", "funcdef": "def"}, {"fullname": "src.slune.utils.get_all_paths", "modulename": "src.slune.utils", "qualname": "get_all_paths", "kind": "function", "doc": "\n
\n- csv_files (list of str): List of strings containing the paths to all csv files found.
\n
Find all possible paths of csv files that have directory matching one of each of all the parameters given.
\n\nFinds all paths of csv files in all subdirectories of the root directory that have a directory in their path matching one of each of all the parameters given.
\n\n\n\n", "signature": "(dirs: List[str], root_directory: Optional[str] = '.') -> List[str]:", "funcdef": "def"}]; // mirrored in build-search-index.js (part 1) // Also split on html tags. this is a cheap heuristic, but good enough. diff --git a/docs/.html/src.html b/docs/.html/src.html index 81d7a74..9fd9666 100644 --- a/docs/.html/src.html +++ b/docs/.html/src.html @@ -17,6 +17,7 @@\n
\n- matches (list of str): List of strings containing the paths to all csv files found.
\n
Slune is flexible. In designing this package I've tried to make as few assumptions as possible meaning that it can be used for lots of stuff outside hyperparameter tuning! (or also within!) For example, you can get slune to give you paths for where to save things, submit lots of jobs in parallel for any sort of script and do grid search! and there's more to come!
+The docs are here.
+Let's go through a quick example of how we can use slune ... first let's define a model that we want to train:
@@ -128,16 +132,16 @@Now we've submitted our jobs we will wait for them to finish 🕛🕐🕑🕒🕓🕔🕕🕖🕗🕘🕙🕚🕛, now that they are finished we can read the results!
from slune import get_csv_slog
-csv_slog = get_csv_slog(params = None)
-params, value = csv_slog.read(params = [], metric_name = 'mse', select_by ='min')
+from slune import get_csv_saver
+csv_saver = get_csv_saver(params = None)
+params, value = csv_saver.read(params = [], metric_name = 'mse', select_by ='min')
print(f'Best hyperparameters: {params}')
print(f'Their MSE: {value}')
Amazing! 🥳 We have successfully used slune to train our model. I hope this gives you a good flavour of how you can use slune and how easy it is to use!
+Amazing! 🥳 We have successfully used slune to train our model. I hope this gives you a good idea of how you can use slune and how easy it is to use!
-Please check out the examples folder for notebooks detailing in more depth some potential ways you can use slune. The docs are not yet up and running 😢 but they are coming soon!
+Please check out the examples folder for notebooks detailing in more depth some potential ways you can use slune and of course please check out the docs!
Still in early stages! First thing on the horizon is better integration with SLURM:
+# With https
-pip install "git+https://github.com/h-aze/slune.git#egg=slune-lib"
+pip install "git+https://github.com/h-0-0/slune.git#egg=slune-lib"
To create a new searcher, logger or saver, you must inherit from the appropriate base class and implement the required methods. The required methods will have the '@abc.abstractmethod' decorator above them and will throw errors if they are not implemented. The compulsory methods allow for well-defined interactions between the different classes and should allow for any combination of searcher, logger and saver to be used together.
Please read the docs for the base classes to see what methods are required to be implemented and how they should be implemented.
+ +If you would like to contribute to slune please first familiarize yourself with the package by taking a look at the docs. In particular please read about the class design, the base classes and take a look at the code for the helper functions in the slune module.
+ +To contribute to the package please either submit a pull request for an open issue or open a new issue. If you are unsure about whether to open a new issue or in general have any problems please open a discussion in the discussions tab.
+ +Checklist for contributing:
+ +1""" 2.. include:: ../README.md 3.. include:: ../CLASSDESIGN.md -4""" +4.. include:: ../CONTRIBUTING.md +5"""
95def get_csv_slog(params: Optional[dict]= None, root_dir: Optional[str]='slune_results') -> BaseSaver: - 96 """ Returns a SaverCsv object with the given parameters and root directory. - 97 - 98 Args: - 99 - params (dict, optional): Dictionary of parameters to be passed to the SaverCsv object, default is None. -100 -101 - root_dir (str, optional): Path to the root directory to be used by the SaverCsv object, default is 'slune_results'. -102 -103 Returns: -104 - SaverCsv (Saver): Saver object with the given parameters and root directory. -105 Initialized with a LoggerDefault object as its logger. -106 -107 """ -108 -109 return SaverCsv(LoggerDefault(), params = params, root_dir=root_dir) + +diff --git a/docs/.html/src/slune/utils.html b/docs/.html/src/slune/utils.html index 5a28ba6..ab3cb90 100644 --- a/docs/.html/src/slune/utils.html +++ b/docs/.html/src/slune/utils.html @@ -22,6 +22,7 @@95def get_csv_saver(params: Optional[dict]= None, root_dir: Optional[str]='slune_results') -> BaseSaver: + 96 """ Returns a SaverCsv object with the given parameters and root directory. + 97 + 98 Args: + 99 - params (dict, optional): Dictionary of parameters to be passed to the SaverCsv object, default is None. +100 +101 - root_dir (str, optional): Path to the root directory to be used by the SaverCsv object, default is 'slune_results'. +102 +103 Returns: +104 - SaverCsv (Saver): Saver object with the given parameters and root directory. +105 Initialized with a LoggerDefault object as its logger. +106 +107 """ +108 +109 return SaverCsv(LoggerDefault(), params = params, root_dir=root_dir)src.slune +