In this lab you will learn how the Automated Machine Learning capability in Azure Machine Learning (AML) can be used for the life cycle management of the manufactured vehicles and how AML helps in creation of better vehicle maintenance plans. To accomplish this, you will train a Linear Regression model to predict the number of days until battery failure using Automated Machine Learning available in AML studio.
-
In Azure portal, open the available machine learning workspace.
-
Select Launch now under the Try the new Azure Machine Learning studio message.
-
When you first launch the studio, you may need to set the directory and subscription. If so, you will see this screen:
For the directory, select Udacity and for the subscription, select Azure Sponsorship. For the machine learning workspace, you may see multiple options listed. Select any of these (it doesn't matter which) and then click Get started.
-
Select Automated ML in the left navigation bar.
-
Select New automated ML run to start creating a new experiment.
-
Select Create dataset and choose the From web files option from the drop-down.
-
Fill in the training data URL in the
Web URL
field:https://introtomlsampledata.blob.core.windows.net/data/battery-lifetime/training-formatted.csv
, make sure the name is set totraining-formatted-dataset
, and select Next to load a preview of the parsed training data. -
In the
Settings and preview
page, for theColumn headers
field, selectAll files have same headers
. Scroll to the right to observe all of the columns in the data. -
Select Next to check the schema and then confirm the dataset details by selecting Next and then Create on the confirmation page.
-
Now you should be able to select the newly created dataset for your experiment. Select the
training-formatted-dataset
dataset and select Next to move to the experiment run details page. -
You will now configure the Auto ML run basic settings by providing the following values for the experiment name, target column and training compute:
- Experiment name: automlregression
- Target column: select Survival_In_Days
- Select training compute target: : select qs-compute
-
Select Next and select Regression in the
Task type and settings
page. -
Select View additional configuration settings to open the advanced settings section. Provide the following settings:
- Primary metric: Normalized root mean squared error
- Exit criterion > Metric score threshold: 0.09
- Validation > Validation type: k-fold cross validation
- Validation > Number of Cross Validations: 5
- Concurrency > Max concurrent iterations: 1
-
Select Save and then Finish to begin the automated machine learning process.
-
Wait until the
Run status
becomes Running in theRun Detail page
.
-
The experiment will run for about 15 minutes. While it runs and once it completes, you should check the
Models
tab on theRun Detail
page to observe the model performance for the primary metric for different runs. -
In the models list, notice at the top the iteration with the best normalized root mean square error score. Note that the normalized root mean square error measures the error between the predicted value and actual value. In this case, the model with the lowest normalized root mean square error is the best model.
-
Select Experiments on the left navigation pane and select the experiment
automlregression
to see the list of available runs. -
Select the option to Include child runs to be able to examine model performance for the primary metric of different runs. By default, the left chart describes the
normalized_median_absolute_error
value for each run. Select the pen icon on the right corner of thenormalized_median_absolute_error
chart to configure thenormalized_root_mean_square_error
metric representation.
Congratulations! You have trained a simple time-series forecasting model using automated machine learning in the visual interface. You can continue to experiment in the environment but are free to close the lab environment tab and return to the Udacity portal to continue with the lesson.