- 14 Jun 2024
- 6 Minutes to read
- Print
- DarkLight
Creating an Azure ML forecast scenario
- Updated on 14 Jun 2024
- 6 Minutes to read
- Print
- DarkLight
This topic describes how to create an Azure ML forecast scenario and all available options.
From the Data model home page, under the "Analytics" tile, click on "Azure ML". Here you can configure your Azure ML forecast scenario once you have met the Azure ML feature requirements and set up an Azure ML Data source connection. Once you create an Azure AutoML scenario, you can create a Procedure step to execute the scenarios under the Analytics Action group.
Creating a new Azure AutoML scenario
To create a new Azure AutoML forecast scenario, proceed as follows:
Go to the Analytics tile of the desired Data model in the Data model area and then click on the Azure ML tile
Click on the orange plus icon in the top left corner next to "Azure ML" to open the configuration panel
Enter the name of the Scenario in the "Name" field
Choose an existing Azure ML cloud Data source from the "Connection" dropdown list
Configure the following options under the "Data" section:
(Optional) SELECT. Click on the "SELECT" button to apply a selection. In this case, the forecast analyzes the time series only related to members included in the selection
Forecast horizon. Enter the number of time periods you want to forecast. For example, if you are doing a monthly forecast, you can enter "1" to forecast only one future month
The type of time period depends on the Time Entity in the Structure of the target Cube. The time period can only be Day or Month.
Observed data. Click on the "CONFIGURE LAYOUT" button to configure the Layout that contains the historical data set. Enter the Data Blocks that will be analyzed by the forecasting algorithms, which include the main observed data and the covariates
Target Cube. Choose the target Cube that will contain the result of the forecast scenario after it is run. The target Cube is usually the one that contains the main observed data
Except for the Time Entity, all other Entities in the Structure of the target Cube are used as time series identifiers. While all other Data Blocks in the Layout are used as covariates.
Observed time range. Click on the "FROM" and "TO" buttons to choose the start and end of the observed time range. The default values for the start and end of the time range are "First loaded period" and "Last loaded period" respectively, which cover the values of the observed data by starting from the oldest to the newest.
Validation set. Choose one of the following validation options, which determines the set of historical data that will be used as a test to validate the forecast accuracy:
Kfold cross validation. This type of validation uses the Kfold cross method, which divides the dataset into K equally sized subsets, or folds. The model is then trained and evaluated K times, each time using a different fold as the validation set and the remaining folds as the training set
Validation percentage. This type of validation uses the indicated percentage size of the historical dataset
Validation periods. This type of validation uses the indicated periods of the historical dataset
Select the preferred competition models that will be used for the training from the "Models for training competition" dropdown list. You can also click on "All" to select all competition models
By selecting "All", Azure ML will also include any custom models created directly in your Azure ML Workspace.
The more models you include, the more time is required to complete the training process.
Click on "SAVE" to save the forecast scenario.
Managing and running Azure ML forecast scenarios
You can perform different actions on one or more existing Azure ML forecast scenarios by selecting them and clicking on the different buttons that appear above the forecast scenario list.
The available actions are described below:
To edit one or more forecast scenario, select one or more forecasts and click the pencil icon above the table to open the scenario configuration panel. You can also click on a single scenario directly from the table to modify the desired settings explained in the paragraph above
When editing multiple forecast scenarios, only the name and connection can be edited together.
To delete one or more forecast scenarios, select one or more and click on the trash icon
To run a forecast scenario with the training phase, select the desired one and click on "RUN". This will execute the training process and first inference phase, which are done in the following macro steps:
Connects to the indicated Azure ML workspace and Blob storage
Serializes the time series of the Board Layout and creates the MLTable. The MLTable name will be in the following format: BoardScenarioName (i.e. BoardSalesForecast)
The MLTable is created in the following way: - An MLTable column is created for each Entity that is in the Structure of the target Cube - An MLTable column is created to contain the values of the target Cube - An MLTable column is created for each covariate Data Block
Uploads the historical data, configured in the Board Layout, in the automatically created MLTable Data asset found in the "Data" area of the Azure ML Workspace
Creates the Azure ML training experiment and automated ML job in the "Jobs" area of the Azure ML Workspace. The job name will be in the following format: BoardScenarioNameXP-AAAAMMDDHHMMSS (i.e. BoardSalesForecastXP-20230704100953)
Executes the training job by initiating the competition of the selected models based on the forecast scenario configuration
Creates an Azure ML Environment that contains the necessary dependencies
Creates an Azure ML Endpoint in the "Endpoints" area of your Azure ML Workspace where the winning model is deployed. The Endpoint name will be in the following format: board-ScenarioName (i.e. board-SalesForecast)
Executes the first inference phase on the created Azure ML endpoint
The first inference phase is executed based on the forecast horizon configured in the forecast scenario.
Retrieves the inference phase result (prediction) and storage in the target Cube
The training phase in Azure is the process that requires the most time to complete. The time required heavily depends on the size of your historical dataset and the selected competition models.
In the case where you perform a retraining phase, all related resources in Azure ML will be deleted and recreated based on the new forecast scenario configuration.
A popup window that shows the progress of this process appears when you click on the "RUN" button. You can click on the "X" button to hide the popup window.
However, the whole process explained above is still seamlessly executed in the background and you can check its progress in the Running tasks area.
Once the process is finished, the user will receive a notification near the Top Menu.
You can also check each step of the training process by selecting the forecast scenario and clicking on the "Training logs" tab in the forecast scenario configuration panel. Click on "RELOAD LOGS" to refresh the log details
Board automatically checks the progress of the training phase every 20 seconds in order to notify the user when it is completed.
To run a forecast scenario without the training phase, select the desired one and click on "RUN INFERENCE ONLY". A popup window that shows the progress of this process will appear. You can click on the "X" button to hide the popup window, however, the whole process is still seamlessly executed in the background and you can check its progress in the Running tasks area. Once the process is finished, the user will receive a notification near the Top Menu.
The "RUN INFERENCE ONLY" button will be enabled only after the training process has been performed at least once.
Board data is serialized in the same way as in the training process. The MLTable is created in the following way: - An MLTable column is created for each Entity that is in the Structure of the target Cube - An MLTable column is for the target Cube - An MLTable column is created for each covariate Data Block
New time series cannot be used for inference calls. If there are new time series, you must perform the training phase again, otherwise the result of the inference call will be zero.