Machine Learning Training

Machine Learning Training API allows you to construct, control, and train a machine learning model in Tizen devices.

The main features of Machine Learning Training API include:

Constructing a deep neural network (DNN)
- You can construct a DNN model using a model description file or by writing code through Machine Learning Training API.
Training with your own data
- Machine Learning Training API also allows you to train the model with your own data as a File I/O or by defining a data generator.
Evaluating the model during training
- You can validate and test your model during the training process easily by defining the dataset.

Note
Every example code does not handle all error use cases. The error must be handled more extensively compared to the example code written on this page.

Prerequisites

To enable your application to use Machine Learning Training API, go through the following steps:

To use the functions and data types of the Machine Learning Inference API, include the <nntrainer.h> header file in your application:
```
#include <nntrainer.h>
```

To use the Machine Learning Training API, include the following features in your tizen-manifest.xml file:

<feature name="http://tizen.org/feature/machine_learning">true</feature>
<feature name="http://tizen.org/feature/machine_learning.training">true</feature>

In case of saving or loading model files from the outside of the application’s own resources, the application has to request permission by adding the following privileges to the tizen-manifest.xml file:

<privileges>
  <!-- For accessing media storage -->
  <privilege>http://tizen.org/privilege/mediastorage</privilege>
  <!-- For accessing external storage -->
  <privilege>http://tizen.org/privilege/externalstorage</privilege>
</privileges>

Building blocks

Following are the four major components of Machine Learning Training API:

Model
Layer
Optimizer
Learning rate scheduler
Dataset

Model

Model is a wrapper component that has the topology of layers, optimizers, and datasets. The model performs training and saves the updated parameters that can later be used for inference. In the following figure, data represents input data or feature and label is the actual value to be tested over prediction:

model

Note
Until Tizen 6.0, only sequential neural network is supported, and since Tizen 6.5, neural network with directed acyclic graph structure is also supported.

// Create model
ml_train_model_h model;
ml_train_model_construct(&model);

/* Configure model(omitted for brevity) */

// Compile model - This freezes the model. Afterwards the model cannot be modified.
ml_train_model_compile(model, "loss=cross", "batch_size=16", NULL);
// Run model
ml_train_model_run(model, "epochs=2", NULL);

// Save and load below format supported since Tizen 6.5:
ml_train_model_save(model, "model_weights.bin", ML_TRAIN_MODEL_FORMAT_BIN);
ml_train_model_save(model, "model.ini", ML_TRAIN_MODEL_FORMAT_INI);
ml_train_model_save(model, "model_and_weights.ini", ML_TRAIN_MODEL_FORMAT_INI_WITH_BIN);

// Destroy after use.
ml_train_model_destroy(model);

A number of properties can be set at ml_train_model_compile() and ml_train_model_run() phase:

Function	Key	Value	Description
`ml_train_model_run()`	epochs	(integer)	Determines epochs for the model
	save_path	(string)	Model path to save parameters after a single epoch

Layer

Layer is a component that does actual computation while managing internal trainable parameters. The following example shows how to create, add and get layer to the model:

// Create layer
ml_train_layer_h layer;
ml_train_layer_h get_layer;
ml_train_layer_create(&layer, ML_TRAIN_LAYER_TYPE_FC);

// Configure layer
ml_train_layer_set_property(layer, "unit=10", "activation=softmax", "bias_initializer=zeros", "name=fc100", NULL);

// After adding the layer to model, you do not need to destroy layer since ownership is transferred to the model.
ml_train_model_add_layer(model, layer);

// Get layer from the model with the given name.
// The returned layer must not be deleted as it is owned by the model.
ml_train_model_get_layer(model, "fc100", &get_layer);

There are two types of layers. One type includes commonly trainable weights and the other type does not include them. The following are the available properties for each layer type which include commonly trainable weights:

Type	Key	Value	Default value	Description
(Universal properties)				Universal properties that apply to every layer
	name	(string)		An identifier for each layer
	trainable	(boolean)	true	Allow weights to be trained if true
	input_layers	(string)		Comma-separated names of layers to be inputs of the current layer
	input_shape	(string)		Comma-separated formatted string as “channel:height:width”. If there is no channel then it must be 1. The first layer of the model must have input_shape. Other can be omitted as it is calculated at compile phase.
	flatten	(boolean)		Flatten shape from `c:h:w` to `1:1:chw`
	activation	(categorical)		Activation type
		tanh		Hyperbolic tangent
		sigmoid		Sigmoid function
		relu		Relu function
		softmax		Softmax function
	loss	(float)	0	Loss
	weight_initializer	(categorical)	xavier_uniform	Weight initializer
		zeros		Zero initialization
		lecun_normal		LeCun normal initialization
		lecun_uniform		LeCun uniform initialization
		xavier_normal		Xavier normal initialization
		xavier_uniform		Xavier uniform initialization
		he_normal		He normal initialization
		he_uniform		He uniform initialization
	bias_initializer	(categorical)	zeros	Bias initializer
		zeros		Zero initialization
		lecun_normal		LeCun normal initialization
		lecun_uniform		LeCun uniform initialization
		xavier_normal		Xavier normal initialization
		xavier_uniform		Xavier uniform initialization
		he_normal		He normal initialization
		he_uniform		He uniform initialization
	weight_regularizer	(categorical)		Weight regularizer. Currently, only l2norm is supported
		l2norm		L2 weight regularizer
	weight_regularizer_constant	(float)	1	Weight regularizer constant
`ML_TRAIN_LAYER_TYPE_FC`				Fully connected layer
	unit	(unsigned integer)		Number of outputs
`ML_TRAIN_LAYER_TYPE_CONV1D` (since 7.0)				1D Convolution layer
	filters	(unsigned integer)		Number of filters
	kernel_size	(unsigned integer)		Kernel size
	stride	(unsigned integer)	1	Strides
	padding	(categorical)	valid	Padding type
		valid		No padding
		same		Preserve dimension
		(unsigned integer)		Size of padding applied uniformly to all side
		(array of unsigned integer of size 2)		Padding for left, right
`ML_TRAIN_LAYER_TYPE_CONV2D` (since 6.5)				2D Convolution layer
	filters	(unsigned integer)		Number of filters
	kernel_size	(array of unsigned integer)		Comma-separated unsigned integers for kernel size, `height, width` respectively
	stride	(array of unsigned integer)	1, 1	Comma-separated unsigned integers for strides, `height, width` respectively
	padding	(categorical)	valid	Padding type
		valid		No padding
		same		Preserve height/width dimension
		(unsigned integer)		Size of padding applied uniformly to all side
		(array of unsigned integer of size 2)		Padding for height, width
		(array of unsigned integer of size 4)		Padding for top, bottom, left, right
`ML_TRAIN_LAYER_TYPE_EMBEDDING` (since 6.5)				Embedding layer
	in_dim	(unsigned integer)		Vocabulary size
	out_dim	(unsigned integer)		Word embedding size
`ML_TRAIN_LAYER_TYPE_RNN` (since 6.5)				RNN layer
	unit	(unsigned integer)		Number of output neurons
	hidden_state_activation	(categorical)	tanh	Activation type
		tanh		Hyperbolic tangent
		sigmoid		Sigmoid function
		relu		Relu function
		softmax		Softmax function
	return_sequences	(boolean)	false	Return only the last output if true, else return full output
	dropout	(float)	0	Dropout rate
	integrate_bias	(boolean)	false	Integrate bias_ih, bias_hh to bias_h
`ML_TRAIN_LAYER_TYPE_RNNCELL` (since 7.0)				RNN cell layer
	unit	(unsigned integer)		Number of output neurons
	hidden_state_activation	(categorical)	tanh	Activation type
		tanh		Hyperbolic tangent
		sigmoid		Sigmoid function
		relu		Relu function
		softmax		Softmax function
	dropout	(float)	0	Dropout rate
	integrate_bias	(boolean)	false	Integrate bias_ih, bias_hh to bias_h
`ML_TRAIN_LAYER_TYPE_LSTM` (since 6.5)				LSTM layer
	unit	(unsigned integer)		Number of output neurons
	hidden_state_activation	(categorical)	tanh	Activation type
		tanh		Hyperbolic tangent
		sigmoid		Sigmoid function
		relu		Relu function
		softmax		Softmax function
	recurrent_activation	(categorical)	sigmoid	Activation type for recurrent step
		tanh		Hyperbolic tangent
		sigmoid		Sigmoid function
		relu		Relu function
		softmax		Softmax function
	return_sequences	(boolean)	false	Return only the last output if true, else return full output
	dropout	(float)	0	Dropout rate
	integrate_bias	(boolean)	false	Integrate bias_ih, bias_hh to bias_h
	max_timestep	(unsigned integer)		Maximum timestep
`ML_TRAIN_LAYER_TYPE_LSTMCELL` (since 7.0)				LSTM cell layer
	unit	(unsigned integer)		Number of output neurons
	hidden_state_activation	(categorical)	tanh	Activation type
		tanh		Hyperbolic tangent
		sigmoid		Sigmoid function
		relu		Relu function
		softmax		Softmax function
	recurrent_activation	(categorical)	sigmoid	Activation type for recurrent step
		tanh		Hyperbolic tangent
		sigmoid		Sigmoid function
		relu		Relu function
		softmax		Softmax function
	dropout	(float)	0	Dropout rate
	integrate_bias	(boolean)	false	Integrate bias_ih, bias_hh to bias_h
`ML_TRAIN_LAYER_TYPE_GRU` (since 6.5)				GRU layer
	unit	(unsigned integer)		Number of output neurons
	hidden_state_activation	(categorical)	tanh	Activation type
		tanh		Hyperbolic tangent
		sigmoid		Sigmoid function
		relu		Relu function
		softmax		Softmax function
	recurrent_activation	(categorical)	sigmoid	Activation type for recurrent step
		tanh		Hyperbolic tangent
		sigmoid		Sigmoid function
		relu		Relu function
		softmax		Softmax function
	return_sequences	(boolean)	false	Return only the last output if true, else return full output
	dropout	(float)	0	Dropout rate
	integrate_bias	(boolean)	false	Integrate bias_ih, bias_hh to bias_h
	reset_after	(boolean)	true	Apply reset gate before/after the matrix
`ML_TRAIN_LAYER_TYPE_GRUCELL` (since 7.0)				GRU cell layer
	unit	(unsigned integer)		Number of output neurons
	reset_after	(boolean)	true	Apply reset gate before/after the matrix multiplication
	hidden_state_activation	(categorical)	tanh	Activation type
		tanh		Hyperbolic tangent
		sigmoid		Sigmoid function
		relu		Relu function
		softmax		Softmax function
	recurrent_activation	(categorical)	sigmoid	Activation type for recurrent step
		tanh		Hyperbolic tangent
		sigmoid		Sigmoid function
		relu		Relu function
		softmax		Softmax function
	dropout	(float)	0	Dropout rate
	integrate_bias	(boolean)	false	Integrate bias_ih, bias_hh to bias_h
`ML_TRAIN_LAYER_TYPE_ZONEOUTLSTMCELL` (since 7.0)				Zoneout LSTM cell layer
	unit	(unsigned integer)		Number of output neurons
	hidden_state_activation	(categorical)	tanh	Activation type
		tanh		Hyperbolic tangent
		sigmoid		Sigmoid function
		relu		Relu function
		softmax		Softmax function
	recurrent_activation	(categorical)	sigmoid	Activation type for recurrent step
		tanh		Hyperbolic tangent
		sigmoid		Sigmoid function
		relu		Relu function
		softmax		Softmax function
	cell_state_zoneout_rate	(float)	0	Zoneout rate for cell state
	hidden_state_zoneout_rate	(float)	0	Zoneout rate for hidden state
	integrate_bias	(boolean)	false	Integrate bias_ih, bias_hh to bias_h
`ML_TRAIN_LAYER_TYPE_ATTENTION` (since 7.0)				Attention layer
	n/a
`ML_TRAIN_LAYER_TYPE_MOL_ATTENTION` (since 7.0)				MOL attention layer
	unit	(unsigned integer)		Number of output neurons
	MoL_K	(unsigned integer)		Size of the three projections
`ML_TRAIN_LAYER_TYPE_MULTI_HEAD_ATTENTION` (since 7.0)				Multi head attention layer
	num_heads	(unsigned integer)	1	Number of head
	projected_key_dim	(unsigned integer)		Projected key dim per head
	projected_value_dim	(unsigned integer)		Projected value dim per head
	output_shape	(unsigned integer)		Output shape of multi head
	dropout_rate	(float)	0	Drop rate
	return_attention_weight	(categorical)	none	Return attention weight
		none		Return none
		before		Return attention weight before applying dropout
		after		Return attention weight after applying dropout
	average_attention_weight	(boolean)	true	Average attention weight
`ML_TRAIN_LAYER_TYPE_LAYER_NORMALIZATION` (since 7.0)				Layer normalization layer
	axis	(unsigned integer)		Index in the dimension
	epsilon	(float)	0.001	Epsilon value
	gamma_initializer	(categorical)	none	Gamma initialization
		zeros		Zero initialization
		ones		One initialization
		lecun_normal		Lecun normal initialization
		lecun_uniform		Lecun uniform initialization
		xavier_normal		Xavier normal initialization
		xavier_uniform		Xavier uniform initialization
		he_normal		He normal initialization
		he_uniform		He uniform initialization
		none		No initialization
	beta_initializer	(categorical)	none	Beta initialization
		zeros		Zero initialization
		ones		One initialization
		lecun_normal		Lecun normal initialization
		lecun_uniform		Lecun uniform initialization
		xavier_normal		Xavier normal initialization
		xavier_uniform		Xavier uniform initialization
		he_normal		He normal initialization
		he_uniform		He uniform initialization
		none		No initialization
	weight_decay	(float)	0	Define how much to decay the weight
	bias_decay	(float)	0	Define how much regularize the weight
`ML_TRAIN_LAYER_TYPE_POSITIONAL_ENCODING` (since 7.0)				Positional encoding layer
	max_timestep	(unsigned integer)		Maximum timestep

The following are the available properties for each layer type which do not include (weight_initializer, bias_initializer, weight_regularizer, weight_regularizer_constant) properties:

Type	Key	Value	Default value	Description
(Universal properties)				Universal properties that apply to every layer
	name	(string)		An identifier for each layer
	trainable	(boolean)	true	Allow weights to be trained if true
	input_layers	(string)		Comma-separated names of layers to be inputs of the current layer
	input_shape	(string)		Comma-separated formatted string as “channel:height:width”. If there is no channel then it must be 1. First layer of the model must have input_shape. Other can be omitted as it is calculated at compile phase.
	flatten	(boolean)		Flatten shape from `c:h:w` to `1:1:chw`
	activation	(categorical)		Activation type
		tanh		Hyperbolic tangent
		sigmoid		Sigmoid function
		relu		Relu function
		softmax		Softmax function
	loss	(float)	0	Loss
`ML_TRAIN_LAYER_TYPE_INPUT`				Input layer
	normalization	(boolean)	false	Normalize input if true
	standardization	(boolean)	false	Standardize input if true
`ML_TRAIN_LAYER_TYPE_BN` (since 6.5)				Batch normalization layer
	epsilon	(float)	0.001	Small value to avoid dividing by zero
	moving_mean_initializer	(categorical)	zeros	Moving mean initializer
		zeros		Zero initialization
		lecun_normal		LeCun normal initialization
		lecun_uniform		LeCun uniform initialization
		xavier_normal		Xavier normal initialization
		xavier_uniform		Xavier uniform initialization
		he_normal		He normal initialization
		he_uniform		He uniform initialization
	moving_variance_initializer	(categorical)	ones	Moving variance initializer
		zeros		Zero initialization
		lecun_normal		LeCun normal initialization
		lecun_uniform		LeCun uniform initialization
		xavier_normal		Xavier normal initialization
		xavier_uniform		Xavier uniform initialization
		he_normal		He normal initialization
		he_uniform		He uniform initialization
	gamma_initializer	(categorical)	ones	Gamma initializer
		zeros		Zero initialization
		lecun_normal		LeCun normal initialization
		lecun_uniform		LeCun uniform initialization
		xavier_normal		Xavier normal initialization
		xavier_uniform		Xavier uniform initialization
		he_normal		He normal initialization
		he_uniform		He uniform initialization
	beta_initializer	(categorical)	zeros	Beta initializer
		zeros		Zero initialization
		lecun_normal		LeCun normal initialization
		lecun_uniform		LeCun uniform initialization
		xavier_normal		Xavier normal initialization
		xavier_uniform		Xavier uniform initialization
		he_normal		He normal initialization
		he_uniform		He uniform initialization
	momentum	(float)	0.99	Momentum for moving average in batch normalization
`ML_TRAIN_LAYER_TYPE_POOLING2D` (since 6.5)				Pooling layer
	pooling	(categorical)		Pooling type
		max		Max pooling
		average		Average pooling
		global_max		Global max pooling
		global_average		Global average pooling
	pool_size	(array of unsigned integer)		Comma-separated unsigned intergers for pooling size, `height, width` respectively
	stride	(array of unsigned integer)	1, 1	Comma-separated unsigned intergers for stride, `height, width` respectively
	padding	(categorical)	valid	Padding type
		valid		No padding
		same		Preserve height/width dimension
		(unsigned integer)		Size of padding applied uniformly to all side
		(array of unsigned integer of size 2)		Padding for height, width
		(array of unsigned integer of size 4)		Padding for top, bottom, left, right
`ML_TRAIN_LAYER_TYPE_FLATTEN` (since 6.5)				Flatten layer
`ML_TRAIN_LAYER_TYPE_ACTIVATION` (since 6.5)				Activation layer
	activation	(categorical)		Activation type
		tanh		Hyperbolic tangent
		sigmoid		Sigmoid function
		relu		Relu function
		softmax		Softmax function
`ML_TRAIN_LAYER_TYPE_ADDITION` (since 6.5)				Addition layer
`ML_TRAIN_LAYER_TYPE_CONCAT` (since 6.5)				Concat layer
`ML_TRAIN_LAYER_TYPE_MULTIOUT` (since 6.5)				Multiout layer
`ML_TRAIN_LAYER_TYPE_SPLIT` (since 6.5)				Split layer
	split_dimension	(unsigned integer)		Which dimension to split. Split batch dimension is not allowed
`ML_TRAIN_LAYER_TYPE_PERMUTE` (since 6.5)				Permute layer
`ML_TRAIN_LAYER_TYPE_DROPOUT` (since 6.5)				Dropout layer
	dropout	(float)	0	Dropout rate
`ML_TRAIN_LAYER_TYPE_BACKBONE_NNSTREAMER` (since 6.5)				NNStreamer layer
	model_path	(string)		NNStreamer model path
`ML_TRAIN_LAYER_TYPE_CENTROID_KNN` (since 6.5)				Centroid KNN layer
	num_class	(unsigned integer)		Number of class
`ML_TRAIN_LAYER_TYPE_PREPROCESS_FLIP` (since 6.5)				Preprocess flip layer
	flip_direction	(categorical)		Flip direction
		horizontal		Horizontal direction
		vertical		Vertical direction
		horizontal_and_vertical	horizontal_and_vertical	Horizontal and vertical direction
`ML_TRAIN_LAYER_TYPE_PREPROCESS_TRANSLATE` (since 6.5)				Preprocess translate layer
	random_translate	(float)		Translate factor value
`ML_TRAIN_LAYER_TYPE_PREPROCESS_L2NORM` (since 6.5)				Preprocess l2norm layer
`ML_TRAIN_LAYER_TYPE_LOSS_MSE` (since 6.5)				MSE loss layer
`ML_TRAIN_LAYER_TYPE_LOSS_CROSS_ENTROPY_SIGMOID` (since 6.5)				Cross entropy with sigmoid loss layer
`ML_TRAIN_LAYER_TYPE_LOSS_CROSS_ENTROPY_SOFTMAX` (since 6.5)				Cross entropy with softmax loss layer
`ML_TRAIN_LAYER_TYPE_IDENTITY` (since 8.0)				Identity layer

Optimizer

Optimizer determines how to update model parameters according to loss from prediction. Currently, Stochastic Gradient Descent optimizer and Adam optimizer are supported:

// Create an optimizer
ml_train_optimizer_h optimizer;
ml_train_optimizer_create(&optimizer, ML_TRAIN_OPTIMIZER_TYPE_SGD);

// Configure an optimizer
ml_train_optimizer_set_property(optimizer, "learning_rate=0.001", NULL);

// Set optimizer to the model
// No need to destroy optimizer after setting optimizer since the ownership is transferred to the model.
ml_train_model_set_optimizer(model, optimizer);

Following are the available properties for each optimizer type:

Type	Key	value	Description
(Universal properties)			Universal properties that apply to every layer
	learning_rate	(float)	Initial learning rate for the optimizer
`ML_TRAIN_OPTIMIZER_TYPE_SGD`			Stochastic Gradient Descent optimizer
`ML_TRAIN_OPTIMIZER_TYPE_ADAM`			Adam optimizer
	beta1	(float)	Beta1 coefficient for Adam
	beta2	(float)	Beta2 coefficient for Adam
	epsilon	(float)	Epsilon coefficient for Adam

Learning rate scheduler

Learning rate scheduler determines how to adjust the learning rate. Currently constant, exponential, and step type is supported:

// Create learning rate scheduler
ml_train_lr_scheduler_h scheduler;
ml_train_lr_scheduler_create(&scheduler, ML_TRAIN_LR_SCHEDULER_TYPE_CONSTANT);

// Configure learning rate scheduler
ml_train_lr_scheduler_set_property(scheduler, "learning_rate=0.001", NULL);

// Set learning rate scheduler to optimizer
ml_train_optimizer_h optimizer;
ml_train_optimizer_set_lr_scheduler(optimizer, scheduler);

// Set optimizer to the model
ml_train_model_set_optimizer(model, optimizer);

Following are the available properties for each learning rate scheduler type:

Type	Key	Value	Description
`ML_TRAIN_LR_SCHEDULER_TYPE_CONSTANT`
	LearningRate	(float)	LearningRate
`ML_TRAIN_LR_SCHEDULER_TYPE_EXPONENTIAL`
	LearningRate	(float)	LearningRate
	DecayRate	(float)	Decay rate
	DecaySteps	(float)	Decay steps
`ML_TRAIN_LR_SCHEDULER_TYPE_STEP`
	LearningRate	(array of float)	LearningRate
	Iteration	(array of unsigned integer)	Iteration

Dataset

Dataset is in charge of feeding data into the model. The dataset can either be created from a callback function or created from a file. For more information, see configure the model section.

The following code is an example of handling dataset:

// Create dataset
ml_train_dataset_h dataset;
ml_train_dataset_construct(&dataset);
ml_train_dataset_add_file(&dataset, train_path, ML_TRAIN_DATASET_MODE_TRAIN);
ml_train_dataset_add_generator(&dataset, get_sample, (void *)val_user_data, ML_TRAIN_DATASET_MODE_VALID);
ml_train_dataset_add_generator(&dataset, get_sample, (void *)test_user_data, ML_TRAIN_DATASET_MODE_TEST);

// configure dataset
ml_train_dataset_set_property_for_mode(dataset, ML_TRAIN_DATASET_MODE_TRAIN, "buffer_size=100", NULL);

// after setting a dataset to model,
// you do not need to destroy dataset since ownership is transferred to the model.
ml_train_model_set_dataset(model, dataset);

Construct a model

A model can be constructed with ml_train_model_construct(). If you have a file that describes the model, the file can be used to construct initially with ml_train_model_construct_with_file(). Even if the model is constructed from a file, switching, modifying, or setting a component is possible until you compile with ml_train_model_compile().

Construct a model from a description file

As of now, only INI formatted files *.ini are supported to construct a model from a file.

Create a model from INI formatted file

Special sections [Model], [Optimizers], [train_set], [valid_set], [test_set], [LearningRateScheduler] are respectively referring to model, optimizer and data provider objects. Rest of INI sections map to a layer. Keys and values from each section set properties of the layer. All keys and values are treated as case-insensitive.

Following is an example of the *.ini file:

[Model] # Special section that describes model itself
Type = NeuralNetwork  # Model type : only NeuralNetwork is supported as of now
batch_size = 9

####  Model run related properties
Epochs = 20     # Epochs

[Optimizer]
Type = adam  # Optimizer : Adaptive Moment Estimation(adam)
beta1 = 0.9     # beta 1 for adam
beta2 = 0.9999  # beta 2 for adam
epsilon = 1e-7  # epsilon for adam

[LearningRateScheduler]
type = constant
Learning_rate = 1e-4

[train_set]
Type = file
BufferSize = 9
path = "trainingSet.dat"

[valid_set]
Type = file
path = "valSet.dat"

# Add [test_set] section if applicable

# Layer Sections, each section name refers to name of the layer
[inputlayer]
Type = input
Input_Shape = 1:1:62720 # Input dimension in channel:height:width
Normalization = true

[outputlayer]
Type = fully_connected
Unit = 2    # Width of output dimension
bias_initializer = zeros
weight_initializer = xavier_uniform
Activation = sigmoid  # activation : sigmoid, softmax
weight_regularizer = l2norm
weight_regularizer_constant = 0.005
input_layers = inputlayer

[loss]
Type = cross # Define loss as a layer

The following restrictions must be adhered to:

Model file must have a [Model] section.
Model file must have at least one layer.
Valid keys must have valid properties. The invalid keys in each section result in an error.

Note
All paths inside the INI file are relative to the INI file path unless the absolute path is stated. Consider setting save_path, train_set, valid_set, and test_set from the code rather than describing inside the model file to avoid this behavior.

Following example constructs model from INI file:

char *res_path = app_get_resource_path();
char model_path[1024];
ml_train_model_h model;

snprintf(model_path, sizeof(model_path), "%s/model.ini", res_path);
free(res_path);

status = ml_train_model_construct_with_conf(model_path, &model);
if(status != ML_ERROR_NONE) {
  // handle error
}

Construct a model on code

An empty model can be constructed with ml_train_model_construct().

Configure the model

After constructing a model, the model can be configured.

Note
Example code written here reproduces the model description from Create Model from INI Formatted File except that following is different:

Relative path is changed to dynamic app resource and data path.

Model related properties that can only be set at compile or run phase.

Demonstration about ml_train_dataset_add_generator(), which cannot be covered in the description file.

First, an empty model needs to be created:

ml_train_model_h model;
ml_train_model_construct(&model);

Add a layer

ml_train_model_add_layer() appends a layer to the end of the graph in the model:

int status = ML_ERROR_NONE;
ml_train_layer_h layers[2];

// create and add input layer
status = ml_train_layer_create(&layers[0], ML_TRAIN_LAYER_TYPE_INPUT);
if(status != ML_ERROR_NONE) {
  // handle error
}

status = ml_train_layer_set_property(layers[0], "name=inputlayer",
                                                "input_shape=1:1:62720",
                                                "normalization=true", NULL);
if(status != ML_ERROR_NONE) {
  //handle error
}
status = ml_train_model_add_layer(model, layers[0]);

// create and add fully connected layer
status = ml_train_layer_create(&layers[1], ML_TRAIN_LAYER_TYPE_FC);
status = ml_train_layer_set_property(layers[1], "name=outputlayer",
                                                "unit=2",
                                                "bias_initializer=zeros",
                                                "weight_initializer=xavier_uniform",
                                                "activation=sigmoid", NULL);
status = ml_train_model_add_layer(model, layers[1]);

Get a layer

ml_train_model_get_layer() gets a layer from the model with the given name:

int status = ML_ERROR_NONE;

ml_train_model_h model;
ml_train_layer_h add_layer;
ml_train_layer_h get_layer;

// constructs the neural network model
status = ml_train_model_construct(&model);
if (status != ML_ERROR_NONE) {
  // handle error
}

// creates and add a neural network layer
status = ml_train_layer_create(&add_layer, ML_TRAIN_LAYER_TYPE_INPUT);
if (status != ML_ERROR_NONE) {
  // handle error
}

// set a name for the layer
status = ml_train_layer_set_property(add_layer, "name=inputlayer", NULL);
if (status != ML_ERROR_NONE) {
  // handle error
}

// adds layer in neural network model
status = ml_train_model_add_layer(model, add_layer);
if (status != ML_ERROR_NONE) {
  // handle error
}

// gets neural network layer from the model with the given name
status = ml_train_model_get_layer(model, "inputlayer", &get_layer);
// get_layer owned by model, Do not delet it
if (status != ML_ERROR_NONE) {
  // handle error
}

Set an optimizer

Creating and setting optimizer to a model can be done in the same manner as layer:

status = ml_train_optimizer_create(&optimizer, ML_TRAIN_OPTIMIZER_TYPE_ADAM);
status = ml_train_optimizer_set_property(optimizer, "beta1=0.002",
                                                    "beta2=0.001",
                                                    "epsilon=1e-7", NULL);
status = ml_train_model_set_optimizer(model, optimizer);

Set a learning rate scheduler

Creating and setting LearningRateScheduler to a optimizer can be done in the same manner as optimizer:

status = ml_train_lr_scheduler_create(&scheduler, ML_TRAIN_LR_SCHEDULER_TYPE_CONSTANT);
status = ml_train_lr_scheduler_set_property(scheduler, "learning_rate=0.001", NULL);
status = ml_train_optimizer_set_lr_scheduler(optimizer, scheduler);

Set a dataset

There are two ways to create a dataset. One is from a file, and the other one is from a callback. In either case, you need to provide streams of tensor data and arrays of values representing the label, usually one-hot-encoded.

Sample, batch and epoch

This section explains the following three concepts: sample, batch and epoch.

Let’s assume a given model requires three inputs and two labels. This has been reflected in the examples below:

Sample

A sample denotes a single pair of inputs/labels. The example can be as follows:
```
[input_1][input_2][input_3][label_1][label_2]
```

Batch

A batch is a bundle of samples which is fed to a single iteration. If a batch is of size 5, the batch layout, where /* denotes nth sample will be as follows:

[[input_1/1][input_1/2][input_1/3][input_1/4][input_1/5]],
[[input_2/1][input_2/2][input_2/3][input_2/4][input_2/5]],
[[input_3/1][input_3/2][input_3/3][input_3/4][input_3/5]],
[[label_1/1][label_1/2][label_1/3][label_1/4][label_1/5]],
[[label_2/1][label_2/2][label_2/3][label_2/4][label_2/5]],

Epoch

An epoch refers to a full, exhaustive visit to a dataset.

So, if you consider Cifar-100 dataset when using the full training set, a single training epoch will contain 50,000 samples. If batch size is 100, it will be of 500 batches (or iterations).

Construct a dataset on code

An empty dataset handle can be constructed with ml_train_model_create().

The handle abstracts data, providing logics with three dataset usage point: ML_TRAIN_DATASET_MODE_TRAIN, ML_TRAIN_DATASET_MODE_VALID, ML_TRAIN_DATASET_MODE_TEST

A data provider added with ML_TRAIN_DATASET_MODE_TRAIN, ML_TRAIN_DATASET_MODE_VALID, ML_TRAIN_DATASET_MODE_TEST will be used when training, validating, training respectively.

Consider setting property with ml_train_model_set_property_for_mode() when data provider needs certain properties.

Set a dataset from file

To create a dataset from files, each file must contain an array of samples. A single sample consists of an array of raw float array for input and another array or raw float array for labels that contains the same size as model output:

# If a model requires two inputs and a single label for a sample, a single sample would contain...
[[float array for input1][input 2][float array for label1]],
...,
[[float array for input1][input 2][float array for label1]]

As an example, again, consider Cifar-100 dataset. In Cifar-100 data, it contains an image of size 3 * 32 * 32 and a coarse label of 20 classes and a fine label of 100 classes.

If your model requires an image and two labels (coarse, fine) both one-hot encoded for a sample, the file layout would be:

[[3072 pixels][20 floats for coarse label][100 floats for fine label]],
...
...
[[3072 pixels][20 floats for coarse label][100 floats for fine label]]

After preparing the file, create dataset as follows:

int status = ML_ERROR_NONE;
ml_train_dataset_h dataset;
ml_train_dataset_create(&dataset)

char *res_path = app_get_resource_path();
char train_path[1024];

snprintf(train_path, sizeof(train_path), "%s/training.dat", res_path);
free(res_path);

// a file data provider to be used for training
status = ml_train_dataset_add_file(&dataset, train_path, ML_TRAIN_DATASET_MODE_TRAIN);
if(status != ML_ERROR_NONE) {
  // handle error
}

status = ml_train_dataset_set_property_for_mode(dataset, ML_TRAIN_DATASET_MODE_TRAIN, "buffer_size=9", NULL);
status = ml_train_model_set_dataset(model, dataset);

The property buffer_size defines the maximum amount of batches to be queued while training. When it is not given, it is set to 1.

Set a dataset from a generator

Creating a data provider from a generator function and setting it to dataset is also possible.

Prepare a callback function:

/**
* @brief      fill a single sample
* @see ml_train_datagen_cb for details
* @param[in/out] array of allocated input buffers ready to be filled
* @param[in/out] array of allocated label buffers ready to be filled
* @param[in/out] last if the data is finished
* @param[in] user_data private data for the callback
* @retval status for handling error while training, returning non-zero will cause to abort data fetching.
*/
int get_train_data(float **inputs, float **labels, bool *last, void *user_data) {
  /* code that fills data, label and last */
  return ML_ERROR_NONE;
}

Create a dataset from the callback function:

int status = ML_ERROR_NONE;
ml_train_dataset_h dataset;
ml_train_dataset_create(&dataset)

/// assuming user_data is retrived from `get_user_data()`
custom_type * user_data = get_user_data();

// a generator based provider to be used for training
status = ml_train_dataset_add_generator(&dataset, get_train_data, (void *)user_data, ML_TRAIN_DATASET_MODE_TRAIN);
if(status != ML_ERROR_NONE) {
  // handle error
}

status = ml_train_dataset_set_property_for_mode(dataset, "buffer_size=9", NULL);
status = ml_train_model_set_dataset(model, dataset);

The property buffer_size defines the maximum amount of batches to be queued while training. When it is not given, it is set to 1.

Compile the model

Compiling a model finalizes the model with loss. Once compiled, any modification to the properties of the model is restricted. Adding layers or changing the optimizer or dataset of the model is not permitted either:

int status = ML_ERROR_NONE;

status = ml_train_model_compile(model, "loss=cross", NULL);

Train the model

Now, the model is ready to train. Train model as follows:

int status = ML_ERROR_NONE;

status = ml_train_model_run(model, "batch_size=9", "epochs=20", NULL);

Destroy the model

After training, the model must be destroyed with ml_train_model_destroy(). ml_train_model_add_layer(), ml_train_set_optimizer(), and ml_train_set_dataset() transfers ownership to the model. layers, optimizers and dataset that belongs to the model are also deleted.

Use the trained model for inference

Note
This feature is supported since Tizen 6.5 only.

The trained model can be used for inference with Machine Learning Inference API. Ensure that the INI file contains the correct weight file in save_path in [Model] section. The valid INI file can be made from ml_train_model_h with ml_train_model_save() if you have constructed the model with the provided API. For example, you can use the trained model with a single API as follows:

  #include <nnstreamer-single.h>

  ml_single_h single;
  ml_tensors_info_h in_info, out_info;

  ...

  ml_single_open (&single, "model_file.ini", in_info, out_info, ML_NNFW_TYPE_NNTR_INF, ML_NNFW_HW_ANY);

You can use the trained model with pipeline API as follows:

  ml_pipeline_h pipe;

  /* framework is 'nntrainer'*/
  const char pipeline[] = "appsrc ! other/tensor,dimension=(string)2:1:1:1,type=(string)  int8,framerate=(fraction)0/1 ! tensor_filter framework=nntrainer model=model.ini  ! tensor_sink";

  status = ml_pipeline_construct (pipeline, NULL, NULL, &pipe);

Dependencies
- Tizen 6.0 and Higher for Mobile
- Tizen 6.0 and Higher for Wearable
- Tizen 6.0 and Higher for TV
- Tizen 6.0 and Higher for IoT

Machine Learning Training

Prerequisites

Building blocks

Model

Layer

Optimizer

Learning rate scheduler

Dataset

Construct a model

Construct a model from a description file

Create a model from INI formatted file

Construct a model on code

Configure the model

Add a layer

Get a layer

Set an optimizer

Set a learning rate scheduler

Set a dataset

Sample, batch and epoch

Construct a dataset on code

Set a dataset from file

Set a dataset from a generator

Compile the model

Train the model

Destroy the model

Use the trained model for inference

Related information