About Deeplodocus Configurations
The configurations detailed by files in the config directory are the core of any Deeplodocus project - these manage each of the different modules and how they interact. They are split into nine different files, each responsible for a different domain of the project, as follows:
- Project - configure overarching project variables
- Data - configure your datasets
- Model - set up your model
- Losses -configure your loss functions.
- Metrics - configure any metric functions
- Transform - configure your transformation routines
- Optimizer - configure your optimizer
- Training - set your training conditions
- History - configurations for storing training and validation history
On startup, each of the configuration files are loaded into a single variable named config and the data type of each entry is checked. If any entries are missing or cannot be converted to the expected data type, they will be added/corrected with a default value and a warning will be issued to explain the change made.
Config is a Namespace object, more details of which can be found in the Namespace section of the API page. Users have direct acces to the config variable via Deeplodocus terminal, for example:
- The summary method can be used to view config -
config.summary() - A snapshot of the config file can be taken -
config.snapshot() - Sub-domains can be viewed individually -
config.project.summary() - Variables can be edited -
config.training.num_epochs = 10
The following sections detail all of the expected entries for each configuration YAML file.
Project
Overarching project variables are specified in the project.yaml file.
# Example project.yaml file
session: "version01"
cv_library: "opencv"
device: "auto"
device_ids: "auto"
logs:
history_train_batches: True
history_train_epochs: True
history_validation: True
notification: True
on_wake: Null
session
The name of the current Deeplodocus session. History, log and weight files will be saved to a directory named with the session variable.
- Data type: str
- Default value: "version01"
cv_library
The computer vision library to use.
- Data type: str
- Default value: "opencv"
- Supported options:
- For OpenCV use: "opencv"
- For PILLOW use: "pil"
device
The hardware device to use for executing forward and backward passes of the model.
- Data type: str
- Default value: "auto"
- Supported options:
- For CPU use: "cpu"
- For CUDA devices use: "cuda"
- "auto" will use CUDA devices if any are available, otherwise CPUs
device_ids
The index values of CUDA devices to use.
- Data type: [int]
- Default value: "auto"
- Supported options:
- "auto" will use all available CUDA devices
- [0, 1, ... n] will use CUDA devices which have index values in the given list
logs: history_train_batches
Whether or not training loss and metric values for each batch should be written to a history CSV file.
- Data type: bool
- Default value: True
logs: history_train_epochs
Whether or not training loss and metric values for each epoch should be written to a history CSV file.
- Data type: bool
- Default value: True
logs: history_validation
Whether or not validation loss and metric values for each epoch should be written to a history CSV file.
- Data type: bool
- Default value: True
logs: notification
Whether or not all Deeplodocus notifications.
- Data type: bool
- Default value: True
on_wake
A list of commands to run on startup.
- Data type: [str]
- Default value: None
Data
# Example data.yaml file
TODO
Model
A single model can be specified in the model.yaml file.
# Example model.yaml file
name: "VGG16"
module: "deeplodocus.app.models.vgg"
from_file: False
file: Null
input_size:
- [3, 244, 244]
kwargs: Null
name
The name of the model to use.
- Data type: str
- Default value: VGG16
module
Path to the module that contains the chosen model. If no module is given, Deeplodocs will search through deeplodocus.app and existing PyTorch modules for classes and functions with names that match the one given. The first object found with the requested name will be loaded and the user will be informed of its origin through a notification.
- Data type: str
- Default value: None
Note: More informaton about pre-built deeplodocus models can be found here.
from_file
If 'from_file' is true the model will be loaded from existing a pre-trained model/weight file specified by 'file'. Otherwise, the model will be loaded from the given 'name' and 'module' only.
- Data type: bool
- Default value: False
file
If 'from_file' is True, the model will be loaded according to the path to a model/weights file specified by 'file'. If a weights file contains the module path and name of its model, Deeplodocus will automatically attempt to use that model, otherwise the name and module specified in model.yaml will be used as the model.
- Data type: str
- Default value: None
input_size
The shape of each input to the model. Models may have multiple inputs, thus this entry is list of lists. 'input_size' is not a obligatory entry, but is required to print summaries of the model.
- Data type: [[int]]
- Default value: None
kwargs
Any keyword arguments to be parsed to the model.
- Data type: dict
- Default value: None
Losses
Any number of losses can be specified in losses.yaml. Give each loss a unique name, followed by a series of defining entries, as seen below. The unique name given will be used when displaying loss values and saving to training and validation history.
# Example loss.yaml file
LossName:
name: "CrossEntropyLoss"
module: Null
weight: 1
kwargs: Null
AnotherLossName:
name: "CrossEntropyLoss"
module: Null
weight: 1
kwargs: Null
name
The name of the loss object to use.
- Data type: str
- Default value: "CrossEntropyLoss"
module
Path to the module that contains the chosen loss. If no module is given, Deeplodocs will search through deeplodocus.app and existing PyTorch modules for loss functions with names that match the one given. The first object found with the requested name will be loaded and the user will be informed of its origin through a notification.
- Data type: str
- Default value: None
Note: More information about existing PyTorch loss functions can be found here, and some additional deeplodocus losses can be found here.
weight
The loss weight.
- Data type: float
- Default value: 1
kwargs
Any keyword arguments to be parsed to the loss.
- Data type: dict
- Default value: None
Metrics
Any number of metrics can be specified in metrics.yaml. Give each metric a unique name, followed by a series of defining entries, as seen below. The unique name given will be used when displaying metric values and saving to training and validation history.
# Example metrics.yaml file
MetricName:
name: "accuracy"
module: Null
kwargs: Null
AnotherMetricName:
name: "accuracy"
module: Null
kwargs: Null
name
The name of the metric object to use.
- Data type: str
- Default value: "accuracy"
module
Path to the module that contains the chosen loss. If no module is given, Deeplodocs will search through deeplodocus.app and existing PyTorch modules for loss functions with names that match the one given. The first object found with the requested name will be loaded and the user will be informed of its origin through a notification.
- Data type: str
- Default value: None
Note: More information about existing metrics that come with Deeplodocus can be found here.
kwargs
Any keyword arguments to be parsed to the metric.
- Data type: dict
- Default value: None
Transform
A series of input, label, additional data and output transformers can be specified in the transform.yaml file. More informaton about existing transforms can be found here.
# Example transform.yaml file
train:
name: Train Transform Manager
inputs: Null
labels: Null
additional_data: Null
outputs: Null
validation:
name: Validation Transform Manager
inputs: Null
labels: Null
additional_data: Null
outputs: Null
test:
name: Test Transform Manager
inputs: Null
labels: Null
additional_data: Null
outputs: Null
predict:
name: Prediction Transform Manager
inputs: Null
labels: Null
additional_data: Null
outputs: Null
train: name
A name transform manager dedicated to the training pipeline.
- Data type: str
- Default value: Train Transform Manager
validation: name
A name transform manager dedicated to the validation pipeline.
- Data type: str
- Default value: Validation Transform Manager
test: name
A name transform manager dedicated to the testing pipeline.
- Data type: str
- Default value: Test Transform Manager
predict: name
A name transform manager dedicated to the prediction pipeline.
- Data type: str
- Default value: Prediction Transform Manager
inputs
For each of the train, test, validation and predict pipelines, you can specify a path to a transformer YAML file for each of the model inputs.
- Data type: [str]
- Default value: Null
labels
- Data type: [str]
- Default value: Null
For each of the train, test, validation and predict pipelines, you can specify a path to a transformer YAML file for each of the model labels.
additional_data
- Data type: [str]
- Default value: Null
For each of the train, test, validation and predict pipelines, you can specify a path to a transformer YAML file for each of the model additional.
outputs
For each of the train, test, validation and predict pipelines, you can specify a path to a transformer YAML file for each of the model output.
- Data type: [str]
- Default value: Null
Optimizer
A single optimizer for the model should be specified in optimizer.yaml.
# Example optimizer.yaml file
name: Adam
module: Null
kwargs: Null
name
The name of the optimizer to use.
- Data type: str
- Default value: "Adam"
module
Path to the module that contains the chosen loss. If no module is given, Deeplodocs will search through deeplodocus.app and existing PyTorch modules for loss functions with names that match the one given. The first object found with the requested name will be loaded and the user will be informed of its origin through a notification.
- Data type: str
- Default value: None
Note: More informaton about existing PyTorch optimizers can be found here.
kwargs
Any keyword arguments to be parsed to the optimizer.
- Data type: dict
- Default value: None
Training
# Example training.yaml file
num_epochs: 10
initial_epoch: 0
shuffle: "default"
saver:
method: "pytorch"
save_signal: "auto"
overwrite: False
overwatch:
name: "Total Loss"
condition: "less"
num_epochs
The epoch number to train to.
- Data type: int
- Default value: 10
initial_epoch
The number of the initial epoch.
- Data type: int
- Default value: 0
shuffle
How the training dataset should be shuffled.
- Data type: str
- Default value: "default"
- Supported options:
- "none" - no shuffling
- "all" / "default" - all instances are shuffled at the start of each epoch.
- "batches" - instances remain in the same batch, and the order of the batches is shuffled.
- "pick" - instances are randomly selected from the dataset.
Note: Use "pick" when wanting to shuffle a dataset whilst simultaneously restricting the size of the dataset through the number entry in the data.yaml file.
saver: method
The format to save the the model as.
- Data type: str
- Default value: "pytorch"
- Supported options:
- "pytorch" - currenty the only supported option is saving as a pytorch weights file.
saver: save_signal
How regularly a signal to save the model weights should be dispatched.
- Data type: str
- Default value: "auto"
- Supported options:
- "auto" - the model will be saved at the end of each epoch if the overwatch condition is met.
- "batch" - the model will be saved at the end of each batch.
- "epoch" - the model will be saved at the end of each epoch.
saver: overwrite
Whether or not the model weights should be overwritten on each save signal.
- Data type: bool
- Default value: False
overwatch: name
The name of the overwatch metric to watch. This can be the name of any loss or metric in use, or set to "Total Loss" to watch the weighted sum of all losses.
- Data type: str
- Default value: "Total Loss"
overwatch: condition
The condition for comparing previous overwatch metric values with the current value.
- Data type: str
- Default value: "less"
- Supported options:
- "<", "smaller", "less" - The current overwatch metric must be lower than the previous lowest.
- ">", "bigger", "greater", "more" - The current overwatch metric must be greater than the previous greatest.
History
# Example history.yaml file
verbose: "default"
memorize: "batch"
verbose
TODO: description
- Data type: str
- Default value: "default"
memorize
TODO: description
- Data type: str
- Default value: "batch"