Struct leaf::solver::SolverConfig
[−]
[src]
pub struct SolverConfig { pub name: String, pub network: LayerConfig, pub objective: LayerConfig, pub solver: SolverKind, pub minibatch_size: usize, pub lr_policy: LRPolicy, pub base_lr: f32, pub gamma: f32, pub stepsize: usize, pub clip_gradients: Option<f32>, pub weight_decay: Option<f32>, pub regularization_method: Option<RegularizationMethod>, pub momentum: f32, }
Configuration for a Solver
Fields
name | Name of the solver. |
network | The LayerConfig that is used to initialize the network. |
objective | The LayerConfig that is used to initialize the objective. |
solver | The Solver implementation to be used. |
minibatch_size | Accumulate gradients over Default: 1 |
lr_policy | The learning rate policy to be used. Default: Fixed |
base_lr | The base learning rate. Default: 0.01 |
gamma | gamma as used in the calculation of most learning rate policies. Default: 0.1 |
stepsize | The stepsize used in Step and Sigmoid learning policies. Default: 10 |
clip_gradients | The threshold for clipping gradients. Gradient values will be scaled to their L2 norm of length Default: None |
weight_decay | The global weight decay multiplier for regularization. Regularization can prevent overfitting. If set to |
regularization_method | The method of regularization to use. There are different methods for regularization. The two most common ones are L1 regularization and L2 regularization. See RegularizationMethod for all implemented methods. Currently only L2 regularization is implemented. See Issue #23. |
momentum | The momentum multiplier for SGD solvers. For more information see SGD with momentum The value should always be between 0 and 1 and dictates how much of the previous gradient update will be added to the current one. Default: 0 |
Methods
impl SolverConfig
fn get_learning_rate(&self, iter: usize) -> f32
Return the learning rate for a supplied iteration.
The way the learning rate is calculated depends on the configured LRPolicy.
Used by the Solver to calculate the learning rate for the current iteration. The calculated learning rate has a different effect on training dependent on what type of Solver you are using.