The weights of a neural network cannot be calculated using an analytical method. Instead, the weights must be discovered via an empirical optimization procedure called stochastic gradient descent.
The optimization problem addressed by stochastic gradient descent for neural networks is challenging and the space of solutions (sets of weights) may be comprised of many good solutions (called global optima) as well as easy to find, but low in skill solutions (called local optima).
The amount of change to the model during each step of this search process, or the step size, is called the “learning rate” and provides perhaps the most important hyperparameter to tune for your neural network in order to achieve good performance on your problem.
Learning rate controls how quickly or slowly a neural network model learns a problem.
How to configure the learning rate with sensible defaults, diagnose behavior, and develop a sensitivity analysis.
How to further improve performance with learning rate schedules, momentum, and adaptive learning rates.
references:
https://machinelearningmastery.com/learning-rate-for-deep-learning-neural-networks/
No comments:
Post a Comment