feat: add neural network optimizers module#13685
feat: add neural network optimizers module#13685shretadas wants to merge 2 commits intoTheAlgorithms:masterfrom
Conversation
- Add SGD (Stochastic Gradient Descent) optimizer - Add MomentumSGD with momentum acceleration - Add NAG (Nesterov Accelerated Gradient) optimizer - Add Adagrad with adaptive learning rates - Add Adam optimizer combining momentum and RMSprop - Include comprehensive doctests (61 tests, all passing) - Add abstract BaseOptimizer for consistent interface - Include detailed mathematical documentation - Add educational examples and performance comparisons - Follow repository guidelines: type hints, error handling, pure Python Implements standard optimization algorithms for neural network training with educational focus and comprehensive testing coverage.
There was a problem hiding this comment.
Click here to look at the relevant links ⬇️
🔗 Relevant Links
Repository:
Python:
Automated review generated by algorithms-keeper. If there's any problem regarding this review, please open an issue about it.
algorithms-keeper commands and options
algorithms-keeper actions can be triggered by commenting on this PR:
@algorithms-keeper reviewto trigger the checks for only added pull request files@algorithms-keeper review-allto trigger the checks for all the pull request files, including the modified files. As we cannot post review comments on lines not part of the diff, this command will post all the messages in one comment.NOTE: Commands are in beta and so this feature is restricted only to a member or owner of the organization.
| Raises: | ||
| ValueError: If parameters and gradients have different shapes | ||
| """ | ||
| def _adagrad_update_recursive(params, grads, acc_grads): |
There was a problem hiding this comment.
Please provide return type hint for the function: _adagrad_update_recursive. If the function does not return a value, please provide the type hint as: def function() -> None:
Please provide type hint for the parameter: params
Please provide type hint for the parameter: grads
Please provide type hint for the parameter: acc_grads
| bias_correction1 = 1 - self.beta1 ** self._time_step | ||
| bias_correction2 = 1 - self.beta2 ** self._time_step | ||
|
|
||
| def _adam_update_recursive(params, grads, first_moment, second_moment): |
There was a problem hiding this comment.
Please provide return type hint for the function: _adam_update_recursive. If the function does not return a value, please provide the type hint as: def function() -> None:
Please provide type hint for the parameter: params
Please provide type hint for the parameter: grads
Please provide type hint for the parameter: first_moment
Please provide type hint for the parameter: second_moment
| x_adagrad = [-1.0, 1.0] | ||
| x_adam = [-1.0, 1.0] | ||
|
|
||
| def rosenbrock(x, y): |
There was a problem hiding this comment.
Please provide return type hint for the function: rosenbrock. If the function does not return a value, please provide the type hint as: def function() -> None:
Please provide descriptive name for the parameter: x
Please provide type hint for the parameter: x
Please provide descriptive name for the parameter: y
Please provide type hint for the parameter: y
| """Rosenbrock function: f(x,y) = 100*(y-x²)² + (1-x)²""" | ||
| return 100 * (y - x*x)**2 + (1 - x)**2 | ||
|
|
||
| def rosenbrock_gradient(x, y): |
There was a problem hiding this comment.
Please provide return type hint for the function: rosenbrock_gradient. If the function does not return a value, please provide the type hint as: def function() -> None:
Please provide descriptive name for the parameter: x
Please provide type hint for the parameter: x
Please provide descriptive name for the parameter: y
Please provide type hint for the parameter: y
| Raises: | ||
| ValueError: If parameters and gradients have different shapes | ||
| """ | ||
| def _check_shapes_and_get_velocity(params, grads, velocity): |
There was a problem hiding this comment.
Please provide return type hint for the function: _check_shapes_and_get_velocity. If the function does not return a value, please provide the type hint as: def function() -> None:
Please provide type hint for the parameter: params
Please provide type hint for the parameter: grads
Please provide type hint for the parameter: velocity
| Raises: | ||
| ValueError: If parameters and gradients have different shapes | ||
| """ | ||
| def _nag_update_recursive(params, grads, velocity): |
There was a problem hiding this comment.
Please provide return type hint for the function: _nag_update_recursive. If the function does not return a value, please provide the type hint as: def function() -> None:
Please provide type hint for the parameter: params
Please provide type hint for the parameter: grads
Please provide type hint for the parameter: velocity
| x_momentum = [2.5] | ||
| x_nag = [2.5] | ||
|
|
||
| def gradient_f(x): |
There was a problem hiding this comment.
Please provide return type hint for the function: gradient_f. If the function does not return a value, please provide the type hint as: def function() -> None:
Please provide descriptive name for the parameter: x
Please provide type hint for the parameter: x
| """Gradient of f(x) = 0.1*x^4 - 2*x^2 + x is f'(x) = 0.4*x^3 - 4*x + 1""" | ||
| return 0.4 * x**3 - 4 * x + 1 | ||
|
|
||
| def f(x): |
There was a problem hiding this comment.
Please provide descriptive name for the function: f
Please provide return type hint for the function: f. If the function does not return a value, please provide the type hint as: def function() -> None:
Please provide descriptive name for the parameter: x
Please provide type hint for the parameter: x
| Raises: | ||
| ValueError: If parameters and gradients have different shapes | ||
| """ | ||
| def _check_and_update_recursive(params, grads): |
There was a problem hiding this comment.
Please provide return type hint for the function: _check_and_update_recursive. If the function does not return a value, please provide the type hint as: def function() -> None:
Please provide type hint for the parameter: params
Please provide type hint for the parameter: grads
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Click here to look at the relevant links ⬇️
🔗 Relevant Links
Repository:
Python:
Automated review generated by algorithms-keeper. If there's any problem regarding this review, please open an issue about it.
algorithms-keeper commands and options
algorithms-keeper actions can be triggered by commenting on this PR:
@algorithms-keeper reviewto trigger the checks for only added pull request files@algorithms-keeper review-allto trigger the checks for all the pull request files, including the modified files. As we cannot post review comments on lines not part of the diff, this command will post all the messages in one comment.NOTE: Commands are in beta and so this feature is restricted only to a member or owner of the organization.
| ValueError: If parameters and gradients have different shapes | ||
| """ | ||
|
|
||
| def _adagrad_update_recursive(params, grads, acc_grads): |
There was a problem hiding this comment.
Please provide return type hint for the function: _adagrad_update_recursive. If the function does not return a value, please provide the type hint as: def function() -> None:
Please provide type hint for the parameter: params
Please provide type hint for the parameter: grads
Please provide type hint for the parameter: acc_grads
| bias_correction1 = 1 - self.beta1**self._time_step | ||
| bias_correction2 = 1 - self.beta2**self._time_step | ||
|
|
||
| def _adam_update_recursive(params, grads, first_moment, second_moment): |
There was a problem hiding this comment.
Please provide return type hint for the function: _adam_update_recursive. If the function does not return a value, please provide the type hint as: def function() -> None:
Please provide type hint for the parameter: params
Please provide type hint for the parameter: grads
Please provide type hint for the parameter: first_moment
Please provide type hint for the parameter: second_moment
| x_adagrad = [-1.0, 1.0] | ||
| x_adam = [-1.0, 1.0] | ||
|
|
||
| def rosenbrock(x, y): |
There was a problem hiding this comment.
Please provide return type hint for the function: rosenbrock. If the function does not return a value, please provide the type hint as: def function() -> None:
Please provide descriptive name for the parameter: x
Please provide type hint for the parameter: x
Please provide descriptive name for the parameter: y
Please provide type hint for the parameter: y
| """Rosenbrock function: f(x,y) = 100*(y-x²)² + (1-x)²""" | ||
| return 100 * (y - x * x) ** 2 + (1 - x) ** 2 | ||
|
|
||
| def rosenbrock_gradient(x, y): |
There was a problem hiding this comment.
Please provide return type hint for the function: rosenbrock_gradient. If the function does not return a value, please provide the type hint as: def function() -> None:
Please provide descriptive name for the parameter: x
Please provide type hint for the parameter: x
Please provide descriptive name for the parameter: y
Please provide type hint for the parameter: y
| ValueError: If parameters and gradients have different shapes | ||
| """ | ||
|
|
||
| def _check_shapes_and_get_velocity(params, grads, velocity): |
There was a problem hiding this comment.
Please provide return type hint for the function: _check_shapes_and_get_velocity. If the function does not return a value, please provide the type hint as: def function() -> None:
Please provide type hint for the parameter: params
Please provide type hint for the parameter: grads
Please provide type hint for the parameter: velocity
| ValueError: If parameters and gradients have different shapes | ||
| """ | ||
|
|
||
| def _nag_update_recursive(params, grads, velocity): |
There was a problem hiding this comment.
Please provide return type hint for the function: _nag_update_recursive. If the function does not return a value, please provide the type hint as: def function() -> None:
Please provide type hint for the parameter: params
Please provide type hint for the parameter: grads
Please provide type hint for the parameter: velocity
| x_momentum = [2.5] | ||
| x_nag = [2.5] | ||
|
|
||
| def gradient_f(x): |
There was a problem hiding this comment.
Please provide return type hint for the function: gradient_f. If the function does not return a value, please provide the type hint as: def function() -> None:
Please provide descriptive name for the parameter: x
Please provide type hint for the parameter: x
| """Gradient of f(x) = 0.1*x^4 - 2*x^2 + x is f'(x) = 0.4*x^3 - 4*x + 1""" | ||
| return 0.4 * x**3 - 4 * x + 1 | ||
|
|
||
| def f(x): |
There was a problem hiding this comment.
Please provide descriptive name for the function: f
Please provide return type hint for the function: f. If the function does not return a value, please provide the type hint as: def function() -> None:
Please provide descriptive name for the parameter: x
Please provide type hint for the parameter: x
| ValueError: If parameters and gradients have different shapes | ||
| """ | ||
|
|
||
| def _check_and_update_recursive(params, grads): |
There was a problem hiding this comment.
Please provide return type hint for the function: _check_and_update_recursive. If the function does not return a value, please provide the type hint as: def function() -> None:
Please provide type hint for the parameter: params
Please provide type hint for the parameter: grads
Neural Network Optimizers Module
This PR introduces a comprehensive neural network optimizers module that implements five widely used optimization algorithms for machine learning and deep learning. The primary goal is to enhance the educational value of the repository by including well-documented, tested, and modular implementations.
Fixes #13662
What's Added
BaseOptimizerfor a consistent interfaceTechnical Details
Algorithms Implemented:
Directory Structure:
neural_network/optimizers/
├── init.py # Package initialization
├── README.md # Comprehensive documentation
├── base_optimizer.py # Abstract base class
├── sgd.py # Stochastic Gradient Descent
├── momentum_sgd.py # SGD with Momentum
├── nag.py # Nesterov Accelerated Gradient
├── adagrad.py # Adagrad optimizer
├── adam.py # Adam optimizer
├── test_optimizers.py # Comprehensive test suite
└── IMPLEMENTATION_SUMMARY.md # Technical implementation details
Testing Coverage
Describe Your Change
Checklist
Fixes #13662.Notes for maintainers and reviewers
[ ]to[x], and save. Only[x]is accepted to mark completion.