Skip to content

Comments

Neural network optimizers module#13689

Closed
shretadas wants to merge 2 commits intoTheAlgorithms:masterfrom
shretadas:master
Closed

Neural network optimizers module#13689
shretadas wants to merge 2 commits intoTheAlgorithms:masterfrom
shretadas:master

Conversation

@shretadas
Copy link

Neural Network Optimizers Module

This PR adds a comprehensive neural network optimizers module implementing 5 standard optimization algorithms used in machine learning and deep learning.
Fixes #13662

What's Added:

  • Add SGD (Stochastic Gradient Descent) optimizer
  • Add MomentumSGD with momentum acceleration
  • Add NAG (Nesterov Accelerated Gradient) optimizer
  • Add Adagrad with adaptive learning rates
  • Add Adam optimizer combining momentum and RMSprop
  • Include comprehensive doctests (61 tests, all passing)
  • Add abstract BaseOptimizer for consistent interface
  • Include detailed mathematical documentation
  • Add educational examples and performance comparisons
  • Follow repository guidelines: type hints, error handling, pure Python

Implements standard optimization algorithms for neural network training with educational focus and comprehensive testing coverage.

Technical Details:

Algorithms Implemented:

  • SGD: θ = θ - α∇θ (basic gradient descent)
  • MomentumSGD: v = βv + (1-β)∇θ, θ = θ - αv
  • NAG: Uses lookahead gradients for better convergence
  • Adagrad: Adaptive learning rates per parameter
  • Adam: Combines momentum + adaptive learning rates

Files Added:

neural_network/optimizers/
├── init.py # Package initialization
├── README.md # Comprehensive documentation
├── base_optimizer.py # Abstract base class
├── sgd.py # Stochastic Gradient Descent
├── momentum_sgd.py # SGD with Momentum
├── nag.py # Nesterov Accelerated Gradient
├── adagrad.py # Adagrad optimizer
├── adam.py # Adam optimizer
├── test_optimizers.py # Comprehensive test suite
└── IMPLEMENTATION_SUMMARY.md # Technical implementation details

Testing Coverage:

  • 61 comprehensive doctests (100% pass rate)
  • Error handling for all edge cases
  • Multi-dimensional parameter support
  • Performance comparison examples

Describe your change:

  • Add an algorithm?
  • Fix a bug or typo in an existing algorithm?
  • Add or change doctests? -- Note: Please avoid changing both code and tests in a single pull request.
  • Documentation change?

Checklist:

  • I have read CONTRIBUTING.md.
  • This pull request is all my own work -- I have not plagiarized.
  • I know that pull requests will not be merged if they fail the automated tests.
  • This PR only changes one algorithm file. To ease review, please open separate PRs for separate algorithms.
  • All new Python files are placed inside an existing directory.
  • All filenames are in all lowercase characters with no spaces or dashes.
  • All functions and variable names follow Python naming conventions.
  • All function parameters and return values are annotated with Python type hints.
  • All functions have doctests that pass the automated testing.
  • All new algorithms include at least one URL that points to Wikipedia or another similar explanation.
  • If this pull request resolves one or more open issues then the description above includes the issue number(s) with a closing keyword: "Fixes #ISSUE-13662".

- Add SGD (Stochastic Gradient Descent) optimizer
- Add MomentumSGD with momentum acceleration
- Add NAG (Nesterov Accelerated Gradient) optimizer
- Add Adagrad with adaptive learning rates
- Add Adam optimizer combining momentum and RMSprop
- Include comprehensive doctests (61 tests, all passing)
- Add abstract BaseOptimizer for consistent interface
- Include detailed mathematical documentation
- Add educational examples and performance comparisons
- Follow repository guidelines: type hints, error handling, pure Python

Implements standard optimization algorithms for neural network training
with educational focus and comprehensive testing coverage.
@algorithms-keeper algorithms-keeper bot added documentation This PR modified documentation files require descriptive names This PR needs descriptive function and/or variable names require type hints https://docs.python.org/3/library/typing.html labels Oct 22, 2025
Copy link

@algorithms-keeper algorithms-keeper bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Click here to look at the relevant links ⬇️

🔗 Relevant Links

Repository:

Python:

Automated review generated by algorithms-keeper. If there's any problem regarding this review, please open an issue about it.

algorithms-keeper commands and options

algorithms-keeper actions can be triggered by commenting on this PR:

  • @algorithms-keeper review to trigger the checks for only added pull request files
  • @algorithms-keeper review-all to trigger the checks for all the pull request files, including the modified files. As we cannot post review comments on lines not part of the diff, this command will post all the messages in one comment.

NOTE: Commands are in beta and so this feature is restricted only to a member or owner of the organization.

Raises:
ValueError: If parameters and gradients have different shapes
"""
def _adagrad_update_recursive(params, grads, acc_grads):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please provide return type hint for the function: _adagrad_update_recursive. If the function does not return a value, please provide the type hint as: def function() -> None:

Please provide type hint for the parameter: params

Please provide type hint for the parameter: grads

Please provide type hint for the parameter: acc_grads

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

def _check_and_update_recursive(
params: list[float],
grads: list[float]
) -> list[float]:
...

bias_correction1 = 1 - self.beta1 ** self._time_step
bias_correction2 = 1 - self.beta2 ** self._time_step

def _adam_update_recursive(params, grads, first_moment, second_moment):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please provide return type hint for the function: _adam_update_recursive. If the function does not return a value, please provide the type hint as: def function() -> None:

Please provide type hint for the parameter: params

Please provide type hint for the parameter: grads

Please provide type hint for the parameter: first_moment

Please provide type hint for the parameter: second_moment

x_adagrad = [-1.0, 1.0]
x_adam = [-1.0, 1.0]

def rosenbrock(x, y):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please provide return type hint for the function: rosenbrock. If the function does not return a value, please provide the type hint as: def function() -> None:

Please provide descriptive name for the parameter: x

Please provide type hint for the parameter: x

Please provide descriptive name for the parameter: y

Please provide type hint for the parameter: y

"""Rosenbrock function: f(x,y) = 100*(y-x²)² + (1-x)²"""
return 100 * (y - x*x)**2 + (1 - x)**2

def rosenbrock_gradient(x, y):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please provide return type hint for the function: rosenbrock_gradient. If the function does not return a value, please provide the type hint as: def function() -> None:

Please provide descriptive name for the parameter: x

Please provide type hint for the parameter: x

Please provide descriptive name for the parameter: y

Please provide type hint for the parameter: y

Raises:
ValueError: If parameters and gradients have different shapes
"""
def _check_shapes_and_get_velocity(params, grads, velocity):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please provide return type hint for the function: _check_shapes_and_get_velocity. If the function does not return a value, please provide the type hint as: def function() -> None:

Please provide type hint for the parameter: params

Please provide type hint for the parameter: grads

Please provide type hint for the parameter: velocity

Raises:
ValueError: If parameters and gradients have different shapes
"""
def _nag_update_recursive(params, grads, velocity):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please provide return type hint for the function: _nag_update_recursive. If the function does not return a value, please provide the type hint as: def function() -> None:

Please provide type hint for the parameter: params

Please provide type hint for the parameter: grads

Please provide type hint for the parameter: velocity

x_momentum = [2.5]
x_nag = [2.5]

def gradient_f(x):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please provide return type hint for the function: gradient_f. If the function does not return a value, please provide the type hint as: def function() -> None:

Please provide descriptive name for the parameter: x

Please provide type hint for the parameter: x

"""Gradient of f(x) = 0.1*x^4 - 2*x^2 + x is f'(x) = 0.4*x^3 - 4*x + 1"""
return 0.4 * x**3 - 4 * x + 1

def f(x):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please provide descriptive name for the function: f

Please provide return type hint for the function: f. If the function does not return a value, please provide the type hint as: def function() -> None:

Please provide descriptive name for the parameter: x

Please provide type hint for the parameter: x

Raises:
ValueError: If parameters and gradients have different shapes
"""
def _check_and_update_recursive(params, grads):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please provide return type hint for the function: _check_and_update_recursive. If the function does not return a value, please provide the type hint as: def function() -> None:

Please provide type hint for the parameter: params

Please provide type hint for the parameter: grads

@algorithms-keeper algorithms-keeper bot added the awaiting reviews This PR is ready to be reviewed label Oct 22, 2025
@algorithms-keeper algorithms-keeper bot added the tests are failing Do not merge until tests pass label Oct 22, 2025
@shretadas shretadas closed this Oct 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

awaiting reviews This PR is ready to be reviewed documentation This PR modified documentation files require descriptive names This PR needs descriptive function and/or variable names require type hints https://docs.python.org/3/library/typing.html tests are failing Do not merge until tests pass

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add neural network optimizers module to enhance training capabilities

1 participant