Skip to content

Commit af03ccb

Browse files
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
1 parent db78eac commit af03ccb

File tree

10 files changed

+527
-458
lines changed

10 files changed

+527
-458
lines changed

neural_network/optimizers/IMPLEMENTATION_SUMMARY.md

Lines changed: 13 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,8 @@
22

33
## 🎯 Feature Request Implementation
44

5-
**Issue:** "Add neural network optimizers module to enhance training capabilities"
6-
**Requested by:** @Adhithya-Laxman
5+
**Issue:** "Add neural network optimizers module to enhance training capabilities"
6+
**Requested by:** @Adhithya-Laxman
77
**Status:****COMPLETED**
88

99
## 📦 What Was Implemented
@@ -15,7 +15,7 @@ neural_network/optimizers/
1515
├── base_optimizer.py # Abstract base class for all optimizers
1616
├── sgd.py # Stochastic Gradient Descent
1717
├── momentum_sgd.py # SGD with Momentum
18-
├── nag.py # Nesterov Accelerated Gradient
18+
├── nag.py # Nesterov Accelerated Gradient
1919
├── adagrad.py # Adaptive Gradient Algorithm
2020
├── adam.py # Adaptive Moment Estimation
2121
├── README.md # Comprehensive documentation
@@ -28,7 +28,7 @@ neural_network/optimizers/
2828
- Basic gradient descent: `θ = θ - α * g`
2929
- Foundation for understanding optimization
3030

31-
2. **MomentumSGD**
31+
2. **MomentumSGD**
3232
- Adds momentum for acceleration: `v = β*v + (1-β)*g; θ = θ - α*v`
3333
- Reduces oscillations and speeds convergence
3434

@@ -52,7 +52,7 @@ neural_network/optimizers/
5252
- **Type Safety**: Full type hints throughout (`typing`, `Union`, `List`)
5353
- **Educational Focus**: Clear mathematical formulations in docstrings
5454
- **Comprehensive Testing**: Doctests + example scripts
55-
- **Consistent Interface**: All inherit from `BaseOptimizer`
55+
- **Consistent Interface**: All inherit from `BaseOptimizer`
5656
- **Error Handling**: Proper validation and meaningful error messages
5757

5858
### 📝 Code Quality Features
@@ -80,21 +80,21 @@ The implementation was validated on multiple test problems:
8080
- Momentum accelerates convergence but can overshoot
8181
- Adam provides robust performance with adaptive learning
8282

83-
### Multi-dimensional (f(x,y) = x² + 10y²)
83+
### Multi-dimensional (f(x,y) = x² + 10y²)
8484
- Tests adaptation to different parameter scales
8585
- Adagrad and Adam handle scale differences well
8686
- Momentum methods show improved stability
8787

8888
### Rosenbrock Function (Non-convex)
89-
- Classic challenging optimization benchmark
89+
- Classic challenging optimization benchmark
9090
- Adam significantly outperformed other methods
9191
- Demonstrates real-world applicability
9292

9393
## 🎯 Educational Value
9494

9595
### Progressive Complexity
9696
1. **SGD**: Foundation - understand basic gradient descent
97-
2. **Momentum**: Build intuition for acceleration methods
97+
2. **Momentum**: Build intuition for acceleration methods
9898
3. **NAG**: Learn about lookahead and overshoot correction
9999
4. **Adagrad**: Understand adaptive learning rates
100100
5. **Adam**: See how modern optimizers combine techniques
@@ -106,7 +106,7 @@ The implementation was validated on multiple test problems:
106106

107107
### Code Patterns
108108
- Abstract base classes and inheritance
109-
- Recursive algorithms for nested data structures
109+
- Recursive algorithms for nested data structures
110110
- State management in optimization algorithms
111111
- Type safety in scientific computing
112112

@@ -126,7 +126,7 @@ from neural_network.optimizers import SGD, Adam, Adagrad
126126

127127
optimizers = {
128128
"sgd": SGD(0.01),
129-
"adam": Adam(0.001),
129+
"adam": Adam(0.001),
130130
"adagrad": Adagrad(0.01)
131131
}
132132

@@ -147,7 +147,7 @@ updated = optimizer.update(params_2d, grads_2d)
147147

148148
### For the Repository
149149
- **Gap Filled**: Addresses missing neural network optimization algorithms
150-
- **Educational Value**: High-quality learning resource for ML students
150+
- **Educational Value**: High-quality learning resource for ML students
151151
- **Code Quality**: Demonstrates best practices in scientific Python
152152
- **Completeness**: Makes the repo more comprehensive for ML learning
153153

@@ -163,7 +163,7 @@ The modular design makes it easy to add more optimizers:
163163

164164
### Future Additions Could Include
165165
- **RMSprop**: Another popular adaptive optimizer
166-
- **AdamW**: Adam with decoupled weight decay
166+
- **AdamW**: Adam with decoupled weight decay
167167
- **LAMB**: Layer-wise Adaptive Moments optimizer
168168
- **Muon**: Advanced Newton-Schulz orthogonalization method
169169
- **Learning Rate Schedulers**: Time-based adaptation
@@ -185,7 +185,7 @@ class NewOptimizer(BaseOptimizer):
185185
-**Incremental Complexity**: SGD → Momentum → NAG → Adagrad → Adam
186186
-**Documentation**: Comprehensive docstrings and README
187187
-**Type Hints**: Full type safety throughout
188-
-**Testing**: Doctests + comprehensive test suite
188+
-**Testing**: Doctests + comprehensive test suite
189189
-**Educational Value**: Clear explanations and examples
190190

191191
### Additional Value Delivered

neural_network/optimizers/README.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ The most basic optimizer that updates parameters in the direction opposite to th
1414
### 2. MomentumSGD (SGD with Momentum)
1515
Adds a momentum term that accumulates past gradients to accelerate convergence and reduce oscillations.
1616

17-
**Update Rule:**
17+
**Update Rule:**
1818
```
1919
v = β * v + (1-β) * g
2020
θ = θ - α * v
@@ -97,10 +97,10 @@ x_adam = [5.0]
9797
for i in range(20):
9898
grad_sgd = [gradient_quadratic(x_sgd[0])]
9999
grad_adam = [gradient_quadratic(x_adam[0])]
100-
100+
101101
x_sgd = sgd.update(x_sgd, grad_sgd)
102102
x_adam = adam.update(x_adam, grad_adam)
103-
103+
104104
print(f"Step {i+1}: SGD={x_sgd[0]:.4f}, Adam={x_adam[0]:.4f}")
105105
```
106106

@@ -153,7 +153,7 @@ for step in range(100):
153153
x, y = positions[name]
154154
grad = rosenbrock_grad(x, y)
155155
positions[name] = optimizer.update(positions[name], grad)
156-
156+
157157
if step % 20 == 19:
158158
print(f"\\nStep {step + 1}:")
159159
for name, pos in positions.items():
@@ -209,7 +209,7 @@ where `f(θ)` is typically a loss function and `θ` represents the parameters of
209209
The optimizers differ in how they use gradient information `g = ∇f(θ)` to update parameters:
210210

211211
1. **SGD** uses gradients directly
212-
2. **Momentum** accumulates gradients over time
212+
2. **Momentum** accumulates gradients over time
213213
3. **NAG** uses lookahead to reduce overshooting
214214
4. **Adagrad** adapts learning rates based on gradient history
215215
5. **Adam** combines momentum with adaptive learning rates

neural_network/optimizers/__init__.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
Available optimizers:
99
- SGD: Stochastic Gradient Descent
1010
- MomentumSGD: SGD with momentum
11-
- NAG: Nesterov Accelerated Gradient
11+
- NAG: Nesterov Accelerated Gradient
1212
- Adagrad: Adaptive Gradient Algorithm
1313
- Adam: Adaptive Moment Estimation
1414
@@ -21,4 +21,4 @@
2121
from .adagrad import Adagrad
2222
from .adam import Adam
2323

24-
__all__ = ["SGD", "MomentumSGD", "NAG", "Adagrad", "Adam"]
24+
__all__ = ["SGD", "MomentumSGD", "NAG", "Adagrad", "Adam"]

0 commit comments

Comments
 (0)