Skip to content

Commit f543fac

Browse files
authored
Merge branch 'master' into feat-lru-cache-implementation
2 parents 60afa57 + 3cea941 commit f543fac

File tree

12 files changed

+220
-37
lines changed

12 files changed

+220
-37
lines changed

.github/workflows/build.yml

Lines changed: 2 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -9,13 +9,7 @@ jobs:
99
build:
1010
runs-on: ubuntu-latest
1111
steps:
12-
- run:
13-
sudo apt-get update && sudo apt-get install -y libtiff5-dev libjpeg8-dev libopenjp2-7-dev
14-
zlib1g-dev libfreetype6-dev liblcms2-dev libwebp-dev tcl8.6-dev tk8.6-dev python3-tk
15-
libharfbuzz-dev libfribidi-dev libxcb1-dev
16-
libxml2-dev libxslt-dev
17-
libhdf5-dev
18-
libopenblas-dev
12+
- run: sudo apt-get update && sudo apt-get install -y libhdf5-dev
1913
- uses: actions/checkout@v5
2014
- uses: astral-sh/setup-uv@v7
2115
with:
@@ -32,6 +26,7 @@ jobs:
3226
--ignore=computer_vision/cnn_classification.py
3327
--ignore=docs/conf.py
3428
--ignore=dynamic_programming/k_means_clustering_tensorflow.py
29+
--ignore=machine_learning/local_weighted_learning/local_weighted_learning.py
3530
--ignore=machine_learning/lstm/lstm_prediction.py
3631
--ignore=neural_network/input_data.py
3732
--ignore=project_euler/

CONTRIBUTING.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -99,7 +99,7 @@ We want your work to be readable by others; therefore, we encourage you to note
9999
ruff check
100100
```
101101

102-
- Original code submission require docstrings or comments to describe your work.
102+
- Original code submissions require docstrings or comments to describe your work.
103103

104104
- More on docstrings and comments:
105105

DIRECTORY.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -195,6 +195,7 @@
195195
* [Permutations](data_structures/arrays/permutations.py)
196196
* [Prefix Sum](data_structures/arrays/prefix_sum.py)
197197
* [Product Sum](data_structures/arrays/product_sum.py)
198+
* [Rotate Array](data_structures/arrays/rotate_array.py)
198199
* [Sparse Table](data_structures/arrays/sparse_table.py)
199200
* [Sudoku Solver](data_structures/arrays/sudoku_solver.py)
200201
* Binary Tree
@@ -623,6 +624,7 @@
623624
* [Sequential Minimum Optimization](machine_learning/sequential_minimum_optimization.py)
624625
* [Similarity Search](machine_learning/similarity_search.py)
625626
* [Support Vector Machines](machine_learning/support_vector_machines.py)
627+
* [T Stochastic Neighbour Embedding](machine_learning/t_stochastic_neighbour_embedding.py)
626628
* [Word Frequency Functions](machine_learning/word_frequency_functions.py)
627629
* [Xgboost Classifier](machine_learning/xgboost_classifier.py)
628630
* [Xgboost Regressor](machine_learning/xgboost_regressor.py)
Lines changed: 178 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,178 @@
1+
"""
2+
t-distributed stochastic neighbor embedding (t-SNE)
3+
4+
For more details, see:
5+
https://en.wikipedia.org/wiki/T-distributed_stochastic_neighbor_embedding
6+
"""
7+
8+
import doctest
9+
10+
import numpy as np
11+
from numpy import ndarray
12+
from sklearn.datasets import load_iris
13+
14+
15+
def collect_dataset() -> tuple[ndarray, ndarray]:
16+
"""
17+
Load the Iris dataset and return features and labels.
18+
19+
Returns:
20+
tuple[ndarray, ndarray]: Feature matrix and target labels.
21+
22+
>>> features, targets = collect_dataset()
23+
>>> features.shape
24+
(150, 4)
25+
>>> targets.shape
26+
(150,)
27+
"""
28+
iris_dataset = load_iris()
29+
return np.array(iris_dataset.data), np.array(iris_dataset.target)
30+
31+
32+
def compute_pairwise_affinities(data_matrix: ndarray, sigma: float = 1.0) -> ndarray:
33+
"""
34+
Compute high-dimensional affinities (P matrix) using a Gaussian kernel.
35+
36+
Args:
37+
data_matrix: Input data of shape (n_samples, n_features).
38+
sigma: Gaussian kernel bandwidth.
39+
40+
Returns:
41+
ndarray: Symmetrized probability matrix.
42+
43+
>>> x = np.array([[0.0, 0.0], [1.0, 0.0]])
44+
>>> probabilities = compute_pairwise_affinities(x)
45+
>>> float(round(probabilities[0, 1], 3))
46+
0.25
47+
"""
48+
n_samples = data_matrix.shape[0]
49+
squared_sum = np.sum(np.square(data_matrix), axis=1)
50+
squared_distance = np.add(
51+
np.add(-2 * np.dot(data_matrix, data_matrix.T), squared_sum).T, squared_sum
52+
)
53+
54+
affinity_matrix = np.exp(-squared_distance / (2 * sigma**2))
55+
np.fill_diagonal(affinity_matrix, 0)
56+
57+
affinity_matrix /= np.sum(affinity_matrix)
58+
return (affinity_matrix + affinity_matrix.T) / (2 * n_samples)
59+
60+
61+
def compute_low_dim_affinities(embedding_matrix: ndarray) -> tuple[ndarray, ndarray]:
62+
"""
63+
Compute low-dimensional affinities (Q matrix) using a Student-t distribution.
64+
65+
Args:
66+
embedding_matrix: Low-dimensional embedding of shape (n_samples, n_components).
67+
68+
Returns:
69+
tuple[ndarray, ndarray]: (Q probability matrix, numerator matrix).
70+
71+
>>> y = np.array([[0.0, 0.0], [1.0, 0.0]])
72+
>>> q_matrix, numerators = compute_low_dim_affinities(y)
73+
>>> q_matrix.shape
74+
(2, 2)
75+
"""
76+
squared_sum = np.sum(np.square(embedding_matrix), axis=1)
77+
numerator_matrix = 1 / (
78+
1
79+
+ np.add(
80+
np.add(-2 * np.dot(embedding_matrix, embedding_matrix.T), squared_sum).T,
81+
squared_sum,
82+
)
83+
)
84+
np.fill_diagonal(numerator_matrix, 0)
85+
86+
q_matrix = numerator_matrix / np.sum(numerator_matrix)
87+
return q_matrix, numerator_matrix
88+
89+
90+
def apply_tsne(
91+
data_matrix: ndarray,
92+
n_components: int = 2,
93+
learning_rate: float = 200.0,
94+
n_iter: int = 500,
95+
) -> ndarray:
96+
"""
97+
Apply t-SNE for dimensionality reduction.
98+
99+
Args:
100+
data_matrix: Original dataset (features).
101+
n_components: Target dimension (2D or 3D).
102+
learning_rate: Step size for gradient descent.
103+
n_iter: Number of iterations.
104+
105+
Returns:
106+
ndarray: Low-dimensional embedding of the data.
107+
108+
>>> features, _ = collect_dataset()
109+
>>> embedding = apply_tsne(features, n_components=2, n_iter=50)
110+
>>> embedding.shape
111+
(150, 2)
112+
"""
113+
if n_components < 1 or n_iter < 1:
114+
raise ValueError("n_components and n_iter must be >= 1")
115+
116+
n_samples = data_matrix.shape[0]
117+
rng = np.random.default_rng()
118+
embedding = rng.standard_normal((n_samples, n_components)) * 1e-4
119+
120+
high_dim_affinities = compute_pairwise_affinities(data_matrix)
121+
high_dim_affinities = np.maximum(high_dim_affinities, 1e-12)
122+
123+
embedding_increment = np.zeros_like(embedding)
124+
momentum = 0.5
125+
126+
for iteration in range(n_iter):
127+
low_dim_affinities, numerator_matrix = compute_low_dim_affinities(embedding)
128+
low_dim_affinities = np.maximum(low_dim_affinities, 1e-12)
129+
130+
affinity_diff = high_dim_affinities - low_dim_affinities
131+
132+
gradient = 4 * (
133+
np.dot((affinity_diff * numerator_matrix), embedding)
134+
- np.multiply(
135+
np.sum(affinity_diff * numerator_matrix, axis=1)[:, np.newaxis],
136+
embedding,
137+
)
138+
)
139+
140+
embedding_increment = momentum * embedding_increment - learning_rate * gradient
141+
embedding += embedding_increment
142+
143+
if iteration == int(n_iter / 4):
144+
momentum = 0.8
145+
146+
return embedding
147+
148+
149+
def main() -> None:
150+
"""
151+
Run t-SNE on the Iris dataset and display the first 5 embeddings.
152+
153+
>>> main() # doctest: +ELLIPSIS
154+
t-SNE embedding (first 5 points):
155+
[[...
156+
"""
157+
features, _labels = collect_dataset()
158+
embedding = apply_tsne(features, n_components=2, n_iter=300)
159+
160+
if not isinstance(embedding, np.ndarray):
161+
raise TypeError("t-SNE embedding must be an ndarray")
162+
163+
print("t-SNE embedding (first 5 points):")
164+
print(embedding[:5])
165+
166+
# Optional visualization (Ruff/mypy compliant)
167+
168+
# import matplotlib.pyplot as plt
169+
# plt.scatter(embedding[:, 0], embedding[:, 1], c=labels, cmap="viridis")
170+
# plt.title("t-SNE Visualization of the Iris Dataset")
171+
# plt.xlabel("Dimension 1")
172+
# plt.ylabel("Dimension 2")
173+
# plt.show()
174+
175+
176+
if __name__ == "__main__":
177+
doctest.testmod()
178+
main()

maths/factorial.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -56,7 +56,7 @@ def factorial_recursive(n: int) -> int:
5656
raise ValueError("factorial() only accepts integral values")
5757
if n < 0:
5858
raise ValueError("factorial() not defined for negative values")
59-
return 1 if n in {0, 1} else n * factorial(n - 1)
59+
return 1 if n in {0, 1} else n * factorial_recursive(n - 1)
6060

6161

6262
if __name__ == "__main__":

maths/fibonacci.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -183,7 +183,7 @@ def fib_memoization(n: int) -> list[int]:
183183
"""
184184
if n < 0:
185185
raise ValueError("n is negative")
186-
# Cache must be outside recursuive function
186+
# Cache must be outside recursive function
187187
# other it will reset every time it calls itself.
188188
cache: dict[int, int] = {0: 0, 1: 1, 2: 1} # Prefilled cache
189189

maths/volume.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -555,7 +555,7 @@ def main():
555555
print(f"Torus: {vol_torus(2, 2) = }") # ~= 157.9
556556
print(f"Conical Frustum: {vol_conical_frustum(2, 2, 4) = }") # ~= 58.6
557557
print(f"Spherical cap: {vol_spherical_cap(1, 2) = }") # ~= 5.24
558-
print(f"Spheres intersetion: {vol_spheres_intersect(2, 2, 1) = }") # ~= 21.21
558+
print(f"Spheres intersection: {vol_spheres_intersect(2, 2, 1) = }") # ~= 21.21
559559
print(f"Spheres union: {vol_spheres_union(2, 2, 1) = }") # ~= 45.81
560560
print(
561561
f"Hollow Circular Cylinder: {vol_hollow_circular_cylinder(1, 2, 3) = }"

pyproject.toml

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -3,10 +3,9 @@ name = "thealgorithms-python"
33
version = "0.0.1"
44
description = "TheAlgorithms in Python"
55
authors = [ { name = "TheAlgorithms Contributors" } ]
6-
requires-python = ">=3.13"
6+
requires-python = ">=3.14"
77
classifiers = [
88
"Programming Language :: Python :: 3 :: Only",
9-
"Programming Language :: Python :: 3.13",
109
]
1110
dependencies = [
1211
"beautifulsoup4>=4.12.3",
@@ -23,6 +22,7 @@ dependencies = [
2322
"pillow>=11.3",
2423
"rich>=13.9.4",
2524
"scikit-learn>=1.5.2",
25+
"scipy>=1.16.2",
2626
"sphinx-pyproject>=0.3",
2727
"statsmodels>=0.14.4",
2828
"sympy>=1.13.3",
@@ -48,7 +48,7 @@ euler-validate = [
4848
]
4949

5050
[tool.ruff]
51-
target-version = "py313"
51+
target-version = "py314"
5252

5353
output-format = "full"
5454
lint.select = [
@@ -109,7 +109,7 @@ lint.ignore = [
109109
# `ruff rule S101` for a description of that rule
110110
"B904", # Within an `except` clause, raise exceptions with `raise ... from err` -- FIX ME
111111
"B905", # `zip()` without an explicit `strict=` parameter -- FIX ME
112-
"EM101", # Exception must not use a string literal, assign to variable first
112+
"EM101", # Exception must not use a string literal, assign to a variable first
113113
"EXE001", # Shebang is present but file is not executable -- DO NOT FIX
114114
"G004", # Logging statement uses f-string
115115
"ISC001", # Conflicts with ruff format -- DO NOT FIX
@@ -125,6 +125,7 @@ lint.ignore = [
125125
"S311", # Standard pseudo-random generators are not suitable for cryptographic purposes -- FIX ME
126126
"SIM905", # Consider using a list literal instead of `str.split` -- DO NOT FIX
127127
"SLF001", # Private member accessed: `_Iterator` -- FIX ME
128+
"UP037", # FIX ME
128129
]
129130

130131
lint.per-file-ignores."data_structures/hashing/tests/test_hash_map.py" = [

requirements.txt

Lines changed: 0 additions & 19 deletions
This file was deleted.

scripts/README.md

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
Dealing with the onslaught of Hacktoberfest
2+
* https://hacktoberfest.com
3+
4+
Each year, October brings a swarm of new contributors participating in Hacktoberfest. This event has its pros and cons, but it presents a monumental workload for the few active maintainers of this repo. The maintainer workload is further impacted by a new version of CPython being released in the first week of each October.
5+
6+
To help make our algorithms more valuable to visitors, our CONTRIBUTING.md file outlines several strict requirements, such as tests, type hints, descriptive names, functions, and/or classes. Maintainers reviewing pull requests should try to encourage improvements to meet these goals, but when the workload becomes overwhelming (esp. in October), pull requests that do not meet these goals should be closed.
7+
8+
Below are a few [`gh`](https://cli.github.com) scripts that should close pull requests that do not match the definition of an acceptable algorithm as defined in CONTRIBUTING.md. I tend to run these scripts in the following order.
9+
10+
* close_pull_requests_with_require_descriptive_names.sh
11+
* close_pull_requests_with_require_tests.sh
12+
* close_pull_requests_with_require_type_hints.sh
13+
* close_pull_requests_with_failing_tests.sh
14+
* close_pull_requests_with_awaiting_changes.sh
15+
* find_git_conflicts.sh
16+
17+
### Run on 14 Oct 2025: 107 of 541 (19.77%) pull requests closed.
18+
19+
Script run | Open pull requests | Pull requests closed
20+
--- | --- | ---
21+
None | 541 | 0
22+
require_descriptive_names | 515 | 26
23+
require_tests | 498 | 17
24+
require_type_hints | 496 | 2
25+
failing_tests | 438 | ___58___
26+
awaiting_changes | 434 | 4
27+
git_conflicts | [ broken ] | 0

0 commit comments

Comments
 (0)