From fa1e96c10ad37d74d66155eb07c3a9f4381800f4 Mon Sep 17 00:00:00 2001 From: Johanmkr Date: Wed, 26 Feb 2025 07:55:19 +0100 Subject: [PATCH 1/7] Started own documentation. --- doc/Johan_page.md | 33 +++++++++++++++++++++++++++++++++ 1 file changed, 33 insertions(+) create mode 100644 doc/Johan_page.md diff --git a/doc/Johan_page.md b/doc/Johan_page.md new file mode 100644 index 0000000..fa7505b --- /dev/null +++ b/doc/Johan_page.md @@ -0,0 +1,33 @@ +# Individual task for Johan + +## My implementation tasks + +* Data: Implement [MNIST](../CollaborativeCoding/dataloaders/mnist_4_9.py) dataset with digits between 4-9. +* Model: [MLP-model](../CollaborativeCoding/models/johan_model.py/) with 4 hidden layers, each with 77 neurons and ReLU activation. +* Evaluation metric: [Precision](../CollaborativeCoding/metrics/precision.py). + +## Implementation choices + +### Dataset + +The choices regarding the dataset were mostly done in conjunction with Jan (@hzavadil98) as we were both using the MNIST dataset. Jan had the idea to download the binary files and construct the images from those. The group decided collaboratorily to make the package download the data once and store it for all of use to use. Hence the individual implementations are fairly similar, at least for the two MNIST dataloaders. Were it not for these individual tasks, there would have been one dataloader class, initialised with two separate ranges for labels 0-3 and 4-9. However, individual dataloaders had to be created to comply with the exam description. For my implementation, the labels had to be mapped to a range starting at 0: $(4-9) \to (0,5)$ since the cross-entropy loss function in PyTorch expect this range. + +## Experiences with running someone else's code + +## Experiences having someone else to run my code + +## I learned how to use these tools during this course + +### Git-stuff + +### WandB + +### Docker, Kubernetes and Springfield + +### Proper documentation + +### Nice ways of testing code + +### UV + + From de6d161a6e4a8dc6f1902869f192bfcf81a28bae Mon Sep 17 00:00:00 2001 From: Johanmkr Date: Wed, 26 Feb 2025 12:01:42 +0100 Subject: [PATCH 2/7] started writing my own documentation --- .gitignore | 2 ++ doc/Johan_page.md | 18 ++++++++++-------- 2 files changed, 12 insertions(+), 8 deletions(-) diff --git a/.gitignore b/.gitignore index d68c5ec..ae5af3a 100644 --- a/.gitignore +++ b/.gitignore @@ -21,6 +21,8 @@ local* formatting.x testrun.x storage/ +Makefile +job.yaml # Byte-compiled / optimized / DLL files __pycache__/ diff --git a/doc/Johan_page.md b/doc/Johan_page.md index fa7505b..f325fe6 100644 --- a/doc/Johan_page.md +++ b/doc/Johan_page.md @@ -1,22 +1,22 @@ # Individual task for Johan -## My implementation tasks +## My implementation tasks * Data: Implement [MNIST](../CollaborativeCoding/dataloaders/mnist_4_9.py) dataset with digits between 4-9. -* Model: [MLP-model](../CollaborativeCoding/models/johan_model.py/) with 4 hidden layers, each with 77 neurons and ReLU activation. +* Model: [MLP-model](../CollaborativeCoding/models/johan_model.py) with 4 hidden layers, each with 77 neurons and ReLU activation. * Evaluation metric: [Precision](../CollaborativeCoding/metrics/precision.py). ## Implementation choices -### Dataset +### Dataset -The choices regarding the dataset were mostly done in conjunction with Jan (@hzavadil98) as we were both using the MNIST dataset. Jan had the idea to download the binary files and construct the images from those. The group decided collaboratorily to make the package download the data once and store it for all of use to use. Hence the individual implementations are fairly similar, at least for the two MNIST dataloaders. Were it not for these individual tasks, there would have been one dataloader class, initialised with two separate ranges for labels 0-3 and 4-9. However, individual dataloaders had to be created to comply with the exam description. For my implementation, the labels had to be mapped to a range starting at 0: $(4-9) \to (0,5)$ since the cross-entropy loss function in PyTorch expect this range. +The choices regarding the dataset were mostly done in conjunction with Jan (@hzavadil98) as we were both using the MNIST dataset. Jan had the idea to download the binary files and construct the images from those. The group decided collaboratively to make the package download the data once and store it for all of use to use. Hence, the individual implementations are fairly similar, at least for the two MNIST dataloaders. Were it not for these individual tasks, there would have been one dataloader class, initialised with two separate ranges for labels 0-3 and 4-9. However, individual dataloaders had to be created to comply with the exam description. For my implementation, the labels had to be mapped to a range starting at 0: $(4-9) \to (0,5)$ since the cross-entropy loss function in PyTorch expect this range. -## Experiences with running someone else's code +## Experiences with running someone else's code -## Experiences having someone else to run my code +## Experiences having someone else to run my code -## I learned how to use these tools during this course +## I learned how to use these tools during this course ### Git-stuff @@ -26,8 +26,10 @@ The choices regarding the dataset were mostly done in conjunction with Jan (@hza ### Proper documentation -### Nice ways of testing code +### Nice ways of testing code ### UV +## General thoughts on collaboration +As someone new to this University and how the IT-setup is here, this project was very fruitful. The positive thing about working with peers that are highly skilled in the relevant field is that the learning curve is steep, and the take home for me was significant. The con is that I constantly felt behind, spend half the time just understanding the changes implemented by others and felt that my contributions to the overall project was less significant. However, working with skilled peers have boosted my understanding about how things work around here, especially git (more fancy commands that add, commit, push and pull) and docker, cluster operations, kubernetes and logging metrics on Weights and Biases. From 421fe42f3c769d6e06aa96f7296f6b174e96d6d7 Mon Sep 17 00:00:00 2001 From: Johanmkr Date: Sun, 2 Mar 2025 13:34:13 +0100 Subject: [PATCH 3/7] Added some pyproject stuff --- pyproject.toml | 1 + uv.lock | 80 ++++++++++++++++++++++++++++++++++++++++++++++---- 2 files changed, 75 insertions(+), 6 deletions(-) diff --git a/pyproject.toml b/pyproject.toml index 15b0167..40a2ebb 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -12,6 +12,7 @@ dependencies = [ "numpy>=2.2.2", "pandas>=2.2.3", "pip>=25.0", + "pip-system-certs>=4.0", "pytest>=8.3.4", "ruff>=0.9.4", "scalene>=1.5.51", diff --git a/uv.lock b/uv.lock index 97ea764..83b5e4c 100644 --- a/uv.lock +++ b/uv.lock @@ -1,4 +1,5 @@ version = 1 +revision = 1 requires-python = ">=3.11.5" resolution-markers = [ "python_full_version >= '3.12' and sys_platform == 'linux'", @@ -332,6 +333,7 @@ dependencies = [ { name = "numpy" }, { name = "pandas" }, { name = "pip" }, + { name = "pip-system-certs" }, { name = "pytest" }, { name = "ruff" }, { name = "scalene" }, @@ -355,6 +357,7 @@ requires-dist = [ { name = "numpy", specifier = ">=2.2.2" }, { name = "pandas", specifier = ">=2.2.3" }, { name = "pip", specifier = ">=25.0" }, + { name = "pip-system-certs", specifier = ">=4.0" }, { name = "pytest", specifier = ">=8.3.4" }, { name = "ruff", specifier = ">=0.9.4" }, { name = "scalene", specifier = ">=1.5.51" }, @@ -1198,7 +1201,7 @@ name = "nvidia-cudnn-cu12" version = "9.1.0.70" source = { registry = "https://pypi.org/simple" } dependencies = [ - { name = "nvidia-cublas-cu12" }, + { name = "nvidia-cublas-cu12", marker = "sys_platform == 'linux'" }, ] wheels = [ { url = "https://files.pythonhosted.org/packages/9f/fd/713452cd72343f682b1c7b9321e23829f00b842ceaedcda96e742ea0b0b3/nvidia_cudnn_cu12-9.1.0.70-py3-none-manylinux2014_x86_64.whl", hash = "sha256:165764f44ef8c61fcdfdfdbe769d687e06374059fbb388b6c89ecb0e28793a6f", size = 664752741 }, @@ -1209,7 +1212,7 @@ name = "nvidia-cufft-cu12" version = "11.2.1.3" source = { registry = "https://pypi.org/simple" } dependencies = [ - { name = "nvidia-nvjitlink-cu12" }, + { name = "nvidia-nvjitlink-cu12", marker = "sys_platform == 'linux'" }, ] wheels = [ { url = "https://files.pythonhosted.org/packages/27/94/3266821f65b92b3138631e9c8e7fe1fb513804ac934485a8d05776e1dd43/nvidia_cufft_cu12-11.2.1.3-py3-none-manylinux2014_x86_64.whl", hash = "sha256:f083fc24912aa410be21fa16d157fed2055dab1cc4b6934a0e03cba69eb242b9", size = 211459117 }, @@ -1228,9 +1231,9 @@ name = "nvidia-cusolver-cu12" version = "11.6.1.9" source = { registry = "https://pypi.org/simple" } dependencies = [ - { name = "nvidia-cublas-cu12" }, - { name = "nvidia-cusparse-cu12" }, - { name = "nvidia-nvjitlink-cu12" }, + { name = "nvidia-cublas-cu12", marker = "sys_platform == 'linux'" }, + { name = "nvidia-cusparse-cu12", marker = "sys_platform == 'linux'" }, + { name = "nvidia-nvjitlink-cu12", marker = "sys_platform == 'linux'" }, ] wheels = [ { url = "https://files.pythonhosted.org/packages/3a/e1/5b9089a4b2a4790dfdea8b3a006052cfecff58139d5a4e34cb1a51df8d6f/nvidia_cusolver_cu12-11.6.1.9-py3-none-manylinux2014_x86_64.whl", hash = "sha256:19e33fa442bcfd085b3086c4ebf7e8debc07cfe01e11513cc6d332fd918ac260", size = 127936057 }, @@ -1241,7 +1244,7 @@ name = "nvidia-cusparse-cu12" version = "12.3.1.170" source = { registry = "https://pypi.org/simple" } dependencies = [ - { name = "nvidia-nvjitlink-cu12" }, + { name = "nvidia-nvjitlink-cu12", marker = "sys_platform == 'linux'" }, ] wheels = [ { url = "https://files.pythonhosted.org/packages/db/f7/97a9ea26ed4bbbfc2d470994b8b4f338ef663be97b8f677519ac195e113d/nvidia_cusparse_cu12-12.3.1.170-py3-none-manylinux2014_x86_64.whl", hash = "sha256:ea4f11a2904e2a8dc4b1833cc1b5181cde564edd0d5cd33e3c168eff2d1863f1", size = 207454763 }, @@ -1444,6 +1447,18 @@ wheels = [ { url = "https://files.pythonhosted.org/packages/85/8a/1ddf40be20103bcc605db840e9ade09c8e8c9f920a03e9cfe88eae97a058/pip-25.0-py3-none-any.whl", hash = "sha256:b6eb97a803356a52b2dd4bb73ba9e65b2ba16caa6bcb25a7497350a4e5859b65", size = 1841506 }, ] +[[package]] +name = "pip-system-certs" +version = "4.0" +source = { registry = "https://pypi.org/simple" } +dependencies = [ + { name = "wrapt" }, +] +sdist = { url = "https://files.pythonhosted.org/packages/27/9a/4e949d0a281c5dd45c8d5b02b03fe32044936234675e967de49317a1daee/pip_system_certs-4.0.tar.gz", hash = "sha256:db8e6a31388d9795ec9139957df1a89fa5274fb66164456fd091a5d3e94c350c", size = 5622 } +wheels = [ + { url = "https://files.pythonhosted.org/packages/70/82/78c30a18858d484acd13a3aea22ead89c66f200e118d1aa4b4bae392efee/pip_system_certs-4.0-py2.py3-none-any.whl", hash = "sha256:47202b9403a6f40783a9674bbc8873f5fc86544ec01a49348fa913e99e2ff68b", size = 6070 }, +] + [[package]] name = "platformdirs" version = "4.3.6" @@ -2704,3 +2719,56 @@ sdist = { url = "https://files.pythonhosted.org/packages/8a/98/2d9906746cdc6a6ef wheels = [ { url = "https://files.pythonhosted.org/packages/0b/2c/87f3254fd8ffd29e4c02732eee68a83a1d3c346ae39bc6822dcbcb697f2b/wheel-0.45.1-py3-none-any.whl", hash = "sha256:708e7481cc80179af0e556bbf0cc00b8444c7321e2700b8d8580231d13017248", size = 72494 }, ] + +[[package]] +name = "wrapt" +version = "1.17.2" +source = { registry = "https://pypi.org/simple" } +sdist = { url = "https://files.pythonhosted.org/packages/c3/fc/e91cc220803d7bc4db93fb02facd8461c37364151b8494762cc88b0fbcef/wrapt-1.17.2.tar.gz", hash = "sha256:41388e9d4d1522446fe79d3213196bd9e3b301a336965b9e27ca2788ebd122f3", size = 55531 } +wheels = [ + { url = "https://files.pythonhosted.org/packages/cd/f7/a2aab2cbc7a665efab072344a8949a71081eed1d2f451f7f7d2b966594a2/wrapt-1.17.2-cp311-cp311-macosx_10_9_universal2.whl", hash = "sha256:ff04ef6eec3eee8a5efef2401495967a916feaa353643defcc03fc74fe213b58", size = 53308 }, + { url = "https://files.pythonhosted.org/packages/50/ff/149aba8365fdacef52b31a258c4dc1c57c79759c335eff0b3316a2664a64/wrapt-1.17.2-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:4db983e7bca53819efdbd64590ee96c9213894272c776966ca6306b73e4affda", size = 38488 }, + { url = "https://files.pythonhosted.org/packages/65/46/5a917ce85b5c3b490d35c02bf71aedaa9f2f63f2d15d9949cc4ba56e8ba9/wrapt-1.17.2-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:9abc77a4ce4c6f2a3168ff34b1da9b0f311a8f1cfd694ec96b0603dff1c79438", size = 38776 }, + { url = "https://files.pythonhosted.org/packages/ca/74/336c918d2915a4943501c77566db41d1bd6e9f4dbc317f356b9a244dfe83/wrapt-1.17.2-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:0b929ac182f5ace000d459c59c2c9c33047e20e935f8e39371fa6e3b85d56f4a", size = 83776 }, + { url = "https://files.pythonhosted.org/packages/09/99/c0c844a5ccde0fe5761d4305485297f91d67cf2a1a824c5f282e661ec7ff/wrapt-1.17.2-cp311-cp311-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:f09b286faeff3c750a879d336fb6d8713206fc97af3adc14def0cdd349df6000", size = 75420 }, + { url = "https://files.pythonhosted.org/packages/b4/b0/9fc566b0fe08b282c850063591a756057c3247b2362b9286429ec5bf1721/wrapt-1.17.2-cp311-cp311-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:1a7ed2d9d039bd41e889f6fb9364554052ca21ce823580f6a07c4ec245c1f5d6", size = 83199 }, + { url = "https://files.pythonhosted.org/packages/9d/4b/71996e62d543b0a0bd95dda485219856def3347e3e9380cc0d6cf10cfb2f/wrapt-1.17.2-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:129a150f5c445165ff941fc02ee27df65940fcb8a22a61828b1853c98763a64b", size = 82307 }, + { url = "https://files.pythonhosted.org/packages/39/35/0282c0d8789c0dc9bcc738911776c762a701f95cfe113fb8f0b40e45c2b9/wrapt-1.17.2-cp311-cp311-musllinux_1_2_i686.whl", hash = "sha256:1fb5699e4464afe5c7e65fa51d4f99e0b2eadcc176e4aa33600a3df7801d6662", size = 75025 }, + { url = "https://files.pythonhosted.org/packages/4f/6d/90c9fd2c3c6fee181feecb620d95105370198b6b98a0770cba090441a828/wrapt-1.17.2-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:9a2bce789a5ea90e51a02dfcc39e31b7f1e662bc3317979aa7e5538e3a034f72", size = 81879 }, + { url = "https://files.pythonhosted.org/packages/8f/fa/9fb6e594f2ce03ef03eddbdb5f4f90acb1452221a5351116c7c4708ac865/wrapt-1.17.2-cp311-cp311-win32.whl", hash = "sha256:4afd5814270fdf6380616b321fd31435a462019d834f83c8611a0ce7484c7317", size = 36419 }, + { url = "https://files.pythonhosted.org/packages/47/f8/fb1773491a253cbc123c5d5dc15c86041f746ed30416535f2a8df1f4a392/wrapt-1.17.2-cp311-cp311-win_amd64.whl", hash = "sha256:acc130bc0375999da18e3d19e5a86403667ac0c4042a094fefb7eec8ebac7cf3", size = 38773 }, + { url = "https://files.pythonhosted.org/packages/a1/bd/ab55f849fd1f9a58ed7ea47f5559ff09741b25f00c191231f9f059c83949/wrapt-1.17.2-cp312-cp312-macosx_10_13_universal2.whl", hash = "sha256:d5e2439eecc762cd85e7bd37161d4714aa03a33c5ba884e26c81559817ca0925", size = 53799 }, + { url = "https://files.pythonhosted.org/packages/53/18/75ddc64c3f63988f5a1d7e10fb204ffe5762bc663f8023f18ecaf31a332e/wrapt-1.17.2-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:3fc7cb4c1c744f8c05cd5f9438a3caa6ab94ce8344e952d7c45a8ed59dd88392", size = 38821 }, + { url = "https://files.pythonhosted.org/packages/48/2a/97928387d6ed1c1ebbfd4efc4133a0633546bec8481a2dd5ec961313a1c7/wrapt-1.17.2-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:8fdbdb757d5390f7c675e558fd3186d590973244fab0c5fe63d373ade3e99d40", size = 38919 }, + { url = "https://files.pythonhosted.org/packages/73/54/3bfe5a1febbbccb7a2f77de47b989c0b85ed3a6a41614b104204a788c20e/wrapt-1.17.2-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:5bb1d0dbf99411f3d871deb6faa9aabb9d4e744d67dcaaa05399af89d847a91d", size = 88721 }, + { url = "https://files.pythonhosted.org/packages/25/cb/7262bc1b0300b4b64af50c2720ef958c2c1917525238d661c3e9a2b71b7b/wrapt-1.17.2-cp312-cp312-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:d18a4865f46b8579d44e4fe1e2bcbc6472ad83d98e22a26c963d46e4c125ef0b", size = 80899 }, + { url = "https://files.pythonhosted.org/packages/2a/5a/04cde32b07a7431d4ed0553a76fdb7a61270e78c5fd5a603e190ac389f14/wrapt-1.17.2-cp312-cp312-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:bc570b5f14a79734437cb7b0500376b6b791153314986074486e0b0fa8d71d98", size = 89222 }, + { url = "https://files.pythonhosted.org/packages/09/28/2e45a4f4771fcfb109e244d5dbe54259e970362a311b67a965555ba65026/wrapt-1.17.2-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:6d9187b01bebc3875bac9b087948a2bccefe464a7d8f627cf6e48b1bbae30f82", size = 86707 }, + { url = "https://files.pythonhosted.org/packages/c6/d2/dcb56bf5f32fcd4bd9aacc77b50a539abdd5b6536872413fd3f428b21bed/wrapt-1.17.2-cp312-cp312-musllinux_1_2_i686.whl", hash = "sha256:9e8659775f1adf02eb1e6f109751268e493c73716ca5761f8acb695e52a756ae", size = 79685 }, + { url = "https://files.pythonhosted.org/packages/80/4e/eb8b353e36711347893f502ce91c770b0b0929f8f0bed2670a6856e667a9/wrapt-1.17.2-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:e8b2816ebef96d83657b56306152a93909a83f23994f4b30ad4573b00bd11bb9", size = 87567 }, + { url = "https://files.pythonhosted.org/packages/17/27/4fe749a54e7fae6e7146f1c7d914d28ef599dacd4416566c055564080fe2/wrapt-1.17.2-cp312-cp312-win32.whl", hash = "sha256:468090021f391fe0056ad3e807e3d9034e0fd01adcd3bdfba977b6fdf4213ea9", size = 36672 }, + { url = "https://files.pythonhosted.org/packages/15/06/1dbf478ea45c03e78a6a8c4be4fdc3c3bddea5c8de8a93bc971415e47f0f/wrapt-1.17.2-cp312-cp312-win_amd64.whl", hash = "sha256:ec89ed91f2fa8e3f52ae53cd3cf640d6feff92ba90d62236a81e4e563ac0e991", size = 38865 }, + { url = "https://files.pythonhosted.org/packages/ce/b9/0ffd557a92f3b11d4c5d5e0c5e4ad057bd9eb8586615cdaf901409920b14/wrapt-1.17.2-cp313-cp313-macosx_10_13_universal2.whl", hash = "sha256:6ed6ffac43aecfe6d86ec5b74b06a5be33d5bb9243d055141e8cabb12aa08125", size = 53800 }, + { url = "https://files.pythonhosted.org/packages/c0/ef/8be90a0b7e73c32e550c73cfb2fa09db62234227ece47b0e80a05073b375/wrapt-1.17.2-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:35621ae4c00e056adb0009f8e86e28eb4a41a4bfa8f9bfa9fca7d343fe94f998", size = 38824 }, + { url = "https://files.pythonhosted.org/packages/36/89/0aae34c10fe524cce30fe5fc433210376bce94cf74d05b0d68344c8ba46e/wrapt-1.17.2-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:a604bf7a053f8362d27eb9fefd2097f82600b856d5abe996d623babd067b1ab5", size = 38920 }, + { url = "https://files.pythonhosted.org/packages/3b/24/11c4510de906d77e0cfb5197f1b1445d4fec42c9a39ea853d482698ac681/wrapt-1.17.2-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:5cbabee4f083b6b4cd282f5b817a867cf0b1028c54d445b7ec7cfe6505057cf8", size = 88690 }, + { url = "https://files.pythonhosted.org/packages/71/d7/cfcf842291267bf455b3e266c0c29dcb675b5540ee8b50ba1699abf3af45/wrapt-1.17.2-cp313-cp313-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:49703ce2ddc220df165bd2962f8e03b84c89fee2d65e1c24a7defff6f988f4d6", size = 80861 }, + { url = "https://files.pythonhosted.org/packages/d5/66/5d973e9f3e7370fd686fb47a9af3319418ed925c27d72ce16b791231576d/wrapt-1.17.2-cp313-cp313-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:8112e52c5822fc4253f3901b676c55ddf288614dc7011634e2719718eaa187dc", size = 89174 }, + { url = "https://files.pythonhosted.org/packages/a7/d3/8e17bb70f6ae25dabc1aaf990f86824e4fd98ee9cadf197054e068500d27/wrapt-1.17.2-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:9fee687dce376205d9a494e9c121e27183b2a3df18037f89d69bd7b35bcf59e2", size = 86721 }, + { url = "https://files.pythonhosted.org/packages/6f/54/f170dfb278fe1c30d0ff864513cff526d624ab8de3254b20abb9cffedc24/wrapt-1.17.2-cp313-cp313-musllinux_1_2_i686.whl", hash = "sha256:18983c537e04d11cf027fbb60a1e8dfd5190e2b60cc27bc0808e653e7b218d1b", size = 79763 }, + { url = "https://files.pythonhosted.org/packages/4a/98/de07243751f1c4a9b15c76019250210dd3486ce098c3d80d5f729cba029c/wrapt-1.17.2-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:703919b1633412ab54bcf920ab388735832fdcb9f9a00ae49387f0fe67dad504", size = 87585 }, + { url = "https://files.pythonhosted.org/packages/f9/f0/13925f4bd6548013038cdeb11ee2cbd4e37c30f8bfd5db9e5a2a370d6e20/wrapt-1.17.2-cp313-cp313-win32.whl", hash = "sha256:abbb9e76177c35d4e8568e58650aa6926040d6a9f6f03435b7a522bf1c487f9a", size = 36676 }, + { url = "https://files.pythonhosted.org/packages/bf/ae/743f16ef8c2e3628df3ddfd652b7d4c555d12c84b53f3d8218498f4ade9b/wrapt-1.17.2-cp313-cp313-win_amd64.whl", hash = "sha256:69606d7bb691b50a4240ce6b22ebb319c1cfb164e5f6569835058196e0f3a845", size = 38871 }, + { url = "https://files.pythonhosted.org/packages/3d/bc/30f903f891a82d402ffb5fda27ec1d621cc97cb74c16fea0b6141f1d4e87/wrapt-1.17.2-cp313-cp313t-macosx_10_13_universal2.whl", hash = "sha256:4a721d3c943dae44f8e243b380cb645a709ba5bd35d3ad27bc2ed947e9c68192", size = 56312 }, + { url = "https://files.pythonhosted.org/packages/8a/04/c97273eb491b5f1c918857cd26f314b74fc9b29224521f5b83f872253725/wrapt-1.17.2-cp313-cp313t-macosx_10_13_x86_64.whl", hash = "sha256:766d8bbefcb9e00c3ac3b000d9acc51f1b399513f44d77dfe0eb026ad7c9a19b", size = 40062 }, + { url = "https://files.pythonhosted.org/packages/4e/ca/3b7afa1eae3a9e7fefe499db9b96813f41828b9fdb016ee836c4c379dadb/wrapt-1.17.2-cp313-cp313t-macosx_11_0_arm64.whl", hash = "sha256:e496a8ce2c256da1eb98bd15803a79bee00fc351f5dfb9ea82594a3f058309e0", size = 40155 }, + { url = "https://files.pythonhosted.org/packages/89/be/7c1baed43290775cb9030c774bc53c860db140397047cc49aedaf0a15477/wrapt-1.17.2-cp313-cp313t-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:40d615e4fe22f4ad3528448c193b218e077656ca9ccb22ce2cb20db730f8d306", size = 113471 }, + { url = "https://files.pythonhosted.org/packages/32/98/4ed894cf012b6d6aae5f5cc974006bdeb92f0241775addad3f8cd6ab71c8/wrapt-1.17.2-cp313-cp313t-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:a5aaeff38654462bc4b09023918b7f21790efb807f54c000a39d41d69cf552cb", size = 101208 }, + { url = "https://files.pythonhosted.org/packages/ea/fd/0c30f2301ca94e655e5e057012e83284ce8c545df7661a78d8bfca2fac7a/wrapt-1.17.2-cp313-cp313t-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:9a7d15bbd2bc99e92e39f49a04653062ee6085c0e18b3b7512a4f2fe91f2d681", size = 109339 }, + { url = "https://files.pythonhosted.org/packages/75/56/05d000de894c4cfcb84bcd6b1df6214297b8089a7bd324c21a4765e49b14/wrapt-1.17.2-cp313-cp313t-musllinux_1_2_aarch64.whl", hash = "sha256:e3890b508a23299083e065f435a492b5435eba6e304a7114d2f919d400888cc6", size = 110232 }, + { url = "https://files.pythonhosted.org/packages/53/f8/c3f6b2cf9b9277fb0813418e1503e68414cd036b3b099c823379c9575e6d/wrapt-1.17.2-cp313-cp313t-musllinux_1_2_i686.whl", hash = "sha256:8c8b293cd65ad716d13d8dd3624e42e5a19cc2a2f1acc74b30c2c13f15cb61a6", size = 100476 }, + { url = "https://files.pythonhosted.org/packages/a7/b1/0bb11e29aa5139d90b770ebbfa167267b1fc548d2302c30c8f7572851738/wrapt-1.17.2-cp313-cp313t-musllinux_1_2_x86_64.whl", hash = "sha256:4c82b8785d98cdd9fed4cac84d765d234ed3251bd6afe34cb7ac523cb93e8b4f", size = 106377 }, + { url = "https://files.pythonhosted.org/packages/6a/e1/0122853035b40b3f333bbb25f1939fc1045e21dd518f7f0922b60c156f7c/wrapt-1.17.2-cp313-cp313t-win32.whl", hash = "sha256:13e6afb7fe71fe7485a4550a8844cc9ffbe263c0f1a1eea569bc7091d4898555", size = 37986 }, + { url = "https://files.pythonhosted.org/packages/09/5e/1655cf481e079c1f22d0cabdd4e51733679932718dc23bf2db175f329b76/wrapt-1.17.2-cp313-cp313t-win_amd64.whl", hash = "sha256:eaf675418ed6b3b31c7a989fd007fa7c3be66ce14e5c3b27336383604c9da85c", size = 40750 }, + { url = "https://files.pythonhosted.org/packages/2d/82/f56956041adef78f849db6b289b282e72b55ab8045a75abad81898c28d19/wrapt-1.17.2-py3-none-any.whl", hash = "sha256:b18f2d1533a71f069c7f82d524a52599053d4c7166e9dd374ae2136b7f40f7c8", size = 23594 }, +] From 72112992d2745108e1bfa5c4dc3d76c71760a79a Mon Sep 17 00:00:00 2001 From: Johanmkr Date: Sun, 2 Mar 2025 13:42:23 +0100 Subject: [PATCH 4/7] Updadet gitignore --- .gitignore | 1 + 1 file changed, 1 insertion(+) diff --git a/.gitignore b/.gitignore index 44221b4..7a4b2b8 100644 --- a/.gitignore +++ b/.gitignore @@ -26,6 +26,7 @@ testrun.x storage/ Makefile job.yaml +.vscode # Byte-compiled / optimized / DLL files __pycache__/ From 8d0bea64027d1e3278b695d9a033f24b528f3ef4 Mon Sep 17 00:00:00 2001 From: Johanmkr Date: Mon, 3 Mar 2025 10:23:34 +0100 Subject: [PATCH 5/7] Updated gitignore to not log the .tex files --- .gitignore | 1 + 1 file changed, 1 insertion(+) diff --git a/.gitignore b/.gitignore index 7a4b2b8..658cd7f 100644 --- a/.gitignore +++ b/.gitignore @@ -27,6 +27,7 @@ storage/ Makefile job.yaml .vscode +tex/ # Byte-compiled / optimized / DLL files __pycache__/ From 86944055a87e5b4f45da615c1f07e6f99a13da56 Mon Sep 17 00:00:00 2001 From: Johanmkr Date: Mon, 3 Mar 2025 10:23:52 +0100 Subject: [PATCH 6/7] updated the readme with my results from the run on Springfield --- README.md | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/README.md b/README.md index fc1d2ad..9b695c7 100644 --- a/README.md +++ b/README.md @@ -120,6 +120,15 @@ The table below presents the detailed results, showcasing the model's performanc | Validation | 1.019 | 0.995 | 0.680 | 0.680 | 0.680 | 0.680 | | Test | 1.196 | 0.985 | 0.634 | 0.634 | 0.634 | 0.634 | +### ChristianModel & USPS_0-6 +ChristianModel was trained on the USPS_0-6 dataset. The model was trained for a total of 20 epochs, utilizing all five metrics macro-averaged. Please find the results in the following table: + +| Dataset Split | Loss | Entropy | Accuracy | Precision | Recall | F1 | +|---------------|-------|---------|----------|-----------|--------|-------| +| Train | 0.040 | 0.070 | 0.980 | 0.981 | 0.140 | 0.981 | +| Validation | 0.071 | 0.074 | 0.973 | 0.975 | 0.140 | 0.974 | +| Test | 0.247 | 0.096 | 0.931 | 0.934 | 0.134 | 0.932 | + ## Citing Please consider citing this repository if you end up using it for your work. Several citation methods can be found under the "About" section. From 25bf62cea53e952c7aef4fd28636b937cbeaa47c Mon Sep 17 00:00:00 2001 From: Johanmkr Date: Mon, 3 Mar 2025 10:24:03 +0100 Subject: [PATCH 7/7] Updated individual documentation --- doc/Johan_page.md | 93 +++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 90 insertions(+), 3 deletions(-) diff --git a/doc/Johan_page.md b/doc/Johan_page.md index f325fe6..284f260 100644 --- a/doc/Johan_page.md +++ b/doc/Johan_page.md @@ -10,26 +10,113 @@ ### Dataset -The choices regarding the dataset were mostly done in conjunction with Jan (@hzavadil98) as we were both using the MNIST dataset. Jan had the idea to download the binary files and construct the images from those. The group decided collaboratively to make the package download the data once and store it for all of use to use. Hence, the individual implementations are fairly similar, at least for the two MNIST dataloaders. Were it not for these individual tasks, there would have been one dataloader class, initialised with two separate ranges for labels 0-3 and 4-9. However, individual dataloaders had to be created to comply with the exam description. For my implementation, the labels had to be mapped to a range starting at 0: $(4-9) \to (0,5)$ since the cross-entropy loss function in PyTorch expect this range. +The choices regarding the dataset were mostly done in conjunction with Jan (@hzavadil98) as we were both using the MNIST dataset. Jan had the idea to download the binary files and construct the images from those. The group decided collaboratively to make the package download the data once and store it for all of us to use. Hence, the individual implementations are fairly similar, at least for the two MNIST dataloaders. Were it not for these individual tasks, there would have been one dataloader class, initialized with two separate ranges for labels 0-3 and 4-9. However, individual dataloaders had to be created to comply with the exam description. For my implementation, the labels had to be mapped to a range starting at 0: $(4-9) \to (0,5)$ since the cross-entropy loss function in PyTorch expect this range. + +Dataset based on the PyTorch module ``Dataset``. + +* ``__init__`` Initialize the dataset. Shifts the labels $(4-9) \to (0,5)$ to comply with the expectations of the cross-entropy loss function. + * ``data_path``: Path to where MNIST is/should be stored. + * ``sample_ids``: Array of indices specifying which samples to load. + * ``train``: Train or test the model (boolean), default is False. + * ``transform``: Transforms if any, default is None. + * ``nr_channels``: Number of channels, default is 1. +* ``__len__``: Return the length of dataloader. +* ``__getitem__``: Return the item of the specified index. Read the binary file at the correct position in order to generate the sample. Parameters: + * ``idx``: Index of the desired sample. + +### Model + +The model is a straightforward MLP that consists of 4 hidden layers, 77 neurons in each layer. The activation function is ReLU. The final output is logits. + +The model inherits the basic class from PyTorch: ``nn.Module`` so that the only necessary methods are + +* ``__init__``: Initialize the network with the following parameters: + * ``ìmage_shape``: Shape of the image (channels, height, width). + * ``num_classes``: Number of classes to predict. This controls the output dimension. +* ``forward``: One forward pass of the model. Parameters: + * ``x`` One batch of data. + +For any batch size ``N`` an example would be: + +* Grayscale MNIST picture have shape (1,28,28). +* Input shape: (``N``,1,28,28) +* First layer output shape: (``N``,77) +* Second layer output shape: (``N``,77) +* Third layer output shape: (``N``,77) +* Fourth (final) layer output shape: (``N``, ``num_classes``) + +### Metric + +The precision metric is calculated as follows: + +$$ +Precision = \frac{TP}{TP+FP}, +$$ +where $TP$ and $FP$ are the numbers of true and false positives respectively. Hence, precision is a measure of how often the model is correct whenever it predicts the target class. + +It can be calculated in two ways: + +* Macro-averaging: The precision is calculated for each class separately and then averaged over (default). +* Micro-averaging: Find $TP$ and $FP$ for all the classes and calculate precision once with these values. + +The precision metric is also subclass of the PyTorch ``nn.Module`` class. It has the following methods: + +* ``__init__``: Class initialization. Creates the class variables ``y_true`` and ``y_pred`` which are used to calculate the metrics. Parameters are: + * ``num_classes``: The number of classes. + * ``macro_averaging``: Boolean flag that control how to average the precision. Default is false. +* ``forward``: Appends the true and predicted values to the class variables. Parameters: + * ``y_true``: Tensor with true values. + * ``y_pred``: Tensor with predicted values. +* ``_micro_avg_precision``: Computes the micro-averaged precision. Parameters: + * ``y_true``: Tensor with true values. + * ``y_pred``: Tensor with predicted values. +* ``_macro_avg_precision``: Computes the macro-averaged precision. Parameters: + * ``y_true``: Tensor with true values. + * ``y_pred``: Tensor with predicted values. +* ``__returnmetric__``: Return the micro/macro-averaged precision of the samples stored in the class variables ``y_true`` and ``y_pred``. +* ``__reset__``: Resets the list of samples stored in the class variables ``y_true`` and ``y_pred`` so that it is empty. ## Experiences with running someone else's code +This was an interesting experience as things did not go exactly as expected. I was initially unable to run with my assigned dataset (USPS 0-6). I have never encountered this error before, but with the help of Erik (IT guy) and Christian, a conflict with the ``urllib`` package and the newest python version was identifies. Christian recognized a solution, and suggested a fix which solved my problem. Hence, communicating well with the author of the code I was trying to run proved essential when I encountered errors. + ## Experiences having someone else to run my code +I have not heard anything from whoever ran my code, so I assume everything went well. + ## I learned how to use these tools during this course +Coming from a non-machine-learning background where the coding routines are slightly different in the sense of code organization and the use of tools like "docker" and "WandB", the learning curve has been steep. I am grateful to have been given the chance to collaborate with skilled peers, from whom I have learned a lot. + ### Git-stuff +My novel experience with git prior to this project consisted of mainly having a local repository where I pulled/pushed to a GitHub repo, using one branch only; version control for myself only. Here I learned to utilize the features of git when collaborating with others, which include: + +* Operating with Issues and assignments of work division. +* Work on separate branches and have merge-protection on the main branch. +* Using pull requests where we had to review each other's code before merging into main. +* Use GitHub actions to automate workflows like unit testing and documentation building. +* Using tags and creating releases. +* Making the repository function like a package and make it installable. +* Writing clear and documented code and building documentations. + ### WandB +It was insightful to learn the basics of using Weights and Biases to track the progress when training and evaluating models. I have used Tensorboard (slightly) prior to this, but WandB seems like a better option. + ### Docker, Kubernetes and Springfield +Completely new to all of this. Spend quite some time trying to understand what docker and kubernetes is since we were supposed to run the experiment on the local cluster Springfield. There was a whole process setting everything up, making the ssh secrets and get everything to work. I have some prior experience with SSH protocols, but never used any container software, nor schedulers like kubernetes. + ### Proper documentation +Writing good documentation is always necessary and having training in this was fruitful. Combining this with GitHub action and Sphinx made it far easier to have an updated version of the documentation readily available. + ### Nice ways of testing code +The combination of having testing part of the GitHub action workflow and using the more advanced features of ``pytest`` (like parametrized testing) was new to be, a very nice thing to learn. It automated testing and made it significantly more easy to make sure that any code we pushed to the main branch was performing well, and did not lose any unintended functionality. + ### UV -## General thoughts on collaboration +I switched to UV as my package manager for this project, and it is VERY good. Really fast and versatile. -As someone new to this University and how the IT-setup is here, this project was very fruitful. The positive thing about working with peers that are highly skilled in the relevant field is that the learning curve is steep, and the take home for me was significant. The con is that I constantly felt behind, spend half the time just understanding the changes implemented by others and felt that my contributions to the overall project was less significant. However, working with skilled peers have boosted my understanding about how things work around here, especially git (more fancy commands that add, commit, push and pull) and docker, cluster operations, kubernetes and logging metrics on Weights and Biases.