Skip to content

Commit ac9ceb0

Browse files
committed
✨ add website
0 parents  commit ac9ceb0

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

44 files changed

+22728
-0
lines changed

.github/workflows/deploy.yml

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
name: Deploy to GitHub Pages
2+
3+
on:
4+
push:
5+
branches:
6+
- main
7+
# Review gh actions docs if you want to further define triggers, paths, etc
8+
# https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#on
9+
10+
jobs:
11+
build:
12+
name: Build Docusaurus
13+
runs-on: ubuntu-latest
14+
steps:
15+
- uses: actions/checkout@v4
16+
with:
17+
fetch-depth: 0
18+
- uses: actions/setup-node@v4
19+
with:
20+
node-version: 18
21+
cache: npm
22+
23+
- name: Install dependencies
24+
run: npm ci
25+
- name: Build website
26+
run: npm build
27+
28+
- name: Upload Build Artifact
29+
uses: actions/upload-pages-artifact@v3
30+
with:
31+
path: build
32+
33+
deploy:
34+
name: Deploy to GitHub Pages
35+
needs: build
36+
37+
# Grant GITHUB_TOKEN the permissions required to make a Pages deployment
38+
permissions:
39+
pages: write # to deploy to Pages
40+
id-token: write # to verify the deployment originates from an appropriate source
41+
42+
# Deploy to the github-pages environment
43+
environment:
44+
name: github-pages
45+
url: ${{ steps.deployment.outputs.page_url }}
46+
47+
runs-on: ubuntu-latest
48+
steps:
49+
- name: Deploy to GitHub Pages
50+
id: deployment
51+
uses: actions/deploy-pages@v4

.gitignore

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
# Dependencies
2+
/node_modules
3+
4+
# Production
5+
/build
6+
7+
# Generated files
8+
.docusaurus
9+
.cache-loader
10+
11+
# Misc
12+
.DS_Store
13+
.env.local
14+
.env.development.local
15+
.env.test.local
16+
.env.production.local
17+
18+
npm-debug.log*
19+
yarn-debug.log*
20+
yarn-error.log*

README.md

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
# Website
2+
3+
This website is built using [Docusaurus](https://docusaurus.io/), a modern static website generator.
4+
5+
## Installation
6+
7+
```bash
8+
yarn
9+
```
10+
11+
## Local Development
12+
13+
```bash
14+
yarn start
15+
```
16+
17+
This command starts a local development server and opens up a browser window. Most changes are reflected live without having to restart the server.
18+
19+
## Build
20+
21+
```bash
22+
yarn build
23+
```
24+
25+
This command generates static content into the `build` directory and can be served using any static contents hosting service.
26+
27+
## Deployment
28+
29+
Using SSH:
30+
31+
```bash
32+
USE_SSH=true yarn deploy
33+
```
34+
35+
Not using SSH:
36+
37+
```bash
38+
GIT_USER=<Your GitHub username> yarn deploy
39+
```
40+
41+
If you are using GitHub pages for hosting, this command is a convenient way to build the website and push to the `gh-pages` branch.
Lines changed: 173 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,173 @@
1+
---
2+
title: Something You Need to Know About GPUs
3+
description: A beginner-friendly guide to understanding the relationship between GPU drivers, CUDA, PyTorch, and cuDNN.
4+
slug: driver-cuda-cudnn
5+
tags: [gpu, cuda]
6+
---
7+
8+
:::note
9+
When using the provided server, everything including the driver and CUDA toolkit is already installed, so you might not need to worry about these details initially. However, I strongly encourage you to understand these concepts because you might one day need to maintain your own server (though hopefully you won't have to).
10+
:::
11+
12+
## Introduction
13+
14+
Back in the day, I always wondered why we could run PyTorch code on our local machine without a GPU, but when it came to compiling or training local library, we suddenly needed CUDA toolkit. What's going on under the hood?
15+
16+
In this article, we’ll break down the mystery behind CUDA, cuDNN, and all the other buzzwords. By the end, you’ll have a clearer (and hopefully less intimidating) understanding of how they all fit together.
17+
18+
<!-- truncate -->
19+
20+
:::warning
21+
This article will not walk you through the installation of these components.
22+
:::
23+
24+
## Crash the Terminology
25+
26+
Before we dive into how everything connects, let’s quickly introduce the key players. You might already be familiar with some of them!
27+
28+
* **Driver**: Think of it as a "bridge" that helps your system talk to the GPU. Without it, your GPU would just sit there, looking pretty but doing nothing.
29+
* **PyTorch**: A popular deep learning framework in Python. It makes building and training neural networks a breeze.
30+
* **CUDA toolkit**: A set of tools, libraries, and a compiler that helps you write code to run on NVIDIA GPUs. Basically, it’s what makes your GPU do the heavy lifting.
31+
* **cuDNN**: A specialized library built on top of CUDA, designed to make deep learning operations (like convolutions) super fast and efficient.
32+
33+
```mermaid
34+
graph TD
35+
PyTorch["PyTorch"]
36+
cuDNN["cuDNN"]
37+
CUDA["CUDA Toolkit"]
38+
Driver["NVIDIA Driver"]
39+
GPU["GPU (Hardware)"]
40+
41+
PyTorch --> cuDNN
42+
PyTorch --> CUDA
43+
cuDNN --> CUDA
44+
CUDA --> Driver
45+
Driver --> GPU
46+
```
47+
48+
## Relation Between CUDA and Driver
49+
50+
When you run `nvidia-smi`, you might see something like this:
51+
52+
```plaintext
53+
+-----------------------------------------------------------------------------------------+
54+
| NVIDIA-SMI 570.86.15 Driver Version: 570.86.15 CUDA Version: 12.8 |
55+
|-----------------------------------------+------------------------+----------------------+
56+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
57+
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
58+
| | | MIG M. |
59+
|=========================================+========================+======================|
60+
| 0 NVIDIA H100 NVL Off | 00000000:0A:00.0 Off | 0 |
61+
| N/A 49C P0 98W / 400W | 17258MiB / 95830MiB | 0% Default |
62+
| | | Disabled |
63+
+-----------------------------------------+------------------------+----------------------+
64+
| 1 NVIDIA H100 NVL Off | 00000000:0D:00.0 Off | 0 |
65+
| N/A 41C P0 63W / 400W | 4MiB / 95830MiB | 0% Default |
66+
| | | Disabled |
67+
+-----------------------------------------+------------------------+----------------------+
68+
...
69+
```
70+
71+
At this point, you might ask yourself:
72+
73+
* What does this "CUDA Version" actually mean?
74+
* Does this mean I have CUDA 12.8 installed on my system?
75+
76+
Before answering, let’s clear up the difference between **CUDA** and the **CUDA Toolkit**.
77+
78+
When we say *CUDA*, it often refers to the whole platform that allows us to run GPU-accelerated code — including the drivers, libraries, and development tools. The **CUDA Toolkit**, on the other hand, is part of the *CUDA* platform.
79+
80+
Inside CUDA, there are two main APIs:
81+
82+
1. **Driver API**: Lower-level, gives you fine-grained control.
83+
2. **Runtime API**: Higher-level, easier to use, and most user code (like PyTorch) depends on this.
84+
85+
Now, the **CUDA Version** shown in `nvidia-smi` refers to the **maximum CUDA runtime version supported by your installed driver** (essentially, what your driver can handle). It doesn’t necessarily mean you have that version of the CUDA Toolkit installed.
86+
87+
So, does this mean you have CUDA 12.8 installed? The answer is: *kind of, but not really*.
88+
89+
* **Yes**, because your driver supports CUDA 12.8 through its driver API.
90+
* **No**, because to actually compile and run your own GPU programs (e.g., custom CUDA kernels), you still need to install the CUDA Toolkit, which includes the runtime libraries, compiler (`nvcc`), and other development tools.
91+
92+
That’s why you often still need to install a specific CUDA Toolkit version on your system, even if your driver "supports" a higher CUDA version.
93+
94+
## Multiple CUDA Toolkit Versions
95+
96+
As we mentioned earlier, the CUDA Toolkit is basically just a set of libraries and tools. Because of that, you can actually install **multiple versions** of the toolkit on the same machine without much trouble!
97+
98+
The only thing you need to do is tell your programs which version you want them to use. That’s why you’ll often see instructions or articles asking you to set environment variables like `CUDA_HOME`, `LD_LIBRARY_PATH`, and add the CUDA binaries to your `PATH`.
99+
100+
Here’s an example of how you might do it:
101+
102+
```bash
103+
CUDA_VERSION=12.4
104+
105+
CUDA_HOME="/usr/local/cuda-${CUDA_VERSION}"
106+
PATH=$PATH:"/usr/local/cuda-${CUDA_VERSION}/bin"
107+
LD_LIBRARY_PATH="/usr/local/cuda-${CUDA_VERSION}/lib64"
108+
LIBRARY_PATH="/usr/local/cuda-${CUDA_VERSION}/lib64"
109+
```
110+
111+
This way, you can switch between different CUDA versions depending on which one your project needs — super handy if you’re working on multiple projects or need to match different library requirements.
112+
113+
By the way, CUDA toolkits are usually installed under `/usr/local`. You can take a peek by running:
114+
115+
```bash
116+
ls /usr/local
117+
```
118+
119+
If you've installed CUDA before, you’ll probably see directories like `cuda-11.8`, `cuda-12.4`, and so on.
120+
121+
## CUDA Toolkit and PyTorch
122+
123+
Remember that question:
124+
125+
> Why don’t we need to install the CUDA Toolkit separately when using PyTorch?
126+
127+
Well, the short answer is: **we actually do need CUDA libraries for PyTorch**, but when you install PyTorch using `pip` or `conda`, they handle it for you behind the scenes.
128+
129+
In fact, when you install PyTorch this way, it comes bundled with the specific CUDA libraries it needs to run on your GPU. For example, if you install via `pip`, you’ll also get packages like:
130+
131+
* `nvidia-cublas-cuxx`
132+
* `nvidia-cuda-runtime-cuxx`
133+
* `nvidia-cudnn-cuxx`
134+
135+
Here, the "xx" represents the specific CUDA version (e.g., `cu118` for CUDA 11.8).
136+
137+
It’s important to note that these are **just the necessary libraries**, not the full CUDA Toolkit. The libraries come as [**shared objects**](https://dmerej.info/blog/post/symlinks-and-so-files-on-linux/) (e.g., `.so` files on Linux, `.dll` files on Windows). This means they are pre-compiled and dynamically linked at runtime, so you don’t need to compile them yourself.
138+
139+
If you’re curious and want to see them yourself, you can find these files under something like `{PATH_TO_PYTHON}/lib/python3.{VERSION}/site-packages/torch/lib`.
140+
141+
That’s why you can get GPU acceleration in PyTorch without manually installing the full toolkit — the heavy lifting is already packaged for you!
142+
143+
## CUDA Toolkit — Useful or Not?
144+
145+
You might ask: *So... do I actually need to install the CUDA Toolkit?*
146+
147+
Well, if you’re just using PyTorch as-is (and installing it via `pip` or `conda`), you **don’t** need to install the CUDA Toolkit manually.
148+
149+
However, if you want to use third-party [custom CUDA extensions](https://docs.pytorch.org/tutorials/advanced/cpp_extension.html) — for example, if you need a super-fast deformable attention module in DETR — you’ll need to **manually compile** the `.cu` files. In this case, having the CUDA Toolkit (which includes the compiler `nvcc`) is essential.
150+
151+
When compiling your own CUDA code, it’s important to make sure the **CUDA Toolkit version** matches the version PyTorch expects. There’s also the concept of **Compute Capability (CC)** — which basically describes the GPU architecture version.
152+
153+
If you compile your code for an older CC, it can usually still run on newer GPUs. But if you compile for a newer CC, it might not run on older GPUs.
154+
155+
If you want to support multiple GPUs, you can set the `TORCH_CUDA_ARCH_LIST` environment variable when compiling. For example, if you want to support both an RTX 3090 (`sm_86`) and an H100 (`sm_90`), you can do:
156+
157+
```bash
158+
TORCH_CUDA_ARCH_LIST="86 90"
159+
```
160+
161+
Interestingly, if you don’t set `TORCH_CUDA_ARCH_LIST`, PyTorch will automatically use the architectures it detects from the GPUs visible during build. It also includes a `+PTX` option, which allows the code to run on future GPUs with higher CCs.
162+
163+
Here’s a note from the [PyTorch docs](https://docs.pytorch.org/docs/stable/cpp_extension.html) that explains this nicely:
164+
165+
> "By default the extension will be compiled to run on all archs of the cards visible during the building process of the extension, plus PTX. ... The +PTX option causes extension kernel binaries to include PTX instructions for the specified CC. PTX is an intermediate representation that allows kernels to runtime-compile for any CC ≥ the specified CC (for example, 8.6+PTX generates PTX that can runtime-compile for any GPU with CC ≥ 8.6)."
166+
167+
## Conclusion
168+
169+
That’s it! We’ve untangled the web of CUDA, cuDNN, drivers, toolkits, and how they all play together with PyTorch.
170+
171+
To sum it up: if you're just running PyTorch out of the box, you don’t have to worry too much — most of the magic is handled for you behind the scenes. But once you start diving into custom CUDA code or advanced optimizations, knowing how all these pieces fit together becomes super important (and actually pretty fun!).
172+
173+
Hopefully, this article helped clear up the mystery and gave you a deeper appreciation for what’s happening under the hood when your GPU starts humming along.

blog/authors.yml

Whitespace-only changes.

0 commit comments

Comments
 (0)