diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 7d3a028..bb809ca 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -108,7 +108,9 @@ The steps below will give you a general idea of how to prepare your local enviro npm run test:e2e ``` -10. Once you're happy with your changes, add and commit them to your branch, then push the branch to your fork. +10. To run the worker locally, see [Dev Setup](./docs/dev-setup.md). + +11. Once you're happy with your changes, add and commit them to your branch, then push the branch to your fork. ```bash git add . @@ -119,7 +121,7 @@ The steps below will give you a general idea of how to prepare your local enviro > [!IMPORTANT]\ > Before committing and opening a Pull Request, please go first through our [Commit](#commit-guidelines) and [Pull Request](#pull-request-policy) guidelines outlined below. -11. Create a Pull Request. +12. Create a Pull Request. ### CLI Commands diff --git a/docs/README.md b/docs/README.md new file mode 100644 index 0000000..5714cfa --- /dev/null +++ b/docs/README.md @@ -0,0 +1,12 @@ +# Documentation + +Documentation for the Release Worker. + +## Table of Contents + +- [Architecture](./architecture.md) +- [Dev Setup](./dev-setup.md) +- [Debugging Production](./debugging-prod.md) +- [Deploying](./deploying.md) +- [R2](./r2.md) +- [Node.js Release Process](./release-process.md) diff --git a/docs/architecture.md b/docs/architecture.md new file mode 100644 index 0000000..0e3ffc6 --- /dev/null +++ b/docs/architecture.md @@ -0,0 +1,55 @@ +# Architecture + +Documentation on the architecture of the worker (i.e. how it works, how it fits into Node.js' infrastructure, etc.). + +## Network Request Flow + +A high-level overview of how a request flows through Node.js' infrastructure: + +```mermaid +flowchart LR + request[Request] --> cloudflare(Cloudflare Routing Rules) + cloudflare -- /dist/, /download/, /docs/, /api/, /metrics/ --> worker@{ shape: procs, label: "Release Worker"} + cloudflare -- /... --> website(Website) + worker -- Cache miss --> r2[(R2 bucket)] + worker -- Error --> originServer(Origin Server) + originServer + website + r2 +``` + +## Worker Request Flow + +The Release Worker uses a middleware approach to routing requests. + +When an instance of the worker starts up, it registers a number of routes and their middlewares. +It then builds a "chain" of middlewares to call in the same order they're given to handle the request. + +When a request hits the worker, the router gives it to the first middleware in the chain. +That middleware can then either handle the request and return a response or pass it onto the next middleware. +This goes on until the request is handled or we run out of middlewares to handle the request, upon which we throw an error. + +We currently have the following middlewares (in no particular order): + +- [CacheMiddleware](../src/middleware/cacheMiddleware.ts) - Caches responses to GET request. +- [R2Middleware](../src/middleware/r2Middleware.ts) - Fetches resource from R2. +- [OriginMiddleware](../src/middleware/originMiddleware.ts) - Fetches resource from the origin server. + Used as a fallback if the R2 middleware fails. +- [NotFoundMiddleware](../src/middleware/notFoundMiddleware.ts) - Handles not found requests. +- [OptionsMiddleware](../src/middleware/optionsMiddleware.ts) - Handles OPTIONS requests. +- [SubstituteMiddleware](../src/middleware/subtituteMiddleware.ts) - Handles requests that need URL substituing (i.e. `/dist/latest/` -> `/dist/`) and then feeds them back into the router. + +### Diagram + +```mermaid +flowchart TD + request[Request] --> worker(Release Worker) + worker --> routerHandle("Router.handle") + routerHandle -- HTTP GET --> cacheMiddleware("Cache Middleware") + routerHandle -- HTTP HEAD --> r2Middleware + routerHandle -- HTTP OPTIONS --> optionsMiddleware("Options Middleware") + routerHandle -- Request --> substituteMiddleware("Substitute Middleware") + substituteMiddleware -- Substituted Request --> routerHandle + cacheMiddleware -- Cache miss --> r2Middleware("R2 Middleware") + r2Middleware -- Error --> originMiddleware("Origin Middleware") +``` diff --git a/docs/debugging-prod.md b/docs/debugging-prod.md new file mode 100644 index 0000000..b3e3695 --- /dev/null +++ b/docs/debugging-prod.md @@ -0,0 +1,16 @@ +# Debugging Prod + +Steps to aid with debugging the Release Worker's production environment. + +> [!NOTE] +> This is mostly meant for Node.js Web Infra team members. +> Some of these steps require access to resources only made available to Collaborators. + +## Steps + +- Check [Sentry](https://nodejs-org.sentry.io/issues/?project=4506191181774848). + All errors should be reported here. + +- If a local reproduction is found, Cloudflare has an implementation of [Chrome's DevTools](https://developers.cloudflare.com/workers/observability/dev-tools/). + +- Cloudflare provides basic stats on the worker's Cloudflare dash page [here](https://dash.cloudflare.com/07be8d2fbc940503ca1be344714cb0d1/workers/services/view/dist-worker/production). diff --git a/docs/deploy.md b/docs/deploy.md deleted file mode 100644 index ed79506..0000000 --- a/docs/deploy.md +++ /dev/null @@ -1,19 +0,0 @@ -# Deploy - -This worker is auto-deployed by Github Actions. The workflow is defined [here](../.github/workflows/deploy.yml). - -## Staging - -This worker is deployed to staging every time a pull request is merged into the main branch. - -## Prod - -This worker is deployed into prod by a manual trigger. - -## Actions Setup - -How to setup the actions for automated deployments. - -- Create a Cloudflare API Token (https://developers.cloudflare.com/fundamentals/api/get-started/create-token/) -- Set Github secret `CF_API_TOKEN` on the repo to the token you generated -- They should be working now diff --git a/docs/deploying.md b/docs/deploying.md new file mode 100644 index 0000000..ba0ec02 --- /dev/null +++ b/docs/deploying.md @@ -0,0 +1,11 @@ +# Deploying the Worker + +Guide on how to deploy the Release Worker. + +## Staging Deployments + +The Release Worker is automatically deployed to its staging environment when a new commit is pushed to the `main` branch through the [Deploy Worker](https://github.com/nodejs/release-cloudflare-worker/actions/workflows/deploy.yml) action. + +## Production Deployments + +The Release Worker is deployed to its production environment by a Collaborator manually running the [Deploy Worker](https://github.com/nodejs/release-cloudflare-worker/actions/workflows/deploy.yml) action. diff --git a/docs/dev-setup.md b/docs/dev-setup.md index 440f591..c5360c8 100644 --- a/docs/dev-setup.md +++ b/docs/dev-setup.md @@ -1,33 +1,38 @@ # Dev Setup -Guide to setting up this worker for development. +Documentation on how to run the Release Worker locally. -## Have Node Installed +## Steps -Node needs to be installed for the thing that serves Node downloads (latest LTS/even numbered major recommended) +### 1. Prepare environment -## Install Dependencies +Read and follow the [Getting Started](../CONTRIBUTING.md) guide to get your local environment setup. -Run `npm install` +### 2. Setup your Cloudflare account -## Testing +Currently we run the worker in [remote mode](https://developers.cloudflare.com/workers/testing/local-development/#develop-using-remote-resources-and-bindings) as there isn't a nice way to locally populate an R2 bucket. +This means that, to run the Release Worker locally, you must have a Cloudflare account that has an R2 bucket named +`dist-prod`. +You will also need to populate the bucket yourself. -To run unit tests, `npm run test:unit`. To run e2e (end-to-end) tests, `npm run test:e2e`. +Both of these will hopefully change in the future to make running the Release Worker easier. -See the [/test](../tests/) folder for more info on testing. +### 3. Create secrets for directory listings -## Running Locally +This step is optional but recommended. -Spin up a Workerd instance on your machine that serves this worker +The Release Worker uses R2's S3 API for directory listings. +In order for directory listings to work, you need to make an R2 API key for your `dist-prod` bucket and provide it to the worker. -### Login to Cloudflare Dash From Wrangler CLI +Generating the API key can be done through the Cloudflare dashboard [here](https://dash.cloudflare.com/?account=/r2/api-tokens). -Run `wrangler login` +Then, make a `.dev.vars` file in the root of this repository with the following: -### R2 Bucket +``` +S3_ACCESS_KEY_ID= +S3_ACCESS_KEY_SECRET= +``` -Create a R2 bucket named `dist-prod`. This is the bucket that the worker read from. It will either need to have a copy of Node's dist folder in it or something mimicing the folder there. +### 4. Run the worker -### Starting the Local Server - -Run `npm start`. This starts a Workerd instance in remote mode. +Start the worker locally with `npm start`. You may be prompted to log into your Cloudflare account. diff --git a/docs/r2.md b/docs/r2.md new file mode 100644 index 0000000..932d73a --- /dev/null +++ b/docs/r2.md @@ -0,0 +1,34 @@ +# R2 + +## What is it? + +[R2](https://developers.cloudflare.com/r2/) is Cloudflare's blob storage provider. +We use it to store all of the release assets stored by the Release Worker. + +## Noteworthy points + +### Directories + +R2 stores files flatly, meaning a directory does not exist in R2. + +However, R2 allows characters such as slashes (/) in an object's name. +For directories we can then specify a prefix (like `nodejs/release/`) and R2 will only return objects that has a name that starts with that prefix. + +### Bindings API + +R2 allows integration with Workers through their [bindings API](https://developers.cloudflare.com/r2/api/workers/workers-api-usage/). +We use this when fetching files. + +### S3 API + +Due to some performance issues we were seeing with R2's `list` binding command, we opted to use R2's S3 API for listing directories. + +### Buckets + +We have two R2 buckets: + +- `dist-staging` - Holds staged releases. This bucket is private and should not be publicly accessible. + +- `dist-prod` - Holds released versions of Node.js. Everything in this bucket should be considered publicly accessible. + +(see [Release Process](./release-process.md) for more information on how we use these buckets) diff --git a/docs/release-process.md b/docs/release-process.md new file mode 100644 index 0000000..b6d4142 --- /dev/null +++ b/docs/release-process.md @@ -0,0 +1,55 @@ +# Release Process + +Documentation on the general order of events that happen when releasing a new version of Node.js + +> [!NOTE] +> This focuses on the flow of release assets (binaries, doc files). +> This may not include the full process for releases (i.e. getting necessary approvals). + +## Release types + +### Mainline releases + +Mainline releases refer to the main release branch of Node.js + +### Nightly releases + +Node.js has multiple release branches that are promoted nightly. + +- `nightly` - Nightly builds from the `main` Node.js branch +- `v8-canary` - Builds with the latest V8 canary +- `rc` - Release candidates +- `test` - Test builds + +
+ Deprecated release branches + +These branches no longer receive new releases. + +- `chakracore-nightly` - Chakracore nightly builds +- `chakracore-rc` - Chakracore release candidates +- `chakracore-release` - Chakracore releases + +
+ +## Release flow + +### 1. Release CI is triggered + +New builds are scheduled on the release CI (https://ci-release.nodejs.org). +These builds compile Node.js on the various platforms and compile the docs. + +Upon a build completing successfully, the build's output (binaries, doc files) will then be uploaded to the origin server and the `dist-staging` bucket in Node.js' Cloudflare account. + +The release assets synced to the origin server are under `/home/staging/nodejs/` path. +The release assets synced to the `dist-staging` bucket are under the `/nodejs/` [_prefix_](./r2.md#directories). + +### 2. Release promotion + +When a release is ready to be released, it is promoted. +For mainline releases, this is done by the releaser running the [`release.sh`](https://github.com/nodejs/node/tree/main/tools/release.sh) script in the Node.js repository. +For nightly releases, this is done once a day by [automated tooling](https://github.com/nodejs/build/blob/main/ansible/www-standalone/tools/promote/promote_nightly.sh). + +On the origin server, the release's assets are copied from `/home/staging/nodejs/` to `/home/dist/nodejs/`. + +For R2, the release's assets are copied from the `dist-staging` bucket to the `dist-prod` bucket. diff --git a/docs/sops/README.md b/docs/sops/README.md new file mode 100644 index 0000000..a16a41b --- /dev/null +++ b/docs/sops/README.md @@ -0,0 +1,9 @@ +# Standard Operating Procedures + +Documents detailing standardized processes for the Release Worker. + +## Table of Contents + +- [Incident Flow](./incident-flow.md) +- [Rolling Back a Release](./rolling-back-a-release.md) +- [Switching between the Worker and Origin Server](./switch-between-worker-and-origin.md) diff --git a/docs/sops/incident-flow.md b/docs/sops/incident-flow.md new file mode 100644 index 0000000..6560da1 --- /dev/null +++ b/docs/sops/incident-flow.md @@ -0,0 +1,43 @@ +# Incident Flow + +Procedure for what to do if there's an incident with the Release Worker. + +## Steps + +1. If the incident was caused by a recent change, try + [rollbacking the release](./rolling-back-a-release.md). + +2. If the incident affects traffic towards the Release Worker, update the Node.js status page (https://status.nodejs.org). + If it is a ongoing security incident that we cannot disclose publicly yet, do not includes the details of the incident in the status page. + + - Optional, but preferably updates will be echoed on social media. + + - For any prolonged incidents, please consider pinning an issue tracking the incident so as to avoid spam. + + - Please also monitor any issues in repositories such as this one, + [nodejs/node](https://github.com/nodejs/node), + and [nodejs/nodejs.org](https://github.com/nodejs/nodejs.org) + for users asking about the incident and link them to the status page. + +3. [Steps for debugging the worker when it's deployed](../debugging.md) + +4. If there is an ongoing security incident requiring code changes, a force push to the `main` branch can be performed by a [Collaborator](../CONTRIBUTING.md#contributing) if there is reasonable risk that opening a PR with the change would allow more bad actors to exploit the vulnerability. + The code changes must still be approved by another Collaborator before the force push is performed, however. + +5. If the issue requires support from Cloudflare, try reaching out through the + `ext-nodejs-cloudflare` channel in the OpenJS Slack. + +6. If needed, create an issue on this repository to serve as a discussion board + for any changes that need to be made to avoid the same incident from + happening again. + +## What qualifies an an incident? + +There is no exact criteria, however, these cases will most likely call for an incident to be declared: + +1. The production deployment of the Release Worker is unavailable to the public or is otherwise operating in a way that impacts users' abilities to interact with it en masse. + This includes behaviors that we are responsible for and those that Cloudflare is responsible for. + +2. There is a ongoing security issue that involves the production deployment of the Release Worker. + +Note the Node.js Web Infrastructure, Build, and TSC teams can declare an incident wherever they see fit, however. diff --git a/docs/sops/rolling-back-a-release.md b/docs/sops/rolling-back-a-release.md new file mode 100644 index 0000000..256ddc8 --- /dev/null +++ b/docs/sops/rolling-back-a-release.md @@ -0,0 +1,34 @@ +# Rolling Back A Release + +> [!WARNING] +> Rolling back a release should only be done when necessary, +> such as a quick-fix for an on-going incident, +> and by a [Collaborator](../CONTRIBUTING.md#contributing). +> The Web Infrastructure team should be aware each time this happens. + +## Option A: via Github Actions + +This is the preferred way, but takes a little bit longer. + +1. Create a new branch + +2. [Revert the commit](https://git-scm.com/docs/git-revert) + +3. Push & create a new PR + +4. Merge PR & Deploy it + +If the rollback is prompted by an incident where the worker is entirely unavailable (i.e. all requests failing) or there is a security vulnerability present, +a Collaborator may forcibly push the commit reverting the release onto the `main` branch. + +## Option B: via Cloudflare Dash + +This requires `Workers Admin` permissions on Node.js' Cloudflare account. + +1. Go to the Release Worker's [deployment page](https://dash.cloudflare.com/?account=/workers/services/view/dist-worker/production/deployments) + +2. Find the previously deployed version in the table + +3. Click the three dots on the right side of the version's entry, then click `Rollback to v...` + +4. Make a revert commit to reflect the change in Git [see Option A](#option-a-via-github-actions). diff --git a/docs/sops/switch-between-worker-and-origin.md b/docs/sops/switch-between-worker-and-origin.md new file mode 100644 index 0000000..3fdafeb --- /dev/null +++ b/docs/sops/switch-between-worker-and-origin.md @@ -0,0 +1,27 @@ +# Switching Between The Worker and The Origin Server + +Steps for toggling server production traffic between the Release Worker and origin server. + +This is most relevant during incidents involving the Release Worker. + +## Option A. Worker Routes + +You need write access to Node.js' Cloudflare account for this option. + +> [!NOTE] +> This assumes the Cloudflare config for the origin server has remained in-tact +> and is still production ready. + +### Steps + +- Go to https://dash.cloudflare.com/07be8d2fbc940503ca1be344714cb0d1/nodejs.org/workers + +- Disable the routes that point to `dist-worker-prod` + +### Option B. Release Worker Routing + +- Go to [src/routes/index.ts](../../src/routes/index.ts). + +- Order the `R2Middleware`'s and `OriginMiddleware`'s to reflect the correct order that they should be invoked in. + For example prioritizing the origin server over R2 means the `OriginMiddleware` should appear before the `R2Middleware`. + The opposite is the same for prioritizing R2.