|
25 | 25 |
|
26 | 26 |
|
27 | 27 | <link rel="icon" href="../../../assets/images/dstack-fav-32.ico"> |
28 | | - <meta name="generator" content="mkdocs-1.6.1, mkdocs-material-9.6.4+insiders-4.53.15"> |
| 28 | + <meta name="generator" content="mkdocs-1.6.1, mkdocs-material-9.6.5+insiders-4.53.15"> |
29 | 29 |
|
30 | 30 |
|
31 | 31 |
|
|
2880 | 2880 | </label> |
2881 | 2881 | <ul class="md-nav__list" data-md-component="toc" data-md-scrollfix> |
2882 | 2882 |
|
| 2883 | + <li class="md-nav__item"> |
| 2884 | + <a href="#efficient-distributed-training-with-aws-efa" class="md-nav__link"> |
| 2885 | + <span class="md-ellipsis"> |
| 2886 | + |
| 2887 | + Efficient distributed training with AWS EFA |
| 2888 | + |
| 2889 | + </span> |
| 2890 | + </a> |
| 2891 | + |
| 2892 | +</li> |
| 2893 | + |
2883 | 2894 | <li class="md-nav__item"> |
2884 | 2895 | <a href="#auto-shutdown-for-inactive-dev-environmentsno-idle-gpus" class="md-nav__link"> |
2885 | 2896 | <span class="md-ellipsis"> |
|
3422 | 3433 | </label> |
3423 | 3434 | <ul class="md-nav__list" data-md-component="toc" data-md-scrollfix> |
3424 | 3435 |
|
| 3436 | + <li class="md-nav__item"> |
| 3437 | + <a href="#efficient-distributed-training-with-aws-efa" class="md-nav__link"> |
| 3438 | + <span class="md-ellipsis"> |
| 3439 | + |
| 3440 | + Efficient distributed training with AWS EFA |
| 3441 | + |
| 3442 | + </span> |
| 3443 | + </a> |
| 3444 | + |
| 3445 | +</li> |
| 3446 | + |
3425 | 3447 | <li class="md-nav__item"> |
3426 | 3448 | <a href="#auto-shutdown-for-inactive-dev-environmentsno-idle-gpus" class="md-nav__link"> |
3427 | 3449 | <span class="md-ellipsis"> |
@@ -3473,6 +3495,51 @@ <h1 id="2025">2025<a class="headerlink" href="#2025" title="Permanent link">&par |
3473 | 3495 | <article class="md-post md-post--excerpt"> |
3474 | 3496 | <header class="md-post__header"> |
3475 | 3497 |
|
| 3498 | + <div class="md-post__meta md-meta"> |
| 3499 | + <ul class="md-meta__list"> |
| 3500 | + <li class="md-meta__item"> |
| 3501 | + <time datetime="2025-02-20 00:00:00+00:00">February 20, 2025</time></li> |
| 3502 | + |
| 3503 | + <li class="md-meta__item"> |
| 3504 | + in |
| 3505 | + |
| 3506 | + <a href="../../category/fleets/" class="md-meta__link">Fleets</a></li> |
| 3507 | + |
| 3508 | + |
| 3509 | + |
| 3510 | + <li class="md-meta__item"> |
| 3511 | + |
| 3512 | + 3 min read |
| 3513 | + |
| 3514 | + </li> |
| 3515 | + |
| 3516 | + |
| 3517 | + </ul> |
| 3518 | + |
| 3519 | + </div> |
| 3520 | + </header> |
| 3521 | + <div class="md-post__content md-typeset"> |
| 3522 | + <h2 id="efficient-distributed-training-with-aws-efa"><a class="toclink" href="../../distributed-training-with-aws-efa/">Efficient distributed training with AWS EFA</a></h2> |
| 3523 | +<p><a href="https://aws.amazon.com/hpc/efa/" target="_blank">Amazon Elastic Fabric Adapter (EFA) <span class="twemoji external"><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24"><path d="m11.93 5 2.83 2.83L5 17.59 6.42 19l9.76-9.75L19 12.07V5z"/></svg></span></a> is a high-performance network interface designed for AWS EC2 instances, enabling |
| 3524 | +ultra-low latency and high-throughput communication between nodes. This makes it an ideal solution for scaling |
| 3525 | +distributed training workloads across multiple GPUs and instances.</p> |
| 3526 | +<p>With the latest release of <code>dstack</code>, you can now leverage AWS EFA to supercharge your distributed training tasks.</p> |
| 3527 | +<p><img src="https://github.com/dstackai/static-assets/blob/main/static-assets/images/distributed-training-with-aws-efa-v2.png?raw=true" width="630"/></p> |
| 3528 | + |
| 3529 | + |
| 3530 | + <nav class="md-post__action"> |
| 3531 | + <a href="../../distributed-training-with-aws-efa/"> |
| 3532 | + Continue reading |
| 3533 | + </a> |
| 3534 | + </nav> |
| 3535 | + |
| 3536 | + |
| 3537 | + </div> |
| 3538 | +</article> |
| 3539 | + |
| 3540 | + <article class="md-post md-post--excerpt"> |
| 3541 | + <header class="md-post__header"> |
| 3542 | + |
3476 | 3543 | <div class="md-post__meta md-meta"> |
3477 | 3544 | <ul class="md-meta__list"> |
3478 | 3545 | <li class="md-meta__item"> |
|
0 commit comments