Skip to content

Commit fe9baf1

Browse files
Deploying to gh-pages from @ dstackai/dstack@7beadf5 🚀
1 parent 16fc34e commit fe9baf1

File tree

27 files changed

+896
-551
lines changed

27 files changed

+896
-551
lines changed
33.9 KB
Loading
-20.6 KB
Binary file not shown.

assets/images/social/examples.png

338 Bytes
Loading

assets/images/social/index.png

-5.06 KB
Loading

blog/case-studies/index.html

Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3609,6 +3609,17 @@
36093609
</label>
36103610
<ul class="md-nav__list" data-md-component="toc" data-md-scrollfix>
36113611

3612+
<li class="md-nav__item">
3613+
<a href="#how-toffee-streamlines-inference-and-cut-gpu-costs-with-dstack" class="md-nav__link">
3614+
<span class="md-ellipsis">
3615+
3616+
How Toffee streamlines inference and cut GPU costs with dstack
3617+
3618+
</span>
3619+
</a>
3620+
3621+
</li>
3622+
36123623
<li class="md-nav__item">
36133624
<a href="#how-ea-uses-dstack-to-fast-track-ai-development" class="md-nav__link">
36143625
<span class="md-ellipsis">
@@ -3750,6 +3761,17 @@
37503761
</label>
37513762
<ul class="md-nav__list" data-md-component="toc" data-md-scrollfix>
37523763

3764+
<li class="md-nav__item">
3765+
<a href="#how-toffee-streamlines-inference-and-cut-gpu-costs-with-dstack" class="md-nav__link">
3766+
<span class="md-ellipsis">
3767+
3768+
How Toffee streamlines inference and cut GPU costs with dstack
3769+
3770+
</span>
3771+
</a>
3772+
3773+
</li>
3774+
37533775
<li class="md-nav__item">
37543776
<a href="#how-ea-uses-dstack-to-fast-track-ai-development" class="md-nav__link">
37553777
<span class="md-ellipsis">
@@ -3826,6 +3848,17 @@
38263848
</label>
38273849
<ul class="md-nav__list" data-md-component="toc" data-md-scrollfix>
38283850

3851+
<li class="md-nav__item">
3852+
<a href="#how-toffee-streamlines-inference-and-cut-gpu-costs-with-dstack" class="md-nav__link">
3853+
<span class="md-ellipsis">
3854+
3855+
How Toffee streamlines inference and cut GPU costs with dstack
3856+
3857+
</span>
3858+
</a>
3859+
3860+
</li>
3861+
38293862
<li class="md-nav__item">
38303863
<a href="#how-ea-uses-dstack-to-fast-track-ai-development" class="md-nav__link">
38313864
<span class="md-ellipsis">
@@ -3855,6 +3888,49 @@ <h1 id="case-studies">Case studies<a class="headerlink" href="#case-studies" tit
38553888
<article class="md-post md-post--excerpt">
38563889
<header class="md-post__header">
38573890

3891+
<div class="md-post__meta md-meta">
3892+
<ul class="md-meta__list">
3893+
<li class="md-meta__item">
3894+
<time datetime="2025-12-05 00:00:00+00:00">December 5, 2025</time></li>
3895+
3896+
<li class="md-meta__item">
3897+
in
3898+
3899+
<a href="./" class="md-meta__link">Case studies</a></li>
3900+
3901+
3902+
3903+
<li class="md-meta__item">
3904+
3905+
4 min read
3906+
3907+
</li>
3908+
3909+
3910+
</ul>
3911+
3912+
</div>
3913+
</header>
3914+
<div class="md-post__content md-typeset">
3915+
<h2 id="how-toffee-streamlines-inference-and-cut-gpu-costs-with-dstack"><a class="toclink" href="../toffee/">How Toffee streamlines inference and cut GPU costs with dstack</a></h2>
3916+
<p>In a recent engineering <a href="https://research.toffee.ai/blog/how-we-use-dstack-at-toffee">blog post</a>, Toffee shared how they use <code>dstack</code> to run large-language and image-generation models across multiple GPU clouds, while keeping their core backend on AWS. This case study summarizes key insights and highlights how <code>dstack</code> became the backbone of Toffee’s multi-cloud inference stack.</p>
3917+
<p><img src="https://dstack.ai/static-assets/static-assets/images/dstack-toffee.png" width="630" /></p>
3918+
3919+
3920+
<nav class="md-post__action">
3921+
<a href="../toffee/">
3922+
<span>Continue reading</span>
3923+
<span class="icon"><svg viewBox="0 0 13 10" xmlns="http://www.w3.org/2000/svg"><path d="M12.823 4.164L8.954.182a.592.592 0 0 0-.854 0 .635.635 0 0 0 0 .88l2.836 2.92H.604A.614.614 0 0 0 0 4.604c0 .344.27.622.604.622h10.332L8.1 8.146a.635.635 0 0 0 0 .88.594.594 0 0 0 .854 0l3.869-3.982a.635.635 0 0 0 0-.88z" fill-rule="nonzero" fill="currentColor" class="fill-main"></path></svg></span>
3924+
</a>
3925+
</nav>
3926+
3927+
3928+
</div>
3929+
</article>
3930+
3931+
<article class="md-post md-post--excerpt">
3932+
<header class="md-post__header">
3933+
38583934
<div class="md-post__meta md-meta">
38593935
<ul class="md-meta__list">
38603936
<li class="md-meta__item">

blog/ea-gtc25/index.html

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3792,7 +3792,7 @@
37923792
<span class="md-ellipsis">
37933793

37943794

3795-
NVIDIA GTC 2025
3795+
NVIDIA GTC 2025
37963796

37973797

37983798

blog/index.html

Lines changed: 65 additions & 67 deletions
Original file line numberDiff line numberDiff line change
@@ -3486,6 +3486,17 @@
34863486
</label>
34873487
<ul class="md-nav__list" data-md-component="toc" data-md-scrollfix>
34883488

3489+
<li class="md-nav__item">
3490+
<a href="#how-toffee-streamlines-inference-and-cut-gpu-costs-with-dstack" class="md-nav__link">
3491+
<span class="md-ellipsis">
3492+
3493+
How Toffee streamlines inference and cut GPU costs with dstack
3494+
3495+
</span>
3496+
</a>
3497+
3498+
</li>
3499+
34893500
<li class="md-nav__item">
34903501
<a href="#sglang-router-integration-and-disaggregated-inference-roadmap" class="md-nav__link">
34913502
<span class="md-ellipsis">
@@ -3583,17 +3594,6 @@
35833594
</span>
35843595
</a>
35853596

3586-
</li>
3587-
3588-
<li class="md-nav__item">
3589-
<a href="#supporting-hot-aisle-amd-ai-developer-cloud" class="md-nav__link">
3590-
<span class="md-ellipsis">
3591-
3592-
Supporting Hot Aisle AMD AI Developer Cloud
3593-
3594-
</span>
3595-
</a>
3596-
35973597
</li>
35983598

35993599
</ul>
@@ -3859,6 +3859,17 @@
38593859
</label>
38603860
<ul class="md-nav__list" data-md-component="toc" data-md-scrollfix>
38613861

3862+
<li class="md-nav__item">
3863+
<a href="#how-toffee-streamlines-inference-and-cut-gpu-costs-with-dstack" class="md-nav__link">
3864+
<span class="md-ellipsis">
3865+
3866+
How Toffee streamlines inference and cut GPU costs with dstack
3867+
3868+
</span>
3869+
</a>
3870+
3871+
</li>
3872+
38623873
<li class="md-nav__item">
38633874
<a href="#sglang-router-integration-and-disaggregated-inference-roadmap" class="md-nav__link">
38643875
<span class="md-ellipsis">
@@ -3956,17 +3967,6 @@
39563967
</span>
39573968
</a>
39583969

3959-
</li>
3960-
3961-
<li class="md-nav__item">
3962-
<a href="#supporting-hot-aisle-amd-ai-developer-cloud" class="md-nav__link">
3963-
<span class="md-ellipsis">
3964-
3965-
Supporting Hot Aisle AMD AI Developer Cloud
3966-
3967-
</span>
3968-
</a>
3969-
39703970
</li>
39713971

39723972
</ul>
@@ -3993,6 +3993,49 @@ <h1 id="blog">Blog<a class="headerlink" href="#blog" title="Permanent link">&par
39933993
<article class="md-post md-post--excerpt">
39943994
<header class="md-post__header">
39953995

3996+
<div class="md-post__meta md-meta">
3997+
<ul class="md-meta__list">
3998+
<li class="md-meta__item">
3999+
<time datetime="2025-12-05 00:00:00+00:00">December 5, 2025</time></li>
4000+
4001+
<li class="md-meta__item">
4002+
in
4003+
4004+
<a href="case-studies/" class="md-meta__link">Case studies</a></li>
4005+
4006+
4007+
4008+
<li class="md-meta__item">
4009+
4010+
4 min read
4011+
4012+
</li>
4013+
4014+
4015+
</ul>
4016+
4017+
</div>
4018+
</header>
4019+
<div class="md-post__content md-typeset">
4020+
<h2 id="how-toffee-streamlines-inference-and-cut-gpu-costs-with-dstack"><a class="toclink" href="toffee/">How Toffee streamlines inference and cut GPU costs with dstack</a></h2>
4021+
<p>In a recent engineering <a href="https://research.toffee.ai/blog/how-we-use-dstack-at-toffee">blog post</a>, Toffee shared how they use <code>dstack</code> to run large-language and image-generation models across multiple GPU clouds, while keeping their core backend on AWS. This case study summarizes key insights and highlights how <code>dstack</code> became the backbone of Toffee’s multi-cloud inference stack.</p>
4022+
<p><img src="https://dstack.ai/static-assets/static-assets/images/dstack-toffee.png" width="630" /></p>
4023+
4024+
4025+
<nav class="md-post__action">
4026+
<a href="toffee/">
4027+
<span>Continue reading</span>
4028+
<span class="icon"><svg viewBox="0 0 13 10" xmlns="http://www.w3.org/2000/svg"><path d="M12.823 4.164L8.954.182a.592.592 0 0 0-.854 0 .635.635 0 0 0 0 .88l2.836 2.92H.604A.614.614 0 0 0 0 4.604c0 .344.27.622.604.622h10.332L8.1 8.146a.635.635 0 0 0 0 .88.594.594 0 0 0 .854 0l3.869-3.982a.635.635 0 0 0 0-.88z" fill-rule="nonzero" fill="currentColor" class="fill-main"></path></svg></span>
4029+
</a>
4030+
</nav>
4031+
4032+
4033+
</div>
4034+
</article>
4035+
4036+
<article class="md-post md-post--excerpt">
4037+
<header class="md-post__header">
4038+
39964039
<div class="md-post__meta md-meta">
39974040
<ul class="md-meta__list">
39984041
<li class="md-meta__item">
@@ -4376,51 +4419,6 @@ <h2 id="introducing-passive-gpu-health-checks"><a class="toclink" href="gpu-helt
43764419
</div>
43774420
</article>
43784421

4379-
<article class="md-post md-post--excerpt">
4380-
<header class="md-post__header">
4381-
4382-
<div class="md-post__meta md-meta">
4383-
<ul class="md-meta__list">
4384-
<li class="md-meta__item">
4385-
<time datetime="2025-08-11 00:00:00+00:00">August 11, 2025</time></li>
4386-
4387-
<li class="md-meta__item">
4388-
in
4389-
4390-
<a href="changelog/" class="md-meta__link">Changelog</a></li>
4391-
4392-
4393-
4394-
<li class="md-meta__item">
4395-
4396-
3 min read
4397-
4398-
</li>
4399-
4400-
4401-
</ul>
4402-
4403-
</div>
4404-
</header>
4405-
<div class="md-post__content md-typeset">
4406-
<h2 id="supporting-hot-aisle-amd-ai-developer-cloud"><a class="toclink" href="hotaisle/">Supporting Hot Aisle AMD AI Developer Cloud</a></h2>
4407-
<p>As the ecosystem around AMD GPUs matures, developers are looking for easier ways to experiment with ROCm, benchmark new architectures, and run cost-effective workloads—without manual infrastructure setup. </p>
4408-
<p><code>dstack</code> is an open-source orchestrator designed for AI workloads, providing a lightweight, container-native alternative to Kubernetes and Slurm.</p>
4409-
<p><img src="https://dstack.ai/static-assets/static-assets/images/dstack-hotaisle.png" width="630"/></p>
4410-
<p>Today, we’re excited to announce native integration with <a href="https://www.hotaisle.io/">Hot Aisle</a>, an AMD-only GPU neocloud offering VMs and clusters at highly competitive on-demand pricing. </p>
4411-
4412-
4413-
<nav class="md-post__action">
4414-
<a href="hotaisle/">
4415-
<span>Continue reading</span>
4416-
<span class="icon"><svg viewBox="0 0 13 10" xmlns="http://www.w3.org/2000/svg"><path d="M12.823 4.164L8.954.182a.592.592 0 0 0-.854 0 .635.635 0 0 0 0 .88l2.836 2.92H.604A.614.614 0 0 0 0 4.604c0 .344.27.622.604.622h10.332L8.1 8.146a.635.635 0 0 0 0 .88.594.594 0 0 0 .854 0l3.869-3.982a.635.635 0 0 0 0-.88z" fill-rule="nonzero" fill="currentColor" class="fill-main"></path></svg></span>
4417-
</a>
4418-
</nav>
4419-
4420-
4421-
</div>
4422-
</article>
4423-
44244422

44254423

44264424

0 commit comments

Comments
 (0)