Skip to content

Commit 807fd5e

Browse files
Deploying to gh-pages from @ dstackai/dstack@0255f94 🚀
1 parent b80b41a commit 807fd5e

File tree

16 files changed

+4679
-352
lines changed

16 files changed

+4679
-352
lines changed
34 KB
Loading

blog/changelog/index.html

Lines changed: 66 additions & 70 deletions
Original file line numberDiff line numberDiff line change
@@ -3575,6 +3575,17 @@
35753575
</label>
35763576
<ul class="md-nav__list" data-md-component="toc" data-md-scrollfix>
35773577

3578+
<li class="md-nav__item">
3579+
<a href="#sglang-router-integration-and-disaggregated-inference-roadmap" class="md-nav__link">
3580+
<span class="md-ellipsis">
3581+
3582+
SGLang router integration and disaggregated inference roadmap
3583+
3584+
</span>
3585+
</a>
3586+
3587+
</li>
3588+
35783589
<li class="md-nav__item">
35793590
<a href="#orchestrating-gpus-on-kubernetes-clusters" class="md-nav__link">
35803591
<span class="md-ellipsis">
@@ -3672,17 +3683,6 @@
36723683
</span>
36733684
</a>
36743685

3675-
</li>
3676-
3677-
<li class="md-nav__item">
3678-
<a href="#built-in-ui-for-monitoring-essential-gpu-metrics" class="md-nav__link">
3679-
<span class="md-ellipsis">
3680-
3681-
Built-in UI for monitoring essential GPU metrics
3682-
3683-
</span>
3684-
</a>
3685-
36863686
</li>
36873687

36883688
</ul>
@@ -3861,6 +3861,17 @@
38613861
</label>
38623862
<ul class="md-nav__list" data-md-component="toc" data-md-scrollfix>
38633863

3864+
<li class="md-nav__item">
3865+
<a href="#sglang-router-integration-and-disaggregated-inference-roadmap" class="md-nav__link">
3866+
<span class="md-ellipsis">
3867+
3868+
SGLang router integration and disaggregated inference roadmap
3869+
3870+
</span>
3871+
</a>
3872+
3873+
</li>
3874+
38643875
<li class="md-nav__item">
38653876
<a href="#orchestrating-gpus-on-kubernetes-clusters" class="md-nav__link">
38663877
<span class="md-ellipsis">
@@ -3958,17 +3969,6 @@
39583969
</span>
39593970
</a>
39603971

3961-
</li>
3962-
3963-
<li class="md-nav__item">
3964-
<a href="#built-in-ui-for-monitoring-essential-gpu-metrics" class="md-nav__link">
3965-
<span class="md-ellipsis">
3966-
3967-
Built-in UI for monitoring essential GPU metrics
3968-
3969-
</span>
3970-
</a>
3971-
39723972
</li>
39733973

39743974
</ul>
@@ -3989,6 +3989,50 @@ <h1 id="changelog">Changelog<a class="headerlink" href="#changelog" title="Perma
39893989
<article class="md-post md-post--excerpt">
39903990
<header class="md-post__header">
39913991

3992+
<div class="md-post__meta md-meta">
3993+
<ul class="md-meta__list">
3994+
<li class="md-meta__item">
3995+
<time datetime="2025-11-25 00:00:00+00:00">November 25, 2025</time></li>
3996+
3997+
<li class="md-meta__item">
3998+
in
3999+
4000+
<a href="./" class="md-meta__link">Changelog</a></li>
4001+
4002+
4003+
4004+
<li class="md-meta__item">
4005+
4006+
3 min read
4007+
4008+
</li>
4009+
4010+
4011+
</ul>
4012+
4013+
</div>
4014+
</header>
4015+
<div class="md-post__content md-typeset">
4016+
<h2 id="sglang-router-integration-and-disaggregated-inference-roadmap"><a class="toclink" href="../sglang-router/">SGLang router integration and disaggregated inference roadmap</a></h2>
4017+
<p><a href="https://github.com/dstackai/dstack/">dstack</a> provides a streamlined way to handle GPU provisioning and workload orchestration across GPU clouds, Kubernetes clusters, or on-prem environments. Built for interoperability, dstack bridges diverse hardware and open-source tooling.</p>
4018+
<p><img src="https://dstack.ai/static-assets/static-assets/images/dstack-sglang-router.png" width="630"/></p>
4019+
<p>As disaggregated, low-latency inference emerges, we aim to ensure this new stack runs natively on <code>dstack</code>. To move this forward, we’re introducing native integration between dstack and <a href="https://docs.sglang.ai/advanced_features/router.html">SGLang’s Model Gateway</a> (formerly known as the SGLang Router).</p>
4020+
4021+
4022+
<nav class="md-post__action">
4023+
<a href="../sglang-router/">
4024+
<span>Continue reading</span>
4025+
<span class="icon"><svg viewBox="0 0 13 10" xmlns="http://www.w3.org/2000/svg"><path d="M12.823 4.164L8.954.182a.592.592 0 0 0-.854 0 .635.635 0 0 0 0 .88l2.836 2.92H.604A.614.614 0 0 0 0 4.604c0 .344.27.622.604.622h10.332L8.1 8.146a.635.635 0 0 0 0 .88.594.594 0 0 0 .854 0l3.869-3.982a.635.635 0 0 0 0-.88z" fill-rule="nonzero" fill="currentColor" class="fill-main"></path></svg></span>
4026+
</a>
4027+
</nav>
4028+
4029+
4030+
</div>
4031+
</article>
4032+
4033+
<article class="md-post md-post--excerpt">
4034+
<header class="md-post__header">
4035+
39924036
<div class="md-post__meta md-meta">
39934037
<ul class="md-meta__list">
39944038
<li class="md-meta__item">
@@ -4412,54 +4456,6 @@ <h2 id="supporting-gpu-provisioning-and-orchestration-on-nebius"><a class="tocli
44124456
</div>
44134457
</article>
44144458

4415-
<article class="md-post md-post--excerpt">
4416-
<header class="md-post__header">
4417-
4418-
<div class="md-post__meta md-meta">
4419-
<ul class="md-meta__list">
4420-
<li class="md-meta__item">
4421-
<time datetime="2025-04-03 00:00:00+00:00">April 3, 2025</time></li>
4422-
4423-
<li class="md-meta__item">
4424-
in
4425-
4426-
<a href="./" class="md-meta__link">Changelog</a></li>
4427-
4428-
4429-
4430-
<li class="md-meta__item">
4431-
4432-
2 min read
4433-
4434-
</li>
4435-
4436-
4437-
</ul>
4438-
4439-
</div>
4440-
</header>
4441-
<div class="md-post__content md-typeset">
4442-
<h2 id="built-in-ui-for-monitoring-essential-gpu-metrics"><a class="toclink" href="../metrics-ui/">Built-in UI for monitoring essential GPU metrics</a></h2>
4443-
<p>AI workloads generate vast amounts of metrics, making it essential to have efficient monitoring tools. While our recent
4444-
update introduced the ability to export available metrics to Prometheus for maximum flexibility, there are times when
4445-
users need to quickly access essential metrics without the need to switch to an external tool.</p>
4446-
<p><img src="https://dstack.ai/static-assets/static-assets/images/dstack-metrics-ui-v3-min.png" width="630"/></p>
4447-
<p>Previously, we introduced a <a href="../dstack-metrics/">CLI command</a> that allows users to view essential GPU metrics for both NVIDIA
4448-
and AMD hardware. Now, with this latest update, we’re excited to announce the addition of a built-in dashboard within
4449-
the <code>dstack</code> control plane.</p>
4450-
4451-
4452-
<nav class="md-post__action">
4453-
<a href="../metrics-ui/">
4454-
<span>Continue reading</span>
4455-
<span class="icon"><svg viewBox="0 0 13 10" xmlns="http://www.w3.org/2000/svg"><path d="M12.823 4.164L8.954.182a.592.592 0 0 0-.854 0 .635.635 0 0 0 0 .88l2.836 2.92H.604A.614.614 0 0 0 0 4.604c0 .344.27.622.604.622h10.332L8.1 8.146a.635.635 0 0 0 0 .88.594.594 0 0 0 .854 0l3.869-3.982a.635.635 0 0 0 0-.88z" fill-rule="nonzero" fill="currentColor" class="fill-main"></path></svg></span>
4456-
</a>
4457-
</nav>
4458-
4459-
4460-
</div>
4461-
</article>
4462-
44634459

44644460

44654461

blog/changelog/page/2/index.html

Lines changed: 70 additions & 69 deletions
Original file line numberDiff line numberDiff line change
@@ -3573,6 +3573,17 @@
35733573
</label>
35743574
<ul class="md-nav__list" data-md-component="toc" data-md-scrollfix>
35753575

3576+
<li class="md-nav__item">
3577+
<a href="#built-in-ui-for-monitoring-essential-gpu-metrics" class="md-nav__link">
3578+
<span class="md-ellipsis">
3579+
3580+
Built-in UI for monitoring essential GPU metrics
3581+
3582+
</span>
3583+
</a>
3584+
3585+
</li>
3586+
35763587
<li class="md-nav__item">
35773588
<a href="#supporting-mpi-and-ncclrccl-tests" class="md-nav__link">
35783589
<span class="md-ellipsis">
@@ -3670,17 +3681,6 @@
36703681
</span>
36713682
</a>
36723683

3673-
</li>
3674-
3675-
<li class="md-nav__item">
3676-
<a href="#using-tpus-for-fine-tuning-and-deploying-llms" class="md-nav__link">
3677-
<span class="md-ellipsis">
3678-
3679-
Using TPUs for fine-tuning and deploying LLMs
3680-
3681-
</span>
3682-
</a>
3683-
36843684
</li>
36853685

36863686
</ul>
@@ -3859,6 +3859,17 @@
38593859
</label>
38603860
<ul class="md-nav__list" data-md-component="toc" data-md-scrollfix>
38613861

3862+
<li class="md-nav__item">
3863+
<a href="#built-in-ui-for-monitoring-essential-gpu-metrics" class="md-nav__link">
3864+
<span class="md-ellipsis">
3865+
3866+
Built-in UI for monitoring essential GPU metrics
3867+
3868+
</span>
3869+
</a>
3870+
3871+
</li>
3872+
38623873
<li class="md-nav__item">
38633874
<a href="#supporting-mpi-and-ncclrccl-tests" class="md-nav__link">
38643875
<span class="md-ellipsis">
@@ -3956,17 +3967,6 @@
39563967
</span>
39573968
</a>
39583969

3959-
</li>
3960-
3961-
<li class="md-nav__item">
3962-
<a href="#using-tpus-for-fine-tuning-and-deploying-llms" class="md-nav__link">
3963-
<span class="md-ellipsis">
3964-
3965-
Using TPUs for fine-tuning and deploying LLMs
3966-
3967-
</span>
3968-
</a>
3969-
39703970
</li>
39713971

39723972
</ul>
@@ -3987,6 +3987,54 @@ <h1 id="changelog">Changelog<a class="headerlink" href="#changelog" title="Perma
39873987
<article class="md-post md-post--excerpt">
39883988
<header class="md-post__header">
39893989

3990+
<div class="md-post__meta md-meta">
3991+
<ul class="md-meta__list">
3992+
<li class="md-meta__item">
3993+
<time datetime="2025-04-03 00:00:00+00:00">April 3, 2025</time></li>
3994+
3995+
<li class="md-meta__item">
3996+
in
3997+
3998+
<a href="../../" class="md-meta__link">Changelog</a></li>
3999+
4000+
4001+
4002+
<li class="md-meta__item">
4003+
4004+
2 min read
4005+
4006+
</li>
4007+
4008+
4009+
</ul>
4010+
4011+
</div>
4012+
</header>
4013+
<div class="md-post__content md-typeset">
4014+
<h2 id="built-in-ui-for-monitoring-essential-gpu-metrics"><a class="toclink" href="../../../metrics-ui/">Built-in UI for monitoring essential GPU metrics</a></h2>
4015+
<p>AI workloads generate vast amounts of metrics, making it essential to have efficient monitoring tools. While our recent
4016+
update introduced the ability to export available metrics to Prometheus for maximum flexibility, there are times when
4017+
users need to quickly access essential metrics without the need to switch to an external tool.</p>
4018+
<p><img src="https://dstack.ai/static-assets/static-assets/images/dstack-metrics-ui-v3-min.png" width="630"/></p>
4019+
<p>Previously, we introduced a <a href="../../../dstack-metrics/">CLI command</a> that allows users to view essential GPU metrics for both NVIDIA
4020+
and AMD hardware. Now, with this latest update, we’re excited to announce the addition of a built-in dashboard within
4021+
the <code>dstack</code> control plane.</p>
4022+
4023+
4024+
<nav class="md-post__action">
4025+
<a href="../../../metrics-ui/">
4026+
<span>Continue reading</span>
4027+
<span class="icon"><svg viewBox="0 0 13 10" xmlns="http://www.w3.org/2000/svg"><path d="M12.823 4.164L8.954.182a.592.592 0 0 0-.854 0 .635.635 0 0 0 0 .88l2.836 2.92H.604A.614.614 0 0 0 0 4.604c0 .344.27.622.604.622h10.332L8.1 8.146a.635.635 0 0 0 0 .88.594.594 0 0 0 .854 0l3.869-3.982a.635.635 0 0 0 0-.88z" fill-rule="nonzero" fill="currentColor" class="fill-main"></path></svg></span>
4028+
</a>
4029+
</nav>
4030+
4031+
4032+
</div>
4033+
</article>
4034+
4035+
<article class="md-post md-post--excerpt">
4036+
<header class="md-post__header">
4037+
39904038
<div class="md-post__meta md-meta">
39914039
<ul class="md-meta__list">
39924040
<li class="md-meta__item">
@@ -4440,53 +4488,6 @@ <h3 id="how-it-works" style="display:none"><a class="toclink" href="../../../dst
44404488
</div>
44414489
</article>
44424490

4443-
<article class="md-post md-post--excerpt">
4444-
<header class="md-post__header">
4445-
4446-
<div class="md-post__meta md-meta">
4447-
<ul class="md-meta__list">
4448-
<li class="md-meta__item">
4449-
<time datetime="2024-09-10 00:00:00+00:00">September 10, 2024</time></li>
4450-
4451-
<li class="md-meta__item">
4452-
in
4453-
4454-
<a href="../../" class="md-meta__link">Changelog</a></li>
4455-
4456-
4457-
4458-
<li class="md-meta__item">
4459-
4460-
4 min read
4461-
4462-
</li>
4463-
4464-
4465-
</ul>
4466-
4467-
</div>
4468-
</header>
4469-
<div class="md-post__content md-typeset">
4470-
<h2 id="using-tpus-for-fine-tuning-and-deploying-llms"><a class="toclink" href="../../../tpu-on-gcp/">Using TPUs for fine-tuning and deploying LLMs</a></h2>
4471-
<p>If you’re using or planning to use TPUs with Google Cloud, you can now do so via <code>dstack</code>. Just specify the TPU version and the number of cores
4472-
(separated by a dash), in the <code>gpu</code> property under <code>resources</code>. </p>
4473-
<p>Read below to find out how to use TPUs with <code>dstack</code> for fine-tuning and deploying
4474-
LLMs, leveraging open-source tools like Hugging Face’s
4475-
<a href="https://github.com/huggingface/optimum-tpu">Optimum TPU</a>
4476-
and <a href="https://docs.vllm.ai/en/latest/getting_started/tpu-installation.html">vLLM</a>.</p>
4477-
4478-
4479-
<nav class="md-post__action">
4480-
<a href="../../../tpu-on-gcp/">
4481-
<span>Continue reading</span>
4482-
<span class="icon"><svg viewBox="0 0 13 10" xmlns="http://www.w3.org/2000/svg"><path d="M12.823 4.164L8.954.182a.592.592 0 0 0-.854 0 .635.635 0 0 0 0 .88l2.836 2.92H.604A.614.614 0 0 0 0 4.604c0 .344.27.622.604.622h10.332L8.1 8.146a.635.635 0 0 0 0 .88.594.594 0 0 0 .854 0l3.869-3.982a.635.635 0 0 0 0-.88z" fill-rule="nonzero" fill="currentColor" class="fill-main"></path></svg></span>
4483-
</a>
4484-
</nav>
4485-
4486-
4487-
</div>
4488-
</article>
4489-
44904491

44914492

44924493

0 commit comments

Comments
 (0)