<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>Llm on Conselara Labs</title>
    <link>https://conselara.dev/tags/llm/</link>
    <description>Recent content in Llm on Conselara Labs</description>
    <generator>Hugo</generator>
    <language>en</language>
    <lastBuildDate>Thu, 14 May 2026 00:00:00 +0000</lastBuildDate>
    <atom:link href="https://conselara.dev/tags/llm/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>Building a Two-Node Ray Cluster for Distributed LLM Inference on DGX Spark</title>
      <link>https://conselara.dev/notes/two-node-ray-cluster-dgx-spark/</link>
      <pubDate>Thu, 14 May 2026 00:00:00 +0000</pubDate>
      <guid>https://conselara.dev/notes/two-node-ray-cluster-dgx-spark/</guid>
      <description>Step-by-step setup for a two-node Ray cluster running Qwen3-235B across two DGX Sparks, including RoCE verification, model sync, NCCL configuration, and the specific gotchas that cause silent failures.</description>
    </item>
    <item>
      <title>DGX Spark GB10 Hardware Reference: SM121 Architecture, Memory, and Networking</title>
      <link>https://conselara.dev/notes/dgx-spark-gb10-hardware-reference/</link>
      <pubDate>Thu, 14 May 2026 00:00:00 +0000</pubDate>
      <guid>https://conselara.dev/notes/dgx-spark-gb10-hardware-reference/</guid>
      <description>Architecture constraints, memory behavior, networking setup, and kernel compatibility for the NVIDIA DGX Spark GB10 Grace Blackwell Superchip (SM121).</description>
    </item>
    <item>
      <title>vLLM on DGX Spark: What the SM121 Architecture Actually Requires</title>
      <link>https://conselara.dev/notes/vllm-dgx-spark-sm121-gotchas/</link>
      <pubDate>Wed, 13 May 2026 00:00:00 +0000</pubDate>
      <guid>https://conselara.dev/notes/vllm-dgx-spark-sm121-gotchas/</guid>
      <description>Hard-won config rules for running vLLM on DGX Spark GB10 (SM121). Covers broken backends, unified memory limits, quantization traps, multi-node NCCL, and flags that silently destroy throughput.</description>
    </item>
    <item>
      <title>We Replaced an MCP Server with FastAPI and It Worked Everywhere</title>
      <link>https://conselara.dev/notes/mcp-to-fastapi-lessons-learned/</link>
      <pubDate>Tue, 12 May 2026 00:00:00 +0000</pubDate>
      <guid>https://conselara.dev/notes/mcp-to-fastapi-lessons-learned/</guid>
      <description>We built a company knowledge base server as an MCP SSE endpoint. It worked in Claude Code and nowhere else. Here is what we learned and how we fixed it.</description>
    </item>
    <item>
      <title>AI Across a Health Research Information Platform</title>
      <link>https://conselara.dev/notes/ai-health-research-platform/</link>
      <pubDate>Sat, 09 May 2026 00:00:00 +0000</pubDate>
      <guid>https://conselara.dev/notes/ai-health-research-platform/</guid>
      <description>How we are integrating AI into a federal health research platform: publication discovery, LLM evaluations, AI-assisted development, and vetting AI-powered CMS modules.</description>
    </item>
    <item>
      <title>DGX Spark Model Comparison: What Fits and What Runs (SM121, 128 GB)</title>
      <link>https://conselara.dev/notes/dgx-spark-model-comparison/</link>
      <pubDate>Sat, 09 May 2026 00:00:00 +0000</pubDate>
      <guid>https://conselara.dev/notes/dgx-spark-model-comparison/</guid>
      <description>A quick-reference comparison of open-weight models for a single DGX Spark GB10 — what fits in 128 GB unified memory, expected throughput, and SM121 compatibility notes.</description>
    </item>
    <item>
      <title>vLLM Model Selection for DGX Spark (SM121)</title>
      <link>https://conselara.dev/notes/dgx-spark-model-selection/</link>
      <pubDate>Sat, 09 May 2026 00:00:00 +0000</pubDate>
      <guid>https://conselara.dev/notes/dgx-spark-model-selection/</guid>
      <description>How to choose a model for the DGX Spark GB10 (SM121). Covers architecture compatibility, quantization format requirements, and what to expect from each option.</description>
    </item>
    <item>
      <title>Running Qwen3.5-122B on a Single DGX Spark</title>
      <link>https://conselara.dev/notes/dgx-spark-qwen35-122b/</link>
      <pubDate>Tue, 05 May 2026 00:00:00 +0000</pubDate>
      <guid>https://conselara.dev/notes/dgx-spark-qwen35-122b/</guid>
      <description>Qwen3.5-122B-A10B runs at 51 tok/s on a single DGX Spark in NVFP4 quantization. This post covers the SM121 constraints, required vLLM flags, and what to expect.</description>
    </item>
  </channel>
</rss>
