Skip to content

// Launch and tune SGLang serving — RadixAttention prefix caching, structured generation, tensor parallelism, vLLM benchmarks.

SGLang Inference ServerVerified Creator

git log --oneline --stat
HEAD
Stars
6.2k
Forks
341
Updated
Jun 24, 2026
repo --stat
  • stars

    6.2k

  • forks

    341

  • last update

    Jun 24, 2026

  • license

    MITv1.3.0

quickstart.sh
3 steps
  1. Install

    // Drops SKILL.md into ~/.claude/skills/

    $ claude skills add sglang-inference-server
  2. Invoke

    // Run from any project directory

    $ claude --skill sglang-inference-server "fine-tune on this CSV"
  3. Iterate

    // Re-run with edits — Claude keeps the skill loaded

    $ claude --skill sglang-inference-server "now refactor it"
sglang-inference-server/
references
SKILL.md
readonly
name:
SGLang Inference Server
slug:
sglang-inference-server
version:
v1.3.0
license:
MIT
author:
@sgl-serve
repository:
github.com/sgl-serve/sglang-inference-server
categories:
tags:
#sglang#llm-serving#radixattention#inference#vllm
description:

Launch and tune SGLang serving — RadixAttention prefix caching, structured generation, tensor parallelism, vLLM benchmarks.

features.md
3 capabilities

// What you can do with it

  • Automates the tedious parts of the workflow.
  • Gives Claude the right context, tools, and guardrails.
  • Produces consistent, reviewable output every time.

README.md

sglang-inference-server/README.md
5 sections
Loading README…

$ cat reviews/

Reviews

// No reviews yet. Be the first.
Loading review form…

$ ls related/

explore all →