// Launch and tune SGLang serving — RadixAttention prefix caching, structured generation, tensor parallelism, vLLM benchmarks.

SGLang Inference ServerVerified Creator

git log --oneline --stat

HEAD

Stars: 6.2k
Forks: 341
Updated: Jun 24, 2026

repo --stat

stars
6.2k
forks
341
last update
Jun 24, 2026
license
MITv1.3.0

quickstart.sh

3 steps

1
Install
// Drops SKILL.md into ~/.claude/skills/
```
$ claude skills add sglang-inference-server
```

Invoke

// Run from any project directory

$ claude --skill sglang-inference-server "fine-tune on this CSV"

3
Iterate
// Re-run with edits — Claude keeps the skill loaded
```
$ claude --skill sglang-inference-server "now refactor it"
```

sglang-inference-server/

references

SKILL.md

readonly

name:: SGLang Inference Server
slug:: sglang-inference-server
version:: v1.3.0
license:: MIT
author:: @sgl-serve
repository:: github.com/sgl-serve/sglang-inference-server
categories:: ML / AI
tags:: #sglang#llm-serving#radixattention#inference#vllm
description:: Launch and tune SGLang serving — RadixAttention prefix caching, structured generation, tensor parallelism, vLLM benchmarks.

features.md

3 capabilities

// What you can do with it

Automates the tedious parts of the workflow.
Gives Claude the right context, tools, and guardrails.
Produces consistent, reviewable output every time.

README.md

sglang-inference-server/README.md

5 sections

Loading README…

$ cat reviews/

Reviews

// No reviews yet. Be the first.

Loading review form…

$ ls related/

explore all →

nano-banana-gemini-image-pipeline.md

@pixelforge-labs

Nano Banana Image Pipeline · v1.2.0

Google Gemini image API (Nano Banana / Nano Banana Pro) — generate, edit, refine, text-accurate infographics.

ML / AI#gemini#nano-banana#image-generation

2026-06-24cd ./nano-banana-gemini-image-pipeline →

graphiti-temporal-memory.md

@getzep

Graphiti Temporal Memory · v1.3.0

Build bi-temporal knowledge-graph agent memory with Graphiti and Zep — facts with validity windows, not flat vector recall.

ML / AI#graphiti#zep#knowledge-graph

2026-06-24cd ./graphiti-temporal-memory →

context-engineering-agents.md

@contextcraft

Context Engineering Agents · v1.2.0

Engineer agent context: token budgets, dynamic selection, compression, memory tiering, and drift monitoring for long runs.

ML / AI#context-engineering#ai-agents#context-window

2026-06-24cd ./context-engineering-agents →