Connect Claude Code to ContextStellar
Auto-score every prompt in your Claude Code sessions. Get real-time quality feedback injected into your context window — and contribute to a scoring model that gets smarter with every use.
How it works
Hook fires on each tool use
Claude Code sends a POST request with the tool input before executing it.
ContextStellar scores the prompt
Our 5-dimension scorer evaluates token efficiency, density, structure, specificity, and cache-friendliness.
Feedback injected as context
The score, grade, and top tip are returned as additionalContext — Claude sees it in real-time.
Outcomes improve the model
PostToolUse hooks capture success/failure. Over time, learned weights adapt to what actually works.
Setup (2 minutes)
Step 1: Enter your API key
Get your API key from the dashboard settings. If you don't have a project yet, create one first.
Step 2: Add to your Claude Code settings
Paste this into ~/.claude/settings.json (or your project's .claude/settings.json):
{
"hooks": {
"PreToolUse": [
{
"url": "https://contextstellar.com/api/v1/hooks/claude-code",
"headers": {
"Authorization": "Bearer cs_live_YOUR_API_KEY"
}
}
],
"PostToolUse": [
{
"url": "https://contextstellar.com/api/v1/hooks/claude-code",
"headers": {
"Authorization": "Bearer cs_live_YOUR_API_KEY"
}
}
],
"Stop": [
{
"url": "https://contextstellar.com/api/v1/hooks/claude-code",
"headers": {
"Authorization": "Bearer cs_live_YOUR_API_KEY"
}
}
]
}
}Step 3: Start using Claude Code
That's it! Every tool call in your Claude Code sessions will now be scored. You'll see quality feedback in Claude's context, and your outcomes will train the adaptive scoring model over time.
The Learning Loop
ContextStellar's scoring model improves with every session. Here's how:
Dimension Weights
The 5 scoring dimensions start with hand-tuned weights. As outcome data accumulates, weights shift toward dimensions that actually predict success.
Prompt Type Adaptation
Technical prompts weight specificity higher. Creative prompts preserve density. The model learns per-type weights automatically.
Outcome Signals
Tool success/failure, latency, and token cost form the reward signal. Prompts that lead to successful, efficient outcomes reinforce their patterns.
Confidence Blending
Learned weights blend with defaults based on sample count. With <50 samples, defaults dominate. At 500+, learned weights take over.
Privacy
- Prompt content is never stored. Only a SHA-256 hash is kept for deduplication.
- Only the 5-dimension score breakdown (numbers, not text) is persisted for learning.
- Outcome data (success/failure, latency) contains no prompt content.
- All data is scoped to your project and never shared across organizations.
Questions? Read the guide or check the dashboard.