d&
analytics|blindspot · llm setup

Local LLM Setup Guide

Hardware: Gaming rig · 32GB RAM · RTX 3070 (8GB VRAM) · Windows
Purpose: Campaign transcript parsing + Intelligence Brief generation
Runtime: Ollama (GPU-accelerated, local, no cloud, no API costs)
Phase 1
Install Ollama
1 Download Ollama
  1. Go to https://ollama.com/download/windows
  2. Click Download for Windows
  3. Run the installer — standard .exe, just click through
  4. Ollama installs as a background service and adds itself to your system tray
2 Verify Installation

Open PowerShell or Command Prompt (not as admin — just normal):

ollama --version

You should see something like ollama version 0.x.x. If you get "not recognized," restart your terminal or reboot.

3 Verify GPU Detection
ollama list

This should return an empty list (no models yet). The important thing is it runs without errors. Ollama auto-detects your 3070 and will use it for inference.

To confirm GPU is being used, after you pull your first model and run it, check Task Manager → Performance → GPU. You should see VRAM utilization spike when the model loads.

Phase 2
Pull Your Models
4 Pull Llama 3.1 8B (Primary Model)
ollama pull llama3.1:8b

Downloads ~4.7GB. Q4_0 quantization by default — fits your 8GB VRAM cleanly. Wait for it to finish.

5 Pull Mistral 7B (Secondary — for convergent analysis later)
ollama pull mistral:7b

Downloads ~4.1GB. You don't need this immediately, but having it ready means you can run two-model comparison whenever you want.

6 Verify Models Are Installed
ollama list

You should see both models listed with their sizes.


Phase 3
Test It
7 Quick Smoke Test
ollama run llama3.1:8b "What is a d20?"

You should get a fast, coherent response about twenty-sided dice. If this works, your GPU inference is running. Type /bye to exit the interactive session if it drops you into one.

8 Test Structured Output (This Is What Matters)
ollama run llama3.1:8b

This opens an interactive session. Paste this:

Extract the following rolls into CSV format with columns:
pc, roll_type, die, result, modifier, total, target_ac, hit_flag

Session transcript excerpt:
Feliciano attacks Enforcer A with his longsword. He rolls a 14,
plus 4 to hit, total 18 against AC 14. That's a hit. He rolls
damage: 1d8+2, gets a 6, total 8 slashing damage.
Hillie fires Eldritch Blast at the Archer. She rolls a 7, plus 4
to hit, total 11 against AC 12. Miss — the blast scorches the scaffolding.
Feliciano makes a Constitution saving throw against poison. He rolls
a 3, plus 2, total 5 against DC 13. Fail.

Output CSV only, no explanation.

You should get clean CSV output. If it does, your parsing pipeline works. Type /bye to exit.


Phase 4
Set Up Your Prompts
9 Create a Project Folder
mkdir C:\BlindSpot
mkdir C:\BlindSpot\prompts
mkdir C:\BlindSpot\data
mkdir C:\BlindSpot\briefs
10 Create the Parsing Prompt

Create a file at C:\BlindSpot\prompts\parse_rolls.txt with this content:

SYSTEM: You are a D&D session transcript parser. Your job is to extract
structured data from session transcripts. Output ONLY CSV data, no
explanation, no markdown formatting, no commentary.

SCHEMA — ROLL LOG:
pc,roll_type,die,result,modifier,total,target,hit_flag,session,scene

FIELD DEFINITIONS:
- pc: Character name (e.g., Feliciano, Hillie)
- roll_type: attack, save, check, damage, initiative
- die: d20, d8, d10, d6, d4, d12, d100
- result: Natural die result (before modifiers)
- modifier: Bonus/penalty applied
- total: result + modifier
- target: AC or DC the roll was against (0 if unknown)
- hit_flag: 1 = success/hit, 0 = fail/miss, -1 = unknown
- session: Session number (e.g., S01)
- scene: Scene number within session (e.g., 1, 2, 3)

RULES:
- Extract EVERY roll mentioned in the transcript
- If a roll result is not explicitly stated, skip it
- If a modifier is not stated, use 0
- If a target AC/DC is not stated, use 0
- Natural 1 is always a miss (hit_flag = 0)
- Natural 20 is always a hit (hit_flag = 1)
- Damage rolls: hit_flag = -1 (not applicable)
- Do not invent or infer rolls that aren't in the text
- Output the CSV header row first, then data rows

USER: Here is the transcript for session [SESSION_NUMBER]:

[PASTE TRANSCRIPT HERE]
11 Create the Intelligence Brief Prompt

Create a file at C:\BlindSpot\prompts\campaign_brief.txt with this content:

SYSTEM: You are a Campaign Intelligence Analyst for a D&D 5e campaign.
You read structured campaign metrics and produce a concise pre-session
briefing for the DM. Your tone is direct, analytical, and actionable —
like an audit findings report, not a creative writing piece.

OUTPUT FORMAT:
1. CAMPAIGN HEALTH SCORE (0-100, with 1-line justification)
2. TOP 3 PREP PRIORITIES (numbered, 2-3 sentences each)
3. FLAGS (bullet list of active concerns with data citations)
4. RECOMMENDATIONS (2-3 specific, actionable suggestions)

RULES:
- Cite specific numbers from the data (don't generalize)
- If spotlight split exceeds 60/40, flag it
- If any quest has been active 4+ sessions with no progress, flag it
- If any PC's d20 average is below 9.0 over 30+ rolls, flag it
- If any faction clock is at 75%+ capacity, flag it
- If resource usage (spell slots, abilities) is below 50%, flag it
- Keep the entire brief under 400 words
- Do not invent data points not provided in the input
- End with "Next session focus:" and a single sentence

USER: Here are the campaign metrics for pre-session [SESSION_NUMBER] prep:

CAMPAIGN: [CAMPAIGN NAME]
SESSIONS PLAYED: [N]
TOTAL ROLLS: [N]

PC STATS:
[paste PC performance summary]

QUEST STATUS:
[paste quest tracker summary]

FACTION STATE:
[paste faction heat/favor summary]

RESOURCE USAGE:
[paste resource burn rates]

DICE SUMMARY:
[paste dice averages and distribution notes]

SPOTLIGHT:
[paste actions-per-PC breakdown]

Phase 5
Run Your First Parse
12 After Your First Session
  1. Copy the session transcript (or the key portions with rolls) into a text file
  2. Open PowerShell
  3. Run:
# Option A: Interactive (paste the transcript in)
ollama run llama3.1:8b

# Then paste your parse_rolls prompt with the transcript

# Option B: Pipe a file (if you saved the full prompt + transcript)
Get-Content C:\BlindSpot\data\session01_parse_input.txt | ollama run llama3.1:8b
  1. Copy the CSV output into C:\BlindSpot\data\session01_rolls.csv
  2. Import that CSV into your Excel engine's fact table
13 Generate Your First Brief

Once you have 2-3 sessions of data in the engine and dashboard metrics:

  1. Export the dashboard summary values as text
  2. Paste them into the campaign_brief prompt template
  3. Run:
ollama run llama3.1:8b < C:\BlindSpot\data\session03_brief_input.txt > C:\BlindSpot\briefs\session04_brief.txt
  1. Open the brief. Read it. Prep your session.

Phase 6
Convergent Analysis (Optional, Later)
14 Run the Same Brief Through Mistral
ollama run mistral:7b < C:\BlindSpot\data\session03_brief_input.txt > C:\BlindSpot\briefs\session04_brief_mistral.txt
15 Compare

Open both briefs side by side. Where they agree = high confidence. Where they diverge = DM judgment call. You don't need automation for this yet — eyeball comparison is fine for the first few sessions.


Reference
Troubleshooting
"ollama not recognized"
Restart your terminal. If still broken, check that C:\Users\[you]\AppData\Local\Programs\Ollama is in your PATH.
Model runs slow (< 5 tokens/sec)
Check Task Manager → GPU. If VRAM isn't being used, Ollama may be falling back to CPU. Run ollama ps while a model is loaded to see where it's running. Restart the Ollama service from the system tray.
Out of memory errors
Close other GPU-intensive apps (games, browser with hardware acceleration). Your 3070 has 8GB — the 8B model needs ~5GB, so you need ~3GB free.
Response seems wrong or hallucinated
For parsing: spot-check 10 rolls against the transcript. If accuracy is below 90%, your prompt needs tightening — add more examples to the system prompt.
For briefs: remember the LLM is interpreting, not computing. The numbers come from your Excel engine. If the numbers are right and the interpretation is off, adjust the brief prompt thresholds.
Want to try a different model?
ollama pull phi3:medium — Phi-3 Medium 14B, partially offloads to CPU, slower but more capable
ollama pull gemma2:9b — Google's Gemma 2 9B, good at structured tasks
ollama pull llama3.2:3b — smaller, faster, less capable, good for simple parsing

Result
What You Have When This Is Done
Total setup time: ~30 minutes (mostly waiting for model downloads)