Initial commit: SIC harness (backend, web, pi-adapter, configs, docs)

- pnpm monorepo: apps/api (Fastify + SQLite + SSE), apps/web (React+Vite), packages/shared, packages/pi-adapter
- Local auth (admin/webhook-runner roles) + Keycloak JWT ready
- Multi-session chat with reliable history (user persisted before LLM, assistant persisted after stream)
- Markdown knowledge base with /api/docs/search + /api/docs/:id
- YAML webhook catalog with backend-only execution, retry/backoff, audit (webhook_runs), and per-user rate limit
- Skills config (sre-on-call, blameless-postmortem, security-incident) injected into LLM system prompt
- LLM provider failover chain (config/models.yml fallback + LLM_FALLBACK_CHAIN override)
- Context-aware webhooks panel + backend id-mention safety net
- Per-message stats (time/duration/tokens/model), Markdown+GFM render, code & table copy/download buttons
- Vitest suite, end-to-end smoke test (scripts/smoke.mjs), per-session system prompt override
- /metrics Prometheus endpoint + /api/metrics JSON, request-id correlation
- dotenv with explicit repo-root path; envString/envNumber helpers (handles empty-string env)
- Runbooks + SOPs under knowledge/ in English; README, docs, and INDEX.md in English
This commit is contained in:
2026-06-29 16:20:53 +02:00
commit 62728b2200
89 changed files with 11992 additions and 0 deletions

17
docs/agents/api-agent.md Normal file
View File

@@ -0,0 +1,17 @@
# API Agent
Owns the Fastify backend.
## Focus
- Design HTTP/SSE contracts first.
- Persist every critical state in SQLite.
- Validate ownership with `session_id + user_id`.
- Emit JSON logs.
- Keep `/healthz` and `/readyz` simple.
## Do not
- Do not keep sessions in memory.
- Do not expose real webhook URLs to clients.
- Do not execute webhooks without explicit confirmation.

View File

@@ -0,0 +1,14 @@
# PI Adapter Agent
Owns isolating the `pi.dev` / LLM provider runtime.
## Focus
- Expose a stable contract to the backend.
- Support OpenAI-compatible providers.
- Return a structured response: `answer`, `recommended_actions`, `internal_docs`.
## Do not
- Do not mix backend HTTP rules with model logic.
- Do not let the model execute tools directly in Phase 1.

View File

@@ -0,0 +1,11 @@
# Security & Reliability Agent
Owns reviewing isolation, audit, and execution rules.
## Checklist
- Every message query filters by `session_id` AND `user_id`.
- Every webhook validates roles before being shown and before being executed.
- Every execution is recorded in `webhook_runs`.
- The frontend never receives real webhook URLs.
- No critical state lives only in memory.

15
docs/agents/web-agent.md Normal file
View File

@@ -0,0 +1,15 @@
# Web Agent
Owns the React + Vite UI.
## Focus
- Three-column layout: sessions, chat, right panel.
- Consume SSE from `/api/chat/stream`.
- Show recommended actions without auto-executing them.
- Rebuild state from the API, not from local memory as the source of truth.
## Do not
- Do not call webhooks directly from the browser.
- Do not store tokens or secrets in the frontend.

View File

@@ -0,0 +1,2 @@
=== PROMPT ===
=== RESPUESTA ===

View File

@@ -0,0 +1,11 @@
# Short definition
`SIC — Super Incident Commander` is a multi-session web interface for consulting a centralized `pi.dev` engine, with persistent history, simple search over internal documentation, and webhook recommendations that are only executed from the backend after explicit user confirmation.
## Target user
Small team, up to 5 concurrent users.
## Successful MVP
A user opens the UI, creates or resumes a session, asks a question, receives a streamed response, sees related documentation, gets recommended actions, and can execute a confirmed webhook. Everything is persisted and auditable.

57
docs/reliable-history.md Normal file
View File

@@ -0,0 +1,57 @@
# Reliable History
## Goal
Guarantee that the chat history is reconstructible, isolated by user, and consistent even if the backend restarts.
## Mandatory rules
1. Persist the user message before calling the LLM.
2. Persist the assistant response when the stream finishes.
3. If the LLM fails, record the failure in metadata or as a controlled error message.
4. Do not keep critical conversational state in memory.
5. All session and message queries must filter by `session_id` AND `user_id`.
6. Webhooks must be audited even when they fail.
## Base tables
```sql
CREATE TABLE IF NOT EXISTS chat_sessions (
id TEXT PRIMARY KEY,
user_id TEXT NOT NULL,
title TEXT,
created_at TEXT NOT NULL,
updated_at TEXT NOT NULL
);
CREATE TABLE IF NOT EXISTS chat_messages (
id TEXT PRIMARY KEY,
session_id TEXT NOT NULL,
user_id TEXT NOT NULL,
role TEXT NOT NULL,
content TEXT NOT NULL,
metadata TEXT,
created_at TEXT NOT NULL,
FOREIGN KEY (session_id) REFERENCES chat_sessions(id)
);
CREATE TABLE IF NOT EXISTS webhook_runs (
id TEXT PRIMARY KEY,
webhook_id TEXT NOT NULL,
user_id TEXT NOT NULL,
session_id TEXT NOT NULL,
status TEXT NOT NULL,
request_payload TEXT,
response_status INTEGER,
created_at TEXT NOT NULL
);
```
## Security invariant
```sql
WHERE session_id = ?
AND user_id = ?
```
Without this filter, the query is incorrectly designed.