Circuit breaker
This commit is contained in:
parent
70df689701
commit
b7126b0d07
3 changed files with 234 additions and 3 deletions
|
|
@ -408,6 +408,68 @@ Then run the dev server with `npm run dev -- --host` to bind to all interfaces.
|
|||
- Verify `OIDC_ISSUER` and `OIDC_JWKS_URI` are correct
|
||||
- The backend must be able to reach Authentik at `http://authentik-server:9000` (Docker network)
|
||||
|
||||
## Required OIDC Claims
|
||||
|
||||
The Kaboot backend validates the following JWT claims:
|
||||
|
||||
| Claim | Required | Description |
|
||||
|-------|----------|-------------|
|
||||
| `sub` | Yes | User identifier (subject) |
|
||||
| `iss` | Yes | Must match `OIDC_ISSUER` env var |
|
||||
| `aud` | Recommended | Must match `OIDC_AUDIENCE` env var if set |
|
||||
| `preferred_username` | No | Display name, falls back to `sub` |
|
||||
| `email` | No | User's email address |
|
||||
| `groups` | No | Array of group names for access control |
|
||||
|
||||
### Audience Claim Configuration
|
||||
|
||||
The `aud` (audience) claim prevents tokens issued for other applications from being accepted. The blueprint configures this automatically via the `kaboot` scope.
|
||||
|
||||
**Backend Configuration:**
|
||||
```bash
|
||||
# Set to your OIDC client ID
|
||||
OIDC_AUDIENCE=kaboot-spa
|
||||
```
|
||||
|
||||
**Frontend Configuration:**
|
||||
The frontend must request the `kaboot` scope to include the audience claim:
|
||||
```typescript
|
||||
// src/config/oidc.ts
|
||||
scope: 'openid profile email offline_access groups kaboot'
|
||||
```
|
||||
|
||||
**Manual Authentik Setup (if not using blueprints):**
|
||||
|
||||
1. Go to **Customisation** > **Property Mappings**
|
||||
2. Click **Create** > **Scope Mapping**
|
||||
3. Configure:
|
||||
| Field | Value |
|
||||
|-------|-------|
|
||||
| Name | `Kaboot Audience Scope` |
|
||||
| Scope name | `kaboot` |
|
||||
| Expression | `return {"aud": "kaboot-spa"}` |
|
||||
|
||||
4. Go to **Applications** > **Providers** > **Kaboot OAuth2**
|
||||
5. Edit and add the new scope mapping to **Scopes**
|
||||
|
||||
### Groups Claim for Access Control
|
||||
|
||||
The `groups` claim controls access to premium features:
|
||||
|
||||
| Group | Access Level |
|
||||
|-------|--------------|
|
||||
| `kaboot-users` | Basic access (required to use app) |
|
||||
| `kaboot-ai-access` | Unlimited AI generations, OCR access |
|
||||
| `kaboot-admin` | Admin features |
|
||||
|
||||
The groups scope mapping is configured in the blueprint. For manual setup:
|
||||
|
||||
1. Create a scope mapping with expression:
|
||||
```python
|
||||
return {"groups": [group.name for group in request.user.ak_groups.all()]}
|
||||
```
|
||||
2. Add it to the provider's scopes
|
||||
|
||||
## Production Notes
|
||||
|
||||
For production deployment, see [PRODUCTION.md](./PRODUCTION.md) for full instructions.
|
||||
|
|
|
|||
114
docs/SECURITY_PLAN.md
Normal file
114
docs/SECURITY_PLAN.md
Normal file
|
|
@ -0,0 +1,114 @@
|
|||
# AUDIT
|
||||
|
||||
High / Critical Findings
|
||||
- Client-exposed Gemini system key risk. vite.config.ts injects process.env.GEMINI_API_KEY into the frontend bundle and services/geminiService.ts falls back to process.env.API_KEY; if a server/system key is set, it becomes visible to all clients and bypasses backend access controls. Recommend removing env injection for secrets and using only POST /api/generate for system AI. Evidence: vite.config.ts, services/geminiService.ts:6-12.
|
||||
- Untrusted document parsing runs native tooling. server/src/services/documentParser.ts uses LibreOffice + multiple parsers on attacker-supplied files; even without command injection, crafted docs are a common RCE vector. Treat as high risk unless sandboxed. Recommend running conversions in a container/worker with seccomp/AppArmor, low privileges, CPU/memory limits, and timeouts. Evidence: server/src/services/documentParser.ts:223-226.
|
||||
- Memory-heavy file uploads without per-endpoint rate limits. multer.memoryStorage() accepts up to 50MB and parsing happens synchronously; repeated requests can exhaust RAM/CPU. Recommend per-route rate limits + streaming to disk or queueing with concurrency caps. Evidence: server/src/routes/upload.ts:11-18.
|
||||
Medium Findings
|
||||
- JWT audience not validated. requireAuth verifies issuer/alg but not aud. A token minted for another client with the same issuer could be accepted. Recommend adding audience validation (client_id) and optionally azp. Evidence: server/src/middleware/auth.ts:58-64.
|
||||
- Tokens and API keys stored in localStorage. OIDC user store and API keys are persisted client-side, which is vulnerable to XSS. Mitigation: prefer in-memory storage or httpOnly cookies; avoid persisting API keys. Evidence: src/config/oidc.ts:17, hooks/useUserPreferences.ts:17-48, hooks/useGame.ts:62-74.
|
||||
- Public game lookup is enumerable if PINs are short. GET /api/games/:pin is unauthenticated (rate-limited) and returns game metadata. If PIN length is small, enumeration is plausible. Consider longer pins, additional throttling, or auth for non-host data. Evidence: server/src/routes/games.ts:85-107.
|
||||
Low Findings / Hardening
|
||||
- Auth error details echoed to client. requireAuth returns details: err.message which can leak verifier details. Recommend generic responses and server-only logs. Evidence: server/src/middleware/auth.ts:66-69.
|
||||
- Logging includes share tokens and pins. When LOG_REQUESTS=true, paths like /api/shared/:token are logged. Consider redacting or omitting sensitive paths. Evidence: server/src/index.ts:52-58, server/src/routes/shared.ts.
|
||||
- CORS uses credentials: true with configurable origin. If cookies are ever used in future, misconfiguration could enable CSRF. Recommend disabling credentials unless needed, and enforce HTTPS origins in prod. Evidence: server/src/index.ts:42-46.
|
||||
Positive Controls Observed
|
||||
- Parametrized SQL across routes (no obvious SQLi). server/src/routes/*.ts.
|
||||
- JWT verification uses RS256 and JWKS with caching. server/src/middleware/auth.ts.
|
||||
- Stripe webhook signature verification present. server/src/routes/payments.ts:337.
|
||||
- Helmet with CSP enabled on backend responses. server/src/index.ts:17-28.
|
||||
- No dangerouslySetInnerHTML found in frontend.
|
||||
Key Recommendation Order
|
||||
1. Remove client-side exposure of GEMINI_API_KEY and force system AI through backend. (vite.config.ts, services/geminiService.ts)
|
||||
2. Sandbox document parsing / LibreOffice conversions. (server/src/services/documentParser.ts)
|
||||
3. Add audience validation for JWTs. (server/src/middleware/auth.ts)
|
||||
4. Add upload-specific throttling and reduce memory pressure. (server/src/routes/upload.ts)
|
||||
5. Reduce client-side secret persistence (move to in-memory / httpOnly). (src/config/oidc.ts, hooks/useUserPreferences.ts)
|
||||
|
||||
# Security Plan (Kaboot)
|
||||
|
||||
This plan turns the audit findings into a tracked, testable remediation roadmap.
|
||||
|
||||
## Goals
|
||||
- [x] Eliminate client-side exposure of system AI secrets and enforce server-side access control only.
|
||||
- [x] Reduce risk of RCE/DoS in document parsing and uploads. *(sandboxing, rate limits, circuit breaker all done)*
|
||||
- [ ] Harden token validation and client storage practices. *(JWT aud validation done; client storage pending)*
|
||||
- [ ] Improve auditability, logging hygiene, and safe rollout.
|
||||
|
||||
## Scope
|
||||
- [ ] Backend entrypoints and middleware: `server/src/index.ts`, `server/src/routes/*.ts`, `server/src/middleware/auth.ts`.
|
||||
- [ ] Document processing: `server/src/services/documentParser.ts`, `server/src/routes/upload.ts`.
|
||||
- [ ] Frontend auth/storage and AI clients: `src/config/oidc.ts`, `hooks/useUserPreferences.ts`, `hooks/useAuthenticatedFetch.ts`, `services/geminiService.ts`.
|
||||
- [ ] Build/config surfaces: `vite.config.ts`, `.env.example`, `docker-compose*.yml`, `docs/*`.
|
||||
|
||||
## Assumptions
|
||||
- [ ] Backend is the only trusted system for system AI generation.
|
||||
- [ ] Auth provider is Authentik with OIDC JWTs.
|
||||
- [ ] Uploads may be attacker-controlled and should be treated as hostile.
|
||||
|
||||
## Phase 0 - Baseline and Inventory
|
||||
- [ ] Inventory all secrets and confirm none are present in repo or frontend bundles.
|
||||
- [ ] Verify current auth boundary: which routes require `requireAuth` vs public.
|
||||
- [ ] Record current rate limits and error handling for uploads, generation, games, and shared endpoints.
|
||||
- [ ] Add a basic secret scan in CI (gitleaks/trufflehog) and document false-positive exclusions.
|
||||
|
||||
## Phase 1 - Critical Remediations
|
||||
|
||||
### 1) Remove Client Exposure of System AI Key
|
||||
- [x] Remove `process.env.GEMINI_API_KEY` exposure from `vite.config.ts`.
|
||||
- [x] Remove `process.env.API_KEY` fallback in `services/geminiService.ts`.
|
||||
- [x] Ensure all system AI usage flows through `POST /api/generate` only.
|
||||
- [ ] Validate that built frontend bundles do not contain Gemini keys.
|
||||
|
||||
### 2) Sandbox Document Processing
|
||||
- [x] Move LibreOffice conversion to a sandboxed worker/container (no network, low privileges).
|
||||
- [x] Enforce per-job CPU/memory limits and execution timeouts.
|
||||
- [x] Store temp files in a private directory (0700) and clean up consistently.
|
||||
- [x] Add a circuit breaker for repeated failures or timeouts.
|
||||
|
||||
### 3) Upload Abuse Protection
|
||||
- [x] Add route-specific rate limiting for `POST /api/upload`.
|
||||
- [x] Cap concurrent uploads per user and per IP.
|
||||
- [ ] Consider streaming to disk or queueing (avoid large in-memory buffers when possible).
|
||||
|
||||
### 4) JWT Audience Validation
|
||||
- [x] Validate `aud` (and `azp` if present) in `server/src/middleware/auth.ts`.
|
||||
- [x] Document required OIDC claims in `docs/AUTHENTIK_SETUP.md`.
|
||||
|
||||
## Phase 2 - Hardening
|
||||
|
||||
### 1) Games Endpoint Risk Reduction
|
||||
- [x] Increase PIN entropy or move to a join token.
|
||||
- [ ] Require host proof (secret or auth) for state mutation endpoints.
|
||||
- [ ] Add per-endpoint throttling for public lookups.
|
||||
|
||||
### 2) Logging and Error Hygiene
|
||||
- [ ] Redact sensitive paths/tokens in request logs (`server/src/index.ts`).
|
||||
- [x] Return generic auth errors to clients; keep details server-only (`server/src/middleware/auth.ts`).
|
||||
|
||||
### 3) CORS and CSRF Posture
|
||||
- [ ] Review use of `credentials: true` and disable if not required.
|
||||
- [ ] Enforce HTTPS origins in production configs.
|
||||
|
||||
### 4) Privacy and Disclosure
|
||||
- [ ] Add a user-facing notice that documents may be sent to external AI providers.
|
||||
- [ ] Document data handling and retention for uploads and AI generation.
|
||||
|
||||
## Phase 3 - Verification
|
||||
- [ ] Add tests for JWT `aud` mismatches and invalid tokens.
|
||||
- [ ] Add upload tests for size limits, malformed files, and parser errors.
|
||||
- [ ] Add regression tests for shared and game endpoints (rate limiting and access).
|
||||
- [ ] Run dependency vulnerability checks in CI (`npm audit` or Snyk).
|
||||
- [ ] Verify no secrets appear in logs or built assets.
|
||||
|
||||
## Phase 4 - Rollout
|
||||
- [ ] Deploy to staging and validate auth, upload, generation, and payments end-to-end.
|
||||
- [ ] Rotate exposed keys after client bundle no longer includes secrets.
|
||||
- [ ] Use feature flags for upload pipeline changes.
|
||||
- [ ] Monitor latency, 429s, error rates, and upload failures for 48h post-deploy.
|
||||
|
||||
## Deliverables
|
||||
- [ ] Code changes with references and rationale per file.
|
||||
- [ ] Updated docs for system AI, uploads, and privacy.
|
||||
- [ ] Test report and monitoring checklist.
|
||||
- [ ] Post-remediation security summary with residual risks.
|
||||
|
|
@ -11,6 +11,40 @@ const PROCESSING_TIMEOUT_MS = 30000;
|
|||
const SANDBOX_URL = process.env.SANDBOX_URL || 'http://localhost:3002';
|
||||
const USE_SANDBOX = process.env.USE_SANDBOX !== 'false';
|
||||
|
||||
const CIRCUIT_BREAKER_THRESHOLD = 5;
|
||||
const CIRCUIT_BREAKER_RESET_MS = 60000;
|
||||
|
||||
class CircuitBreaker {
|
||||
private failures = 0;
|
||||
private lastFailure = 0;
|
||||
private open = false;
|
||||
|
||||
recordSuccess(): void {
|
||||
this.failures = 0;
|
||||
this.open = false;
|
||||
}
|
||||
|
||||
recordFailure(): void {
|
||||
this.failures++;
|
||||
this.lastFailure = Date.now();
|
||||
if (this.failures >= CIRCUIT_BREAKER_THRESHOLD) {
|
||||
this.open = true;
|
||||
}
|
||||
}
|
||||
|
||||
isOpen(): boolean {
|
||||
if (!this.open) return false;
|
||||
if (Date.now() - this.lastFailure > CIRCUIT_BREAKER_RESET_MS) {
|
||||
this.open = false;
|
||||
this.failures = 0;
|
||||
return false;
|
||||
}
|
||||
return true;
|
||||
}
|
||||
}
|
||||
|
||||
const sandboxCircuit = new CircuitBreaker();
|
||||
|
||||
export const GEMINI_NATIVE_TYPES = [
|
||||
'application/pdf',
|
||||
'text/plain',
|
||||
|
|
@ -210,6 +244,10 @@ const LEGACY_TO_MODERN: Record<string, string> = {
|
|||
};
|
||||
|
||||
async function convertViaSandbox(buffer: Buffer, extension: string): Promise<Buffer> {
|
||||
if (sandboxCircuit.isOpen()) {
|
||||
throw new Error('CIRCUIT_OPEN');
|
||||
}
|
||||
|
||||
const controller = new AbortController();
|
||||
const timeout = setTimeout(() => controller.abort(), PROCESSING_TIMEOUT_MS);
|
||||
|
||||
|
|
@ -223,14 +261,20 @@ async function convertViaSandbox(buffer: Buffer, extension: string): Promise<Buf
|
|||
|
||||
if (!response.ok) {
|
||||
const error = await response.json().catch(() => ({ error: 'Conversion failed' }));
|
||||
sandboxCircuit.recordFailure();
|
||||
throw new Error(error.error || 'Document conversion failed');
|
||||
}
|
||||
|
||||
sandboxCircuit.recordSuccess();
|
||||
return Buffer.from(await response.arrayBuffer());
|
||||
} catch (err) {
|
||||
if (err instanceof Error && err.name === 'AbortError') {
|
||||
sandboxCircuit.recordFailure();
|
||||
throw new Error('Document conversion timed out. Try a smaller file.');
|
||||
}
|
||||
if (err instanceof Error && err.message !== 'CIRCUIT_OPEN') {
|
||||
sandboxCircuit.recordFailure();
|
||||
}
|
||||
throw err;
|
||||
} finally {
|
||||
clearTimeout(timeout);
|
||||
|
|
@ -282,9 +326,20 @@ async function convertLocally(buffer: Buffer, extension: string): Promise<Buffer
|
|||
}
|
||||
|
||||
async function extractWithLibreOffice(buffer: Buffer, extension: string, useOcr: boolean = false): Promise<string> {
|
||||
const convertedBuffer = USE_SANDBOX
|
||||
? await convertViaSandbox(buffer, extension)
|
||||
: await convertLocally(buffer, extension);
|
||||
let convertedBuffer: Buffer;
|
||||
|
||||
if (USE_SANDBOX) {
|
||||
try {
|
||||
convertedBuffer = await convertViaSandbox(buffer, extension);
|
||||
} catch (err) {
|
||||
if (err instanceof Error && err.message === 'CIRCUIT_OPEN') {
|
||||
throw new Error('Document conversion service temporarily unavailable. Please try again in a minute.');
|
||||
}
|
||||
throw err;
|
||||
}
|
||||
} else {
|
||||
convertedBuffer = await convertLocally(buffer, extension);
|
||||
}
|
||||
|
||||
const config = useOcr ? { extractAttachments: true, ocr: true, ocrLanguage: 'eng' } : {};
|
||||
const ast = await officeParser.parseOffice(convertedBuffer, config);
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue