F-003 fix: Sanitize SQL dump for safe dev use

This commit is contained in:
rikrdo
2026-05-25 08:14:34 +02:00
parent 3d41579ad3
commit e6feea5ee6
24 changed files with 483 additions and 1187942 deletions

View File

@@ -34,3 +34,21 @@
- `spec/bdd/features/config/legacy-config.feature`
- `work/artifacts/F-002/architect.md`
- `work/artifacts/F-002/implementer.md`
## F-003 — Sanitize SQL dump for safe dev use
### Acceptance criteria
- Repo no longer stores the raw production-like SQL dump as the active development baseline.
- Tracked SQL baseline contains only safe synthetic or non-sensitive data for local module work.
- Safe local data handling is documented.
- Local development remains possible through the sanitized baseline and docs.
- `./scripts/verify.sh` stays green after the change.
### Evidence targets
- `project/sql/db-25052026.sql`
- `project/sql/README.md`
- `spec/sdd/components/development-data-baseline.md`
- `spec/sdd/decisions/003-replace-raw-sql-with-sanitized-dev-baseline.md`
- `spec/bdd/features/data/sanitized-sql-baseline.feature`
- `work/artifacts/F-003/architect.md`
- `work/artifacts/F-003/implementer.md`

View File

@@ -0,0 +1,18 @@
@F-003 @smoke @security @regression
Feature: Safe SQL baseline exists for legacy module development
As a maintainer
I want a tracked SQL baseline without sensitive live data
So I can develop locally without keeping a raw production snapshot in git
Scenario: Tracked SQL baseline is sanitized
Given the repo contains one tracked SQL baseline for the legacy module
When feature F-003 is applied
Then the tracked SQL baseline does not contain customer or live order snapshot data
And the baseline contains only safe schema and synthetic seed data needed for local module work
Scenario: Local private data handling is documented
Given a maintainer may still need a private raw dump outside git
When feature F-003 is applied
Then the repo documents where private local data should live
And the tracked SQL baseline remains safe for commit and push

View File

@@ -47,3 +47,26 @@ Keep page behavior the same while removing hard-coded secrets from tracked PHP f
- auth redesign
- worker refactor beyond config use
- deploy automation
## F-003 — Sanitize SQL dump for safe dev use
### Problem
Current SQL dump in repo looks like a production snapshot.
It contains sensitive and production-like data.
This is unsafe as a tracked development baseline.
### Objective
Replace the raw dump in the working tree with a safe development baseline.
Keep local development possible for the legacy PHP module.
Document how to handle private data outside git.
### Scope
- In scope:
- define safe SQL baseline strategy
- replace current tracked dump with sanitized development dump
- document private local dump handling
- keep module development possible with synthetic seed data
- Out of scope:
- production database changes
- app logic changes
- full OpenCart dataset preservation

View File

@@ -7,7 +7,7 @@ The module also runs one batch worker that updates OpenCart product descriptions
Current raw source path was `project/new`.
Target stable path is `project/web/index/new`.
SQL dump target path is `project/sql/db-25052026.sql`.
SQL baseline path is `project/sql/db-25052026.sql` and now contains sanitized synthetic development data.
## Main flows
1. User opens product form.

View File

@@ -1,24 +1,25 @@
# Component: Development data baseline
## Responsibility
Provide one local SQL dump so maintainers can inspect schema and seed dev database.
Provide one safe local SQL baseline so maintainers can seed a development database for the legacy PHP module.
## Interfaces
- Input:
- SQL import command run by maintainer
- Output:
- local MariaDB database with OpenCart and custom tables
- local MariaDB database with the schema and synthetic seed data needed by the module
## Dependencies
- `project/sql/db-25052026.sql`
- `project/sql/README.md`
- local MariaDB/MySQL server
## Limits
- Dump may contain production-like data.
- Dump is large.
- Dump is not safe for public sharing without review.
- Baseline is intentionally smaller than the former raw snapshot.
- Baseline covers current module needs, not the full production dataset.
- Private raw snapshots must stay outside git.
## Success criteria
- [ ] Dump path is stable and explicit
- [ ] Design docs call it dev baseline only
- [ ] Move does not alter dump content
- [ ] Tracked dump contains only safe synthetic or non-sensitive data
- [ ] Docs explain private local dump handling

View File

@@ -0,0 +1,31 @@
# ADR-003: Replace raw SQL snapshot with sanitized dev baseline
## Status
Accepted
## Context
The tracked SQL file under `project/sql/db-25052026.sql` looked like a production snapshot.
It exposed production-like and sensitive data in the working tree.
The legacy PHP module still needs a database baseline for local work.
## Decision
Keep the same tracked SQL path but replace its content with a sanitized development baseline.
The new baseline contains only the schema and synthetic seed data needed by the legacy PHP module.
Document how to keep any private raw dump outside git.
## Consequences
- Good:
- active repo tree stops shipping raw sensitive SQL data
- local setup remains possible with a smaller safe dataset
- module development gets a focused baseline for current pages and worker
- Bad:
- baseline no longer mirrors the full production dataset
- some future work may need extra synthetic fixtures
## Alternatives considered
1. Keep raw dump and add warning only - rejected because data risk remains in tracked files.
2. Remove all SQL baseline files - rejected because local development would become harder.
3. Rewrite full git history now - rejected because scope is too large for this feature.
## Date
2026-05-25

View File

@@ -35,3 +35,9 @@
- Versioned file stores example values only.
- Ignored local file stores real local secrets and URLs.
- All PHP entry points must read DB, OpenAI, and route values through config helper.
## F-003 technical notes
- Keep one tracked SQL baseline for safe local development.
- Baseline should contain synthetic or non-sensitive seed data only.
- Baseline should cover the tables needed by the legacy module pages and worker.
- Private raw dumps must stay outside git or in ignored local paths only.