- Home
- Skills
- Code Review
- Database Schema Review
Database Schema Review
Review database schemas for normalization, index strategy, constraints, migration safety, and query efficiency.
The Problem
Schema design mistakes are the most expensive bugs in software. An missing index causes a query to table-scan millions of rows. A missing foreign key allows orphaned records that corrupt reports. A denormalized column gets out of sync because the update logic has a race condition. Unlike application code, schema changes on large production tables require careful migration planning — you cannot just “fix it later” without downtime risk.
The Prompt
Review the following database schema. Act as a database architect evaluating for data integrity, query performance, and evolution safety.
DATABASE: [PostgreSQL / MySQL / MongoDB / SQLite]
EXPECTED SCALE: [e.g., 10M rows in orders, 500k users]
SCHEMA:
[paste CREATE TABLE statements, Prisma schema, or migration files]
COMMON QUERIES:
[paste the most frequent queries this schema serves]
Evaluate across these dimensions:
1. **Normalization**
- Is the schema in 3NF? If denormalized, is it intentional with justification?
- Are there duplicated columns that will drift out of sync?
- Are many-to-many relationships properly modeled with junction tables?
2. **Index Strategy**
- Do all foreign keys have indexes?
- Do WHERE clause columns in frequent queries have appropriate indexes?
- Are there composite indexes for multi-column lookups?
- Are there unused or redundant indexes adding write overhead?
3. **Constraints & Integrity**
- Are NOT NULL constraints applied where data is required?
- Are foreign keys defined (not just convention)?
- Are unique constraints set where business rules demand uniqueness?
- Are CHECK constraints used for enum-like or range-bounded values?
4. **Data Types**
- Are column types appropriately sized (varchar(255) when varchar(50) suffices)?
- Are monetary values stored as DECIMAL, not FLOAT?
- Are timestamps using TIMESTAMPTZ (timezone-aware)?
- Are UUIDs used for external-facing IDs instead of sequential integers?
5. **Query Efficiency**
- Will the common queries perform full table scans?
- Are N+1 patterns implied by the schema shape (no join path)?
- Would a materialized view or computed column improve read performance?
6. **Evolution Safety**
- Can new columns be added without locking the table (NULL default)?
- Are soft deletes implemented where audit trails are needed?
- Is there a schema versioning or migration strategy?
For each issue, provide:
- **Table/Column**: What is affected
- **Severity**: data-risk (integrity) / performance / improvement
- **Impact**: What breaks at scale or under concurrent access
- **Fix**: DDL statement or schema change with migration notes
Example Output
## Schema Review: 4 issues found
### Data Risk: Missing Foreign Key
Table: order_items.product_id — no FK constraint to products.id
Impact: Deleting a product leaves orphaned order items. Reports show "unknown product."
Fix:
ALTER TABLE order_items
ADD CONSTRAINT fk_order_items_product
FOREIGN KEY (product_id) REFERENCES products(id) ON DELETE RESTRICT;
### Performance: Missing Index on Frequent Query
Table: orders — query `SELECT * FROM orders WHERE user_id = ? AND status = ?`
Impact: Full table scan on 10M rows. Response time degrades from 5ms to 3s.
Fix:
CREATE INDEX idx_orders_user_status ON orders(user_id, status);
### Improvement: Float for Currency
Table: products.price FLOAT
Impact: Floating point arithmetic: 0.1 + 0.2 = 0.30000000000000004. Invoice totals drift.
Fix:
ALTER TABLE products ALTER COLUMN price TYPE DECIMAL(10,2);
When to Use
Run this before creating migration files, when onboarding to a legacy database, or during performance investigations where slow queries point to schema issues. Particularly valuable before the database grows past the point where ALTER TABLE becomes a multi-hour operation.
Pro Tips
- Include your queries — schema quality is meaningless without knowing how data is accessed. Always provide the top 5-10 queries.
- Specify your scale — index advice for 10,000 rows differs dramatically from 10,000,000 rows.
- Ask for migration scripts — follow up with “Generate safe migration SQL for each fix, using zero-downtime patterns (add column nullable, backfill, then set NOT NULL).”
- Review ORM-generated schemas — ORMs like Prisma and Sequelize make schema decisions you might not agree with. Always review the generated SQL.