Skip to main content

Schema Migrations & Evolution

Learn how to safely update and evolve your datatable schemas in production with automatic migrations, rollback capabilities, and comprehensive audit trails.

Overview

The Data Service provides automatic schema migrations powered by industry-standard Alembic technology. When you update a table's schema, the system:

  • Automatically detects changes via SHA256 hash comparison
  • Generates DDL operations to migrate your data
  • Executes safely in transactions with timeout protection
  • Tracks everything in an audit trail with rollback SQL
  • Prevents conflicts with row-level locking
Production-Ready

All schema changes are transactional, reversible, and tracked. You can confidently evolve schemas in production.

When Migrations Happen

Migrations are triggered automatically when you update a table's json_schema field:

PATCH /api/apps/myapp/datatables/users/
Content-Type: application/json

{
"json_schema": {
"fields": [
{"name": "id", "type": "integer"},
{"name": "email", "type": "string"},
{"name": "phone", "type": "string"} // ← New field added
],
"primaryKey": ["id"]
}
}

What happens behind the scenes:

  1. Hash comparison detects the schema changed
  2. Row lock prevents concurrent modifications
  3. Alembic compares old vs new schema
  4. DDL operations generated (ALTER TABLE ADD COLUMN phone...)
  5. Validation checks (safe operation? FK constraints valid?)
  6. Execution in transaction with 30-second timeout
  7. SchemaOperation record created with rollback SQL
  8. Cache invalidation triggers

Supported Operations

✅ Safe Operations (No Data Loss)

These operations run automatically without special flags:

OperationExampleDDL Generated
Add nullable columnAdd phone fieldALTER TABLE users ADD COLUMN phone TEXT
Add column with defaultAdd status with default 'active'ALTER TABLE users ADD COLUMN status TEXT DEFAULT 'active'
Rename columnRename emailemail_address (with hint)ALTER TABLE users RENAME COLUMN email TO email_address
Drop columnRemove deprecated_fieldALTER TABLE users DROP COLUMN deprecated_field
Widen column typeChange VARCHAR(50)TEXTALTER TABLE users ALTER COLUMN name TYPE TEXT
Make column nullableRemove NOT NULL constraintALTER TABLE users ALTER COLUMN phone DROP NOT NULL
Add foreign keyAdd FK to organizationsALTER TABLE users ADD CONSTRAINT fk_org FOREIGN KEY...

⚠️ Potentially Unsafe Operations

These operations require explicit permission via the allow_data_loss flag:

OperationRiskMitigation
Add NOT NULL columnFails if existing rows have NULLAdd with DEFAULT value or make nullable
Narrow column typeTEXTVARCHAR(10) may truncateValidate data fits new type first
Change incompatible typesTEXTINTEGER may failEnsure all values are convertible
Remove NOT NULLMay violate application logicReview application code first
Breaking Changes

The system will reject unsafe operations unless you explicitly set allow_data_loss=True in the provider configuration. This protects your data from accidental loss.

Preview Changes Before Applying

Always preview major schema changes before applying them in production.

Preview Endpoint

POST /api/apps/myapp/datatables/users/preview-changes/
Content-Type: application/json

{
"new_schema": {
"fields": [
{"name": "id", "type": "integer"},
{"name": "email", "type": "string"},
{"name": "phone_number", "type": "string", "constraints": {"maxLength": 20}}
],
"primaryKey": ["id"]
}
}

Response

{
"operations": [
{
"type": "add_column",
"column": "phone_number",
"sql": "ALTER TABLE myapp_users ADD COLUMN phone_number VARCHAR(20)",
"is_safe": true
}
],
"is_safe": true,
"warnings": [],
"estimated_duration_ms": 50
}

Understanding the Preview

  • operations: List of DDL changes that will be executed
  • is_safe: true if no data loss risk
  • warnings: Important notices about potential issues
  • estimated_duration_ms: Approximate execution time
Best Practice

Run preview in development first, review the SQL, then apply to production.

Validate Schema Before Saving

Catch errors before attempting migration:

POST /api/apps/myapp/datatables/users/validate-schema/
Content-Type: application/json

{
"schema": {
"fields": [
{"name": "id", "type": "integer"},
{"name": "org_id", "type": "integer"},
{"name": "email", "type": "string"}
],
"primaryKey": ["id"],
"foreignKeys": [
{
"fields": ["org_id"],
"reference": {
"resource": "organizations",
"fields": ["id"]
}
}
]
}
}

Response

{
"valid": true,
"errors": [],
"warnings": [
"Foreign key target table 'organizations' must be materialized before adding FK constraint"
]
}

Migration History & Audit Trail

Every schema operation is permanently tracked for compliance and debugging.

View Operation History

GET /api/apps/myapp/datatables/users/operation-history/

Response

{
"count": 15,
"results": [
{
"revision_id": "a1b2c3d4e5f6",
"operation_type": "ALTER_TABLE",
"status": "SUCCESS",
"alembic_operations": [
{"type": "add_column", "column": "phone_number"}
],
"executed_sql": "ALTER TABLE myapp_users ADD COLUMN phone_number VARCHAR(20);",
"rollback_sql": "ALTER TABLE myapp_users DROP COLUMN phone_number;",
"started_at": "2024-01-15T10:30:00Z",
"completed_at": "2024-01-15T10:30:01.234Z",
"duration_ms": 1234,
"parent_operation": null
},
{
"revision_id": "f6e5d4c3b2a1",
"operation_type": "CREATE_TABLE",
"status": "SUCCESS",
"started_at": "2024-01-10T09:00:00Z",
"completed_at": "2024-01-10T09:00:02.567Z",
"duration_ms": 2567,
"parent_operation": null
}
]
}

Understanding the History

  • revision_id: Unique 12-character identifier (Alembic-compatible)
  • operation_type: CREATE_TABLE, ALTER_TABLE, DROP_TABLE
  • status: PENDING, IN_PROGRESS, SUCCESS, FAILED, ROLLED_BACK
  • alembic_operations: Structured list of operations performed
  • executed_sql: Exact SQL that was run
  • rollback_sql: Auto-generated SQL to reverse the operation
  • parent_operation: Links operations in a migration chain
  • duration_ms: How long the operation took
Audit Compliance

All operations are permanently logged. You can prove exactly what changed, when, and by whom.

Rollback & Recovery

Every schema migration can be reversed using automatically-generated rollback SQL.

When to Rollback

  • ❌ Migration failed mid-execution
  • ❌ Wrong schema applied by accident
  • ❌ Application incompatible with new schema
  • ❌ Data corruption detected after migration

How Rollback Works

Rollback SQL is automatically generated for every operation:

Forward OperationRollback Operation
ADD COLUMN phone TEXTDROP COLUMN phone
DROP COLUMN deprecatedADD COLUMN deprecated TEXT (with original definition)
ALTER COLUMN type TEXTALTER COLUMN type INTEGER (back to old type)
ADD CONSTRAINT fk_orgDROP CONSTRAINT fk_org

Execute Rollback

POST /api/apps/myapp/datatables/users/rollback/
Content-Type: application/json

{
"revision_id": "a1b2c3d4e5f6"
}

Response

{
"success": true,
"message": "Successfully rolled back operation a1b2c3d4e5f6",
"operations_reversed": 1,
"new_status": "ROLLED_BACK"
}

Rollback Limitations

Cannot Always Rollback

Some operations cannot be fully reversed:

  • Dropped columns: If column definition wasn't stored, cannot recreate
  • Type conversions: If data is incompatible with old type, rollback fails
  • Data loss: Dropped data cannot be recovered from rollback SQL alone
  • Already rolled back: Cannot rollback an operation twice

Mitigation: Always have database backups before major schema changes.

Column Rename Detection

The system automatically detects column renames when you use the x-rename-from hint in your schema.

How It Works

Without rename hint (DATA LOSS):

{
"fields": [
{
"name": "email_address",
"type": "string"
}
]
}

Alembic sees:

  • Old column email missing in new schema → DROP COLUMN email
  • New column email_address not in database → ADD COLUMN email_address
  • Result: All email data is lost!

With rename hint (DATA PRESERVED):

{
"fields": [
{
"name": "email_address",
"type": "string",
"x-rename-from": "email"
}
]
}

System detects:

  • New column has x-rename-from: "email" hint
  • Old column email exists and is being dropped
  • Result: Generates ALTER TABLE RENAME COLUMN email TO email_address
  • Data preserved!

Example: Renaming Multiple Columns

{
"fields": [
{
"name": "id",
"type": "integer"
},
{
"name": "email_address",
"type": "string",
"x-rename-from": "email"
},
{
"name": "full_name",
"type": "string",
"x-rename-from": "name"
},
{
"name": "phone_number",
"type": "string",
"x-rename-from": "phone"
}
],
"primaryKey": ["id"]
}

Generated SQL:

ALTER TABLE users RENAME COLUMN email TO email_address;
ALTER TABLE users RENAME COLUMN name TO full_name;
ALTER TABLE users RENAME COLUMN phone TO phone_number;

Migration Log Output

When rename is detected, you'll see in the logs:

INFO: Column 'email_address' marked as rename from 'email'
INFO: Detected column rename: users.email → email_address
INFO: Migration #1: AlterColumnOp
- Table: users
- Column: email
- New Name: email_address
Best Practice

Always use x-rename-from when renaming columns in production. Without it, the system cannot distinguish a rename from a drop+add operation.

Remove Hint After Migration

After the migration succeeds, you can optionally remove the x-rename-from hint from your schema. It's only needed during the rename operation itself.

Best Practices

Development Workflow

  1. Test locally first

    # Create test app
    curl -X POST /api/apps/ -d '{"name": "test-migrations"}'

    # Test schema change
    curl -X PATCH /api/apps/test-migrations/datatables/users/ \
    -H "Content-Type: application/json" \
    -d '{"json_schema": {...}}'
  2. Use preview endpoint

    # Preview changes before applying
    curl -X POST /api/apps/myapp/datatables/users/preview-changes/ \
    -d '{"new_schema": {...}}'
  3. Validate schema

    # Check for errors before saving
    curl -X POST /api/apps/myapp/datatables/users/validate-schema/ \
    -d '{"schema": {...}}'
  4. Review operation history

    # Verify migration succeeded
    curl -X GET /api/apps/myapp/datatables/users/operation-history/

Production Strategies

Zero-Downtime Migrations

For large tables, use a multi-step approach:

Step 1: Add nullable column

{
"fields": [
...existing fields...,
{"name": "new_field", "type": "string"} // No constraints
]
}

Step 2: Populate data (run script)

UPDATE myapp_users SET new_field = 'default_value' WHERE new_field IS NULL;

Step 3: Add NOT NULL constraint (separate migration)

{
"fields": [
...existing fields...,
{"name": "new_field", "type": "string", "constraints": {"required": true}}
]
}

Handling Breaking Changes

Instead of this (BREAKS!):

// Removing a field applications depend on
{
"fields": [
{"name": "id", "type": "integer"},
{"name": "email", "type": "string"}
// "name" field removed ❌
]
}

Do this (Safe):

// 1. Deploy code that doesn't use "name"
// 2. THEN remove field from schema
{
"fields": [
{"name": "id", "type": "integer"},
{"name": "email", "type": "string"}
]
}

Backup Before Major Changes

# PostgreSQL backup
pg_dump -h localhost -U user -d database -n myapp_schema > backup.sql

# Apply migration
curl -X PATCH /api/apps/myapp/datatables/users/ ...

# If issues, restore
psql -h localhost -U user -d database < backup.sql

Common Patterns

✅ Adding a Nullable Column (Safe)

{
"fields": [
...existing fields...,
{
"name": "phone_number",
"type": "string",
"constraints": {"maxLength": 20}
}
]
}

✅ Adding a Column with Default (Safe)

{
"fields": [
...existing fields...,
{
"name": "status",
"type": "string",
"default": "active"
}
]
}

✅ Removing Unused Columns (Safe)

{
"fields": [
{"name": "id", "type": "integer"},
{"name": "email", "type": "string"}
// "deprecated_field" removed
]
}

⚠️ Changing Column Type (Check Compatibility)

{
"fields": [
{
"name": "age",
"type": "integer" // Changed from "string"
}
]
}

Validation needed: Ensure all existing age values are valid integers.

⚠️ Adding NOT NULL Constraint (Provide Default)

{
"fields": [
{
"name": "status",
"type": "string",
"default": "active",
"constraints": {"required": true}
}
]
}

Safety Features

Automatic Protections

The system provides multiple layers of safety:

FeatureProtectionBenefit
DDL Timeout30-second limit on schema operationsPrevents runaway migrations
Transaction IsolationAll operations in ACID transactionsAll-or-nothing execution
Row LockingSELECT FOR UPDATE prevents concurrent editsNo race conditions
Schema Cache InvalidationAutomatic cache clearingNo stale schema data
FK ValidationValidates target tables/columns existNo broken references
Identifier ValidationSQL injection preventionSecurity

Concurrent Modifications

Problem: Two users update the same table schema simultaneously.

Solution: Row-level locking prevents conflicts.

# User A starts updating
PATCH /datatables/users/ # → Locks row

# User B tries to update same table
PATCH /datatables/users/ # → Waits for lock

# User A finishes
# → Lock released

# User B's update now executes
# → May fail if incompatible with A's changes

Retry Strategy: If you get a lock timeout error, retry the request.

API Reference

Update Table Schema

PATCH /api/apps/{app_slug}/datatables/{table_name}/
Content-Type: application/json

{
"json_schema": {
"fields": [...],
"primaryKey": [...],
"foreignKeys": [...]
}
}

Response: 200 OK

{
"id": 123,
"name": "users",
"json_schema": {...},
"is_materialized": true,
"physical_table_name": "myapp_users",
"schema_hash": "a1b2c3..."
}

Preview Schema Changes

POST /api/apps/{app_slug}/datatables/{table_name}/preview-changes/
Content-Type: application/json

{
"new_schema": {...}
}

Response: 200 OK

{
"operations": [
{"type": "add_column", "column": "phone", "sql": "..."}
],
"is_safe": true,
"warnings": [],
"estimated_duration_ms": 50
}

Validate Schema

POST /api/apps/{app_slug}/datatables/{table_name}/validate-schema/
Content-Type: application/json

{
"schema": {...}
}

Response: 200 OK

{
"valid": true,
"errors": [],
"warnings": []
}

Get Operation History

GET /api/apps/{app_slug}/datatables/{table_name}/operation-history/

Query Parameters:

  • limit (optional): Number of results (default: 10)
  • offset (optional): Pagination offset

Response: 200 OK

{
"count": 50,
"next": "/api/apps/myapp/datatables/users/operation-history/?offset=10",
"previous": null,
"results": [...]
}

Rollback Operation

POST /api/apps/{app_slug}/datatables/{table_name}/rollback/
Content-Type: application/json

{
"revision_id": "a1b2c3d4e5f6"
}

Response: 200 OK

{
"success": true,
"message": "Successfully rolled back operation a1b2c3d4e5f6",
"operations_reversed": 1
}

Troubleshooting

Common Errors

"Foreign key constraint violation"

Error: Cannot add FK because target table doesn't exist.

Solution: Ensure referenced table is materialized first.

# Check if target table exists
curl -X GET /api/apps/myapp/datatables/organizations/

# Materialize if needed
curl -X POST /api/apps/myapp/datatables/organizations/materialize/

"Cannot convert type X to type Y"

Error: Existing data incompatible with new type.

Solution: Validate/convert data first, then change type.

-- Check if all values are convertible
SELECT age FROM users WHERE age !~ '^[0-9]+$';

-- Fix invalid values
UPDATE users SET age = NULL WHERE age !~ '^[0-9]+$';

-- Now safe to change type

"DDL timeout exceeded"

Error: Operation took longer than 30 seconds.

Solution:

  1. Check for locks on the table
  2. Ensure table isn't extremely large (>10M rows)
  3. Contact support for timeout adjustment

"Schema validation failed"

Error: Invalid Frictionless schema format.

Solution: Validate against Frictionless specification.

// Missing required field
{
"fields": [
{"type": "integer"} // ❌ Missing "name"
]
}

// Fixed
{
"fields": [
{"name": "id", "type": "integer"} // ✅
]
}

Recovery Procedures

Scenario: Migration failed, table in inconsistent state.

Recovery Steps:

  1. Check operation status

    curl /api/apps/myapp/datatables/users/operation-history/
  2. If status is FAILED, review error

    {
    "status": "FAILED",
    "error_message": "Column 'phone' already exists"
    }
  3. Manual fix if needed

    -- Connect to database
    psql -h localhost -U user -d database

    -- Fix the issue
    ALTER TABLE myapp_users DROP COLUMN phone;
  4. Mark operation as rolled back (if fixed manually)

    # Contact support to update operation status

Getting Help

If you encounter issues:

  1. Check operation history for error details
  2. Review preview to understand what was attempted
  3. Check logs for detailed error messages
  4. Contact support with revision_id for investigation

What's Next?