Skip to main content

Frictionless Table Schema Guide

Taruvi's Data Service uses Frictionless Table Schema as the standard format for defining table structures. This guide explains how to create and work with schemas.

What is Frictionless Table Schema?

Frictionless Table Schema is a simple, open standard for declaring table structure. It's JSON-based, human-readable, and widely supported.

Basic Structure

A minimal schema requires a fields array:

{
"fields": [
{
"name": "id",
"type": "integer"
},
{
"name": "name",
"type": "string"
}
]
}

Field Types

String Types

{
"name": "email",
"type": "string",
"format": "email"
}

Formats: default, email, uri, binary, uuid

Numeric Types

{
"name": "age",
"type": "integer"
},
{
"name": "price",
"type": "number"
}

Integer: Whole numbers Number: Decimal numbers

Boolean Type

{
"name": "is_active",
"type": "boolean"
}

Date & Time Types

{
"name": "created_at",
"type": "datetime"
},
{
"name": "birth_date",
"type": "date"
},
{
"name": "start_time",
"type": "time"
},
{
"name": "year_established",
"type": "year"
}

Array & Object Types

{
"name": "tags",
"type": "array"
},
{
"name": "metadata",
"type": "object"
}

Field Constraints

Required Fields

{
"name": "email",
"type": "string",
"constraints": {
"required": true
}
}

Unique Constraint

{
"name": "username",
"type": "string",
"constraints": {
"unique": true
}
}

Min/Max Values

{
"name": "age",
"type": "integer",
"constraints": {
"minimum": 0,
"maximum": 150
}
},
{
"name": "price",
"type": "number",
"constraints": {
"minimum": 0.01
}
}

String Length

{
"name": "username",
"type": "string",
"constraints": {
"minLength": 3,
"maxLength": 50
}
}

Pattern (Regex)

{
"name": "phone",
"type": "string",
"constraints": {
"pattern": "^\\+?[1-9]\\d{1,14}$"
}
}

Enum Values

{
"name": "status",
"type": "string",
"constraints": {
"enum": ["draft", "published", "archived"]
}
}

Primary Keys

Define primary key(s) at the schema level:

{
"fields": [
{
"name": "id",
"type": "integer"
}
],
"primaryKey": ["id"]
}

Composite Primary Keys:

{
"primaryKey": ["user_id", "project_id"]
}

Foreign Keys

Reference other tables:

{
"fields": [
{
"name": "user_id",
"type": "integer"
}
],
"foreignKeys": [
{
"fields": ["user_id"],
"reference": {
"resource": "users",
"fields": ["id"]
}
}
]
}

Complete Examples

User Table

{
"fields": [
{
"name": "id",
"type": "integer",
"title": "User ID",
"description": "Unique identifier for the user",
"constraints": {
"required": true,
"unique": true
}
},
{
"name": "email",
"type": "string",
"format": "email",
"title": "Email Address",
"constraints": {
"required": true,
"unique": true
}
},
{
"name": "username",
"type": "string",
"title": "Username",
"constraints": {
"required": true,
"unique": true,
"minLength": 3,
"maxLength": 50,
"pattern": "^[a-zA-Z0-9_-]+$"
}
},
{
"name": "first_name",
"type": "string",
"title": "First Name",
"constraints": {
"required": true
}
},
{
"name": "last_name",
"type": "string",
"title": "Last Name",
"constraints": {
"required": true
}
},
{
"name": "age",
"type": "integer",
"title": "Age",
"constraints": {
"minimum": 13,
"maximum": 150
}
},
{
"name": "is_active",
"type": "boolean",
"title": "Active Status",
"description": "Whether the user account is active"
},
{
"name": "created_at",
"type": "datetime",
"title": "Created At",
"description": "When the user account was created"
}
],
"primaryKey": ["id"],
"title": "Users",
"description": "User accounts and profile information"
}

Product Table with Foreign Key

{
"fields": [
{
"name": "id",
"type": "integer",
"constraints": {
"required": true,
"unique": true
}
},
{
"name": "name",
"type": "string",
"constraints": {
"required": true
}
},
{
"name": "price",
"type": "number",
"constraints": {
"required": true,
"minimum": 0
}
},
{
"name": "category",
"type": "string",
"constraints": {
"enum": ["electronics", "clothing", "food", "books"]
}
},
{
"name": "seller_id",
"type": "integer",
"description": "ID of the user selling this product",
"constraints": {
"required": true
}
},
{
"name": "in_stock",
"type": "boolean"
},
{
"name": "tags",
"type": "array",
"description": "Product tags for search and categorization"
}
],
"primaryKey": ["id"],
"foreignKeys": [
{
"fields": ["seller_id"],
"reference": {
"resource": "users",
"fields": ["id"]
}
}
]
}

Field Properties

Required Properties

  • name (string): Field name (required)
  • type (string): Data type (required)

Optional Properties

  • title (string): Human-readable field name
  • description (string): Field documentation
  • format (string): Specific format for the type
  • constraints (object): Validation rules
  • example (any): Example value
  • rdfType (string): RDF type for semantic web
  • x-rename-from (string): Original column name when renaming (preserves data during migrations)

Validation

Schemas are automatically validated when creating or updating DataTables:

POST /api/apps/my-app/datasources/my-ds/datatables/

Validation Errors:

{
"json_schema": [
"Invalid Frictionless schema: Field 'age' has invalid type 'integ'. Valid types are: string, number, integer, boolean, object, array, date, time, datetime, year, yearmonth, duration, geopoint, geojson, any"
]
}

Schema Updates

When updating a schema, the schema_hash is automatically recalculated:

PATCH /api/apps/my-app/datasources/my-ds/datatables/users/
Content-Type: application/json

{
"json_schema": {
// Updated schema
}
}

The system tracks schema changes through SchemaOperation records for audit purposes.

Best Practices

1. Always Define Primary Keys

{
"primaryKey": ["id"]
}

2. Use Descriptive Names

{
"name": "user_email", // ✅ Clear
"title": "User Email Address",
"description": "Primary contact email for the user"
}
{
"name": "e", // ❌ Unclear
"type": "string"
}

3. Add Constraints for Data Quality

{
"name": "email",
"type": "string",
"format": "email", // ✅ Format validation
"constraints": {
"required": true, // ✅ Not null
"unique": true // ✅ No duplicates
}
}

4. Document Your Fields

{
"name": "status",
"type": "string",
"title": "Order Status",
"description": "Current status of the order in the fulfillment process",
"constraints": {
"enum": ["pending", "processing", "shipped", "delivered", "cancelled"]
}
}

5. Use Appropriate Types

// ✅ Correct types
{
"name": "age",
"type": "integer" // Whole numbers
},
{
"name": "price",
"type": "number" // Decimals
},
{
"name": "created_at",
"type": "datetime" // Timestamps
}

// ❌ Wrong types
{
"name": "age",
"type": "string" // Avoid storing numbers as strings
}

Column Renames

When renaming a column, use the x-rename-from hint to preserve existing data:

{
"fields": [
{
"name": "email_address",
"type": "string",
"format": "email",
"x-rename-from": "email"
}
]
}

How it works:

  • Without hint: System drops email column and creates new email_address column → DATA LOSS
  • With hint: System generates ALTER TABLE RENAME COLUMN email TO email_addressDATA PRESERVED

Example rename migration:

# Step 1: Get current schema
GET /api/apps/my-app/datasources/my-ds/datatables/users/

# Step 2: Update schema with rename hint
PATCH /api/apps/my-app/datasources/my-ds/datatables/users/
Content-Type: application/json

{
"json_schema": {
"fields": [
{
"name": "id",
"type": "integer"
},
{
"name": "email_address",
"type": "string",
"format": "email",
"x-rename-from": "email"
},
{
"name": "name",
"type": "string"
}
],
"primaryKey": ["id"]
}
}

Result: Column email is safely renamed to email_address with all data intact.

Best Practice

Always use x-rename-from when renaming columns in production to avoid data loss.

Common Patterns

Timestamp Fields

{
"fields": [
{
"name": "created_at",
"type": "datetime",
"title": "Created At"
},
{
"name": "updated_at",
"type": "datetime",
"title": "Updated At"
}
]
}

Status/State Fields

{
"name": "status",
"type": "string",
"constraints": {
"enum": ["draft", "review", "approved", "rejected"]
}
}

User Reference

{
"fields": [
{
"name": "created_by_id",
"type": "integer"
}
],
"foreignKeys": [
{
"fields": ["created_by_id"],
"reference": {
"resource": "users",
"fields": ["id"]
}
}
]
}

Custom Indexes

For performance optimization, you can define custom indexes in your schema:

{
"fields": [
{"name": "id", "type": "integer"},
{"name": "email", "type": "string"},
{"name": "status", "type": "string"}
],
"primaryKey": ["id"],
"indexes": [
{
"name": "idx_users_email",
"fields": ["email"],
"unique": false,
"method": "btree",
"comment": "Fast email lookup"
},
{
"name": "idx_users_email_lower",
"fields": ["email"],
"expression": "LOWER(email)",
"unique": true,
"method": "btree",
"comment": "Case-insensitive unique email"
}
]
}

Index Properties:

  • name: Unique index name
  • fields: Array of column names to index
  • unique: Create UNIQUE index (optional, default: false)
  • method: Index type - btree, hash, gin, gist, brin (optional, default: btree)
  • expression: SQL expression for expression indexes (optional)
  • where: WHERE clause for partial indexes (optional)
  • comment: Human-readable description (optional)

For comprehensive index documentation, see Index Management Guide.

Resources

API Integration

Get Schema Only

GET /api/apps/{app_slug}/datasources/{ds_slug}/datatables/{name}/schema/

Validate Data Against Schema

POST /api/apps/{app_slug}/datasources/{ds_slug}/datatables/{name}/validate_data/
Content-Type: application/json

{
"data": [
{"id": 1, "email": "user@example.com", "name": "John Doe"},
{"id": 2, "email": "jane@example.com", "name": "Jane Smith"}
]
}

Response:

{
"is_valid": true,
"errors": [],
"validated_rows": 2
}

What's Next?