|
| 1 | +# Database Schema Documentation |
| 2 | + |
| 3 | +## Overview |
| 4 | + |
| 5 | +This document describes the database schema for the Spring Boot Demo Implementation, which uses a simplified version of the Sakila database schema. The schema is designed for a film catalog API that allows querying films by title. |
| 6 | + |
| 7 | +## ER Diagram |
| 8 | + |
| 9 | +The following Entity Relationship diagram shows the complete database schema using Chen's notation: |
| 10 | + |
| 11 | +```plantuml |
| 12 | +@startchen |
| 13 | +
|
| 14 | +title Sakila Database Schema - ER Diagram |
| 15 | +note top : Spring Boot Demo Implementation\nSimplified Sakila Database Schema |
| 16 | +
|
| 17 | +entity Film { |
| 18 | + film_id : SERIAL <<key>> |
| 19 | + title : VARCHAR(255) |
| 20 | + description : TEXT |
| 21 | + release_year : YEAR |
| 22 | + language_id : SMALLINT |
| 23 | + original_language_id : SMALLINT |
| 24 | + rental_duration : SMALLINT |
| 25 | + rental_rate : NUMERIC(4,2) |
| 26 | + length : SMALLINT |
| 27 | + replacement_cost : NUMERIC(5,2) |
| 28 | + rating : MPAA_RATING |
| 29 | + last_update : TIMESTAMP |
| 30 | + special_features : TEXT[] |
| 31 | + fulltext : TSVECTOR |
| 32 | +} |
| 33 | +
|
| 34 | +entity Language { |
| 35 | + language_id : SERIAL <<key>> |
| 36 | + name : CHAR(20) |
| 37 | + last_update : TIMESTAMP |
| 38 | +} |
| 39 | +
|
| 40 | +relationship SPOKEN_IN { |
| 41 | +} |
| 42 | +
|
| 43 | +relationship ORIGINAL_LANGUAGE { |
| 44 | +} |
| 45 | +
|
| 46 | +Film -N- SPOKEN_IN |
| 47 | +SPOKEN_IN -1- Language |
| 48 | +
|
| 49 | +Film -(0,1)- ORIGINAL_LANGUAGE |
| 50 | +ORIGINAL_LANGUAGE -1- Language |
| 51 | +
|
| 52 | +note right of Film : Contains film catalog information\nwith rating, rental details,\nand full-text search capability |
| 53 | +
|
| 54 | +note right of Language : Defines available languages\nfor films (spoken and original) |
| 55 | +
|
| 56 | +note right of SPOKEN_IN : Primary language relationship\n(required - NOT NULL) |
| 57 | +
|
| 58 | +note right of ORIGINAL_LANGUAGE : Original language relationship\n(optional - can be NULL) |
| 59 | +
|
| 60 | +@endchen |
| 61 | +``` |
| 62 | + |
| 63 | +## Tables |
| 64 | + |
| 65 | +### Film Table |
| 66 | + |
| 67 | +The `film` table is the core entity of the schema, containing comprehensive information about each film in the catalog. |
| 68 | + |
| 69 | +**Columns:** |
| 70 | +- `film_id` (SERIAL, PRIMARY KEY): Unique identifier for each film |
| 71 | +- `title` (VARCHAR(255), NOT NULL): Film title |
| 72 | +- `description` (TEXT): Film description/synopsis |
| 73 | +- `release_year` (YEAR): Year of film release (domain constraint: 1901-2155) |
| 74 | +- `language_id` (SMALLINT, NOT NULL, FK): Primary spoken language (references language.language_id) |
| 75 | +- `original_language_id` (SMALLINT, FK): Original language if different from spoken language |
| 76 | +- `rental_duration` (SMALLINT, DEFAULT 3, NOT NULL): Rental period in days |
| 77 | +- `rental_rate` (NUMERIC(4,2), DEFAULT 4.99, NOT NULL): Cost to rent the film |
| 78 | +- `length` (SMALLINT): Film duration in minutes |
| 79 | +- `replacement_cost` (NUMERIC(5,2), DEFAULT 19.99, NOT NULL): Cost to replace the film |
| 80 | +- `rating` (MPAA_RATING, DEFAULT 'G'): MPAA film rating (G, PG, PG-13, R, NC-17) |
| 81 | +- `last_update` (TIMESTAMP, DEFAULT now(), NOT NULL): Last modification timestamp |
| 82 | +- `special_features` (TEXT[]): Array of special features |
| 83 | +- `fulltext` (TSVECTOR): Full-text search vector for title and description |
| 84 | + |
| 85 | +**Indexes:** |
| 86 | +- `idx_title`: B-tree index on title for fast title-based searches |
| 87 | +- `film_fulltext_idx`: GiST index on fulltext for full-text search operations |
| 88 | + |
| 89 | +**Triggers:** |
| 90 | +- `last_updated`: Automatically updates last_update timestamp on row modifications |
| 91 | + |
| 92 | +### Language Table |
| 93 | + |
| 94 | +The `language` table defines the available languages for films. |
| 95 | + |
| 96 | +**Columns:** |
| 97 | +- `language_id` (SERIAL, PRIMARY KEY): Unique identifier for each language |
| 98 | +- `name` (CHAR(20), NOT NULL): Language name |
| 99 | +- `last_update` (TIMESTAMP, DEFAULT now(), NOT NULL): Last modification timestamp |
| 100 | + |
| 101 | +**Triggers:** |
| 102 | +- `last_updated`: Automatically updates last_update timestamp on row modifications |
| 103 | + |
| 104 | +## Relationships |
| 105 | + |
| 106 | +### Film ↔ Language Relationships |
| 107 | + |
| 108 | +The schema defines two relationships between Film and Language entities: |
| 109 | + |
| 110 | +1. **SPOKEN_IN (Required Relationship)** |
| 111 | + - **Cardinality**: Many films to one language (N:1) |
| 112 | + - **Foreign Key**: `film.language_id` → `language.language_id` |
| 113 | + - **Constraint**: NOT NULL (every film must have a primary language) |
| 114 | + - **Referential Integrity**: ON UPDATE CASCADE, ON DELETE RESTRICT |
| 115 | + |
| 116 | +2. **ORIGINAL_LANGUAGE (Optional Relationship)** |
| 117 | + - **Cardinality**: Many films to one language (N:1, optional) |
| 118 | + - **Foreign Key**: `film.original_language_id` → `language.language_id` |
| 119 | + - **Constraint**: NULL allowed (films may not have a different original language) |
| 120 | + - **Use Case**: For dubbed films where original language differs from spoken language |
| 121 | + |
| 122 | +## Custom Types and Domains |
| 123 | + |
| 124 | +### MPAA_Rating Enum |
| 125 | +```sql |
| 126 | +CREATE TYPE mpaa_rating AS ENUM ( |
| 127 | + 'G', -- General Audiences |
| 128 | + 'PG', -- Parental Guidance Suggested |
| 129 | + 'PG-13', -- Parents Strongly Cautioned |
| 130 | + 'R', -- Restricted |
| 131 | + 'NC-17' -- Adults Only |
| 132 | +); |
| 133 | +``` |
| 134 | + |
| 135 | +### Year Domain |
| 136 | +```sql |
| 137 | +CREATE DOMAIN year AS integer |
| 138 | + CONSTRAINT year_check CHECK (((VALUE >= 1901) AND (VALUE <= 2155))); |
| 139 | +``` |
| 140 | + |
| 141 | +## Database Features |
| 142 | + |
| 143 | +### Full-Text Search |
| 144 | +- The `fulltext` column uses PostgreSQL's tsvector type for efficient text search |
| 145 | +- Automatically populated with English language text search vectors |
| 146 | +- Indexed with GiST for fast text search operations |
| 147 | + |
| 148 | +### Automatic Timestamps |
| 149 | +- Both tables use triggers to automatically update `last_update` timestamps |
| 150 | +- Ensures data consistency and audit trail |
| 151 | + |
| 152 | +### Data Integrity |
| 153 | +- Foreign key constraints ensure referential integrity |
| 154 | +- Check constraints validate data ranges (year domain) |
| 155 | +- NOT NULL constraints ensure required data is present |
| 156 | + |
| 157 | +## Usage in Spring Boot Application |
| 158 | + |
| 159 | +The Spring Boot application uses Spring Data JDBC with a simplified entity model: |
| 160 | + |
| 161 | +```java |
| 162 | +@Table("film") |
| 163 | +public record Film( |
| 164 | + @Id @Column("film_id") Integer filmId, |
| 165 | + @Column("title") String title |
| 166 | +) {} |
| 167 | +``` |
| 168 | + |
| 169 | +**Note**: The JPA entity only maps essential fields (film_id, title) for the specific use case of the film query API. The full database schema supports more comprehensive film catalog operations. |
| 170 | + |
| 171 | +## Migration Scripts |
| 172 | + |
| 173 | +The schema is managed using Flyway migrations: |
| 174 | +- `V1__sakila_schema.sql`: Creates tables, types, indexes, and constraints |
| 175 | +- `V2__sakila_film_data.sql`: Populates initial film data |
| 176 | + |
| 177 | +## Performance Considerations |
| 178 | + |
| 179 | +1. **Title Search Optimization**: B-tree index on title column supports efficient prefix searches |
| 180 | +2. **Full-Text Search**: GiST index enables fast text search across film descriptions |
| 181 | +3. **Foreign Key Indexes**: Automatic indexes on foreign key columns for join performance |
| 182 | +4. **Trigger Efficiency**: Simple timestamp update triggers with minimal overhead |
| 183 | + |
| 184 | +## Future Extensions |
| 185 | + |
| 186 | +The schema design supports easy extension for additional Sakila entities: |
| 187 | +- Actor and film_actor tables for cast information |
| 188 | +- Category and film_category tables for genre classification |
| 189 | +- Store, rental, and payment tables for rental business logic |
| 190 | +- Customer and address tables for customer management |
| 191 | + |
| 192 | +This modular approach allows incremental schema evolution while maintaining data integrity and performance. |
0 commit comments