Skip to main content

Schemas

What is a schema?

In databases, a schema is the logical structure that defines how data is organized within a database, including tables, fields, data types, and the relationships between them. It is like a blueprint or design that describes the shape and structure of the database, but does not contain the data itself.

Schemas often use visual representations to communicate the architecture of the database, becoming the foundation of an organization's data management discipline. This process of designing database schemas is also known as data modeling.

Practical example:

If the database were a house, the schema would be the architect's blueprint: what rooms there are (tables), what objects each room contains (fields), and how they connect to each other (relationships).


Why is a schema important?

  • Organizes data clearly and logically.
  • Defines the rules that must be followed for entering and relating data.
  • Helps ensure the consistency and integrity of information.
  • Facilitates system maintenance, scalability, and comprehension.

Types of schemas:

  • Conceptual schemas:

    Offer an overview of what the system will contain, how it will be organized, and what business rules are involved. Conceptual models are usually created as part of the process of gathering the initial requirements of the project.

    Features:

    • Represents the overall view of the system, without technical details.

    • Shows what data there is and how it relates.

    • Commonly represented with entity-relationship (ER) diagrams.

    • Does not worry about how that data will be implemented or stored. Example:
      A Student entity with attributes name, email, and major.

      Question it answers: What information do I need to save?

  • Logical schemas:

    They are less abstract than conceptual schemas. They clearly define the schema objects with information, such as table names, field names, relationships between entities, and integrity constraints, that is, any rule governing the database. However, they usually do not include any technical requirements.

    Features:

    • It is the translation of the conceptual schema into a model that can be understood by a DBMS.

    • Defines tables, columns, primary and foreign keys, data types, and relationships.

    • It is independent of physical storage, but adapts to the type of database (relational, object-oriented, etc.).

      Example:
      students table:

      • id (primary key)
      • name
      • email
      • major_id (foreign key)

      Question it answers: How do I structure the information within the system?

  • Physical schemas:

    Provide the technical information that the logical database schema type lacks, in addition to contextual information, such as table names, field names, relationships between entities, etc. That is, they also include the syntax that will be used to create the data structures within the disk storage.

    Features:

  • Describes how data is actually saved on disk.

  • Includes technical details such as:

    • File organization.

    • Block or page size.

    • Indexes to speed up searches.

    • Compression and storage.

      Example:

      • The file for the students table is saved in a .dat file, with B-Tree indexes on the email field.

      Question it answers: How is data physically stored?

How do these schemas relate?

The three levels of schema connect to each other:

  1. Conceptual: defines what data the system needs.
  2. Logical: structures that information into tables and relationships.
  3. Physical: saves that information in organized and optimized files.

This separation allows you to modify one level without completely affecting the others. For example, you can change how a table is saved on disk without changing the system logic.


Complete Example

Scenario: Library system

Conceptual Schema

  • Entities: Book, Author, Loan
  • Relationships: A book can have several authors; a loan is associated with a book.

Logical Schema

  • books table (id, title, year)
  • authors table (id, name)
  • books_authors table (book_id, author_id)
  • loans table (id, book_id, date, user)

Physical Schema

  • Data from books stored in a binary file.
  • Indexes on title to speed up searches.
  • Each file divided into 4 KB blocks for reading efficiency.

Advantages of using schemas

  • Clear and coherent organization of data.
  • Facilitates system maintenance.
  • Avoids duplication and errors.
  • Allows optimizing queries and operations.
  • Improves communication between analysts, designers, and developers.