Normalization in Relational Databases

This is a continuation of my posts on relational databases, started here. Previous posts on this theme, “Data Modeling“, “Let’s Talk About Database Schema“, “The Exasperating Topic of Database Indexes“, “The Keys & Relationships of Relational Databases“, and “Relational Database Query Optimization“.

Consider you’re a die-hard fan of progressive death metal, with a particular affinity for bands like Allegaeon. Over the years, you’ve accumulated a vast collection of CDs, vinyls, and other memorabilia.

Your collection has grown so much that you decide to document every item meticulously. Each piece of memorabilia contains information about the album, track titles, band members, and so forth. If you scribbled down every detail in one continuous list, you’d end up with a lot of repeated information. For instance, both “Proponent for Sentience” and “Formshifter” would mention the same band members like Riley McShane and Michael Stancel.

Normalization is akin to setting up separate lists or sections in your documentation. One section purely for “Band Members” where you detail members of Allegaeon over time. Another section for “Albums”, where instead of listing band members all over again, you simply refer back to the “Band Members” section. This kind of organization cuts down redundancy and ensures if, say, a band member leaves, you have only one spot to update.

Continue reading “Normalization in Relational Databases”

Relational Database Query Optimization

This is a continuation of my posts on relational databases, started here. Previous posts on this theme, “Data Modeling“, “Let’s Talk About Database Schema“, “The Exasperating Topic of Database Indexes“, and “The Keys & Relationships of Relational Databases“.

Query optimization refers to the process of selecting the most efficient way to execute a SQL query. The goal is to retrieve the requested data as quickly and efficiently as possible, making the best use of system resources. Given that there can be multiple strategies to retrieve the same set of data, the optimizer’s role is to choose the best one based on factors like data distribution, statistics, and available resources.

Query Optimizers

Continue reading “Relational Database Query Optimization”

The Keys & Relationships of Relational Databases

This is a continuation of my posts on relational databases, started here. Previous posts on this theme, “Designing a Relational Database – Data Modeling, “Let’s Talk About Database Schema“, and “The Exasperating Topic of Database Indexes“.

In the context of relational databases, “keys” and “relationships” are fundamental concepts that help in ensuring data integrity, organizing data efficiently, and enabling meaningful data retrieval. Let’s break these down:

1. Keys

A key is an attribute (or a set of attributes) that uniquely identifies a record in a table. Various types of keys exist in relational databases:

  • Primary Key: This uniquely identifies each record in a table. A primary key cannot have NULL values, and each value must be unique. A table can have only one primary key, which may consist of single or multiple columns.
  • Foreign Key: This is an attribute or set of attributes in a table that refers to the primary key in another table. It establishes a relationship between two tables. A foreign key ensures records in one table correspond to records in another.
  • Unique Key: Similar to a primary key, it uniquely identifies each record in the table. The difference is that a table can have more than one unique key, but only one primary key. Unique keys can have NULL values, but the values must be unique.
  • Composite Key: When more than one attribute is used to uniquely identify a record in the table, such a key is called a composite key.
  • Candidate Key: Any attribute or set of attributes that could serve as the primary key is a candidate key. It means it can uniquely identify records in the table.
  • Super Key: It is a set of attributes that, when taken collectively, can be used to uniquely identify records. A super key can have additional attributes that are not strictly necessary for unique identification.

2. Relationships

Relationships in relational databases determine how tables connect to one another and how data correlates:

  • One-to-One (1:1) Relationship: In this relationship, one record in a table is related to one and only one record in another table. For instance, each employee in a company might have one unique work ID.
  • One-to-Many (1:N) or Many-to-One (N:1) Relationship: One record in a table can be related to one or more records in another table. For example, one customer can place many orders, but each order is made by one customer.
  • Many-to-Many (M:N) Relationship: Multiple records in one table are related to multiple records in another table. For instance, students and courses. One student can enroll in many courses, and one course can have many students. This relationship is typically resolved in relational databases using a junction table (or bridge table).

Relationships are typically enforced using Foreign Keys. When a foreign key in one table refers to the primary key of another, it creates a linkage that ensures the validity of the data and maintains referential integrity.

Continue reading “The Keys & Relationships of Relational Databases”

The Exasperating Topic of Database Indexes

Previous posts on this theme, “Designing a Relational Database – Data Modeling“ and “Let’s Talk About Database Schema“.

Indexes aren’t necessary right? Of course not, they’re just peripherally superficial things that sit like leaches on a database table’s column!

Ok, bad database satire aside, let’s get real. Indexes are extremely useful for a plethora of reasons. They play a crucial role in optimizing the speed of database operations, especially retrieval operations.

Demystifying the Magic Behind Relational Database Indexes

Ever tried finding a specific recipe in a massive cookbook? Imagine if there wasn’t an index! Just as that cookbook index saves you hours of flipping through page after page, a relational database index works wonders by helping the system quickly locate rows within a table without the tedious process of examining each one.

So, How Does This Magic Work?

Continue reading “The Exasperating Topic of Database Indexes”

Let’s Talk About Database Schema

Previous post on this theme, “Designing a Relational Database – Data Modeling“.

Seriously, let’s talk about schema in the abstract and the literal implemented schema in some of the most popular databases.

What is a schema?

In general, outside the specific realm of relational databases, a “schema” is a conceptual framework or blueprint that defines the structure, relationships, and constraints of data or information. It provides a way to describe and organize data in a structured manner. This concept of schema is not unique to databases; for instance, in GraphQL, a schema defines the types, queries, mutations, and the relationships between them, outlining the set of possible operations that can be executed against the API and the shape of the data that is returned.

The Oddity of Different Implementations in Databases

From the viewpoint of someone familiar with the general idea of a schema, it can indeed seem unusual that databases like SQL Server, Oracle, MariaDB/MySQL, and PostgreSQL each interpret and implement schemas in slightly (or sometimes, vastly) different ways. While the core idea behind a schema as a structured container or namespace for database objects remains somewhat consistent, the exact nature, utility, and behavior of schemas vary across these systems.

Why Each Database Implements Schema Differently

  1. Historical and Legacy Reasons: Many database systems started their journey decades ago. Over time, as they evolved, they built upon their existing architectures, leading to variations in features like schema. For instance, the idea of a schema in Oracle being closely tied to a user comes from Oracle’s early architectural decisions.
  2. Design Philosophy and Target Audience: Some databases were designed with specific audiences in mind. Oracle, for example, was designed for enterprise-level applications, which might have influenced their user-centric schema design. On the other hand, MySQL, initially envisioned for web applications, treats schemas synonymously with databases, perhaps for simplicity.
  3. Standards Compliance vs. Practicality: While there are ANSI SQL standards that databases might strive to adhere to, there’s also a balance to strike between standards compliance and offering features that are deemed more practical or beneficial for the database’s primary users. For instance, PostgreSQL, which aims to be highly standards-compliant, also introduces advanced features not in the standard when it sees fit.
  4. Competitive Differentiation: Sometimes, databases introduce or tweak features to differentiate themselves in the market. These variations can sometimes lead to differences in core concepts like schemas.
  5. Community and Governance Influence: Open-source databases like PostgreSQL and MariaDB can be influenced by their developer communities. Different communities might prioritize certain features or philosophies, leading to divergent implementations.
  6. Flexibility and Extensibility Concerns: Databases might implement schemas in ways they believe are more flexible or extensible for future changes. This might lead them to adopt non-standard or unique approaches.

In essence, the varying implementation of schemas in different databases is a result of a blend of historical decisions, design philosophies, target audiences, and practical considerations. It’s a testament to the evolving nature of database systems and the diverse needs they aim to address.

Looking @ Relational Database Schema

Continue reading “Let’s Talk About Database Schema”