What is RDD?
What is RDD?
Relational Database Design (RDD) organizes data into two-dimensional tables consisting of rows and columns. Each table row is a record, with columns that are attributes or fields of the record. A collection of related tables is known as a schema. The purpose of such a design is to allow flexible access to the data stored in the schema using the Structured Query Language (SQL).
Why is Relational Database Design Important?
The relational database design process maps a business function into a data model that can be managed by a Relational Database Management System (RDBMS). The RDBMS stores data about the business function, making it easy to store transactional information that can be retrieved by applications and used by data analytics solutions such as Business Intelligence (BI) systems.
Thanks to its Atomicity Consistency Isolation Durability (ACID) support, the relational model provides businesses with robust transaction processing. ACID offers the following capabilities:
Atomicity
Atomicity enforces the all-or-nothing rule to protect the integrity of the data in a relational database system. If a transaction makes multiple changes, they are all committed as a unit. If any change fails for any reason, the whole transaction is rolled back. If a transaction is in flight when the server crashes, all uncommitted transactions are rolled back by default.
Consistency
Consistency is maintained by disallowing any data that does not match the datatype of a field, which is a constraint that governs valid data values.
Isolation
Isolation ensures each transaction is executed in its own protected space so no other transaction can interfere with another.
Durability
Durability protects committed changes to guarantee they are written to non-volatile storage, so if a server does fail, committed transactions are protected.
The Relational Database Design Process
Below are the steps a database designer takes to map a business function to a relational database.
Modeling Entities
The first step in the design process is identifying all the entities involved in a business transaction.
We can use a sales order management application as an example. The entities involved in processing orders might be Employees, Departments, Products, Customers and Orders.
Each entity will have attributes or details about them that are important to track. In the case of an order, these might be the Customer Name, customer number, Product ID and Product Description, Quantity, Order date and the item amount.
It’s important to know how the entities are connected and whether the relationship is one-to-one, many-to-one, one-to-many, optional or mandatory.
Now, we have enough information to create an Entity Relationship model.
Normalization
Normalization removes repeating groups and organizes the data model by ensuring each element is connected using primary and foreign key values.
The series of normalization are defined as:
- First normal form.
- Second normal form.
- Third normal form.
- Boyce-Codd normal form.
- Fifth normal form.
Physical Design
We can move on to the physical design phase with the logical design completed. The database schema closely matches the fully normalized model for a transaction-centric use case. It uses a minimal number of indexes to maximize throughput, as updating indexes can be slow.
In a decision support of analytics use case, the data can be denormalized, and additional indexes can be added. An analytic database schema can be organized into a star or snowflake schema to support the most common queries.
Storage
Traditional relational databases store a record as a whole row, whereas column-oriented databases store tables in separate columns. This has the advantage of storing the data in the order it was sorted when loaded. This is often the natural primary index, which minimizes the need for indexes. The second reason columnar databases excel for analytic workloads is that they can skip all the columns unrelated to a query, so the result set is much smaller and much more likely to fit in cache memory.
Actian and Relational Databases
The Actian Data Platform includes multiple relational databases, including Actian Vector for high-speed analytic workloads and Actian Ingres for Transactional workloads. The Actian Data Platform runs on-premises and on multiple cloud platforms, including AWS, Azure, and Google Cloud, so you can run your analytics wherever your data resides.