In no particular order, we will look at architecting the database as we continue our tour of software architecture from scratch. There are several key considerations to keep in mind while doing this work. A wrong decision or improper architectural choice can have a far-reaching impact as we build from the backend. Here are some areas that we want to think through as we design our backend solution.
Primary KeysOne of the most commonly utilized values in a database is the primary key. This is the value that uniquely identifies a record within the database in a specific table. You can limit that uniqueness to the database and even table, or you can make it universally unique (GUID). While there are pros and cons for each approach, our specific solution should dictate the approach we take. Generally speaking, integers that are unique within a table are the simplest solution. Thus, this is also the most index-friendly approach.
A GUID can allow you to do complex things with moving around your data. However, it has a higher storage cost and is not as quick to find in an index as an integer in most systems. There are situations where a GUID is required. Therefore, ensure you have defendable reasons for whichever choice you make.
Data TypesA common mistake by those new to databases is to have large and simple defaults for data types. For example, every string is hundreds of characters in length, and every number is as big and precise as possible. This method is a significant waste of space and resources. While space is cheap and this can work for small databases (like a simple customer contact database or small catalog), this does not scale.
Think about simple space calculations. If you allow for 100 characters in length everywhere but often need less than 20 and never more than 80, you are using a lot of empty space. This can be a small 20 bytes per row, but it adds up. A thousand rows will be only 20K of wasted space. However, think about that as you move into millions or billions of rows. Even worse, this calculation was for only one column. The waste adds up very quickly.
Space is an issue, but so is interoperability and usability. Dates, in particular, have a broad range of formats, so find one to use as your default and stick to it. These issues are essential in the back-end, but they will re-appear when we look at the front end as well.
Naming and AuditingStandards are always helpful for maintenance. The easier it is for a new developer to jump in and feel familiar with the system, the better. There are sweeping architectural decisions to settle on early in the process. These include things like how data auditing will be implemented, naming conventions, and what work will be held in the back-end as opposed to other layers.
Architecting the database may seem like a simple task of identifying tables and columns. However, there is much more to be done before you can finish this architectural task.