Learn Data Modelling by Example PT 1 Beginner Level
Learn Data Modelling by Example PT 1 Beginner Level
Chapter
1) Modelling
WindsorbyCastle
Williams
| Learn
Data Modeling
Example - Part 1 1
Barry Williams
Welcome
This is Part 1 of our book has been produced in response to a number of requests from
visitors to our Database Answers Web Site.
It is intended for beginners to Data Modeling
It incorporates a selection from our Library of about 950 data models that are featured on
our Web site:
http://www.databaseanswers.org/data_models/index.htm
I hope you enjoy this Book and would be very pleased to have your comments at
barryw@databaseanswers.org.
Barry Williams
Principal Consultant
Database Answers Ltd.
London, England
In this tutorial, we will follow two young tourists as they visit Windsor Castle and create a
data model.
Our tourists are Dimple, a 10-year-old girl, who likes sightseeing and ice cream
and Toby, Dimple's 12-year-old brother, who likes sightseeing and designing data models.
1.1.1 What is this?
This is a tutorial on data modeling for young people that represents a typical data modeling
project and illustrates the basic principles involved.
A physical database can easily be generated from a data model using a commercial
data modeling tool.
1.2 Topics
In this chapter, we will cover some basic concepts in data modeling:
Reference Data
[Dimple]: Toby, it's great being in London, which is so exciting and buzzing.
[Toby]: I'm glad you like it, Dimple. What would you like to do today?
[Dimple]: Toby, we have seen Buckingham Palace, where the Queen of the
United Kingdom lives, and now I'd like to visit Windsor Castle, because it's
one of the most popular tourist attractions in the UK, and it's just a short
trip from London.
[Dimple] Wow, Toby, Windsor has a beautiful castle and here is a royal park with lots of
deer.
And we have people - local people, tourists, students, people passing through, people
working here, people here on business and so on.
[Dimple]: Hmmm - so how do we translate what we know to help us get started with our
data model?
[Toby]: Lets start a diagram with people and establishments.
This simple diagram is going to grow into a data model.
[Dimple]: Toby, I am one of these people so how do I create a unique identity for myself to
make me different from everybody else?
[Toby]: We will give every person a unique identifier and every establishment its own
unique Identifier.
When we use these we call them Primary Keys, and show them in the diagram with a PK
on the left-hand side.
[Dimple]: That sounds good, Toby, but I don't know what it means.
[Toby]: Well, Dimple, let's look at how we use these identifiers...
We can call these boxes tables - or entities if we want to speak to professional data
modelers.
A table simply stores data about one particular kind of Thing of Interest.
For example, people or establishments.
Each record in a table will be identified by its own unique identifier, which we call the
primary key.
It is not usually easy to find a specific item of data already in the table that will always be
unique.
For example, in the United States, Social Security Numbers (SSNs) are supposed to be
unique, but (for various legitimate reasons) that is not always the case.
Also, foreign visitors and tourists will not have SSNs.
Therefore, it is best practice to create a new field just for this purpose.
This will be what is called an auto-increment data type, which will be generated
automatically by the Database Management System (DBMS) at run-time.
This is called a surrogate key and it does not have any other purpose.
It is simply a key that stands for something else.
It is a meaningless integer that is generated automatically by the database management
software, such as Oracle or SQL Server. The values are usually consecutive integers,
starting with 1,2,3,4 and so on.
Now we can see how useful our identifiers can be because we can include the person and
establishment identifiers in our Visits Table.
Then we draw a little box called Products and say that every product has a type.
In other words, there is a relationship between the Products and Product_Types boxes.
The lines are called relationships and they are very important in data modeling.
We are now creating an Entity-Relationship Diagram or "ERD".
This diagram shows only a line for the relationship:
The symbol at the products end is called crows feet and it shows the many end.
The short straight line at the Product_Types end shows the one end.
In other words, this line shows a one-to-many relationship.
Dimple, let me explain about the dotted line. It means that the relationship results in a
Foreign Key in the products table. This is shown by the FK symbol next to the
product_type_code field and it means that there is a link back to the Product_Types.
However, the primary key is only the Product_ID, and of course, this is shown by the PK
symbol next to the Product_ID field.
Later, when we talk about inheritance, we will use a straight line, in contrast to this dotted
line here. This is to show that the foreign key field is also a primary key.
I have to say something a bit difficult about primary keys right now.
In the Products Table, we have to allow for a very large number of products being stored.
Therefore we use an ID field for the primary key.
We then create this ID field automatically as a number (called an auto-increment integer).
This number has no meaning and is simply used to identify each record uniquely among
possibly millions or hundreds of millions.
However, things are different for type fields.
These are what we call enumerated data and are typically reference data.
They are always relatively small in number and we choose a code for the primary key
because we can create them and review them manually.
It also helps us to create a code that we can use and refer to, in contrast to the ID fields
that have no meaning.
Typical examples would be:
Sizes Small, Medium and Large where we are accustomed to seeing S,M and L.
We know that they are organized in groups, like food and drink, and each of these has more
groups and so on, right down to the particular product, like caramel macchiato or a panini.
This top-down organization is called a hierarchy and appears all over the place.
Luckily we can show this very easily and neatly in our data model.
[Dimple]: Toby, with so many people, establishments and purchases how do they keep
track of everything?
[Toby]: Well, Dimple, by this time, everything has its own identifier that is used wherever
they need to keep track.
[Dimple]: OK, that sounds sensible. And do they use these identifiers in a database?
Some of these local people are shoppers and some of them will be working in the shops.
We will call the workers staff and we know different things about them than the things we
know about the tourists.
For example, we will probably know the gender of everybody just by looking at them.
For staff, we will usually also know their date of birth and their home address.
We can see a field marked as PF in the three tables for ceremonial guards, staff and
tourists.
This is unusual because it means a field that is a Primary Key in the three tables and also a
Foreign Key to the People Table.
Therefore, if your first record was a ceremonial guard, then we would have a record in the
People Table with a Person_ID of 1 and a record in the ceremonial guard with a Guard_ID of
1.
Similarly, if our second record was a member of staff, we would have a record in the People
Table with a Person_ID of 2 and a record in the Staff Table with a Staff_ID of 3.
[Toby]: Dimple, if we bring together everything we have talked about, we will see that we
have quite a good data model that any professional would be proud of.
[Dimple]: OK, Toby. Do you think I will understand it?
[Toby]: Let me help you by making a list of the business rules for our model:
[Toby]: OK, Dimple - we have a very nice data model and now we can take the break I
promised you.
[Dimple]: That's great, Toby - can I have an ice cream?
[Toby]: Sure, but before we do I should say something about PF, which appears in the Staff
Table.
It's unusual and it's called PF because it means a field which is a Primary Key in the Staff
Table and a Foreign Key to the People Table.
[Dimple]: Hmmm, I've got a headache, Toby - can we please go and get an ice cream?
[Toby]: OK, Dimple. You've been a very good girl and you deserve a break.
You can admire what we have created, which is this very professional-looking data model.
There are lots of people in Windsor, including ceremonial guards, staff and tourists.
There are also lots of establishments, like shops and the Castle.
We have moved this Chapter to a separate document that you can download here :
http://www.databaseanswers.org/downloads/Chapter_2_Learn_Data_Modelling_Book_for_Denmark.pdf
A physical database can easily be generated from a data model using a commercial
data modeling tool.
3.2 Topics
In this chapter, we will cover some basic concepts in data modeling:
Reference Data
[Toby]: I'm glad you like it, Dimple. What would you like to do today?
[Dimple]: Toby, we have come to Turkey, and I would like to see Istanbul
and visit the Blue Mosque, because it's one of the most popular tourist
attractions here, then I would like to do some shopping, then see the sea,
and I would like to finish up at Starbucks.
[Toby]: OK. Let's go.
We are starting from Istanbul, which is a beautiful place
And we have people - local people, tourists, students, people passing through, people
working here, people here on business and so on.
[Dimple]: Hmmm - so how do we translate what we know to help us get started with our
data model?
[Toby]: Lets start a diagram with people and establishments.
This simple diagram is going to grow into a data model.
[Dimple]: Toby, I am one of these people so how do I create a unique identity for myself to
make me different from everybody else?
[Toby]: We will give every person a unique identifier and every establishment its own
unique identifier.
When we use these we call them Primary Keys, and show them in the diagram with a PK
on the left-hand side.
[Dimple]: That sounds good, Toby, but I don't know what it means.
[Toby]: Well, Dimple, let's look at how we use these identifiers...
We have managed to find a quiet area where a very happy man is selling a Turkish favorite,
called SIMIT ;0)
So, in other words, we have one person, who is the happy man, and one establishment,
which is his simple stall.
Now we can see how useful our identifiers can be because we can include the person and
establishment identifiers in our visits table.
Then the Person_ID field becomes a link to a record for a person in the Person Table.
This link is what is called a Foreign Key and we can see it's shown with 'FK' on the lefthand side.
Then we draw a little box called Products and say that every product has a type.
In other words, there is a relationship between the Products and Product_Types boxes.
The lines are called relationships and they are very important in data modeling.
We are now creating an Entity-Relationship Diagram or ERD.
This diagram shows only a line for the relationship:
The symbol at the products end is called crows feet and it shows the many end.
The short straight line at the Product_Types end shows the one end.
In other words, this line shows a one-to-many relationship.
Dimple, let me explain about the dotted line. It means that the relationship results in a
foreign key in the Products Table. This is shown by the FK symbol next to the
product_type_code field and it means that there is a link back to the Product_Types.
I have to say something a bit difficult about primary keys right now.
In the Products Table, we have to allow for a very large number of products being stored.
Therefore we use an ID field for the Primary key.
We then create this ID field automatically as a number (called an auto-increment integer).
This number has no meaning and is simply used to identify each record uniquely among
possibly millions or hundreds of millions.
However, things are different for type fields.
These are what we call enumerated data and are typically reference data.
They are always relatively small in number and we choose a code for the primary key
because we can create them and review them manually.
It also helps us to create a code that we can use and refer to, in contrast to the ID fields
that have no meaning.
Typical examples would be:
Sizes Small, Medium and Large where we are accustomed to seeing S,M and L.
This menu board shows a typical menu in a Turkish restaurant that serves a wide
range of food and drink.
We can see that they are organized in groups, like desserts and hot and cold drinks, and
each of these has products, like apple baklava or turkish coffee.
This top-down organization is called a hierarchy and appears all over the place in our
world.
Luckily we can show this very easily and neatly in our data model.
MENU
DESSERTS
HOT
BEVERAGE
LIST
Apple Tree
Juices
COLD
BEVERAGE
LIST
Other Cold
Beverages
[Dimple]: Toby, with so many tourists, stalls, shops and things to buy, how do we keep
track of everything?
[Toby]: Well, Dimple, by this time, everything has its own identifier that is used wherever
they need to keep track.
[Dimple]: OK, that sounds sensible. And do we use these identifiers in a database?
[Toby]: Yes, Dimple, and in this diagram, we can see that we can use the unique identifiers
that are shown as PK, for primary keys.
We can see that we have a PK for every entity or table so we can be pretty sure we can get
from any table to any other table.
This is called navigating around the data model and is a good test for a well-designed data
model.
We usually know different things about the stallholders and workers than the things we
know about the tourists.
For example, we will probably know the gender of everybody just by looking at them.
For workers, we might also know things related to their employment, such as their date of
birth and their home address.
We can see a field marked as PF in the tables for staff and tourists.
This is unusual because it means a field that is a Primary Key in the three tables and also a
Foreign Key to the People Table.
Therefore, if your first record was a member of staff, then we would have a record in the
People Table with a Person_ID of 1 and a record in the staff table with a Staff_ID of 3.
Similarly, if our second record was a tourist, we would have a record in the Person Table
with a Person_ID of 2 and a record in the tourist table with a Staff_ID of 3.
[Toby]: People make reservations every day all around the world.
These reservations have a lot in common:
Hotel bookings, airline bookings, theatres and shows, appointments to see a doctor
or dentist and so on.
The basic common things are a date and time, usually a specific facility, like a hotel,
an airline seat, a theatre and so on.
This means that we can identify what they have in common and what they have that is
different and specific to the type of appointment.
http://www.databaseanswers.org/data_models/generic_reservations/generic_reserv
ations_inheritance_for_turkey.htm
This means that the values don't change much and I can use them to define what the valid
values can be.
This is a technique that professional data modelers use but we don't need to worry about it
today.
[Dimple]: I'm glad to hear it, Toby!
Although it isnt difficult to understand and it seems like a good idea.
[Toby]: In our small example, we have only four kinds of reference data altogether gender, types of establishment, people and products.
[Toby]: Dimple, if we bring together everything we have talked about, we will see that we
have quite a good data model that any professional would be proud of.
[Dimple]: OK, Toby. Do you think I will understand it?
[Toby]: Let me help you by making a list of the business rules for our model:
[Toby]: OK, Dimple - we have a very nice data model and now we can take the break I
promised you.
[Dimple]: That's great, Toby - can we go to Starbucks?
[Toby]: Sure, but before we do I should say something about PF, which appears in the Staff
Table.
It's unusual and it's called PF because it means a field which is a Primary Key in the Staff
Table and a Foreign Key to the People Table.
[Dimple]: Hmmm, I've got a headache, Toby - can we please go to Starbucks?
[Toby]: OK, Dimple. You've been a very good girl and you deserve a break.
You can admire what we have created, which is this very professional-looking data model.
There are lots of people in Windsor, including ceremonial guards, staff and tourists.
There are also lots of establishments, like shops and the Castle.
Tourists made visits to establishments where they made purchases of products.
Reference Data
At the end of this tutorial, we will have produced a data model, which is commonly referred
as an Entity-Relationship Diagram, or 'ERD'.
4.1.1 What is this?
This chapter is a description of the relational theory as originally established by Ted Codd,
who, at the time, was a research scientist with IBM.
4.1.2 Why is it important?
The basic concepts are important because the relational theory is very powerful and
provides a sound theoretical foundation for databases that have become essential since
their first appearance in the early 1970s.
They were the creation of a brilliant research scientist called Ted Codd, who was working for
an IBM Research Lab at the time. It is reported that he faced internal criticism initially
because it was considered that his new idea would affect sales of established IBM database
products.
A physical database can easily be generated from a data model using a commercial
data modeling tool.
Getting Started:
The area we have chosen for this tutorial is a data model for a simple Order Processing
System for Starbucks.
We have done it this way because many people are familiar with Starbucks and it provides
an application that is easy to relate to.
We think about the area we are going to model.
We can see customers ordering products (food, drinks and so on).
Our approach has three steps:
These things will be called Entities in a Data Model and Tables in a Database.
1. At this stage, we show only the entities with no relationships and minimum
attributes and specify only the primary key and one details field that will be
replaced later on.
2. The Primary Key field(s) should always be first.
3. You will notice that the first field in the Customers_version2 Table is the
Customer_ID.
4. It has a PK symbol beside it, which indicates that it is the primary key for the
table.
5. The primary key is very important and is the way that we can recognize each
individual record in the table.
Creating a primary key in the Dezign tool:
1. Right-click on the Entity
3. Choose Attributes
This is shown by the symbol that has three small lines at that end of the relationship
dotted line, which is referred to as crow's feet.
Optional Key Fields
Strictly speaking, a customer does not have to place an order. He or she could change
their mind and walk out without ordering anything. In other words, we would say that
TERM
DEFINITION
Customer
Demand
Sometimes it is useful to see the key fields to ensure that everything looks alright.
When we look closely at this data model, we can see that the primary key is composed of
the Order_ID and Product_ID fields.
This reflects the underlying logic, which states that every combination of order and product
is unique.
In the database, this will define a new record.
When we see this situation in a database, we can say that this reflects a many-to-many
relationship.
However, we can also show the same situation in a slightly different way, which reflects the
standard design approach of using a surrogate key as the primary key and showing the
demand and product IDs simply as foreign keys.
TERM
DEFINITION
Order
Product
This diagram shows how the hierarchies of products and product types that we have just
discussed are shown in our Entity-Relationship Diagram.
You will notice that the table called 'Product_Types' has a dotted line coming out on the
right-hand side and going back in again on the top-right corner.
Data analysts call this a recursive or reflexive relationship, or informally, simply rabbit ears.
In plain English, we would say that the table is joined to itself and it means that a record in
this table can be related to another record in the table. This approach is how we handle the
situation where each product can be in a hierarchy and related to another Product.
For example, a product called Panini could be in a product sub-category called
'Miscellaneous Sandwiches' which could be a higher product category called 'Cold Food,
which itself could be in a higher product super-category called simply 'Food'.
Next time you go into Starbucks, take a look at the board behind the counter and try to
decide how you would design the products area of the data model.
You should pay special attention to the little 'zeros' at each end of the dotted line.
These are how we implement the fact that the Parent Product Type Code is optional,
because the highest level will not have a parent.
This tutorial is also available on the Database Answers Web site:
http://www.databaseanswers.org/tutorial4_data_modeling/index.htm
A number of data models show examples of inheritance, including:
Charities
Event Registrations
Games Store
Insurance Brokers
New Egg
Photo Catalogs
Shrek 2 Movie
Vehicle Imports
Then we think about the fact that every unit is part of a larger organization.
In other words, every unit reports to a higher level within the overall organization.
The unit at the very top of organization has no one to report to, and a unit at the lowest
level does not have any other unit reporting to it.
In other words, this relationship is optional at the top and bottom levels.
We show this by the small letter O at each end of the line that marks the relationship.
Conceptual or Logical
o This focuses on a business-oriented specific of a situation that identifies
the things of interest and how they are related.
Physical
o This introduces aspects that relate to implementation in a specific
database
Inheritance can appear in a logical data model but it disappears in the physical database,
which is what ultimately becomes the database.
Relational databases do not support inheritance. Therefore our thinking must include the
question of when we stop showing the inheritance relationship and replace it with two oneto-many relationships. Business users tend to be comfortable with many-to-many but for
data modelers, DBAs and developers it is usually better to replace them.
Inheritance is a very simple and very powerful concept. We can see examples of inheritance
in practice when we look around us every day. For example, when we think about Houses,
we implicitly include bungalows and ski lodges, and maybe even apartments, beach huts
and house boats.
In a similar way, when we discuss aircraft we might be talking about rotary aircraft, fixed
wing aircraft and unmanned aircraft.
However, when we want to design or review a data model that includes aircraft, then we
need to analyze how different kinds of aircraft are shown in the design of the data model.
We use the concept of Inheritance to achieve this. Inheritance in data modeling is just the
same as the general meaning of the word. It means that at a high level, we identify the
general name of the Thing of Interest and the characteristics that all of these things share.
For example, an aircraft will have a name for the type of aircraft, such as Tornado and it will
be of a certain type, such as fixed-wing or rotary.
At the lower level of fixed-wing aircraft, an aircraft will have a minimum length for the
runway that the aircraft needs in order to take off.
This situation is shown in the following diagram:
Of course, it is not always possible to determine the Date From value, and it is not always
something that it is appropriate to ask every customer.
Therefore, a better and more general approach is to use a key (that we discussed in Section
3.3) for a record and leave the Date From field optional.
We can accommodate more than one person at the same address. We need to
do this because different members of a family may sign up separately with
Amazon.
With this approach, we can always be sure that we have 100% good address
data in our database.
It has a relatively small number of values, usually less than a few dozen and
never more than a few hundred.
Data in Reference Data Tables can be used to populate drop-down lists for
users.
4.11.4 Standards
http://www.databaseanswers.org/data_models/gaming_gears_of_war/index.htm
the Weapons
the Rules of Engagement between the Good Guys and the Bad Guys
His Profile reads :A military man to the core, Colonel Victor Hoffman demands discipline and sacrifice from
those under his command.
His Profile reads :Few have given more and lost as much as Marcus Fenix.
A promising soldier during the Pendulum Wars, Marcus saw everything change on
Emergence Day.
Marcus bravely fought the Locust for ten years, then, during an intense battle, he
abandoned his post to rescue his father, Professor Adam Fenix.
But he arrived too late.
Marcus was tried for dereliction of duty and sentenced to 40 years in
Jacinto Maximum Security Prison.
Incarcerated for four years before being released to fight Locust again, Marcus was later
promoted to sergeant.
His Profile reads :Private Damon Baird is a dedicated tech-head and professional skeptic.
In Baird's world, if something can go wrong, it probably already has.
His sarcasm can keep people at a distance, which is why Baird prefers the company of
machines.
He believes in the Coalition's cause, but he's often frustrated with command decisions, and
took offense when Hoffman promoted Marcus Fenix to lead Delta Squad instead of him.
His Profile reads :As the youngest member of Delta Squad during the Lightmass Offensive, what Private
Anthony Carmine lacked in combat experience, he made up for in unbridled enthusiasm.
His Profile reads :A seasoned fighter whos positive even in the darkest of hours, Dominic Santiago freed his
best friend Marcus Fenix from Jacinto Maximum Security Prison and recruited him into Delta
Squad.
His battlefield intensity is rivalled only by his loyalty to Marcus--and his wife, Maria.
Dominic's relentless search for his wife finally ended during Operation: Hollow Storm, when
he and Marcus found her in a Locust processing facility, barely alive and irrevocably twisted.
Marcus left his side to allow Dom a final moment with his beloved Maria before ending her
suffering.
Her Profile reads :As Deltas Control contact, Anya Stroud guided Delta Squad on their mission to destroy the
Locust, providing vital intel and strategic advice to the squad in the field.
Her Profile reads :Samantha "Sam" Byrne's father, Sgt. Samuel Byrne, fell in battle at the siege of Anvil Gate
in Anvegad, Kashkur before the birth of his daughter.
5.2.8 Summary
At this point, we can see that all Soldiers have Gender, Names, Ranks and a military
background.
Therefore, our Soldiers Table looks like this :-
Description of the Boomshot Grenade Locust is :A Boomshot Grenade Locust is a short- to mid-range grenade launcher that can easily take
down a target in a single shot
Description of the Hammer of Dawn is :The Hammer of Dawn is An Imulsion-powered satellite that rains down a devastating
particle energy stream.
It can wipe out anything from small Locust squads to entire city blocks.
Description of the Longshot is :The Longshot Sniper Rifle is a high-powered, bolt-action sniper rifle with a powerful zoom
sight.
Description of the OneShot is :The OneShot is an intimidating and obscenely powerful long-range sniper rifle capable of
destroying most foes in a single shot
Description of the Flamethrower is :The Scorcher Flamethrower is a short- to mid-range weapon that emits a concentrated
stream of fire that chars your enemies.
Description of the Turret is :The Troika Turret is a high-powered, turret-mounted Locust machine gun that fires
continuous rounds across the battlefield.
The Brumaks Profile reads :To stand in the Brumaks shadow is to stare death in the face.
These hulking war machines possess a deadly assortment of weapons, from wrist-mounted
machine guns to over-the-shoulder rocket launchers.
For any chance of survival against a Brumak, blast away bits of its armor to reveal the soft,
weak spots underneath.
The Grenadiers Profile reads :Locust Grenadiers are never afraid to get up close and personal.
They have a hard-charging kamikaze attack and rush their enemy with little concern for
their own welfare.
They specialize in both grenades and the Gnasher Shotgun, drawing their targets out of
cover with one before blasting them to pieces with the other.
5.5.4 RAAM
The RAAM Profile reads :An imposing figure, RAAM towers over all humans, his silent demeanor concealing a violent
and merciless nature.
In battle, RAAM is a formidable opponent who wields a Troika Machine Gun while controlling
the Kryll that he sometimes employs as a shield. RAAM met his demise at the hands of
Marcus Fenix aboard the Tyro Pillar, where his reign of terror came to an abrupt and
welcome end.
This Model shows all the components in the Complete Design Pattern :-
Weapons will be involved for an Event that is a fight between Soldiers and Locusts.