A movie studio is characterized by its name and address. A studio may have many film crews working for it. Each film crew at a studio is identified by its number: 1, 2, 3, and so on. It also has a crew chief.
Draw an ER diagram to model these entities.
State clearly any assumptions you make in your design.
How many entities are there? What are their keys?
How many relationships? What are their cardinalities?
Does the key for the Crew entity create any problems?
Here's what I see in the problem description:
But the description leads to a violation of one of the most important elements of a data model: A key uniquely identifies a tuple. In this case, though, the same crew numbers are used at multiple studios, so they are not unique.
Maybe we need an artificial key:
This works, but it is dissatisfying in another way. The model is no longer quite as faithful to the domain.
Can we do better?
This kind of situation arises more often in data modeling than might seem obvious at first. It is so common that there is a tweak to ER and relational models for recording and handling it: the weak entity.
We are still exploring conceptual modeling and logical modeling in parallel, in the form of the Entity Relationship model and the relational data model. This week, we will bring them together and prepare to focus our attention on the theory and practice of the relational data model.
A movie studio can have many film crews working for it. Each film crew at a studio is identified by its number: 1, 2, 3, and so on. But other studios use the same numbers to identify their crews, so the crew number attribute cannot be the key. It's not clear that crew number + crew chief can be a compound key, either.
To identify a crew uniquely, we need both the name of the studio and the crew number. Studio name is not an attribute of the crew, though, so it has to be passed through the WorksFor relationship.
It is legal for an entity's key to include attributes that belong to a related entity. This kind of entity is the called a weak entity.
The studio film crew illustrates a form of weak entity that occurs frequently in the world: one entity is part of another, in a hierarchical relationship. A crew is part of a studio. Rather than create a meaningless artificial key, we can uniquely identify a crew by identifying the studio of which it is a part.
In our ER models, we represent weak entities using a special notation:
We use a double rectangle to indicate a weak entity. A double diamond marks the many-to-one relationship that ties a weak entity to the source of the rest of its key.
The result is a model that is faithful to the domain.
Another example is hierarchical classification in biology. Each species takes its name from both the species name itself, say sapiens, and the genus that contains it, say Homo. This supports faithfully models the biological domain: a genus can contain multiple species, and species names can be reused across two genera.
An ER diagram of the relationship between species and genera might look like this:
Can you think of any other examples in domains you're familiar with?
In a university's information system, we will have Student and Course as entities:
We would like to record the fact that a student takes a course. This sounds like a natural place to create a relationship, but we have a problem:
We cannot use a "Takes" relationship, because it needs to have an attribute of its own: the student earns a grade in the course. We need another entity: an EnrollmentRecord:
What will be the key in the new entity? Will the grade be part? No, the key for an EnrollmentRecord will comprise the keys from the corresponding Student and Course.
If we decide to record more information in an EnrollmentRecord -- say, the semester in which the student takes the course -- then EnrollmentRecord may contribute a field to the key. (Our model would now enable us to record that students take a course multiple times.)
EnrollmentRecord is an example of a second source of weak entities: when we create entities that aren't a natural part of the domain. We also see this in the process that creates an artificial entity in order to convert a multi-way relationship into a set of binary relationships. We will see this idea in this week's online session.
After seeing a couple of examples, we can describe a bit more formally how weak entities and their relationships work. This can help us recognize and use weak entities effectively.
If E is a weak entity, then its key consists of:
R is a supporting relationship for E from F if:
The bullets in the second lists seem pretty straightforward, except perhaps for the second one: why can every E be related to only one F? Think about it for a second... If one enrollment record could be related to two students, then our model would store the same grade for both students!
The theory of weak entities is bigger than this set of rules. (For example, a supporting entity F can be weak, too! In that case, it will have a supporting entity G that contributes to the key of E.) But this set of rules prepares us to work with most of the situations that we will see in data modeling.
You may have noticed that the entities in ER modeling and the relations in the relational data model are quite similar. The correspondence is so strong that, as a first approximation, converting an ER model into a set of relations is simple:
Consider the bank account we modeled in last session's closing exercise:
The first rule tells us to create relations named Customer and Account, with these schema:
Customer(ssNo, name, phone, address) Account(number, type, balance)These schema describe instances such as these:
ssNo name phone address ----------- ----- -------- ------- 123-45-6789 Eugene W 273-5919 305 ITTC 321-45-6789 John Smith 452-9159 339 WRT number type balance ------ -------- ------- 12345 checking 100 96114 savings 25 54678 savings 1000
The second rule tells us to create a relation named Owns, with this schema:
Owns(ssNo, number)This schema allows us represent account ownership:
ssNo number ----------- ------ 123-45-6789 12345 321-45-6789 96114 123-45-6789 54678
And we are done. This approach works for a broad variety of ER models.
There are a few tweaks we can make to these rules to handle special cases. Let's look at one now and one later.
Combining two relations Sometimes, a relationship results in a table that can be folded into one of its entities' tables.
Consider the Owns relation. In a bank where the relationship between customers and accounts is m:n -- that is, each customer can own multiple accounts, and each account can have multiple owners -- then the Owns relation is necessary for us to record the possible combinations.
However, in our closing exercise, the relationship is 1:n. A customer may have one or more accounts, which have one owner each. This means that the Owns relation can contain many tuples with the same customer, but there will be only one tuple for each account.
ssNo number ----------- ------ 123-45-6789 12345 321-45-6789 96114 123-45-6789 54678The account number is all we need to identify a tuple!
This is a case of an entity E and an n:1 relationship R from E to F. In these cases, it can be clearer and more efficient to fold the relation for R into the relation for E, resulting in a schema that contains:
In our example, E is Account, F is Customer, and R is Owns, an n:1 relationship from Account to Customer. We can fold the Owns relation into the Account relation, resulting in this schema:
Customer(ssNo, name, phone, address) Account(number, type, balance, owner)The attribute owner will be the ssNo of the customer who owns the account.
number type balance owner ------ -------- ------- ---------- 12345 checking 100 123-45-6789 96114 savings 25 321-45-6789 54678 savings 1000 123-45-6789"Owner" is a better name for the attribute than "ssNo". We will want to define the domain of the attribute to be the set of customer keys.
This conversion may cause a feeling of deja vu... Last time, we looked at some of your questions from an ER modeling exercise. Many of you were wondering about the conditions under which an attribute could be an entity or a relationship could be an attribute. Combining two relations into one when one of them adds no new information illustrates one such condition: a relationship in an ER model can become an attribute in a relational schema!
Handling Weak Entities. You might imagine that weak entities require a slightly different approach. You'd be right. Let's study that approach in this week's online session.
This week's online activity is available. Work through it before class next week. There is a deliverable for this one, so send it to me after you are done.
Homework 1 was due this morning. Homework 2 will be available on Friday. Exam 1 will not be next week.