Korth dbms ebook pdf
Database System Concepts by Silberschatz, Korth and Sudarshan is now in its 7th edition and is one of the cornerstone texts of database education. It presents the fundamental concepts of database management in an intuitive manner geared toward allowing students to begin working with databases as quickly as possible. It also contains additional material that can be used as supplements or as introductory material for an advanced course. Assertions are seeing increasing use. Triggers are widely used, although each database supports its own syntax and semantics for triggers; triggers were standardized as part of SQL, and we can expect databases to provide support for SQL triggers.
Functional dependencies are now taught as part of normalization instead of be- ing part of the integrity constraints chapter as they were in the 3rd edition. The rea- son for the change is that they are used almost exclusively in database design, and no database system to our knowledge supports functional dependencies as integrity constraints. Covering them in the context of normalization helps motivate students to spend the effort to understand the intricacies of reasoning with functional depen- dencies.
Security is a major topic in its own right. Since any system is only as secure as its weakest component, a system builder must consider all aspects of security. This chapter focuses only on those security issues that are specific to databases. In an advanced course, this material can be supplemented by discussion of security issues in operating systems and in distributed systems.
Changes from 3rd edition: Trigger coverage is now based on the SQL standard. At the time of publica- tion of the 3rd edition, triggers had not been standardized. The notion of roles for authorization has been introduced in this edition, now that it is a part of the SQL standard.
Coverage of encryption has been updated to cover recent developments. Answer: create table loan loan-number char 10 , branch-name char 15 , amount integer, primary key loan-number , foreign key branch-name references branch create table borrower customer-name char 20 , loan-number char 10 , primary key customer-name, loan-number , foreign key customer-name references customer, foreign key loan-number references loan Declaring the pair customer-name, loan-number of relation borrower as primary key ensures that the relation does not contain duplicates.
Identify referential-integrity con- straints that should hold, and include them in the DDL definition. Other choices for not null at- tributes may be acceptable. Consider a database that includes the following relations: salaried-worker name, office, phone, salary hourly-worker name, hourly-wage address name, street, city Suppose that we wish to require that every name that appears in address appear in either salaried-worker or hourly-worker, but not necessarily in both.
Propose a syntax for expressing such constraints. Discuss the actions that the system must take to enforce a constraint of this form. For simplicity, we present a variant of the SQL syntax. As part of the create table expression for address we include foreign key name references salaried-worker or hourly-worker b. To enforce this constraint, whenever a tuple is inserted into the address rela- tion, a lookup on the name value must be made on the salaried-worker relation and if that lookup failed on the hourly-worker relation or vice-versa.
The foreign-key clause requires that every manager also be an employee. Explain exactly what happens when a tuple in the relation manager is deleted. Answer: The tuples of all employees of the manager, at all levels, get deleted as well! This happens in a series of steps. The initial deletion will trigger deletion of all the tuples corresponding to direct employees of the manager. These deletions will in turn cause deletions of second level employee tuples, and so on, till all direct and indirect employee tuples are deleted.
Describe how the trigger mechanism can be used to implement the on delete cascade option, when a tuple is deleted from s. Answer: We define triggers for each relation whose primary-key is referred to by the foreign-key of some other relation.
The trigger would be activated when- ever a tuple is deleted from the referred-to relation. The action performed by the trigger would be to visit all the referring relations, and delete all the tuples in them whose foreign-key attribute value is the same as the primary-key attribute value of the deleted tuple in the referred-to relation.
These set of triggers will take care of the on delete cascade operation. Answer: The assertion-name is arbitrary. We have chosen the name perry. Note that since the assertion applies only to the Perryridge branch we must restrict attention to only the Perryridge tuple of the branch relation rather than writing a constraint on the entire relation.
Answer: create trigger check-delete-trigger after delete on account referencing old row as orow for each row delete from depositor where depositor. Write active rules to maintain the view, that is, to keep it up to date on insertions to and deletions from depositor or account. Do not bother about updates. Answer: For inserting into the materialized view branch-cust we must set a database trigger on an insert into depositor and account. We assume that the database system uses immediate binding for rule execution.
Further, assume that the current version of a relation is denoted by the relation name itself, while the set of newly inserted tuples is denoted by qualifying the relation name with the prefix — inserted. The active rules for this insertion are given below — define trigger insert into branch-cust via depositor after insert on depositor referencing new table as inserted for each statement insert into branch-cust select branch-name, customer-name from inserted, account where inserted.
The deletion of a tuple from branch-cust is similar to insertion, except that a deletion from either depositor or account will cause the natural join of these relations to have a lesser number of tuples.
We denote the newly deleted set of tuples by qualifying the relation name with the keyword deleted. For each item on your list, state whether this concern relates to physical security, human security, operating- system security, or database security. Answer: Let us consider the problem of protecting our sample bank database. Some security measures at each of the four levels are mentioned below - a.
Physical level - The system from which the relations can be accessed and modified should be placed in a locked, well-guarded, and impregnable room. Passwords for gaining access to the database should be known only to trusted users. Operating System level - Login passwords should be difficult to guess and they should be changed regularly.
No user should be able to gain unautho- rized access to the system due to a software bug in the operating system. Database System level - The users should be authorized access only to rele- vant parts of the database.
A view containing the account numbers and customer names but not the balances for all accounts at the Deer Park branch. Exercises 79 b. A view containing the names and addresses of all customers who have an account with the bank, but do not have a loan. A view containing the name and average account balance of every customer of the Rock Ridge branch. Hint: See the discussion of views in Chapter 3. Answer: To insert account-number, name into the view deer-park we insert the tuple Deer Park, account-number, null into the account relation and the tuple name, account-number into the depositor relation.
Updates to the views no-debt and avg-bal present serious problems. If we insert into the no-debt view, the system must reject the insertion if the customer has a loan.
The overhead of updating through this view is so high that most systems would disallow update. The avg-bal view cannot be updated since the result of an aggregate operation depends on several tuples, not just one. In this chapter, we described the use of views as a security mechanism. Do these two purposes for views ever conflict? However, as the following example shows, the two purposes do conflict in case the mechanisms are not designed carefully.
However, the user will be denied access to delete this erroneous tuple by the security mechanism. Answer: Index and resource authorization should be special categories to al- low certain users to create relations and the indices to operate on them while preventing these time-consuming and schema-changing operations from being available to many users.
Separating index and resource authorization allows a user to build an index on existing relations, say, for optimization purposes, but allows us to deny that user the right to create new relations. Discuss an advantage and a disadvantage of such an approach. Answer: Database systems have special requirements which are typically more refined than most operating systems. Encrypted data allows authorized users to access data without worrying about other users or the system administrator gaining any information.
Encryption of data may simplify or even strengthen other authorization mechanisms. For example, distribution of the cryptographic key amongst only trusted users is both, a simple way to control read access, and an added layer of security above that offered by views. Suggest a scheme for the secure stor- age of passwords. Be sure that your scheme allows the system to test passwords supplied by users who are attempting to log into the system. Answer: A scheme for storing passwords would be to encrypt each password, and then use a hash index on the user-id.
The user-id can be used to easily access the encrypted password. The password being used in a login attempt is then en- crypted and compared with the stored encryption of the correct password. Undergraduates frequently find this chapter difficult.
It is acceptable to cover only Sections 7. However, a careful study of data dependencies and normalization is a good way to introduce students to the formal aspects of relational database theory. There are many ways of stating the definitions of the normal forms. We have cho- sen a style which we think is the easiest to present and which most clearly conveys the intuition of the normal forms. Changes from 3rd edition: There are many changes to this chapter from the 3rd edition.
Functional dependencies are now covered in this chapter, instead of Chap- ter 6. The reason is that normalization provides the real motivation for functional dependencies, since they are used primarily for normalization.
We have described a simplified procedure for functional dependency inference based on attribute closure, and provided simplified procedures to test for normal forms. The process of practical relational schema design has been described in signifi- cantly more detail, along with some design problems that are not caught by the usual normalization process. Explain why each of these properties may indicate a bad relational- database design.
This is a bad relational database design because it increases the storage re- quired for the relation and it makes updating the relation more difficult. This is bad re- lational database design because all the unrelated attributes must be filled with null values otherwise a tuple without the unrelated information can- not be inserted into the relation. It is a bad relational database design because certain queries cannot be answered using the reconstructed relation that could have been answered using the original relation.
Since A is a candidate key see Exercise 7. Answer: Certain functional dependencies are called trivial functional depen- dencies because they are satisfied by all relations.
C does not functionally determine A because the first and third tuples have the same C but different A values. The same tuples also show B does not functionally determine A. Like- wise, A does not functionally determine C because the first two tuples have the same A value and different C values.
The same tuples also show B does not functionally determine C. Relation of Exercise 7. The following relation r is a counterexample to the rule.
No more dependencies in F apply now. Also none of the attributes in the left side or right side of any of the FDs is extraneous. Therefore the canonical cover Fc is equal to F. Show that this algorithm is more efficient than the one presented in Figure 7. Otherwise f dcount would be nonzero and the if condition would be false. A full proof can be given by induction on the depth of recursion for an execution of addin, but such a proof can be expected only from students with a good mathematical background.
The entry for attribute A is a list of integers. Hence A will be added to result. To see that this algorithm is more efficient than the one presented in the chap- ter note that we scan each FD once in the main program. The resulting array appears has size proportional to the size of the given FDs. The recursive calls to addin result in processing linear in the size of appears. Hence the algorithm has time complexity which is linear in the size of the given FDs. On the other hand, the algorithm given in the text has quadratic time complexity, as it may perform the loop as many times as the number of FDs, in each loop scanning all of them once.
Also write an SQL assertion that enforces the functional dependency. Assume that no null values are present. The query is given below. Therefore, this is a lossy join. Thus, t[R1 ] 1 t[R2 ] The cartesian product of single tuples generates one tuple. Finally, the projection clause removes duplicate attribute names. That is, t is equal to the result of this join. We shall see in Exercise 7. A simpler argument is as follows: F1 contains no dependencies with D on the right side of the arrow.
F2 contains no dependencies with B on the left side of the arrow. Hint: Show that the join of all the projections onto the schemas of the decomposition cannot have more tuples than the original relation. Answer: Let F be a set of functional dependencies that hold on a schema R. Let X be a candidate key for R. Consider a legal instance r of R. Consider the use of the algorithm given in Figure 7.
We use induction on the number of times that the f or loop in this algorithm is executed. Thus we have proved that the size of j equals that of r. Using the result of Exercise 7. Answer: The three design goals are lossless-join decompositions, dependency preserving decompositions, and minimization of repetition of information.
They are desirable so we can maintain an accurate database, check correctness of up- dates quickly, and use the smallest amount of space possible. Answer: From Exercise 7. By the algorithm of Figure 7. This is in BCNF. Answer: BCNF is not always dependency preserving. Therefore, we may want to choose another normal form specifically, 3NF in order to make checking de- pendencies easier during updates. This would avoid joins to check dependen- cies and increase system performance.
Answer: First we note that the dependencies given in Exercise 7. Generating the schema from the algorithm of Figure 7. Schema A, B, C contains a candidate key. Thus, it was not necessary to apply the algorithm as we have done above. The single original schema is trivially a lossless join, dependency-preserving decomposi- tion.
We can restate our definition of 3NF as follows: A relation schema R is in 3NF with respect to a set F of functional dependencies if there are no nonprime attributes A in R for which A is transitively dependent on a key for R. Show that this new definition is equivalent to the original one. Answer: Suppose R is in 3NF according to the textbook definition.
We show that it is in 3NF according to the definition in the exercise. Suppose R is not in 3NF according the the textbook definition. Show that every 3NF schema is in 2NF. Hint: Show that every partial depen- dency is a transitive dependency. Answer: Referring to the definitions in Exercise 7. Let A be a non-prime attribute in R.
Hence we have proved that every 3NF schema is also in 2NF. See Exercise 7. To avoid this we must go to 3NF. Repetition of information is allowed in 3NF in some but not all of the cases where it is allowed in 2NF. Thus, in general, 3NF reduces repetition of information.
Since we can always achieve a lossless join 3NF decomposition, there is no loss of information needed in going from 2NF to 3NF. However, in case we choose this decomposition, retrieving information about the relationship be- tween A, B and C requires a join of two relations, which is avoided in the cor- responding 2NF decomposition.
Thus, the decision of which normal form to choose depends upon how the cost of dependency checking compares with the cost of the joins. Usually, the 3NF would be preferred. Dependency checks need to be made with every insert or update to the instances of a 2NF schema, whereas, only some queries will require the join of instances of a 3NF schema.
Explain problems that they may cause. Answer: Dangling tuples can arise when one tuple is inserted into a decom- posed relation but no corresponding tuple is inserted into the other relations in the decomposition. They can cause incorrect values to be returned by queries which form the join of a decomposed relation since the dangling tuple might not be included.
As we saw in Chapter 5, dangling tuples can be avoided by the specification of referential integrity constraints.
This chapter and the next chapter form a logical unit and should be taught consecutively. It is possible to teach these chapters before covering normalization Chapter 7. However, it is quite possible that students may already be familiar with the basic concepts of object orientation, and with an object-oriented programming lan- guages. For such students Section 8. However, it is important to point out the motivation for object-oriented features in the context of a database, and how the requirements differ from those of a programming language.
A persistent object-oriented language should be merely a front- end to a database. It is important to remind students of all of the features that a database system must have, so that, they can distinguish full-fledged object-oriented database systems from systems that provide an object-oriented front-end, but pro- vide little in the way of database facilities such as a query facility, an on-line catalog, concurrency control and recovery.
There are several commercial object-oriented database systems available on the market, and a few public domain systems as well. Some of the commercial systems also offer low-cost or free copies for academic use. The commercial object-oriented database systems include Objectivity www.
Changes from 3rd edition: Some examples have been updated to make them more intuitive. The coverage of ODMG has been updated to ODMG-2, including the new syntax with a d prefix for keywords , and the new d rel ref feature to declare relationships. List all specific system components that would need to be modified. Computer-aided design b. Multimedia databases Answer: Each of the applications includes large, specialized data items e. These data items have operations specific to them e. These data items are of vari- able length making it impractical to store them in the short fields that are al- lowed in records for such database systems.
Thus, the data model, data manip- ulation language, and data definition language need to be changed. Also, long-duration and nested transactions are typical of these applications. Changes to the concurrency and recovery subsystems are likely to be needed. Answer: An entity is simply a collection of variables or data items. An object is an encapsulation of data as well as the methods code to operate on the data.
The data members of an object are directly visible only to its methods. For all vehicles, it includes the vehicle identification number, license num- ber, manufacturer, model, date of purchase, and color. Use inheritance where appropriate. Illustrate your explanation with an example. Answer: A class inherits the variables and methods of all its immediate super- classes. Thus it could inherit a variable or method of the same name from more than one super-class.
When that particular variable or method of an object of the sub-class is referenced, there is an ambiguity regarding which of the super- classes provides the inheritance. For instance, let there be classes teacher and student, both having a variable department. If a class teachingAssistant inherits from both of these classes, any reference to the department variable of a teachingAssistant object is ambiguous.
Answer: Tuple equality is determined by data values. Object identity is inde- pendent of data values, since object-oriented systems use built-in identity. It has all the properties that objects of class A have, plus additional ones of its own. It can of course provide its own implementations for the inherited methods.
There need not be any simi- larities in the properties of A and B. Neither B nor A inherit anything from the other. Might it be simpler to use only persistent objects, with unneeded objects deleted at the end of an execution? This is because of the over-heads in preserving transaction semantics, security and integrity. Since a transient ob- ject is purely local to the transaction which created it and does not enter the database, all these over-heads are avoided.
Thus, in order to provide efficient access to purely local and temporary data, transient objects are provided by persistent programming languages. Give schema definitions corresponding to the relational schema shown in Figure 3. Write programs to compute each of the queries in Exercise 3. The schema definitions can be written in two different ways, one of which is a direct translation from the relational schema, while the other uses object- oriented features more directly.
We present queries for the second schema. Answer: To represent ternary relationships, create a class corresponding to the relationship and refer to the entities in this class. For example, to represent the ternary relationship in Figure 2.
Contrast this implementation with that of pointers as they exist in general-purpose languages, such as C or Pascal. These ADTs should provide the typical pointer operations like incrementing and dereferencing, so their usage and regular pointer usage is uniform. Regular pointers on the other hand are usually built-in types, implemented as part of the language. Answer: If an object is created without any references to it, it can neither be accessed nor deleted via a program.
The only way is for the database system to locate and delete such objects by itself. This is called garbage collection. One way to do garbage collection is by the method of mark and sweep.
First, the objects referred to directly by programs are marked. Then references from these objects to other objects are followed, and those referred objects are marked. This pro- cedure is followed repeatedly until no more unmarked objects can be reached by following reference chains from the marked objects. At this point, all these remaining unmarked objects are deleted.
Is such a system necessarily a database system? Answer: A database system must provide for such features as transactions, queries associative retrieval of objects , security, and integrity.
A persistent ob- ject system may not offer such features. Such extended systems are called object- relational systems. Since the chapter was introduced in the 3rd edition most commer- cial database systems have added some support for object-relational features, and these features have been standardized as part of SQL It would be instructive to assign students exercises aimed at finding applications where the object-relational model, in particular complex objects, would be better suited than the traditional relational model.
Changes from 3rd edition: The query language features are now based on the SQL standard, which was not ready when the 3rd edition was published; that edition was based from features from several different proposals for extending SQL. Exercises 9. Suppose the database contains a relation emp Emp. Write the following queries in SQL with the extensions described in this chapter. Find the names of all employees who have a child who has a birthday in March.
List all skill types in the relation emp. SkillSet as s, s. ExamSet as x where s. SkillSet as s 9. List any functional or multivalued dependencies that you assume. Also list all referential-integrity constraints that should be present in the first- and fourth-normal-form schemas. Answer: To put the schema into first normal form, we flatten all the attributes into a single relation schema. The MVDs capture the fact there is no relationship between the children of an employee and his or her skills-information.
The ename attribute is a foreign key in Child and in Skill, referring to the Employee relation. Give a relational schema in third normal form that represents the same information. Recall the constraints on sub- tables, and give all constraints that must be imposed on the relational schema so that every database instance of the relational schema can also be represented by an instance of the schema with inheritance. Instead of placing only the name attribute of People in Students and Teachers, both its attributes can be included.
Sudarshan , The slides and figures are authorized for personal use, and for use in conjunction with a course for which Database System Concepts is the prescribed text.
Instructors are free to modify the slides to their taste, as long as the modified slides acknowledge the source and the fact that they have been modified. Paper copies of the slides may be sold strictly at the price of reproduction, to students of courses where the book is the prescribed text.
Any use that differs from the above, and any for profit sale of the slides in any form requires the consent of the copyright owners; contact Avi Silberschatz avi cs. Korth S. To browse Academia. Skip to main content. Transaction Management Introduction Transactions Concurrency Control Recovery System iii VI. Database System Architecture Introduction Database System Architecture Distributed Databases Parallel Databases VII. Other Topics Introduction Application Development and Administration
0コメント