Database concepts
Concept of a Database: Why database?
File system to store and manage data electronically.
A file can be understood as a container to store data in a computer. Files can be stored on the storage device of a computer system. Contents of a file can be texts, computer program code, comma separated values (CSV), etc. Likewise, pictures, audios/videos, web pages are also files.
In traditional approach, independent application programs access their own independent data files. This results in many problems in data storage and retrieval. In the traditional file approach, data is stored in flat files, which are maintained by the file system, under the control of operating system.
Disadvantages of the Traditional Approach
• Data Security: The data as maintained in the flat file is easily accessible and therefore not secure
• Data Redundancy: Often the same information is duplicated in two or more files.
• Data Isolation: Data isolation means that all the related data is not available in one file. Generally, the data is scattered in various files, and the files may be in different formats, therefore writing new application programs to retrieve the appropriate data is difficult.
• Data Dependence: Application programs are data dependent. It is impossible to change the physical representation (how the data is physically represented in storage) or access technique (how it is physically accessed) without affecting the application.
• Lack of Flexibility: Only pre-determined requests for information can be met. It is not flexible enough to satisfy unanticipated queries.
• Concurrent Access Anomalies: Same piece of data is allowed to be updated simultaneously which leads to inconsistencies.
Database Approach:
In database approach, all applications access a common database, which is a centralized data storage system.
Advantages of the Database approach
• Redundancy of data storage is reduced.
• Inconsistency in data is eliminated.
• Data sharing between applications is possible.
Database Management System (DBMS)
DBMS is software that interfaces between applications and a database for all data processing activities. In other words, DBMS can be defined as "A Software that controls the organization, storage, retrieval, security and integrity of data in a database. It accepts requests from the application and instructs the operating system to transfer the appropriate data."
The major DBMS vendors are Oracle, IBM, and Microsoft. MySQL is a popular open source DBMS. Users of a DBMS End Users, Application Programmers, Database Designer and Database Administrators use a DBMS, either directly or indirectly.
How users interact with a Database?
1. End users send queries to the DBMS through applications.
2. The DBMS translates the queries.
3. The DBMS retrieves data from the database.
4. The DBMS sends data to the application, which present the data to the end users.
Functions of a DBMS
Function of DBMS can be broadly classified as follows:
• Data Definition Language (DDL): Defining the data structure Defining structure of data to be stored in database.
• Data Manipulation Language (DML): Manipulating data, Retrieving data, data adding, data modifying, deleting data.
• Data Control Language (DCL): Data Security and control of data access prevents unauthorized access of data through User-ids and Passwords.
Major Features of a DBMS
Data Security: The DBMS can prevent unauthorised users from viewing or updating the database. Security is achieved through Login, password, views etc. For example, in an employee database, some users may be able to view salaries while others may view only work history and medical data.
Data Integrity: The DBMS can ensure that no more than one user can update the same record at the same time. It will not allow user to insert duplicate records. For example, no two customers with the same customer number can be entered.
Interactive Query: A DBMS provides a query language and report writer that lets users interactively interrogate the database. These essential components give users access to all management information as needed.
Data Independence: When a DBMS is used, the details of the data structure are not stated in each application program.
DBMS: Advantages
• Data can be handled as a common resource. The database is a model of the real world.
• Centralized management and economy of scale availability of integrated services.
• Reduction of redundancies and inconsistencies.
• Data independence.
RDBMS
Relational Database Management System is a type of DBMS in which the database is organized and accessed according to the relationships between data values.
This is called relational database because all the data is stored into different tables and relations are established using primary keys or other keys known as Foreign Keys.
The RDBMS was invented by a team lead by Dr. Edmund F. Codd and funded by IBM in the early 1970’s. The Relational Model is based on the principles of relational algebra. Example RDBMS Systems: Oracle, SQL Server, DB2, Sybase, etc.
A Relational DataBase Management System (RDBMS) is a software that:
-Enables you to implement a database with tables, columns and indexes.
- Guarantees the Referential Integrity between rows of various tables.
- Updates the indexes automatically.
- Interprets an SQL Query and combines information from various tables.
Basics of Relational Database
Relational Database Management System (RDBMS): RDBMS is the most popular form of DBMS used in the world. It uses a relational database to organize data. A relational database comprises relations, which are represented as tables.
RDBMS Terminology
Database: A database is a collection of tables, with related data.
Table: A table is a matrix with data. A table in a database looks like a simple spreadsheet.
Column: One column (data element) contains data of one and the same kind, for example the column post code.
Row: A row (= tuple, entry or record) is a group of related data. For example, the data of one subscription.
Redundancy: Storing data twice, redundantly to make the system faster.
Primary Key: A primary key is unique. A key value cannot occur twice in one table.With a key, you can only find one row.
Foreign Key: A foreign key is the linking pin between two tables.
Relation: A relation stores information about an object in the real world. A relation is represented as a table.
Attribute: Each attribute of a relation stores a piece of information about an object. Attributes are represented as columns in tables and can be arranged in any order. Each attribute in a relation is unique and contain atomic values i.e. Atomic value contain a single value of data and Non Atomic values contain a set of values. The number of attributes in a relation is called the degree of the relation.
Tuple: A row in a table is called a tuple of the relation. The number of tuples in a relation is known as the cardinality of the relation. Tuples in a table are unique and can be arranged in any order.
Domain: A domain is a set of valid atomic values that an attribute can take. Within a single database, an attribute cannot have different domains associated with it. A domain can include a null value, if the value for the domain is unknown or does not exist.
The concept of Data Integrity
Data Integrity:
Data Integrity refers to the correctness and completeness of data in a database.
Integrity Constraints: Integrity constraints allow only correct changes to be made to a database. There are two types of integrity constraints - entity integrity and referential integrity.
Entity Integrity: Entity Integrity ensures that for each row in a table, the value of the primary key is unique and is not null.
Referential Integrity: Referential Integrity ensures that for each row in a table, the value of the foreign key is present in the reference table.
Domain constraints: An entity refers to any data recorded in the database. The primary key is used as the identifier of the row. Duplication of data is not possible. No component of the primary key is allowed to accept a NULL value.
Interpreting Data
Entity: An entity is an object that exists in the real world and is distinguishable from other objects. Each entity is represented as a table in a relational database.
Types of Entities: Entities can be classified in two ways - based on existence and based on subsets. Based on existence, entities can be classified as Dominant and Weak entities. Based on subsets, entities can be classifies as Supertypes and Subtypes.
Relationships
A relationship is an association between two entities.
Types of Relationships: Relationships are classified into three types based on the occurrence of the related entities.
One-to-One (1-1)
• ONE Student has only ONE Roll no.
• ONE Computer has only ONE mac address
• ONE Person has only ONE Pan Card
One-to-Many (1-M)
• ONE Mother has MANY children
• ONE Book has MANY pages
• ONE House has MANY windows
Many-to-Many (M-M)
• ONE Institute has MANY faculty members &
• ONE faculty member teaches in MANY institutes.&
• ONE Bank has MANY account holders &
• ONE account holder holds accounts in MANY Banks.
Language Support for an RDBMS
There are many RDBMS such as MySQL, Microsoft SQL Server, PostgreSQL, Oracle, etc. that allow us to create a database consisting of relations. These RDBMS also allow us to store, retrieve and manipulate data on that database through queries.