Before entering into the world of database let’s learn some basic Database Fundamentals

There are many ways to store data like in Notepad/Excel/ACCESS. Let’s have a look on these.

NOTEPAD: Which presents in any operating system (Windows/Linux/Mac) for free and can be used to store data. If the data is of small size and had less content then we can use notepad effectively but if the data is large and had more data with more columns then we can’t use it because we will face following problems

      1)   It doesn’t support usage of tables
      2)   We can’t update it if the data size is more
      3)   There will be redundancy problem

To overcome above problems, we will go for another software which is not free but cheap.

Creating sample database in Notepad
Migrating from Notepad to Excel
Data Redundancy problem 

MS EXCEL: Which comes with Microsoft Office which can be used in windows. For Mac and Linux operating systems it comes in different flavor open office. Excel had lot more options than notepad. By using Excel, we can store large amount of data and we can compare the data with different tables. We also can draw statistical graph from the data directly for comparison. That’s the reason it is used by many small-scale businesses who had large data in their systems. But with Excel also we had following problems

       1) There is a problem of redundancy
       2)   File size

To overcome these problems, we are moving to another product which is also from Microsoft.

  • Creating database using EXCEL
  • Modifying, updating database in EXCEl
  • Pivot table and some concepts in EXCEL
  • Functions in EXCEL used for database

MS ACCESS: This is also a product which comes with MS Office. To overcome data redundancy we use certain rules (SQL commands) in it to store data. By using these rules we can easily search/delete/update/modify any required data filed. We can directly import EXCEL data into ACCESS whenever needed. By using Access we will face certain problems like

1)   File size

2)   Security

As the data size increases there will be a problem of adding content /updating /searching /security.

  • Migrating from Excel to Access
  • Creating database using Access
  • Updating or deleting a cell in Access

To overcome the following potential problem, 

  • Size
  • Updating
  • Accuracy
  • Security
  • Redundancy
  • Importance

We use database concept to maintain these databases. The concept which we are going to use is DBMS.

Database is made up of our rules & Data Base Management System (DBMS) will hold those rules.

There are several types of DBMS among those few are ORACLE, SQL SERVER, MY SQL, POSTGRE SQL, MONGO SQL.

We use this Data Base Management System software’s to manipulate and control one (or) more databases.

DBMS falls into broad category. The foremost common one is RDBMS (Relational Data Base Management Systems) 

 Other types of DBMS are:

  • Hierarchical DBMS
  • Network DBMS
  • Object-Oriented DBMS
  • No SQL DBMS

According to Wikipedia RDBMS defined as:

“A relational database management system (RDBMS) is a database management system (DBMS) that is based on the relational model as invented by E. F. Codd, of IBM’s San Jose Research Laboratory. Many popular databases currently in use are based on the relational database model.” 

RDBMS features:

  • A database without a table is an empty shell
  • A database is constructed using one/more tables
  • Each table consist of rows and columns
  • Every single row represents only one single thing
  • Each column gives a particular data. Column says which type of data we need to insert in that row.
  • We can store data in a database using tables and by using values and primary keys we find data in those tables.
  • Each row in a database requires a key. And this key will say the row which we needed (key is a way to identify just one particular row).
  • We create unique rows to identify things

There are two types of keys that we use in these tables

  1. Primary Key (PK)
  2. Foreign Key (FK)
  3. Candidate Key
  4. Composite Key

 PRIMARY KEY (PK):

The primary key is very important because it holds or shows the data of particular row and which connects the data from one table to another. The primary key of a relational table uniquely Identifies each record in the table. It can either be a normal attribute that is generated to be unique (such as social security number in a table with no more than one record per session) or it can be generated by the DBMS (such as globally unique identifier or GUID in Microsoft SQL server). Primary keys may consist of a single attribute or multiple attributes in combination. 

Points to remember:

  • A unique column is the column which doesn’t have same number or text in the entire row
  • All database allows us to create unique columns to identify the particular data
  • If we don’t have any unique data then we need to create one since database is made up of our rules 

FOREIGN KEY (FK):

A foreign key is a field in a relational table that matches the primary key column of another table. The foreign key can be used to cross-reference tables. 

Some tricky points:

  1. Customer ID in customer table is a primary key but when it is added to order table it isn’t a primary key any more.
  2. Customer ID in ordered table need not be unique because same customer can place different orders.

FK will be existed when there is a relation between tables. These are the common relations that we see in tables

  1. One-To-One
  2. One-To-Many
  3. Many-To-Many

 

One-To-One relationship says the relation between one table to other table. 

One-To-Many is the most common relationship in database. It’s simply nothing but having relation of one table with many tables. This can be explained by following example

Many-To-Many relationship is nothing but having relationship between many tables. We can’t create this directly in most databases, but we can do it indirectly. 

Here one book may have different authors and one author may have different books so it is a many-to-many relation. We can’t just place Author ID since it holds only one number. To create many-to-many relation we need to create a new “junction/linking table”.

CANDIDATE KEY:

A candidate key is a combination of attributes that can be uniquely used to identify a database record without any external data. Each table may have one/more candidate keys. One of these candidate keys is selected as the table primary key. 

COMPOSITE KEY:

A composite key in the context of relational database is a combination of two/more columns in a table that can be used to uniquely identify each row in the table. Uniqueness is only guaranteed when the columns are combined. When taken individually the columns don’t guarantee uniqueness.

Here in the above table there is no single column that can make a uniqueness to make a primary key. So, if we combine two columns then we will get uniqueness. Therefore, we are making a primary key by combining two columns. This type of primary key is known as composite key

Points to keep in mind:

  • When we are entering into data/changing data we need transactions.
  • When we are working with transactions we come across ACID test

Atomic         : Transaction should complete completely or return to its original state

Consistent   : Takes the database from one state to other by applying rules of database

Isolated        : Data in transaction is essentially locked for that moment

Durable        : Transaction should be complete/durable 

PLANNING DATABASE

To plan database we use ER diagrams. In ER diagrams

  1. Entities will become tables
  2. Attributes will become columns 

To minimize the redundancy and dependency in relational database we Database Normalization technique.

DATABASE NORMALIZATION

Normalization usually involves dividing large tables into smaller (and less redundant) tables and defining relationships between them. The objective is to isolate data so that additions, deletions, and modifications of a field can be made in just one table and then propagated through the rest if the database using the defined relationships.

DATABASE DENORMALIZATION

In a relational database, denormalization is an approach to speeding up read performance (data retrieval) in which the administrator selectively adds back specific instances of redundant data after the data structure has been normalized. A denormalized database shouldn’t be confused with a database that has never been normalized.