File System vs DBMS: A Comparison of Data Storage and Management

Introduction

In this blog post, we will compare and contrast two ways of storing and managing data: file systems and database management systems (DBMS). We will explain what they are, how they work, and what are their advantages and disadvantages.

What is a File System?

A file system is a way of organizing and storing files on a storage device, such as a hard disk, a flash drive, or a CD-ROM. A file system consists of different files that are grouped into directories or folders. Each file has a name, a location, and some attributes, such as size, type, permissions, etc. A file system performs basic operations, such as creating, deleting, renaming, copying, moving, and searching files.

A file system can be used to store any kind of data, such as text documents, images, audio, video, etc. However, a file system does not have any knowledge of the structure or meaning of the data inside the files. For example, a file system does not know that a file contains a table of student records or a list of products. A file system also does not provide any mechanisms for enforcing data integrity, security, consistency, or concurrency.

What is a DBMS?

A DBMS is a software application that manages a collection of related data. A DBMS stores data in a structured and organized way, using tables, records, fields, keys, indexes, etc. A DBMS also provides various functions and tools for manipulating, querying, analyzing, and maintaining the data. For example, a DBMS can perform operations such as inserting, updating, deleting, selecting, sorting, filtering, grouping, aggregating, joining, and calculating data.

A DBMS can be used to store any kind of data that has some logical relationships and dependencies. For example, a DBMS can store data about students, courses, grades, teachers, etc. A DBMS also provides mechanisms for ensuring data integrity, security, consistency, and concurrency. For example, a DBMS can enforce rules such as primary keys, foreign keys, unique constraints, check constraints, etc.

Comparison between File System and DBMS

The following table summarizes some of the main differences between file systems and DBMSs:

Criteria	File System	DBMS
Structure	Unstructured and flat	Structured and hierarchical
Data Redundancy	High	Low
Data Independence	Low	High
Data Consistency	Low	High
Data Integrity	Difficult to enforce	Easy to enforce
Data Security	Low	High
Data Recovery	No backup or recovery mechanism	Backup and recovery mechanism
Data Manipulation	No efficient query processing	Efficient query processing
Data Sharing	Difficult to share data among multiple users or applications	Easy to share data among multiple users or applications
Data Abstraction	No abstraction of data details	Abstraction of data details
Complexity	Low	High
Cost	Low	High

Example of File System and DBMS

To illustrate the difference between file systems and DBMSs, let us consider an example of storing data about students, subjects, and results.

File System Approach

In the file system approach, we can create three files: student.txt, subject.txt, and result.txt. Each file contains some fields separated by commas. For example,

student.txt:

roll_no,name,course
101,Rajesh,MCA
102,Riya,MBA
103,Amit,B.Tech

subject.txt:

sub_code,name,max_marks
CS101,C Programming ,100
CS102,DBMS ,100
CS103,OOP ,100

result.txt:

roll_no,name,course ,sub_code,name,max_marks ,obtained_marks
101,Rajesh,MCA ,CS101,C Programming ,100 ,85
101,Rajesh,MCA ,CS102,DBMS ,100 ,90
101,Rajesh,MCA ,CS103,OOP ,100 ,80
102,Riya,MBA ,CS101,C Programming ,100 ,75
102,Riya,MBA ,CS102,DBMS ,100 ,95
102,Riya,MBA ,CS103,OOP ,100 ,70
103,Amit,B.Tech ,CS101,C Programming ,100 ,65
103,Amit,B.Tech ,CS102,DBMS ,100 ,60
103,Amit,B.Tech ,CS103,OOP ,100 ,55

In this approach, we can see that there are some problems:

There is a lot of data redundancy, as some fields are repeated in more than one file. For example, the name and course of each student are repeated in the result file. This wastes storage space and increases the risk of data inconsistency.
There is no data independence, as any change in the file structure or format will affect the applications that use the files. For example, if we want to add a new field or change the order of the fields in the student file, we have to modify all the applications that read or write the student file.
There is no data consistency, as there is no way to ensure that the data in different files are synchronized and valid. For example, there is no way to prevent inserting a record in the result file for a student who does not exist in the student file, or for a subject that does not exist in the subject file.
There is no data integrity, as there is no way to enforce rules or constraints on the data values. For example, there is no way to ensure that the obtained marks are less than or equal to the max marks, or that the roll number is unique for each student across all the files.
There is no data security, as there is no way to protect the files from unauthorized access or modification. For example, anyone who has access to the files can read, write, delete, or copy them without any restriction or authentication.
There is no data recovery, as there is no backup or recovery mechanism in case of system failure or data loss. For example, if the system crashes while entering some data in the result file, the content of the file may be corrupted or lost.
There is no efficient data manipulation, as there is no query language or tool for performing complex operations on the data. For example, if we want to find out the average marks of each student or the highest marks in each subject, we have to write a program that reads and processes all the files.
There is no way to implement Transaction Atomicity. It might happen sometime during data modification between multiple files that data gets modified in one file but couldn't modify the corresponding data in the other file (because of any possible failure).

DBMS Approach

In the DBMS approach, we can create three tables: Student, Subject, and Result. Each table has some columns and rows. For example,

Student:

roll_no	name	course
101	Rajesh	MCA
102	Riya	MBA
103	Amit	B.Tech

Subject:

sub_code	name	max_marks
CS101	C Programming	100
CS102	DBMS	100
CS103	OOP	100

Result:

roll_no	sub_code	obtained_marks
101	CS101	85
101	CS102	90
101	CS103	80
102	CS101	75
102	CS102	95
102	CS103	70
103	CS101	65
103	CS102	60
103	CS103	55

In this approach, we can see that there are some advantages:

There is less data redundancy, as some fields are not repeated in more than one table. For example, the name and course of each student are stored only in the Student table, and the name and max marks of each subject are stored only in the Subject table. This saves storage space and reduces the risk of data inconsistency.
There is more data independence, as any change in the table structure or format will not affect the applications that use the tables. For example, if we want to add a new column or change the order of the columns in the Student table, we do not have to modify all the applications that access or modify the Student table.
There is more data consistency, as there is a way to ensure that the data in different tables are synchronized and valid. For example, we can use primary keys and foreign keys to link the tables and prevent inserting records that do not match with other tables. For instance, we can make roll_no as the primary key of Student table and sub_code as the primary key of Subject table. We can also make roll_no and sub_code as the foreign keys of Result table, referencing Student and Subject tables respectively. This way, we can ensure that every record in Result table corresponds to a valid student and a valid subject.
There is more data integrity, as there is a way to enforce rules or constraints on the data values. For example, we can use unique constraints, check constraints, etc. to ensure that the obtained marks are less than or equal to the max marks, or that the roll number is unique for each student.
There is more data security, as there is a way to protect the tables from unauthorized access or modification. For example, we can use user accounts, passwords, roles, permissions, etc. to control who can read, write, update.

Conclusion:

File-processing systems have major disadvantages:

i. Data Redundancy and inconsistency

ii. Difficulty in accessing data

iii. Data isolation

iv. Integrity problems

v. Atomicity problems

vi. Concurrent-access anomalies

vii. Security problems

This is why we have DBMS !!

DBMS Vs File Systems

Table of contents

Introduction

What is a File System?

What is a DBMS?

Comparison between File System and DBMS

Example of File System and DBMS

File System Approach

DBMS Approach

Conclusion:

DBMS Vs File Systems

Table of contents

Introduction

What is a File System?

What is a DBMS?

Comparison between File System and DBMS

Example of File System and DBMS

File System Approach

DBMS Approach

Conclusion:

Did you find this article valuable?