How to store such huge data which is beyond our capacity?

INTRODUCTION

Earlier everything was going on fine when there was no Internet but after the Internet, Technical Industries like Google, Facebook, etc. Started facing the issue. Users are increasing day by day and so there data also. There are approximately 4.57 billion Internet users in the world and in 1 year almost 346 million new users have come.

WHAT ARE THE ISSUES FACED?

Any entry made by the user and that is stored in the Database is Data and that data can be used by industries for commercial purposes but one issue came here that is day by day data increased exponentially and now the questions came up -

HOW DATA IS INCREASING?

  1. SOCIAL MEDIA — Social Media is a place where people connect with each other by online mode and share their emotions and journey by images, audios, videos, etc.Social Media is one of the important factors of Big Data. Instagram, Facebook, Whatsapp, takes alot of data like personal details, pictures, likes or reactions, etc.
  • FACEBOOK — Facebook is a social media platform that has almost 2.7 billion active users until the second quarter of 2020. Facebook generates 4 petabytes of data per day. People can chat and upload images, videos, etc. on Facebook.
  1. GOOGLE- Google is a Search Engine that has 4 billion users and it processes 3.5 billion searches per day and if we break down this it processes 40,000 searches per second on an average. Google processes approximately 20 petabytes of data per day through an average of 100,000 MapReduce jobs spread across its massive computing clusters.
  2. INTERNET OF THINGS(IoT) — IoT connects with a device and makes it smarter. Nowadays we have a smart A.C., smart room, etc. Due to IoT we humongous amount of data is generated. It is assumed that till 2025 41.6 billion of data will be generated by IoT devices.

There are many more things due to data is Increasing.

WHAT IS BIGDATA?

Big data is a problem. Big Data is a tsunami of data that is increasing exponentially day by day.Examples of big data are — Science, Astronomy, Sensor Networks, Medical records, Social Data, etc.Problems with big data:

  1. Huge Volumes
  2. Data in different types and Format
  3. Impacting the Business

CHALLENGES

  1. STORING THE DATA — The data is coming in huge volume and where to store it is a big issue. To store a huge amount of data in a traditional system is not possible.To buy one expensive hardware with a huge volume storing capability is not a good idea because it will raise another issue.We have one file of 500 MB but we have only 200 MB of storage left now what to do?
  2. VARIOUS FORMATS OF DATA- Earlier, we used to store data in Relational Database but currently, 80% of the data is Unstructured Data. Also now there are different types of data:

it‘s hard to handle this data in a traditional manner.

3. PROCESSING DATA FASTER- let’s take one example, we have one harddisk of 100MB and we stored data there but now more data is coming so we increased its size from 100MB to 500MB but now more data is coming and we are increasing it’s size again and again. Now all data is stored but did you thought about How we will be going to retrieve this data or process this data?Though the CPU speed, RAM Memory, Disk Capacity have improved alot, the thing not improved is the speed. From the last 7–10 years, the read/write speed of a disk is 80 MB/Sec.So these are the problems faced by industries when the data converted to Big Data.

TYPES OF BIG DATA

The Relational Database is known as Structured data which is in the form of Row and Column.

Example- Stock Information, Credit Card details, Medical records of the hospital, Bank Records etc.

Facebook especially make their own query language based on SQL which handles Big Data Known as Hive Query Language.

Unstructured data which are images, audios, videos, etc. Almost 80% of the data is unstructured data. It is generated more by Social Media.

JSON, XML, CSV File, Tab Delimited files,log files etc are semi-structured data.

Log files are the files that store the data when we login till logout to any application. Like on Facebook when we log in, what activity is done by us, when we logout .everything is stored in log file.

Characteristics of Big Data:

Big Data is categorized by 3 important characteristics.

  1. Volume
  2. Velocity
  3. Variety

Volume

There are many form of data generate:

  • Generated from hospitals keeping record all patients,doctors,nurses ,medical staff etc.
  • By social media .
  • By google drive ,drop box.
  • By organization.etc..

This is call volume or size of data.

Velocity

Input and Output of data.

Example — If I make one post on LinkedIn.And how fast it stored and how fast it is processed and retrieve by other.Speed of data.

Variety

Data is in many format.Data type like cvs,excel,video , song ,text ,pdf,photos etc..This is called variety of data.

DISTRIBUTED STORAGE

Distributed Storage means when the file can’t be stored in one P.C. and we split the file and store it in different P.C.Let’s understand with an example — we have a file of 100 MB and we have storage of 50 MB and we can’t store it like this. So we can do one thing rather than storing it by vertical scaling we can store it in a horizontal scaling manner.

SOLUTION TO BIG DATA

The Solution to Big Data was Given by DOUGH CUTTING which is HADOOP. Hadoop’s name is given because his son’s elephant toy name was Hadoop.

Hadoop stores and processes data in a distributive manner and in a parallel way.

TWO MAIN COMPONENTS OF HADOOP ARE:

HADOOP ARCHITECTURE

MASTER/SLAVE ARCHITECTURE-

NameNode is the Master and SlaveNode is the DataNode.

NameNode is expensive hardware and stores metadata.

DataNode is Commodity Hardware and Stores the Files with the replication factor and input split.

#bigdata #hadoop #bigdatamanagement #arthbylw #vimaldaga #righteducation #educationredefine #rightmentor
#worldrecordholder #ARTH #linuxworld #makingindiafutureready #righeudcation

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store