0% found this document useful (0 votes)
34 views2 pages

What Is Big Data?

Big data refers to extremely large data sets that are difficult to process using traditional database management tools. It presents challenges related to storage, processing, and approaches to handling the data. Hadoop was developed as an open-source solution to these problems, providing a framework to distribute storage and parallelize processing across clusters of commodity hardware. Major companies like IBM, Microsoft, Cloudera, and Hortonworks now offer their own Hadoop distributions to address big data.

Uploaded by

Vikas Sinha
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
Download as docx, pdf, or txt
0% found this document useful (0 votes)
34 views2 pages

What Is Big Data?

Big data refers to extremely large data sets that are difficult to process using traditional database management tools. It presents challenges related to storage, processing, and approaches to handling the data. Hadoop was developed as an open-source solution to these problems, providing a framework to distribute storage and parallelize processing across clusters of commodity hardware. Major companies like IBM, Microsoft, Cloudera, and Hortonworks now offer their own Hadoop distributions to address big data.

Uploaded by

Vikas Sinha
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1/ 2

What is Big Data?

Big data is the term for a collection of data sets so large and complex that it becomes difficult to process
using on-hand database management tools or traditional data processing applications.

What are all the challenges produced by Big Data?

What are the problems faced by Big Data?

What are the approaches to handle the Big Data?

Big data is a problem due to storage and processing while hadoop is a solution

During 2003-2004 google released white papter GFS and MR


Yahoo Dug cutting has develop Hadoop and MR (Now opern souce by Apache)

Hadoop Distribution
BigInsight

IBM

To sell wehspher

HDInsight

Microsoft

To sell Ajure (.Net)

Cloudera
Hortonwork
.

IBM Defintion of Big Data


Volume

Velocity
Veriety

It is fast, run on commodity H/W, No ETL and Database

You might also like