0% found this document useful (0 votes)
89 views17 pages

Introduction To Data Science

This document provides an introduction to data science, including definitions of data, big data, and the challenges presented by large datasets. It discusses what data scientists do, such as extracting knowledge and insights from data. Examples are given of the large amounts of data being collected from sources like social media, online purchases, and medical studies. The high demand for data scientists is also noted. Key aspects of data science are outlined, including aggregating, analyzing, searching, and discovering patterns in data.

Uploaded by

Abhinavreddy B
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
Download as pptx, pdf, or txt
0% found this document useful (0 votes)
89 views17 pages

Introduction To Data Science

This document provides an introduction to data science, including definitions of data, big data, and the challenges presented by large datasets. It discusses what data scientists do, such as extracting knowledge and insights from data. Examples are given of the large amounts of data being collected from sources like social media, online purchases, and medical studies. The high demand for data scientists is also noted. Key aspects of data science are outlined, including aggregating, analyzing, searching, and discovering patterns in data.

Uploaded by

Abhinavreddy B
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1/ 17

Introduction to Data Science

By Balijapalli Abhinav Reddy


Electronics and Communication Engineering
Annamacharya Institute of Technology & Sciences,Tirupati.
CONTENTS
Data, Big Data and Challenges
Data Science
Introduction
Why Data Science
Data Scientists
What do they do?
Major/Concentration in Data Science
What courses to take.
DATA ALL AROUND

Lots of data is being collected


and warehoused
Web data, e-commerce
Financial transactions, bank/credit transactions
Online trading and purchasing
Social Network
HOW MUCH DATA
DO WE HAVE?
Google processes 20 PB a day (2008)
Facebook has 60 TB of daily logs
eBay has 6.5 PB of user data + 50
TB/day (5/2009)
1000 genomes project: 200 TB
BIG DATA
Big Data is any data that is expensive to manage and hard to extract value
from
Volume
The size of the data
Velocity
The latency of data processing relative to the growing demand for
interactivity
Variety and Complexity
the diversity of sources, formats, quality, structures.
TYPES OF DATA WE HAVE

Relational Data (Tables/Transaction/Legacy Data)


Text Data (Web)
Semi-structured Data (XML)
Graph Data
Social Network, Semantic Web (RDF), …
Streaming Data
You can afford to scan the data once
WHAT TO DO WITH THESE DATA?
Aggregation and Statistics
Data warehousing and OLAP
Indexing, Searching, and Querying
Keyword based search
Pattern matching (XML/RDF)
Knowledge discovery
Data Mining
Statistical Modeling
BIG DATA AND DATA SCIENCE
“… the sexy job in the next 10 years will be statisticians,” Hal Varian,
Google Chief Economist
The U.S. will need 140,000-190,000 predictive analysts and 1.5 million
managers/analysts by 2018. McKinsey Global Institute’s June 2011
New Data Science institutes being created or repurposed – NYU,
Columbia, Washington, UCB,...
New degree programs, courses, boot-camps:
e.g., at Berkeley: Stats, I-School, CS, Astronomy…
One proposal (elsewhere) for an MS in “Big Data Science”
WHAT IS DATA SCIENCE
An area that manages, manipulates, extracts, and
interprets knowledge from tremendous amount of
data
Data science (DS) is a multidisciplinary field of study
with goal to address the challenges in big data
Data science principles apply to all data – big and
small
WHY IS IT ?
Gartner’s 2014 Hype Cycle
DATA SCIENCE
REAL LIFE EXAMPLES
Companies learn your secrets, shopping patterns, and
preferences
For example, can we know if a woman is pregnant, even
if she doesn’t want us to know?
Data Science and election (2008, 2012)
1 million people installed the Obama Facebook app that
gave access to info on “friends”
DATA SCIENTISTS

Data Scientist
The Sexiest Job of the 21st Century
They find stories, extract knowledge. They are not reporters
DATA SCIENTISTS

Data scientists are the key to realizing the opportunities presented by big data. They
bring structure to it, find compelling patterns in it, and advise executives on the
implications for products, processes, and decisions
WHAT DO DATA SCIENTISTS DO?
National Security
Cyber Security
Business Analytics
Engineering
Healthcare
And more ….
CONCENTRATION IN DATA SCIENCE
Mathematics and Applied Mathematics
Applied Statistics/Data Analysis
Solid Programming Skills (R, Python, Julia, SQL)
Data Mining
Data Base Storage and Management
Machine Learning and discovery
THANK YOU

You might also like