0% found this document useful (0 votes)
26 views8 pages

Python Unit 1

svu bcom python unit 1

Uploaded by

rajanikanth
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
26 views8 pages

Python Unit 1

svu bcom python unit 1

Uploaded by

rajanikanth
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 8

UNIT -1

Introduction to Data Science

Data science and its importance:


Data science uses scientific systems, algorithms, processes, and other methods to gain
insight and knowledge from data in different forms, both unstructured and structured. It is a lot
like data mining.

The concept of data science is to help unify statistics, machine learning, data analysis,
and other related methods. That way people will better understand and analyze information with
data. It uses different theories and techniques that are drawn from different fields within the
context of computer science, information science, statistics, and mathematics.

Data science has popped in lots of different contexts over the past 30 years. but it didn‟t
become established until recent years. In the „60s it was referred with the term datalogy. In 2001
William Cleveland introduced data science as its own discipline.

The IEEE launched a Task Force on Data Science and Advanced Analytics in 2013. The
European Association for Data Science was established in Luxembourg in the same year. The
IEEE had their first international conference in 2014. Later in 2014, The Data Incubator then
created a data science fellowship.

Data is at the core of data science. There are troves of raw information that is being
streamed in and then stored in data warehouses. There is a lot to learn through mining it. There
are advanced capabilities that can be built from it. This means that data science is basically using
data in creative ways to add business value.

The main aspect of data science is discovering new results from data. People are
exploring at a granular level to understand and mine complex inferences, behaviors, and trends.
It‟s about uncovering hidden information that may be able to help companies make smarter
choices for their business.

For example:
o Data mines in Netflix are used to look for movie viewing patterns to better
understand user‟s interests and to make decisions on the Netflix series they should
produce.

o Target tries to find the major customer segments in its customer base and their
shopping behaviors, which helps them to guide messaging to other market groups.

o Proctor & Gamble looks towards time series models to help them to understand
future demand and plan production levels.

So how does the data scientist mine all this information? It begins with data exploration.
When a data scientist is given a challenging question, they become a detective. They will start to
investigate leads, and then try to understand characteristics or patterns in the data. This means
they need a lot of analytical creativity.

One of the classic examples of a data product is an engine which takes in user data, and then
creates a personalized recommendation based upon that data.

The following are some examples of data products:


o The recommendation engine that Amazon uses suggests new items to its users,
which is determined by their algorithms.

o Spotify recommends new music.

o Netflix recommends new movies.

o The spam filter in Gmail is a data product. This is a behind the scenes algorithm
that processes the incoming mail and decides whether or not it is junk.

The computer vision that is used for self-driving cars is also a data product. Machine
learning algorithms can recognize pedestrians, traffic lights, other cars, and so on.

Data science and scientist add value to all businesses in many different ways.

Let‟s look at the eight major areas.


1. Empowers officers and managers to make better decisions.
2. Directs action based upon trends that will help define goals.
3 It will challenge the staff to use their best practices and to focus on important issues
4 Identify opportunities
5 Make decisions with data-driven, quantifiable evidence.
6 Data scientists test management decisions
7 Identifying and refining target audiences
8 They help to recruit the best talent for the organization

---
Q. What is data science and its importance?
Q. explain how data science is used?

---
Advantages of data science:

The data science can be seen everywhere, from the information within your smart phone and
apps to the idea of a car that drives itself. This new modern phenomenon is why data scientists
are becoming more necessary.

When data science is brought into a business, it brings along with it several different benefits.
Among those are the following seven :
1. It will monetize data:
Facebook turns the data that they get from their subscribers into money, and so can any
business. For example, there are a lot of retailer sites that will show you a section that says,
“Customers Who Bought This Item Also Bought,” which will show items that is more likely to
provide them another sale. This type of creative analysis is what will allow a company to
increase its revenue.

2. It will mitigate company risk:


A data scientist will analyze client churn patterns and allow a company to react in a
proactive manner if they notice that a trend of customers start favoring another business. In order
to get the customers back, the business will be able to send out teaser deals discounts to retain
customers. The data scientist will also evaluate the data of other businesses that a company is
looking to partner with. This will help to minimize the possible risk.

3. It will help a company get a better understanding of their customers:


The behaviors of customers will change with time, and it‟s hard to monitor their changes
without using data science. For example, websites like Airbnb, which helps hosts and travelers
rent and find affordable places to stay, recently looked at the behavior of their consumers during
website searches and changed their algorithm engine to give them better results. Because of this
change, their reservations and bookings went up.

4. It will give businesses unique insights:


Let‟s assume that a data scientist finds out that there is a connection between winter
snowstorms and the sale of Cocoa Krispies. If a grocer had this information, they could
strategically place Cocoa Krispies in the store during snowstorms so that they can increase their
sales. This would be a rather impossible connection to come up with without data science.

5. It will help with business expansion:


A data scientist could end up uncovering new markets that could be interested in a
business‟s service or product. An advertising campaign could be solid, but the data scientist
could end up looking it over and could figure out the type of customers gained for a certain
initiative so that the business can adjust future campaigns. Data science can be used to find new
trends, or it could figure out which inventory items will have a faster impact on revenues.

6. It will improve forecasting:


Neural networks and machine learning have been used to mine business data for quite
some time to predict future results, and there are a lot of data scientists that have skills in both.
For example, a business that deals in car repair could analyze spikes in visits over the past few
years so that they can come up with a better schedule for their employees.

7. It will provide businesses with objective decisions:


Data is a powerful tool that speaks for itself. Having verifiable and solid data on hand
will help a business make better decisions that are based on objectivity. This will take
precedence and emotions out of the problem. If emotions, egos, or the tendency to do things the
same all the time has caused a business problem in the past, data science is able to help.
---
Q. what are the advantages of Data Science
---
The process of data science:

When a non-technical boss asks a data scientist to figure out a data problem, the
description can end up being ambiguous at first. It becomes your job as the data
scientist, to change the task into a problem, figure out how you can solve it, and the
present your solution to the boss. This process uses several steps:

o Frame the problem: Who is the client? What are they asking you, exactly, to
solve? How are you able to translate the ambiguous request into a well-defined and
concrete problem?

o Collect the data that you need to solve the problem. Do you already have
access to this data? If you do, what parts of this data can help? If you don‟t, what
data do you need? What resources, such as infrastructure, time, and money, do you
need to get the data to a usable form?

o Process your data: Raw data is very rarely able to be used right out of the
box. There will be errors in the collection, missing values, corrupt records, and lots
of other challenges you have to take care of. You first have to clean the data to
change it into a form that you will be able to analyze.

o Explore the data: After you have the data cleaned, you need a high level of
understanding of the information that is contained in it. What are the obvious
correlations or trends that you see within the data? What high-level characteristics
does it have, and are there any of them that is more important than the other?

o Perform in-depth analysis: This is typically the core of the project. This is
where you use the machinery of data analysis to find the best predictions and
insights.

o Communicate the results of your analysis: All of the technical results and
analysis that you have found isn‟t very valuable unless you are able to explain it in
a way that is compelling and comprehensible. Data storytelling is a very underrated
and critical skill that a data scientist needs to use and build.
---
Q. Explain the process of Data Science
---
Responsibilities of a data scientist:

There are several responsibilities of a data scientist as:

o Recommend the most cost-effective changes that should be made to existing


strategies and procedures.
o Communicate findings and predictions to IT and management departments
through effective reports and visualizations of data.
o Come up with new algorithms to figure out problems and create new tools to
automate work.
o Devise data-driven solutions to challenges that are most pressing.
o Examine and explore data from several different angles to find hidden
opportunities, weaknesses, and trends.
o Thoroughly prune and clean data to get rid of the irrelevant information.
o Employ sophisticated analytics programs, statistical methods, and machine
learning to get data ready for use in a prescriptive and predictive modeling.
o Extract data from several external and internal sources.
o Conduct undirected research and create open-ended questions.

Different companies will have a different idea of data scientist tasks. There
are some businesses that will treat their data scientists like glorified data analysts,
or combine the duties with data engineering. There are others that need top-level
analytics experts that are skilled in intense data visualizations and machine
learning.

As data scientists reach new experience levels or change jobs, the


responsibilities they face will change as well. For example, a person that works
alone for a mid-sized company may spend most of their day cleaning and munging
data. High-level employees that are a part of a business that offers databased
services could have to create new products or structure big data projects on an
almost daily basis.

---
Q. Explain responsibilities of data scientist
---
Qualifications of data scientists:

There are three education options that you will need to look at when considering a
career in data science.

1. Graduate certificates and degrees provide recognized academic qualifications,


networking, internships, and structure for your resume. This will end up costing
you a lot of money and time.
2. Self-guided courses and MOOCs are cheap or free, targeted, and short. They
will let you complete your projects within your own timeframe, but they will
require you to structure your own career path.
3. Boot camps are a lot faster and more intense than traditional degrees. They may
even be taught by data scientists, but they will not provide you with a degree that
has initials after your name. Academic qualifications are probably more important
than you think. It‟s very rare for a person that doesn‟t have an advanced
quantitative degree to have the skills that a data scientist needs.

Burtch Works, in its salary report, found that 46% of data scientists have a
PhD and 88% have a master‟s degree. For the most part, these degrees are in
rigorous scientific, quantitative, or technical subjects which includes statistics and
math – 32%, engineering – 16%, and computer science – 19%.

Many companies are desperate to find candidates that have real-world skills.
If you have the technical knowledge, it could trump the preferred degree
requirements.

The skills one needs to be a data scientist:

1) Technical skills:
o Cloud tools such as Amazon S3.
o Big data platforms such as Hive & Pig, and Hadoop.
o Python, Perl, Java, C/C++
o SQL databases, as well as database querying languages.
o SAS and R languages.
o Unstructured data techniques.
o Data visualization and reporting techniques.
o Data munging and cleaning.
o Data mining
o Software engineering skills
o Machine learning techniques and tools.
o Statistics
o Math

 This list is always changing as data science changes.

2) Business Skills:
Industry knowledge: It‟s important to understand how your chosen industry works
and how the data is utilized, collected, and analyzed. Intellectual curiosity: Data
Scientists have to explore new territories
and find unusual and creative ways to solve problems. Effective communication:
Data Scientists have to explain their discoveries and techniques to non-technical
and technical audiences in a way that they can understand.

Analytic problem-solving: Data Scientists approach high-level challenges with


clear eyes on what is important. They employ the right methods and approaches to
create the best use of human resources and time.

---
Q. What are the qualifications required for a Data Scientists.
---

Would you be a good data scientist?

To figure out whether or not you would make a good data scientist, ask yourself
these questions:

o Are you interested in broadening your skills and taking on new challenges?
o Do you communicate well both visually and verbally?
o Do you enjoy problem-solving and individualized work?
o Are you interested in data analysis and collection?
o Do you have substantial work experience in the areas involved in data
science?
o Do you have a degree in marketing, management information systems,
computer science, statistics, or mathematics?

If you were able to answer yes to any of these questions, then you will probably
find a lot of enjoyment in data science.

It‟s important that data scientists have knowledge of statistics or math. It‟s also
important that they have a natural curiosity, such as critical thinking and creativity.
What are you able to do with the data? What undiscovered information is
hidden within the data? You need to have the ability to connect the dots and have a
desire to find the answers to these questions you haven‟t been asked if you notice
that there is data that is full of potential.

---
Q. how can one be a good data scientist?
Q. To become a good data scientist, what you should do?
---

Why to use python for data science:

Python was created by Guido van Rossum in the late 1980s and is an
interpreted high-level programming language that is typically used for general-
purpose programming.

The design of Python makes it more readable, notably by making use of


more whitespace. It gives a programmer the chance to make clear programs on
small and large scales. Van Rossum is still the main author of Python. Python
supports structured and object-oriented programming, and many of its features will
support aspect-oriented and functional programming as well as metaobjects. There
are a lot of paradigms that are supported through extensions, which includes logic
programming and design by contract.

Python has dynamic typing, and it uses a combination of cycle-detecting


garbage collectors and reference counting to manage memory. Instead of building
all of the functionality into the core, Python was made in a way to make it highly
extensible. Having this compact modularity has caused it to be a popular means of
adding in interfaces to existing applications that are programmable.

Van Rossum wanted to create a small core language with a large library and
make an easy interpreter. His desire came from how frustrated he was with ABC,
which used a very different approach.

SO as python is designed as simple and dynamic language and now a days, it


adapts features of different other languages by making it as open source, python
could be the best choice for data science.
---
Q. Why python for data sciences?
---

You might also like