0% found this document useful (0 votes)
6 views47 pages

Python_for_DataScience

The document provides an overview of Python programming basics, including setting up the working directory, file handling, data types, operators, and functions. It also covers data structures such as lists, tuples, sets, and dictionaries, along with their associated functions, as well as an introduction to the Pandas library for data manipulation. Additionally, it discusses Numpy for numerical computing and Matplotlib for data visualization, along with examples of regression and classification in data analysis.

Uploaded by

Mahesh Kumar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
6 views47 pages

Python_for_DataScience

The document provides an overview of Python programming basics, including setting up the working directory, file handling, data types, operators, and functions. It also covers data structures such as lists, tuples, sets, and dictionaries, along with their associated functions, as well as an introduction to the Pandas library for data manipulation. Additionally, it discusses Numpy for numerical computing and Matplotlib for data visualization, along with examples of regression and classification in data analysis.

Uploaded by

Mahesh Kumar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 47

Unit I Basics of Python 10

Introduction – Setting working directory – Creating and saving, File execution, clearing
console, removing variables from environment, clearing environment – variable creation –
Operators – Data types and its associated operations – sequence data types – conditions and
branching – Functions-Virtual Environments

Introduction to Python

Python is a versatile and popular programming language known for its simplicity and
readability. It is widely used in various fields, including web development, data analysis,
artificial intelligence, and scientific computing.

Setting Working Directory

The working directory is the folder where Python scripts are executed and where files are
read from or written to. To set the working directory:

Coding
import os
# To get the current working directory
print(os.getcwd())
# To change the working directory
os.chdir('path_to_directory')

Creating and Saving Files

Creating and saving files in Python can be done using the open() function:

# Create and write to a file


with open('example.txt', 'w') as file:
file.write("Hello, World!")

# Read from a file


with open('example.txt', 'r') as file:
content = file.read()
print(content)

File Execution

To execute a Python file from the console or terminal:

python filename.py

Clearing Console
Clearing the console can be done using a system call:

import os

# Clear console (Windows)


os.system('cls')

# Clear console (Unix/Linux/MacOS)


os.system('clear')

Removing Variables from Environment

To remove a variable from the environment:

# Create a variable
x = 10

# Delete the variable


del x

Clearing Environment

Clearing the entire environment is not a built-in feature of Python, but you can delete all
variables in the global scope:

# Delete all variables in the global scope


globals().clear()

Variable Creation

Variables in Python are created by simply assigning a value to a name:

x=5 # Integer
y = 3.14 # Float
name = "John" # String

Operators

Python supports various operators:

 Arithmetic Operators: +, -, *, /, %, //, **


 Comparison Operators: ==, !=, >, <, >=, <=
 Logical Operators: and, or, not
 Assignment Operators: =, +=, -=, *=, /=, %=, //=, **=
 Bitwise Operators: &, |, ^, ~, <<, >>

Data Types and Associated Operations

 Numbers: Integers, Floats, Complex numbers


o Operations: Arithmetic, type conversion, etc.
 Strings: Immutable sequences of characters
o Operations: Concatenation, slicing, formatting, etc.
 Lists: Mutable sequences
o Operations: Indexing, slicing, appending, inserting, removing, etc.
 Tuples: Immutable sequences
o Operations: Indexing, slicing, etc.
 Sets: Unordered collections of unique elements
o Operations: Union, intersection, difference, etc.
 Dictionaries: Key-value pairs
o Operations: Accessing, updating, removing elements, etc.

Sequence Data Types

 Lists

my_list = [1, 2, 3, 4, 5]

 Tuples

my_tuple = (1, 2, 3, 4, 5)

 Strings

my_string = "Hello, World!"

 Ranges

my_range = range(1, 10)

Conditions and Branching

Python uses if, elif, and else for conditional branching:

x = 10
if x > 0:
print("Positive")
elif x < 0:
print("Negative")
else:
print("Zero")

Functions

Functions are defined using the def keyword:

def greet(name):
return f"Hello, {name}!"

print(greet("Alice"))
Virtual Environments

Virtual environments allow you to create isolated Python environments for different projects:

Copy code
# Create a virtual environment
python -m venv myenv

# Activate the virtual environment (Windows)


myenv\Scripts\activate

# Activate the virtual environment (Unix/Linux/MacOS)


source myenv/bin/activate

# Deactivate the virtual environment


deactivate
Unit II PYTHON DATA STRUCTURES, PACKAGES 10

List – Tuples- Set – Dictionary – Its associated functions - File handling - Modes– Reading
and writing files - Introduction to Pandas – Series – Data frame – Indexing and loading –
Data manipulation – Merging – Group by – Scales – Pivot table – Date and time.

Lists

Lists are ordered, mutable collections of items.

Creating Lists:

my_list = [1, 2, 3, 4, 5]

Functions and Methods:

 append(x): Add an item to the end.


 extend(iterable): Extend list by appending elements from an iterable.
 insert(i, x): Insert an item at a given position.
 remove(x): Remove first item with value x.
 pop([i]): Remove and return item at position i (default last).
 clear(): Remove all items.
 index(x[, start[, end]]): Return index of first item with value x.
 count(x): Return number of times x appears.
 sort(key=None, reverse=False): Sort items.
 reverse(): Reverse the elements.
 copy(): Return a shallow copy.

Tuples

Tuples are ordered, immutable collections of items.

Creating Tuples:

my_tuple = (1, 2, 3, 4, 5)

Functions and Methods:

 count(x): Return the number of times x appears.


 index(x): Return the index of the first item with value x.

Sets

Sets are unordered collections of unique items.

Creating Sets:

my_set = {1, 2, 3, 4, 5}
Functions and Methods:

 add(x): Add an item.


 remove(x): Remove an item.
 discard(x): Remove an item if present.
 pop(): Remove and return an arbitrary item.
 clear(): Remove all items.
 union(*others): Return the union.
 intersection(*others): Return the intersection.
 difference(*others): Return the difference.
 symmetric_difference(other): Return the symmetric difference.
 issubset(other): Check if set is subset of other.
 issuperset(other): Check if set is superset of other.

Dictionaries

Dictionaries are unordered collections of key-value pairs.

Creating Dictionaries:

my_dict = {'a': 1, 'b': 2, 'c': 3}

Functions and Methods:

 keys(): Return a new view of the dictionary's keys.


 values(): Return a new view of the dictionary's values.
 items(): Return a new view of the dictionary's items.
 get(key[, default]): Return the value for key if key is in the dictionary.
 setdefault(key[, default]): Insert key with a value of default if key is not
in the dictionary.
 update([other]): Update the dictionary with the key/value pairs from other.
 pop(key[, default]): Remove specified key and return the corresponding
value.
 popitem(): Remove and return a (key, value) pair.

File Handling

Modes:

 'r': Read (default).


 'w': Write (truncate file).
 'x': Create (fail if exists).
 'a': Append.
 'b': Binary mode.
 't': Text mode (default).
 '+': Update (read and write).
Reading and Writing Files:

# Writing to a file
with open('example.txt', 'w') as file:
file.write("Hello, World!")

# Reading from a file


with open('example.txt', 'r') as file:
content = file.read()

Introduction to Pandas

Pandas is a powerful data manipulation library in Python.

Series

A Series is a one-dimensional labeled array capable of holding any data type.

Creating a Series:

import pandas as pd

data = [1, 2, 3, 4, 5]
series = pd.Series(data)

DataFrame

A DataFrame is a two-dimensional labeled data structure.

Creating a DataFrame:

data = {
'Column1': [1, 2, 3],
'Column2': [4, 5, 6]
}
df = pd.DataFrame(data)
Indexing and Loading

Indexing:

df['Column1'] # Access a single column


df[['Column1', 'Column2']] # Access multiple columns
df.iloc[0] # Access a row by index
df.loc[0] # Access a row by label

Loading Data:

df = pd.read_csv('file.csv') # Load CSV file


df = pd.read_excel('file.xlsx') # Load Excel file
Data Manipulation

Basic Operations:

df['NewColumn'] = df['Column1'] + df['Column2'] # Add a new column


df.drop('Column1', axis=1, inplace=True) # Drop a column
df.rename(columns={'OldName': 'NewName'}, inplace=True) # Rename a column

Merging

Combining DataFrames:

df1 = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})


df2 = pd.DataFrame({'A': [5, 6], 'B': [7, 8]})
merged_df = pd.concat([df1, df2])

Group By

Grouping data:

grouped = df.groupby('Column1')
summary = grouped['Column2'].sum()

Scales

Scaling data can be done using libraries like sklearn.preprocessing.

Example:

from sklearn.preprocessing import StandardScaler


scaler = StandardScaler()
scaled_data = scaler.fit_transform(df)

Pivot Table

Creating a pivot table:

pivot = df.pivot_table(values='Value', index='Index', columns='Columns', aggfunc='mean')


Date and Time

Handling date and time data:

df['Date'] = pd.to_datetime(df['Date'])
df['Year'] = df['Date'].dt.year
df['Month'] = df['Date'].dt.month
df['Day'] = df['Date'].dt.day

Unit III: Packages for Data Analysis

Numpy – 1D and 2D numpy – Associated operations –Broadcasting - Linear algebra and


related operations – Indexing and other operations – Matplotlib – scatterplot – line plot – bar
plot – histogram – box plot – pair plot – Case study on regression and classification.

Numpy

Numpy is a powerful numerical computing library in Python, providing support for large
multi-dimensional arrays and matrices along with a large collection of high-level
mathematical functions.

1D and 2D Numpy Arrays

Creating 1D Arrays:

import numpy as np

arr_1d = np.array([1, 2, 3, 4, 5])

Creating 2D Arrays:

arr_2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

Associated Operations

Basic Operations:

# Element-wise addition
result = arr_1d + 2

# Element-wise subtraction
result = arr_1d - 2

# Element-wise multiplication
result = arr_1d * 2

# Element-wise division
result = arr_1d / 2
Aggregations:

# Sum of elements
np.sum(arr_1d)

# Mean of elements
np.mean(arr_1d)

# Standard deviation
np.std(arr_1d)

# Maximum and minimum


np.max(arr_1d)
np.min(arr_1d)

Broadcasting

Broadcasting allows Numpy to perform element-wise operations on arrays of different


shapes.

Example:

arr = np.array([1, 2, 3])


scalar = 2

result = arr + scalar # [3, 4, 5]

Linear Algebra and Related Operations

Dot Product:

a = np.array([1, 2])
b = np.array([3, 4])

dot_product = np.dot(a, b) # 11

Matrix Multiplication:

A = np.array([[1, 2], [3, 4]])


B = np.array([[5, 6], [7, 8]])

result = np.matmul(A, B)

Inverse of a Matrix:

matrix = np.array([[1, 2], [3, 4]])


inverse = np.linalg.inv(matrix)
Indexing and Other Operations

Indexing:

arr = np.array([1, 2, 3, 4, 5])

# Accessing elements
element = arr[0] # 1

# Slicing
subarray = arr[1:3] # [2, 3]

Reshaping:

arr = np.array([[1, 2, 3], [4, 5, 6]])


reshaped = arr.reshape((3, 2)) # [[1, 2], [3, 4], [5, 6]]

Matplotlib

Matplotlib is a plotting library for creating static, interactive, and animated visualizations in
Python.

Scatter Plot

Creating a Scatter Plot:

import matplotlib.pyplot as plt

x = np.array([1, 2, 3, 4, 5])
y = np.array([5, 4, 3, 2, 1])

plt.scatter(x, y)
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.title('Scatter Plot')
plt.show()

Line Plot

Creating a Line Plot:

plt.plot(x, y)
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.title('Line Plot')
plt.show()
Bar Plot

Creating a Bar Plot:

categories = ['A', 'B', 'C']


values = [10, 20, 15]

plt.bar(categories, values)
plt.xlabel('Categories')
plt.ylabel('Values')
plt.title('Bar Plot')
plt.show()
Histogram

Creating a Histogram:

data = np.random.randn(1000)

plt.hist(data, bins=30)
plt.xlabel('Value')
plt.ylabel('Frequency')
plt.title('Histogram')
plt.show()
Box Plot

Creating a Box Plot:

data = [np.random.normal(0, std, 100) for std in range(1, 4)]

plt.boxplot(data, vert=True, patch_artist=True)


plt.xlabel('Distribution')
plt.ylabel('Value')
plt.title('Box Plot')
plt.show()
Pair Plot

Creating a Pair Plot:

import seaborn as sns


import pandas as pd

df = pd.DataFrame({
'A': np.random.randn(100),
'B': np.random.randn(100),
'C': np.random.randn(100),
'D': np.random.randn(100)
})
sns.pairplot(df)
plt.show()

Case Study: Regression and Classification

Regression

Linear Regression Example:

from sklearn.linear_model import LinearRegression

# Sample data
X = np.array([[1], [2], [3], [4], [5]])
y = np.array([1, 2, 3, 4, 5])

# Create and train the model


model = LinearRegression()
model.fit(X, y)

# Make predictions
predictions = model.predict(X)

# Plot results
plt.scatter(X, y, color='blue')
plt.plot(X, predictions, color='red')
plt.xlabel('X')
plt.ylabel('y')
plt.title('Linear Regression')
plt.show()
Classification

Logistic Regression Example:

from sklearn.linear_model import LogisticRegression


from sklearn.datasets import load_iris

# Load data
iris = load_iris()
X = iris.data
y = iris.target

# Create and train the model


model = LogisticRegression(max_iter=200)
model.fit(X, y)

# Make predictions
predictions = model.predict(X)

# Plot results
plt.scatter(X[:, 0], X[:, 1], c=predictions, cmap='viridis')
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.title('Logistic Regression Classification')
plt.show()

Programs for Python for Data Science

Basic python programs:

Addition of two numbers Output


a=eval(input(“enter first no”)) enter first no
b=eval(input(“enter second no”)) 5
c=a+b enter second no
print(“the sum is “,c) 6
the sum is 11
Area of rectangle Output
l=eval(input(“enter the length of rectangle”)) enter the length of rectangle 5
b=eval(input(“enter the breath of rectangle”)) enter the breath of rectangle 6
a=l*b 30
print(a)
Area & circumference of circle output
r=eval(input(“enter the radius of circle”)) enter the radius of circle4
a=3.14*r*r the area of circle 50.24
c=2*3.14*r the circumference of circle
print(“the area of circle”,a) 25.12
print(“the circumference of circle”,c)
Calculate simple interest Output
p=eval(input(“enter principle amount”)) enter principle amount 5000
n=eval(input(“enter no of years”)) enter no of years 4
r=eval(input(“enter rate of interest”)) enter rate of interest6
si=p*n*r/100 simple interest is 1200.0
print(“simple interest is”,si)

Calculate engineering cutoff Output


p=eval(input(“enter physics marks”)) enter physics marks 100
c=eval(input(“enter chemistry marks”)) enter chemistry marks 99
m=eval(input(“enter maths marks”)) enter maths marks 96
cutoff=(p/4+c/4+m/2) cutoff = 97.75
print(“cutoff =”,cutoff)

Check voting eligibility output


age=eval(input(“enter ur age”)) Enter ur age
If(age>=18): 19
print(“eligible for voting”) Eligible for voting
else:
print(“not eligible for voting”)

Find greatest of three numbers output


a=eval(input(“enter the value of a”)) enter the value of a 9
b=eval(input(“enter the value of b”)) enter the value of a 1
c=eval(input(“enter the value of c”)) enter the value of a 8
if(a>b): the greatest no is 9
if(a>c):
print(“the greatest no is”,a)
else:
print(“the greatest no is”,c)
else:
if(b>c):
print(“the greatest no is”,b)
else:
print(“the greatest no is”,c)
Programs on for loop
Print n natural numbers Output

for i in range(1,5,1): 1234

print(i)
Print n odd numbers Output
for i in range(1,10,2):
13579
print(i)

Print n even numbers Output


for i in range(2,10,2):
2468
print(i)
Print squares of numbers Output

for i in range(1,5,1): 1 4 9 16

print(i*i)

Print squares of numbers Output

for i in range(1,5,1): 1 8 27 64

print(i*i*i)

Programs on while loop

Print n natural numbers Output


i=1 1
while(i<=5): 2
print(i) 3
i=i+1 4
5
Print n odd numbers Output
i=2 2
while(i<=10): 4
print(i) 6
i=i+2 8
10
Print n even numbers Output
i=1 1
while(i<=10): 3
print(i) 5
i=i+2 7
9
Print n squares of numbers Output
i=1 1
while(i<=5): 4
print(i*i) 9
i=i+1 16
25

Print n cubes numbers Output


i=1 1
while(i<=3): 8
print(i*i*i) 27
i=i+1

find sum of n numbers Output


i=1 55
sum=0
while(i<=10):
sum=sum+i
i=i+1
print(sum)

factorial of n numbers/product of n numbers Output


i=1 3628800
product=1
while(i<=10):
product=product*i
i=i+1
print(product)
sum of n numbers Output
def add(): enter a value
a=eval(input(“enter a value”)) 6
b=eval(input(“enter b value”)) enter b value
c=a+b 4
print(“the sum is”,c) the sum is 10
add()

area of rectangle using function Output


def area(): enter the length of
l=eval(input(“enter the length of rectangle”)) rectangle 20
b=eval(input(“enter the breath of rectangle”)) enter the breath of
a=l*b rectangle 5
print(“the area of rectangle is”,a) the area of rectangle is
area() 100

swap two values of variables Output


def swap(): enter a value3
a=eval(input("enter a value")) enter b value5
b=eval(input("enter b value")) a= 5 b= 3
c=a
a=b
b=c
print("a=",a,"b=",b)
swap()
check the no divisible by 5 or not Output
def div(): enter n value10
n=eval(input("enter n value")) the number is divisible by
if(n%5==0): 5
print("the number is divisible by 5")
else:
print("the number not divisible by 5")
div()

find reminder and quotient of given no Output


def reminder(): enter a 6
a=eval(input("enter a")) enter b 3
b=eval(input("enter b")) the reminder is 0
R=a%b enter a 8
print("the reminder is",R) enter b 4
def quotient(): the reminder is 2.0
a=eval(input("enter a"))
b=eval(input("enter b"))
Q=a/b
print("the reminder is",Q)
reminder()
quotient()

convert the temperature Output


enter temperature in
def ctof(): centigrade 37
c=eval(input("enter temperature in centigrade")) the temperature in
f=(1.8*c)+32 Fahrenheit is 98.6
print("the temperature in Fahrenheit is",f) enter temp in Fahrenheit
def ftoc(): 100
f=eval(input("enter temp in Fahrenheit")) the temperature in
c=(f-32)/1.8 centigrade is 37.77
print("the temperature in centigrade is",c)
ctof()
ftoc()
program for basic calculator Output
def add(): enter a value 10
a=eval(input("enter a value")) enter b value 10
b=eval(input("enter b value")) the sum is 20
c=a+b enter a value 10
print("the sum is",c) enter b value 10
def sub(): the diff is 0
a=eval(input("enter a value")) enter a value 10
b=eval(input("enter b value")) enter b value 10
c=a-b the mul is 100
print("the diff is",c) enter a value 10
def mul(): enter b value 10
a=eval(input("enter a value")) the div is 1
b=eval(input("enter b value"))
c=a*b
print("the mul is",c)
def div():
a=eval(input("enter a value"))
b=eval(input("enter b value"))
c=a/b
print("the div is",c)
add()
sub()
mul()
div()
NUMPY ARRAYS

ALGORITHM

Step1: Start

Step2: Import numpy module

Step3: Print the basic characteristics and operactions of array Step4: Stop

PROGRAM

import numpy as np

# Creating array object arr = np.array( [[ 1, 2, 3],

[ 4, 2, 5]] )

# Printing type of arr object print("Array is of type: ", type(arr)) # Printing array dimensions
(axes)

print("No. of dimensions: ", arr.ndim) # Printing shape of array print("Shape of array: ",
arr.shape)

# Printing size (total number of elements) of array print("Size of array: ", arr.size)

# Printing type of elements in array

print("Array stores elements of type: ", arr.dtype)

OUTPUT

Array is of type: <class 'numpy.ndarray'> No. of dimensions: 2

Shape of array: (2, 3) Size of array: 6

Array stores elements of type: int32

PROGRAM TO PERFORM ARRAY SLICING

a = np.array([[1,2,3],[3,4,5],[4,5,6]])

print(a)

print("After slicing") print(a[1:])


Output

[[1 2 3]

[3 4 5]

[4 5 6]]

After slicing [[3 4 5]

[4 5 6]]

CREATE A DATAFRAME USING A LIST OF ELEMENTS.

ALGORITHM

Step1: Start

Step2: import numpy and pandas module

Step3: Create a dataframe using the dictionary

Step4: Print the output

Step5: Stop

PROGRAM

import numpy as np import pandas as pd

data = np.array([['','Col1','Col2'], ['Row1',1,2],

['Row2',3,4]])

print(pd.DataFrame(data=data[1:,1:],

index = data[1:,0], columns=data[0,1:]))

# Take a 2D array as input to your DataFrame my_2darray = np.array([[1, 2, 3], [4, 5, 6]])
print(pd.DataFrame(my_2darray))

# Take a dictionary as input to your DataFrame my_dict = {1: ['1', '3'], 2: ['1', '2'], 3: ['2', '4']}

print(pd.DataFrame(my_dict))

# Take a DataFrame as input to your DataFrame

my_df = pd.DataFrame(data=[4,5,6,7], index=range(0,4), columns=['A'])


print(pd.DataFrame(my_df))

# Take a Series as input to your DataFrame


my_series = pd.Series({"United Kingdom":"London", "India":"New Delhi", "United
States":"Washington", "Belgium":"Brussels"})

print(pd.DataFrame(my_series))

df = pd.DataFrame(np.array([[1, 2, 3], [4, 5, 6]]))

# Use the `shape` property print(df.shape)

# Or use the `len()` function with the `index` property print(len(df.index))

Output:

Col1 Col2

Row1 1 2

Row2 3 4

0 1 2

0 1 2 3

1 4 5 61 23

0 1 1 2

1 3 2 4A

0 4

1 5

2 6

3 7

United Kingdom London India New Delhi United States Washington Belgium
Brussels

(2, 3)

2
BASIC PLOTS USING MATPLOTLIB

ALGORITHM

Step1: Start

Step2: import Matplotlib module

Step3: Create a Basic plots using Matplotlib Step4: Print the output

Step5: Stop

Program:3a

# importing the required module import matplotlib.pyplot as plt

# x axis values x = [1,2,3]

# corresponding y axis values y = [2,4,1]

# plotting the points plt.plot(x, y)

# naming the x axis plt.xlabel('x - axis') # naming the y axis plt.ylabel('y - axis')

# giving a title to my graph plt.title('My first graph!')

# function to show the plot plt.show()

Output:

Program:3b

import matplotlib.pyplot as plt a = [1, 2, 3, 4, 5]


b = [0, 0.6, 0.2, 15, 10, 8, 16, 21]

plt.plot(a)

# o is for circles and r is # for red

plt.plot(b, "or") plt.plot(list(range(0, 22, 3)))

# naming the x-axis plt.xlabel('Day ->')

# naming the y-axis plt.ylabel('Temp ->')

c = [4, 2, 6, 8, 3, 20, 13, 15]

plt.plot(c, label = '4th Rep')

# get current axes command ax = plt.gca()

# get command over the individual # boundary line of the graph body
ax.spines['right'].set_visible(False) ax.spines['top'].set_visible(False)

# set the range or the bounds of

# the left boundary line to fixed range ax.spines['left'].set_bounds(-3, 40)

# set the interval by which # the x-axis set the marks

plt.xticks(list(range(-3, 10)))

# set the intervals by which y-axis # set the marks plt.yticks(list(range(-3, 20, 3)))

# legend denotes that what color # signifies what

ax.legend(['1st Rep', '2nd Rep', '3rd Rep', '4th Rep'])

# annotate command helps to write

# ON THE GRAPH any text xy denotes # the position on the graph

plt.annotate('Temperature V / s Days', xy = (1.01, -2.15))

# gives a title to the Graph plt.title('All Features Discussed') plt.show()


Output:

Program:

import matplotlib.pyplot as plt

a = [1, 2, 3, 4, 5]

b = [0, 0.6, 0.2, 15, 10, 8, 16, 21]

c = [4, 2, 6, 8, 3, 20, 13, 15]

# use fig whenever u want the # output in a new window also # specify the window size you
# want ans to be displayed

fig = plt.figure(figsize =(10, 10))

# creating multiple plots in a # single plot

sub1 = plt.subplot(2, 2, 1)

sub2 = plt.subplot(2, 2, 2)

sub3 = plt.subplot(2, 2, 3)

sub4 = plt.subplot(2, 2, 4) sub1.plot(a, 'sb')

# sets how the display subplot # x axis values advances by 1 # within the specified range
sub1.set_xticks(list(range(0, 10, 1))) sub1.set_title('1st Rep')

sub2.plot(b, 'or')

# sets how the display subplot x axis # values advances by 2 within the

# specified range sub2.set_xticks(list(range(0, 10, 2))) sub2.set_title('2nd Rep')

# can directly pass a list in the plot

# function instead adding the reference sub3.plot(list(range(0, 22, 3)), 'vg')

sub3.set_xticks(list(range(0, 10, 1))) sub3.set_title('3rd Rep')

sub4.plot(c, 'Dm')

# similarly we can set the ticks for # the y-axis range(start(inclusive), # end(exclusive), step)

sub4.set_yticks(list(range(0, 24, 2))) sub4.set_title('4th Rep')

# without writing plt.show() no plot # will be visible

plt.show()

Output:
Normal Curve

ALGORITHM

Step 1: Start the Program

Step 2: Import packages scipy and call function scipy.stats

Step 3: Import packages numpy, matplotlib and seaborn

Step 4: Create the distribution

Step 5: Visualizing the distribution Step 6: Stop the process

Program:

# import required libraries from scipy.stats import norm import numpy as np

import matplotlib.pyplot as plt import seaborn as sb

# Creating the distribution data = np.arange(1,10,0.01)

pdf = norm.pdf(data , loc = 5.3 , scale = 1 )

#Visualizing the distribution

sb.set_style('whitegrid')

sb.lineplot(data, pdf , color = 'black') plt.xlabel('Heights')

plt.ylabel('Probability Density')

Output:
CORRELATION AND SCATTER PLOTS

ALGORITHM

Step 1: Start the Program Step 2: Create variable y1, y2

Step 3: Create variable x, y3 using random function

Step 4: plot the scatter plot Step 5: Print the result Step 6: Stop the process

Program:

# Scatterplot and Correlations # Data

x-pp random randn(100) yl=x*5+9

y2=-5°x

y3=no_random.randn(100) #Plot

plt.reParams update('figure figsize' (10,8), 'figure dpi¹:100})

plt scatter(x, yl, label=fyl, Correlation = {np.round(np.corrcoef(x,y1)[0,1], 2)})

plt scatter(x, y2, label=fy2 Correlation = (np.round(np.corrcoef(x,y2)[0,1], 2)})

plt scatter(x, y3, label=fy3 Correlation = (np.round(np.corrcoef(x,y3)[0,1], 2)})

# Plot

plt titlef('Scatterplot and Correlations') plt(legend)

plt(show)

Output
SIMPLE LINEAR REGRESSION

ALGORITHM

Step 1: Start the Program

Step 2: Import numpy and matplotlib package Step 3: Define coefficient function

Step 4: Calculate cross-deviation and deviation about x Step 5: Calculate regression


coefficients

Step 6: Plot the Linear regression and define main function

Step 7: Print the result

Step 8: Stop the process

PROGRAM:

import numpy as np

import matplotlib.pyplot as plt

def estimate_coef(x, y):

# number of observations/points n = np.size(x)

# mean of x and y vector m_x = np.mean(x)

m_y = np.mean(y)

# calculating cross-deviation and deviation about x SS_xy = np.sum(y*x) - n*m_y*m_x

SS_xx = np.sum(x*x) - n*m_x*m_x

# calculating regression coefficients b_1 = SS_xy / SS_xx

b_0 = m_y - b_1*m_x return (b_0, b_1)

def plot_regression_line(x, y, b):

# plotting the actual points as scatter plot plt.scatter(x, y, color = "m",

marker = "o", s = 30)

# predicted response vector y_pred = b[0] + b[1]*x


# plotting the regression line plt.plot(x, y_pred, color = "g")

# putting labels plt.xlabel('x')

plt.ylabel('y')

# function to show plot plt.show()

def main():

# observations / data

x = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

y = np.array([1, 3, 2, 5, 7, 8, 8, 9, 10, 12])

# estimating coefficients b = estimate_coef(x, y)

print("Estimated coefficients:\nb_0 = {} \

\nb_1 = {}".format(b[0], b[1]))

# plotting regression line plot_regression_line(x, y, b)

if name == " main ": main()

Output :

Estimated coefficients:

b_0 = -0.0586206896552

b_1 = 1.45747126437

Graph:
MATPLOTLIB

Draw a line in a diagram from position (1, 3) to (2, 8) then to (6, 1) and
finally to position (8, 10):

import matplotlib.pyplot as plt


import numpy as np

xpoints = np.array([1, 2, 6, 8])


ypoints = np.array([3, 8, 1, 10])

plt.plot(xpoints, ypoints)
plt.show()
Draw a line diagram to plot from (1, 3) to (8, 10), we have to pass two
arrays [1, 8] and [3, 10] to the plot function.

import matplotlib.pyplot as plt


import numpy as np
xpoints = np.array([1, 8])
ypoints = np.array([3, 10])
plt.plot(xpoints, ypoints)
plt.show()

Markers

Draw a line diagram with marker to plot from (1, 3) to (8, 10), we have to
pass two arrays [1, 8] and [3, 10] to the plot function.

import matplotlib.pyplot as plt


import numpy as np
ypoints = np.array([3, 8, 1, 10])
plt.plot(ypoints, marker = 'o')
plt.show()
Marker Size
Draw a line diagram with marker size will be 20 to plot from (1, 3) to (8, 10),
we have to pass two arrays [1, 8] and [3, 10] to the plot function.

import matplotlib.pyplot as plt


import numpy as np
ypoints = np.array([3, 8, 1, 10])
plt.plot(ypoints, marker = 'o', ms = 20)
plt.show()
Marker Color
Draw a line diagram with marker size will be 20 with marker colour red to
plot from (1, 3) to (8, 10), we have to pass two arrays [1, 8] and [3, 10] to
the plot function.

import matplotlib.pyplot as plt


import numpy as np
ypoints = np.array([3, 8, 1, 10])
plt.plot(ypoints, marker = 'o', ms = 20, mec = 'r')
####plt.plot(ypoints, marker = 'o', ms = 20, mec = '#4CAF50', mfc
= '#4CAF50')
###plt.plot(ypoints, marker = 'o', ms = 20, mec = 'hotpink', mfc
= 'hotpink')
plt.show()
Create Labels for a Plot
With Pyplot, you can use the xlabel() and ylabel() functions to set a label
for the x- and y-axis.

import numpy as np
import matplotlib.pyplot as plt

x = np.array([80, 85, 90, 95, 100, 105, 110, 115, 120, 125])
y = np.array([240, 250, 260, 270, 280, 290, 300, 310, 320, 330])
plt.plot(x, y)
plt.title("Sports Watch Data")
plt.xlabel("Average Pulse")
plt.ylabel("Calorie Burnage")
plt.show()
Set Font Properties for Title and Labels

You can use the fontdict parameter in xlabel(), ylabel(), and title() to set font
properties for the title and labels.

Example

Set font properties for the title and labels:

import numpy as np

import matplotlib.pyplot as plt

x = np.array([80, 85, 90, 95, 100, 105, 110, 115, 120, 125])

y = np.array([240, 250, 260, 270, 280, 290, 300, 310, 320, 330])

font1 = {'family':'serif','color':'blue','size':20}

font2 = {'family':'serif','color':'darkred','size':15}
plt.title("Sports Watch Data", fontdict = font1)

plt.xlabel("Average Pulse", fontdict = font2)

plt.ylabel("Calorie Burnage", fontdict = font2)

plt.plot(x, y)

plt.show()

Matplotlib Scatter
With Pyplot, you can use the scatter() function to draw a scatter plot.

The scatter() function plots one dot for each observation. It needs two arrays
of the same length, one for the values of the x-axis, and one for values on
the y-axis:
import matplotlib.pyplot as plt
import numpy as np
x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])
y = np.array([99,86,87,88,111,86,103,87,94,78,77,85,86])
plt.scatter(x, y)
plt.show()

ColorMap

import matplotlib.pyplot as plt


import numpy as np
x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])
y = np.array([99,86,87,88,111,86,103,87,94,78,77,85,86])
colors = np.array([0, 10, 20, 30, 40, 45, 50, 55, 60, 70, 80, 90, 100])
plt.scatter(x, y, c=colors, cmap='viridis')
plt.colorbar()
plt.show()

Creating Bars
With Pyplot, you can use the bar() function to draw bar graphs:

import matplotlib.pyplot as plt


import numpy as np
x = np.array(["A", "B", "C", "D"])
y = np.array([3, 8, 1, 10])
plt.bar(x,y)
plt.show()
import matplotlib.pyplot as plt
import numpy as np
x = np.array(["A", "B", "C", "D"])
y = np.array([3, 8, 1, 10])
plt.bar(x, y, color = "red")
plt.show()

Histogram
A histogram is a graph showing frequency distributions.

It is a graph showing the number of observations within each given interval.

In Matplotlib, we use the hist() function to create histograms.

The hist() function will use an array of numbers to create a histogram, the
array is sent into the function as an argument.
import matplotlib.pyplot as plt
import numpy as np
x = np.random.normal(170, 10, 250)
plt.hist(x)
plt.show()

Creating Pie Charts


With Pyplot, you can use the pie() function to draw pie charts:

import matplotlib.pyplot as plt


import numpy as np
y = np.array([35, 25, 25, 15])
plt.pie(y)
plt.show()
import matplotlib.pyplot as plt
import numpy as np
y = np.array([35, 25, 25, 15])
mylabels = ["Apples", "Bananas", "Cherries", "Dates"]
plt.pie(y, labels = mylabels)
plt.show()

Explode
The explode parameter, if specified, and not None, must be an array with one
value for each wedge.
Each value represents how far from the center each wedge is displayed:

import matplotlib.pyplot as plt


import numpy as np
y = np.array([35, 25, 25, 15])
mylabels = ["Apples", "Bananas", "Cherries", "Dates"]
myexplode = [0.2, 0, 0, 0]
plt.pie(y, labels = mylabels, explode = myexplode)
plt.show()

Legend
To add a list of explanation for each wedge, use the legend() function:

import matplotlib.pyplot as plt


import numpy as np

y = np.array([35, 25, 25, 15])


mylabels = ["Apples", "Bananas", "Cherries", "Dates"]

plt.pie(y, labels = mylabels)


plt.legend()
plt.show()

Python program to perform Data Manipulation operations using Pandas


package.

import pandas as pd
# Create a DataFrame
data = { 'Name': ['John', 'Emma', 'Sam', 'Lisa', 'Tom'], 'Age': [25, 30, 28, 32, 27], 'Country':
['USA', 'Canada', 'Australia', 'UK', 'Germany'], 'Salary': [50000, 60000, 55000, 70000, 52000]
}
df = pd.DataFrame(data)
print("Original DataFrame:")
print(df)
# Selecting columns
name_age = df[['Name', 'Age']]
print("\nName and Age columns:")
print(name_age)
# Filtering rows
filtered_df = df[df['Country'] == 'USA']
print("\nFiltered DataFrame (Country = 'USA'):")
print(filtered_df)
# Sorting by a column
sorted_df = df.sort_values('Salary', ascending=False)
print("\nSorted DataFrame (by Salary in descending order):")
print(sorted_df)
# Aggregating data
average_salary = df['Salary'].mean() print("\nAverage Salary:", average_salary)
# Adding a new column
df['Experience'] = [3, 6, 4, 8, 5]
print("\nDataFrame with added Experience column:")
print(df)
# Updating values
df.loc[df['Name'] == 'Emma', 'Salary'] = 65000
print("\nDataFrame after updating Emma's Salary:")
print(df)
# Deleting a column df = df.drop('Experience', axis=1)
print("\nDataFrame after deleting Experience column:")
print(df)

You might also like