Python Programming: Installation & Basics
Python Programming: Installation & Basics
G
IN
PYTHO
N
Lawal O. O
2
INTRODUCTION AND INSTALLATION
Python is a widely used general-purpose, high level programming language. It was initially designed by
Guido van Rossum in 1991 and developed by Python Software Foundation. It was mainly developed for
emphasis on code readability, and its syntax allows programmers to express concepts in fewer lines of
code.
Python is a programming language that lets you work quickly and integrate systems more efficiently.
There are two major Python versions and they are Python 2 and Python 3.
Features of Python?
1. Python is object-oriented - Structure supports such concepts as polymorphism, operation
overloading and multiple inheritance.
3
2. Indentation - Indentation is one of the greatest feature in python
3. It’s free (open source) - Downloading python and installing python is free and easy
4. It’s Portable - Python runs virtually on every major platform used today, as long as you have a
compatible python interpreter installed, python programs will run in exactly the same manner,
irrespective of platform.
5. It’s easy to use and learn – a. No intermediate compile. Python Programs are compiled
automatically to an intermediate form called byte code, which the interpreter then reads, This gives
python the development speed of an interpreter without the performance loss inherent in purely
interpreted languages.
b. Structure and syntax are pretty intuitive and easy to grasp.
7. Interpreted Language - Python is processed at runtime by python Interpreter
8. Interactive Programming Language - Users can interact with the python interpreter directly for
writing the programs
9. Straight forward syntax - The formation of python syntax is simple and straight forward which also
makes it popular.
INSTALLATION
There are many interpreters available freely to run Python scripts like IDLE (Integrated Development
Environment) which is installed when you install the python software from [Link]
RUNNING PYTHON
There are two modes for using the Python interpreter:
1. Interactive Mode
4
2. Script Mode
The chevron at the beginning of the 1st line, i.e., the symbol >>> is a prompt the python interpreter uses
to indicate that it is ready. If the programmer types 2+6, the interpreter replies 8.
Alternatively, programmers can store Python script source code in a file with the .py extension, and use
the interpreter to execute the contents of the file. To execute the script by the interpreter, you have to tell
the interpreter the name of the file. For example, if you have a script name [Link] and you're
working on Unix, to run the script you have to type:
python [Link]
Working with the interactive mode is better when Python programmers deal with small pieces of code as
you can type and execute them immediately, but when the code is more than 2-4 lines, using the script
for coding can help to modify and use the code in future.
Example:
5
PYTHON DATA TYPES
Python has various standard data types that are used to define the operations possible on them and the
storage method for each of them.
1. Int
Int, or integer, is a whole number, positive or negative, without decimals, of unlimited length.
Examples
>>> print(24656354687654+2)
24656354687656
>>> print(20)
20
>>> print(0b10)
2
>>> a=10
>>> print(a)
10
2. Float
Float, or "floating point number" is a number, positive or negative, containing one or more decimals.
Float can also be scientific numbers with an "e" to indicate the power of 10.
Examples
>>> y=2.8
>>> y
2.8
>>> y=2.8
>>> print(type(y))
<class 'float'>
>>> type(.4)
<class 'float'>
x = 35e3
y = 12E4
z = -87.7e100
print(type(x))
print(type(y))
print(type(z))
6
Output:
<class 'float'>
<class 'float'>
<class 'float'>
3. Boolean
Objects of Boolean type may have one of two values, True or False:
>>> type(True)
<class 'bool'>
>>> type(False)
<class 'bool'>
4. String
1. Strings in Python are identified as a contiguous set of characters represented in the quotation marks.
Python allows for either pairs of single or double quotes.
• 'hello' is the same as "hello".
• Strings can be output to screen using the print function. For example: print("hello").
>>> print("ict poly")
ict poly
>>> type("ict poly")
<class 'str'>
>>> print(ict poly')
ict poly
>>> " "
''
If you want to include either type of quote character within the string, the simplest way is to delimit the
string with the other type. If a string is to contain a single quote, delimit it with double quotes and vice
versa:
>>> print("ict is an autonomous (') poly")
ict is an autonomous (') poly
>>> print('ict is an autonomous (") poly')
ict is an autonomous (") poly
5. List
a. It is a general purpose most widely used in data structures
b. List is a collection which is ordered and changeable and allows duplicate members. (Grow and
shrink as needed, sequence type, sortable).
c. To use a list, you must declare it first. Do this using square brackets and separate values with
commas.
d. We can construct / create list in many ways.
Example 1
>>> list1=[1,2,3,'A','B',7,8,[10,11]]
>>> print(list1)
[1, 2, 3, 'A', 'B', 7, 8, [10, 11]]
Example 2
>>> x=list()
7
>>> x
[]
Example 3
>>> tuple1=(1,2,3,4)
>>> x=list(tuple1)
>>> x
[1, 2, 3, 4]
6. Variables
Variables are nothing but reserved memory locations to store values. This means that when you create a
variable you reserve some space in memory.
Based on the data type of a variable, the interpreter allocates memory and decides what can be stored in
the reserved memory. Therefore, by assigning different data types to variables, you can store integers,
decimals or characters in these variables.
Rules for Python variables:
a. A variable name must start with a letter or the underscore character
b. A variable name cannot start with a number
c. A variable name can only contain alpha-numeric characters and underscores (A-z, 0-9, and _ )
d. Variable names are case-sensitive (age, Age and AGE are three different variables)
Here, an integer object is created with the value 1, and all three variables are assigned to the same
memory location. You can also assign multiple objects to multiple variables.
For example:
a,b,c = 1,2,"john“
8
Here, two integer objects with values 1 and 2 are assigned to variables a and b respectively, and one
string object with the value "john" is assigned to the variable c.
Output Variables
The Python print statement is often used to output variables.
Variables do not need to be declared with any particular type and can even change type after they have
been set.
x = 5 # x is of type int
x = "john " # x is now of type str
print(x)
Output: john
To combine both text and a variable, Python uses the “+” character:
Example 1
x = "awesome"
print("Python is " + x)
Output
Python is awesome
You can also use the + character to add a variable to another variable:
Example 2
x = "Python is "
y = "awesome"
z=x+y
print(z)
Output
Python is awesome
7. Expressions
An expression is a combination of values, variables, and operators. An expression is evaluated using
assignment operator.
Examples
>>> x=10
>>> z=x+20
>>> z
30
>>> x=10
>>> y=20
>>> c=x+y
>>> c
30
Python also defines expressions which contain identifiers, literals, and operators. So,
Identifiers: Any name that is used to define a class, function, variable module, or object is an identifier.
Literals: These are language-independent terms in Python and should exist independently in any
programming language. In Python, there are the string literals, byte literals, integer literals, floating
point literals, and imaginary literals.
Operators: In Python you can implement the following operations using the corresponding tokens.
Operator Token
add +
subtract -
multiply *
Integer Division /
remainder %
Binary left shift <<
Binary right shift >>
And &
or \
Less than <
Greater than >
Less than or equal to <=
Greater than or equal to >=
Check equality ==
Check not equal !=
PRECEDENCE OF OPERATORS
Operator precedence affects how an expression is evaluated.
For example, x = 7 + 3 * 2; here, x is assigned 13, not 20 because operator * has higher precedence than
+, so it first multiplies 3*2 and then adds into 7.
Example 1
>>> 3+4*2
11
Multiplication gets evaluated before the addition operation
>>> (10+10)*2
40
Parentheses () overriding the precedence of the arithmetic operators
Example 2
a = 20
b = 10
c = 15
d=5
e=0
e = (a + b) * c / d #( 30 * 15 ) / 5
print("Value of (a + b) * c / d is ", e)
10
e = ((a + b) * c) / d # (30 * 15 ) / 5
print("Value of ((a + b) * c) / d is ", e)
e = (a + b) * (c / d); # (30) * (15/5)
print("Value of (a + b) * (c / d) is ", e)
e = a + (b * c) / d; # 20 + (150/5)
print("Value of a + (b * c) / d is ", e)
Output
C:/Users/MRCET/AppData/Local/Programs/Python/Python38-32/pyyy/[Link]
Value of (a + b) * c / d is 90.0
Value of ((a + b) * c) / d is 90.0
Value of (a + b) * (c / d) is 90.0
Value of a + (b * c) / d is 50.0
COMMENTS
1. Single-line comments begins with a hash(#) symbol and is useful in mentioning that the whole
line should be considered as a comment until the end of line.
2. A Multi line comment is useful when we need to comment on many lines. In python, triple
double quote(“ “ “) and single quote(‘ ‘ ‘)are used for multi-line commenting.
Example
Output
C:/Users/MRCET/AppData/Local/Programs/Python/Python38-32/pyyy/[Link]
30
LISTS
1. It is a general purpose most widely used in data structures
2. List is a collection which is ordered and changeable and allows duplicate members. (Grow and
shrink as needed, sequence type, sortable).
3. To use a list, you must declare it first. Do this using square brackets and separate values with
commas.
4. We can construct / create list in many ways.
Example 1
>>> list1=[1,2,3,'A','B',7,8,[10,11]]
>>> print(list1)
[1, 2, 3, 'A', 'B', 7, 8, [10, 11]]
11
Example 2
>>> x=list()
>>> x
[]
Example 3
>>> tuple1=(1,2,3,4)
>>> x=list(tuple1)
>>> x
[1, 2, 3, 4]
The following operations can be performed on a list:
1. List operations
These operations include indexing, slicing, adding, multiplying, and checking for membership
Lists respond to the + and * operators much like strings; they mean concatenation and repetition here
too, except that the result is a new list, not a string.
2. List slices
List slicing refers to accessing a specific portion or a subset of the list for some operation while the
orginal list remains unaffected. The slicing operator in python can take 3 parameters out of which 2 are
optional depending on the requirement.
Syntax of list slicing:
list_name[start:stop:steps]
The start parameter is a mandatory parameter, whereas the stop and steps are both optional
parameters.
3. List Methods
Here are all the methods that can be done on a list:
a. Del() b. Append() c. Extend() d. Insert() e. Pop() f. Remove() g. Reverse()
h. Sort()
Example 2
>>> del(x)
>>> x # complete list gets deleted
d. Insert: To add an item at the specified index, use the insert () method:
>>> x=[1,2,4,6,7]
>>> [Link](2,10) #insert(index no, item to be inserted)
>>> x
[1, 2, 10, 4, 6, 7]
Example 2
>>> [Link](4,['a',11])
>>> x
[1, 2, 10, 4, ['a', 11], 6, 7]
e. Pop: The pop() method removes the specified index, (or the last item if index is not specified) or
simply pops the last item of list and returns the item.
>>> x=[1, 2, 10, 4, 6, 7]
>>> [Link]()
7
>>> x
[1, 2, 10, 4, 6]
Example 2
>>> x=[1, 2, 10, 4, 6]
>>> [Link](3)
10
>>> x
[1, 2, 4, 6]
f. Remove: The remove() method removes the specified item from a given list.
>>> x=[1,33,2,10,4,6]
>>> [Link](33)
>>> x
[1, 2, 10, 4, 6]
>>> [Link](4)
>>> x
[1, 2, 10, 6]
4. List loop
Loops are control structures used to repeat a given section of code a certain number of times or until a
particular condition is met. These have been discussed under Control structure
5. Mutability
A mutable object can be changed after it is created, and an immutable object can't.
[1, 3, 5, 7, 8, 10]
6. Aliasing
1. An alias is a second name for a piece of data, often easier (and more useful) than making a copy.
2. If the data is immutable, aliases don’t matter because the data can’t change.
3. But if data can change, aliases can result in lot of hard – to – find bugs.
4. Aliasing happens whenever one variable’s value is assigned to another variable.
For example:
a = [81, 82, 83]
b = [81, 82, 83]
print(a == b)
print(a is b)
b=a
print(a == b)
print(a is b)
b[0] = 5
print(a)
Output
C:/Users/MRCET/AppData/Local/Programs/Python/Python38-32/pyyy/[Link]
True
False
True
True
[5, 82, 83]
Because the same list has two different names, a and b, we say that it is aliased. Changes made with one
alias affect the other. In the example above, you can see that a and b refer to the same list after executing
the assignment statement b = a.
7. Cloning Lists
If we want to modify a list and also keep a copy of the original, we need to be able to make a copy of the
list itself, not just the reference. This process is sometimes called cloning, to avoid the ambiguity of the
word copy.
The easiest way to clone a list is to use the slice operator. Taking any slice of a creates a new list. In this
case the slice happens to consist of the whole list.
Example
a = [81, 82, 83]
b = a[:] # make a clone using slice
14
print(a == b)
print(a is b)
b[0] = 5
print(a)
print(b)
Output
C:/Users/MRCET/AppData/Local/Programs/Python/Python38-32/pyyy/[Link]
True
False
[81, 82, 83]
[5, 82, 83]
Now we are free to make changes to b without worrying about a
1. List parameters
Passing a list as an argument actually passes a reference to the list, not a copy of the list. Since lists are
mutable, changes made to the elements referenced by the parameter change the same list that the
argument is referencing.
For example, the function below takes a list as an argument and multiplies each element in the list by 2:
def doubleStuff(List):
""" Overwrite each element in a List with double its value. """
for position in range(len(List)):
List[position] = 2 * List[position]
things = [2, 5, 9]
print(things)
doubleStuff(things)
print(things)
Output
C:/Users/MRCET/AppData/Local/Programs/Python/Python38-32/[Link] ==
[2, 5, 9]
[4, 10, 18]
2. List comprehension
List comprehensions provide a concise way to create lists. Common applications are to make new lists
where each element is the result of some operations applied to each member of another sequence or
iterable, or to create a subsequence of those elements that satisfy a certain condition.
For example, assume we want to create a list of squares, like:
>>> list1=[]
>>> for x in range(10):
[Link](x**2)
>>> list1
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
(or)
TUPLES
A tuple is a collection which is ordered and unchangeable. In Python tuples are written with round
brackets.
1. It supports all operations for sequences.
2. It is immutable, but member objects may be mutable.
3. If the contents of a list shouldn’t change, use a tuple to prevent items from accidently being
added, changed, or deleted.
4. Tuples are more efficient than list due to python’s implementation.
The following operations can be performed on a tuple: tuple assignment, tuple as return value, tuple
comprehension; Dictionaries: operations and methods, comprehension.
Example 1
>>> x=(1,2,3)
>>> print(x)
(1, 2, 3)
>>> x
(1, 2, 3)
Example 2
>>> x=()
>>> x
()
Example 3
>>> x=[4,5,66,9]
16
>>> y=tuple(x)
>>> y
(4, 5, 66, 9)
Example 4
>>> x=1, 2, 3, 4
>>> x
(1, 2, 3, 4)
a. Access tuple items: Access tuple items by referring to the index number, inside square brackets
>>> x=('a','b','c','g')
>>> print(x[2])
b
b. Change tuple items: Once a tuple is created, you cannot change its values. Tuples are
unchangeable.
>>> x=(2,5,7,'4',8)
>>> x[1]=10
Traceback (most recent call last):
File "<pyshell#41>", line 1, in <module>
x[1]=10
c. Loop through a tuple: We can loop the values of tuple using for loop
>>> x=4,5,6,7,2,'aa'
>>> for i in x:
print(i)
4
5
6
7
2
aa
d. Count (): Returns the number of times a specified value occurs in a tuple
>>> x=(1,2,3,4,5,6,2,10,2,11,12,2)
>>> [Link](2)
4
e. Index (): Searches the tuple for a specified value and returns the position of where it was found
>>> x=(1,2,3,4,5,6,2,10,2,11,12,2)
>>> [Link](2)
1
17
(Or)
>>> x=(1,2,3,4,5,6,2,10,2,11,12,2)
>>> y=[Link](2)
>>> print(y)
1
f. Length (): To know the number of items or values present in a tuple, we use len().
>>> x=(1,2,3,4,5,6,2,10,2,11,12,2)
>>> y=len(x)
>>> print(y)
12
Conditionals: Boolean values and operators, conditional (if), alternative (if-else), chained conditional
(if-elif-else), Iteration, while, for, break, continue.
4. CONDITIONAL (IF):
The IF statement contains a logical expression using which data is compared and a decision is made
based on the result of the comparison.
Syntax:
If expression:
statement(s)
If the boolean expression evaluates to TRUE, then the block of statement(s) inside the if statement is
executed. If boolean expression evaluates to FALSE, then the first set of code after the end of the if
statement(s) is executed.
If Statement Flowchart:
18
Fig: Operation of if statement
Example 1
a=3
if a > 2:
print(a, "is greater")
print("done")
a = -1
if a < 0:
print(a, "a is smaller")
print("Finish")
Output
C:/Users/MRCET/AppData/Local/Programs/Python/Python38-32/pyyy/[Link]
3 is greater
done
-1 a is smaller
Finish
Example 2
a=10
if a>9:
print("a is Greater than 9")
Output
C:/Users/MRCET/AppData/Local/Programs/Python/Python38-32/pyyy/[Link]
a is Greater than 9
5. ALTERNATIVE IF (IF-ELSE)
An ELSE statement can be combined with an IF statement. An else statement contains the block of code
(false block) that executes if the conditional expression in the IF statement resolves to 0 or a FALSE
value.
The ELSE statement is an optional statement and there could be at most only one ELSE statement
following IF.
Syntax of IF - ELSE:
IF test expression:
Body of IF statement
19
else:
Body of else statement
If - else Flowchart:
Example 1
a=int(input('enter the number'))
if a>5:
print("a is greater")
else:
print("a is smaller than the input given")
Output
C:/Users/MRCET/AppData/Local/Programs/Python/Python38-32/pyyy/[Link]
enter the number 2
a is smaller than the input given
Example 2
a=10
b=20
if a>b:
print("A is Greater than B")
else:
print("B is Greater than A")
Output
C:/Users/MRCET/AppData/Local/Programs/Python/Python38-32/pyyy/[Link]
B is Greater than A
20
Syntax of if – elif - else:
If test expression:
Body of if statement
elif test expression:
Body of elif statement
else:
Body of else statement
Output
C:/Users/MRCET/AppData/Local/Programs/Python/Python38-32/pyyy/[Link]
enter the number5
enter the number2
enter the number9
a is greater
>>>
C:/Users/MRCET/AppData/Local/Programs/Python/Python38-32/pyyy/[Link]
enter the number2
enter the number5
enter the number9
c is greater
21
Example 2
var = 100
if var == 200:
print("1 - Got a true expression value")
print(var)
elif var == 150:
print("2 - Got a true expression value")
print(var)
elif var == 100:
print("3 - Got a true expression value")
print(var)
else:
print("4 - Got a false expression value")
print(var)
print("Good bye!")
Output
C:/Users/MRCET/AppData/Local/Programs/Python/Python38-32/pyyy/[Link]
3 - Got a true expression value
100
Good bye!
7. ITERATION
A loop statement allows us to execute a statement or group of statements multiple times as long as the
condition is true. Repeated execution of a set of statements with the help of loops is called iteration.
Loops statements are used when we need to run same code again and again, each time with a different
value.
Statements:
Python Iteration (Loops) statements are of three types:
1. While Loop
2. For Loop
3. Nested For Loops
1. While loop
a. Loops are either infinite or conditional. Python WHILE loop keeps reiterating a block of code defined
inside it until the desired condition is met.
b. The while loop contains a boolean expression and the code inside the loop is repeatedly executed as
long as the boolean expression is true.
c. The statements that are executed inside WHILE can be a single line of code or a block of multiple
statements.
Syntax:
while(expression):
Statement(s)
Flowchart:
22
Example 1
i=1
while i<=6:
print("Mrcet college")
i=i+1
Output
C:/Users/MRCET/AppData/Local/Programs/Python/Python38-32/pyyy/[Link]
Mrcet college
Mrcet college
Mrcet college
Mrcet college
Mrcet college
Mrcet college
Example 2
while (i < 10):
print (i)
i = i+1
Output
C:/Users/MRCET/AppData/Local/Programs/Python/Python38-32/pyyy/[Link]
1
2
3
4
5
6
7
8
9
23
Example 3
a=1
b=1
while (a<10):
print ('Iteration',a)
a=a+1
b=b+1
if (b == 4):
break
print ('While loop terminated')
Output
C:/Users/MRCET/AppData/Local/Programs/Python/Python38-32/pyyy/[Link]
Iteration 1
Iteration 2
Iteration 3
While loop terminated
Example 4
count = 0
while (count < 9):
print("The count is:", count)
count = count + 1
print("Good bye!")
Output
C:/Users/MRCET/AppData/Local/Programs/Python/Python38-32/pyyy/[Link] =
The count is: 0
The count is: 1
The count is: 2
The count is: 3
The count is: 4
The count is: 5
The count is: 6
The count is: 7
The count is: 8
Good bye!
8. FOR LOOP
Python for loop is used for repeated execution of a group of statements for the desired number of times.
It iterates over the items of lists, tuples, strings, the dictionaries and other iterable objects.
24
Example
numbers = [1, 2, 4, 6, 11, 20]
seq=0
for val in numbers:
seq=val*val
print(seq)
Output
C:/Users/MRCET/AppData/Local/Programs/Python/Python38-32/[Link]
1
4
16
36
121
400
Flowchart:
Output
C:/Users/MRCET/AppData/Local/Programs/Python/Python38-32/pyyy/[Link]
college 1 is M
college 2 is R
college 3 is C
college 4 is E
college 5 is T
25
2. Iterating over a Tuple
tuple = (2,3,5,7)
print ('These are the first four prime numbers ')
#Iterating over the tuple
for a in tuple:
print (a)
Output
C:/Users/MRCET/AppData/Local/Programs/Python/Python38-32/pyyy/[Link]
These are the first four prime numbers
2
3
5
7
Output
C:/Users/MRCET/AppData/Local/Programs/Python/Python38-32/pyyy/[Link]
Keys are:
ces
it
ece
Values are:
block1
block2
block3
Output
C:/Users/MRCET/AppData/Local/Programs/Python/Python38-32/pyyy/[Link]
M
R
C
26
E
T
Example 1
for i in range(1,6):
for j in range(0,i):
print(i, end=" ")
print('')
Output
C:/Users/MRCET/AppData/Local/Programs/Python/Python38-32/pyyy/[Link]
1
22
333
4444
55555
Example 2
for i in range(1,6):
for j in range(5,i-1,-1):
print(i, end=" ")
print('')
C:/Users/MRCET/AppData/Local/Programs/Python/Python38-32/pyyy/[Link]
Output
11111
2222
333
44
FUNCTIONS
Fruitful functions:
Fruitful functions return values, parameters, local and global scope, function composition, recursion;
Strings: string slices, immutability, string functions and methods, string module; Python arrays, Access
the Elements of an Array, array methods.
In a fruitful function the return statement includes a return value. This statement means "Return
immediately from this function and use the following expression as a return value."
(OR) Any function that returns a value is called Fruitful function. A Function that does not return a
value is called a void function.
27
Return values
The Keyword return is used to return back the value to the called function.
# returns the area of a circle with the given radius:
def area(radius):
temp = 3.14 * radius**2
return temp
print(area(4))
(OR)
def area(radius):
return 3.14 * radius**2
print(area(2))
Sometimes it is useful to have multiple return statements, one in each branch of a conditional:
def absolute_value(x):
if x < 0:
return -x
else:
return x
Since these return statements are in an alternative conditional, only one will be executed.
As soon as a return statement executes, the function terminates without executing any subsequent
statements. Code that appears after a return statement, or any other place the flow of execution can never
reach, is called dead code.
In a fruitful function, it is a good idea to ensure that every possible path through the program hits a
return statement. For example:
def absolute_value(x):
if x < 0:
return -x
if x > 0:
return x
This function is incorrect because if x happens to be 0, both conditions is true, and the function ends
without hitting a return statement. If the flow of execution gets to the end of a function, the return value
is None, which is not the absolute value of 0.
>>> print absolute_value(0)
None
Python provides a built-in function called abs that computes absolute values.
1. Write a Python function that takes two lists and returns True if they have at least one common
member.
def common_data(list1, list2):
for x in list1:
for y in list2:
if x == y:
result = True
return result
print(common_data([1,2,3,4,5], [1,2,3,4,5]))
28
print(common_data([1,2,3,4,5], [1,7,8,9,510]))
print(common_data([1,2,3,4,5], [6,7,8,9,10]))
Output
C:\Users\MRCET\AppData\Local\Programs\Python\Python38-32\pyyy\[Link]
True
True
None
PARAMETERS
Parameters are passed during the definition of function while Arguments are passed during the function
call.
Example 1
#here a and b are parameters
def add(a,b): #//function definition
return a+b
#12 and 13 are arguments
#function call
result=add(12,13)
print(result)
Output
C:/Users/MRCET/AppData/Local/Programs/Python/Python38-32/pyyy/[Link]
25
Example 2
def Fun1() :
print("function 1")
Fun1()
Output
C:/Users/MRCET/AppData/Local/Programs/Python/Python38-32/pyyy/[Link]
function 1
Example 3
def fun2(a) :
print(a)
fun2("hello")
Output
C:/Users/MRCET/AppData/Local/Programs/Python/Python38-32/pyyy/[Link]
Hello
29
Example 4
def fun3():
return "welcome to python"
print(fun3())
Output
C:/Users/MRCET/AppData/Local/Programs/Python/Python38-32/pyyy/[Link]
welcome to python
Example 5
def fun4(a):
return a
print(fun4("python is better then c"))
Output
C:/Users/MRCET/AppData/Local/Programs/Python/Python38-32/pyyy/[Link]
python is better then c
RECURSION
The following image shows the working of a recursive function called recurse.
Factorial of a number is the product of all the integers from 1 to that number. For example, the factorial
of 6 (denoted as 6!) is 1*2*3*4*5*6 = 720.
Below are examples of recursive function to find the factorial of an integer:
30
print("five factorial",fact(5))
Output
C:/Users/MRCET/AppData/Local/Programs/Python/Python38-32/pyyy/[Link]
zero factorial 1
five factorial 120
Output
C:/Users/MRCET/AppData/Local/Programs/Python/Python38-32/pyyy/[Link]
The factorial of 4 is 24
def factorial(x):
"""This is a recursive function
to find the factorial of an integer"""
if x == 1:
return 1
else:
return (x * factorial(x-1))
num = 3
print("The factorial of", num, "is", factorial(num))
Output
The factorial of 3 is 6
When we call this function with a positive integer, it will recursively call itself by decreasing the
number.
Each function multiplies the number with the factorial of the number below it until it is equal to one.
This recursive call can be explained in the following steps.
Our recursion ends when the number reduces to 1. This is called the base condition. Every recursive
function must have a base condition that stops the recursion or else the function calls itself infinitely.
The Python interpreter limits the depths of recursion to help avoid infinite recursions, resulting in stack
overflows. By default, the maximum depth of recursion is 1000. If the limit is crossed, it results in
RecursionError. Let's look at one such condition.
def recursor():
recursor()
recursor()
Output
Advantages of Recursion
Disadvantages of Recursion
Advantages
a. Simplicity: Rather than focusing on the entire problem at hand, a module typically focuses on
one relatively small portion of the problem. If you’re working on a single module, you’ll have a
smaller problem domain to wrap your head around. This makes development easier and less
error-prone.
32
b. Maintainability: Modules are typically designed so that they enforce logical boundaries
between different problem domains. If modules are written in a way that minimizes
interdependency, there is decreased likelihood that modifications to a single module will have an
impact on other parts of the program. This makes it more viable for a team of many programmers
to work collaboratively on a large application.
c. Reusability: Functionality defined in a single module can be easily reused (through an
appropriately defined interface) by other parts of the application. This eliminates the need to
recreate duplicate code.
d. Scoping: Modules typically define a separate namespace, which helps avoid collisions between
identifiers in different areas of a program.
e. Functions, modules and packages are all constructs in Python that promote code
modularization.
A file containing Python code, for e.g.: [Link], is called a module and its module name would be
example.
Example
>>> def add(a,b):
result=a+b
return result
"""This program adds two numbers and return the result""“
Here, we have defined a function add() inside a module named example. The function takes in two
numbers and returns their sum.
Reloading a module:
def hi(a,b):
print(a+b)
hi(4,4)
Output
C:/Users/MRCET/AppData/Local/Programs/Python/Python38-32/pyyy/[Link]
8
>>> import add
8
>>> import add
>>> import add
>>>
1. Datetime module
Output
C:/Users/MRCET/AppData/Local/Programs/Python/Python38-32/pyyy/[Link] =
2000-09-18
Output
C:/Users/MRCET/AppData/Local/Programs/Python/Python38-32/pyyy/[Link] =
[Link]
Example 4: Write a python program to print date, time for today and now.
import datetime
a=[Link]()
b=[Link]()
print(a)
print(b)
Output
C:/Users/MRCET/AppData/Local/Programs/Python/Python38-32/pyyy/[Link] =
2019-11-29 [Link].235581
2019-11-29 [Link].235581
34
Example 5: Write a python program to add some days to your present date and print the date added.
import datetime
a=[Link]()
b=[Link](days=7)
print(a+b)
Output
C:/Users/MRCET/AppData/Local/Programs/Python/Python38-32/pyyy/[Link] =
2019-12-06
Example 6: Write a python program to print the no. of days to write to reach your birthday
import datetime
a=[Link]()
b=[Link](2020,5,27)
c=b-a
print(c)
Output
C:/Users/MRCET/AppData/Local/Programs/Python/Python38-32/pyyy/[Link] =
180 days, [Link]
2. Time module
Example 1: Write a python program to display time.
import time
print([Link]())
Output
C:/Users/MRCET/AppData/Local/Programs/Python/Python38-32/pyyy/[Link] =
1575012547.1584706
While The Python Language Reference describes the exact syntax and semantics of the Python
language, this library reference manual describes the standard library that is distributed with Python. It
also describes some of the optional components that are commonly included in Python distributions.
Python’s standard library is very extensive, offering a wide range of facilities as indicated by the long
table of contents listed below. The library contains built-in modules (written in C) that provide access to
system functionality such as file I/O that would otherwise be inaccessible to Python programmers, as
well as modules written in Python that provide standardized solutions for many problems that occur in
everyday programming. Some of these modules are explicitly designed to encourage and enhance the
portability of Python programs by abstracting away platform-specifics into platform-neutral APIs.
The Python installers for the Windows platform usually include the entire standard library and often also
include many additional components. For Unix-like operating systems Python is normally provided as a
35
collection of packages, so it may be necessary to use the packaging tools provided with the operating
system to obtain some or all of the optional components.
In addition to the standard library, there is a growing collection of several thousand components (from
individual programs and modules to packages and entire application development frameworks),
available from the Python Package Index.|
While The Python Language Reference describes the exact syntax and semantics of the Python
language, this library reference manual describes the standard library that is distributed with Python. It
also describes some of the optional components that are commonly included in Python distributions.
Python’s standard library is very extensive, offering a wide range of facilities as indicated by the long
table of contents listed below. The library contains built-in modules (written in C) that provide access to
system functionality such as file I/O that would otherwise be inaccessible to Python programmers, as
well as modules written in Python that provide standardized solutions for many problems that occur in
everyday programming. Some of these modules are explicitly designed to encourage and enhance the
portability of Python programs by abstracting away platform-specifics into platform-neutral APIs.
The Python installers for the Windows platform usually include the entire standard library and often also
include many additional components. For Unix-like operating systems Python is normally provided as a
collection of packages, so it may be necessary to use the packaging tools provided with the operating
system to obtain some or all of the optional components.
In addition to the standard library, there is a growing collection of several thousand components (from
individual programs and modules to packages and entire application development frameworks),
available from the Python Package Index.
Introduction
o Notes on availability
Built-in Functions
Built-in Constants
o Constants added by the site module
Built-in Types
o Truth Value Testing
o Boolean Operations — and, or, not
o Comparisons
o Numeric Types — int, float, complex
o Iterator Types
o Sequence Types — list, tuple, range
o Text Sequence Type — str
o Binary Sequence Types — bytes, bytearray, memoryview
o Set Types — set, frozenset
o Mapping Types — dict
o Context Manager Types
o Generic Alias Type
o Other Built-in Types
o Special Attributes
36
Built-in Exceptions
o Base classes
o Concrete exceptions
o Warnings
o Exception hierarchy
Text Processing Services
o string — Common string operations
o re — Regular expression operations
o difflib — Helpers for computing deltas
o textwrap — Text wrapping and filling
o unicodedata — Unicode Database
o stringprep — Internet String Preparation
o readline — GNU readline interface
o rlcompleter — Completion function for GNU readline
Binary Data Services
o struct — Interpret bytes as packed binary data
o codecs — Codec registry and base classes
Data Types
o datetime — Basic date and time types
o zoneinfo — IANA time zone support
o calendar — General calendar-related functions
o collections — Container datatypes
o [Link] — Abstract Base Classes for Containers
o heapq — Heap queue algorithm
o bisect — Array bisection algorithm
o array — Efficient arrays of numeric values
o weakref — Weak references
o types — Dynamic type creation and names for built-in types
o copy — Shallow and deep copy operations
o pprint — Data pretty printer
o reprlib — Alternate repr() implementation
o enum — Support for enumerations
o graphlib — Functionality to operate with graph-like structures
Numeric and Mathematical Modules
o numbers — Numeric abstract base classes
o math — Mathematical functions
o cmath — Mathematical functions for complex numbers
o decimal — Decimal fixed point and floating point arithmetic
o fractions — Rational numbers
o random — Generate pseudo-random numbers
o statistics — Mathematical statistics functions
Functional Programming Modules
o itertools — Functions creating iterators for efficient looping
o functools — Higher-order functions and operations on callable objects
o operator — Standard operators as functions
File and Directory Access
o pathlib — Object-oriented filesystem paths
o [Link] — Common pathname manipulations
o fileinput — Iterate over lines from multiple input streams
37
o stat — Interpreting stat() results
o filecmp — File and Directory Comparisons
o tempfile — Generate temporary files and directories
o glob — Unix style pathname pattern expansion
o fnmatch — Unix filename pattern matching
o linecache — Random access to text lines
o shutil — High-level file operations
Data Persistence
o pickle — Python object serialization
o copyreg — Register pickle support functions
o shelve — Python object persistence
o marshal — Internal Python object serialization
o dbm — Interfaces to Unix “databases”
o sqlite3 — DB-API 2.0 interface for SQLite databases
Data Compression and Archiving
o zlib — Compression compatible with gzip
o gzip — Support for gzip files
o bz2 — Support for bzip2 compression
o lzma — Compression using the LZMA algorithm
o zipfile — Work with ZIP archives
o tarfile — Read and write tar archive files
File Formats
o csv — CSV File Reading and Writing
o configparser — Configuration file parser
o netrc — netrc file processing
o xdrlib — Encode and decode XDR data
o plistlib — Generate and parse Apple .plist files
Cryptographic Services
o hashlib — Secure hashes and message digests
o hmac — Keyed-Hashing for Message Authentication
o secrets — Generate secure random numbers for managing secrets
Generic Operating System Services
o os — Miscellaneous operating system interfaces
o io — Core tools for working with streams
o time — Time access and conversions
o argparse — Parser for command-line options, arguments and sub-commands
o getopt — C-style parser for command line options
o logging — Logging facility for Python
o [Link] — Logging configuration
o [Link] — Logging handlers
o getpass — Portable password input
o curses — Terminal handling for character-cell displays
o [Link] — Text input widget for curses programs
o [Link] — Utilities for ASCII characters
o [Link] — A panel stack extension for curses
o platform — Access to underlying platform’s identifying data
o errno — Standard errno system symbols
o ctypes — A foreign function library for Python
38
Concurrent Execution
o threading — Thread-based parallelism
o multiprocessing — Process-based parallelism
o multiprocessing.shared_memory — Provides shared memory for direct access across
processes
o The concurrent package
o [Link] — Launching parallel tasks
o subprocess — Subprocess management
o sched — Event scheduler
o queue — A synchronized queue class
o contextvars — Context Variables
o _thread — Low-level threading API
Networking and Interprocess Communication
o asyncio — Asynchronous I/O
o socket — Low-level networking interface
o ssl — TLS/SSL wrapper for socket objects
o select — Waiting for I/O completion
o selectors — High-level I/O multiplexing
o asyncore — Asynchronous socket handler
o asynchat — Asynchronous socket command/response handler
o signal — Set handlers for asynchronous events
o mmap — Memory-mapped file support
Internet Data Handling
o email — An email and MIME handling package
o json — JSON encoder and decoder
o mailcap — Mailcap file handling
o mailbox — Manipulate mailboxes in various formats
o mimetypes — Map filenames to MIME types
o base64 — Base16, Base32, Base64, Base85 Data Encodings
o binhex — Encode and decode binhex4 files
o binascii — Convert between binary and ASCII
o quopri — Encode and decode MIME quoted-printable data
o uu — Encode and decode uuencode files
Structured Markup Processing Tools
o html — HyperText Markup Language support
o [Link] — Simple HTML and XHTML parser
o [Link] — Definitions of HTML general entities
o XML Processing Modules
o [Link] — The ElementTree XML API
o [Link] — The Document Object Model API
o [Link] — Minimal DOM implementation
o [Link] — Support for building partial DOM trees
o [Link] — Support for SAX2 parsers
o [Link] — Base classes for SAX handlers
o [Link] — SAX Utilities
o [Link] — Interface for XML parsers
o [Link] — Fast XML parsing using Expat
Internet Protocols and Support
39
o webbrowser — Convenient Web-browser controller
o cgi — Common Gateway Interface support
o cgitb — Traceback manager for CGI scripts
o wsgiref — WSGI Utilities and Reference Implementation
o urllib — URL handling modules
o [Link] — Extensible library for opening URLs
o [Link] — Response classes used by urllib
o [Link] — Parse URLs into components
o [Link] — Exception classes raised by [Link]
o [Link] — Parser for [Link]
o http — HTTP modules
o [Link] — HTTP protocol client
o ftplib — FTP protocol client
o poplib — POP3 protocol client
o imaplib — IMAP4 protocol client
o nntplib — NNTP protocol client
o smtplib — SMTP protocol client
o smtpd — SMTP Server
o telnetlib — Telnet client
o uuid — UUID objects according to RFC 4122
o socketserver — A framework for network servers
o [Link] — HTTP servers
o [Link] — HTTP state management
o [Link] — Cookie handling for HTTP clients
o xmlrpc — XMLRPC server and client modules
o [Link] — XML-RPC client access
o [Link] — Basic XML-RPC servers
o ipaddress — IPv4/IPv6 manipulation library
Multimedia Services
o audioop — Manipulate raw audio data
o aifc — Read and write AIFF and AIFC files
o sunau — Read and write Sun AU files
o wave — Read and write WAV files
o chunk — Read IFF chunked data
o colorsys — Conversions between color systems
o imghdr — Determine the type of an image
o sndhdr — Determine type of sound file
o ossaudiodev — Access to OSS-compatible audio devices
Internationalization
o gettext — Multilingual internationalization services
o locale — Internationalization services
Program Frameworks
o turtle — Turtle graphics
o cmd — Support for line-oriented command interpreters
o shlex — Simple lexical analysis
Graphical User Interfaces with Tk
o tkinter — Python interface to Tcl/Tk
o [Link] — Color choosing dialog
40
o [Link] — Tkinter font wrapper
o Tkinter Dialogs
o [Link] — Tkinter message prompts
o [Link] — Scrolled Text Widget
o [Link] — Drag and drop support
o [Link] — Tk themed widgets
o [Link] — Extension widgets for Tk
o IDLE
o Other Graphical User Interface Packages
Development Tools
o typing — Support for type hints
o pydoc — Documentation generator and online help system
o Python Development Mode
o Effects of the Python Development Mode
o ResourceWarning Example
o Bad file descriptor error example
o doctest — Test interactive Python examples
o unittest — Unit testing framework
o [Link] — mock object library
o [Link] — getting started
o 2to3 - Automated Python 2 to 3 code translation
o test — Regression tests package for Python
o [Link] — Utilities for the Python test suite
o [Link].socket_helper — Utilities for socket tests
o [Link].script_helper — Utilities for the Python execution tests
o [Link].bytecode_helper — Support tools for testing correct bytecode generation
Debugging and Profiling
o Audit events table
o bdb — Debugger framework
o faulthandler — Dump the Python traceback
o pdb — The Python Debugger
o The Python Profilers
o timeit — Measure execution time of small code snippets
o trace — Trace or track Python statement execution
o tracemalloc — Trace memory allocations
Software Packaging and Distribution
o distutils — Building and installing Python modules
o ensurepip — Bootstrapping the pip installer
o venv — Creation of virtual environments
o zipapp — Manage executable Python zip archives
Python Runtime Services
o sys — System-specific parameters and functions
o sysconfig — Provide access to Python’s configuration information
o builtins — Built-in objects
o __main__ — Top-level script environment
o warnings — Warning control
o dataclasses — Data Classes
o contextlib — Utilities for with-statement contexts
41
o abc — Abstract Base Classes
o atexit — Exit handlers
o traceback — Print or retrieve a stack traceback
o __future__ — Future statement definitions
o gc — Garbage Collector interface
o inspect — Inspect live objects
o site — Site-specific configuration hook
Custom Python Interpreters
o code — Interpreter base classes
o codeop — Compile Python code
Importing Modules
o zipimport — Import modules from Zip archives
o pkgutil — Package extension utility
o modulefinder — Find modules used by a script
o runpy — Locating and executing Python modules
o importlib — The implementation of import
o Using [Link]
Python Language Services
o parser — Access Python parse trees
o ast — Abstract Syntax Trees
o symtable — Access to the compiler’s symbol tables
o symbol — Constants used with Python parse trees
o token — Constants used with Python parse trees
o keyword — Testing for Python keywords
o tokenize — Tokenizer for Python source
o tabnanny — Detection of ambiguous indentation
o pyclbr — Python module browser support
o py_compile — Compile Python source files
o compileall — Byte-compile Python libraries
o dis — Disassembler for Python bytecode
o pickletools — Tools for pickle developers
Miscellaneous Services
o formatter — Generic output formatting
MS Windows Specific Services
o msilib — Read and write Microsoft Installer files
o msvcrt — Useful routines from the MS VC++ runtime
o winreg — Windows registry access
o winsound — Sound-playing interface for Windows
Unix Specific Services
o posix — The most common POSIX system calls
o pwd — The password database
o spwd — The shadow password database
o grp — The group database
o crypt — Function to check Unix passwords
o termios — POSIX style tty control
o tty — Terminal control functions
o pty — Pseudo-terminal utilities
o fcntl — The fcntl and ioctl system calls
42
o pipes — Interface to shell pipelines
o resource — Resource usage information
o nis — Interface to Sun’s NIS (Yellow Pages)
o syslog — Unix syslog library routines
Superseded Modules
o optparse — Parser for command line options
o imp — Access the import internals
Undocumented Modules
o Platform specific modules
Object Oriented Programming is a way of computer programming using the idea of “objects” to
represent data and methods. It is also, an approach used for creating neat and reusable code instead of a
redundant one. The program is divided into self-contained objects or several mini-programs. Every
Individual object represents a different part of the application having its own logic and data to
communicate within themselves.
Major OOP (object-oriented programming) concepts in Python include Class, Object, Method,
Inheritance, Polymorphism, Data Abstraction, and Encapsulation.
CLASS
A class is a collection of objects or you can say it is a blueprint of objects defining the common
attributes and behavior. Well, it logically groups the data in such a way that code reusability becomes
easy.
For example, think of an office ’employee’ as a class and all the attributes related to it like ’emp_name’,
’emp_age’, ’emp_salary’, ’emp_id’ as the objects in Python.
43
class class1(): // class 1 is the name of the class
OBJECTS
Objects are an instance of a class. It is an entity that has state and behavior. In a nutshell, it is an instance
of a class that can access the data.
Example
class employee():
def __init__(self,name,age,id,salary): //creating a function
[Link] = name // self is an instance of a class
[Link] = age
[Link] = salary
[Link] = id
Explanation: ’emp1′ and ’emp2′ are the objects that are instantiated against the class ’employee’. Here,
the word (__dict__) is a “dictionary” which prints all the values of object ‘emp1’ against the given
parameter (name, age, salary).(__init__) acts like a constructor that is invoked whenever an object is
created.
1. Inheritance
2. Polymorphism
3. Encapsulation
4. Abstraction
1. INHERITANCE
From the Programming aspect, It generally means “inheriting or transfer of characteristics from parent
to child class without any modification”. The new class is called the derived/child class and the one
from which it is derived is called a parent/base class.
Ever heard of this dialogue from relatives “you look exactly like your father/mother” the reason behind
this is called ‘inheritance’.
44
a. Single Inheritance
Single level inheritance enables a derived class to inherit characteristics from a single parent class.
Example
print([Link])
Output: 22
Explanation:
I am taking the parent class and created a constructor (__init__), class itself is initializing the
attributes with parameters(‘name’, ‘age’ and ‘salary’).
Created a child class ‘childemployee’ which is inheriting the properties from a parent class and
finally instantiated objects ’emp1′ and ’emp2′ against the parameters.
Finally, I have printed the age of emp1. You can do a lot of things like print the whole dictionary
or name or salary.
b. Multilevel Inheritance
Multi-level inheritance enables a derived class to inherit properties from an immediate parent class
which in turn inherits properties from his parent class.
Example
print([Link])
print([Link])
Output: 22,23
Explanation:
It is clearly explained in the code written above, Here I have defined the superclass as employee
and child class as childemployee1. Now, childemployee1 acts as a parent for childemployee2.
I have instantiated two objects ’emp1′ and ’emp2′ where I am passing the parameters “name”,
“age”, “salary” for emp1 from superclass “employee” and “name”, “age, “salary” and “id” from
the parent class “childemployee1”
c. Hierarchical Inheritance
Hierarchical level inheritance enables more than one derived class to inherit properties from a parent
class.
Example
class employee():
def __init__(self, name, age, salary): //Hierarchical Inheritance
[Link] = name
[Link] = age
[Link] = salary
class childemployee1(employee):
def __init__(self,name,age,salary):
[Link] = name
[Link] = age
[Link] = salary
class childemployee2(employee):
def __init__(self, name, age, salary):
[Link] = name
[Link] = age
[Link] = salary
emp1 = employee('harshit',22,1000)
emp2 = employee('arjun',23,2000)
print([Link])
46
print([Link])
Output: 22,23
Explanation:
In the above example, you can clearly see there are two child class “childemployee1” and
“childemployee2”. They are inheriting functionalities from a common parent class that is
“employee”.
Objects ’emp1′ and ’emp2′ are instantiated against the parameters ‘name’, ‘age’, ‘salary’.
d. Multiple Inheritance
Multiple level inheritance enables one derived class to inherit properties from more than one base class.
Example
class childemployee(employee1,employee2):
def __init__(self, name, age, salary,id):
[Link] = name
[Link] = age
[Link] = salary
[Link] = id
emp1 = employee1('harshit',22,1000)
emp2 = employee2('arjun',23,2000,1234)
print([Link])
print([Link])
Output: 22,1234
Explanation: In the above example, I have taken two parent class “employee1” and “employee2”.And a
child class “childemployee”, which is inheriting both parent class by instantiating the objects ’emp1′ and
’emp2′ against the parameters of parent classes.
2. POLYMORPHISM
47
Polymorphism is one such OOP methodology where one task can be performed in several different
ways. To put it in simple words, it is a property of an object which allows it to take multiple forms.
You all must have used GPS for navigating the route, Isn’t it amazing how many different routes you
come across for the same destination depending on the traffic, from a programming point of view this is
called ‘polymorphism’.
a. Compile-time Polymorphism
A Compile-time polymorphism also called as static polymorphism which gets resolved during the
compilation time of the program. One common example is “method overloading”.
Example
class employee1():
def name(self):
print("Harshit is his name")
def salary(self):
print("3000 is his salary")
def age(self):
print("22 is his age")
class employee2():
def name(self):
print("Rahul is his name")
def salary(self):
print("4000 is his salary")
def age(self):
print("23 is his age")
obj_emp1 = employee1()
obj_emp2 = employee2()
func(obj_emp1)
func(obj_emp2)
Output:
Explanation:
In the above Program, I have created two classes ’employee1′ and ’employee2′ and created
functions for both ‘name’, ‘salary’ and ‘age’ and printed the value of the same without taking it
from the user.
Now, welcome to the main part where I have created a function with ‘obj’ as the parameter and
calling all the three functions i.e. ‘name’, ‘age’ and ‘salary’.
Later, instantiated objects emp_1 and emp_2 against the two classes and simply called the
function. Such type is called method overloading which allows a class to have more than one
method under the same name.
b. Run-time Polymorphism
A run-time Polymorphism is also called dynamic polymorphism where it gets resolved into the run time.
One common example of Run-time polymorphism is “method overriding”.
Example
class employee():
def __init__(self,name,age,id,salary):
[Link] = name
[Link] = age
[Link] = salary
[Link] = id
def earn(self):
pass
class childemployee1(employee):
class childemployee2(employee):
def earn(self):
print("has money")
c = childemployee1
[Link](employee)
d = childemployee2
[Link](employee)
49
Explanation: In the above example, I have created two classes ‘childemployee1’ and ‘childemployee2’
which are derived from the same base class ‘employee’. Here’s the catch one did not receive money
whereas the other one gets. Now the real question is how did this happen? Well, here if you look closely
I created an empty function and used Pass (a statement which is used when you do not want to execute
any command or code). Now, under the two derived classes, I used the same empty function and made
use of the print statement as ‘no money’ and ‘has money’. Lastly, created two objects and called the
function.
3. ENCAPSULATION
In a raw form, encapsulation basically means binding up of data in a single class. Python does not have
any private keyword, unlike Java. A class shouldn’t be directly accessed but be prefixed in an
underscore.
For Example
class employee(object):
def __init__(self):
[Link] = 1234
self._age = 1234
self.__salary = 1234
object1 = employee()
print([Link])
print(object1._age)
print(object1.__salary)
Output
1234
Traceback (most recent call last):
1234
File “C:/Users/Harshit_Kant/PycharmProjects/test1/venv/[Link]”, line 10, in
print(object1.__salary)
AttributeError: ’employee’ object has no attribute ‘__salary’
Explanation: What is the underscore and error? Well, python class treats the private variables
as(__salary) which cannot be accessed directly.
Example 2
class employee():
def __init__(self):
self.__maxearn = 1000000
def earn(self):
print("earning is:{}".format(self.__maxearn))
50
emp1 = employee()
[Link]()
emp1.__maxearn = 10000
[Link]()
[Link](10000)
[Link]()
Output:
Explanation: Making Use of the setter method provides indirect access to the private class method.
Here I have defined a class employee and used a (__maxearn) which is the setter method used here to
store the maximum earning of the employee, and a setter function setmaxearn() which is taking price as
the parameter.
This is a clear example of encapsulation where we are restricting the access to private class method and
then use the setter method to grant access.
4. ABSTRACTION
Abstraction basically means you only show the implementation details of a particular process and hide
the details from the user. It is used to simplify complex problems by modeling classes appropriate to the
problem. Suppose you booked a movie ticket from bookmyshow using net banking or any other process.
You don’t know the procedure of how the pin is generated or how the verification is done. This is called
‘abstraction’
An abstract class cannot be instantiated which simply means you cannot create objects for this type of
class. It can only be used for inheriting the functionalities.
Example
class childemployee1(employee):
def emp_id(self,id):
print("emp_id is 12345")
emp1 = childemployee1()
emp1.emp_id(id)
51
Explanation: As you can see in the above example, we have imported an abstract method and the rest
of the program has a parent and a derived class. An object is instantiated for the ‘childemployee’ base
class and functionality of abstract is being used.
Programming in Python is arguably more efficient and faster compared to other languages.
Python is famous for its portability.
It is platform independent.
Python supports SQL cursors.
In many programming languages, the application developer needs to take care of the open and
closed connections of the database, to avoid further exceptions and errors. In Python, these
connections are taken care of.
Python supports relational database systems.
Python database APIs are compatible with various databases, so it is very easy to migrate and
port database application interfaces.
Python and MySQL are a good combination to develop database applications. After starting the MySQL
service on Linux, you need to acquire MySQLdb, a Python DB-API for MySQL to perform database
operations. You can check whether the MySQLdb module is installed in your system with the following
command:
>>>import MySQLdb
If this command runs successfully, you can now start writing scripts for your database.
1. CREATE DATABASE
You can create a database in MYSQL using the CREATE DATABASE query.
Syntax
Example
52
After establishing connection with MySQL, to manipulate data in it you need to connect to a database.
You can connect to an existing database or, create your own.
You would need special privileges to create or to delete a MySQL database. So if you have access to the
root user, you can create any database.
2. CREATE TABLE
The CREATE TABLE statement is used to create tables in MYSQL database. Here, you need to specify
the name of the table and, definition (name and datatype) of each column.
Syntax
Example
The following query creates a table named EMPLOYEE in MySQL with five columns namely,
FIRST_NAME, LAST_NAME, AGE, SEX and, INCOME.
NOTE: The DESC statement gives you the description of the specified table. Using this you can verify
if the table has been created or not as shown below:
3. INSERT
You can add new rows to an existing table of MySQL using the INSERT INTO statement. In this, you
need to specify the name of the table, column names, and values (in the same order as column names).
Syntax
53
INSERT INTO TABLE_NAME (column1, column2,column3,...columnN)
VALUES (value1, value2, value3,...valueN);
Example
NOTE: You can verify the records of the table after insert operation using the SELECT statement as:
It is not mandatory to specify the names of the columns always, if you pass values of a record in the
same order of the columns of the table you can execute the SELECT statement without the column
names as follows:
4. SELECT
You can retrieve/fetch data from a table in MySQL using the SELECT query. This query/statement
returns contents of the specified table in tabular form and it is called as result-set.
Syntax
Example
The query below retrieves the FIRST_NAME and Country values from the table.
You can also retrieve all the values of each record using * instead of the name of the columns as:
5. WHERE
If you want to fetch, delete or, update particular rows of a table in MySQL, you need to use the where
clause to specify condition to filter the rows of the table for the operation.
For example, if you have a SELECT statement with where clause, only the rows which satisfies the
specified condition will be retrieved.
Syntax
Example
55
The MySQL statement below retrieves the records of the employees whose income is greater than
4000.
6. ORDER BY
To retrieve contents of a table in specific order, invoke the execute() method on the cursor object and,
pass the SELECT statement along with ORDER BY clause, as a parameter to it.
Example
In the example below we are creating a table with name and Employee, populating it, and retrieving its
records back in the (ascending) order of their age, using the ORDER BY clause.
import [Link]
Output
In the same way you can retrieve data from a table in descending order using the ORDER BY clause.
Example
import [Link]
Output
7. DELETE
To delete records from a MySQL table, you need to use the DELETE FROM statement. To remove
specific records, you need to use WHERE clause along with it.
Syntax
Example
57
Assume we have created a table in MySQL with name EMPLOYEES as:
This MySQL statement deletes the record of the employee with FIRST_NAME ”Mac”.
If you retrieve the contents of the table, you can see only 3 records since we have deleted one.
If you execute the DELETE statement without the WHERE clause all the records from the specified
table will be deleted.
If you retrieve the contents of the table, you will get an empty set as shown below −
8. DROP TABLE
You can remove an entire table using the DROP TABLE statement. You just need to specify the name
of the table you need to delete.
Syntax
Example
Before deleting a table get the list of tables using the SHOW TABLES statement as follows:
58
mysql> SHOW TABLES;
This statement removes the table named sample from the database completely:
Since we have deleted the table named sample from MySQL, if you get the list of tables again you will
not find the table name sample in it.
You can drop a table whenever you need to, using the DROP statement of MYSQL, but you need to be
very careful while deleting any existing table because the data lost will not be recovered after deleting a
table.
If you try to drop a table which does not exist in the database, an error occurs as:
You can prevent this error by verifying whether the table exists before deleting, by adding the IF
EXISTS to the DELETE statement.
9. UPDATE
UPDATE operation on any database updates one or more records, which are already available in the
database. You can update the values of existing records in MySQL using the UPDATE statement. To
update specific rows, you need to use the WHERE clause along with it.
Syntax
UPDATE table_name
SET column1 = value1, column2 = value2...., columnN = valueN
WHERE [condition];
You can combine N number of conditions using the AND or the OR operators.
Example
59
SEX CHAR(1),
INCOME FLOAT
);
Following MySQL statement increases the age of all male employees by one year:
10. JOIN
When you have divided the data in two tables you can fetch combined records from these two tables
using Join.
Example
Suppose we have created a table with name EMPLOYEE and populated data into it as shown below:
Following example retrieves data from the above two tables combined by contact column of the
EMPLOYEE table and ID column of the CONTACT table.
import [Link]
Output
[('Krishna', 'Sharma', 26, 'M', 2000, 101, 101, 'Krishna@[Link]', 9848022338, 'Hyderabad'),
('Raj', 'Kandukuri', 20, 'M', 7000, 102, 102, 'Raja@[Link]', 9848022339, 'Vishakhapatnam'),
('Ramya', 'Ramapriya', 29, 'F', 5000, 103, 103, 'Krishna@[Link]', 9848022337, 'Pune'),
('Mac', 'Mohan', 26, 'M', 2000, 104, 104, 'Raja@[Link]', 9848022330, 'Mumbai')]
61
What is Big Data? Big Data is a collection of data that is huge in volume, yet growing exponentially
with time. It is a data with so large size and complexity that none of traditional data management tools
can store it or process it efficiently. Big data is also a data but with huge size.
According to Gartner, Big data is high-volume, velocity, and variety information assets that demand
cost-effective, innovative forms of information processing for enhanced insight and decision making.
Big Data refers to complex and large data sets that have to be processed and analyzed to uncover
valuable information that can benefit businesses and organizations
However, there are certain basic tenets of Big Data that will make it even simpler to answer what is Big
Data:
It refers to a massive amount of data that keeps on growing exponentially with time.
It is so voluminous that it cannot be processed or analyzed using conventional data processing
techniques.
It includes data mining, data storage, data analysis, data sharing, and data visualization.
The term is an all-comprehensive one including data, data frameworks, along with the tools and
techniques used to process and analyze the data.
1. The New York Stock Exchange generates about one terabyte of new trade data per day.
2. Social Media - The statistic shows that 500+terabytes of new data get ingested into the databases
of social media site Facebook, every day. This data is mainly generated in terms of photo and
video uploads, message exchanges, putting comments etc.
3. A single Jet engine can generate 10+terabytes of data in 30 minutes of flight time. With many
thousand flights per day, generation of data reaches up to many Petabytes.
1. Structured
2. Unstructured
3. Semi-structured
1. Structured
Any data that can be stored, accessed and processed in the form of fixed format is termed as a
'structured' data. Over the period of time, talent in computer science has achieved greater success in
developing techniques for working with such kind of data (where the format is well known in advance)
and also deriving value out of it. However, nowadays, we are foreseeing issues when a size of such data
grows to a huge extent, typical sizes are being in the rage of multiple zettabytes.
Note: 1021 bytes equal to 1 zettabyte or one billion terabytes forms a zettabyte.
Looking at these figures one can easily understand why the name Big Data is given and imagine the
challenges involved in its storage and processing.
62
Data stored in a relational database management system is one example of a 'structured' data.
2. Unstructured
Any data with unknown form or the structure is classified as unstructured data. In addition to the size
being huge, un-structured data poses multiple challenges in terms of its processing for deriving value out
of it. A typical example of unstructured data is a heterogeneous data source containing a combination of
simple text files, images, videos etc. Now aday organizations have wealth of data available with them
but unfortunately, they don't know how to derive value out of it since this data is in its raw form or
unstructured format.
3. Semi-structured
63
Semi-structured data can contain both the forms of data. We can see semi-structured data as a structured
in form but it is actually not defined with e.g. a table definition in relational DBMS. Example of semi-
structured data is a data represented in an XML file.
<rec><name>Prashant Rao</name><sex>Male</sex><age>35</age></rec>
<rec><name>Seema R.</name><sex>Female</sex><age>41</age></rec>
<rec><name>Satish Mane</name><sex>Male</sex><age>29</age></rec>
<rec><name>Subrato Roy</name><sex>Male</sex><age>26</age></rec>
<rec><name>Jeremiah J.</name><sex>Male</sex><age>35</age></rec>
Please note that web application data, which is unstructured, consists of log files, transaction history
files etc. OLTP systems are built to work with structured data wherein data is stored in relations (tables).
(i) Volume
(ii) Variety
(iii) Velocity
(iv) Variability
(i) Volume – The name Big Data itself is related to a size which is enormous. Size of data plays a very
crucial role in determining value out of data. Also, whether a particular data can actually be considered
as a Big Data or not, is dependent upon the volume of data. Hence, 'Volume' is one characteristic which
needs to be considered while dealing with Big Data.
(ii) Variety – Variety refers to heterogeneous sources and the nature of data, both structured and
unstructured. During earlier days, spreadsheets and databases were the only sources of data considered
by most of the applications. Nowadays, data in the form of emails, photos, videos, monitoring devices,
PDFs, audio, etc. are also being considered in the analysis applications. This variety of unstructured data
poses certain issues for storage, mining and analyzing data.
(iii) Velocity – The term 'velocity' refers to the speed of generation of data. How fast the data is
generated and processed to meet the demands, determines real potential in the data. Big Data Velocity
deals with the speed at which data flows in from sources like business processes, application logs,
networks, and social media sites, sensors, Mobile devices, etc. The flow of data is massive and
continuous.
(iv) Variability – This refers to the inconsistency which can be shown by the data at times, thus
hampering the process of being able to handle and manage the data effectively.
64
1. Businesses can utilize outside intelligence while taking decisions - Access to social data from search
engines and sites like facebook, twitter are enabling organizations to fine tune their business strategies.
2. Improved customer service - Traditional customer feedback systems are getting replaced by new
systems designed with Big Data technologies. In these new systems, Big Data and natural language
processing technologies are being used to read and evaluate consumer responses.
3. Early identification of risk to the product/services, if any and Better operational efficiency - Big Data
technologies can be used for creating a staging area or landing zone for new data before identifying what
data should be moved to the data warehouse. In addition, such integration of Big Data technologies and
data warehouse helps an organization to offload infrequently accessed data.
Big Data is the most valuable commodity in the modern day. The amount of data generated by
companies is increasing at a rapid pace. By 2025, IDC says the worldwide data will reach 175
zettabytes. A zettabyte is equivalent to a trillion gigabytes. Now multiply that 175 times. Then imagine
how fast data is exploding.
Choosing a programming language for the Big Data field is very project-specific and depends on its
goal. And whatever may be the project goals, Python is the perfect programming language for Big
Data because of its easy readability and statistical analysis capacity.
Python is a fast-growing programming language, and a combination of Python and Big Data is the most
preferred choice for developers due to less coding and tremendous library support.
The benefits of using Python in Big Data and its astonishing growth rate in Big Data Analytics include:
1) Simple coding - Python programming involves simple coding compared to other programming
languages. We can execute programs with few code lines, and the essential thing is we can associate and
identify data types quickly with Python. This language can process and prolix tasks within a short time.
2) Open-source and easy to learn - Python is an open-source programming language developed with
the community-based model. It’s free to use, and since it’s open-source supports multiple platforms and
can be run on any environment (Linux, Windows, etc.).
Python is easy to learn as well because of its simple syntax. This simple, readable syntax helps Big Data
pros to focus on insights managing Big data, rather than wasting time in understanding technical tones
of the language. This one is one of the primary reasons to choose Python for Big Data.
3) Python supports multiple libraries - Python is a famous programming language because of its
extensive support for libraries. These libraries are beneficial in saving time and make the language even
more popular.
Most of the Python libraries are useful for data analytics, visualization, numerical computing, and
machine learning. Big Data requires a lot of scientific computing and data analysis, and the combination
of Python with Big Data make them great companions.
65
Pandas – Free software library to analyze and handle data. Offers multiple data structures to
manipulate data. Pandas also support tools for reading and writing data between different data
formats and in-memory data structures.
Numpy – Free software library to compute in arrays and multidimensional matrices. Provides
high-level mathematical functions to handle data with random number crunchings, Fourier
Transforms, linear algebra, etc.
Scikit-learn – Free software library for machine learning related to regression, classification,
and clustering.
SciPy – Preferred library for scientific computing and technical computing on data. Allows data
integration, interpolation, optimization, and modification using special functions.
4) Python provides high compatibility with Hadoop - Both Python and Hadoop are open-source big
data platforms, and that’s why Python is securely more compatible with Hadoop than any other
programming language. Developers prefer to use Python with Hadoop because of its extensive support
for libraries. Also, Python has PyDoop Package, which offers excellent support for Hadoop.
5) Python has a high processing speed - Python’s high speed for data processing makes it optimal for
usage with Big Data. Python codes are executed in a fraction of the time needed by other programming
languages because of its simple syntax and easy-to-manage code. It supports various prototyping ideas,
making it run code faster while maintaining excellent transparency between code and execution. This
consistently makes Python one of the most popular options for Big Data in the tech industry.
6) Scope - Python is an object-oriented language, which supports advanced data structures. It allows
users to imply data structures, including lists, sets, tuples, dictionaries, and many more. It also supports
various scientific computing operations like data frames, matrix operations, etc. These incredible
features of Python enhance the language’s scope and thus enable it to simplify and speed up data
operations. This is what makes Python and Big Data a deadly combination.
7) Python has data processing support - Python has an in-built feature of supporting data processing
for unconventional and unstructured data, and this is the most common requirement for Big Data to
analyze social media data. That’s the reason why big data companies choose Python as an essential
requirement in Big Data.
8) Python is portable - This is the most crucial reason why Python is popular in data science. Many
cross-language operations are performed easily on Python because of its portable and extensible nature.
Many data scientists prefer using graphics processing units for their Machine Learning models, and the
portable nature of Python is well-suited for this.
9) Python has large community support - Big data analysis usually deals with complicated problems
that need community support for solutions. Python has large and active community support, which helps
data scientists and programmers with expert backing on coding related issues. Also, corporate support is
a significant part of the success of Python for Big Data. Top tech companies like Facebook, Instagram,
Netflix, etc., use Python in their products.
10) Scalability - Scalability matters a lot when dealing with data. Unlike other languages, Python is
much faster. If the data volume is increased, Python easily increases the speed of processing the data,
which is tough to do in languages like Java and others.
66
A data analyst uses programming tools to mine large amounts of complex data, and find relevant
information from this data.
In short, an analyst is someone who derives meaning from messy data. A data analyst needs to have
skills in the following areas, in order to be useful in the workplace:
1. Domain Expertise — In order to mine data and come up with insights that are relevant to their
workplace, an analyst needs to have domain expertise.
2. Programming Skills —As a data analyst, you will need to know the right libraries to use in
order to clean data, mine, and gain insights from it.
3. Statistics — An analyst might need to use some statistical tools to derive meaning from data.
4. Visualization Skills — A data analyst needs to have great data visualization skills, in order to
summarize and present data to a third party.
5. Storytelling — Finally, an analyst needs to communicate their findings to a stakeholder or
client. This means that they will need to create a data story, and have the ability to narrate it.
Data Analysis is a process of inspecting, cleaning, transforming, and modeling data with the goal of
discovering useful information, suggesting conclusions, and supporting decision-making.
1. NUMPY
Syntax
import numpy as np
67
[1 2 3]
2. PANDAS
Pandas is an open-source Python library providing efficient, easy-to-use data structure and data analysis
tools. The name Pandas is derived from "Panel Data" - an Econometrics from Multidimensional Data.
Pandas is well suited for many different kinds of data:
Pandas provides three data structure - all of which are build on top of the NumPy array - all the data
structures are value-mutable
Syntax
import pandas as pd
MATPLOTLIB
1. Matplotlib is a Python library that is specially designed for the development of graphs, charts
etc., in order to provide interactive data visualisation
2. Matplotlib is inspired from the MATLAB software and reproduces many of it's features
Syntax
# Import Matplotlib submodule for plotting
import [Link] as plt
DATASET
A Data set is a set or collection of data. This set is normally presented in a tabular pattern. Every column
describes a particular variable. And each row corresponds to a given member of the data set.
Data sets describe values for each variable for unknown quantities such as height, weight, temperature,
volume, etc of an object or values of random numbers. The values in this set are known as a datum. The
data set consists of data of one or more members corresponding to each row.
A data set is an ordered collection of data. While handling the data, the data set can be a bunch of tables,
schema and other objects. The data are essentially organized to a certain model that helps to process the
needed information. The set of data is any permanently saved collection of information that usually
contains either case-level, gathered data, or statistical guidance level data.
In Statistics, we have different types of data sets available for different types of information. They are:
A data set (or dataset) is a collection of data, usually presented in tabular form whereas a database is an
organized collection of data for one or more purposes, usually in digital form.
69
IMPORTING AND EXPORTING DATASET IN PYTHON
When running python programs, we need to use datasets for data analysis. Python has various modules
which help us in importing the external data in various file formats to a python program. In this example
we will see how to import data of various formats to a python program.
The csv module enables us to read each of the row in the file using a comma as a delimiter. We first
open the file in read only mode and then assign the delimiter. Finally use a FOR loop to read each row
from the csv file.
Example
import csv
With pandas
The pandas library can actually handle most of the file types inclusing csv file. In this program let see
how pandas library handles the excel file using the read_excel module. In the example below we read
the excel version of the above file and get the same result when we read the file.
Example
import pandas as pd
df = [Link]("E:\\[Link]")
data=[Link]("customers")
print([Link](10))
Output
70
0 7590-VHVEG Female Month-to-month Yes No
1 5575-GNVDE Male One year No No
2 3668-QPYBK Male Month-to-month Yes Yes
3 7795-CFOCW Male One year No No
4 9237-HQITU Female Month-to-month Yes Yes
5 9305-CDSKC Female Month-to-month Yes Yes
6 1452-KIOVK Male Month-to-month Yes No
7 6713-OKOMC Female Month-to-month No No
8 7892-POOKP Female Month-to-month Yes Yes
9 6388-TABGU Male One year No No
EXPORTING
The ultimate goal is to export that dataset into Excel. But before you export that data, you'll need to
create the DataFrame in order to capture the information in Python. Next, you'll need to define the path
where you'd like to store the exported Excel file.
CSV stands for comma separated values. This file format is a commonly used data format while
exporting/importing data to/from spreadsheets and data tables in databases. The csv module was
incorporated in Python’s standard library as a result of PEP 305. It presents classes and methods to
perform read/write operations on CSV file as per recommendations of PEP 305.
CSV is a preferred export data format by Microsoft’s Excel spreadsheet software. However, csv module
can handle data represented by other dialects also.
The CSV API interface consists of the following writer and reader classes:
1. writer()
This function in csv module returns a writer object that converts data into a delimited string and stores in
a file object. The function needs a file object with write permission as a parameter. Every row written in
the file issues a newline character. To prevent additional space between lines, newline parameter is set to
''.
writerow()
This method writes items in an iterable (list, tuple or string), separating them by comma character.
writerows()
This method takes a list of iterables, as parameter and writes each item as a comma separated line of
items in the file.
Example
71
This example shows the use of writer() function. First a file is opened in ‘w’ mode. This file is used to
obtain writer object. Each tuple in list of tuples is then written to file using writerow() method.
import csv
persons=[('Lata',22,45),('Anil',21,56),('John',20,60)]
csvfile=open('[Link]','w', newline='')
obj=[Link](csvfile)
for person in persons:
[Link](person)
[Link]()
Output
This will create ‘[Link]’ file in current directory. It will show following data.
Lata,22,45
Anil,21,56
John,20,60
Instead of iterating over the list to write each row individually, we can use writerows() method.
csvfile=open('[Link]','w', newline='')
persons=[('Lata',22,45),('Anil',21,56),('John',20,60)]
obj=[Link](csvfile)
[Link](persons)
[Link]()
2. reader()
This function returns a reader object which returns an iterator of lines in the csv file. Using the regular
for loop, all lines in the file are displayed in following example:
Example
csvfile=open('[Link]','r', newline='')
obj=[Link](csvfile)
for row in obj:
print (row)
Output
The reader object is an iterator. Hence, it supports next() function which can also be used to display all
lines in csv file instead of a FOR loop.
csvfile=open('[Link]','r', newline='')
72
obj=[Link](csvfile)
while True:
try:
row=next(obj)
print (row)
except StopIteration:
break
The csv module also defines a dialect class. Dialect is set of standards used to implement CSV protocol.
The list of dialects available can be obtained by list_dialects() function.
>>> csv.list_dialects()
['excel', 'excel-tab', 'unix']
In addition to iterables, csv module can export a dictionary object to CSV file and read it to populate
Python dictionary object.
CORRELATION
Statistics and data science are often concerned about the relationships between two or more variables (or
features) of a dataset. Each data point in the dataset is an observation, and the features are the properties
or attributes of those observations.
Every dataset you work with uses variables and observations. For example, you might be interested in
understanding the following:
In the examples above, the height, shooting accuracy, years of experience, salary, population density,
and gross domestic product are the features or variables. The data related to each player, employee, and
each country are the observations.
When data is represented in the form of a table, the rows of that table are usually the observations, while
the columns are the features. Take a look at this employee table:
Ann 30 120,000
Rob 21 105,000
Tom 19 90,000
Ivy 10 82,000
73
In this table, each row represents one observation, or the data about one employee (either Ann, Rob,
Tom, or Ivy). Each column shows one property or feature (name, experience, or salary) for all the
employees.
If you analyze any two features of a dataset, then you’ll find some type of correlation between those two
features. Consider the following figures:
1. Negative correlation (red dots): In the plot on the left, the y values tend to decrease as the x
values increase. This shows strong negative correlation, which occurs when large values of one
feature correspond to small values of the other, and vice versa.
2. Weak or no correlation (green dots): The plot in the middle shows no obvious trend. This is a
form of weak correlation, which occurs when an association between two features is not obvious
or is hardly observable.
3. Positive correlation (blue dots): In the plot on the right, the y values tend to increase as the x
values increase. This illustrates strong positive correlation, which occurs when large values of
one feature correspond to large values of the other, and vice versa.
TYPES OF CORRELATION
74
The correlation between experience and salary is positive because higher experience corresponds to a
larger salary and vice versa.
Note: When you’re analyzing correlation, you should always have in mind that correlation does not
indicate causation. It quantifies the strength of the relationship between the features of a dataset.
Sometimes, the association is caused by a factor common to several features of interest.
Correlation is tightly connected to other statistical quantities like the mean, standard deviation,
variance, and covariance.
TYPES OF CORRELATION
FEATURES OF MongoDB
MongoDB is a scalable, flexible NoSQL document database platform designed to overcome the
relational databases approach and the limitations of other NoSQL solutions. MongoDB is well known
for its horizontal scaling and load balancing capabilities, which has given application developers an
unprecedented level of flexibility and scalability.
MongoDB Atlas is the leading global cloud database service for modern applications. Using Atlas,
developers can deploy fully managed cloud databases across AWS, Azure, or Google Cloud. Best-in-
class data security and privacy standards practices means that developers can rest easy knowing that
they have instant access to the availability, scalability, and compliance they require for enterprise-level
application development.
75
MongoDB provides developers with a number of useful out-of-the-box capabilities, whether you need to
run privately on site or in the public cloud.
1. Ad-hoc queries for optimized, real-time analytics - When designing the schema of a database, it is
impossible to know in advance all the queries that will be performed by end users. An ad hoc query is a
short-lived command whose value depends on a variable. Each time an ad hoc query is executed, the
result may be different, depending on the variables in question.
Optimizing the way in which ad-hoc queries are handled can make a significant difference at scale,
when thousands to millions of variables may need to be considered. This is why MongoDB, a document-
oriented, flexible schema database, stands apart as the cloud database platform of choice for enterprise
applications that require real-time analytics. With ad-hoc query support that allows developers to update
ad-hoc queries in real time, the improvement in performance can be game-changing.
MongoDB supports field queries, range queries, and regular expression searches. Queries can return
specific fields and also account for user-defined functions. This is made possible because MongoDB
indexes BSON documents and uses the MongoDB Query Language (MQL).
2. Indexing appropriately for better query executions - In our experience, the number one issue that
many technical support teams fail to address with their users is indexing. Done right, indexes are
intended to improve search speed and performance. A failure to properly define appropriate indices can
and usually will lead to a myriad of accessibility issues, such as problems with query execution and load
balancing.
Without the right indices, a database is forced to scan documents one by one to identify the ones that
match the query statement. But if an appropriate index exists for each query, user requests can be
optimally executed by the server. MongoDB offers a broad range of indices and features with language-
specific sort orders that support complex access patterns to datasets.
Notably, MongoDB indices can be created on demand to accommodate real-time, ever-changing query
patterns and application requirements. They can also be declared on any field within any of your
documents, including those nested within arrays.
3. Replication for better data availability and stability - When your data only resides in a single
database, it is exposed to multiple potential points of failure, such as a server crash, service
interruptions, or even good old hardware failure. Any of these events would make accessing your data
nearly impossible.
Replication allows you to sidestep these vulnerabilities by deploying multiple servers for disaster
recovery and backup. Horizontal scaling across multiple servers that house the same data (or shards of
that same data) means greatly increased data availability and stability. Naturally, replication also helps
with load balancing. When multiple users access the same data, the load can be distributed evenly across
servers.
76
In MongoDB, replica sets are employed for this purpose. A primary server or node accepts all write
operations and applies those same operations across secondary servers, replicating the data. If the
primary server should ever experience a critical failure, any one of the secondary servers can be elected
to become the new primary node. And if the former primary node comes back online, it does so as a
secondary server for the new primary node.
4. Sharding - When dealing with particularly large datasets, sharding (the process of splitting larger
datasets across multiple distributed collections), helps the database distribute and better execute what
might otherwise be problematic and cumbersome queries. Without sharding, scaling a growing web
application with millions of daily users is nearly impossible.
Like replication via replication sets, sharding in MongoDB allows for much greater horizontal
scalability. Horizontal scaling means that each shard in every cluster houses a portion of the dataset in
question, essentially functioning as a separate database. The collection of distributed server shards forms
a single, comprehensive database much better suited to handling the needs of a popular, growing
application with zero downtime.
All operations in a sharding environment are handled through a lightweight process called mongos.
Mongos can direct queries to the correct shard based on the shard key. Naturally, proper sharding also
contributes significantly to better load balancing.
5. Load balancing - At the end of the day, optimal load balancing remains one of the holy grails of
large-scale database management for growing enterprise applications. Properly distributing millions of
client requests to hundreds or thousands of servers can lead to a noticeable (and much appreciated)
difference in performance.
Fortunately, via horizontal scaling features like replication and sharding, MongoDB supports large-scale
load balancing. The platform can handle multiple concurrent read and write requests for the same data
with best-in-class concurrency control and locking protocols that ensure data consistency. There’s no
need to add an external load balancer—MongoDB ensures that each and every user has a consistent
view and quality experience with the data they need to access.
77
Python is optimal for Big Data applications due to its high processing speed, simple syntax, and rich data processing capabilities. It supports advanced data structures, scientific computing operations, and is highly scalable. Python's portability allows for cross-language operations and its large community support facilitates problem solving. These features, alongside its effectiveness in handling unstructured data, make Python a preferred choice over other programming languages like Java .
Big Data is distinguished by its volume, velocity, variety, and variability. 'Volume' refers to the enormous size of the data sets. 'Velocity' denotes the speed at which data is generated and processed. 'Variety' indicates the diversity of data types from various sources, including structured and unstructured data. 'Variability' refers to the data’s inconsistency and complexity in handling it effectively .
Python's role as an object-oriented language allows it to implicitly use data structures such as lists, sets, and dictionaries, which are crucial in handling large datasets efficiently. Its support for scientific computing operations enhances its capability to manage data frames and matrix operations, simplifying data manipulation and processing tasks. These features enable Python to speed up various Big Data operations, making it an ideal choice for such tasks .
Leveraging Python's logging facilities in Big Data environments offers benefits such as structured logging and real-time monitoring of data processes, facilitating quick debugging and operational insights. However, challenges include managing large volumes of logs, requiring efficient log storage solutions to maintain performance. Despite these challenges, Python's logging tools boost visibility into data processing pipelines, contributing to better system reliability and maintainability .
Python benefits Big Data analysis through robust scientific computing capabilities. It includes libraries such as NumPy and Pandas that enable efficient data manipulation and complex computations. Its ability to perform matrix operations and manage data frames seamlessly allows for sophisticated data processing and analysis tasks, providing precise and actionable insights into large datasets typical of Big Data environments .
Python's scalability significantly impacts its performance in high-volume data processing tasks. Unlike other languages such as Java, Python can maintain superior processing speed and efficiency as data volume increases. Its scalability allows developers to easily adapt and optimize code for larger datasets without significant performance loss, demonstrating an advantage in processing speed and flexibility in comparison to less scalable languages .
Python's success in Big Data fields is significantly bolstered by its large and active community, which provides extensive support and solutions to complex programming challenges faced by data scientists. This community backing aids in coding and debugging efforts. Additionally, corporate support from major companies like Facebook and Netflix endorses Python's utility and reliability in Big Data applications, contributing to its widespread adoption and success .
Python's ability to support various prototyping ideas is crucial in accelerating development in Big Data projects. Its simple syntax and dynamic typing facilitate rapid iteration and testing, allowing developers to quickly build and refine models. This encourages an agile development approach, critical for exploratory data analysis and experimental setups, which are often required in Big Data projects to adapt to new insights or changing requirements efficiently .
Python enhances business decision-making through its proficient data analysis tools, enabling comprehensive insight into complex datasets. It facilitates the integration of Big Data technologies with traditional business intelligence tools, allowing organizations to derive actionable insights. By improving data processing capabilities, Python supports real-time data analysis, which is crucial for timely decision-making and adapting business strategies in response to market dynamics .
Python provides significant advantages in managing unstructured data, which is common in Big Data applications. Its built-in data processing support handles unconventional data formats effectively, such as social media data, enhancing the usability and flexibility required for Big Data analysis. This capability simplifies the integration, cleaning, and transformation of various unstructured data types, making Python well-suited for Big Data tasks .