Training Report On Data Science With Python
Training Report On Data Science With Python
on
DATA SCIENCE WITH PYTHON
Submitted by
HARSHIT RAJ
ROLL NUMBER- 2124449
Programme Name: Data Science
Under the Guidance of
NIELIT,PATNA CENTRE
(June-July, 2019)
DECLARATION
I hereby declare that I have completed my four weeks summer training
at NIELIT,Patna Centre from 1 July,2023 to 30 July,2023 under the
guidance of Ankit Kumar. I declare that I have worked with full
dedication during these four weeks of training and my learning
outcomes fulfil the requirements of training for the award of degree of
Data Science with Python.
Name of Student:
Date: Registration no:
ACKNOWLEDGEMENT
History of Python
Python was developed in 1980 by Guido van Rossum at the
NationalResearch Institute for Mathematics and Computer Science in
theNetherlands as a successor of ABC language capable of
exceptionhandling and interfacing. Python features a dynamic type
system andautomatic memory management. It supports multiple
programming paradigms, including object-oriented, imperative,
functional and procedural, and has a large and comprehensive standard
library.Van Rossum picked the name Python for the new language from
a TV show, Monty Python's Flying Circus.
In December 1989 the creator developed the 1st python interpreter asa
hobby and then on 16 October 2000, Python 2.0 was released with
many new features.
...In December 1989, I was looking for a "hobby" programming
projectthat would keep me occupied during the week around
Christmas. My office ... would be closed, but I had a home computer,
and not much else on my hands. I decided to write an interpreter for
the new scripting language I had been thinking about lately: a
descendant of ABC that would appeal to Unix/C hackers. I chose Python
as a working title for the project, being in a slightly irreverent mood
(and a big fan of Monty Python's Flying Circus)
Why Python ?
The language's core philosophy is summarized in the document The Zen
of Python(PEP 20), which includes aphorisms such as...
• BeautiExplicit is better than implicit
• ful is better than ugly
• Simple is better than complex
• Complex is better than complicated
• Readability counts
A simple Program to print "Hello World"
Characteristics of Python
Interpreted Language: Python is processed at runtime by Python
Interpreter
Easy to read: Python source-code is clearly defined and visible to the
eyes.
Portable: Python codes can be run on a wide variety of hardware
platform shaving the same interface.
Extendable: Users can add low level-modules to Python interpreter.
Scalable: Python provides an improved structure for supporting large
programs than shell-scripts.
Object-Oriented Language: It supports object-oriented features and
techniques of programming.
Tuples-
• Immutable in nature, i.e they cannot be changed.
• No type restriction
• Indexing and slicing, everything's same like that in strings and lists.
• Constructing tuples.
• Basic tuple methods.
• Immutability.
• When to use tuples?
• We can use tuples to present things that shouldn't change, such as
days of the week, or dates on a calendar, etc.
Sets-
• A set contains unique and unordered elements and we can
construct them by using a set() function.
• Convert a list into Set-
• k = set(l)
• k becomes {1,2,3,4,6,7}
• Basic Syntax-
• x=set()
• x.add(l)
• x.add(l)
• This would make no change in x now
Packages and Modules in Python
1. NumPy
NumPy is a Python package. It stands for 'Numerical Python'. It is a
library consisting of multidimensional array objects and a collection of
routines for processing of array.
Numeric, the ancestor of NumPy, was developed by Jim Hugunin.
Another package Numarray was also developed, having some additional
functionalities. In 2005, Travis Oliphant created NumPy package by
incorporating the features of Numarray into Numeric package. There
are many contributors to this open source project.
Operations using NumPy
Using NumPy, a developer can perform the following operations —
• Mathematical and logical operations on arrays.
• Fourier transforms and routines for shape manipulation.
• Operations related to linear algebra. NumPy has in-built functions
for linear algebra and random number generation.
Simple program to create a matrix-
First of all we import numpy package then using this we take input in
numpy function as a list then we create a matrix
There is many more function can be perform by using this like that take
sin value of the given value ,print a zero matrix etc. we also take any
image in the form of array.
2. Matplotlib
Matplotlib is a library for making 2D plots of arrays in Python. Although
it has its origins in emulating the MATLAB graphics commands, it is
independent of MATLAB, and can be used in a Pythonic, object oriented
way. Although Matplotlib is written primarily in pure Python, it makes
heavy use of NumPy and other
extension code to provide good performance even for large arrays.
Matplotlib is designed with the philosophy that you should be able to
create simple plots with just a few commands, or just one! If you want
to see a histogram of your data, you shouldn't need to instantiate
objects, call methods, set properties, and so on; it should just work.
These are the some example of matplotlib..
3. Pandas
Pandas is an open-source, BSD-licensed Python library providing high-
performance, easy-to-use data structures and data analysis tools for the
Python programming language. Python with Pandas is used in a wide
range of fields including academic and commercial domains including
finance, economics, Statistics, analytics, etc.
Pandas is an open-source Python Library providing high-performance
data manipulation and analysis tool using its powerful data structures.
The name Pandas is derived from the word Panel Data — an
Econometrics from Multidimensional data.
Key Features of Pandas-
• Fast and efficient DataFrame object with default and customized
indexing.
• Tools for loading data into in-memory data objects from different
file formats.
• Data alignment and integrated handling of missing data.
• Reshaping and pivoting of date sets.
• Label-based slicing, indexing and subsetting of large data sets.
• Columns from a data structure can be deleted or inserted.
• Group by data for aggregation and transformations.
Pandas deals with the following three data structures -
• Series
• DataFrame
• Panel
These data structures are built on top of Numpy array, which means
they are fast.
4.
OpenCV
OpenCV was started at Intel in 1999 by Gary Bradsky and the first
release came out in 2000. Vadim Pisarevsky joined Gary Bradsky to
manage Intel's Russian software OpenCV team. In 2005, OpenCV was
used on Stanley, the vehicle who won 2005 DARPA Grand Challenge.
Later its active development continued under
the support of Willow Garage, with Gary Bradsky and Vadim Pisarevsky
leading the project. Right now, OpenCV supports a lot of algorithms
related to Computer Vision and Machine Learning and it is expanding
day-by-day.
Below is the list of contributors who submitted tutorials to OpenCV-
Python.
-Alexander Mordvintsev (GSoC-2013 mentor)
-Abid Rahman K. (GSoC-2013 intern)
Use the function-
cv2.imread() to read an image. The image should be in the working
directory or a full path of image should be given.
Second argument is a flag which specifies the way image should be
read.
• cv2.lMREAD COLOR : Loads a color image. Any transparency of
image will be neglected. It is the default flag.
• cv2.lMREAD GRAYSCALE : Loads image in grayscale mode
• cv2.lMREAD UNCHANGED : Loads image as such including alpha
channel
Use the function cv2.imshow() to display an image in a window. The
window
automatically fits to the image size.
This program change image from colour to black&white.
Conclusion
After reading this Data Science with Python, I have learned what data
science is, why it is important, and the different libraries involved in
data science. I learned the different skills needed when it comes to data
science.
Learning Outcome
After completing the training, I am able to:
• Develop relevant programming abilities.
• Demonstrate proficiency with statistical analysis of data.
• Develop the skill to build and assess data-based model.
• Execute statistical analysis with professional statistical software.
• Demonstrate skill in data management.
• Apply data science concepts and methods to solve problem in real-
world contexts and will communicate these solutions effectively.