Introduction and Pythonb Basics
Introduction and Pythonb Basics
Data Science
• Data science involves using
methods to analyze massive
amounts of data and extract the
knowledge it contains.
Data Format
• Structured Data
• Unstractured Data
– Natural language
– Machine-generated data
– Network data
– Audio, image, and video
Which language is used in data science?
Life Cycle of Data Analytics
Problem Results
Data Collection Data Cleaning Model Building Implementation
Definition Communication
About this course
• Syllabus
• Meeting time
• Homework assignments
• Prerequisite
PYTHON BASICS
Reading Materials
EASY Hard
Google Colaboratory
• Google Colaboratory, or "Colab" for short, allows you to write and execute
Python in your browser, with
– Zero configuration required
– Free access to GPUs
– Easy sharing
• Watch this introduction video:
https://www.youtube.com/watch?v=inN8seMm7UI
• Google Colab link for this lecture:
– https://colab.research.google.com/drive/1EfShYcQpP-
I7hBiKQ3vZKXIuwyFyCn7N?usp=sharing
– File-> Save a copy in Drive (This will create a copy of the notebook and save it to
your drive. Later on you may rename the copy to your choice of name.)
Variable
• A variable is something that holds a value that may change.
You can use variables to store all kinds of information both
numbers and strings (text). The following example code shows
how to assign numerical value to variables.
x=3
y=4
z=x+y Output: 7
print(z)
Operators
• Division, powers, int, round up
x=2
y=3
print("division: ", x/y)
print("square: ", x**2)
print("square root: ", x**0.5)
print("integer part: ", int(2/3))
print("round up to 3 digits: ", round(2/3,3))
division: 0.666666666667
square: 4
square root: 1.41421356237
integer part: 0
round up to 3 digits: 0.667
Python Indentation
• Indentation refers to the spaces at the beginning of a code line.
The indentation in Python is very important. Python uses
indentation to indicate a block of code. We will see some
examples in the following slides.
If Statement
• Examples
a=3
if a>3:
print(a, " is greater than 3")
else:
print(a, " is less than or equal to 3")
a=5
if a>3:
print(a, " is greater than 3")
elif a<3:
print(a, " is less than 3")
else:
print(a, " is equal to 3")
For Loop
• Examples for i in range(10):
print(i)
for i in range(1,10,2):
print(i)
for i in range(10):
if i==5:
continue
print(i)
List
• A list is a collection of values, organized in order. A list is
created using square brackets.
• Example: Define a list with numerical values.
a = [1,2,3,4,5]
print(a)
Output:
first element: a
second element: b
last element: 10
How to join a list of lists into a single list in
python?
a = [['x','b'],['c'],['d']]
print(sum(a,[]))
Output:
[‘x', 'b', 'c', 'd']
Dictionary (1)
• Dictionaries are also like lists. However, every element in a
dictionary has two parts: a key, and a value. Calling a key of a
dictionary returns the value linked to that key. Dictionaries are
declared using curly braces, and each element is declared first
by its key, then a colon, and then its value.
Dictionary (2)
• Example: Define a dictionary.
a = {}
a['name'] = 'liming'
a['age'] = 25
a['gender'] = 'male'
print(a)
print(a['name'])
Output:
{'gender': 'male', 'age': 25, 'name': 'liming'}
liming
Dictionary (3)
• Example: Keys of a dictionary.
a = {}
a['name'] = 'lilei'
a['age'] = 25
a['gender'] = 'male'
print(a.keys())
Output:
['gender', 'age', 'name']
Dictionary (4)
• Example: Define a list of dictionaries.
c = []
a = {}
a['name'] = 'lilei'
a['age'] = 25
a['gender'] = 'male'
b={}
b['name'] = 'hanmeimei'
b['age'] = 25
b['gender'] = 'female'
c.append(a)
c.append(b)
print(c)
Output
[{'gender': 'male', 'age': 25, 'name': 'lilei'}, {'gender': 'female', 'age': 25, '
name': 'hanmeimei'}]
Dictionary (6)
• Example: Sort a dictionary by value or key. Notice: the result is
not a dictionary anymore.
#sort by value
a = {1:4,2:3,5:12,8:6}
b = sorted(a.items(),key=lambda e:e[1])
print(b)
#sort by key
a = {1:4,2:3,5:12,8:6}
b = sorted(a.items(),key=lambda e:e[0])
print(b)
Output:
[(2, 3), (1, 4), (8, 6), (5, 12)]
[(1, 4), (2, 3), (5, 12), (8, 6)]
Dictionary (7)
• Example: Sort a list of dictionaries
a = [{'name':'david','age':19},{'name':'victor','age':32},{'name':'mike','age':29},{'name':'lisa','age':29}]
b = sorted(a,key=lambda x:(-x['age'], x['name']))
for bb in b:
print(bb)
Output:
{'age': 32, 'name': 'victor'}
{'age': 29, 'name': 'lisa'}
{'age': 29, 'name': 'mike'}
{'age': 19, 'name': 'david'}
Set
• Sets are just like lists except that they are unordered and they
do not allow duplicate values.
• Example 1: How to define a set from a list and how to define a
list from a set.
a = [1,2,3,4,5,1,2,3]
b = set(a)
print(b)
c = list(b)
print(c)
Output:
set([1, 2, 3, 4, 5])
[1, 2, 3, 4, 5]
Count unique values in a list
Counter({1: 6, 2: 3, 3: 3})
Function
• A function is a relationship that can accept some arguments
(also called inputs or parameters) and possibly return an object
(often a tuple containing multiple objects).
• Example 1: how to define a function and how to use it.
def totalsum(a,b):
return a+b
a=1
b=2
c = totalsum(a,b)
print(c)
Output:
3
Tokenize a sentence
import re
text = "If you have any questions, feel free to contact us (test_1)."
words = re.findall('[a-zA-Z0-9_]+|[,.]+', text)
print(words)
['If', 'you', 'have', 'any', 'questions', ',', 'feel', 'free', 'to', 'contact', 'us', 'test_1', '.']
Extract phone number
import re
2004959559
Random Number Generation
The code above will output the names of all file on your desktop. Remember to change the username (lg) to yours.
Endswith
This code shows how to check if a string ends with a specific phrase.
This code shows how to import a txt file into a list. You can create a .txt file on your desktop and test how the code works.
path = "c://users//lgao5//desktop//test.txt"
file=open(path,'r')
lst = [1,2,'f']
for l in lst:
l = str(l).replace("\n"," ")
l = str(l).replace("\r"," ")
fp.write("\n"+l)
fp.close()
Exercises
1. Write a Python function to return the length of a given string.
2. Write a Python function to return the area of a circle given its
radius.
3. Write a Python function to return the difference between a given
number and 20, if the difference is less than 0 return 0.
4. Write a Python function to concatenate all elements (except the
last element) of a given list into a string and return the string.
5. Write a Python function to return the area of the triangle
determined by three given points (x1, y1), (x2, y2), and (x3,y3).