Module 5-Networked Programs
Module 5-Networked Programs
Networked Programs
ARUN K H,ISE, AIT
1
IP Address
An Internet Protocol address (IP address) is a unique numerical
label separated by full stops assigned to each device connected to a
computer network that uses the Internet Protocol for
communication
2
Public and Private IP Address
4
5
• Python provides two levels of access to network services. At a
low level, you can access the basic socket support in the
underlying operating system, which allows you to implement
clients and servers for both connection-oriented and
connectionless protocols.
6
7
Why use Sockets?
8
What are Sockets?
• Sockets are the endpoints of a bidirectional communications channel.
Sockets may communicate within a process, between processes on the
same machine, or between processes on different continents.
• A single network will have two sockets, one for each communicating
device or program
• A single device can have ‘n’ number of sockets based on the port
number that is being used.
9
List of some important modules in Python Network/Internet
programming.
Methods Description
used to create sockets (required on both
socket.socket() server as well as client ends to create
sockets)
used to accept a connection. It returns a
pair of values (conn, address) where conn is
a new socket object for sending or receiving
socket.accept()
data and address is the address of the
socket present at the other end of the
connection
used to bind to the address that is specified
socket.bind()
as a parameter
12
Server
while True:
clt,adr=s.accept()
print(f"Connection to {adr}established")
#f string is literal string prefixed with f which
#contains python expressions inside braces
#to send info to clientsocket
clt.send(bytes("Socket Programming in Python","utf-8 ")) 14
Client
import socket
s=socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.Connect ( ( socket.gethostname( ), 1234) )
msg=s.recv(1024)
print(msg.decode("utf-8"))
NOTE: gethostname is used when client and server are on the same
computer. (LAN – local Ip / WAN – public Ip)
16
The World’s Simplest Web Browser
Perhaps the easiest way to show how the HTTP protocol works is to write a very
simple Python program that makes a connection to a web server and follows the rules
of the HTTP protocol to request a document and display what the server sends back.
import socket
mysock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
mysock.connect(('data.pr4e.org', 80))
cmd = 'GET http://data.pr4e.org/romeo.txt HTTP/1.0\r\n\r\n'.encode()
mysock.send(cmd)
while True:
data = mysock.recv(20)
if (len(data) < 1):
break
print(data.decode(),end='')
mysock.close()
17
import socket
Retrieving an image over HTTP
import time
HOST = 'data.pr4e.org'
PORT = 80
mysock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
mysock.connect((HOST, PORT))
mysock.sendall(b'GET http://data.pr4e.org/cover3.jpg HTTP/1.0\r\n\r\n')
count = 0
picture = b""
while True:
data = mysock.recv(5120)
if (len(data) < 1): break
time.sleep(0.25)
count = count + len(data)
print(len(data), count)
picture = picture + data
mysock.close()
# Look for the end of the header (2 CRLF)
pos = picture.find(b"\r\n\r\n")
print('Header length', pos)
print(picture[:pos].decode())
# Skip past the header and save the picture data
18
picture = picture[pos+4:]
fhand = open("stuff.jpg", "wb")
fhand.write(picture)
fhand.close()
Retrieving web pages with urllib
• While we can manually send and receive data over HTTP using the
socket library, there is a much simpler way to perform this common
task in Python by using the urllib library.
• Using urllib, you can treat a web page much like a file. You simply
indicate which web page you would like to retrieve and urllib handles
all of the HTTP protocol and header details.
import urllib.request
fhand = urllib.request.urlopen('http://data.pr4e.org/romeo.txt')
for line in fhand:
print(line.decode().strip())
19
Compute the frequency of each word in
the file romeo.txt
import urllib.request
fhand = urllib.request.urlopen('http://data.pr4e.org/romeo.txt')
counts = dict()
for line in fhand:
words = line.decode().split()
for word in words:
counts[word] = counts.get(word, 0) + 1
print(counts)
20
Parsing HTML using regular expressions
22
What is Web Scraping?
• Web Scraping is the technique of automatically extracting data from
websites using software/script.
• Python is the most popular language for web scraping. It's more like
an all-rounder and can handle most of the web crawling related
processes smoothly.
• Scrapy and Beautiful Soup are among the widely used frameworks
based on Python that makes scraping using this language such an
easy route to take
23
Some python libraries for
web scraping:
• Beautiful Soup
• Scrapy
• Requests
• LXML
• Selenium
24
Parsing HTML and scraping the web
• Using this technique, Google spiders its way through nearly all of the
pages on the web.
26
Parsing HTML using BeautifulSoup
• Beautiful Soup tolerates highly flawed HTML and still lets you easily
extract the data you need.
• We will use urllib to read the page and then use Beautiful Soup to
extract the href attributes from the anchor (a) tags. 27
import urllib.request
from bs4 import BeautifulSoup
import ssl
To retrieve a non-text (or binary) file such as an image or video file. The data in
these files is generally not useful to print out, but you can
easily make a copy of a URL to a local file on your hard disk using urllib.
img =
urllib.request.urlopen('http://data.pr4e.org/cover3.jpg').read()
fhand = open('cover3.jpg', 'wb')
fhand.write(img)
fhand.close()
29
30
HTML
• HTML stands for Hyper Text Markup Language
31
HTML (cont..)
• HTML elements tell the browser how to display the
content
32
A Simple HTML Document
<!DOCTYPE html>
<html>
<head>
<title>Page Title</title>
</head>
<body>
</body>
</html>
33
• The <!DOCTYPE html> declaration defines this document
to be HTML5
• The first tag in a pair is the start tag, the second tag is
the end tag
• The end tag is written like the start tag, but with
35
a forward slash inserted before the tag name
XML
• XML stands for eXtensible Markup Language.
37
Example
<?xml version="1.0" encoding="UTF-8"?>
<Wishes>
<to>Vinu</to>
<from>Kavitha</from>
<heading>Happy Birthday</heading>
<person>
<name>Chuck</name>
<phone type="intl">
+1 734 303 4456
</phone>
<email hide="yes"/>
</person>
39
Tree representation of XML
40
XML Does Not Use Predefined Tags
• The tags in the example above (like <to> and <from>) are not
defined in any XML standard.
• HTML works with predefined tags like <p>, <h1>, <table>, etc.
• With XML, the author must define both the tags and the
document structure. 41
XML is Extensible
42
XML Simplifies Things
43
Parsing XML
Here is a simple application that parses some XML and extracts some data elements from the
XML:
.
import xml.etree ElementTree as ET
data = '''
<person>
<name>Chuck</name>
<phone type="intl"> XML data
+1 734 303 4456
</phone>
<email hide="yes"/>
</person>'''
tree = ET.fromstring(data) # converts string representation of “data” into tree of XML nodes
print('Name:', tree.find('name').text) # search tree and retrieve a node that matches tag –’name’
print('Attr:', tree.find('email').get('hide')) 44
Looping through nodes
import xml.etree.ElementTree as ET
input = '''
<stuff>
<users>
<user x="2">
<id>001</id>
<name>Chuck</name>
</user>
<user x="7">
<id>009</id>
<name>Brent</name>
</user>
</users>
</stuff>''' 45
Looping through nodes
stuff = ET.fromstring(input)
lst = stuff.findall('users/user') # list of subtrees that represent the user
structures in the XML tree
46
Still Any simpler format ?
47
JSON
50
JSON Syntax Rules
51
JSON Data - A Name and a Value
52
JSON Objects
{"firstName":"John", "lastName":"Doe"}
53
JSON encoding that is equivalent to the XML
ex seen earlier :
{
"name" : "Chuck",
"phone" : {
"type" : "intl",
"number" : "+1 734 303 4456"
},
"email" : {
"hide" : "yes"
}
}
54
55
Parse JSON - Convert from JSON to Python
Example
Convert from JSON to Python:
import json
# some JSON:
x = '{ "name":"John", "age":30, "city":"New York"}'
# parse x: O/P : 30
y = json.loads(x)
"employees":[
{"firstName":"John", "lastName":"Doe"},
{"firstName":"Anna", "lastName":"Smith"},
{"firstName":"Peter", "lastName":"Jones"}
]
62
Example of an SOA
63
Advantages of SOA
• The owners of the data can set the rules about the use of
their data
65
The following is a simple application to prompt the user for a search string, call
the Google geocoding API, and extract information from the returned JSON.
66
67
68
69
Security and API usage
• It is common that we need some kind of “API key” to make
use of a vendor’s API.
71
• Few years earlier before OAuth, the authorization of a
user on a website was done typically assigning a unique
id and user-selective password. But now a days, you also
see the dialogue box identical to below one along with
the so called typical sign up method :
• Now a days, you can sign up and login on third party sites
using your previous accounts on facebook, twitter, github,
etc.
72
73
OAuth Protocol
• Officially it is stated as : “OAuth is an authorization
framework that enables a third-party application to obtain a
limited access to an HTTP service.”
74
Twitter API
• Twitter moved from an “open and public API” to an API that
required the use of “OAuth signatures” on each API request.
def oauth():
return {"consumer_key": "h7Lu...Ng",
"consumer_secret" : "dNKenAC3New...mmn7Q",
"token_key" : "10185562-eibxCp9n2...P4GEQQOSGI",
"token_secret" : "H0ycCFemmC4wyf1...qoIpBo"}
76
The Twitter web service are accessed using a URL like this:
https://api.twitter.com/1.1/statuses/user_timeline.json
But once all of the security information has been added, the
URL will look more like:
https://api.twitter.com/1.1/statuses/user_timeline.json?co
unt=2&oauth_version=1.0&oauth_token=101...SGI&screen_
name=drchuck&oauth_nonce=09239679&oauth_timestam
p=1380395644&oauth_signature=rLK...BoD&oauth_consu
mer_key=h7Lu...GNg&oauth_signature_method=HMAC-
SHA1
77
The below program retrieves the timeline for a particular Twitter user and
returns it to us in JSON format in a string. We simply print the first 250
characters of the string:
# Create App and get the four strings, put them in hidden.py
TWITTER_URL = 'https://api.twitter.com/1.1/statuses/user_timeline.json'
print('Retrieving', url)
connection = urllib.request.urlopen(url, context=ctx)
data = connection.read().decode()
print(data[:250])
headers = dict(connection.getheaders())
# print headers
print('Remaining', headers['x-rate-limit-remaining'])
79
80
In the following example, we retrieve a user’s Twitter friends, parse the returned
JSON, and extract some of the information about the friends.
81
82
83
84