0% found this document useful (0 votes)
44 views7 pages

Web Caches, CDNS, and P2Ps

The document discusses web caching, content distribution networks, and peer-to-peer file sharing networks. Web caches store content locally to reduce response times and network traffic. Content distribution networks replicate content across many servers to improve performance. Peer-to-peer networks allow users to share files directly without a central server.

Uploaded by

raw.junk
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
44 views7 pages

Web Caches, CDNS, and P2Ps

The document discusses web caching, content distribution networks, and peer-to-peer file sharing networks. Web caches store content locally to reduce response times and network traffic. Content distribution networks replicate content across many servers to improve performance. Peer-to-peer networks allow users to share files directly without a central server.

Uploaded by

raw.junk
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 7

Web caches (proxy server)

Web Caches, CDNs, and P2Ps

Goal: satisfy client request without


involving origin server
user sets browser: Web accesses via
cache
browser sends all HTTP requests to
cache

object in cache: cache returns


object

else cache requests object from


origin server, then returns object
to client

Applications (part 3)

More about Web caching

Caching example (1)

Cache acts as both client and server

Why Web caching?

Assumptions

Cache can do up-to-date check using


If-modified-since HTTP
header

Reduce response time for client request.

average object size = 100,000 bits

Reduce traffic on an institutions access


link.

avg. request rate from institutions


browser to origin serves = 15/sec

Internet dense with caches enables


poor content providers to
effectively deliver content

delay from institutional router to any


origin server and back to router = 2
sec

Issue: should cache take risk and


deliver cached object without
checking?
Heuristics are used.

Consequences

Typically cache is installed by ISP


(university, company, residential ISP)

utilization on LAN = 15%


utilization on access link = 100%
total delay = Internet delay + access
delay + LAN delay
= 2 sec + minutes + milliseconds

Applications (part 3)

Applications (part 3)

Caching example (2)

Caching example (3)

Possible solution

Install cache
suppose hit rate is .4

increase bandwidth of access


link to, say, 10 Mbps

Consequence
40% requests will be satisfied almost
immediately

Consequences
utilization on LAN = 15%

60% requests satisfied by origin server


10 Mbps access link

utilization on access link = 15%


Total delay = Internet delay +
access delay + LAN delay

utilization of access link reduced to 60%,


resulting in negligible delays (say 10
msec)
total delay = Internet delay + access
delay + LAN delay

= 2 sec + msecs + msecs

= .6*2 sec + .6*.01 secs + milliseconds


< 1.3 secs

often a costly upgrade

Applications (part 3)

Applications (part 3)

Content distribution networks (CDNs)

CDN example

The content providers are the CDN


customers.

Origin server

Content replication

distributes HTML

www.foo.com
replaces:

CDN company installs hundreds of


CDN servers throughout Internet

http://www.foo.com/sports.ruth.gif
with
http://www.cdn.com/www.foo.com/sport
s/ruth.gif

in

lower-tier ISPs, close to


users

CDN company

CDN replicates its customers


content in CDN servers. When
provider updates content, CDN
updates servers

cdn.com
distributes gif files
uses its authoritative DNS server to route
redirect requests

Applications (part 3)

Applications (part 3)

More about CDNs

P2P file sharing


Example

routing requests

not just Web pages

CDN creates a map, indicating


distances from leaf ISPs and
CDN nodes

streaming stored audio/video

when query arrives at authoritative


DNS server:

server determines ISP from


which query originates

uses map to determine


best CDN server

streaming real-time audio/video

CDN nodes create


application-layer overlay
network

Alice runs P2P client application on


her notebook computer

Alice chooses one of the peers,


Bob.
File is copied from Bobs PC to
Alices notebook: HTTP

Intermittently connects to Internet;


gets new IP address for each
connection

While Alice downloads, other


users uploading from Alice.

Asks for file ABC

Alices peer is both a Web client


and a transient Web server.

Application displays other peers that


have copy of ABC.

All peers are servers = highly


scalable!

Applications (part 3)

Applications (part 3)

P2P: centralized directory

P2P: problems with centralized directory

original Napster design

Single point of failure

1) when peer connects, it informs


central server:

Performance bottleneck

IP address

content

Copyright infringement

file transfer is decentralized,


but locating content is highly
centralized

2) Alice queries for ABC


3) Alice requests file from Bob

Applications (part 3)

Applications (part 3)

P2P: decentralized directory

More about decentralized directory

Each peer is either a group leader or


assigned to a group leader.
Group leader tracks the content in all
its children.

overlay network

advantages of approach

peers are nodes

no centralized directory server

edges between peers and their group


leaders

Peer queries group leader; group


leader may query other group
leaders.

location service distributed


over peers

more difficult to shut down

edges between some pairs of group


leaders

disadvantages of approach

virtual neighbors

bootstrap node needed

bootstrap node

group leaders can get overloaded

connecting peer is either assigned to


a group leader or designated as
leader

Applications (part 3)

P2P: Query flooding

P2P: more on query flooding

Gnutella
no hierarchy

Send query to neighbors

use bootstrap node to learn about


others

Neighbors forward query

join message

Applications (part 3)

If queried peer has object, it sends


message back to querying peer

Pros

Cons

peers have similar responsibilities: no


group leaders

excessive query traffic

highly decentralized
no peer maintains directory info

query radius: may not have content


when present
bootstrap node
maintenance of overlay network

join

Applications (part 3)

Applications (part 3)

Structured P2P networks

Structured P2P networks

Traditional P2P file-sharing systems do not operate efficiently

Structured P2P networks (file sharing example) has two


questions to consider:

Spend too many messages on constructing and maintaining the


overlay network

Perform random global searches mostly by flooding the network

One advantage of the traditional scheme is that the documents


can be placed anywhere and the document will be found if at
least one of the machines holding a copy is up and reachable.

How do we map objects onto nodes?

How do we route requests to the node that is responsible for the


object?

For the first question, simplest solution just uses hashing.


hash(x) -> n
Where x is the object identifier and n is the node identifier onto
which the object is placed.
What properties do we need from the hash function?

Applications (part 3)

Applications (part 3)

Structured P2P networks

Consistent hashing

Problems with traditional hashing:

Consistent hashing maps both objects


and nodes onto a 128-bit ID space
that is organized as a circle.

When nodes join and leave the hashing function will be affected

hash(x) { return x % 101 }

Need to know the exact number of hosts (in this example, 101)

Hash(object_name) -> objid

To address these issues, structured P2P networks use consistent


hashing.

Hash(IP_addr) -> nodeid

Because the ID space is very large,


an objid and nodeid would not (most
likely) coincide.
Select the node whose id is closest in
this 128-bit space to the object id.

Applications (part 3)

Applications (part 3)

Consistent hashing

Distributed hash tables

Like ordinary hashing, distributes objects evenly across the


nodes. However, unlike ordinary hashing, only a small number
of objects have to move when a node (hash bucket) leaves or
joins.

Suppose you are at node 65a1fc


(hex) and trying to locate objid
d46a1c

How does a user who wants to access a object know which node
holds the object?

Each node keeps a complete table of nodes IDs and associated IP


addresses search the list for the closest node ID and access the
node!

Not practical for large networks (i.e., not scalable)

Another approach: route the request to the appropriate node

Your node does not share


anything with the target object

You know a node that shared at


least the prefix d it is closer
than you to this object

Ask this node to locate object


d46a1c for you

Assuming node d13da3 knows


another node with even longer
prefix the message will be
forwarded even further

Applications (part 3)

Applications (part 3)

Distributed hash tables

Distributed hash tables

As the message moves through the


ID space, the actual message moves
through the Internet

Each node maintains a leaf set


these are nodes that are numerically
closest to the node.

Each node maintains a route table as


shown here

Leaf node peers with other leaf nodes


within the same set of leafs. Suppose
a leaf node is unable to do some
operation because of some error
condition that work may be offloaded
onto another leaf node

Routing table is a 2-D array. Has a


row for each hex digit in the ID (32
rows for a 128-bit ID)
Entry in row i shares a prefix of
length i with this node the entry in
the j-th column has hex value j at
i+1-th position

Routing table at 65a1fcx

x denotes an unspecified suffix

Applications (part 3)

Applications (part 3)

Distributed hash tables


Adding a node to overlay is much
like routing a locate object
message
New node must at least know a
member of the P2P network
(preferable if the closest)
Learns about other nodes through the
routing process fills out its routing
able.
Existing nodes also update their
routing tables based on new arrivals

Applications (part 3)

Applications (part 3)

You might also like