• No results found

Today: Internet Services. Internet Services. Typical Workload (Web Pages) Web caches (proxy server)

N/A
N/A
Protected

Academic year: 2021

Share "Today: Internet Services. Internet Services. Typical Workload (Web Pages) Web caches (proxy server)"

Copied!
17
0
0

Loading.... (view fulltext now)

Full text

(1)

1!

Internet  Services  

 

COMP5213:  Computer  and  Network  Organiza;on  

Instructors:    

Javid  Taheri  

2!

Today:  Internet  Services  

¢ 

Web  Caching    

¢ 

Domain  Name  System  

¢ 

Content  Distribu;on  Network  

¢ 

Web  Services  

The University of Sydney

Typical  Workload  (Web  Pages)  

¢ 

Mul;ple  (typically  small)  objects  per  page    

¢ 

File  sizes  

§ Heavy-­‐tailed  

§ Pareto  distribu;on  for  tail  

§ Lognormal  for  body  of  distribu;on    

The University of Sydney

Web  caches  (proxy  server)  

¢ user  sets  browser:  Web  accesses  via     cache  

¢ browser  sends  all  HTTP  requests  to     cache  

§  object  in  cache:  cache  returns  object    

§  else  cache  requests  object  from   origin  server,  then  returns  object  to   client  

Goal: satisfy client request without involving origin server!

client Proxy server client HTTP reque st HTTP reque st HTTP response HTTP response HTTP re quest HTTP response origin server origin server

(2)

5!

More  about  Web  caching  

¢ Cache  acts  as  both  client  and  server  

¢ Cache  can  do  up-­‐to-­‐date  check  using  

If-modified-since  HTTP  header  

§ Issue:  should  cache  take  risk  and   deliver  cached  object  without   checking?  

§ Heuris;cs  are  used.   ¢ Typically  cache  is  installed  by  ISP  

(university,  company,  residen;al  ISP)  

¢ 

Why  Web  caching?

 

¢ Reduce  response  ;me  for  client  

request.  

¢ Reduce  traffic  on  an  ins;tu;on’s  

access  link.  

¢ Internet  dense  with  caches  enables  

“poor”  content  providers  to  effec;vely   deliver  content  

6!

Caching  example  (1)  

¢ Assump;ons  

¢ average  object  size  =  100,000  bits  

¢ avg.  request  rate  from  ins;tu;on’s  browser   to  origin  serves  =  15/sec  

¢ delay  from  ins;tu;onal  router  to  any  origin   server  and  back  to  router    =  2  sec  

¢ Consequences  

¢ u;liza;on  on  LAN  =  15%  

¢ u;liza;on  on  access  link  =  100%  

¢ total  delay      =  Internet  delay  +  access  delay  +  LAN   delay  

¢     =    2  sec  +  minutes  +  milliseconds  

origin servers public Internet institutional network 10 Mbps LAN 1.5 Mbps access link institutional cache

The University of Sydney

Caching  example  (2)  

¢ Possible  solu;on  

¢ increase  bandwidth  of  access  link  to,  say,  10   Mbps  

¢ Consequences  

¢ u;liza;on  on  LAN  =  15%  

¢ u;liza;on  on  access  link  =  15%  

¢ Total  delay      =  Internet  delay  +  access  delay  +  LAN   delay  

¢     =    2  sec  +  msecs  +  msecs  

¢ o_en  a  costly  upgrade  

origin servers public Internet institutional network 10 Mbps LAN 10 Mbps access link institutional cache

The University of Sydney

Caching  example  (3)  

¢ Install  cache  

¢ suppose  hit  rate  is  .4  

¢ Consequence  

¢ 40%  requests  will  be  sa;sfied  almost  immediately   ¢ 60%  requests  sa;sfied  by  origin  server   ¢ u;liza;on  of  access  link  reduced  to  60%,  resul;ng  

in  negligible    delays  (say  10  msec)  

¢ total  delay      =  Internet  delay  +  access  delay  +  LAN  

delay  

¢     =    .6*2  sec  +  .6*.01  secs  +  milliseconds  <  1.3  secs  

origin servers public Internet institutional network 10 Mbps LAN 1.5 Mbps access link institutional cache

(3)

9!

Problems  

¢ 

Over  50%  of  all  HTTP  objects  are  uncacheable  –  why?  

¢ 

Not  easily  solvable  

§ Dynamic  data  à  stock  prices,  scores,  web  cams  

§ CGI  scripts  à  results  based  on  passed  parameters  

¢ 

Obvious  fixes  

§ SSL  à  encrypted  data  is  not  cacheable  

§ Most  web  clients  don’t  handle  mixed  pages  well  àmany  generic  

objects  transferred  with  SSL  

§ Cookies  à  results  may  be  based  on  passed  data  

§ Hit  metering  à  owner  wants  to  measure  #  of  hits  for  revenue,  etc.  

10!

Caching  Proxies  –  Sources  for  Misses  

¢  Capacity  

§  How  large  a  cache  is  necessary  or  equivalent  to  infinite  

§  On  disk  vs.  in  memory  à  typically  on  disk   ¢  Compulsory  

§  First  ;me  access  to  document  

§  Non-­‐cacheable  documents  

§  CGI-­‐scripts  

§  Personalized  documents  (cookies,  etc)  

§  Encrypted  data  (SSL)  

¢  Consistency  

§  Document  has  been  updated/expired  before  reuse   ¢  Conflict  

§  No  such  misses  

The University of Sydney

Today:  Internet  Services  

¢ 

Web  Caching    

¢ 

Domain  Name  System  

¢ 

Content  Distribu;on  Network  

¢ 

Web  Services  

The University of Sydney

DNS:  Domain  Name  System  

¢ 

People:

 many  iden;fiers:  

§ name,  passport  #  

¢ 

Internet  hosts,  routers:

 

§ IP  address  (32  bit)  -­‐  used  for   addressing  datagrams  

§ “name”,  e.g.,  www.it.usyd.edu.au   -­‐  used  by  humans  

¢ 

Q:

 map  between  IP  addresses  

and  name  ?  

¢ 

Domain  Name  System:

 

¢ distributed  database  implemented  in  

hierarchy  of  many  name  servers  

¢ applica1on-­‐layer  protocol  host,  

routers,  name  servers  to  communicate   to  resolve  names  (address/name   transla;on)  

§ note:  core  Internet  func;on,   implemented  as  applica;on-­‐layer   protocol  

(4)

13!

Hierarchical  Name  Space  

§ 

 Each  node  in  hierarchy  stores  a  

list  of  names  that  end  with  same  

suffix  

Suffix  =  path  up  tree   § 

 E.g.,  given  this  tree,  where  

would  following  be  stored:  

§  cnn.com   §  mit.edu   §  cpu0.cs.usyd.edu.au   shade.ug.cs.usyd.edu.au! root edu net org au com cn

unsw usyd uts anu

cs ee

ug shade

edu

gov com net

14!

Hierarchical  Name  Space  (cont)  

§ 

 Zone  =  con;guous  sec;on  of  

name  space  

§ E.g.,  Complete  tree,  single  node   or  subtree  

§ 

 A  zone  has  an  associated  set  of  

name  servers  

§ 

 Must  store  list  of  names  and  

tree  links  

shade.ug.cs.usyd.edu.au! root edu net org au com cn

unsw usyd uts anu

cs ee

ug shade

edu

gov com net

Subtree

Single node

Complete

Tree

The University of Sydney

DNS  Design  

¢ 

Zones  are  created  by  convincing  owner  node  to  create/

delegate  a  subzone  

§ Records  within  zone  stored  mul;ple  redundant  name  servers  

§ Primary/master  name  server  updated  manually  

§ Secondary/redundant  servers  updated  by  zone  transfer  of  name  space  

§ Zone  transfer  is  a  bulk  transfer  of  the  “configura;on”  of  a  DNS  server   –  uses  TCP  to  ensure  reliability  

¢ 

Example:  

§ CS.USYD.EDU.AU  created  by  USYD.EDU.AU  administrators  

§ Who  creates  USYD.EDU.AU?  

The University of Sydney

DNS  name  servers  

¢ no  server  has  all  name-­‐to-­‐IP  address  

mappings  

¢ local  name  servers:  

§ each  ISP,  company  has  local  (default)   name  server  

§ host  DNS  query  first  goes  to  local   name  server  

¢ authorita;ve  name  server:   § for  a  host:  stores  that  host’s  IP  

address,  name  

§ can  perform  name/address  transla;on   for  that  host’s  name    

¢ Why  not  centralize  DNS?  

¢ single  point  of  failure  

¢ traffic  volume  

¢ distant  centralized  database  

¢ maintenance  

(5)

17!

DNS:  Root  name  servers  

¢ contacted  by  local  name  server  that  can  not  resolve  name  

¢ root  name  server:  

§  contacts  authorita;ve  name  server  if  name  mapping  not  known  

§  gets  mapping  

§  returns  mapping  to  local  name  server  

b USC-ISI Marina del Rey, CA l ICANN Marina del Rey, CA e NASA Mt View, CA f Internet Software C. Palo Alto, CA

i NORDUnet Stockholm k RIPE London

m WIDE Tokyo a NSI Herndon, VA

c PSInet Herndon, VA d U Maryland College Park, MD g DISA Vienna, VA h ARL Aberdeen, MD j NSI (TBD) Herndon, VA

13 root name servers worldwide!

18!

Typical  Resolu;on  

Client DNS server Local

root & au DNS server metro.ucc.usyd.edu.au DNS server www.it.usyd.edu.au NS metro.u cc.usyd. edu.au" www.it.usyd .edu.au NS crux.cs.usyd.edu.au" A www=I Pa ddr crux.cs.usyd.edu.au DNS server

The University of Sydney

Typical  Resolu;on  

¢  Steps  for  resolving  www.it.usyd.edu.au   § Applica;on  calls  gethostbyname()  (RESOLVER)  

§ Resolver  contacts  local  name  server  (S1)  

§ S1  queries  root  server  (S2)  for  (www.it.usyd.edu.au)  

§ S2  returns  NS  record  for  usyd.edu.au  (S3)  

§ What  about  A  record  for  S3?  

§  This  is  what  the  addi;onal  informa;on  sec;on  is  for  (PREFETCHING)  

§ S1  queries  S3  for  www.it.usyd.edu.au  

§ S3  returns  A  record  for  www.it.usyd.edu.au  

¢  Can  return  mul;ple  A  records  !  what  does  this  mean?  

The University of Sydney

Lookup  Methods  

¢ Recursive  query:  

§  Server  goes  out  and  searches   for  more  info  (recursive)  

§  Only  returns  final  answer  or   “not  found”  

¢ Itera;ve  query:  

§  Server  responds  with  as  much   as  it  knows  (itera;ve)  

§  “I  don’t  know  this  name,  but   ask  this  server”  

¢ Workload  impact  on  choice?   §  Local  server  typically  does  

recursive  

§  Root/distant  server  does   itera;ve  

requesting host

surf.eurecom.fr

shade.ug.cs.usyd.edu.au

root name server

local name server

dns.eurecom.fr 1 2 3 4 5 6

authoritative name server crux.cs.usyd.edu.au intermediate name server metro.ucc.usyd.edu.au

7

8

(6)

21!

DNS:  caching  and  upda;ng  records  

¢ once  (any)  name  server  learns  mapping,  it  caches  mapping  

§ cache  entries  ;meout  (disappear)  aher  some  ;me  

¢ update/no;fy  mechanisms  under  design  by  IETF   § RFC  2136  

§ hjp://www.faqs.org/rfcs/rfc2136.html  

22!

DNS  records  

¢ DNS:  distributed  db  storing  resource  records  (RR)  

¢ Type=NS  

§  name  is  domain  (e.g.  foo.com)  

§  value  is  IP  address  of  authorita;ve  name   server  for  this  domain  

RR format: (name, value, type, class, ttl)

Type=A!

n name is hostname!

n value is IP address!

!

Type=CNAME!

n name is alias name for some

“canonical” (the real) name! www.ibm.com is really servereast.backup2.ibm.com n value is canonical name!

!

Type=MX!

n value is name of mailserver

associated with name!

!

The University of Sydney

DNS  protocol,  messages  

¢ DNS  protocol  :  query  and  reply  messages,  both  with  same  message  format  

msg header!

identification: 16 bit # for query,

reply to query uses same #! flags:!

n query or reply!

n reply is authoritative !

n recursion desired !

n recursion available!

The University of Sydney

DNS  protocol,  messages  

Name, type fields

for a query RRs in response to query records for authoritative servers additional “helpful” info that may be used

(7)

25!

Workload  and  Caching  

¢  What  workload  do  you  expect  for  different  servers?   § Why  might  this  be  a  problem?  How  can  we  solve  this  problem?   ¢  DNS  responses  are  cached    

§ Quick  response  for  repeated  transla;ons  

§ Other  queries  may  reuse  some  parts  of  lookup  

§  NS  records  for  domains    

¢  DNS  nega;ve  queries  are  cached   § Don’t  have  to  repeat  past  mistakes  

§ E.g.  misspellings,  search  strings  in  resolv.conf   ¢  Cached  data  periodically  ;mes  out  

§ Life;me  (TTL)  of  data  controlled  by  owner  of  data  

§ TTL  passed  with  every  record  

26!

Subsequent  Lookup  Example  

Client DNS server Local

root & au DNS server usyd.edu.au DNS server it.usyd.edu.au DNS server ftp.it.usyd.edu.au ftp=IPa ddr ftp.it.usyd .edu.au

The University of Sydney

Reliability  

¢ 

DNS  servers  are  replicated  

§ Name  service  available  if  ≥  one  replica  is  up  

§ Queries  can  be  load  balanced  between  replicas  

¢ 

Try  alternate  servers  on  ;meout    

¢ 

What’s  a  good  value  for  a  ;meout?  

§ Hard  to  tell  à  what  are  the  tradeoffs?  

§ Bejer  be  conserva;ve!  

¢ 

Exponen;al  backoff  when  retrying  same  server  

§ Why  do  we  need  this?  

¢ 

Same  iden;fier  for  all  queries  

§ Don’t  care  which  server  responds  

The University of Sydney

Prefetching  

¢ 

Name  servers  can  add  addi;onal  data  to  any  response  

¢ 

Typically  used  for  prefetching  

§ CNAME/MX/NS  typically  point  to  another  host  name  

(8)

29!

DNS:  Summary  

¢ 

Mo;va;ons  !  large  distributed  database  

§ Scalability  

§ Independent  update  

§ Robustness  

¢ 

Hierarchical  database  structure  

§ Zones  

§ How  is  a  lookup  done  

¢ 

Caching/prefetching  and  TTLs  

30!

Today:  Internet  Services  

¢ 

Web  Caching    

¢ 

Domain  Name  System  

¢ 

Content  Distribu;on  Network  

¢ 

Web  Services  

The University of Sydney

Components  in  a  CDN  

Cache!

aaa.com! bbb.com! ccc.com!

Backend Servers! ! ! Geographically distributed surrogate servers! ! ! Redirectors! ! Clients!

The University of Sydney

Content  distribu;on  networks  (CDNs)  

¢ The  content  providers  are  the  CDN  

customers.  

¢ Content  replica;on  

¢ CDN  company  installs  hundreds  of  CDN  

servers  throughout  Internet  

§ in  lower-­‐;er  ISPs,  close  to  users  

¢ CDN  replicates  its  customers’  content  in  

CDN  servers.  When  provider  updates   content,  CDN  updates  servers

 

origin server in North America CDN distribution node CDN server in S. America CDN server in Europe CDN server in Asia

(9)

33!

CDN  example

 

¢ origin  server   ¢ www.foo.com   ¢ distributes  HTML   ¢   Replaces:   ¢  http://www.foo.com/sports.ruth.gif ¢           with http://www.cdn.com/www.foo.com/sports/ ruth.gif   HTTP request for www.foo.com/sports/sports.html

DNS query for www.cdn.com

HTTP request for www.cdn.com/www.foo.com/sports/ruth.gif 1 2 3 Origin server CDNs authoritative DNS server Nearby CDN server CDN company! cdn.com!

distributes gif files! uses its authoritative DNS server to route redirect requests!

34!

More  about  CDNs  

¢ rou;ng  requests  

¢ CDN  creates  a  “map”,  indica;ng  

distances  from  leaf  ISPs  and  CDN  nodes  

¢ when  query  arrives  at  authorita;ve  DNS  

server:  

§ server  determines  ISP  from  which   query  originates  

§ uses  “map”  to  determine  best  CDN   server  

¢ not  just  Web  pages  

¢ streaming  stored  audio/video  

¢ streaming  real-­‐;me  audio/video   §  CDN  nodes  create  applica;on-­‐layer  

overlay  network  

The University of Sydney

How  Akamai  Works  

¢ 

Clients  fetch  html  document  from  primary  server  

§ E.g.  fetch  index.html  from  cnn.com  

¢ 

URLs  for  replicated  content  are  replaced  in  html  

§ E.g.  <img  src=“hjp://cnn.com/af/x.gif”>  replaced  with  <img  src=“hjp:// a73.g.akamaitech.net/7/23/cnn.com/af/x.gif”>    

¢ 

Client  is  forced  to  resolve  aXYZ.g.akamaitech.net  hostname  

The University of Sydney

How  Akamai  Works  

¢ 

How  is  content  replicated?  

¢ 

Akamai  only  replicates  sta;c  content  

¢ 

Modified  name  contains  original  file  name  

¢ 

Akamai  server  is  asked  for  content  

§ First  checks  local  cache  

(10)

37!

How  Akamai  Works  

¢ End-­‐user  

cnn.com (content provider) DNS root server Akamai server

1 2 3 4 Akamai high-level DNS server Akamai low-level DNS server Nearby matching Akamai server 11 6 7 8 9 10 Get index.ht ml Get /cnn.com/foo.jpg 12 Get foo.jpg 5 38!

Akamai  –  Subsequent  Requests  

¢ End-­‐user  

cnn.com (content provider) DNS root server Akamai server

1 2 Akamai high-level DNS server Akamai low-level DNS server 7 8 9 10 Get index.h tml Get /cnn.com/foo.jpg Nearby matching Akamai server

The University of Sydney

Today:  Internet  Services  

¢ 

Web  Caching    

¢ 

Domain  Name  System  

¢ 

Content  Distribu;on  Network  

¢ 

Web  Services  

The University of Sydney

Web  History  

¢ 

1945:    

§ Vannevar  Bush,  “As  we  may  think”,  Atlan;c  Monthly,  July,  1945.  

§  Describes  the  idea  of  a  distributed  hypertext  system.  

§  A  “memex”  that  mimics  the  “web  of  trails”  in  our  minds.  

“Consider a future device for individual use, which is a sort of mechanized private file and library. It needs a name, and to coin one at random, "memex" will do. A memex is a device in which an individual stores all his books, records, and communications, and which is mechanized so that it may be consulted with exceeding speed and flexibility. It is an enlarged intimate supplement to his memory.”!

(11)

41!

Web  History  

¢ 

1989:  

§ Tim  Berners-­‐Lee  (CERN)  writes  internal  proposal  to  develop  a  distributed   hypertext  system.  

§ Connects  “a  web  of  notes  with  links.”  

§ Intended  to  help  CERN  physicists  in  large  projects  share  and  manage  

informa;on    

¢ 

1990:  

§ Tim  BL  writes  a  graphical  browser  for  Next  machines.  

42!

Web  History  (cont)  

¢ 

1992  

§ NCSA  server  released  

§ 26  WWW  servers  worldwide  

¢ 

1993  

§ Marc  Andreessen  releases  first  version  of  NCSA  Mosaic  browser  

§ Mosaic  version  released  for  (Windows,  Mac,  Unix).  

§ Web  (port  80)  traffic  at  1%  of  NSFNET  backbone  traffic.  

§ Over  200  WWW  servers  worldwide.  

¢ 

1994  

§ Andreessen  and  colleagues  leave  NCSA  to  form  “Mosaic  Communica;ons   Corp”  (predecessor  to  Netscape).    

The University of Sydney

Internet  Hosts  

§ How  many  of  the  232  IP  addresses  have  registered  domain  names?  

The University of Sydney

Web  Servers  

Web! server! HTTP request HTTP response (content)

¢  Clients  and  servers  communicate  

using    the  HyperText  Transfer   Protocol  (HTTP)  

§ Client  and  server  establish  TCP   connec;on  

§ Client  requests  content  

§ Server  responds  with  requested   content  

§ Client  and  server  close  connec;on   (eventually)  

¢  Current  version  is  HTTP/1.1   § RFC  2616,  June,  1999.     Web! client! (browser) ! http://www.w3.org/Protocols/rfc2616/rfc2616.html IP! TCP! HTTP! Datagrams   Streams   Web  content  

(12)

45!

Web  Content  

¢ 

Web  servers  return  

content

 to  clients  

§ content:  a  sequence  of  bytes  with  an  associated  MIME  (Mul;purpose  Internet  Mail  

Extensions)  type  

¢ 

Example  MIME  types  

§ text/html HTML  document  

§ text/plain Unformajed  text  

§ application/postscript Postcript  document  

§ image/gif Binary  image  encoded  in  GIF  format  

§ image/jpeg  Binary  image  encoded  in  JPEG  format  

46!

Sta;c  and  Dynamic  Content  

¢ 

The  content  returned  in  HTTP  responses  can  be  either  

sta1c

 or  

dynamic

.  

§ Sta4c  content:  content  stored  in  files  and  retrieved  in  response  to  an  

HTTP  request  

§  Examples:  HTML  files,  images,  audio  clips.   §  Request  iden;fies  content  file  

§ Dynamic  content:  content  produced  on-­‐the-­‐fly  in  response  to  an  HTTP  

request  

§  Example:  content  produced  by  a  program  executed  by  the  server  on   behalf  of  the  client.  

§  Request  iden;fies  file  containing  executable  code  

¢ 

Borom  line:  All  Web  content  is  associated  with  a  file  that  is  

managed  by  the  server.  

The University of Sydney

URLs  

¢ 

Each  file  managed  by  a  server  has  a  unique  name  called  a  URL  

(Universal  Resource  Locator)  

¢ 

URLs  for  sta;c  content:  

§ http://www.cs.cmu.edu:80/index.html  

§ http://www.cs.cmu.edu/index.html

§ http://www.cs.cmu.edu

§ Iden;fies  a  file  called  index.html,  managed  by  a  Web  server  at  

www.cs.cmu.edu  that  is  listening  on  port  80.

¢ 

URLs  for  dynamic  content:

§ http://www.cs.cmu.edu:8000/cgi-bin/proc?15000&213

§ Iden;fies  an  executable  file  called  proc,    managed  by  a  Web  server  at  

www.cs.cmu.edu  that  is  listening  on  port  8000,  that  should  be  called  with   two  argument  strings:  15000  and  213.

The University of Sydney

How  Clients  and  Servers  Use  URLs  

¢ 

Example  URL:  

http://www.cmu.edu:80

/index.html

¢ 

Clients  use  prefix  (

http://www.cmu.edu:80

)  to  infer:  

§ What  kind  of  server  to  contact  (Web  server)  

§ Where  the  server  is  (www.cmu.edu)  

§ What  port  it  is  listening  on  (80)  

¢ 

Servers  use  suffix  (

/index.html

)  to:  

§ Determine  if  request  is  for  sta;c  or  dynamic  content.  

§  No  hard  and  fast  rules  for  this.  

§  Conven;on:  executables  reside  in  cgi-bin directory  

§ Find  file  on  file  system.  

§  Ini;al  “/”  in  suffix  denotes  home  directory  for  requested  content.  

§  Minimal  suffix  is  “/”,  which  all  servers  expand  to  some  default  home  page  

(13)

49!

HTTP  Requests  

¢ 

HTTP  request  is  a  

request  line

,  followed  by  zero  or  more  

request  headers  

¢ 

Request  line:  <method> <uri> <version>

§ <version>  is  HTTP  version  of  request  (HTTP/1.0  or  HTTP/1.1)

§ <uri>  is  typically  URL  for  proxies,  URL  suffix  for  servers.  

§ A  URL  is  a  type  of  URI  (Uniform  Resource  Iden;fier)   § See  hjp://www.iey.org/rfc/rfc2396.txt  

§ <method> is  either GET, POST, OPTIONS, HEAD, PUT, DELETE, or TRACE.

50!

HTTP  Requests  (cont)  

¢ 

HTTP  methods:

§ GET:  Retrieve  sta;c  or  dynamic  content  

§  Arguments  for  dynamic  content  are  in  URI   §  Workhorse  method  (99%  of  requests)  

§ POST:  Retrieve  dynamic  content  

§  Arguments  for  dynamic  content  are  in  the  request  body  

§ OPTIONS:  Get  server  or  file  ajributes  

§ HEAD:  Like  GET  but  no  data  in  response  body  

§ PUT:  Write  a  file  to  the  server!  

§ DELETE:  Delete  a  file  on  the  server!  

§ TRACE:  Echo  request  in  response  body  

§  Useful  for  debugging.  

¢ 

Request  headers:  <header name>: <header data>  

§ Provide  addi;onal  informa;on  to  the  server.  

The University of Sydney

HTTP  Versions  

¢ 

Major  differences  between  HTTP/1.1  and  HTTP/1.0  

§ HTTP/1.0  uses  a  new  connec;on  for  each  transac;on.  

§ HTTP/1.1  also  supports  persistent  connec4ons    

§ mul;ple  transac;ons  over  the  same  connec;on   § Connection: Keep-Alive

§ HTTP/1.1  requires  HOST  header  

§ Host: www.cmu.edu

§ Makes  it  possible  to  host  mul;ple  websites  at  single  Internet  host  

§ HTTP/1.1  supports  chunked  encoding  (described  later)  

§ Transfer-­‐Encoding:  chunked  

§ HTTP/1.1  adds  addi;onal  support  for  caching  

The University of Sydney

HTTP  Responses  

¢ 

HTTP  response  is  a  

response  line

 followed  by  zero  or  more  

response  

headers

.  

¢ 

Response  line:    

¢ 

  <version> <status code> <status msg>

§ <version>  is  HTTP  version  of  the  response.  

§ <status  code>  is  numeric  status.  

§ <status  msg>  is  corresponding  English  text.  

§ 200    OK    Request  was  handled  without  error  

§ 301  Moved    Provide  alternate  URL  

§ 403  Forbidden  Server  lacks  permission  to  access  file   § 404  Not  found  Server  couldn’t  find  the  file.  

¢ 

Response  headers:  <header name>: <header data>

§ Provide  addi;onal  informa;on  about  response  

§ Content-Type: MIME  type  of  content  in  response  body.  

(14)

53!

GET  Request  to  Apache  Server  

From  Firefox  Browser  

GET /~bryant/test.html HTTP/1.1 Host: www.cs.cmu.edu

User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv: 1.9.2.11) Gecko/20101012 Firefox/3.6.11 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/ *;q=0.8 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 115 Connection: keep-alive CRLF (\r\n)

URI is just the suffix, not the entire URL!

54!

GET  Response  From  Apache  Server  

HTTP/1.1 200 OK

Date: Fri, 29 Oct 2010 19:48:32 GMT

Server: Apache/2.2.14 (Unix) mod_ssl/2.2.14 OpenSSL/0.9.7m mod_pubcookie/3.3.2b PHP/5.3.1

Accept-Ranges: bytes Content-Length: 479

Keep-Alive: timeout=15, max=100 Connection: Keep-Alive Content-Type: text/html <html> <head><title>Some Tests</title></head> <body> <h1>Some Tests</h1> . . . </body> </html>

The University of Sydney

Serving  Dynamic  Content  

Client Server

¢ 

Client  sends  request  to  server.  

¢ 

If  request  URI  contains  the  

string  “/cgi-bin”,  then  the  

server  assumes  that  the  request  

is  for  dynamic  content.    

GET /cgi-bin/env.pl HTTP/1.1

The University of Sydney

Serving  Dynamic  Content  (cont)  

Client Server

¢ 

The  server  creates  a  child  

process  and  runs  the  program  

iden;fied  by  the  URI  in  that  

process  

env.pl fork/exec!

(15)

57!

Serving  Dynamic  Content  (cont)  

Client Server

¢ 

The  child  runs  and  generates  

the  dynamic  content.  

¢ 

The  server  captures  the  content  

of  the  child  and  forwards  it  

without  modifica;on  to  the  

client  

env.pl

Content! Content!

58!

Issues  in  Serving  Dynamic  Content  

¢ 

How  does  the  client  pass  program  

arguments  to  the  server?  

¢ 

How  does  the  server  pass  these  

arguments  to  the  child?  

¢ 

How  does  the  server  pass  other  info  

relevant  to  the  request  to  the  child?  

¢ 

How  does  the  server  capture  the  content  

produced  by  the  child?  

¢ 

These  issues  are  addressed  by  the  

Common  Gateway  Interface  (CGI)  

specifica;on.  

Client Server Content! Content! Request! Create! env.pl

The University of Sydney

CGI  

¢ 

Because  the  children  are  wriren  according  to  the  CGI  spec,  

they  are  o_en  called  

CGI  programs

.  

¢ 

Because  many  CGI  programs  are  wriren  in  Perl,  they  are  o_en  

called  

CGI  scripts

.  

¢ 

However,  CGI  really  defines  a  simple  standard  for  transferring  

informa;on  between  the  client  (browser),  the  server,  and  the  

child  process.  

The University of Sydney

The  add.com  Experience  

input URL!

Output page! host! port! CGI program!args!

(16)

61!

Serving  Dynamic  Content  With  GET  

¢ 

Ques;on:  How  does  the  client  pass  arguments  to  the  server?  

¢ 

Answer:  The  arguments  are  appended  to  the  URI  

¢ 

Can  be  encoded  directly  in  a  URL  typed  to  a  browser  or  a  URL  in  an  

HTML  link      

§ http://add.com/cgi-bin/adder?n1=15213&n2=18243

§ adder  is  the  CGI  program  on  the  server  that  will  do  the  addi;on.  

§ argument  list  starts  with  “?”  

§ arguments  separated  by  “&”    

§ spaces  represented  by    “+” or “%20”

¢ 

URI  o_en  generated  by  an  HTML  form  

<FORM METHOD=GET ACTION="cgi-bin/adder"> <p>X <INPUT NAME="n1">

<p>Y <INPUT NAME="n2"> <p><INPUT TYPE=submit>

</FORM>

62!

Serving  Dynamic  Content  With  GET  

¢ 

URL:    

§ cgi-bin/adder?n1=15213&n2=18243

¢ 

Result  displayed  on  browser:    

Welcome to add.com: THE Internet addition portal. The answer is: 15213 + 18243 -> 33456 !

Thanks for visiting! !

The University of Sydney

Serving  Dynamic  Content  With  GET  

¢ 

Ques;on:  How  does  the  server  pass  these  arguments  to  the  

child?  

¢ 

Answer:  In  environment  variable  QUERY_STRING

§ A  single  string  containing  everything  aher  the  “?”  

§ For  add:  QUERY_STRING  =  “n1=15213&n2=18243”  

The University of Sydney

Addi;onal  CGI  Environment  Variables  

¢ 

General  

§ SERVER_SOFTWARE  

§ SERVER_NAME  

§ GATEWAY_INTERFACE  (CGI  version)  

¢ 

Request-­‐specific  

§ SERVER_PORT  

§ REQUEST_METHOD  (GET,  POST,  etc)  

§ QUERY_STRING  (contains  GET  args)  

§ REMOTE_HOST  (domain  name  of  client)  

§ REMOTE_ADDR  (IP  address  of  client)  

§ CONTENT_TYPE  (for  POST,  type  of  data  in  message  body,  e.g.,  text/ html)  

(17)

65!

Even  More  CGI  Environment  Variables  

¢ 

In  addi;on,  the  value  of  each  header  of  type  type  received  

from  the  client  is  placed  in  environment  variable  HTTP_type  

§ Examples  (any  “-­‐”  is  changed  to  “_”)  :  

§ HTTP_ACCEPT   § HTTP_HOST   § HTTP_USER_AGENT   66!

 

THANK  YOU  

Ques;ons?  

References

Related documents

Server GIS Desktop GIS Internet GIS Mobile GIS Accessibility Spatial Data Infrastructures.. Creation,

Web Server Spatial DBMS GIS Data Data Handling Web Services Mapping / GIS Server Data Access Mapping Application Internet. Web

[r]

El objetivo principal de este trabajo es analizar las transfor- maciones en las prácticas, los usos, los conocimientos y las significaciones de los mocovíes en relación con su medio

use multiple in vivo two-photon targeted whole-cell recordings to measure the rates of connectivity, amplitude, kinetics, reliability, and short-term plasticity of

chest, as are books of magic in other tales (cf. 41) explains that in funerary texts, chests are connected with the sun god, who dominates the following episodes in Papyrus

The ICAR has played a pioneering role in ushering Green Revolution and subsequent developments in agriculture in India through its research and technology development that

Fabianett Salas Ibarra T00021640. Universidad Tecnologica de Bolívar. rápida transferencia del calor), capacidad de mojadura, no reactividad o inercia química (el