Dr. Mark Humphrys

School of Computing. Dublin City University.

Online coding site: Ancient Brain

coders   JavaScript worlds

Search:

Online AI coding exercises

Project ideas


  URI schemes

DCU proxy servers


The Web



Performance (server-side)

Many things can be done on the server-side to speed up Web performance:
  1. Multi-threaded. Start servicing new client while still responding to last client.

  2. Cache of (maybe huge numbers of) files in memory.
    Disk reads are slow. So don't make separate disk access for every file request. Instead maintain cache in RAM of frequently accessed files and/or small files which are easy to hold in RAM.
    OS will cache files in RAM and Web server can also cache files in RAM.

    Example of caching files in RAM

    Example: Search my genealogy site. Searches text of web pages. Over 2,400 web pages.
    But search is instant. Think it is caching every single web page in RAM.
    Web pages are text files and so are small compared to images, video. 2,400 pages is only about 20 M total. Could easily hold all that in RAM.
    Entire site is about 60 G. So the HTML text is less than 1/1000 of the site. This is normal enough.

  3. Multiple disks. Site could be spread over multiple disks to allow many reads going on at once.

  4. Defragmentation of disks. Reduce seek times.

  5. Multiple servers. "Server farm".

  6. Content delivery network - distributed distribution of resources.

  

Related to how the site is designed:
  1. Minification - Do various transforms to JS and other files to reduce size (reduce download time) and make parsing faster.
    Text files tend to be tiny anyway.

  2. Bundling of Files. One network request for a bundled JS file for the page, instead of 20 network requests for 20 JS files.
    Same for CSS - bundle into one CSS file.
    Reducing network calls can make a big difference.

    Example of file bundling

    Example: On my Ancient Brain site, at time of writing I have 5 JS files for each page, that I bundle into one JS file page.js.

    And I have 13 CSS files for each page, that I bundle into one CSS file main.css.

  3. Small / low-resolution images (for any images used inline).
    Can click to expand.
    Definition of "small" changes over time.


  


For high-demand sites: Multiple copies of entire site - "server farm" - front end routes requests to different CPUs.

Problem: OK to have all (small size) requests come in through one front end and get routed to searching nodes.
Not OK to have all (large size) replies go back through one front end - bottleneck.
Solution: TCP handoff - trick to have the searching node reply directly in a manner that is invisible to client.
The reply load is therefore distributed over all the nodes.




Caching in HTTP




Server logs

HTTP servers can log all accesses. Can have separate log for errors.



Typical web server logs.
(Apart from being colour-coded. Normal logs are not colour-coded.)
From askapache.com.





URI schemes

Shows how the Web has tried to provide a unifying interface to all Internet protocols, data and activities.
  

Some URL formats.


  
URI schemes listed above (in use): Obsolete: Others (media): Others (phone): Others:




HTTP client

Web browser

Uses MIME types.
(a) Plug-in - Runs inside browser process.
(b) Helper application - Separate process.





Keeping state

Relating one client-server stateless request with other client-server requests.

Identify user (pay-to-view, register, personalisation).
Shopping carts.






Performance (client-side)

Many things can be done on the client-side to speed up Web performance.

Actually, all of these things, though taking place on the client, involve server support too:

  1. Client-side caching
    • Browser maintains cache (in memory or disk or both).
    • How to see your cache files in various browsers.
    • Server tells you what to cache.

  2. Site-wide (or ISP-wide) cache via proxy server.

  3. Lazy load - of images etc.

  4. Infinite scroll - Load more of page on scroll to bottom.
    Use with moderation. See article about why this is only suitable for some types of sites.

  5. Delayed loading of resources.
    Delayed running of scripts.
    Fetch some resources / run some JS only after initial page is rendered.



DCU proxy servers

DCU is (apparently) not using proxy servers any more. But they are still in use outside DCU.
  
In DCU, some machines may communicate with the outside world through a proxy server.
Some communicate directly (not through a proxy).


  1. wwwproxy.computing.dcu.ie = 136.206.11.243 (forwards requests through 136.206.11.249)
    • port: 8000

  2. proxy.dcu.ie alternates between different IP addresses (for load balancing)
    • port: 8080 or 3128
    • lookup shows it alternates randomly between:
      1. 136.206.1.17
      2. 136.206.1.20


To set proxy, something like:
  1. Firefox - Tools - Options - Advanced - Network - Settings
  2. IE - Tools - Options - Connections - LAN settings

You may use a proxy auto-config (PAC) file:

  1. https://www.computing.dcu.ie/proxy.pac
  2. http://proxy.dcu.ie/proxy.pac


Test the IP address other sites see:



ancientbrain.com      w2mind.org      humphrysfamilytree.com

On the Internet since 1987.      New 250 G VPS server.

Note: Links on this site to user-generated content like Wikipedia are highlighted in red as possibly unreliable. My view is that such links are highly useful but flawed.