Dr. Mark Humphrys

School of Computing. Dublin City University.

Online coding site: Ancient Brain

coders   JavaScript worlds

Search:

Free AI exercises


Network programming in Java


Package:
import java.net.*;
Can use these classes to (a) communicate with any server, (b) construct your own server.



Java network programming reference




InetAddress class - Get IP address of hostname

Find numeric IP address (or return error) given hostname.

InetAddress

From Graba:

 

import java.net.*;
import java.io.*;

public class ip 
{
  public static void main ( String[] args ) throws IOException 
  {
    String hostname = args[0];

    try 
    {
      InetAddress ipaddress = InetAddress.getByName(hostname);
      System.out.println("IP address: " + ipaddress.getHostAddress());
    }
    catch ( UnknownHostException e )
    {
      System.out.println("Could not find IP address for: " + hostname);
    }
  }
}

Run it:


$ javac ip.java

$ java ip www.computing.dcu.ie 
IP address: 136.206.217.26

Q. Write program to find hostname given IP.
Note getByName() is flexible in its input.

See DNS lookup.




InetAddress.getLocalHost()

To find your own numeric IP address in Java:
  1. getLocalHost
    Works on DCU Win.
    May not work on DCU Linux.


In general:


My local host is not to be confused with: 127.0.0.1



TCP Sockets

Sockets .
Ports.
TCP connection to a server: Open a "socket" to connect to a "port" number.
Connection-oriented.
Must explicitly socket.close()



Query open ports

Port scanner - look at some machines in DCU to find ports that are "open" - providing a service.

Does this by trying to open a socket to that port.

 

import java.net.*;
import java.io.*;

public class ports 
{
  public static void main ( String[] args ) throws IOException 
  {
    String hostname = args[0];

    Socket s = null;

    try 
    {
      // this is to see if host exists:
      InetAddress ipaddress = InetAddress.getByName(hostname);

//	int p =  21;		// ftp
//	int p =  22;		// ssh / sftp
//	int p =  23;		// telnet
//	int p =  25;		// smtp
	int p =  80;		// http
//	int p = 110;		// pop3
//	int p = 143;		// imap

		try
		{
		  s = new Socket(hostname, p);
		  System.out.println("A server is running on port " + p + ".");
		  s.close();
		}
		catch (IOException e)
		{
		  System.out.println("No server on port " + p + ".");
		}
    }
    catch ( UnknownHostException e )
    {
      System.out.println("Could not find host: " + hostname);
    }

	if (s != null)
	{
		try
		{
			s.close();
		}
		catch ( IOException ioEx )
		{
		}
	}
  }
}

Can now look for http servers:


$ java ports www.dcu.ie
A server is running on port 80.

$ java ports dgrayweb.computing.dcu.ie
A server is running on port 80.

$ java ports mailhost.computing.dcu.ie
A server is running on port 80.

POP3 servers:


$ java ports mailhost.computing.dcu.ie
A server is running on port 110.

Search for IMAP servers.

Search for ssh servers from outside DCU for:

  1. student.computing.dcu.ie







Socket class - Show raw HTTP request and response

This is what a HTTP request and response actually looks like.
HTTP client normally hides this from you.

e.g. Get instructions for How to email me:


 

// HTTP GET through socket, not through "URL" class

import java.net.*;
import java.io.*;

public class sget 
{
  public static void main ( String[] args ) throws IOException 
  {
    Socket s = null;

    try 
    {
	String host = "humphryscomputing.com";
	String file = "/howtomailme.html";
	int port = 80;
    
	s = new Socket(host, port);

	OutputStream out = s.getOutputStream();
	PrintWriter outw = new PrintWriter(out, false);
	outw.print("GET " + file + " HTTP/1.0\r\n");
	outw.print("Accept: text/plain, text/html, text/*\r\n");
	outw.print("\r\n");
	outw.flush();

	InputStream in = s.getInputStream();
	InputStreamReader inr = new InputStreamReader(in);
	BufferedReader br = new BufferedReader(inr);
	String line;
	while ((line = br.readLine()) != null) 
	{
		System.out.println(line);
	}
	// br.close();		// Q. Do I need this?
    } 
    catch (UnknownHostException e) {} 
    catch (IOException e) {}

	if (s != null)
	{
		try
		{
			s.close();
		}
		catch ( IOException ioEx ) {}
	}
  }
}

From:

flush() - send this now.
TCP sends a variable number of bytes. It may buffer bytes (to collect a larger amount) before sending.
flush() tells it to send what it has now.

Output:

$ java sget
HTTP/1.1 200 OK
Date: Tue, 26 Mar 2013 21:51:59 GMT
Server: Apache/2.2.3 (Unix) DAV/2 mod_ssl/2.2.3 OpenSSL/0.9.8l PHP/5.2.6 SVN/1.6.12
Accept-Ranges: bytes
Content-Type: text/html

(the URL content)



HTTP methods.
HEAD - can be used to test a URL existence without downloading.
Q. Change code to do a HEAD request only.



HTTP headers.
Request - sent by client.
Response - returned by server.





Sending a HTTP POST request

HTTP POST used for things like sending arbitrary length data through a HTML Form.



telnet to HTTP server

All plain text commands. Can just telnet to port 80 and send http commands:

$ telnet www.computing.dcu.ie 80
GET /index.html HTTP/1.1
Host: www.computing.dcu.ie

(blank line to end header)



Write your own client to control ftp, telnet, POP3 ..

We have seen how to write your own http client, using the URL class, or using sockets directly.
Now your program can control http.

You can study the commands of any other service and write a client for that too.
Use a socket to connect to the port and then send the appropriate commands.




URL class - Download HTTP page

From The Java Developers Almanac:


 

// download text content of URL

import java.net.*;
import java.io.*;

public class jget 
{
  public static void main ( String[] args ) throws IOException 
  {
    try 
    {
        URL url = new URL( args[0] );
    
        BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream()));
        String str;

        while ((str = in.readLine()) != null) 
        {
          System.out.println(str);
        }

        in.close();
    } 
    catch (MalformedURLException e) {} 
    catch (IOException e) {}
  }
}

e.g. Get the page on How to email me:

  $ java jget "https://humphryscomputing.com/howtomailme.html"


Q. Download to file.

Q. Can you parse it to extract email address?



Exercise

Insert proper error statements into the catch sections above.
Note where the following are caught:
  1. Bad URL syntax.
  2. Host does not exist.
    Insert a new catch to catch this one.
  3. Host exists but URL not found.
Some exceptions are generated by the URL constructor and others are generated by URL.openStream().



URLConnection class - Get and parse HTTP headers only

From The Java Developers Almanac:


 

// get the HTTP headers 

import java.net.*;
import java.io.*;

public class jhttp
{
  public static void main ( String[] args ) throws IOException 
  {
    try 
    {    
      URL url = new URL( args[0] );

      URLConnection c = url.openConnection();
    
      for (int i=0; ; i++) 
      {
            String name = c.getHeaderFieldKey(i);
            String value = c.getHeaderField(i);
    
            if (name == null && value == null)     // end of headers
            {
              break;         
            }

            if (name == null)     // first line of headers
            {
              System.out.println("Server HTTP version, Response code:");
              System.out.println(value);
              System.out.print("\n");
            }
            else
            {
              System.out.println(name + "=" + value);
            }
      }
    } 
    catch (Exception e) {}
  }
}

Output:


Server HTTP version, Response code:
HTTP/1.1 200 OK

Date=Mon, 22 Nov 2004 11:43:09 GMT
Server=Apache/2.0.47 (Unix) PHP/5.0.2
Last-Modified=Thu, 18 Nov 2004 10:32:20 GMT
ETag="19495e-3cd-e7abf500"
Accept-Ranges=bytes
Content-Length=973
Keep-Alive=timeout=15, max=100
Connection=Keep-Alive
Content-Type=text/html; charset=ISO-8859-1



HttpURLConnection class - Get HTTP return code only


 
 
import java.net.*;
import java.io.*;

public class hrc
{
  public static void main ( String[] args ) throws Exception 
  {
    try 
    {    
      URL url = new URL( args[0] );

      HttpURLConnection  c = (HttpURLConnection) url.openConnection();
    
      System.out.println( c.getResponseCode() );
    } 
    catch (Exception e) {}
  }
}


Output:


$ java hrc "https://humphrysfamilytree.com/surnames.html"
200

$ java hrc "https://humphrysfamilytree.com/djkjkjkll"
404

$ java hrc "https://humphrysfamilytree.com/Icons/"
403

$ java hrc "http://ddfdfdjjdg.com/surnames.html"
(Exception)




Page not found

If the file is not found you will normally get 404, though there are some other possibilities.

https://www.computing.dcu.ie/BADPAGE will give something like:


Server HTTP version, Response code:
HTTP/1.1 404 Not Found

Date=Mon, 22 Nov 2004 12:15:27 GMT
Server=Apache/2.0.47 (Unix) PHP/5.0.2
Content-Length=318
Keep-Alive=timeout=15, max=100
Connection=Keep-Alive
Content-Type=text/html; charset=iso-8859-1

Q. Write a program to check if a URL exists and return yes/no.



HTTP response codes.


My 404 re-direct

On my site I catch 404 errors with a re-direct to a script.
Try:
 https://humphryscomputing.com/BADPAGE 
Do you get response 404 or 200?




Sites that restrict scripts

Some sites don't provide content to scripts, only to browsers. For example:

  1. Write Java program to download the Google home page. This is ok.
  2. Write Java program to download the result of a Google search. It will be blocked.

Solutions:

  1. Set "user agent" to pretend to be a browser (see below).
    • How to Fetch a page from Google
    • This is a bit cheeky, but should be ok if you don't hit the site too often. That is, the remote site is asking you not to hit them with a script many times. They won't mind the occasional scripted hit. But respect their wishes by making sure you don't hit them many times or they may block your IP address.

  2. Google Developers - The correct way to interact with Google via script.



Exercise

Use Java to make requests for:
  1. Google home page (with http)
  2. Google home page (with https)
  3. Google search (with http)
  4. Google search (with https)

  5. YouTube home page (with http)
  6. YouTube home page (with https)
  7. YouTube search (with http)
  8. YouTube search (with https)
And examine:
  1. The HTTP response code
  2. Whether you can actually get the payload with Java


How to set user agent to pretend to be a browser

How to set User agent to pretend to be a browser:

On Windows:

$ java  "-Dhttp.agent=Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)"   prog

On Linux:

$ java  -Dhttp.agent="Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)"   prog



ancientbrain.com      w2mind.org      humphrysfamilytree.com

On the Internet since 1987.      New 250 G VPS server.

Note: Links on this site to user-generated content like Wikipedia are highlighted in red as possibly unreliable. My view is that such links are highly useful but flawed.