Dr. Mark Humphrys

School of Computing. Dublin City University.

Online coding site: Ancient Brain

 

Search:


Einstein - stock prices


Einstein - stock prices (stockanalysis.com)

Go on the Internet and get today's stock price.
Usage like:

  
We will use this site:
  
"View source" and we see that the price is in a section like this:
 <div class="text-4xl font-bold ... ">298.79</div>

The bit in dots "..." seems to vary. Assuming the "text-4xl font-bold" bit is stable, this div can be identified and extracted.
Note our script will fail in the future if the site format changes.

  

Lab exam needs

For reasons explained before, in the lab exam we need extension .sh and call it with "./" in front of it.
Also for this lab we will call it getprice6.

More lab exam needs - Reduce network requests

  

Getting started

On the command-line (not in your script), run wget once to get the goog stock price page and output to a file called thepage.htm

Now the structure of your program will be:

cat thepage.htm | [various modifications to extract price]

And we run it with no argument, just:

 ./getprice6.sh  
  

Stop to debug

  

Recipe

Follow this recipe to extract the price.
  1. Pipe the data to tr to delete all Windows end of line characters.
  2. Pipe the previous to tr to delete all new line characters (put the entire file onto a single line).

  3. Use sed to put a new line before each <div
    and a new line after each </div>
    For clues, see how to put new line before and after HTML tags.

  4. grep for "text-4xl font-bold"
    Now you should have just a single line like:
     <div class="text-4xl font-bold ... ">156.47</div> 

  5. Use sed to remove </div>

  6. Use sed to remove everything from start of line to >
    You want a regular expression of:
     Start of line - Any sequence of chars - > 


You could in fact extract the price in many different ways.
Any method will do, so long as it extracts the price.


Testing

Please do not debug it on Einstein! That is entirely the wrong way to think.

Run it locally.
Test it locally.
Debug it locally.

Debug each step of the recipe first, before moving to the next step.

When you are happy with it, upload it to Einstein.
Note the version you upload to Einstein will not be doing a wget.
Einstein will test it on a different file thepage.htm for a different stock (but with the same format).



Afterwards (optional)

After you have got your marks, you can fix up the program as follows:
  1. The program will do the wget every time it runs.
  2. Construct the URL based on the command-line argument.
  3. Remember the two types of quotes.
  4. Use wget to fetch the URL and then go straight to the pipe. No need to have an intermediate file thepage.htm
On a normal setup, it does not need .sh, it can be called anything, and it will be in the PATH.
So we can call it getprice and run it like:
  

ancientbrain.com      w2mind.org      humphrysfamilytree.com

On the Internet since 1987.      New 250 G VPS server.

Note: Links on this site to user-generated content like Wikipedia are highlighted in red as possibly unreliable. My view is that such links are highly useful but flawed.