Einstein - stock prices
Einstein - stock prices (stockanalysis.com)
Go on the Internet and get today's stock price.
Usage
like:
- getprice goog
- getprice msft
We will use this site:
"View source" and we see that the price is in a section like this:
<div class="text-4xl font-bold ... ">298.79</div>
The bit in dots "..." seems to vary.
Assuming the "text-4xl font-bold" bit is stable,
this div can be identified and extracted.
Note our script will fail in the future if the site format changes.
Lab exam needs
For reasons explained before,
in the lab exam
we need extension .sh and call it with "./" in front of it.
Also for this lab we will call it getprice6.
More lab exam needs - Reduce network requests
- To reduce network requests, for the lab exam we will not wget the page each time we run.
- Rather we will wget once into a file, and then work with that file.
- At the end, after we have got the marks, we can put wget back in the script.
Getting started
On the
command-line (not in your script),
run
wget once to get the
goog
stock price page and output to a file called
thepage.htm
Now the structure of your program will be:
cat thepage.htm | [various modifications to extract price]
And we run it with no argument, just:
./getprice6.sh
Stop to debug
- Does thepage.htm
contain the valid content of the
goog page,
inside of which is the actual stock price?
- If not, there is no point continuing further.
Recipe
Follow this recipe to extract the price.
- Pipe the data to tr
to delete all Windows end of line characters.
- Pipe the previous to tr
to delete all new line characters (put the entire file onto a single line).
- Use
sed
to put a new line before each
<div
and a new line after each
</div>
For clues, see
how to put new line before and after HTML tags.
- grep for "text-4xl font-bold"
Now you should have just a single line like:
<div class="text-4xl font-bold ... ">156.47</div>
- Use
sed
to remove </div>
- Use sed
to remove everything from start of line to >
You want a regular expression of:
Start of line - Any sequence of chars - >
You could in fact extract the price in many different ways.
Any method will do, so long as it extracts the price.
Testing
Please do
not debug it on Einstein!
That is entirely the wrong way to think.
Run it locally.
Test it locally.
Debug it locally.
Debug each step of the recipe first, before moving to the next step.
When you are happy with it, upload it to Einstein.
Note the version you upload to Einstein will not be doing a wget.
Einstein will test it on a different file
thepage.htm
for a different stock (but with the same format).
Afterwards (optional)
After you have got your marks, you can fix up the program as follows:
- The program will do the wget every time it runs.
- Construct the URL based on the
command-line argument.
- Remember the
two types of quotes.
- Use wget
to fetch the URL and then go straight to the pipe.
No need to have an intermediate file thepage.htm
On a normal setup, it does not need .sh, it can be called anything, and it will be in the PATH.
So we can call it
getprice and run it like:
- getprice goog
- getprice msft