Einstein - Change files
We are going to write a Shell script called "cweb"
to change one string to another in a group of files.
Usage like:
cweb oldstring newstring (list of files)
for example, change string in a list of web pages:
cweb oldstring newstring *html
Setup
Get files to change
- To run this locally, you need some files to change.
-
Download your own copy of the Shakespeare files.
- Then we can practice running the script on them.
Usage in lab exam
In the special environment of
the lab exam, we have some issues:
Intro - See how "sed" works
Before even writing the program, first see how "sed" works:
- "cd" into the "macbeth" directory in your own copy of the Shakespeare files.
- Search for all occurrences of the string "Scotland":
grep Scotland *html
- Change "Scotland" to "Tralee" in the output stream
by piping the output of grep into
sed:
grep Scotland *html | sed -e "s|Scotland|Tralee|g"
- This changes the output stream on the command line. It does not change the files.
- Hint: The "g" is important. Why?
Modify this script
Here is a script to get started.
Anything in
[red brackets]
has to be fixed by you.
# Read in the arguments:
OLDSTRING=[first argument]
NEWSTRING=[second argument]
# the following gets rid of the first two arguments
shift
shift
# from now on, "all arguments" means "all arguments from 3rd one on"
# go through all arguments one by one:
for x in [all arguments]
do
echo "We need to change $OLDSTRING to $NEWSTRING in file $x"
ls -l $x
echo
done
As discussed above, for the special lab test environment, we put the program in the same directory as the files.
We could test with various Shakespeare files,
but I suggest we just test with the Macbeth files.
Go into your macbeth directory. Put the program in here and we will test it like:
./cweb3.sh Scotland Tralee complete.html
./cweb3.sh Scotland Tralee *html
It should just do an "ls" of the files in question, and say what string needs to be changed to what.
But not actually change anything.
Hint: No spaces in assignment of a variable in Shell.
Test the above locally. Do not move on until this is working.
Make new file
We will now try to make a new file with the string changed.
- After the "ls -l" in the script, we will insert a line like this:
cat $x | [sed command] > tmpfile
- We "cat" the contents of the file.
- Send that via a pipe to "sed" to change one string to another.
- "sed" should change the old string to the new string, whatever they are.
It should not be hardwired to change Scotland to Tralee.
- The output goes to a temporary file called tmpfile
Test it with:
./cweb3.sh Scotland Tralee complete.html
./cweb3.sh Scotland Tralee *html
Check the size of the temporary file (compared to the size of the original).
Check whether it has the old string changed to the new string.
If the size of the temporary file is zero, check what the "sed" string actually is (echo the sed string).
Top tip:
Remember the
two types of quotes.
Test the above locally. Do not move on until this is working.
Finish
- When you are confident the temporary file is good,
edit the script.
Use file commands
to copy the temporary file back to the original file.
- And do another "ls -l" to show the file after it has been changed.
Full test before upload:
- In the macbeth directory, search for occurrences of the string "Scotland" in one file:
grep Scotland complete.html
- Change all occurrences of "Scotland" to "Tralee":
./cweb3.sh Scotland Tralee complete.html
- Search for occurrences of "Scotland". Should be gone.
- Search for occurrences of "Tralee". Should be lots.
- Change all occurrences of "Tralee" back to "Scotland".
- Search for occurrences of "Scotland". Should be lots.
- Search for occurrences of "Tralee". Should be gone.
- When that is working, test with multiple files:
./cweb3.sh Scotland Tralee *html
- When that is working, upload to Einstein for marks.
This script is very useful
Imagine using this script to change one string to another in 1,000 web pages without having to open any editors
(or indeed do any work).
In a normal environment, the program can be called any name,
does not have to be .sh,
can be in your bin directory, in the PATH,
and we call it like:
cweb oldstring newstring *html */*html
or, if we hard code into it where all your web pages are, we could do a version like this:
cweb oldstring newstring