School of Computing. Dublin City University.
Online coding site: Ancient Brain
coders JavaScript worlds
File - A named section of disk.
Files implementation:
Not necessarily a
contiguous
section of disk
(but that fact may be hidden from users and programs).
Normally both user and programmer never deal with disk directly,
but only by calling named files.
In some high-performance application (e.g. writing a high-speed search engine), you may need to implement your own file system, but this is obviously difficult and full of dangers.
Windows file system can spread over multiple pieces of hardware. Each given its own (single-letter) drive:
drive:\dir\fileCan also partition a single piece of hardware into multiple drives.
UNIX file system can spread over multiple pieces of hardware too.
But everything appears as sub-directories of a single file hierarchy.
Path may indicate hardware, something equivalent to:
/drive/dir/fileor may hide hardware entirely:
/dir/file
Can organise files in separate dirs
(Many web authors seem not to have discovered sub-dirs!).
Crucial to keep user files separate from system files (Why?).
Windows
C:\Users\me
UNIX
$HOME
Can reuse same file names in different sub-dirs (like index.html).
photos.kenya.apr.1963.html
Legacy systems:
phka0463.htm
Short file names are good, though, for:
Maybe short URLs:
http://en.wikipedia.org/wiki/Othello
make the web a more pleasant experience than long URLs:
http://dmoz-odp.org/Arts/Literature/World_Literature/British/Shakespeare/Works/Plays/Tragedies/Othello/.
It is nice to have short, "guessable" URLs.
See "URL as UI"
See
URL shortening.
(Used e.g. on Twitter.)
Q. Is there still a problem with that URL?
Some web server set-ups generate super-complex URLs, which can then get pasted into documents.
This is apparently a real ad.
From here.
Can selectively break the hierarchy with shortcuts.
ln -s dir shortcutor in Windows see "Create Shortcut".
e.g. On one system I used, there was no /bin dir:
$ ls -l /bin lrwxrwxrwx 1 root root 9 Apr 14 1997 /bin -> ./usr/bin
ln -s file secondnameOr have pointer in one dir to file in other dir.
On DCU Linux you will see lots of pointers:
$ ls -l /bin | grep '^l' $ ls -l /usr/bin | grep '^l' $ ls -l /usr/bin/touch lrwxrwxrwx 1 root root 10 May 1 2020 /usr/bin/touch -> /bin/touch
/bin/ls -> /usr/bin/lsQ. Why do programs sometimes call a specific path to a program, e.g. they call /bin/ls rather than just ls ?
With shortcuts, if doing a recursive search of disk, can get infinite loop problems, or at least duplication. e.g. List all files on disk. If follow symbolic links may list files twice.
Q. Also, if delete file, do you delete symbolic link?
If so, how do you find them - do you have reverse directory of them?
Also, I make symbolic link to other user's file.
They delete file. They can't delete my link.
A. If link doesn't work, so what.
Might even leave it dangling as reminder.
If your directory is accessible by others on your local machine, someone on your machine can make it readable by the world on the Web (either maliciously or accidentally):
The world can then read other user's directory through:cd /homes/your-userid/public_html ln -s /homes/other-userid/dir shortcut
Has valid uses too. Might want to make one of your own dirs visible without having to have it under public_html, e.g. public_html disk is full, dir is on another disk.http://host/~your-userid/shortcut/
Another example - ftp may only drop you in home directory rather than root directory and you may not be able to go upwards. What you do is put symbolic links in your home directory and you can access any directory through them:
ln -s /var/mail email ln -s /htdocs ht
General conclusion is that a basic hierarchy, with some cross-links for difficult points, is excellent way to structure complex data (e.g. Open Directory) - rather than total cross-link free-for-all on one hand (e.g. the Web with just search engines and no directories), or rigid hierarchy on other (e.g. Dewey library system).
Interestingly, family trees are also basically hierarchical, with arbitrary cross-links, rather than strictly hierarchical as many people seem to think.
If it's data (1's and 0's), there's no real excuse for losing it. You can make automated copies and store them all over the world. Disk space is big and cheap. Machines are often idle. The network is always on. Backups can be automated across the network by scripts.
In future, backup and long-term storage will be increasingly important service, like a bank.
Which of these is the most dangerous: