Introduction to HTML
Website Structure
Copyright © 2000 - 2002 Randy D. Ralph.  All rights reserved.
  Style Document Segments   Course Contents  
In place June 7, 2000.

The public component of a website is really a repository for information files which are delivered to the client on demand.  The better organized the information is, the easier the site will be to build, manage and maintain.  The key to good organization is a good system for filing information.

Directory Structure -

Just as information is filed in folders in a file cabinet drawer for ease of storage and access, information is stored within a website file directory structure for the same purposes.  The directory structure consists of a system of virtual file folders each containing a portion of the data within the website (see diagram below).

Website Directory Structure

Website Directory Structure

A three-tiered website directory structure, like the hypothetical one in the diagram above, can be displayed as if it were an organization chart because files are stored in hierarchically arranged directories.  What is not intuitively obvious about hierarchical directory structures is that lower level directories are contained entirely within the upper level directories.  In the example above there are only three directories at the root level.  All subordinate directories and files are completely contained within these three primary directories.

In this example the entire website "MY STUFF" is contained within the root (topmost) directory.  A website home page generally resides in the root directory.  Lower level web pages, which generally comprise the bulk of the content of a website, reside within the appropriate subdirectories.

File Paths -

The route that must be taken down through a hierarchical directory structure to locate a file within a subordinate directory is called a path.  An analogy might be useful to demonstrate the concept of a path.

If you were in the front foyer (root level) of your house and wanted to get a jar of jam on the second shelf of the refrigerator in the kitchen, the path to the jar of jam might be diagrammed like this:

foyer > kitchen > refrigerator > second shelf > jar of jam

or, in computer parlance:

/foyer/kitchen/refrigerator/second shelf/jar of jam

Extending this simple logic, in the example website directory structure shown in the diagram above suppose that there were a graphics file named toms_wedding.gif stored in directory Tom.  The full path to this file would be:

/MY STUFF/Photos/Friends/Tom/toms_wedding.gif

Note that spaces, special characters and capitalization count and must match exactly!  Almost all file paths are case, character and place sensitive, especially on the Internet.  This is common in the Unix/Linux systems which pervade the Internet.  Note also that the directory names are separated by a forward slash / rather than the backward slash \ familiar from DOS paths on microcomputers.  When the ultimate target of a path is, itself, a directory rather than a file, as in the example above, it is good form to end the path with a forward slash, as well.

Proper paths to the content of a website are critical because they are integral to the URLs that are used to point users to the right stuff.  URLs which contain incorrect path information are often referred to as broken or malformed and will lead users into dead ends.

Accretion vs. Planning -

Most novice web authors make the mistake of allowing their websites to grow by simple accretion.  They add more HTML documents as their site grows.  They often make the mistake of placing all the HTML documents and files that support them in a single root level directory in a number of files with non-mnemonic names.  As time goes by the content of this directory becomes more and more complex and, eventually, arcane.

Rather than allowing a website to grow by accretion, it is far better to organize the site from the outset into a clear directory structure which can accommodate all the content logically.  All that needs to be done is to develop an outline of the prospective site content and to mirror this in the directory structure. 

In the examples of website structure shown below, that on the left places all the content in a single root level directory and uses mnemonic file names for consituent HTML documents.  That is OK for small and uncomplicated websites, but for larger, more complex websites it shortly becomes inadequate and makes maintenance and management much more difficult.

In the structure shown at the right, there are directories for each category of information within the website.  This does not require that HTML documents receive mnemonic names because each is contained within it's own directory.  Note that all the HTML documents in the website are called simply index.html

Acceptable:

/mystuff.org
    books.html
    index.html
    photos.html
    records.html
 
 
Content of the site resides entirely within the root directory.  Content is managed by giving mnemonic names to HTML documents which contain various components of the website.  This makes URLs to the content less obvious and makes the organization of the site difficult for a user to follow.
 
Better:

/mystuff.org
    index.html
    /books
        index.html
    /photos
        index.html
    /records
        index.html
 
 
Content of the site resides in a directory structure which mirrors the structure of the information.  Content is managed by segregating it into a sensible directory structure.  This makes URLs to the content plain and the organization of the site much easier for users to follow.
 

File Naming Conventions -

There are magic (index or default) HTML document file names defined on most servers.  Generally the file names index.html, index.htm, home.html and home.htm are default names which will load automatically when an URL points to a directory name rather than a discrete file.  Using the default file names makes URLs to the content on a website more intelligible to the user and easier to remember.  Additionally, the default index file intercepts any user who enters the directory and forces display of the content it contains.

In the example at the right above note that the default index file name is used in every directory of the website.  This means that URLs which address the content need only reference the directory in which information resides.  Shown below are the URLs which would be required to address information on the websites in the examples above:

Acceptable:

http://www.mystuff.org/books.html
Better:

http://www.mystuff.org/books/

One of the difficulties novice web authors have is understanding index files and how they work.  They can't understand that multiple files can have the same name but different content if they reside within different directories on a website.  An analogy might be helpful.

Consider how many men named Smith may live in your neighborhood.  They all share the same name but you have no difficulty in telling one Smith from another.  Why?  Because they all look different (have different content) and live in different houses (directories).  So what's the problem with index files all bearing the same name - index.html - but having different content and residing in different directories on your site?  None!

A word of warning.  If you use a directory structure to contain content on your website and use default file names for index files in each directory, then, like Luke Skywalker, you must know where you are and what you're doing all the time.  A good rule of thumb is to download code from the server to the correct local directory before modifying it.  That way, you are assured of having the most recent copy available to you and you do not run the risk of overwriting existing code on the server with older versions or with index files from other directories.

Another word of warning.  No matter how diligent and careful you are, the day will dawn when, before consuming your first cup of coffee, you mistakenly overwrite good code on the server with bad code from your local copy.  Be prepared.  Make backups of your entire website at least quarterly to permanent storage media so that you can repair damage quickly and easily. 

  Style Document Segments   Course Contents  
You are here:  NetStrider   »   Tutorials   »   HTML   »   STRUCTURE   «