Artronic Development (ARDE)

Crafting and Maintaining a Useful Web Site

David W. Eaton
602-953-0336
dwe@arde.com
Artronic Development
4848 E. Cactus Rd. - Suite 505-224
Scottsdale, Arizona 85254
http://www.arde.com
April 1996

Introduction

Building a Web site can be a rewarding business advantage for your company. It can deliver your message to customers, suppliers and potential employees as well as provide a wealth of information to your staff.

Working with this relatively new technology can be enjoyable. Building a few HTML pages and posting them is not very difficult, however, crafting and maintaining a useful Web site is no small task. It takes knowledge, skill, and substantial time. Foresight and planning are the keys to success, and awareness of current trends and tools helps ensure your Web information can be read by your audience.

What Will We Cover?

This presentation assumes you already have a working knowledge of the Web and HTML and moves on to provide guidance for your planning and implementation process. It includes recommendations to help you avoid some of the common pitfalls, such as long unwieldy pages and sites that are hard to navigate or cannot be read with some browsers or over slow networks. It also suggests some tools to use that can help develop and test your site to ensure it is one people will want to visit.

Were Will We Get?

Since the technology is still developing, the very nature of the Web continues to change. It is necessary to grow with the technology so that your site can continue to provide its benefits. It won't be long before what is discussed here is out of date.

Be sure to allocate appropriate resources to the project and allow proper training and time to research new developments to keep a site current. Alternatively, all or part of your Web site activities could be outsourced, as is being done by companies both large and small.


General Guidance

Planning

Perhaps the most obvious key to building a good Web site is to know what it is you want to deliver. Determine whether you want to build a simple company billboard or an intricate set of product specifications and information. User input and interaction is found frequently on Web sites, but it requires an additional level of planning and implementation. Determine how much time and energy you can allocate to your Web site project, then proceed.

Plan your site's logical layout. A simple outline is helpful when writing a paper, but the dynamics of a Web site demand more. In order for a site to be easy to navigate, not only do you need to plan its contents, but also what hyperlinks are needed and expected by your visitors. Key information buried a half dozen links away from your welcome page may never be found before frustrated users give up and assume the information's not there. While a Table of Contents can improve navigation, carefully laid out heirarchical links with appropriate cross reference links can lead readers quickly to what they need. Avoid excessive linking. It becomes distracting and makes it harder for readers to maintain their train of thought. Link new terms to pages that explain them, but resist linking every occurrence of those terms.

Plan your site's physical layout. Although logical links can ping-pong back and forth across directory and machine boundaries, you may find server configuration easier if you provide reader access authorizations based on directory structures. Try keeping material beneath a common directory entry if it requires the same authentication before it can be read. Similarly, if you have different groups providing content to different portions of your Web site, controlling author access with file system permissions will be easier if you have grouped those portions into appropriate subdirectories.

Document your intentions. Particularly if you have multiple authors and administrators, it is important to let them know your site's ground rules and objectives. This will help retain the same "look and feel" of your site across departmental boundaries. These are easier to find and reference if you make them part of a private section of your Web site with appropriate authentication configured for those who need the information.

Implementation

Use templates. Perhaps one of the easiest ways to maintain a consistent page appearance is to provide authors with a template they can use for creating new pages. In addition to providing the required HTML structure, it can include site-specific lines such as: All browsers do not honor the <LINK ..> tag used to identify a page owner:
  <LINK REV="made" HREF="mailto:webmaster@your.domain.name">
but it is a good idea to provide one. Since some Web users are not set up to send e-mail, they cannot use the "mailto" anchor. Thus, you may want to provide a hyperlink to a page which allows them to submit comments via a Web form, also.

Maintain your hyperlinks. This can be one of the most time-consuming and often forgotten tasks. If you decide to rearrange your site and move pages from one location to another, leave information at the old locations to direct readers to the new URLs. Even if you have corrected all the links on your own site (and we would hope you have), cached versions of your pages, individual "hotlists" or "bookmarks", and explicit links on pages at other sites may still contain the old URLs. Watch your log files and remove the re-directing pages only after references to them have dropped below a level you determine is acceptable. If you know other sites reference a page you have moved, send e-mail to alert them of the change. You might be able to determine the correct person by using your server "referer_log" to determine how people reached your pages, then read the referring pages to determine an owner. If you still are not certain who is the correct person, try mailing to "webmaster" at the domain where the page is located. Most sites honor this convention and have someone to respond to such mail.

Some off-site cross reference problems can be reduced by keeping your links pointing to the highest appropriate level of the other site and letting that site provide navigation to the lower level details. Since the overall architecture of a site changes less frequently than the names and locations of individual pages, there is less chance your links will become obsolete. You can increase the likelihood you will be notified of changes if you first seek permission from the page owner to create the link. (After all, it is usually in their best interest for readers of your page to be able to continue to find theirs.)

Metrics

Sites always want to know if their information is getting to people who need it. Web servers can be configured to keep logs of parameters such as accesses, errors, references, and other data. Some servers come with report routines; NCSA sites can use add-on packages such as Roy Fielding's wwwstat at http://www.ics.uci.edu/WebSoft/wwwstat/. Be careful how you interpret the analysis, however. Some browsers cache the pages you've accessed so they can be displayed quicker the next time, without requiring network access. Some sites require that browsers pass through a chaching server before accessing the global internet for the same reason. This means that the "hits" recorded in your logs will be smaller than the actual number of people who accessed your pages ... of course there's no way to tell if an access really means someone read a page either and some accesses are from "robots" searching the Web. Understanding these subtlties will help you interpret the logs.

On-line Friendliness

Among other things, the Web is a publication medium. However, there are different constraints and guidelines which should be followed for electronic publication than for print. The combination of network bandwidth, transmission speed, screen size, and attention span are better served by small chunks of concise information that reference more detailed information. This is consistent with techniques taught by leading electronic documentation companies and organizations such as Information Mapping, Inc.

The on-line format should be simple and easy to follow. Appropriate use of bulleted lists rather than prose will cut size and enhance readability. (OK, I know I didn't do that here. This originally was prepared for publication where a limited ammount of page real estate was available. I appologize.) As a rule of thumb, try to keep pages to two or three screens (or less) of related information. Once the content has been written, go back and select the appropriate words to use as hyperlink anchors to related pages. This will help ensure that your content will read smoothly, yet allow those who want more detailed information to obtain it quickly. Avoid adding special words such as "click here for more info". They cause a break in the reader's train of thought. Smaller pages of focused information will make it more likely that other links to the content can be made more easily.

As you craft your pages, strive more for appropriate information structure than for a visual appearance of the page layout. Different browsers will render the components of your page in different manners. For example, some may center level 1 headings (<H1>) or make them all capital letters. Others will use a larger font size. With some software, the reader can specify the font style, size and color according to their liking. These features may be particularly important for those with vision problems or color blindness. Time spent on page layout may be lost if the reader's browser displays pages in ways you have not forseen.


Testing

Another difference with electronic presentation mediums over print is that they require testing to ensure your reader sees the information you wish.

Be aware of browsers

As mentioned above, different brands of browsers may display the same page in different ways. In fact, different versions or different platform ports of the same brand of browser may display the page differently. If you can control the browsers your readers will use, you can limit your testing. In the case of most public Web sites, however, you will need to accommodate a wide variety of browsers, including some, such as Lynx, which cannot display graphics.

For example, a brief analysis of the percentage of browsers used most often to access the InterWorks Web site over several months shows:

         Netscape   NCSA Mosaic   Lynx   Other
 1996
  Feb      70.1        8.1         2.4    19.4
  Jan      72.5        9.7         2.9    14.9

 1995
  Dec      60.2       10.9         2.5    26.4
  Nov      64.2       11.3         1.9    22.6
  Oct      69.0       16.2         2.3    12.5
  Sep      61.3       19.2         3.3    16.2
The "Other" accesses to the InterWorks pages were spread across more than 15 different browsers and more than a dozen robot search engines. While it is impractical to test with each of the dozens of possible browsers, knowing which ones are used most frequently will help avoid problems for the majority of your readers. Adhering to the standards set by the World Wide Web Consortium is the best way to assure your Web page can be read by the broadest set of Web users.

Some of the most common problem areas are also some of the "flashiest":

Your server logs may tell you what browsers are accessing your site, but they can't tell you the characteristics of the systems being used. Preformatted information that displays nicely on your high resolution screen may be too wide and require awkward scrolling on a small low resolution screen. High color images may look very poor on systems with low quality screens or small video memory. Some graphics may be virtually invisible on a monochrome screen since different colors may display in the same shade of grey.

Broken Links

You should periodically test your links to find out if they are still valid. This can be as simple as manually traversing a small site and ensuring that all the links still work and display the information you intended for your reader to find. (Even if the page is still there, its contents may have changed.) On a large site, you may want to automate this process. There are tools available on the net to help verify links still point to something (though it may not be what you intended). One of the simplest methods is to search your site for all off-site links and massage that list into a command file to use a tool such as Jeffrey Friedl's webget (was httpget) perl script. Run the command file and review the results for errors, then make the necessary corrections.

HTML syntax errors

It may not be as difficult as writing program code, but the volume of incorrect HTML on the Web is testimony to the fact that errors can be made. Most browsers are very forgiving, so some errors may not be noticed by simple inspection through your favorite browser. However, other browsers may display strange results or no page at all. Pay particular attention to your HTML tags and be certain they conform to the appropriate specification. Even if you are using an HTML authoring tool, it is a good idea to check the current syntax since the specification is still undergoing changes. You may want to bring up your pages in a browser such as Arena which can identify and flag improper HTML or you may want to run it through a syntax checker such as Neil Bowers' weblint which may be able to be found at several locations on the Web.

Graphics Issues

Graphics can enhance a presentation, provide increased understanding, and make it pleasant to view a page, however improper and inappropriate graphics can not only spoil the looks of a page, they can keep people from reading your pages at all.

Don't load a page down with large or excessive numbers of graphics. Keep those you do use relevant to the material and small enough to download quickly even over slower transmission lines. The Bandwidth Conservation Revival page was a good reference site for ideas concerning keeping images smaller without loosing their integrity.

Never forget that some browsers can't display graphics and some users of graphical browsers turn off graphics download until and unless they find something they really want to view. Always use the ALT attribute to provide those readers with a meaningful indication of the information they might be missing.

Consider displaying a small (in bytes as well as in screen real estate) thumbnail of a large graphic and let readers decide to select it if they want the full size image. If it is very large, provide an indication of its size beside or as part of the anchor that references the full inage.

The IMG tag allows for optional HEIGHT and WIDTH attributes, but how these are used may depend on the reader's browser. Some browsers fetch the complete text and graphics before anything is displayed. Some display all the text, leaving a place for the graphics encountered, then go back and start filling in the images. If HEIGHT and WIDTH attributes were specified, this type of browser usually allocates enough screen space for the graphics so that the page won't "jump" as the graphics files are processed.

Not all browsers treat these attributes the same, however. Some reserve the space indicated, then if the image is actually a different size they re-allocate the page for the true image size. This causes that page jump, but displays the image as it was built. Other browsers use the provided HEIGHT and WIDTH values and rescale the actual image to fit within that provided value. This is fine if that's what you wanted, but if you really provided a new image and forgot to go back and correct the HTML with the new values, some of your readers may be seeing some very strange pages.

A mis-use for this feature (that I have found used on more than a few sites) is to send a very large image, tie up the line time, then make my browser scale and dither it down to what should have been provided as a small "thumbnail" in the first place. On a re-scaling browser the visual result may be the same, but the access time is much longer. On a re-allocating browser, the reader gets to look at the giant image amidst the text, often with a note to "click here to view full size image." Please don't let this happen to your pages.

Graphics are very effective when used as image maps. The reader chooses what they want to see next by selecting the appropriate location on the screen. But be certain to provide text methods for reaching the same information or you will again loose some readership. Also, resist making a graphic which is nothing but words scattered around and used as an image map. If you don't really have a graphical presentation, use a text approach.

Equally irritating and wasteful of bandwith and reader time are graphic-only representations of a presentation designed for an overhead projector when the presentation consists of titles and bulleted lists with a fancy boarder around them. Consider providing these in a simple text form, perhaps with a small graphic logo at the top or bottom.

When you are mixing text and graphics on a single line, experiment with the ALIGN=MIDDLE attribute. This may make your page read smoother and display nicer.

Watch for emerging graphics standards. Some browsers already support the Portable Network Graphic (PNG) draft standard and others are making plans to do so.


Specific Techniques

Forms

Forms are a wonderful way to solicit information from your readers. Always make it clear what "button" or selection actually submits, dispatches, or sends the form. It is also a good idea to provide a button that clears the form so the user can start over.

Be certain your form is concise and easy to follow. Remember to view it with several different sized windows to be certain your reader will see what you expect. It is usually best to request specific answers from check boxes, radio buttons, or pull-down lists when possible, rather than text entry. This will make analysis and error checking easier, too.

Using explicit default selections and text entries can make it simpler on your reader, or can lead them into directions you wish they would select while still providing them alternatives.

Remembering to request the submitter's e-mail address will make it easier for you to get back to them, but expect to get quite a few that are incorrect or that at least have typos in them. Saving the user's host name as part of the response may help you detect and correct some of these typos.

Always provide feedback to your user when the submission has been completed. Echoing their selections back to the screen will allow them to print or save the entry for future reference in case they wish to follow up. If your processing script detects an error, try to be as explicit as you can. This will make it easier for them to make the correction. You may even want to have the error at the top of a regenerated form in which you supply the answers they provided. They can then make the correction and continue, rather than needing to go "back" and re-submit.

Provide an appropriate means to exit the form and proceed to some meaningful spot or at least to your welcome page.

Subordinate information

Some pages use the unordered list tag (<UL>) without its corresponding list tag (<LI>) inorder to try to provide indented paragraphs. Resist this. While it may work sometimes, it is incorrect HTML and some browsers will provide a bullet on the first text following the UL anyway, making your page look strange. If what you really have is an index or glossary, consider using the <DL>, <DT>, and <DD> construct to provide it. While we are discussing lists, remember that the list tag (<LI>) does require a surrounding qualifier to specify whether it is an unordered (bullet) list (<UL>,</UL>) or an ordered (numeric) list (<OL>,</OL>)

What extensions are safe to use?

This is a loaded question. It really depends on what you need to convey, who your audience is, and what transpires as the Web changes and grows. Some suggestions have been made already in this document, but a few highlights are listed below.

Backgrounds are becoming more commonplace. Providing you recognize some viewers won't see them and they are not integral to the understanding of the content of your page, they should be fine. Be aware that dark backgrounds with white letters will be harder for readers to print and the printed copy may look strange. Finally, always be aware that some people try very hard to set up their browser with color schemes pleasing to them. Trying to force your color choices on them may not be welcomed. In fact, the wrong color schemes may make it impossible for a color blind person to read your page as you built it.

The proposed HTML3 attribute ALIGN="CENTER" will display as you want it to on more browsers than will the Netscape <CENTER> tag.

The HTML3 figure tag may result in lost information or double image displays on some older browsers.

Embedded image maps can show the user each URL which will be selected, but not all browsers support it yet. Provide the traditional method as well to be safe.

Browsers supporting Java applets are very limited yet. If you use Java, at least provide alternate text so the rest of your readers know where they can find the information the applet is providing.

Restricted access

If you are going to link to a restricted portion of your Web site from a non-restricted portion, be certain to inform your readers of this and tell them who may read the restricted portions and what they must do to gain entry. If you are using a password scheme, be prepared to field queries from those who have forgotten their password. Tell them who to contact to get the password or to have theirs reset.

Are We There Yet?

It should be obvious by now that the answer is "No." The Web is still changing and will continue to do so. As long as it does, there's not a "there" to be and the journey isn't completed. The best you can strive for is a site that provides the content and image you want today and is ready for tomorrow's changes.

Be prepared to invest resources, time and energy keeping up to date as this roller coaster reaches new heights and takes new turns. Most of all, enjoy the ride.


Where is this paper?

This presentation was prepared by Artronic Development for the benefit of our prospective clients and our associates. An outline and an abstract are available for your convenience. This is an HTML document located at URL "http://www.arde.com/Papers/CraftWeb/paper.html".

To learn how the Web can help improve business, please refer to our paper titled "Business Case for a Web Site", which is also on this Web site.


[Available Papers] [ARDE Home] (Originally written 26 APR 1996; updated 26 NOV 2004)
Contact webmaster@www.arde.com with questions or problems. Copyright 1996 Artronic Development