Up: Math on the Web
Up: Home

Math on the Web:
HTML compared to LaTeX

Goals to accomplish during class

  1. Increase your familiarity with HTML
  2. Learn how to build links from one Web page to others
  3. Learn how to make images appear on a Web page


There is a very important distinction between the two markup languages that we have met, namely, LaTeX and HTML: LaTeX is designed to produce typeset documents whereas HTML is designed to allow a web browser to load documents quickly and then display them on a computer monitor.

In particular, the term "hypertext" in the name HTML refers to the features in HTML that allow a document to contain "links" to other documents (or files, such as graphics files, sound files and programs).

Something to keep in mind when you are designing a web page: the person who views it will lose interest if it takes too long to load. Also keep in mind that the time that it takes to load a document will be slowest when most people want to use the web.

While reviewing at some features of HTML we will look briefly at how the two languages differ in the way they handle some things.


Basic Features of HTML

Containers

We will refer to a matched pair of HTML tags (a start tag and matching end tag) as a container. For example, the <EM> </EM> tags taken together form a container into which we can pour some text that we want emphasized. This has a parallel in LaTeX with the command \emph, which is followed by text enclosed in curly brackets. You have probably noticed that there are parallels between some constructions in LaTeX and some constructions in HTML. You can find many other parallel constructions in the two languages. However, the way that HTML emphasizes the use of containers consistently and explicitly is one feature that distinguishes HTML from LaTeX.

Tags and Attributes

Actually, the pair of angle brackets, < and >, also is a kind of container (although it is not generally referred to as a container in discussions of HTML). The simplest tags that we have seen just have the tag name as the contents of the start tag and the tag name preceded by a slash as the contents of the end tag, as in the example of <H1> </H1>, where the tag name is H1. This is used to make a large boldface "heading", called a level 1 heading. The syntax <tag-name> looks extremely clumsy at first, and if you are new to HTML, it is probably not at all obvious to you yet why it was chosen.

By default the level 1 heading (and the others, levels 2 through 6--the smallest) are placed against the left margin. If you want to center the heading instead, you can make a fancier start tag with the name H1 and with the "attribute" ALIGN, to which you give the "value" CENTER this way: <H1 ALIGN=CENTER>. The end tag is still just </H1>. End tags never take attributes. Similarly, you can produced a whole paragraph of centered text by using the command <P ALIGN=CENTER>. Here is a short example.

     <P ALIGN=CENTER>
         This is a very short centered paragraph.
     </P>
 

It is rendered as

This is a very short centered paragraph.

NOTE: The </P>closing tag may be omitted. This is because browsers understand that when they encounter a <P>tag, it means that the previous paragraph has ended. However, since HTML now allows certain attributes to be assigned to the <P>tag, it may be a good idea to include it.

Spacing and Text Flow

Normally, the text in an HTML container is allowed to "flow" when it is displayed by the web browser, which means that a carriage-return is rendered as a single space in the web browser window, and that any amount of white space in the source file is rendered by the browser as a single space. This is unlike LaTeX, where a blank line or several blank lines in the source file will mark the transition to a new paragraph. As in LaTeX, though, horizontal white space in the source file does not matter, and you can make your lines any length in the source file without changing at all how the file is displayed.

In the browser window lines of text flow from the left margin to the right margin. A short line is left justfied by default. However, in a centered paragraph, a short line is centered. If you want to see the lines of text on this page get shorter, just use your mouse to grab the corner box of the the Netscape window (mouse down) and move the border to make the window narrower.

Text flow in HTML allows you to format your source file to make the logical structure clear. You can give each heading its own line, and you can put in extra blank lines to mark paragraphs and sections of the document.

You should make mental note of the structure <tag-name attribute-name=value>. We will see this used for other purposes soon: for making links to text in the same document, for links to other documents or to graphics, and for putting graphics in-line.

Notice that the special use made of < and > in HTML means that whenever you want to have these symbols "printed" on the screen, you have to refer to them a bit obliquely. Namely, the left angle bracket (or less than sign) < is "printed" by using &lt;, and similarly, for > use &gt;. Such a construct is called a character entity. The ampersand, &, also has a special role for the browser, so is displayed with the character entity &amp;. The character ; does not have a special role, so you can use it for punctuation, as usual.

Parts of an HTML Document

Let's take time to look at the most basic constituents of an HTML document.

The Head

This container is the top part of your document. It should contain at least the title of the document. It may contain other information, as we shall see later.

The Title

It is easy to guess what goes in this container. The document title is typically displayed in the title bar at the top of the browser window, not inside the window itself. The title is also what is displayed on everybody's bookmark list, so your title should be descriptive, and relatively short -- at most 64 characters. A title is also used to identify your page for search engines (such as Google). Make sure that your title is not entirely context dependent. For example, the title "Introduction" is not so hot. Better: "Math on the Web: Introduction" or "Math 640-499: Introduction".

The Body

The body is the second--and usually largest--part of your HTML document. It contains the essential content of your document (displayed within your browser window).


In-class Activities

Personalize your Web page

If you have not already put personal information in the Web page that you created last week in this course, please do that now. The page should have a better title, and it should have some information that pertains only to you. You may want to take a few of your favorite URLs from your bookmarks file and make them into a list.

Make a link to another document

It is not difficult to make a link to another document if you have its URL (Uniform Resource Locator). For example, suppose we want to provide the reader of our Web page with a link to the WorldWide Web organization's HTML validator service. We put an "anchor" tag <A> to make a container, and in that container we put some text to describe the target page. The anchor tag takes a couple of attributes. The one that we need here is the HREF attribute, and we give it the URL as its value, placed inside of quotation marks. Here is the example:
   <A HREF="http://validator.w3.org">HTML validator</A>
This is rendered as: HTML validator

The text inside the container is underlined and rendered in a color that depends on the browser, the user's preferences, and whether the URL has been visited by the browser recently.

Put a link to the above Web page (or any other Web page that you like) in the Web page that you built last week.

Put an Image in your World-Wide Web page

You can provide a link to an image in the same way as our link to another document, provided the image is given by a file in one of the standard graphics formats: GIF (Graphics Interchange Format, invented by CompuServe), JPEG (Journal Photography Experts Group), or a few others that are now supported by some browsers. The type of the file is usually indicated by the extension .gif or .jpeg (or .jpg on windows machines).

However, instead of having the reader click some descriptive text, you can have the image loaded with the rest of the page. It's then called an "in-line image". The tag is <IMG>, and it has no matching end tag. Here's how you get the Rutgers home-page header on your Web page.

<IMG SRC="http://www.rutgers.edu/images/header.gif"  ALT="Rutgers Header GIF">
It produces the image below:

Rutgers Header 
GIF

You should put this image or another one in the Web page that you started last week.

About the ALT attribute. The ALT="Rutgers Header GIF" part of the tag serves to produce text whenever the browser cannot display the image. While browsers do not need this unless there is a problem with displaying the image, if you leave it out, then the W3C validator will say that your web page is not HTML 4.01 Transitional

If you try to validate your web page now, you will find that some things are missing. Your web page needs some extra top matter, like the source html of the page you are now viewing, which is

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
            "http://www.w3.org/TR/DTD html4/loose.dtd">

<HTML>
<HEAD>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<TITLE>Math on the Web -- HTML and LaTeX compared</TITLE>
</HEAD>
This is followed by <BODY> , etc. After you have inserted material like that above into your HTML source file, you should try the validator link given above to see whether what you wrote is strictly legal according to the W3C (That is, the " WWW Consortium").

Homework for after class

Of course, you may start on the homework during class if you have time

Read further in HTML, the Definitive Guide (Musciano and Kennedy). You should go as far as section 2 of chapter 5.

Use your browser to look at the source code for some web pages. You can look under the VIEW menu in most browsers, and select SOURCE or PAGE SOURCE. This is one of the best ways to learn how web pages are designed, but you may find the complexity of some web pages overwhelming, so look for some fairly simple examples. For now, avoid frames and cascading style sheets.


Email: Martin L. Karel

Valid HTML 4.01!