Wuff, Wuff!!: http://www.webdevout.net

3/16/2009

http://www.webdevout.net

Web Development Tools - very good webdev resources

The W3C Link Checker, developed by the same organization that standardized the HTML and HTTP languages, is a free and open source web-based tool that will scan your website and report any broken links it finds, as well as other relevant information. You may specify an arbitrary number of pages deep to check.

The HTML Validator extension uses a local HTML validation tool to validate every page you visit, with the results displayed as an icon in the status bar and with a summary and highlights in the View Source window. This can help you conveniently identify errors in the markup without needing to consult the W3C validator every time. It's available for both Windows and Linux and has a very small performance impact.

Version 0.7.x uses a local install of HTML Tidy as its backend, which is less reliable than the W3C validator. However, as of version 0.8 (still in beta at this time of writing), there is an optional SGML mode that uses the same backend as the W3C validator and produces very accurate results.

Validity and Well-formedness

Introduction to validity
SGML documents, including XML documents, should come with a DTD, usually by using a simple doctype reference. The DTD informs the user agent as to which elements and attributes may exist in the document and where they may be. If an element occurs in the document at an unexpected place or has an unexpected attribute which violates the rules set in the DTD, this is called a “validation error”. A “valid” document is a document which conforms to the rules specified in the DTD, as well as the basic SGML parsing rules.
When you run a webpage through the W3C HTML Validator, it is checking for validity.

Introduction to well-formedness
XML was designed to have an extra set of rules called “well-formedness” rules. Well-formedness has nothing to do with the types of elements and attributes in the document. Instead, it is a basic syntax which all XML documents must follow. It deals with the individual characters which delimit tags, attributes, processing instructions, marked sections, character data, etc.

User agent requirements with XML
When a user agent (such as a web browser) parses an XML document, it is supposed to check for well-formedness. If it comes across any well-formedness error, the user agent immediately quits trying to parse the page, and it will sometimes display a parse error message instead.
Although user agents are supposed to check for well-formedness, they are not required to check for validity, and web browsers usually don't.

Regarding XHTML
XHTML was designed to be an XML version of HTML. However, for various reasons discussed in the Beware of XHTML article, most XHTML pages on the Web are not parsed as XML by most popular web browsers. Instead, they usually treat the page as if it were simply HTML with some odd unrecognized / characters and attributes here and there. This means that browsers won't check the XHTML page for well-formedness and, as the author, you can't expect the browser to give any indication of whether or not the page is well-formed.
Now, just because most popular browsers usually won't treat the page as XML, that doesn't mean nothing will. XHTML is supposed to be parsable as XML, and you should expect user agents to try to parse it as such (and many will). That means you have to make sure the document is well-formed.

Validity is not well-formedness
But what a lot of people don't realize is that it's possible for a document to be valid yet not well-formed. This issue comes up a lot more often than you may think, and is often completely overlooked until someone stumbles upon the fringe cases where something tries to parse the page as XML.
The following are examples of XHTML documents which are perfectly valid from an SGML point of view but not well-formed. Even though the W3C HTML Validator gives these pages a green light, they will completely fail to load in any XML parser. Notes: Most XHTML pages on the Web today are parsed as HTML most of the time, as explained in the Beware of XHTML article. Also, because Internet Explorer doesn't yet support XHTML parsed as XML, you'll have to use a different browser to see the problems.

沒有留言:

張貼留言

Search

3/16/2009

http://www.webdevout.net

沒有留言:

網誌存檔

標籤

RSS Feeds