Friday, February 09, 2007

HTML: Making HTML Validation Easier

Creating pedantically, well-formed HTML is a nice thing to do to increase your karma as a Web developer ;) However, it can also be a tedious waste of time. Fortunately, there are a couple tricks to make it easier.

For instance, which DTD should you use per today's best practices? Apparently, this is a religious argument. I suspect no fewer than two comments below will be on this topic. I've spent way too much time looking at the various arguments, but I was finally persuaded by this long discussion: I've chosen:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
Wait! Before you disagree with me, let me tell you my secret!

Although 4.01 Transition is the least hacky thing per today's standards (per the discussion linked to above), my actual HTML source is in XHTML! Genshi, my templating engine, lets me choose whether to output HTML or XHTML. Hence, I get the best of both worlds, and I can change my mind later!

The next question is how to catch silly HTML mistakes. For instance, I didn't know that you must specify an action on form tags, even if you're just going to set it to "". As wicked cool as Firebug is, I don't think it provides HTML validation. Furthermore, using the W3C's validator can be painful if you're working on an application that requires form based logins and is not publicly accessible. Copy and pasting the HTML source gets old quickly!

The Firefox plugin Html Validator solves this need quite well. It shows me a little icon to tell me how the page is validating. At anytime, I can "View Page Source" to see the warnings and errors. Html Validator works by embedding Tidy in the extension. Although the work flow isn't nearly as streamlined, I also like Offline Page Validator. It's a plugin that simply loads the W3C validator with the source of the current page. This is nice because it uses the W3C validator, but it takes work to right click and select "Validate Page". Nonetheless, it has the additional benefit that it's a rather simple plugin, whereas Html Validator actually has a compiled copy of Tidy in it.

In summary, Genshi lets me switch between HTML and XHTML with a simple configuration change, and Html Validator lets me see if the page is validating correctly just by looking at an icon.

Happy validating!


Ian Bicking said...

paste.debug.wdg_validate runs all your pages through a validator (something you would do during development), using the (unfortunately rather old) wdg validator command-line script. Doing that with Tidy as well might also be nice. I don't know if it's better than an extension really, but it's there if you are interested.

Shannon -jj Behrens said...

Thanks, Ian.

I knew about paste.debug.wdg_validate, but I opted to use a Firefox extension so as not to create an addition dependency for other developers.

It's also nice to have the extension for cases when I'm not using Paste, although that doesn't happen as much these days ;)

Patrick Corcoran said...

Hi JJ,

HTML4 Trans written in XHTML is exactly what I'm recommending in my book. (You must have Googled "why not to use xhtml", eh?)

As far as validating goes, I use the Web Developer extension. It validates the page with every reload, running it against the W3C's validator, or whichever validator you choose to replace it with. And it displays the results as an icon+button, with the button taking you straight to the W3C page telling you what you failed. (It does the same thing for CSS too.)

Shannon -jj Behrens said...

> HTML4 Trans written in XHTML is exactly what I'm recommending in my book.

Actually, it's more subtle than that. I write XHTML, but Genshi converts it into HTML4 on the fly! The link in my post talks about why.

> As far as validating goes, I use the Web Developer extension. It validates the page with every reload

Are you sure? I just removed an alt tag from one of my images, and I still get a green check mark.

Shannon -jj Behrens said...

Patrick Corcoran, by the way, thanks for reading! ;)

Patrick Corcoran said...


I too have noticed that the web dev toolbar extension doesn't update as frequently as with every reload. (I guess I was knowingly exaggerating :)

It does update eventually, though, making it good enough for keeping an eye on stuff over the long term without having to do explicit validation. And of course, on-demand validation is always current.

It doesn't validate DHTML though, which is too bad. I had a bug on a page yesterday because my JavsScript was writing in css positioning values without specifying 'px'. I wish the extension could check validation based upon the dynamically-generated DOM representation, not just the initial source code.

Patrick Corcoran said...

Oh, and one more OT thing: captchas are frigging stupid software design.

Why can't they validate me by making sure the DOM events I generate are consistent with a human??


if num-keystrokes < post-length and no-paste-event
if not has-mouse-moved and not-tab-key-pressed
if time-on-page < minimum-human-time