Skip to main content

HTML: Making HTML Validation Easier

Creating pedantically, well-formed HTML is a nice thing to do to increase your karma as a Web developer ;) However, it can also be a tedious waste of time. Fortunately, there are a couple tricks to make it easier.

For instance, which DTD should you use per today's best practices? Apparently, this is a religious argument. I suspect no fewer than two comments below will be on this topic. I've spent way too much time looking at the various arguments, but I was finally persuaded by this long discussion: I've chosen:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
Wait! Before you disagree with me, let me tell you my secret!

Although 4.01 Transition is the least hacky thing per today's standards (per the discussion linked to above), my actual HTML source is in XHTML! Genshi, my templating engine, lets me choose whether to output HTML or XHTML. Hence, I get the best of both worlds, and I can change my mind later!

The next question is how to catch silly HTML mistakes. For instance, I didn't know that you must specify an action on form tags, even if you're just going to set it to "". As wicked cool as Firebug is, I don't think it provides HTML validation. Furthermore, using the W3C's validator can be painful if you're working on an application that requires form based logins and is not publicly accessible. Copy and pasting the HTML source gets old quickly!

The Firefox plugin Html Validator solves this need quite well. It shows me a little icon to tell me how the page is validating. At anytime, I can "View Page Source" to see the warnings and errors. Html Validator works by embedding Tidy in the extension. Although the work flow isn't nearly as streamlined, I also like Offline Page Validator. It's a plugin that simply loads the W3C validator with the source of the current page. This is nice because it uses the W3C validator, but it takes work to right click and select "Validate Page". Nonetheless, it has the additional benefit that it's a rather simple plugin, whereas Html Validator actually has a compiled copy of Tidy in it.

In summary, Genshi lets me switch between HTML and XHTML with a simple configuration change, and Html Validator lets me see if the page is validating correctly just by looking at an icon.

Happy validating!


Ian Bicking said…
paste.debug.wdg_validate runs all your pages through a validator (something you would do during development), using the (unfortunately rather old) wdg validator command-line script. Doing that with Tidy as well might also be nice. I don't know if it's better than an extension really, but it's there if you are interested.
jjinux said…
Thanks, Ian.

I knew about paste.debug.wdg_validate, but I opted to use a Firefox extension so as not to create an addition dependency for other developers.

It's also nice to have the extension for cases when I'm not using Paste, although that doesn't happen as much these days ;)
Hi JJ,

HTML4 Trans written in XHTML is exactly what I'm recommending in my book. (You must have Googled "why not to use xhtml", eh?)

As far as validating goes, I use the Web Developer extension. It validates the page with every reload, running it against the W3C's validator, or whichever validator you choose to replace it with. And it displays the results as an icon+button, with the button taking you straight to the W3C page telling you what you failed. (It does the same thing for CSS too.)
jjinux said…
> HTML4 Trans written in XHTML is exactly what I'm recommending in my book.

Actually, it's more subtle than that. I write XHTML, but Genshi converts it into HTML4 on the fly! The link in my post talks about why.

> As far as validating goes, I use the Web Developer extension. It validates the page with every reload

Are you sure? I just removed an alt tag from one of my images, and I still get a green check mark.
jjinux said…
Patrick Corcoran, by the way, thanks for reading! ;)

I too have noticed that the web dev toolbar extension doesn't update as frequently as with every reload. (I guess I was knowingly exaggerating :)

It does update eventually, though, making it good enough for keeping an eye on stuff over the long term without having to do explicit validation. And of course, on-demand validation is always current.

It doesn't validate DHTML though, which is too bad. I had a bug on a page yesterday because my JavsScript was writing in css positioning values without specifying 'px'. I wish the extension could check validation based upon the dynamically-generated DOM representation, not just the initial source code.
Oh, and one more OT thing: captchas are frigging stupid software design.

Why can't they validate me by making sure the DOM events I generate are consistent with a human??


if num-keystrokes < post-length and no-paste-event
if not has-mouse-moved and not-tab-key-pressed
if time-on-page < minimum-human-time

Popular posts from this blog

Drawing Sierpinski's Triangle in Minecraft Using Python

In his keynote at PyCon, Eben Upton, the Executive Director of the Rasberry Pi Foundation, mentioned that not only has Minecraft been ported to the Rasberry Pi, but you can even control it with Python. Since four of my kids are avid Minecraft fans, I figured this might be a good time to teach them to program using Python. So I started yesterday with the goal of programming something cool for Minecraft and then showing it off at the San Francisco Python Meetup in the evening.

The first problem that I faced was that I didn't have a Rasberry Pi. You can't hack Minecraft by just installing the Minecraft client. Speaking of which, I didn't have the Minecraft client installed either ;) My kids always play it on their Nexus 7s. I found an open source Minecraft server called Bukkit that "provides the means to extend the popular Minecraft multiplayer server." Then I found a plugin called RaspberryJuice that implements a subset of the Minecraft Pi modding API for Bukkit s…

Apple: iPad and Emacs

Someone asked my boss's buddy Art Medlar if he was going to buy an iPad. He said, "I figure as soon as it runs Emacs, that will be the sign to buy." I think he was just trying to be funny, but his statement is actually fairly profound.

It's well known that submitting iPhone and iPad applications for sale on Apple's store is a huge pain--even if they're free and open source. Apple is acting as a gatekeeper for what is and isn't allowed on your device. I heard that Apple would never allow a scripting language to be installed on your iPad because it would allow end users to run code that they hadn't verified. (I don't have a reference for this, but if you do, please post it below.) Emacs is mostly written in Emacs Lisp. Per Apple's policy, I don't think it'll ever be possible to run Emacs on the iPad.

Emacs was written by Richard Stallman, and it practically defines the Free Software movement (in a manner of speaking at least). Stal…

ERNOS: Erlang Networked Operating System

I've been reading Dreaming in Code lately, and I really like it. If you're not a dreamer, you may safely skip the rest of this post ;)

In Chapter 10, "Engineers and Artists", Alan Kay, John Backus, and Jaron Lanier really got me thinking. I've also been thinking a lot about Minix 3, Erlang, and the original Lisp machine. The ideas are beginning to synthesize into something cohesive--more than just the sum of their parts.

Now, I'm sure that many of these ideas have already been envisioned within, LLVM, Microsoft's Singularity project, or in some other place that I haven't managed to discover or fully read, but I'm going to blog them anyway.

Rather than wax philosophical, let me just dump out some ideas:Start with Minix 3. It's a new microkernel, and it's meant for real use, unlike the original Minix. "This new OS is extremely small, with the part that runs in kernel mode under 4000 lines of executable code." I bet it&…