Skip to main content

HTML: Making HTML Validation Easier

Creating pedantically, well-formed HTML is a nice thing to do to increase your karma as a Web developer ;) However, it can also be a tedious waste of time. Fortunately, there are a couple tricks to make it easier.

For instance, which DTD should you use per today's best practices? Apparently, this is a religious argument. I suspect no fewer than two comments below will be on this topic. I've spent way too much time looking at the various arguments, but I was finally persuaded by this long discussion: http://webkit.org/blog/?p=68. I've chosen:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html40/loose.dtd">
Wait! Before you disagree with me, let me tell you my secret!

Although 4.01 Transition is the least hacky thing per today's standards (per the discussion linked to above), my actual HTML source is in XHTML! Genshi, my templating engine, lets me choose whether to output HTML or XHTML. Hence, I get the best of both worlds, and I can change my mind later!

The next question is how to catch silly HTML mistakes. For instance, I didn't know that you must specify an action on form tags, even if you're just going to set it to "". As wicked cool as Firebug is, I don't think it provides HTML validation. Furthermore, using the W3C's validator can be painful if you're working on an application that requires form based logins and is not publicly accessible. Copy and pasting the HTML source gets old quickly!

The Firefox plugin Html Validator solves this need quite well. It shows me a little icon to tell me how the page is validating. At anytime, I can "View Page Source" to see the warnings and errors. Html Validator works by embedding Tidy in the extension. Although the work flow isn't nearly as streamlined, I also like Offline Page Validator. It's a plugin that simply loads the W3C validator with the source of the current page. This is nice because it uses the W3C validator, but it takes work to right click and select "Validate Page". Nonetheless, it has the additional benefit that it's a rather simple plugin, whereas Html Validator actually has a compiled copy of Tidy in it.

In summary, Genshi lets me switch between HTML and XHTML with a simple configuration change, and Html Validator lets me see if the page is validating correctly just by looking at an icon.

Happy validating!

Comments

Ian Bicking said…
paste.debug.wdg_validate runs all your pages through a validator (something you would do during development), using the (unfortunately rather old) wdg validator command-line script. Doing that with Tidy as well might also be nice. I don't know if it's better than an extension really, but it's there if you are interested.
jjinux said…
Thanks, Ian.

I knew about paste.debug.wdg_validate, but I opted to use a Firefox extension so as not to create an addition dependency for other developers.

It's also nice to have the extension for cases when I'm not using Paste, although that doesn't happen as much these days ;)
Hi JJ,

HTML4 Trans written in XHTML is exactly what I'm recommending in my book. (You must have Googled "why not to use xhtml", eh?)

As far as validating goes, I use the Web Developer extension. It validates the page with every reload, running it against the W3C's validator, or whichever validator you choose to replace it with. And it displays the results as an icon+button, with the button taking you straight to the W3C page telling you what you failed. (It does the same thing for CSS too.)
jjinux said…
> HTML4 Trans written in XHTML is exactly what I'm recommending in my book.

Actually, it's more subtle than that. I write XHTML, but Genshi converts it into HTML4 on the fly! The link in my post talks about why.

> As far as validating goes, I use the Web Developer extension. It validates the page with every reload

Are you sure? I just removed an alt tag from one of my images, and I still get a green check mark.
jjinux said…
Patrick Corcoran, by the way, thanks for reading! ;)
-jj,

I too have noticed that the web dev toolbar extension doesn't update as frequently as with every reload. (I guess I was knowingly exaggerating :)

It does update eventually, though, making it good enough for keeping an eye on stuff over the long term without having to do explicit validation. And of course, on-demand validation is always current.

It doesn't validate DHTML though, which is too bad. I had a bug on a page yesterday because my JavsScript was writing in css positioning values without specifying 'px'. I wish the extension could check validation based upon the dynamically-generated DOM representation, not just the initial source code.
Oh, and one more OT thing: captchas are frigging stupid software design.

Why can't they validate me by making sure the DOM events I generate are consistent with a human??

Pseudo-code:

if num-keystrokes < post-length and no-paste-event
or
if not has-mouse-moved and not-tab-key-pressed
or
if time-on-page < minimum-human-time
then
show-and-require-captcha

Popular posts from this blog

Ubuntu 20.04 on a 2015 15" MacBook Pro

I decided to give Ubuntu 20.04 a try on my 2015 15" MacBook Pro. I didn't actually install it; I just live booted from a USB thumb drive which was enough to try out everything I wanted. In summary, it's not perfect, and issues with my camera would prevent me from switching, but given the right hardware, I think it's a really viable option. The first thing I wanted to try was what would happen if I plugged in a non-HiDPI screen given that my laptop has a HiDPI screen. Without sub-pixel scaling, whatever scale rate I picked for one screen would apply to the other. However, once I turned on sub-pixel scaling, I was able to pick different scale rates for the internal and external displays. That looked ok. I tried plugging in and unplugging multiple times, and it didn't crash. I doubt it'd work with my Thunderbolt display at work, but it worked fine for my HDMI displays at home. I even plugged it into my TV, and it stuck to the 100% scaling I picked for the othe

ERNOS: Erlang Networked Operating System

I've been reading Dreaming in Code lately, and I really like it. If you're not a dreamer, you may safely skip the rest of this post ;) In Chapter 10, "Engineers and Artists", Alan Kay, John Backus, and Jaron Lanier really got me thinking. I've also been thinking a lot about Minix 3 , Erlang , and the original Lisp machine . The ideas are beginning to synthesize into something cohesive--more than just the sum of their parts. Now, I'm sure that many of these ideas have already been envisioned within Tunes.org , LLVM , Microsoft's Singularity project, or in some other place that I haven't managed to discover or fully read, but I'm going to blog them anyway. Rather than wax philosophical, let me just dump out some ideas: Start with Minix 3. It's a new microkernel, and it's meant for real use, unlike the original Minix. "This new OS is extremely small, with the part that runs in kernel mode under 4000 lines of executable code.&quo

Haskell or Erlang?

I've coded in both Erlang and Haskell. Erlang is practical, efficient, and useful. It's got a wonderful niche in the distributed world, and it has some real success stories such as CouchDB and jabber.org. Haskell is elegant and beautiful. It's been successful in various programming language competitions. I have some experience in both, but I'm thinking it's time to really commit to learning one of them on a professional level. They both have good books out now, and it's probably time I read one of those books cover to cover. My question is which? Back in 2000, Perl had established a real niche for systems administration, CGI, and text processing. The syntax wasn't exactly beautiful (unless you're into that sort of thing), but it was popular and mature. Python hadn't really become popular, nor did it really have a strong niche (at least as far as I could see). I went with Python because of its elegance, but since then, I've coded both p