Marramgrass

HTML Meets XPath

Dang. There was a blog post I was going to write for today, but haven’t. There was another one I could have written, but this isn’t it, either. Instead, there’s this fairly unplanned one.

I’m working on a few different projects at the moment, and for one of them I went looking for a particular tool and couldn’t find it. So I wrote it.

Today introduces a little Mac desktop application called HTML Meets XPath. (Snappy, I know.)

The idea is simple. There are a number of ways to extract data from a web page. Most are a pain, some are slightly less so. One way is to use XPath to query the elements of the page, treating it like you would a tree of XML nodes. HTML Meets XPath accepts some HTML, either by looking up a URL or a local file, and lets you input an XPath query to run against the page. It then displays the results.

That’s it.

Possibly handy if you’re trying to pin down which query to use on a given page, maybe even useful if you’re just learning how to use XPath at all.

A few things to note:

  • HTML Meets XPath is written by me, Mark Goody. It's published by and copyright Unexpectedly Spiky Ltd, all rights reserved. (Yes, it's the first public act of Unexpectedly Spiky Ltd. More on that to come.)
  • However, it's a free download, at least for version 1.x. If it develops into an all-singing, all-dancing 2.0, that may change. But that's unlikely.
  • Please don't republish the files elsewhere. If you want to spread the word, that's cool, but please do it by pointing people here.
  • HTML Meets XPath is presented with no warranties or guarantees of any kind. It's very rough, and not just about the edges. I plan to refine it quite rapidly (next job is to make the output a bit more readable), but be aware that this is an early 1.0. It shouldn't do anything horrible to your computer, but if it does then I accept no responsibility. If you're not happy with that, don't download it.
  • I've only tested the software on Mac OS X 10.6.2. It may work on 10.5, but I haven't tried it, so let's just say it requires 10.6.

If you want to give it a go, please download the .zip (582 KB), unarchive it and drag the app to wherever you want it to live.

All feedback (good and bad) is welcome. Comment here, or drop me an email. Cheers.

UPDATE: Version 1.0.2 is now uploaded — run “Check for Updates…” from the application menu to get it. The interface now survives resizing the window! The devil’s in the detail, folks…