One of the things that I’ve wanted to do since Christmas was get the BeBook converted from the horrible clunky html it’s presently in to an xml based solution. I say solution as nothing about xml is as straightforward as it ought to be. Seperating the content from the formatting is a truly excellent idea and one that will continue to grow in support I feel, but xml, xslt and all the other letter jungle terms you’ll come across when looking at it aren’t as straightforward as they should be.

Well, despite all that progress is being made. I finally decided to do something about it all and so have written a perl script (using the very helpful HTML::TreeBuilder module) that can convert the BeBook html into valid xml documents. We have some test xlst files and even a very basic css file so that using Firefox and pointing at one of our xml files you can see nicely (well, sort of) formatted pages. Slaad (the mad australian) is working on getting us a better css file so that it’ll look better but the simple fact that we’re now starting to seperate content from formatting should allow us to get more value from the BeBook.

It’ll be a while before the script is ready to convert the whole book, and even then it won’t be perfect, so volunteers would be very helpful to read through and find/fix the mistakes.

The other advantage that this should offer (assuming I can ever figure out how we stick it all together that is) is the ability to add chapters very, very easily.