For my own reference, as well as that of readers who are in the same boat, I pulled together the following links to help make sense of the alphabet soup inherent in self-publishing solutions. My objective is simply to provide a single post that will replace the repeated searches I’ve been running whenever I can’t remember how XML is different from HTML is different from XHTML.
- Brian O’Leary, in a post titled Alphabet Soup, tackles the issue head on. If you get confused by XML, HTML and XHTML, this is the post for you.
- In a post titled Web Standards for E-books, Joe Clark dives deeper. There’s a lot here and I’m not sure I understand or agree with all of it, but it definitely wrestles with the issues I’m wrestling with.
- Gizmodo leads with a tabloid headline: Giz Explains: How You’re Gonna Get Screwed By Ebook Formats. Despite the hype the article is still worth a read, in large part because it projects all these tech issues onto the current marketplace. Again, I’m not sure I agree with the conclusions, but the article frames the right debates.
- Jedisaber has an .epub eBooks Tutorial that I found extremely helpful. It includes a list of tools, with commentary about same, as well as many other useful bits of information. If you’re thinking of creating an ePub file, this is the place to start.
As suggested in a recent post, it’s always a good idea to look for work flow examples that you can copy or emulate. You may not agree with all of the other person’s choices, or need to follow their examples word for word, but anything is better than reinventing the wheel.
Where the rubber meets the road for me in all this jargon is getting my content distributed. I am concerned about embarking down a technological path that either dies out or takes my content hostage. I don’t want to have to keep changing native file formats, or create new documents for new services or sites that use proprietary tools as a means of also holding customers hostage. I’m interested in flexibility and utility and portability, and I’m constantly judging tech solutions by those criteria.
Update: Keith Fahlgren has a post about ePub and CSS that’s worth reading, if only to give you an idea of what’s coming in terms of compatibility issues. In the comments to the thread, Liz Castro says, “It’s browser wars all over again,” and I fear she may be right. My one hope is that the maturity and deep pockets of many of the market players will keep the insanity to a minimum.
— Mark Barrett
“I can’t remember how XML is different from HTML”
I realize you already know this, but here’s the story I’ve been telling on this recently:
XML is (among other things) a set of rules. It defines how to name and organize containers, which can hold other containers or words (sometimes both). HTML is the set of container names that we use to build web pages (containers for headings, paragraphs, lists, images, etc), but its structural rules are really loose, which sometimes means its easier for humans to write (but harder for computers to read, as computers love hard and fast rules). XHTML takes HTML’s set of container names and makes sure that the document adheres to all of XML’s rules (which often makes it easier for computers to process).
I’ll take all the explanations I can get. 🙂
More to the point, I’ve found that stuff like this really needs to be explained ten different ways, because different people process things differently. Your explanation helps because it makes the relationships clear while also explaining the reason for those relationships.
XHTML is XML-compliant HTML.