Friday, June 01, 2007

Web microformats for coordinated government reports and accessible PDF?

Microformats: Empowering Your Markup for Web 2.0, cover of book by John AllsoppThe 31 May Web Services Group meeting was on Web microformats and accessible PDF.


John Allsopp gave a passionate overview of Microformats and a small plug for his book "Microformats: Empowering Your Markup for Web 2.0".

Microformats are lightweight approach to what the semantic web is attempting to do. While the Semantic Web requires you to recode your data in an XML format, such as RDF, microformats adds some semantic content to existing HTML. Applications such as the "Operator" extension to Firefox can be used to read the microformat data and transfer it to other applications.

John gave several examples, including the XFN the "XHTML Friends Network" for representing (human) relationships in XHTML. This adds a 'rel' attribute with a controlled vocabulary to the <a href> tag, so a hypertext link not only points to someone, but says what your relationship is to them.

One refreshing aspect of this talk was the plea to web designers not to be obsessed with every web page rendering the same in every browser. John used the example of conditional CSS selectors, which do not work with older browsers. These can be used, for example to put icons on all external links in a web page. John argued it was worth having such a feature in browser which could support it and we should not get too worried about old browsers which do not.

Other microformats mentioned included:

* hCard: The HTML version of the vCard format for electronic business cards.
* hCalendar: The HTML version of iCalendar. Works similarly to hCard.

Less well established Microformtas:

* hReview: For assessments of products and services.
* hResume: For resumes and CVs.
* hAtom: Atom feed in HTML format. Perhaps XSLT could be used to generate hAtom from Atom in the browser?
* hCitation: This would be very useful for authors citing materials, but seems a long way from being a standard.
* Geo microformat: For the physical location of something, used by Flickr.

Pingerati supports microformats, but exactly what it is for I am not yet sure. The first link I clicked on went to Twitter, as mentioned in the NLA talk yesterday.

The microformats approach is a clever idea and may be useful for accessibility (as suggested by Brian Hardy at the beginning of his talk). It might also be a way to ease authors and web designers out of a rigid graphical/print based approach to information design. XHTML and CSS was supposed to provide better separation of content and layout. But these added a lot of complexity for the creator, with little in return. If the author and web designer can build their content using microformats, that might provide easier creation and better structure.

As an example, those in a government agency preparing an annual report would be provided with a microformat enabled web based system. Hundreds of staff would enter their parts of the report in web pages using the appropriate microformats. Some of these would be extracted from spreadsheets and databases. Plain text would be entered in a web based editor. Staff coordinating the publication would add the rules for layout, images and style. The system would then generate the web, PDF, and print versions of the annual report.

That might some very ambitious, but a system for creating complex documents larger than annual reports in web, PDF and print already exists. This is the ICE system for course content. But that system relies on authors using a supplied word processing template and having extra software on their PCs to interface to the central system. Using microformats and a web based system would remove the need for use of WP templates and PC software.


Brian Hardy from Vision Australia talked about Accessibility of PDF files. This was a more detailed version of his talk to university people.

Brian's message was that PDF could be made accessible to people with a disability, with a little extra effort in document creation. He started with some simple tips:
  • Bookmarks: Have bookmarks open by default in the PDF document.
  • Tag PDF for screen readers.
  • PDF help page is not useful, nor are PDF icons (the text "PDF" will do). Include the size of the file.
Simple Access issues:
  • Reduce file size. PDF creation tools and addons have options for reducing file size.
  • Remove non-essential graphics.
  • Offer report in sections.
Accessibly issues:
  • Well marked word processing documents convert well.
  • Fix up with tools.
  • Test with readers.
  • "Print to PDF" is not recommended.
Brian discussed the use of Adobe's own PDF creation tools. It occurred to me that Open Office has PDF creation built in and version 2 seems to do an okay job.

Vision Australia has delivered a draft report on accessible PDF for government agencies, which should be available soon.

I asked Andrew if it would be simpler to produce accessible web pages to complement the PDF print versions, Andrew said at least one agency found it easier to produce HTML than fix up the PDF. This would seem a sensible option as accessibility tools for web pages are better developed and there is less expectation that a web page will look identical to a printed document, than with PDF.

An audience member asked a question about business processes. It seemed that most of the discussion of PDF assumed that reports were being produced in the old fashioned for-print way. If the document were produced by loading the content into a management system and then generating the web and PDF, the issue of accessibility should disappear. It should be a simple matter of configuring the content management system to include the PDF accessibility options, just as they do for the web.

There was some discussion as to if agencies having a common look and feel would like the process easier. If each agency did not try and produce unique pretty printed documents and PDF facsimiles of them, then accessible documents would be much easier to produce. It was argued that the Queensland and NZ were adopting a Common User Experience (CUE) for web sites and this might be done for the Commonwealth.

Perhaps a more ambitious target could be set to produce one coordinated annual report for the Commonwealth government, similar to the federal budget web site. The Finance Department would provide a web based system for all agencies to enter their annual report information. This would then be published online. It would be possible to see an agency based view, or a cross section of the same category of information comparing all agencies. PDF versions would be available for printing. All the material would use the one standard Commonwealth branding.

ps: I am presenting some of the courses in the series "A System Approach to Management of Government Information" at ANU later in the year. These are for public service senior executives on how to implement e-document management and e-archiving.

No comments: