Wednesday, May 25, 2011

Publishing BBC Metadata on the Web

Greetings from the opening of the Meta 2011 Conference at ANU University House in Canberra. Tom Scott, from the BBC is the first speaker, on Publishing BBC Metadata. Tom mentioned the Semantic Web in his first few words. He asked "What is the web?", showing Tim Berners-Lee's original paper "Information Management: A Proposal" ( CERN, March 1989).

Tom demonstrated the BBC Nature website, which in addition to ordinary web pages, provides structured data, using RSS and RDF and semantic mark-up using microformats. This data is available for others to use and is also used by the BBC to create new stories.

Tom also mentioned dbpedia, an attempt to structure Wikipedia data. At this point he argued that there is no metadata and what is commonly though of is data is actually metadata. In a reference to Stephen Hawking, Tom said "Turtles all the way down". This is an metaphor for infinte recursion, however, I would argue it is "metadata all the way down". James Gleick argues in his book "The Information: A History, a Theory, a Flood", that the ability to reason abstractly came after writing ("if all horses are white ..."). That seems unlikely, as I am sure horse breeders reasoned on the nature of a good horse, before written language. Data and metadata are intertwined by their nature, not due to a human invention.

Tom argued that we needed to move from the document web to the data web, the web of things, which is what the semantic web is for. However, after spending many years trying to understand the semantic web and teach it to university students (supervising several masters students doing project on using it for cataloguing indigenous cultural material), I think this is a concept which needs to be further refined and simplified to be widely used. Tim Berners-Lee's key contribution with the World Wide Web was to take an existing complex electronic document standard (SGML) and simplify it to make something easy enough to use (HTML). Ever since, information professionals have argued that HTML is flawed, some tinkered with SGML and produced XML, others tinkered with HTML to make XHTML, but lost was the simplicity of HTML In my view the semantic web similarly needs simplification, even if the purists then say it is incomplete.

Tom then explained that the BBC use metadata for program guides. The importance is not the metadata but the information it describes. This is the key point which information professionals tend to find so obvious, that they forget to explain. While they may say metadata is data about data, but do not say why this is useful. That is a topic I will explore in my talk to the conference tomorrow, with Senator Lundy on "Designing for Democratic Dialogue: More than Mating iPads" (11.00 am on Thursday 26th May, 2011).

Next on the program today we have Greg Stone, Chief Technology Officer, Microsoft Australia and Professor John McMillan, Australian Information Commissioner, who is launching the new government information policy.

No comments: