Wednesday, February 14, 2007

When Everything is Digital

The white paper "Coping when everything is digital? Digital Documents and Issues in Document Retention" by Julian Gillespie, Patrick Fair, Adrian Lawrence and David Vaile (Cyberspace Law and Policy Center, UNSW 2004), provides a good introduction. It is addressed to legal departments, finance, records managers, IT, corporate executives and others in organisations. But it suffers from having been written by lawyers from a legal point of view and misses the point of having digital documents: making organisations more efficient.

The white papers asks if the organisation has a policy on retention and destruction of digital documents. But apart from government agencies, many organisations are unlikely to have a policy on paper records, let alone electronic ones. If the organisation has a policy for any sort of documents, that is a good first step.

The authors cite research from the USA claiming that most documents are now digital and 70% are never printed. Those dealing with increasing paper use in offices are probably wishing the figure was higher. ;-)

Australian and US court cases involving electronic documents are cited. The authors warn of the legal risks in deleting records which should have been retained and argue for a methodological approach. They give examples of digital documents: imaged versions of paper documents, word processing files, spreadsheets, presentations, email, databases, logs of networks and web access, financial transaction records and web pages.

While giving a good overview of the issues, the authors failed to give the obvious solution until near the end of the paper (on page 41 of 55): implement the relevant standards and guidelines. Also there does not appear to be any mention of the guidelines applying to state and federal government agencies. Perhaps the authors feel that if they mentioned the standards at the beginning, the reader would wonder why a 55 page white paper was required at all (as do I).

The authors introduce the concept of "meta data" by using the example of electronic mail messages. This is a good approach as, unlike other electronic documents, the metadata for email messages is usually visible in the header. The authors point out where to find dates, times and addresses in message headers. Curiously the don't point out the subject, which is an important metadata item. Instead they emphasize the spam warning inserted in the header by spam detection software and discuss "suspicious" email. While this is an important topic it is not relevant to document retention. Similarly the disclaimers inserted in messages are mentioned and makes the valid, but irrelevant, point that these disclaimers are untested in Australian law.

The authors discuss backups and archives of digital documents. As I discovered when helping write Commonwealth government guidelines on electronic documents, IT people use the term "archive" and "backup" interchangeably. Records managers and archivists use "archive" in a different sense. The authors here characterize backups as being to guard against disk failure and archives as being for perpetuity . But then they go on to say that archives are likely to be available for months or years, which a long way from "perpetuity" and falls within what might be considered a backup. The distinction between backups and archives is not a useful one and the authors should have avoided the issue.

The more important point, only made in a couple of sentences, is that an old file may not be readable due to the software which created it being no longer available. The authors mention PDF-A (a version of PDF intended for archives) and XML. This is an important point needing more analysis. Recent progress on XML based standards and on their adoption by the National Archives of Australia shows promise for long term access.

The authors go on to detail obligations for keeping documents. Unfortunately this is from a legalistic point of view. The emphasis is on what you have to keep when there is litigation. This gives a very skewed view of why an organisation would keep documents. Organisations should be keeping documents in order to support their operations. Document keeping should be within what the law requires and allows, but that should not be the primary reason.

The paper ends by discussing document management systems. Unfortunately these standards are not widely used, outside large companies and government. Even in organisations with such systems, many of the day-to-day documents are outside the system in email, word processors and the like.

The current approach to electronic document management is not working, and while well meaning "Coping when everything is digital? Digital Documents and Issues in Document Retention" does not really help. It says things we already knew and probably is only be read by people who already knew.

After some years discussing this issue, and having helped write well intentioned, earnest documents on the need for e-document management (which were completely ignored), I believe a different approach is required. As an IT professional when faced with a problem of people not doing what is needed, I try to automate the problem out of existence; that approach is needed for e-document management.

With a related issue, accessible web design for the disabled, for some years I attempted to interest executives and organisations. Like document management there are clear guidelines and laws requiring its use and organisations have even been fined for non-compliance. But most people are just not interested. Instead I decided to train the people who write the web software to implement the standards. That way the standards would be build into the web systems. People using the web tools would be complying with the guidelines without knowing it. This has proved much more successful.

The same approach can be applied to digital document management. Those designing document systems can be trained to build the needed management systems into the software.

Previously this would have been difficult to do as most of the document creation would have been with large monolithic packages with their own binary proprietary formats and primitive document management systems (such as Microsoft Office). However, web based systems are being increasingly used. These systems are easier to modify, to integrate with records management systems, are more likely to use standards and can be easily deployed across an organisation.

Also packages using standard formats, such as OpenOffice.Org are available to use in place of Microsoft Office, or to use to convert Microsoft Office files to standard formats. Microsoft are also, at last, is making efforts to comply with document standards.

As a result there may be less need to explain electronic document management to lawyers and executives. It will be built into the software and those using it will be prompted for their record management policies so the software can implement them. If the needed software is available free as open source, it will be difficult for any executive to argue against its use.

This may sound unlikely, but it has already been successfully implemented in at least one area: academic electronic publishing. The OJS system implements XML based metadata standards allow easy export of document records and backup of publications. The software is free open source and can be downloaded and installed. During configuration the user is asked if they want metadata to be exported and if they want the publications available in a stand archive format. The user just has to click a few buttons fort this to happen. I did this for the ACS Digital Library and the papers in the library are now available world wide, including in the Arrow Discovery Service.

Obviously, keeping internal organisation documents in a secure archive will be more difficult that open access academic papers intended for unlimited distribution. But the same concepts can be applied. Standard formats and interfaces can be implemented in the tools used.

No comments: