metadata watch
standards framework
workshops
registry
information services
publicity materials



intranet
vertical line  
Home vertical line
Project vertical line
Partners vertical line
Related vertical line
Archives vertical line
Search vertical line
Glossary vertical line
 

Metadata Watch Report #6 and Standards Framework Report #3

[ contents | section 1 | section 2 | section 3 | section 4 | Section 5 | Appendix A | Appendix B ]

Appendix C: Publishing Domain

Correspondent: Laurie Causton, Clearbay Limited

Current state of domain

There has been perhaps less activity than usual since the last report, maybe in part due to the summer break, or perhaps that a number of the metadata initiatives have reached a level of maturity and more are now in, or entering, operational life.

CrossRef certainly is up and running. It enlarged its membership in July, with five new publishers who between them offer nearly three hundred journals, bringing the total membership to 77 publishers. The total number of CrossRef-enabled journals now stands at well over 5,000. Four new affiliates also joined.

The International DOI Foundation also has moved into commercial application, but remains active on the ‘development’ front, making progress on collaboration with other bodies in more than one area (more on this below).

Also, the IPTC have released NITF Version 3.0, with more intelligent handling of tables, a cleaner DTD, and improved metadata support.

ONIX Release 2.0 appeared over July and August. This is a major new release introducing coverage of electronic books, adding many new elements and codes, making structural changes to enable coverage to be extended more widely to non-book media, and facilitating the structured description of product content, including book tables of contents. Because of the speed of change in this area, the e-book aspects are being maintained and updated separately from the rest of Release 2.0.

Indecs2 (the rights data dictionary) is under way, funded by the record and film industries, Accenture, and Microsoft among others, with the IDF and EDItEUR/ONIX as partners.

Overlaps and gaps identification

The trend towards greater co-operation to minimise overlaps continues. The International DOI Foundation and the Content ID Forum both develop specifications for content identification and metadata to enable e-commerce and rights transactions for copyrighted information. In August, they made an agreement to collaborate on building an infrastructure for the management of digital intellectual property. Norman Paskin, Director of the IDF, and Hiroshi Yasuda, President of cIDf, stated: "Convergence rather than divergence of the two systems will benefit the wider community of users. The similarity in approach of these two major initiatives was recognized as ground to reconcile differences to create interoperable infrastructure. IDF and cIDf will share information on system development and work with open collaborations such as participation in MPEG-21 (Moving Picture Experts Group)."

The Annual IDF Meeting equally stressed the need for information exchange between the media industries; organizations from the publishing, music, television, library and technology industries attended this meeting to share information on the development of policy and technical infrastructure for digital copyright management.

Trends

The idea of the ‘processable’ digital object, and by consequence the DOI, appears to continue to grow in significance; at the Annual IDF meeting, a number of very definite views of its future were expressed. Robert Kahn, founder of the Corporation for National Research Initiatives, explained how digital object approaches are vital to business and society in the information age, as is reconceptualizing the Internet from the movement of data packets to the management of information. Doug Armati, author of a 1995 report on Information Identifiers, endorsed the wider remit of DOI, calling in fact for widening the DOI Foundation's operations "to facilitate policy and organizational infrastructure globally that will result in DOIs being used to identify EVERY possible Digital Object -- not just in the media industries."

Main issues

While not strictly a metadata issue, a recent US court ruling is of interest, given the attention currently being paid in the publishing metadata domain to e-books and rights management issues. Random House recently took the digital bookseller RosettaStone to court, because they sold in digital format eight books that are published on paper by Random. The court however ruled that RosettaStone may sell these electronic versions, with the judge saying that Random's right to "print, publish and sell the work in book form" doesn't apply to e-books. This recalls the New York Times Supreme Court case over electronic rights where the court ruled that, unless specified in contracts, newspaper and magazine publishers do not automatically own the right to resell freelance contributors stories to digital database companies. Like some of those freelance contracts, the Random House book contracts were signed before e-books or the Internet were an issue. One view is that, if the RosettaBooks stance prevails, e-rights to thousands of old titles conceivably would become available.

Special reviews

The central bodies of all of the publishing metadata initiatives covered in Metadata Watch were contacted recently to enquire about status, achievements and the future. The commentary below is based upon the replies received.

CrossRef

The continuing progress of CrossRef has been noted earlier in this report, and the organisation feels that it has been quite exceptional in managing to involve a large number of leading publishers in a highly collaborative endeavour. Currently it is in the process of extending the XML schema to also accommodate book and conference proceeding records, and metadata is seen as very central to its mission of providing an infrastructure for citation linking.

The biggest challenge at present is interfacing with the library community, and specifically helping libraries, whose records do not usually contain article-level metadata, incorporate article-level linking. Another challenge is providing a flexible interface when it comes to the type of metadata that can be accepted in queries to the system to retrieve DOIs.

CrossRef is actively involved in the setting of identifier standards - in particular, for DOI deposit, resolution, and extended services, all of which involve metadata, and in working with other standards organizations such as EDItEUR to achieve compatible metadata requirements across a variety of publication types.

Some overlap is recognised, with EDItEUR on the ONIX-for-serials front, and with the ISTC initiative on standard identifiers.

DOI

The DOI has moved from a development concept to full commercial applications, with several Registration Agencies now appointed, such as CrossRef. There is strong interest from supporters, partners and liaison in US, Europe and Asia, and more links with other sectors and initiatives. Currently there are around 4 million DOIs, with 8 million resolutions in September 2001. Future plans focus on consolidation, promotion and collaboration:

  • More marketing to get the DOI message out;
  • Firming up, and scaling up, the operational foundations, and providing robust tools for commercial operations, such as metadata (namespace) tools;
  • “Business development” – development of DOI interest and applications by and in other sectors and applications;
  • Working with other activities which are part of the bigger picture of e-commerce, rights, etc., as much as possible by leveraging other efforts rather than by the International DOI Foundation doing it all themselves.

Metadata is seen as central to the DOI; its functionality is predicated on the concept of structured interoperable metadata framework, and is based on indecs concepts. In fact, a major achievement of DOI is seen as the bringing together of “the techniques of persistent actionable identification and structured metadata.”

There are nevertheless issues to be resolved:

  • Funding: Standards must be developed for the long term, and voluntary effort alone is not enough. The current economic climate is “now biting.”
  • Coverage: The IDF feels that coverage is not deep enough (text and technology sectors have recognised the need for DOI in part but some large companies are not participating), nor is it broad enough (non-text sectors should be involved, but while significant interest has been shown, this is not being matched by funding or participation).
  • Perceptions: The IDF is seen as neither “a standards organisation” nor “a commercial solution”, while in fact it sees its main role as infrastructure creation, for digital commerce of intellectual property, and such “killer plumbing” infrastructure is hard to sell; “killer applications are more easily understood.”

The IDF sees an increased recognition of the need for tools and techniques to deal with structured metadata, ranging from Semantic Web through to individual application offerings like Adobe’s XMP. In the case of the DOI, overlap with other metadata initiatives is deliberate, relating to “those metadata efforts which have adopted the same indecs-like view of metadata, such as ONIX (www.editeur.org).” At the same time, there is increasing recognition that several activities developed in different sectors may find it useful to interoperate and share techniques, tools and avoid re-invention of the wheel through mappings, common principles, etc.

ISTC

The major achievement of ISTC is seen as the provision to the text supply chain of a tool for managing e-commerce and rights processes at the work level.

The Committee Draft of ISTC has been submitted to the Secretariat, and will be sent out for comments in the next few months. The next big step will be the process for choosing the international Registration Authority to run the system; setting up the international management structure and implementing the standard in the supply chain are the current major issues for ISTC.

Metadata is a central part of ISTC, with the submission of core metadata required before any ISTC is issued. Concurrent standards are recognised, in that the ISTC metadata scheme is a stream of ONIX.

Overlaps exist; the ISTC overlaps closely with the other related ISO initiatives – the ISWC music work code and the ISAN audiovisual work code – and it heavily references the ISBN system; with regard to the latter, it is expected that the two systems will be linked at the identifier level.

However, convergence is important, particularly with regard to interoperability of metadata, both for e-commerce and rights management.

NewsML and NITF

NewsML is at version 1.0 and is being widely implemented, with a Schema version to be released shortly. NITF is now at version 3.0, with an enhanced table model, as noted earlier. Both are expected to evolve as more feedback is received from implementers and users. More specialised content markup solutions are also expected for use in NewsML, like SportsML.

The IPTC see the major achievement of NITF as opening the news domain to XML markup for text content in a non-proprietary way. NewsML is an open content management and exchange envelope that can be used across many information domains; it is now being used in the news and finance areas but other implementations are also being progressed.

There are issues to be addressed. NITF requires XML-aware systems, but these are becoming more available. NewsML is a complex standard with considerable power, but needs expertise to implement as well as advanced XML-aware and configurable content management systems; nevertheless, these are just emerging onto the market.

Both initiatives rely on metadata to populate parts of the data structure. While NITF is more concerned with content markup, NewsML is a package for all types of data and is a carrier of rich metadata. The IPTC feels that NewsML has also made a useful contribution to the metadata scene - the TopicSet constructs that feature in NewsML are becoming accepted as a good way to store and use metadata.

Overlaps do not appear to be an issue; currently there do not appear to be any other XML-based open standards that compete directly with NITF and NewsML, although there are on the other hand opportunities for collaboration - the PRISM work on metadata for publishing could be made into TopicSets for use with NewsML.

Collaboration in fact is an objective to an extent; the IPTC is trying to work with “other parties who are prepared to make their work open to achieve convergence.” Even so, this necessarily focuses on that which is relevant to its members' needs.

Open eBook Foundation

The Publication Structure version 1.01 for Open eBook was released in July 2001, with version 2.0 in development.

On the metadata front, a Metadata/Identifiers Working Group has been established to assess requirements, evaluate relevant work by other standards organisations and recommend a solution to meet the requirements.

Metadata is seen as very important for all activities. It is nominally part of the Publication Structure Specification, and will also be critical to any standards in the Rights and Rules area.

The Publication Structure is seen as one of the major achievement of Open eBook, together with getting international companies and organisations together to discuss, plan and create for the benefit of the consumer and a competitive marketplace.

The current main issue is maintaining momentum and participation in an extremely uncertain economic and political time.

While there are no known significant overlaps between Open eBook and other standards or metadata initiatives, the avoidance of such overlaps is important.

Commentary

The responses received reflected the trends which have been evident over the period of the Metadata Watch reports:

  • Metadata, in the initiatives which the reports are monitoring, is more than purely descriptive; it is increasingly oriented towards rights management issues, and to e-commerce requirements in general.
  • Collaboration and moves to address overlaps are increasing. At the very least, overlaps are recognised and taken into account, and in many cases their avoidance is considered to be key. This is evidenced by the approach taken by Open eBook where, in evaluating metadata aspects, assessment of relevant work by other standards organisations is a central activity, and by the recent announcements by the International DOI Foundation and the Content ID Forum.

[ contents | section 1 | section 2 | section 3 | section 4 | Section 5 | Appendix A | Appendix B ]


Maintained by: UK Office for Library and Information Networking (UKOLN)
Last updated: 13 November 2001