It’s hot again. Enriched publications. For academic publishers, the importance of enriching publications could lie primarily in the added value this can offer to researchers, both readers and authors. This immediately raises a number of questions. Which criteria must an enriched publication and its associated dataset(s) meet to actually be considered an enrichment for an article or other type of publication? Are authors really prepared to release their research data? Which applications and software can the reader use? What is the status of an enriched article compared to a traditional article? What is the impact on current practice (e.g. reading and publishing)?

Important questions that an academic publisher has to take into account when considering starting to offer this kind of service to authors and readers. These day for a publisher it is vital to be in the vanguard of the technological developments in scientific information. Close collaboration with research groups and university libraries to be able to properly use and re-use data and supplementary files is therefor an essential requirement. Scholars are usually the pre-eminent critical users of the newest technological applications, whether in the laboratory, or through online applications when communicating with each other.[1]

Open Access

The internet makes new forms of production, storage and distribution of information possible. This has led to a transition in the field of scientific publishing. Libraries are modernising, and they now offer scholars different options for durable storage via digital publication archives (repositories) and cloud-storage services like Figshare and Harvard dataverse. The idea of free access to information has ultimately developed into the Open Access movement, with the milestone of the ‘Berlin Declaration on Open Access to Knowledge in the Sciences and Humanities’ that was formulated in 2003. The idea of Open Access stated in this declaration has expanded to incorporate the idea that Open Access contributions can include original scientific research results, raw data and metadata, source materials, digital representations of pictorial and graphical materials, and scholarly multimedia material. More and more media content is becoming available in the public domain, see for instance the OpenImages project in The Netherlands.

Enhanced or enriched?

But what exactly is meant by an ‘enriched publication’? In a state-of-the-art study about enriched publications (Woutersen-Windhouwer & Brandsma, 2008), the authors distinguish three major publication models. The modular model of Kircz (1998) splits a publication into independent modules such as abstract, problem definition, methodology, etc. The second is the semantic publication model by Hunter (2006), which bears some resemblance to the modular model, but focuses on the workflow. What these two models have in common is that they predict the end of the traditional publication model. The third model is a more generic one. It creates a loosely coupled system of independent objects such as images, texts, datasets, etc. The report summarizes what kind of resources can exist within an enriched publication and suggests the following:

“An enhanced (or enriched) publication is a publication that is enhanced with research data, extra materials, post publication data or database records, and that has an object-based structure with explicit links between the objects. In this definition an object can be (part of) an article, a data set, an image, a movie, a comment, a module or a link to information in a database.”

Since 2009 I’ve been involved in a few projects enriching scholarly journal articles and monographs. All projects stayed in an experimental phase. In 2009 and further on it was way too early. Storage wasn’t the problem, but repositories weren’t equipped, there were no API’s available, and persistent identifiers like DOI’s for data wasn’t common practice.

But above all, authors weren’t ready for it.[2]

I always make a clear distinction between ‘enhanced publications’ (hyperlinked content and data, with added metadata) and ‘enriched publications’. The latter type offers much more functionality (preferably inside the digital publication itself), such as visualizations of data, facilities to explore and to analyse data, which provides a better understanding of the data set and the underlying research.[3] Enriched publications can contain different types of enrichments, such as research data, visualizations, annotations, websites, and maps, and they can be composed in different ways. To ensure the scientific integrity and complete usability of the enriched publication, it is important for all the components, together with all the relations (e.g. links and online data from other sources), to be preserved in a durable manner.

E-books are an attractive format for this sort of enrichment, particularly those based on the epub3 standard, which allows the addition of multimedia (just like HTML5), although books are much larger and enrichment may be more complicated, just as intellectual property rights and licenses.

In addition to the possibility to support the textual publication with, for example, data or visualizations, these kinds of publications also promote the availability of re-usable scientific research data and above all allow verification of the outcomes of research. Ideally, repositories should have an API that allows publishers to obtain data from the archive and to enrich a publication dynamically by processing the extracted data set simultaneously. Though I think there is an important rule here: in my opinion enrichments should not be overdone, less is more. Since enrichments of publications have visualization and specialized analysis features, it is important to provide them in a durable way, to ensure the integrity of the publication. This requires that all components needed for these features (hosting, software, configuration and real-time conversion of data) should be ensured by a durable organization. One of the issues still remaining is that often such organizations restrict themselves to providing durable formats (like CSV), while modern software often requires more up-to-date formats (like JSON). But this is rapidly changing.

Researchers still think mainly in terms of the traditional article and are not used to employing their raw datasets in an interactive manner to support their arguments. This demands a cultural shift from researchers. The infrastructure is present and will be developed further. Now it is time for the researcher to start making use of it. The point is to make the enriched publications more visible and more importantly creditable considering the time-effort that needs to be made.

Metadata and Discovery

An important feature of enriched publications is the ability to improve the discovery of research data. Readers of enriched publications are able to explore the accompanying data, to verify the conclusions and to develop their own theories. The relation between the article and the research data is implicit, and researchers can only discover the data by reading the article. This can be overcome by defining an explicit relation between the article and the research data within the metadata. This allows portals and search engines to notify researchers of the research data when they find a publication.

The Object, Reuse and Exchange protocol from the Open Archives Initiative (OAI-ORE) allows the definition of an enriched publication as an identifiable aggregation of a publication and enhancements or research data. For instance this protocol was applied to build a demonstrator of enhanced publications in the Netherlands (Hogenaar & Hoogerwerf, 2008) for the DRIVER II project.

Preservation and permanent access

The introduction of enriched publications in scholarly communication creates new challenges with regard to preservation and permanent access. Besides the publications and the research data, they introduce two more elements that need to be preserved. One is the relation between the enrichments and the publications. Technically, this is not an issue, since they can be easily described in XML/OAI-ORE. The question is: who takes responsibility for them, as the components can exist in different repositories. The other element is preservation of the visualizations: this is both a technical and an organizational challenge.

The technical challenge is introduced by the dependency on software for visualization, since solutions do not yet exist for preserving software.To ensure the scientific integrity and complete usability of the enriched publication, it is important for all the components, together with all the relations (e.g. links and online data from other sources), to be preserved in a durable manner. For instance the Royal Library in the Netherlands has facilities to preserve complete website environments, but they cannot preserve all the functionalities of the tools. The only solution is to preserve the basic functionality by replacing the tools over time, which turns the challenge into an organizational one: who can and will take such responsibility to do so?

Added value of enrichments

The enrichment of articles is clearly still being developed, but that it is going so slowly is probably due to the unfamiliarity with the available functionalities and possibilities. And yet, enriched publications in theory offer an evident added value compared with the traditional publications, even those available online. First of all, it enables the authors to visualise and integrate parts of their research material in the publication. Thus, they can present the foundation for their findings better. Second, the methodology and results of the study can be made more comprehensible and thus better verifiable for the reader. This makes the authors more vulnerable, but will ultimately lead to an improved quality of the research. The authors will have to subject the obtained research data to higher standards of quality before turning to publication. Currently, scientists are often still hesitant to make their research data accessible for the outside world, for example because commercial interests are involved or there may be doubts about the specific methodology of the study. Indeed, the enrichment of publications will not change this situation immediately, but such publications could contribute to the transparency of science. The manner and the extent to which this will happen are still to be revealed. Third, the information that is stored in online data archives will have to be made more accessible and applicable. The archive and storage function of this sort of databank will have to be supplemented with a publication application that ensures that in the future the research data will be used more intensively for future scientific research. The fact that readers can make use of more precise and detailed research material can only promote the scientific debate.

Developments in Media Studies

Quite recently I came across a paper (pre-print) entitled: ‘Open media scholarship: The case for open access in media studies’. One of the arguments this study makes is:

“the topics that we write about are inescapably multimedia, so our publishing platforms should be capable—at the very least—of embedding the objects that we study”[4]

It only mentions a few samples of new publishing platforms. It also mentions open access for scholarly monographs as an example. I think this is something else. Open Access to monographs in itself is neither an enhancement or enrichment, it could only lead to new enhanced formats when using the right tools.

One of the projects that is mentioned is MediaCommons as an example of good practice in enriching scholarly output. As the founders, Kathleen Fitzpatrick and Avi Santo say:

“As media scholars can make the “form must follow content” argument convincingly, and as tenure qualifications in media studies often include work done in media other than print already, we hope that media studies will provide a key point of entry for a broader reshaping of publishing in the humanities.”[5]

In the last 10 years of its existence the project discussed questions like what counts as scholarship anyway, experimented with open peer review and much more.

This is all very good. But what I still find remarkable is that in most journals from existing, traditional publishing houses the embedding of video’s, adding dynamic data and visualisations, etc. etc. is hardly being implemented. It seems that almost all online innovations come from scholarly led platforms like MediaCommons, or start-ups developing specific writing and reading tools. And these tools make it easy to easily add media content.

The introduction of the audio-visual essay has been a huge success and on all levels you see an growing amount of audio-video essays. For a good start, take a look at In Transition[6], a side-project of MediaCommons. I won’t go into details here about the discussion of the academic status of audio-visual essays neither will I go into details about how to organise peer review for these new objects of research. I just want to point to examples of new publishing formats. One of the issues with these ‘objects’ is of course sustainability. In almost all examples I’ve seen video’s are stored with Vimeo or Youtube. Commercial web-services with clauses in their terms that enables them to remove content when they think it needs to be removed. And the objects have no persistent links.

Another nice example are the author and reading tools online. Like Apple iBook (you can easily make your own enhanced e-book). I believe tools like Scalar are really interesting for full integration of media content.

From their website: ‘The Alliance for Networking Visual Culture seeks to enrich the intellectual potential of our fields to inform understandings of an expanding array of visual practices as they are reshaped within digital culture, while also creating scholarly contexts for the use of digital media in film, media and visual studies. By working with humanities centers, scholarly societies, and key library, archive, and university press partners, we are investigating and developing sustainable platforms for publishing interactive and rich media scholarship.’[7]

In July 2016 Thomas van den Berg and Miklos Kiss launched their ‘Film Studies in Motion: From Audiovisual Essay to Academic Research Video’ enhanced book project.[8] The Scalar reading tool enables the reader to play around with the content and to make customized data visualizations. It has an institutional backing and is still in a developmental phase.

Nevertheless it’s becoming more easy to make your own enriched editions of papers, articles or books. There are many more examples of DIY software tools which take academic standards seriously.

The book Film Studies in Motion opens with the following quote by Mark and Deborah Parker in their book The DVD and the Study of Film: The Attainable Text:

“One of the great ironies of film study is that its ‘evidence’(a term itself derived from Latin and meaning ‘out of the seen’) has so limited a visibility in print form.”[9]

I can’t agree more. There is so much more out there. I will keep on collecting good examples of useful tools and post them here.

Let’s get to work and get the images to actual moving.

