In March of 2001 JISC commissioned a TechWatch Report into the use of CMS systems for University Web Sites[1]. Since then we have seen an exponential increase in the number of institutional staff who have editorial control of the public-facing web pages. This has been one factor in a significant growth in the number of .ac.uk web pages created each day. While there is value in allowing public .ac.uk web pages to be created by institutional staff (especially highly-rated teachers and researchers) for the purposes of increasing the online visibility and reputation of the institution, there is also a need by the institution to have common structures and vocabularies by which to organise and maintain the overall expansion of the public-facing website as it grows in size and amasses more legacy content. One key aspect of that organisation is the use of identifiers for web pages and data.
The overall aim of the Identifiers programme area of work is to improve the extent to which identifiers for public .ac.uk websites are planned and managed within institutions, and contribute to the technologies and skills required to do that. These improvements may be simply better use of structured URIs for web pages, or may extend to the presentation of institutional information as linked data that will allow it to be easily used internally and externally to the institution[2].
- There is a useful glossary of terms available from UKOLN: http://www.ukoln.ac.uk/jisc-ie/blog/identifiers/identifiers-quick-reference/
- To help clarify the problem space JISC had a community consultation conducted into persistent identifiers, and the definitions and meeting notes from that meeting are of value: http://identifiers2010.jiscpress.org/
- The Cabinet Office has released guidance on designing URI sets for public sector websites: http://www.cabinetoffice.gov.uk/media/301253/puiblic_sector_uri.pdf
Please note that all of the example links below are listed under the following “jiscPID” tag: http://www.delicious.com/tag/jiscpid
The following relate to the “Objectives” section of the Call document (the Call document should be read prior to reading this Briefing Paper). The examples below provide context for these objectives. These examples are not prescriptive and are only an attempt to provide scope for the problem space. Proposals should not be restricted by these examples as they are only intended as a guide to previous work that has been done in the problem space.
Projects will need to engage a range of stakeholders from within the institution and make sure there are common communication methods for discussing the structuring of the university website or part’s thereof.
● There have been experimental work done at UKOLN on the pragmatics of utilising linked data in a content management system: http://www.ukoln.ac.uk/jisc-ie/blog/2010/09/15/consuming-and-producing-linked-data-in-a-content-management-system/
● The Guardian has reported on how it has addressed structuring of data for journalism for its web platforms: http://www.guardian.co.uk/news/datablog/2010/aug/10/government-data-information-architecture
● JISC Strategic Content Alliance has published this report on the value for institutions increasing Search Engine Optimisation which persistent identifiers supports directly: http://sca.jiscinvolve.org/wp/2010/01/16/download-new-seo-report-with-case-studies/
Projects will need to be aware of how they can progress their proposed structure for the University’s public web pages, some examples of how to track that progress are provided below:
● Tim Berners-Lee’s 5 Stars of Linked Data: http://inkdroid.org/journal/2010/06/04/the-5-stars-of-open-linked-data/
● The 4 Steps of structuring data as adopted by the Resource Discovery Taskforce: http://www.ukoln.ac.uk/jisc-ie/blog/2010/08/19/aggregation-and-the-resource-discovery-taskforce-vision/
● Paul Walk’s reflection on his experience as a technical manager: http://blog.paulwalk.net/2010/09/21/institutions-and-the-web-done-better/
● Microsoft’s John Udell talks about the value of re-using part of other organisations’ identifier structures: http://blog.jonudell.net/2009/08/31/the-joy-of-webscale-identifiers/
Projects are encouraged to build on pre-existing structured vocabularies where possible. Please note that many of these vocabulary structures are still in draft and are subject to change. Projects are not expected to use OWL/RDF, but rather they should consider the list of vocabulary terms as part of their URI structure. Proposals should actively discuss the use of any vocabulary as part of the project (see above):
● Vocabulary structure for the creation of researcher profiles: http://vocab.ox.ac.uk/res/researchers.htm
● Vocabulary structure for describing online communities and their participants http://sioc-project.org/ontology
● Vocabulary structure for describing an organisation and the things it provides as part of its business: http://www.heppnetz.de/projects/goodrelations/
● Vocabulary for writing and annotating scientific publications: http://salt.semanticauthoring.org/
● Vocabularies for describing higher and further education institutions (UCAS, HESA, Athens): http://www.jiscmu.ac.uk/news/view/189
● Vocabulary for creating time based structures for web pages: http://www.mementoweb.org/guide/
● Various vocabularies used with bibliographic metadata: http://www.w3.org/2005/Incubator/lld/wiki/Library_Data_Resources
Projects will need to be mindful of how they are communicating their proposed structures to a range of people across the institution. It is especially important that senior managers are easily engaged in the proposed structure. Some examples of how that might be archived are below:
● The University of Southampton provides a mindmap for the structure of the data on their website: http://mind42.com/pub/mindmap?mid=605c3bad-3980-4d4b-9155-75b33af8860d
● A step-by-step walk-through for how data was worked with and identifiers created for various website data: http://www.jenitennison.com/blog/node/145
● A personal account of coming to terms with structured data at the start of the work is described here: http://blogs.ukoln.ac.uk/locah/2010/09/22/creating-linked-data-more-reflections-from-the-coal-face/
In building their prototype structured public-facing web pages projects will need to consider both the URI structures and how that data at those URIs are made available; examples of both are provided below:
- Public URI sets: www.cabinetoffice.gov.uk/media/301253/puiblic_sector_uri.pdf
- Cool URIs http://www.w3.org/TR/cooluris/
- How the BBC makes websites: http://www.bbc.co.uk/blogs/radiolabs/2009/01/how_we_make_websites.shtml
- An example for URI structures is available from The Web Science Trust, for example how they have provided the information about each of their people: http://webscience.org/people.html
- University of Southampton publishes it researcher’s profiles as machine readable data: http://www.ecs.soton.ac.uk/people/lac = http://graphite.ecs.soton.ac.uk/browser/?uri=http://id.ecs.soton.ac.uk/person/60
- Projects should consider the value of open data structure standards, e.g. CSV, RSS/ATOM, JSON, RDFa, RDF, etc. Advice is available via UKOLN’s developer contact community forum: http://devcsi.ukoln.ac.uk/dev-contact/
While the re-structuring of the identifiers and data at the .ac.uk public pages should directly benefit the institution it should also benefit the wider sector, and the following references suggest how that
● Webscale identifiers and how they can add value to an organisations website: http://www.bbc.co.uk/blogs/radiolabs/2008/06/the_simple_joys_of_webscale_id.shtml
● Use of common human readable key-value pairs as identifiers: http://patterns.dataincubator.org/book/shared-keys.html
● Curation of websites and the long term value in structuring them that way: http://derivadow.com/2010/03/11/some-thoughts-on-moving-beyond-the-resource/
[1] Browning, P. and Lowndes, M. (2001) Content management systems http://www.jisc.ac.uk/whatwedo/services/techwatch/reports/horizonscanning/hs0102