All 17 projects funded by JISC in Phase 2 of the Discovery programme met in Birmingham today to share updates and ideas as they wind down their efforts. It was a very stimulating meeting, not least because the shared Discovery dialogue seems to have developed significantly since during 2012. The Phase 1 projects undertook some very useful experiments, but the Phase 2 projects have taken things up a notch.
Here, in very raw form are the recurrent themes that I recorded as takeaways from the session
A – Data and access points
- Time and Place are priority access points
- URIs offer an effective base level linking strategy
- Collection level descriptions have potential as finding aids across domains
- User generated content, such as annotations, has a place at the table
B – People
- Community is a vital driver – open communities maintain momentum; specialist enthusiasms and ways of working provide strong use cases
- For embedding new metadata practice, start where the workers are – add-ins to Calm and MODS demonstrate that
- More IT experience / skills are required on the ground
C - The way the web works
- Aggregators crawl don’t query … OAI-PMH, Robots, etc
- Google’s strength shouts ‘Do it my way’ – and we should take heed (but we do need both/and)
- Currency of data is important – there may be a tension with time lags associated with crawling
- Aggregators need to know what is where to build or add value so … we don’t need a registry?
- No man is an island – It’s a collaborative world with requirements to interact with complementary services such as Dbpedia, Europeana, Google Historypin, Pleiades, UKAT, VIAF
D - Tools and technology
- There is opportunity / obligation to leverage expert authority data and vocabularies – examples as above and more, such as Victoria County History, …
- Commonly used software tools include Drupal, Solr/Lucene, Elastic Search, Javascript, Twitter bootstrap
- JSON and RDF are strong format choices amongst the developers
- Beware SPARQL end points and Triple Stores, especially in terms of performance
- APIs are essential – but little use without both documentation and example code
- OSS tools have been built by several projects … but how do we leverage them (e.g. Bibsoup, Alicat)
