Transcript Slide 1

Licence to Share

Research and Collaboration through Go-Geo! and ShareGeo

Guy McGarva, Geoservices Support, Project Manager for ShareGeo Nicola Osborne, Social Media Officer for EDINA David Medyckyj-Scott, Research & Geo-data Services Team Manager

UK e-Science All Hands Meeting 2009: Sharing & Collaboration Wednesday 9 th December 2009

Outline of this talk

• • • • • Introduction to ShareGeo & Go-Geo!

The ShareGeo and Go-Geo! Community.

The benefits and challenges of sharing geospatial data.

Our experiences of enabling sharing of geospatial data sets (so far). Challenges and opportunities for the future.

Introduction to ShareGeo & Go-Geo!

http://edina.ac.uk/digimap

What is the ‘community’ ?

Staff 14% • 40,000 users in 160 institutions.

• Scientists, researchers and students who use geospatial data.

• Those with data to share.

• Individuals looking for existing data and relevant resources.

• Those seeking visibility or reputation gains through building an active Depositor Profile.

• Engagement via the Digimap Blog, emails, RSS feeds, website updates, newsletters and Go-Geo! Twitter stream. • Large natural overlap between the ShareGeo and Go-Geo! user communities.

Postgraduate 22% Undergraduate 64%

Digimap Users

Biotechnology and Engineering and 4% 9% Natural Environment Physical Sciences 13% 33% Medical Sciences 1% Economic and Social Particle Physics and Astronomy 1% Science 17% Information Services 19% 4%

Why Share Geospatial data?

• Significant collections of geospatial data have already been created.

• High cost (time and money) associated with collecting data.

• Existing data can form useful components of new data sets. • Research and so use of data can be over long time periods • Increase visibility and/or create a record of data.

• Benefit from and work with derived licensed data.

Examples of types of Geospatial data in ShareGeo

GPS Land use DTM Grids Boundaries Imagery Derived OS data

Challenges of sharing Geospatial Data

• Sharing Licensed Data, particularly derived data, requires compliance with often complex licenses.

• Increased sharing of data is useful only if data has integrity and is of consistent quality 1 .

• Commercial arrangements and third party data are subject to additional restrictions.

• There are technical issues such as format. • Any shared data will be subject to a distributed trust network – you must trust any potential downloader not to expose or misuse data.

Derived Data & Licenses

Publish Metadata in Go-Geo! using GeoDoc

Private Institutional Node Public

Create Metadata in GeoDoc Create Metadata in GeoDoc Publish metadata to a Institution Node Create Metadata in GeoDoc Publish metadata as Public

Research Cluster Node (planned)

Create Metadata in GeoDoc Publish metadata as Cluster Node No one will be able to discover your metadata Only members of your institution will be able to discover your metadata You can export metadata to xml to save locally You can export metadata to xml to save locally Anyone searching Go-Geo! or the www will be able to discover your metadata You can export metadata to xml to save locally Anyone registerd as part of your cluster will be able to discover your metadata You can export metadata to xml to save locally

ShareGeo & Go-Geo!

• ShareGeo is intended for when: – You are willing or able to share your data.

– You want to reuse existing datasets.

– Within a known community • Go-Geo! provides an alternative sharing mechanism for those that need to: – Publicise the existence of data.

– Share metadata about ongoing work where the collected data may still be changing.

– Cannot trace all licenses for – and therefore cannot share - complex data combinations (‘Grey’ data).

– Share metadata publicly OR with peers.

Our contribution experiences

• Usage of both services is steadily growing… • But most of our users are consumers not creators: – Around 0.5% of ShareGeo users upload data, but 22% of ShareGeo users download data.

– 1.5% of Go-Geo! Users create metadata records; 18% of all Go-Geo! visitors access these.

Our contribution experiences compared to others’

• 1:9:90 2 is the often quoted rule for online participation • Under 0.0001% of Firefox users contribute to development or testing 3, 4 .

• 0.02% of Wikipedia users are (active) editors/contributors 5 • 7% of OpenStreetMap users make some type of edit each month 6 • 10% of Twitter users author 90% of all Tweets Site 9 7 • 24% of the British Public have voted for a reality TV show 8 • 49% of active UK Internet users have a profile on a Social Networking • 62% of the registered UK voters voted in the 2005 General Election 10

Our Experiences to Date

• Far more downloads than uploads • Comments back from ShareGeo users include:

“I am not sure if my data would be of interest to others “ “I have often considered adding data to ShareGeo, given how often individuals must reproduce the work of obtaining and pre-processing datasets; however the license agreements for each datasets prohibit such action. “ “If I had any data to share I would definitely use ShareGeo.“

Sharing Behaviours & Cultures

Academic culture – Funding competitive.

– Access can be very restricted (especially pre-publication).

– Commercial restrictions may apply.

– Collaboration is rarely directly rewarded.

– It is hard to trust the reliability of others’ data – As a contributor there are concerns that: • Data could be misused (maliciously or not, “Data is often viewed

through a tribal prism”

your data.

11, 12

).

• You might somehow be liable for what others’ do with/derive from • You could receive time-consuming questions about your data.

Licensing culture – Perceived as complex and litigious.

– Can be intimidating even if data is licensed for sharing.

Personal vs. community benefit – Greatest benefit to the community paradoxically when a contributor is exiting it (e.g. graduating students).

– Selfless attitude & strong sense of community rare.

Challenges for the Future

• Raise awareness and increase impact of ShareGeo and Go Geo!

• Increase the number of both passive users and proactive creators. • Define and publicize benefits to depositors particularly around: – Community benefits (continuity, reuse & saved costs).

– Personal reputation benefits (e.g. citation).

• Engage in the Making Public Data Public initiative

Opportunities

“Making Public Data Public” initiative 13 : • Prime Minister announced, in November 2009 available for free: 14 , that Ordnance Survey is going to make “some” data – Electoral and local authority boundaries – Postcode areas – Mid-scale mapping • Data from other agencies including crime, transport, health, education to be included.

• Should reduce barriers to sharing data.

Short Term Technical Improvements

• Integrate with standard desktop apps for ‘one-touch’ submission e.g. using SWORD.

• Visualize data with plug-in applications.

• Option to expose metadata either to search engines (Google) directly or via Go-Geo.

• Provide data in alternative formats, including web-services.

• Add more social features such as annotations, tagging and ratings.

Long Term Policy Improvements

• Source more open data (especially as more types of data become open).

• Create ‘open access’ ShareGeo for unlicensed and/or less restrictively licensed materials.

• Measure - and display – the impact (re/use) of data more effectively.

• Improve visibility of data reuse and of the impact of ShareGeo (e.g. through citations).

• Seamless interoperability – around policy, licensing, access levels etc. - with Go-Geo metadata portal.

Thank You

• If you have any Questions we would be very happy to answer them. • Or email us: • [email protected]

[email protected]

• Or if you have any general comments about ShareGeo or Go-Geo! Email: • [email protected]

Links

• ShareGeo: http://edina.ac.uk/projects/sharegeo/ • Go-Geo!: http://www.gogeo.ac.uk/ • EDINA: http://www.edina.ac.uk/

References

1.

2.

3.

4.

5.

6.

7.

Brown, Ian (2009). Cybercrime and data sharing. Slides presented at the Fifth Annual European Geospatial Intelligence Conference in London on 22 nd Jan 2009. Accessed 1 st December 2009: http://www.slideshare.net/blogzilla/cybercrime and-data-sharing-presentation Neilson, Jakon (2006). Participation Inequality: Encouraging More Users to Contribute. Jakob Neilson’s Alertbox (9 th October 2006). Accessed 1st December 2009: http://www.useit.com/alertbox/participation_inequality.html

.

Mozilla. (2009). Our Contributors. Accessed 2 nd December 2009: http://www.mozilla.org/credits/ Shankland, Stephen (2009). After 5 years, Firefox faces new challenges. CNET news (9 th November 2009). Accessed 2 nd December 2009: http://news.cnet.com/8301-17939_109-10392542-2.html

Wikimedia (2009). Wikimedia Monthly Report Card (October 2009). Accessed 2 nd 2009: http://stats.wikimedia.org/reportcard/#fragment-63 OpenStreetMap (2009). Stats – OpenStreetMap. Accessed 3 rd December 2009: December http://wiki.openstreetmap.org/wiki/Stats Heil, Bill and Piskorski,Mikolaj. (2009). New Twitter Research: Men Follow Men and Nobody Tweets. Harvard Business School Blog (1st June 2009). Accessed 2 nd December 2009: http://blogs.harvardbusiness.org/cs/2009/06/new_twitter_research_men_follo.html

8.

9.

Ipsos MORI (2008) quoted in Wardle, Claire and Williams, Andrew (2008). UGC@thebbc: understanding it’s impact upon contributors, non-contributors and BBC News. Cardiff School of Journalism, Media and Cultural Studies, Cardiff. Accessed 1 st December 2009: http://www.bbc.co.uk/blogs/knowledgeexchange/cardiffone.pdf

Dutton, W.H., Helsper, E.J., and Gerber, M.M. (2009). The Internet in Britain: 2009. Oxford Internet Institute, University of Oxford. Accessed 2 nd December 2009: http://www.oii.ox.ac.uk/microsites/oxis/publications.cfm

10. BBC (2005). Election 2005: Results. Accessed 1st December 2009: http://news.bbc.co.uk/1/hi/uk_politics/vote_2005/constituencies/default.stm

. 11. Rusbriger, Alan (2009). Climate science: Inconvenient truths. Editorial for Guardian.co.uk: Comment is Free section (3 rd December 2009). Accessed 3 rd December 2009: http://www.guardian.co.uk/commentisfree/2009/dec/03/climate sceptics-hackers-leaked-emails 12. Hickman, Leo and Randerson, James (2009). Climate sceptics claim leaked emails are evidence of collusion among scientists. Guardian.co.uk (20 blog (27 th Public-Data-Public.aspx

th November 2009). Accessed 3 October 2009). Accessed 3 rd December 2009: rd December 2009: http://www.guardian.co.uk/environment/2009/nov/20/climate-sceptics-hackers-leaked-emails 13. Cabinet Office (2009). Stephen Timms reports progress on Making Public Data Public. Cabinet Office Digital Engagement http://blogs.cabinetoffice.gov.uk/digitalengagement/post/2009/10/27/Stephen-Timms-reports-progress-on-Making 14. Prime Ministers Office (2009). Ordnance Survey to open up data – PM. Number10.gov.uk (17 th November 2009). Accessed 3 rd December 2009: http://www.number10.gov.uk/Page21343