Transcript Document

Community Development with the
Open Planets Foundation
Helping developers and practitioners
to collaborate to improve digital
preservation software.
Your Presenter
Session Topics
My Role
INTRODUCTION AND AGENDA
17/07/2015
2
A Little About Me
Carl Wilson
Software Configuration Manager
Open Planets Foundation
Email : [email protected]
Skype : carl.f.wilson
GitHub : carlwilson
Twitter : carlwilsoneu
Google+ : [email protected]
17/07/2015
3
Manifesto
•
•
•
•
•
•
•
•
State the case for community development
A little about the OPF philosophy
Best practise for starting projects
Continuous integration with Travis-CI
Packaging & releasing software
Managing development with GitHub
Continuous Improvement
Next Steps
17/07/2015
4
My Role
• Software Configuration Manager
• NOT developing software
• NOT guiding project direction
• Improving development practises
• Improving software quality
• Practical assistance
17/07/2015
5
Benefits of Community
More Developers, Users, & Testers
One Community
Sharing
WHY ONE COMMUNITY?
17/07/2015
6
A Small Community
• Digital Preservation still a fairly niche concern
• Research, heritage, and public institutions all
facing financial challenges
• Two heads are better than one
• A problem shared IS a problem halved
• We can all benefit from the sum of our best
efforts and expertise
17/07/2015
7
More………
• Developers
Contribute fixes, features & help shape development
• Users
Use the software & help shape development
• Everyone
Tests software, reports bugs, and request features
17/07/2015
8
One Community Because….
• Consistent Methods
Conventions make life easier
• One Hub for Software
Easier to find tools to fit your needs
• A Small Community
Not so large that fragmentation beneficial or
organisation problematic
17/07/2015
9
What We Can’t (Always) Share…..
• Large Software Systems
Member organisations vary in size and needs
Internal conventions and workflows differ
• Data / Content
Scale Issues – moving data takes a long time
Security Issues – data in transit can be “stolen”
Copyright Issues – a legal and reputational minefield
17/07/2015
10
Solutions for Sharing Software
• Share Small Focussed Tools & Components
Reusable components that make up large systems
Web services, particularly REST APIs
Tools that perform a single task well, e.g. jpylyzer
• Favour Cross Platform Development
Web Apps, Java, Python, Ruby, PHP, etc.
Cross compilation – not for the faint hearted
• Favour linux
Free Operating Systems == virtualization goodness
17/07/2015
11
Solutions for Sharing Data
• Use Open Corpora
GovDocs, Atlas of Digital Damages, Public Domain/CC
• Create Share Small Focussed Corpora
OPF can help members create and host open test
corpora illustrating real world risks and problems
https://github.com/openplanets/format-corpus
Web harvest and adapt CC licensed content??
17/07/2015
12
Simplicity Itself
The OPF Software Lifecycle
Community Involvement & Member Support
Leveraging Public Infrastructure
OPF PHILOSOPHY
17/07/2015
13
Keeping it Simple
• OPF Software Philosophy
Borrows from Unix philosophy
– Small is beautiful.
– Make each program do one thing well.
– Build a prototype as soon as possible.
– Choose portability over efficiency.
• Ideally projects should be as small as practical
17/07/2015
14
Small and Regular
• OPF Contribution Philosophy
– Heroic contributions gratefully received not required
– Small but regular contributions key to sustainability
– Regular not necessarily frequent
– Leverage the power of habit
17/07/2015
15
OPF Software Curation Process
http://wiki.opf-labs.org/display/PT/The+Software+Curation+Process
17/07/2015
16
We’re All Individuals….
• Different Strokes for Different Folk
People should use the tools they’re comfortable with
People should develop for the platform they use
• OPF won’t Mandate
Platforms, Languages, Tools, or Methods
17/07/2015
17
BUT
• We’re NOT experts in everything
OPF will specialise in supporting platforms and tools
that all members are free to use
• Open Platforms are Favoured
For all sorts of reasons, see the next slide……
17/07/2015
18
Why Open?
• All members can access open platforms
• Open platforms reaches a wider community
• Using Linux simplifies licensing
In turn simplifying virtualisation, for:
–
–
–
–
Continuous Integration
Demonstrator Machines
Training Machines
Deployment on Scalable Infrastructure
• It’s in our name…….
17/07/2015
19
Always Supporting
• We won’t withhold support for Windows or
Mac development
• We will go the extra mile(s) to help produce
apt and rpm packages, or maven artefacts
17/07/2015
20
Use External Infrastructure
• OPF can’t afford to host everything
• Public hosted services means development in
the open
• All members of the digital preservation
community can access public services
17/07/2015
21
OPF Weapons of Choice
• GitHub
For revision/source control, issue tracking
• Travis-CI
For free, multi language, GitHub integrated CI
• Jenkins
For when Travis is not enough
• Bintray
For hosting binary packages
17/07/2015
22
Four Services, One Login……
GitHub ID’s for All
Sign up for GitHub, and join the OPF organisation
http://wiki.opf-labs.org/display/PT/GitHub+Guide
• [email protected]
OPF GitHub organisation admin
• GitHub ID, and OPF Teams used for :
Travis-CI, Jenkins, & Bintray via GitHub OAuth
17/07/2015
23
README & LICENSE files
OPF Metadata
GitHub Pages
STARTING A PROJECT
17/07/2015
24
Source Code Availability
• Support for Public Source Projects Only
If it doesn’t happen in the open, it doesn’t happen
• GitHub is the OPF Tool of Choice
Ideally on the OPF organisation page:
https://github.com/openplanets
• If it’s Publically Available & Open…
We’ll do all we can to support it
17/07/2015
25
What Belongs in a Source Repo
• GUIDELINE, if it’s editable it belongs in the repo:
– code, e.g. java, c, shell scripts, etc.
– config files, e.g. xml, properties files, etc.
– good documentation, e.g. READMEs, install
instructions, etc.
– build files, e.g. poms, ant build files, make
– working unit tests (preferably with good coverage)
– small test data sets for unit tests
– good checkin comments
17/07/2015
26
Try To Avoid Adding
• build artefacts (exes, wars, jars, .class files, Maven
target dirs)
• containers (zips, etc.). Use the raw data and a build
tool to create the container. Git handles
compression so zips don't help.
• volatile data
• large test files, they make the source awkward to
download
• third party artefacts (jars, apps, dlls) provide links
and documents instead.
17/07/2015
27
A Good READ(ME) Part 1
• All projects should have a README containing:
– A project “mission statement”, i.e. what the software
is intended to do, and an indication of target users.
This is NOT a detailed feature list, but a succinct
summary of the projects aims.
– Here’s an example from LibreOffice:
“LibreOffice is the power-packed free, libre and open source personal
productivity suite for Windows, Macintosh and GNU/Linux, that gives you
six feature-rich applications for all your document production and data
processing needs: Writer, Calc, Impress, Draw, Math and
Base. Support and documentation is free from our large, dedicated
community of users, contributors and developers. You, too, can get
involved!”
17/07/2015
28
A Good READ(ME) Part 2
• The README should also give developer level
instructions for getting the software running.
– This doesn’t mean that an installation package is
required, just enough to get an experienced dev up
and running
• It should also list pre-requisites required to run
the software:
– i.e. Python 2.7, Java 6, Operating System, etc.
• Finally it should give some idea of the
development status of the project
17/07/2015
29
State your License
• Each project should explicitly state it’s licence
conditions.
• Open Licenses Listed & Identified @
The Open Source Initiative :
http://opensource.org/licenses
• A LICENCE file in the root of the project is
sufficient at first.
• Get in Touch for Advice (it’s a dry subject)
17/07/2015
30
A Little YAML Goes a Long Way
• One More Piece of Glue
A single file .opf.yml lives in the root directory
• Human/Machine Readable Project Metadata
Name, “vendor”, contacts, links, & extensible
http://wiki.opf-labs.org/display/PT/GitHub+Guide#GitHubGuide-InitialREADME.mdFile
• Where Hooks and Links Live
Non-Travis CI, Code Quality Reports, web sites, etc.
17/07/2015
31
GitHub Pages & Wikis
• GitHub Also Provide a Place for Web Pages
Example: SCAPE SCOUT Software
Project : https://github.com/openplanets/scout
Branch : https://github.com/openplanets/scout/tree/gh-pages
Pages Site : http://openplanets.github.com/scout/
More Info : http://pages.github.com/
• GitHub also provide project Wikis
SCOUT Wiki : https://github.com/openplanets/scout/wiki
A little ad-hoc, we’d prefer use of http://wiki.opf-labs.org
More Info : https://github.com/blog/774-git-powered-wikis-improved
17/07/2015
32
Continuous Integration 101
Travis-CI
OPF Jenkins
CONTINUOUS INTEGRATION
17/07/2015
33
CI for Everyone
• Continuous Integration
–
–
–
–
–
–
17/07/2015
Software Engineering Practise
Developer Contributions to project merged ASAP
Project built whenever new code is added
Unit Tests run
Developer know when the code base is broken
Users can see if a project builds
34
Introducing Travis-CI
• A Hosted Continuous Integration Service
– Free for the Open Source Community
– Integrated with GitHub
– Support for C, C++, Java, JavaScript (node.js), Perl,
PHP, Python, Ruby, and others….
– Support for test environment services
MySQL, PostgreSQL, MongoDB, ….
– Example C3PO
https://github.com/openplanets/c3po/blob/master/.travis.yml
Builds the Maven project, against 3 JDKs with a MongoDB instance
17/07/2015
35
OPF Jenkins
• For When Travis is not Enough
– http://jenkins.opf-labs.org
• Scheduled Builds, e.g. Maven Site Builds
• Replaces Bamboo http://bamboo.opf-labs.org
• Work In Progress
– Adding plugins
– Adding projects
– Automated code quality metrics
17/07/2015
36
Release Early
Release Often
Packaging
Bintray
RELEASING SOFTWARE
17/07/2015
37
When to Release?
• Release Early
As soon as usable functionality available
As early as feasible
Testing starts early
• Release Often
As often as feasible
Incremental feature releases
Testing starts early
17/07/2015
38
How to Package
• At First Convenience is King
Developer build instructions initially OK
Any form of package is better than none
• Get Something Downloadable / Executable
And get in touch: [email protected]
17/07/2015
39
Where to Host Packages?
• GitHub Withdrew Download Support 12/12
• Enter Bintray
Social Service for software packages (GitHub for Binaries)
Support for APT, RPM, & Maven repos
Vanilla repos for all other platforms
https://bintray.com/user/organization/profile/openplanets
• Sign into Bintray via GitHub Oauth
Request openplanets organisation membership
17/07/2015
40
GitHub Issues & Milestones
A GitHub Workflow
Practitioner Involvement
MANAGING DEVELOPMENT WITH
GITHUB
17/07/2015
41
GitHub Issues
• The Key to Improving Software
Plato Project : https://github.com/openplanets/plato/issues
• Issues are labelled:
– Defect (bug)
– Enhancement
– Feature
• Labelling is User Defined
Standard labels / colours do help though
17/07/2015
42
GitHub Milestones
• Group Issues, Synch with Branches & Releases
Plato Project : https://github.com/openplanets/plato/issues/milestones
• Documentation
https://github.com/blog/831-issues-2-0-the-next-generation
• One Suggested Workflow
https://gist.github.com/olegp/1548203
17/07/2015
43
GitHub Workflow
• This documents the GitHub workflow
http://scottchacon.com/2011/08/31/github-flow.html
17/07/2015
44
Get Involved
• Any one can raise an issue to:
– Report a bug
– Request new functionality
– Request enhancements
• Milestones and Issues give a Project Roadmap
17/07/2015
45
GitHub for Test Data
Documenting a Project
What Makes a Healthy Project?
Project Health Check
CONTINUOUS IMPROVEMENT
17/07/2015
46
Making Test Results Available
• Publish Test Data on GitHub
Initial work part of the SCAPE project
https://github.com/openplanets/cc-benchmark-tests
http://data.openplanetsfoundation.org/cc-benchmark-tests/Visualisations/d3-parsets/
• Uses the README to provide context
Based upon the Open Data Institutes Open Data Certificate
http://www.theodi.org/
http://theodi.github.io/open-data-certificate/
http://theodi.github.io/open-data-certificate/certificate.html
17/07/2015
47
Improve Documentation
• Developers are NOT always good writers
• Early users are well placed to write
documentation, and often do it better
• OPF can help co-ordinate documentation
efforts
17/07/2015
48
What Makes a Healthy Project?
• A Good README, LICENSE, etc.
• Signs of Recent Commit Activity
• An Active Issue Tracker
Open issues are usually a good sign
• Continuous Integration in Place and Passing
17/07/2015
49
OPF Project Health Check
• Initial release uses GitHub, and Travis REST
APIs to collect and publish health metrics
• http://projects.opf-labs.org/
Currently updated once a day
• Still to incorporate Jenkins and Bintray API
June 2013
17/07/2015
50
Improving Presenters & Processes
Automating Everything
Getting Involved
External Contributions
THE WAY FORWARD
17/07/2015
51
Improving the SCM (Me)
• Open to Suggestion
Get in touch and help me to help you
• From Me to You
I hereby promise to Tweet & Blog a little
Members are free to suggest topics for blog posts
Further webinars, again requests considered
17/07/2015
52
Improving the Process
• We Aim to Please
Process should be simple, painless, and ideally fun
If it’s broken we’ll try to fix it
• Facilitation NOT Micro Management
You’re free to use tools and methods of your choice
If you think that it’s better it’s worth sharing
Better documentation helps everyone
17/07/2015
53
Automate if Possible
• REST APIs for Reporting and Raising Awareness
• Automated Regression Testing on Open Datasets
• Automated Publication of Results
• The Aim?
An automated sausage factory for software
development process and testing
17/07/2015
54
OPF’s What You Make It
• The OPF CAN Co-ordinate
Website : http://openplanetsfoundation.org
Wiki : http://wiki.opf-labs.org
GitHub : https://github.com/openplanets
• The OPF CAN Help
Cross organisational boundaries
Practical remote one on one help for members
• The OPF cannot Do It All For You
Ask not just “what the OPF can do for you”, ask also
“what can I do for the OPF”
17/07/2015
55
External Contributions
• The Appropriate use of a Fork
• External Contributors :
– Fork the project
– Work on a feature branch, preferably small
– Submit a pull request to the original
– Project leader reviews the code
– Rinse and repeat until trust is established
17/07/2015
56
Upcoming….
• OPF Hackathon 15th – 17th May
A Practical Approach to Disk Images & Digital Forensics
Copenhagen University Library – The Black Diamond
• OPF Hackathon 3rd – 5th June
Tackling Real-World Collection Challenges with Digital Forensic
Tools and Methods
University of North Carolina
• SPRUCE Mashup 2nd – 4th July
SPRUCE Mashup London 2
London, Hubworking Liverpool Street
17/07/2015
57
Credit Where It’s Due
• David Tarrant
OPF / University Of Southampton
• Andy Jackson
The British Library
• Johan van der Kniff
The National Library of the Netherlands
• Paul Wheatley
SPRUCE Projects
• Peter May
The British Library / SCAPE
17/07/2015
58
References
• Book on community management (free PDF available)
http://www.artofcommunityonline.org/
• A nice article on the importance of READMEs
http://tom.preston-werner.com/2010/08/23/readme-driven-development.html
• Travis CI site and documentation
https://travis-ci.org/
http://about.travis-ci.org/docs
• Bintray site and documentation
https://bintray.com/
https://bintray.com/docs/help/bintrayuserguide.html
17/07/2015
59
Licensing
This work by Open Planets Foundation is licensed under
a Creative Commons Attribution 3.0 Unported License.
17/07/2015
60