Transcript Document
Community Development with the Open Planets Foundation Helping developers and practitioners to collaborate to improve digital preservation software. Your Presenter Session Topics My Role INTRODUCTION AND AGENDA 17/07/2015 2 A Little About Me Carl Wilson Software Configuration Manager Open Planets Foundation Email : [email protected] Skype : carl.f.wilson GitHub : carlwilson Twitter : carlwilsoneu Google+ : [email protected] 17/07/2015 3 Manifesto • • • • • • • • State the case for community development A little about the OPF philosophy Best practise for starting projects Continuous integration with Travis-CI Packaging & releasing software Managing development with GitHub Continuous Improvement Next Steps 17/07/2015 4 My Role • Software Configuration Manager • NOT developing software • NOT guiding project direction • Improving development practises • Improving software quality • Practical assistance 17/07/2015 5 Benefits of Community More Developers, Users, & Testers One Community Sharing WHY ONE COMMUNITY? 17/07/2015 6 A Small Community • Digital Preservation still a fairly niche concern • Research, heritage, and public institutions all facing financial challenges • Two heads are better than one • A problem shared IS a problem halved • We can all benefit from the sum of our best efforts and expertise 17/07/2015 7 More……… • Developers Contribute fixes, features & help shape development • Users Use the software & help shape development • Everyone Tests software, reports bugs, and request features 17/07/2015 8 One Community Because…. • Consistent Methods Conventions make life easier • One Hub for Software Easier to find tools to fit your needs • A Small Community Not so large that fragmentation beneficial or organisation problematic 17/07/2015 9 What We Can’t (Always) Share….. • Large Software Systems Member organisations vary in size and needs Internal conventions and workflows differ • Data / Content Scale Issues – moving data takes a long time Security Issues – data in transit can be “stolen” Copyright Issues – a legal and reputational minefield 17/07/2015 10 Solutions for Sharing Software • Share Small Focussed Tools & Components Reusable components that make up large systems Web services, particularly REST APIs Tools that perform a single task well, e.g. jpylyzer • Favour Cross Platform Development Web Apps, Java, Python, Ruby, PHP, etc. Cross compilation – not for the faint hearted • Favour linux Free Operating Systems == virtualization goodness 17/07/2015 11 Solutions for Sharing Data • Use Open Corpora GovDocs, Atlas of Digital Damages, Public Domain/CC • Create Share Small Focussed Corpora OPF can help members create and host open test corpora illustrating real world risks and problems https://github.com/openplanets/format-corpus Web harvest and adapt CC licensed content?? 17/07/2015 12 Simplicity Itself The OPF Software Lifecycle Community Involvement & Member Support Leveraging Public Infrastructure OPF PHILOSOPHY 17/07/2015 13 Keeping it Simple • OPF Software Philosophy Borrows from Unix philosophy – Small is beautiful. – Make each program do one thing well. – Build a prototype as soon as possible. – Choose portability over efficiency. • Ideally projects should be as small as practical 17/07/2015 14 Small and Regular • OPF Contribution Philosophy – Heroic contributions gratefully received not required – Small but regular contributions key to sustainability – Regular not necessarily frequent – Leverage the power of habit 17/07/2015 15 OPF Software Curation Process http://wiki.opf-labs.org/display/PT/The+Software+Curation+Process 17/07/2015 16 We’re All Individuals…. • Different Strokes for Different Folk People should use the tools they’re comfortable with People should develop for the platform they use • OPF won’t Mandate Platforms, Languages, Tools, or Methods 17/07/2015 17 BUT • We’re NOT experts in everything OPF will specialise in supporting platforms and tools that all members are free to use • Open Platforms are Favoured For all sorts of reasons, see the next slide…… 17/07/2015 18 Why Open? • All members can access open platforms • Open platforms reaches a wider community • Using Linux simplifies licensing In turn simplifying virtualisation, for: – – – – Continuous Integration Demonstrator Machines Training Machines Deployment on Scalable Infrastructure • It’s in our name……. 17/07/2015 19 Always Supporting • We won’t withhold support for Windows or Mac development • We will go the extra mile(s) to help produce apt and rpm packages, or maven artefacts 17/07/2015 20 Use External Infrastructure • OPF can’t afford to host everything • Public hosted services means development in the open • All members of the digital preservation community can access public services 17/07/2015 21 OPF Weapons of Choice • GitHub For revision/source control, issue tracking • Travis-CI For free, multi language, GitHub integrated CI • Jenkins For when Travis is not enough • Bintray For hosting binary packages 17/07/2015 22 Four Services, One Login…… GitHub ID’s for All Sign up for GitHub, and join the OPF organisation http://wiki.opf-labs.org/display/PT/GitHub+Guide • [email protected] OPF GitHub organisation admin • GitHub ID, and OPF Teams used for : Travis-CI, Jenkins, & Bintray via GitHub OAuth 17/07/2015 23 README & LICENSE files OPF Metadata GitHub Pages STARTING A PROJECT 17/07/2015 24 Source Code Availability • Support for Public Source Projects Only If it doesn’t happen in the open, it doesn’t happen • GitHub is the OPF Tool of Choice Ideally on the OPF organisation page: https://github.com/openplanets • If it’s Publically Available & Open… We’ll do all we can to support it 17/07/2015 25 What Belongs in a Source Repo • GUIDELINE, if it’s editable it belongs in the repo: – code, e.g. java, c, shell scripts, etc. – config files, e.g. xml, properties files, etc. – good documentation, e.g. READMEs, install instructions, etc. – build files, e.g. poms, ant build files, make – working unit tests (preferably with good coverage) – small test data sets for unit tests – good checkin comments 17/07/2015 26 Try To Avoid Adding • build artefacts (exes, wars, jars, .class files, Maven target dirs) • containers (zips, etc.). Use the raw data and a build tool to create the container. Git handles compression so zips don't help. • volatile data • large test files, they make the source awkward to download • third party artefacts (jars, apps, dlls) provide links and documents instead. 17/07/2015 27 A Good READ(ME) Part 1 • All projects should have a README containing: – A project “mission statement”, i.e. what the software is intended to do, and an indication of target users. This is NOT a detailed feature list, but a succinct summary of the projects aims. – Here’s an example from LibreOffice: “LibreOffice is the power-packed free, libre and open source personal productivity suite for Windows, Macintosh and GNU/Linux, that gives you six feature-rich applications for all your document production and data processing needs: Writer, Calc, Impress, Draw, Math and Base. Support and documentation is free from our large, dedicated community of users, contributors and developers. You, too, can get involved!” 17/07/2015 28 A Good READ(ME) Part 2 • The README should also give developer level instructions for getting the software running. – This doesn’t mean that an installation package is required, just enough to get an experienced dev up and running • It should also list pre-requisites required to run the software: – i.e. Python 2.7, Java 6, Operating System, etc. • Finally it should give some idea of the development status of the project 17/07/2015 29 State your License • Each project should explicitly state it’s licence conditions. • Open Licenses Listed & Identified @ The Open Source Initiative : http://opensource.org/licenses • A LICENCE file in the root of the project is sufficient at first. • Get in Touch for Advice (it’s a dry subject) 17/07/2015 30 A Little YAML Goes a Long Way • One More Piece of Glue A single file .opf.yml lives in the root directory • Human/Machine Readable Project Metadata Name, “vendor”, contacts, links, & extensible http://wiki.opf-labs.org/display/PT/GitHub+Guide#GitHubGuide-InitialREADME.mdFile • Where Hooks and Links Live Non-Travis CI, Code Quality Reports, web sites, etc. 17/07/2015 31 GitHub Pages & Wikis • GitHub Also Provide a Place for Web Pages Example: SCAPE SCOUT Software Project : https://github.com/openplanets/scout Branch : https://github.com/openplanets/scout/tree/gh-pages Pages Site : http://openplanets.github.com/scout/ More Info : http://pages.github.com/ • GitHub also provide project Wikis SCOUT Wiki : https://github.com/openplanets/scout/wiki A little ad-hoc, we’d prefer use of http://wiki.opf-labs.org More Info : https://github.com/blog/774-git-powered-wikis-improved 17/07/2015 32 Continuous Integration 101 Travis-CI OPF Jenkins CONTINUOUS INTEGRATION 17/07/2015 33 CI for Everyone • Continuous Integration – – – – – – 17/07/2015 Software Engineering Practise Developer Contributions to project merged ASAP Project built whenever new code is added Unit Tests run Developer know when the code base is broken Users can see if a project builds 34 Introducing Travis-CI • A Hosted Continuous Integration Service – Free for the Open Source Community – Integrated with GitHub – Support for C, C++, Java, JavaScript (node.js), Perl, PHP, Python, Ruby, and others…. – Support for test environment services MySQL, PostgreSQL, MongoDB, …. – Example C3PO https://github.com/openplanets/c3po/blob/master/.travis.yml Builds the Maven project, against 3 JDKs with a MongoDB instance 17/07/2015 35 OPF Jenkins • For When Travis is not Enough – http://jenkins.opf-labs.org • Scheduled Builds, e.g. Maven Site Builds • Replaces Bamboo http://bamboo.opf-labs.org • Work In Progress – Adding plugins – Adding projects – Automated code quality metrics 17/07/2015 36 Release Early Release Often Packaging Bintray RELEASING SOFTWARE 17/07/2015 37 When to Release? • Release Early As soon as usable functionality available As early as feasible Testing starts early • Release Often As often as feasible Incremental feature releases Testing starts early 17/07/2015 38 How to Package • At First Convenience is King Developer build instructions initially OK Any form of package is better than none • Get Something Downloadable / Executable And get in touch: [email protected] 17/07/2015 39 Where to Host Packages? • GitHub Withdrew Download Support 12/12 • Enter Bintray Social Service for software packages (GitHub for Binaries) Support for APT, RPM, & Maven repos Vanilla repos for all other platforms https://bintray.com/user/organization/profile/openplanets • Sign into Bintray via GitHub Oauth Request openplanets organisation membership 17/07/2015 40 GitHub Issues & Milestones A GitHub Workflow Practitioner Involvement MANAGING DEVELOPMENT WITH GITHUB 17/07/2015 41 GitHub Issues • The Key to Improving Software Plato Project : https://github.com/openplanets/plato/issues • Issues are labelled: – Defect (bug) – Enhancement – Feature • Labelling is User Defined Standard labels / colours do help though 17/07/2015 42 GitHub Milestones • Group Issues, Synch with Branches & Releases Plato Project : https://github.com/openplanets/plato/issues/milestones • Documentation https://github.com/blog/831-issues-2-0-the-next-generation • One Suggested Workflow https://gist.github.com/olegp/1548203 17/07/2015 43 GitHub Workflow • This documents the GitHub workflow http://scottchacon.com/2011/08/31/github-flow.html 17/07/2015 44 Get Involved • Any one can raise an issue to: – Report a bug – Request new functionality – Request enhancements • Milestones and Issues give a Project Roadmap 17/07/2015 45 GitHub for Test Data Documenting a Project What Makes a Healthy Project? Project Health Check CONTINUOUS IMPROVEMENT 17/07/2015 46 Making Test Results Available • Publish Test Data on GitHub Initial work part of the SCAPE project https://github.com/openplanets/cc-benchmark-tests http://data.openplanetsfoundation.org/cc-benchmark-tests/Visualisations/d3-parsets/ • Uses the README to provide context Based upon the Open Data Institutes Open Data Certificate http://www.theodi.org/ http://theodi.github.io/open-data-certificate/ http://theodi.github.io/open-data-certificate/certificate.html 17/07/2015 47 Improve Documentation • Developers are NOT always good writers • Early users are well placed to write documentation, and often do it better • OPF can help co-ordinate documentation efforts 17/07/2015 48 What Makes a Healthy Project? • A Good README, LICENSE, etc. • Signs of Recent Commit Activity • An Active Issue Tracker Open issues are usually a good sign • Continuous Integration in Place and Passing 17/07/2015 49 OPF Project Health Check • Initial release uses GitHub, and Travis REST APIs to collect and publish health metrics • http://projects.opf-labs.org/ Currently updated once a day • Still to incorporate Jenkins and Bintray API June 2013 17/07/2015 50 Improving Presenters & Processes Automating Everything Getting Involved External Contributions THE WAY FORWARD 17/07/2015 51 Improving the SCM (Me) • Open to Suggestion Get in touch and help me to help you • From Me to You I hereby promise to Tweet & Blog a little Members are free to suggest topics for blog posts Further webinars, again requests considered 17/07/2015 52 Improving the Process • We Aim to Please Process should be simple, painless, and ideally fun If it’s broken we’ll try to fix it • Facilitation NOT Micro Management You’re free to use tools and methods of your choice If you think that it’s better it’s worth sharing Better documentation helps everyone 17/07/2015 53 Automate if Possible • REST APIs for Reporting and Raising Awareness • Automated Regression Testing on Open Datasets • Automated Publication of Results • The Aim? An automated sausage factory for software development process and testing 17/07/2015 54 OPF’s What You Make It • The OPF CAN Co-ordinate Website : http://openplanetsfoundation.org Wiki : http://wiki.opf-labs.org GitHub : https://github.com/openplanets • The OPF CAN Help Cross organisational boundaries Practical remote one on one help for members • The OPF cannot Do It All For You Ask not just “what the OPF can do for you”, ask also “what can I do for the OPF” 17/07/2015 55 External Contributions • The Appropriate use of a Fork • External Contributors : – Fork the project – Work on a feature branch, preferably small – Submit a pull request to the original – Project leader reviews the code – Rinse and repeat until trust is established 17/07/2015 56 Upcoming…. • OPF Hackathon 15th – 17th May A Practical Approach to Disk Images & Digital Forensics Copenhagen University Library – The Black Diamond • OPF Hackathon 3rd – 5th June Tackling Real-World Collection Challenges with Digital Forensic Tools and Methods University of North Carolina • SPRUCE Mashup 2nd – 4th July SPRUCE Mashup London 2 London, Hubworking Liverpool Street 17/07/2015 57 Credit Where It’s Due • David Tarrant OPF / University Of Southampton • Andy Jackson The British Library • Johan van der Kniff The National Library of the Netherlands • Paul Wheatley SPRUCE Projects • Peter May The British Library / SCAPE 17/07/2015 58 References • Book on community management (free PDF available) http://www.artofcommunityonline.org/ • A nice article on the importance of READMEs http://tom.preston-werner.com/2010/08/23/readme-driven-development.html • Travis CI site and documentation https://travis-ci.org/ http://about.travis-ci.org/docs • Bintray site and documentation https://bintray.com/ https://bintray.com/docs/help/bintrayuserguide.html 17/07/2015 59 Licensing This work by Open Planets Foundation is licensed under a Creative Commons Attribution 3.0 Unported License. 17/07/2015 60