Docker Pipeline for Reproducible Research - Paul Amazona - FOSSASIA 2018

Published on: Saturday, 24 March 2018

Speaker: Paul Amazona, Core Lead DataKind Singapore
Info: https://2018.fossasia.org/event/speakers.html#paul-amazona3269

Datakind Singapore has been using docker to help in reproducing environments that were used during DataDive events. A DataDive is a weekend long event where non-profit organizations work alongside data science volunteers, developers and designers to analyze data to gain insight into their programs, the communities they serve and more.     

Apart from versioning the files/scripts that we used for analysis, we also version the environments where we ran such scripts. In this session, I'll be sharing the Docker Continuous Integration (CI) pipeline/conventions we use in DataKind Singapore to promote reproducibility.     

If you have time, I encourage you to watch the talk I gave last year:    
https://engineers.sg/video/datalearn-docker-for-reproducible-research-datakind-sg--1468     
It focused on the basics of docker and how we're planning to leverage it in an upcoming DataKind event.   

In this FOSSAsia worskhop, I'll focus and share more on our Continuous Integration (CI) setup.   

i.e. how to setup the dockerfile and related environment files in github and how to leverage quay.io for continuous integration.   

 
If you want to follow along during the workshop,    
I recommend creating accounts in advance in the following platforms:   
https://github.com 
https://quay.io 
https://labs.play-with-docker.com/            

For the hands-on, I'll go through the following:     
1. Setting up dockerfile in github (for a sample jupyter notebook)     
2. How to setup quay.io for triggered docker builds     
3. How to consume docker images from quay.io   

     
Workshop Material:
https://datakind-sg.github.io/chapter-one/docker-pipeline-for-reproducible-research.html

Who is DataKind?   

DataKind is a non-profit organization which seeks to harness the power of data science in the service of humanity. Founded in New York in 2011, DataKind has since started chapters in UK, Singapore, Bangalore, San Francisco Bay Area, Washington DC and Dublin. Our DataKind Singapore chapter was founded in Aug 2014, and our goal is to connect data science volunteers with non-profit organizations to help them analyze their data for good.

Track: Cloud, Container, DevOps
Room: Training room 4-1
Date: Saturday, 24th March, 2018

Event Page: http://2018.fossasia.org
Follow FOSSASIA on Twitter: https://twitter.com/fossasia/
Like FOSSASIA on Facebook: https://www.facebook.com/fossasia/

Produced by Engineers.SG

Organization