Coding, DataManagement, Shiny applications, RStudio, for Ecologists and Evolutionary Biologists

Mon, 2017/05/15 - Fri, 2017/05/19
Loch Lomond, Scotland

The course will introduce programming logic using the R syntax. The participants will be able to solve problems involving heterogeneous biological datasets and the combined use of different statistical packages, so the advantages of learning programming skills can be demonstrated. The RMarkdown syntax will be used to illustrate the advantages of literate programming and the possibilities of code sharing and archiving. In the sequence, participants will learn how to design relational databases (RDB) which can be used to manage and analyse large biological datasets. They will learn the basics of the SQL language and how to use it with R with the package {RMySQL}. To finalise, they will use the Shiny tool (R Studio) to build interactive applications to analyse and display data depending on user inputs. Throughout the course we will emphasise data, code and analyses best practices that could foster reproducibility and transparency in science, and the long-term availability of scientific data. At the end of the course the participants are expected to be able to develop small, tailored applications, to read and analyse datasets using a variety of statistics tools.

This course is being delivered by Aline Quadros, an ecologist and researcher at the Leibniz Centre for Tropical Marine Research in Germany.

This course will run from the 15th - 19th May 2017 at SCENE field station, Loch Lomond, Scotland.

Intended Audience:
Researchers and postgraduate students working with in evolutionary biology and ecological data who want to have more autonomy and flexibility in their quantitative analyses, and need to access and analyse large datasets with R.

Day 1
Module 1: Programming Logic
R syntax (Variable types – operators – conditionals – loops – writing functions)
Programming and commenting code with RMarkdown

Day 2
Module 2: Data structures
R syntax (arrays, lists, data frames, matrices)
Data wrangling with {dplyr} and {tidyr}; the {ff} package and data tables for large datasets (e.g. transcriptomics; whole-genome data)
Best practices of data acquisition, organization and storage

Day 3
Module 3: Relational databases
Introduction to the SQL language and MySQL (open-source RDB freeware)
Accessing and analysing large datasets using the package {RMySQL}
As an example, we will combine DNA sequence datasets with IUCN Red List data illustrate the use of RDB to biological datasets.

Day 4
Module 4: Introduction to Shiny (R Studio)
Shiny – Server and user interface commands
As an example we will use Shiny to develop a small application where users can select different species and genes and run/visualize phylogenetic trees using {ape} running in the background.

Day 5
Module 5: Wrapping-up
Development and presentation of individual projects combining data wrangling skills and user inputs using Shiny (R Studio)

Teaching Format:
The course will be highly practical, with a series of hands-on, step-by-step, problem-solving exercises, combining the different tools to solve ecological and evolutionary biology problems. The participants are invited to think of a problem that requires programming skills to be solved, and can bring their own data for a case-study. At the end of each day the participants will have time to work on their on projects and apply the skills learned on that day.

We offer two packages
COURSE ONLY – Includes lunch and refreshments.
ALL INCLUSIVE – Includes breakfast, lunch, dinner, refreshments, minibus to and from meeting point and accommodation. Accommodation is multiple occupancy (max 3 people) single sex en-suite rooms. Arrival Sunday 14th May and departure Friday 19th May PM.

Your rating: None