Description
R is an open source software environment for statistical computing that is rapidly becoming the tool of choice for data analysis in the life sciences and elsewhere. It is developed by a large international community of scientists and programmers and is at the forefront of new developments in statistical computing. Additionally, R is the foundation of Bioconductor, a similar open-source project focussed on the development of bioinformatics analysis tools. Bioconductor rose to prominence when it became the standard environment for the analysis of microarray gene expression data, but it has maintained and extended this position with the advent of new technologies and the integration with different types of ‘omics data. As such, understanding its basic functionality is of benefit to undergraduates, graduates and researchers across diverse fields.
This short course provides a gentle introduction to the R software and programming environment. It should take you approximately 6-8 hours in total to work through the material. There are five sections: Introduction and basics, Variables and data types, Inbuilt functions, Data frames, Plotting. Materials are taught through pdf documents and videos, with quizzes and assignments provided to test your knowledge. Upon completion of the course you will understand how to manipulate data within R, perform basic data analysis procedures and create plots. This course provides a foundation for more advanced topics and techniques.
Frequently Asked Questions
1) Who is this resource for?
This site is for anyone wishing to learn the basics of R and computer programming in general, although the tasks focus on examples from the biosciences.
2) What prior knowledge do I need?
No prior knowledge of computer programming is required but some familiarity with basic mathematical and statistical concepts is assumed. These include: mean, median, variance, exponents, logarithms, summations.
3) How do I use the course?
There are six sections to the course that should be tackled sequentially. Each section has an accompanying pdf document to work through. There are also videos to guide you through some of the materials. The quiz or assignment at the end of each section will test what you have learned. At the end of the course is a short exercise where you will manipulate a data set that comes with the R installation. This exercise is designed to be more involved than the quizzes and will require you to use the skills developed through the different parts of the course.
Because this is a free course and open to anyone outside UCL your assignment will not be graded, however there is an answer sheet that should be available to download so you can compare your answers. We do not currently offer a certificate for this course.
Prerequisites
Some familiarity with basic mathematical and statistical concepts such as mean, median, variance, exponents, logarithms, summations.
Max Reuter, Reader in Evolutionary Genetics (m.reuter@ucl.ac.uk)
Max joined UCL as a postdoc in 2004, then held a NERC research fellowship and was appointed to lecturer in 2009 and Reader in 2010. His research group investigates the evolutionary genetics of plastic genotypes using experimental approaches in fruitflies and yeast, bioinformatics, and modelling. Max teaches evolutionary genetics and statistics. He has played a leading role in making R an integral part of the biology curriculum at UCL.
Chris Barnes, Lecturer in Systems Biology (christopher.barnes@ucl.ac.uk)
Chris joined UCL in 2012. He is a Lecturer in the Department of Cell and Developmental Biology and initially developed the SysMIC e-learning resource for interdisciplinary training for bioscience researchers. A physicist by training, Chris was awarded a PhD in particle physics in 2005. Since then he has worked in statistical genetics and genomics at the Wellcome Trust Sanger Institute, and systems and synthetic biology at Imperial College London. He runs the Computational Systems and Synthetic Biology group at UCL.