Scientific program

Prerequisite and requirements

The audience is expected to have a good working knowledge in R (e.g. handling data frames and perform simple calculations). Attendees are requested to bring their own laptops, having installed the software RStudio (http://www.rstudio.com/) and the R package mixOmics.

More details on the covered topics

1. Key methodologies in mixOmics and their variants

          A. Exploration of one data set and how to estimate missing values

   B. Identification of biomarkers to discriminate different treatment groups

          C. Integration of two data sets and identification of biomarkers

          D. Repeated measurements design

          E. Introduction to the integration of more than two data sets

2. Review on the graphical outputs implemented in mixOmics

          A. Sample plot representation

          B. Variable plot representation for data integration

          C. Other useful graphical outputs

3. Case studies and applications

The following statistical concepts will be introduced: covariance and correlation, multiple linear regression, classification and prediction, cross-validation, selection of diagnostic or prognostic markers,  l1 and l2 penalties in a regression framework. Each methodology will be illustrated on a case study (we will alternate theory and application).

Note that mixOmics is not limited to biological data only and can be applied to other type of data where integration is required.

Target group

The course is intended for data analysts in the fields of bioinformatics, computational biology and applied statistics with a good statistical knowledge and a good working knowledge in R. It will be particularly useful to those interested in:

1. Exploring large data sets.

2. Selecting features with methods implementing LASSO-based penalisations.

3. Using graphical techniques to better visualise data.

4. Understanding and/or applying multivariate projection methodologies to large data sets.

Results

After completion of this workshop, participants will be able to

1. Understand fundamental principles of multivariate projection-based dimension reduction technique.

2. Perform statistical integration and feature selection using recently developed multivariate methodologies.

3. Apply those methods to high throughput biological studies, including their own studies.

Online user: 1