Introduction

Anthony Mathelier, PhD

Deputy Group Leader, Dr Wyeth W. Wasserman's lab
University of British Columbia, Center for Molecular Medicine and Therapeutics, Child and Family Research Institute
980 West 28th Ave, Room 3109
V5Z 4H4, Vancouver, BC, Canada

Google Scholar profile

anthony[DOT]mathelier[AT]gmail[DOT]com

Wordle: Mathelier201403

Research Activities

I am currently working as a postdoctoral fellow in Dr. Wyeth W. Wasserman's laboratory. I am interested in finding signals of gene expression regulation through genomic sequence analysis. During my PhD, I looked for signals of gene regulation at the post-transcriptional level through miRNA regulation in eukaryotes and for genome-wide spatial organization in the procaryote Escherichia coli. As a postdoctoral fellow, I am working on accurately finding transcription factor binding sites and the impact of cis-regulatory variations on gene expression. Finally, I am involved in the functional annotation of the mammalian genome (FANTOM5) project aiming to build a complete promoter map to uncover the transcriptional regulatory networks defining every human cell type.

Despite improvement of identification methods for functional variations within protein encoding exons, the prediction of variations within cis-regulatory sequences, called cis-regulatory variations (CRVs), remains an unmet challenge. The aim of my project is to (i) improve models for the detection of transcription factor binding sites (TFBSs) using ChIP-Seq data and (ii) use derived models to predict and prioritize CRVs.
Transcription factors are proteins implicated in transcriptional regulation by activating or repressing genes. Finding where these proteins bind to DNA is of key importance to decipher gene regulation at the transcriptional level. Greater understanding the regulation of transcription promises to improve human genetic analysis by specifying critical gene components now inaccessible to investigators. Classically, computational prediction of transcription factor binding sites is based on models giving weights to each nucleotide at each position. Key TFBS types for human health cannot be detected reliably with the older methods. The confounding properties of sites include dinucleotide composition and variable lengths (e.g. variable spacing between half-sites). We develop a novel TFBSs prediction system to detect TFBS configurations by using statistical models.
The global relationship between TFBS recognition and nucleotide variations remains largely unidentified both experimentally and in silico. Recent studies have shown causal CRVs responsible for striking phenotypes and extensive genetic variations within human TFBSs correlated with differences in gene expression. Combining the application of models and procedures for predicting TFBSs at genome-scale with data arising from full-genome re-sequencing will permit us to identify phenotype-conferring CRVs.

During my PhD, I looked for different kind of signals in eukaryotic and prokaryotic genomes.
MicroRNAs are a class of endogenes of 18-25nt long derived from a precursor and involved in post-transcriptional regulation. Studies showed that the analysis of microRNAs' expression levels may improve the diagnostic of unknown cancers. We proposed two complementary in silico approaches for finding microRNAs in eukaryotic genomes. The first one predicts microRNAs from already known microRNAs or from deep sequencing data and uses secondary structure properties of precursors for computational validation. The second approach looks for new microRNAs organized in clusters using a novel ab initio methodology without any a priori knowledge and possibly deep sequencing data.
The structure of dynamic folds in microbial genomes is largely unknown. We look for genes periodicity within genomes, using a parametrized spectral analysis approach based on single genome analysis. We found, in Escherichia coli, a global genomic period suggesting an encoded three-dimensional chromosomal organization that highlights two independent positional networks of functional genes. The methodology developed in this analysis can be applied to any set of genes and can be taken as a footprint for large scale bacterial and archaeal analysis.