Data Scientist

Date Posted: February 8th, 2018

Data Scientist sought by Episona, Inc. in Pasadena, CA to develop data models and databases to classify fertile and infertile patients using epigenetic markers extracted from large human genome methylation data; work with the methpipe’ manual to map reads to a reference genome and to compute methylation levels for the CpGiant sequencing data; analyze sequencing data to compute statistics such as the distribution of read counts, distribution of methylation levels, on- and off- target sites, distance to closest target region, read counts and methylation levels from different samples, etc.; utilize UCSC liftOver’ program to navigate between reference genomes; utilize the minfi’ package in R to process the raw 450K array data; examine various normalization techniques available for the 450K array data such as SWAN and BMIQ; compare methylation levels from sequencing data with those from array data; analyze the methylation levels from CpGiant data at the specific CpG sites of interest in the 450K array data; compare methylation levels from CpGiant data with the normal methylation range from the control data; create genome browser tracks of the CpGiant sequencing data for visualization on the UCSC genome browser; research various machine learning algorithms to transform the 450K methylation level to the CpGiant methylation level.

Requires Ph.D. or foreign equivalent degree in Genetics, Biology, Biomedical Engineering or related field, and one (1) year of experience in the offered position, bioinformatics research associate, postdoctoral associate or related occupation, including applying statistical hypothesis testing including error rates and chi-square tests and G-tests; using data visualization; utilizing Python and Java; utilizing algorithms such as Monte Carlo sampling, importance sampling and k-means clustering; and developing linear, normal and lognormal models.

