Postdoc Position in Machine learning tools to combine sequence and biologically heterogeneous data (Ref. 6/2017) | CRAGJobs

Postdoc Position in Machine learning tools to combine sequence and biologically heterogeneous data (Ref. 6/2017)

Publication date:
26 Jan 2017
Staff Type:
Research Staff
Research Type:
Postdoctoral researcher
Research Program: Genomics Research Group:

Job description

The Center for Research in Agricultural Genomics (CRAG) is looking for a Postdoc Researcher to be part of the Statistical and Population Genomics Group. For more information:

CRAG is an independent research institution engaged in leading-edge basic and applied plant and farm animal sciences. CRAG is established as a Consortium of the Spanish National Research Council (CSIC), Institute of Agrifood Research and Technology (IRTA), Autonomous University of Barcelona (UAB), and University of Barcelona (UB). The Center is located at the UAB campus, and currently hosts 200 members from across the world.

Research Programs at CRAG (from basic science to applied research using plant experimental model systems, crops and farm animals) make extensive use of large sets of genetic and genomic data, including high-throughput sequencing-based methods, high-throughput genotyping, and complete genome sequences and genome re-sequencing (

Group description:

The numerical genomics group ( at Centre for Research in Agricultural Genomics (CRAG, is primarily interested in the use of high throughput sequencing technology (NGS) for population and statistical genomics. Topics of particular interest are studying the footprint of domestication and artificial selection and the use of sequence for genomic selection. CRAG was recently awarded a Severo Ochoa project for excellence centers in Spain. This project will be carried out in cooperation with international and national groups with a long experience in machine learning that complement our expertise.


We are looking for a Postdoc Researcher with a computational profile and an interest in biological applications.

Big data are characterized not only by their size but also by their heterogeneity and noisiness. These features apply in particular to genomic data: their size has been increasing exponentially with the advent of new sequencing technologies, but also their complexity. We aim at combining several available sources of information, not only the phenotypes and sequence data, but also, e.g., annotation features or metabolic pathways. An important goal is not only to provide efficient predictors but also tools to biologically interpret the results. We will explore machine learning tools such as ensemble methods or deep learning to investigate two main problems (i) genomic prediction, and (ii) inference of selective sweeps.

The tasks are included within the framework at the project AGL2016-78709-R.

Laboratory: Dr. Miguel Pérez Enciso


Educational requirements: PhD in Biology, Biochemistry or similar.

Qualifications and Experience: Programming ability and experience un python. Quantitative and/or population background.

Additional Specifications:  Experience in machine learning area.

Required skills:

Strong skills in communication, organization, and interpersonal relations.

We offer:

Contract duration: 22 months (extendable up to three years if funding available).

Hours/week: Full time.

Submission of applications:

Interested candidates, please submit:

  • Letter of motivation describing past and current projects (1.5 pages max.)
  • Full CV including a list of publications, experience and techniques.
  • Three references (including email address and phone number)

Deadline: Applications will be accepted and the position will remain open until it is filled.

Not registered yet? Register now