SNP Data Analysis

Print Friendly, PDF & Email

New sequencing and marker genotyping technologies promise to accelerate the pace of genetic diversity research and gains in selection through molecular breeding. At the same time, these technologies can often overwhelm researchers because of the large amounts of new data being produced. Recently, single nucleotide polymorphism (SNP) markers have become popular because of several key advantages: SNP markers are more abundant in the genome and multiplexed SNP genotyping is more efficient and cost-effective than SSR genotyping. SNP chips can enable rapid scans of the rice genome at different levels of resolution for applications such as diversity analysis, fingerprinting, QTL and association mapping, and marker-assisted selection. At the same time, a wealth of genetic information can enable candidate gene discovery for important traits at fine-mapped gene and QTL regions. These new methods require a deep understanding of advanced bioinformatics and data analysis techniques to extract useful information from large sets of data for rice diversity and breeding applications.

intro pics snp

The SNP Analysis and Candidate Gene Discovery Training Course will provide IRRI researchers and scholars with the essential knowledge required to navigate the new DNA sequence and SNP data sets. The goal is to introduce participants to the available tools to handle these types of data and to apply them in their own research projects. The course will include hands-on sessions to allow participants to become familiar with the different software programs and online tools for SNP analysis and candidate gene discovery.


To develop skills for SNP analysis and bioinformatics for candidate gene discovery for rice researchers.


  • To present an overview of methodologies for rice SNP analysis;
  • To train participants in the use of stand-alone and online software tools for SNP analysis and candidate gene discovery;
  • To demonstrate SNP analysis and gene discovery workflows for rice research.


The course includes hands-on computer training, where the participants work on exercises individually with assistance from resource persons, to raise the participant’s level of skill in:

  • Searching and retrieving SNP data from SNP-Seek and other online databases;
  • Variant calling using the GATK and TASSEL-GBS pipeline;
  • SNP data manipulation using IRRI GSL Galaxy tools;
  • Navigating and handling large SNP datasets using GUI-based tools such as Flapjack;
  • Conduct genetic diversity and population structure analysis using MEGA and Admixture;
  • Perform genome-wide association analysis using TASSEL
  • Using various bioinformatics tools for more in-depth SNP analysis and Candidate Gene Discovery.


The course is offered to researchers and scholars who conduct SNP analysis with bioinformatics support in their project. Maximum of 20 participants will be accepted. Each participant must bring a laptop for use in the training.

To apply for a slot in the course, download application and screening form. Priority of selecting participants is based on screening criteria.

Course Content

  • Introduction to SNP discovery
  • Navigating Online SNP databases
  • Introduction to allele calling softwares for SNP genotyping
  • Introduction to variant calling pipelines from NGS data
  • Manipulation and visualization of large SNP datasets
  • Diversity and population structure analysis
  • Haplotype analysis
  • Association studies
  • Bioinformatic tools for Candidate Gene Discovery from GWAS peaks and QTL regions