Schema for GWAS Catalog - NHGRI-EBI Catalog of Published Genome-Wide Association Studies
  Database: hg38    Primary Table: gwasCatalog    Row Count: 226,127   Data last updated: 2020-11-18
Format description: NHGRI's collection of Genome-Wide Association Studies SNPs
fieldexampleSQL type info description
bin 591smallint(5) unsigned range Indexing field to speed chromosome range queries.
chrom chr1varchar(255) values Reference sequence chromosome or scaffold
chromStart 832872int(10) unsigned range Start position in chromosome
chromEnd 832873int(10) unsigned range End position in chromosome
name rs2977608varchar(255) values ID of SNP associated with trait
pubMedID 31969693int(10) unsigned range PubMed ID of publication of the study
author Coleman JRIvarchar(255) values First author of publication
pubDate 2020-01-23varchar(255) values Date of publication
journal Mol Psychiatryvarchar(255) values Journal of publication
title Genome-wide gene-environmen...varchar(1024) values Title of publication
trait Major depressive disordervarchar(255) values Disease or trait assessed in study
initSample 29,475 European ancestry ca...longblob   Initial sample size
replSample NAlongblob   Replication sample size
region 1p36.33varchar(255) values Chromosome band / region of SNP
genes NRlongblob   Reported Gene(s)
riskAllele rs2977608-Alongblob   Strongest SNP-Risk Allele
riskAlFreq 0.259029varchar(255) values Risk Allele Frequency
pValue 8E-6varchar(255) values p-Value
pValueDesc  varchar(255) values p-Value Description
orOrBeta 1.0729614varchar(255) values Odds ratio or beta
ci95 [1.04-1.1]varchar(255) values 95% Confidence Interval
platform Affymetrix [7791636] (imputed)varchar(255) values Platform and [SNPs passing QC]
cnv Nenum('Y', 'N') values Y if Copy Number Variant

Connected Tables and Joining Fields
        hg38.snp144.name (via gwasCatalog.name)
      hg38.snp144CodingDbSnp.name (via gwasCatalog.name)
      hg38.snp144Common.name (via gwasCatalog.name)
      hg38.snp144Flagged.name (via gwasCatalog.name)
      hg38.snp144Mult.name (via gwasCatalog.name)
      hg38.snp144OrthoPt4Pa2Rm3.name (via gwasCatalog.name)
      hg38.snp144Seq.acc (via gwasCatalog.name)

Sample Rows
 
binchromchromStartchromEndnamepubMedIDauthorpubDatejournaltitletraitinitSamplereplSampleregiongenesriskAlleleriskAlFreqpValuepValueDescorOrBetaci95platformcnv
591chr1832872832873rs297760831969693Coleman JRI2020-01-23Mol PsychiatryGenome-wide gene-environment analyses of major depressive disorder and reported lifetime traumatic experiences in UK Biobank.Major depressive disorder29,475 European ancestry cases, 63,482 European ancestry controlsNA1p36.33NRrs2977608-A0.2590298E-61.0729614[1.04-1.1]Affymetrix [7791636] (imputed)N
591chr1845016845017rs14117508626955885Lane JM2016-03-09Nat CommunGenome-wide association analysis identifies novel loci for chronotype in 100,420 individuals from the UK Biobank.Morning vs. evening chronotype8,724 European ancestry evening chronotype individuals, 26,948 European ancestry morning chronotype individualsNA1p36.33LINC01128rs141175086-C0.9984E-82.16[1.34-3.49]Affymetrix [73355677] (imputed)N
592chr1946652946653rs227275630895295Jonnalagadda M2019-03-21Genome Biol EvolA genome-wide association study of skin and iris pigmentation among individuals of South Asian ancestry.Skin reflectance (Melanin index)720 South Asian ancestry individuals1p36.33NRrs2272756-ANR3E-60.324533[0.19-0.46] unit increaseAffymetrix, Illumina [at least 398118] (imputed)N
592chr1959138959139rs11543873931596458Greenwood TA2019-10-09JAMA PsychiatryGenome-wide Association of Endophenotypes for Schizophrenia From the Consortium on the Genetics of Schizophrenia (COGS) Study.California verbal learning test score523 European ancestry schizophrenia cases, 100 Latino schizophrenia cases, 827 European ancestry controls, 83 Latino controlsNA1p36.33NOC2Lrs115438739-ANR7E-67.540324[4.4-10.69] unit increaseIllumina [> 6200000] (imputed)N
592chr1959192959193rs1330301029422604Klein AP2018-02-08Nat CommunGenome-wide meta-analysis identifies five new susceptibility loci for pancreatic cancer.Pancreatic cancer9,040 European ancestry cases, 12,496 European ancestry controlsup to 2,737 cases, up to 4,752 controls1p36.33NOC2Lrs13303010-G0.118E-141.26[1.19-1.35]Illumina [11381182] (imputed)N
592chr1960325960326rs1330332730598549Morris JA2018-12-31Nat GenetAn atlas of genetic influences on osteoporosis in humans and mice.Heel bone mineral density426,824 British ancestry individualsNA1p36.33KLHL17rs13303327-G0.08576584E-130.0237969[0.017-0.03] unit increaseNR [13737936] (imputed)N
592chr1960325960326rs1330332730595370Kichaev G2018-12-27Am J Hum GenetLeveraging Polygenic Functional Enrichment to Improve GWAS Power.Heel bone mineral densityapproximately 446,000 European ancestry individualsNA1p36.33rs13303327-?NR5E-13NR [~ 8900000] (imputed)N
592chr1960325960326rs1330332730048462Kim SK2018-07-26PLoS OneIdentification of 613 new loci associated with heel bone mineral density and a polygenic risk score for bone mineral density, os ...Heel bone mineral density394,929 European ancestry individualsNA1p36.33rs13303327-?NR8E-170.0289903[0.022-0.036] unit increaseAffymetrix [20259828] (imputed)N
592chr1962485962486rs20138536631551469Tabassum R2019-09-24Nat CommunGenetic architecture of human plasma lipidome and its link to cardiovascular disease.Lysophosphatidylethanolamine levels2,045 European ancestry individualsNA1p36.33KLHL17rs201385366-TNR4E-8(LPE(22:6;0))0.8736[0.56-1.18] unit decreaseIllumina [~ 9300000] (imputed)N
592chr1965138965139rs14001919631136621van de Putte R2019-05-28PLoS OneExome chip association study excluded the involvement of rare coding variants with large effect sizes in the etiology of anorect ...Anorectal malformation568 European ancestry cases, 1,860 European ancestry controlsNA1p36.33KLHL17rs140019196-G0.01E-12Illumina [239042]N

Note: all start coordinates in our database are 0-based, not 1-based. See explanation here.

GWAS Catalog (gwasCatalog) Track Description
 

Description

This track displays single nucleotide polymorphisms (SNPs) identified by published Genome-Wide Association Studies (GWAS), collected in the NHGRI-EBI GWAS Catalog published jointly by the National Human Genome Research Institute (NHGRI) and the European Bioinformatics Institute (EMBL-EBI). Some abbreviations are used above.

From http://www.ebi.ac.uk/gwas/docs/about:

The Catalog is a quality controlled, manually curated, literature-derived collection of all published genome-wide association studies assaying at least 100,000 SNPs and all SNP-trait associations with p-values < 1.0 x 10-5 (Hindorff et al., 2009). For more details about the Catalog curation process and data extraction procedures, please refer to the Methods page.

Methods

From http://www.ebi.ac.uk/gwas/docs/methods:

The GWAS Catalog data is extracted from the literature. Extracted information includes publication information, study cohort information such as cohort size, country of recruitment and subject ethnicity, and SNP-disease association information including SNP identifier (i.e. RSID), p-value, gene and risk allele. Each study is also assigned a trait that best represents the phenotype under investigation. When multiple traits are analysed in the same study either multiple entries are created, or individual SNPs are annotated with their specific traits. Traits are used both to query and visualise the data in the Catalog's web form and diagram-based query interfaces.

Data extraction and curation for the GWAS Catalog is an expert activity; each step is performed by scientists supported by a web-based tracking and data entry system which allows multiple curators to search, annotate, verify and publish the Catalog data. Papers that qualify for inclusion in the Catalog are identified through weekly PubMed searches. They then undergo two levels of curation. First all data, including association information for SNPs, traits and general information about the study, are extracted by one curator. A second curator then performs an additional round of curation to double-check the accuracy and consistency of all the information. Finally, an automated pipeline performs validation of the extracted data, see the Quality control and SNP mapping section below for more details. This information is then used for queries and in the production of the diagram.

References

Hindorff LA, Sethupathy P, Junkins HA, Ramos EM, Mehta JP, Collins FS, Manolio TA. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc Natl Acad Sci U S A. 2009 Jun 9;106(23):9362-7. PMID: 19474294; PMC: PMC2687147