PRS
mkdir -p materials/gwas materials/pgs materials/ld_ref tools LiftOver PRS
├── tools
│ ├── plink.zip
│ ├── plink2.zip
│ ├── liftOver
│ ├── PRSice_linux.zip
│ ├── PRScs
│ | └── ...
│ └── PRScsx
│ └── ...
├── materials
│ ├── gwas
│ | └──
│ ├── pgs
│ | └──
│ └── ld_ref
│ └──
├── demo_hg19.bed
├── demo_hg19.bim
├── demo_hg19.fam
├── demo_hg38_imput.bed
├── demo_hg38_imput.bim
├── demo_hg38_imput.fam
├── demo.phecov
├── LiftOver
├── PRS
└── ...
Main tools for quality control and association tests
wget http://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64/liftOver
wget https://github.com/choishingwan/PRSice/releases/download/2.3.5/PRSice_linux.zip
unzip PRSice_linux.zip -d PRSice2
PRSice2/PRSice_linux
git clone https://github.com/getian107/PRScs.git
git clone https://github.com/getian107/PRScsx.git
pip install numpy scipy h5py
Materials
Download the chain file to convert genomic coordinates between different genome assemblies
cd ../materials
# chain file for converting hg38 to hg19
wget http://hgdownload.soe.ucsc.edu/goldenPath/hg38/liftOver/hg38ToHg19.over.chain.gz
# chain file for converting hg19 to hg38
wget http://hgdownload.soe.ucsc.edu/goldenPath/hg19/liftOver/hg19ToHg38.over.chain.gz
cd ld_ref
wget -O ldblk_1kg_eas.tar.gz https://www.dropbox.com/s/7ek4lwwf2b7f749/ldblk_1kg_eas.tar.gz?dl=0
wget -O ldblk_1kg_eur.tar.gz https://www.dropbox.com/s/mt6var0z96vb6fv/ldblk_1kg_eur.tar.gz?dl=0
wget -O snpinfo_mult_1kg_hm3 https://www.dropbox.com/s/rhi806sstvppzzz/snpinfo_mult_1kg_hm3?dl=0 # necessary for PRS-CSx
tar -zxvf ldblk_1kg_eas.tar.gz
tar -zxvf ldblk_1kg_eur.tar.gz
cd ..
cd gwas
wget https://ftp.ebi.ac.uk/pub/databases/gwas/summary_statistics/GCST90428001-GCST90429000/GCST90428597/harmonised/GCST90428597.h.tsv.gz
wget -O phenocode-20002_1466.tsv.gz https://pheweb.org/UKB-Neale/download/20002_1466 # from UKB PheWeb
wget -O gout_ckb.tsv.zip https://pheweb.ckbiobank.org/download/m10 # from CKB PheWeb
gunzip GCST90428597.h.tsv.gz
gunzip phenocode-20002_1466.tsv.gz
unzip -P [] gout_ckb.tsv.zip # m10.tsv
cd ..
GCST90428597.h.tsv
chromosome |
base_pair_location |
effect_allele |
other_allele |
beta |
standard_error |
effect_allele_frequency |
p_value |
rsid |
rs_id |
n |
hm_coordinate_conversion |
hm_code |
variant_id |
1 |
817186 |
A |
G |
0.0627 |
0.0546 |
0.8534 |
0.2508 |
rs3094315 |
rs3094315 |
17054 |
lo |
10 |
1_817186_G_A |
1 |
818802 |
G |
A |
0.0391 |
0.0548 |
0.706 |
0.4756 |
rs3131969 |
rs3131969 |
8220 |
lo |
11 |
1_818802_A_G |
1 |
818812 |
G |
A |
0.0376 |
0.0549 |
0.7064 |
0.4934 |
rs3131968 |
rs3131968 |
8220 |
lo |
11 |
1_818812_A_G |
... |
|
|
|
|
|
|
|
|
|
|
|
|
|
phenocode-20002_1466.tsv
#chrom |
pos |
ref |
alt |
rsids |
nearest_genes |
pval |
beta |
ac |
1 |
693731 |
A |
G |
rs12238997 |
AL669831.1 |
0.12 |
0.00073 |
78182.9 |
1 |
717587 |
G |
A |
rs144155419 |
AL669831.1 |
0.42 |
-0.001 |
10567.6 |
1 |
730087 |
T |
C |
rs148120343 |
AL669831.1 |
0.53 |
0.00042 |
38079.4 |
... |
|
|
|
|
|
|
|
|
m10.tsv
variant_id |
chromosome |
base_pair_location |
other_allele |
effect_allele |
p_value |
effect_allele_frequency |
beta |
standard_error |
Download the variant weights for a disease from PGS Catalog
cd pgs
wget https://ftp.ebi.ac.uk/pub/databases/spot/pgs/scores/PGS002030/ScoringFiles/Harmonized/PGS002030_hmPOS_GRCh38.txt.gz
gunzip -c PGS002030_hmPOS_GRCh38.txt.gz > PGS002030_hmPOS_GRCh38.txt
grep -v "#" PGS002030_hmPOS_GRCh38.txt > tmp && mv tmp PGS002030_hmPOS_GRCh38.txt
cd ..
PGS002030_hmPOS_GRCh38.txt
rsID |
chr_name |
chr_position |
effect_allele |
other_allele |
effect_weight |
hm_source |
hm_rsID |
hm_chr |
hm_pos |
hm_inferOtherAllele |
rs6671356 |
1 |
1040026 |
C |
T |
1.76815822071497e-08 |
ENSEMBL |
rs6671356 |
1 |
1104646 |
|
rs6604968 |
1 |
1041700 |
G |
A |
1.41265515757531e-08 |
ENSEMBL |
rs6604968 |
1 |
1106320 |
|
rs12726255 |
1 |
1049950 |
G |
A |
1.40560016472365e-08 |
ENSEMBL |
rs12726255 |
1 |
1114570 |
|
... |
|
|
|
|
|
|
|
|
|
|
In this file, the rsID, chr_name, and chr_position fields are derived from GRCh37, while the hm_rsID
, hm_chr
, and hm_pos
fields correspond to GRCh38.
LiftOver
cd LiftOver
awk '{print "chr"$1,$4-1,$4,$2}' demo_hg38_imput.bim > demo_imput_hg38.bed
../tools/liftOver demo_imput_hg38.bed ../materials/hg38ToHg19.over.chain.gz demo_imput_hg19.bed unmapped.txt
awk '{gsub("chr", "", $1); print $4, $1, $3}' demo_imput_hg19.bed > hg38tohg19.txt
grep -v "#" unmapped.txt | awk '{print $4}' > hg38tohg19_rm.txt
cd ..
plink --bfile demo_hg38_imput --allow-extra-chr --exclude LiftOver/hg38tohg19_rm.txt --update-chr LiftOver/hg38tohg19.txt 2 1 --update-map LiftOver/hg38tohg19.txt 3 1 --chr 1-22 --make-bed --out demo_hg19_imput
demo_imput_hg38.bed
|
|
|
|
chr1 |
789894 |
789895 |
rs111203396 |
chr1 |
789941 |
789942 |
rs567893021 |
chr1 |
790098 |
790099 |
rs371966738 |
... |
|
|
|
demo_imput_hg38.bed
|
|
|
|
chr1 |
725274 |
725275 |
rs111203396 |
chr1 |
725321 |
725322 |
rs567893021 |
chr1 |
725478 |
725479 |
rs371966738 |
... |
|
|
|
unmapped.txt
|
#Deleted in new |
chr1 1649313 1649314 rs1198088331 |
#Deleted in new |
chr1 1649379 1649380 rs1383679870 |
#Deleted in new |
chr1 1649495 1649496 rs1287624275 |
... |
hg38tohg19.txt
|
|
|
rs111203396 |
1 |
725275 |
rs567893021 |
1 |
725322 |
rs371966738 |
1 |
725479 |
... |
|
|
hg38tohg19_rm.txt
|
rs1198088331 |
rs1383679870 |
rs1287624275 |
... |
PLINK v1.90b6.24 64-bit (6 Jun 2021) www.cog-genomics.org/plink/1.9/
(C) 2005-2021 Shaun Purcell, Christopher Chang GNU General Public License v3
Logging to demo_hg19_imput.log.
Options in effect:
--allow-extra-chr
--bfile demo_hg38_imput
--chr 1-22
--exclude LiftOver/hg38tohg19_rm.txt
--make-bed
--out demo_hg19_imput
--update-chr LiftOver/hg38tohg19.txt 2 1
--update-map LiftOver/hg38tohg19.txt 3 1
1031497 MB RAM detected; reserving 515748 MB for main workspace.
16537409 variants loaded from .bim file.
27719 people (0 males, 0 females, 27719 ambiguous) loaded from .fam.
Ambiguous sex IDs written to demo_hg19_imput.nosex .
--update-map: 16496680 values updated.
Warning: Base-pair positions are now unsorted!
--exclude: 16496680 variants remaining.
Using 1 thread (no multithreaded calculations invoked).
Before main variant filters, 27719 founders and 0 nonfounders present.
Calculating allele frequencies... done.
Total genotyping rate is 0.994176.
16496680 variants and 27719 people pass filters and QC.
Note: No phenotypes present.
--update-chr: 16496680 values updated.
--make-bed to demo_hg19_imput.bed + demo_hg19_imput.bim + demo_hg19_imput.fam
... done.
External Score
plink --bfile demo_hg19_imput --score materials/pgs/PGS002030_hmPOS_GRCh38.txt 1 4 6 --out PRS/demo_hg19_imput_PGS002030
# plink --bfile demo_hg19_imput --score materials/pgs/PGS002030_hmPOS_GRCh38.txt 1 4 6 sum --out PRS/demo_hg19_imput_PGS002030
plink2 --bfile demo_hg19_imput --score materials/pgs/PGS002030_hmPOS_GRCh38.txt 1 4 6 cols=+scoresums --out PRS/demo_hg19_imput_PGS002030
PLINK v1.90b6.24 64-bit (6 Jun 2021) www.cog-genomics.org/plink/1.9/
(C) 2005-2021 Shaun Purcell, Christopher Chang GNU General Public License v3
Logging to PRS/demo_hg19_imput_PGS002030.log.
Options in effect:
--bfile demo_hg19_imput
--out PRS/demo_hg19_imput_PGS002030
--score materials/pgs/PGS002030_hmPOS_GRCh38.txt 1 4 6
1031497 MB RAM detected; reserving 515748 MB for main workspace.
16496116 variants loaded from .bim file.
2000 people (0 males, 0 females, 2000 ambiguous) loaded from .fam.
Ambiguous sex IDs written to PRS/demo_hg19_imput_PGS002030.nosex .
Using 1 thread (no multithreaded calculations invoked).
Before main variant filters, 2000 founders and 0 nonfounders present.
Calculating allele frequencies... done.
Total genotyping rate is 0.994153.
16496116 variants and 2000 people pass filters and QC.
Note: No phenotypes present.
Warning: 14515 lines skipped in --score file (14439 due to variant ID mismatch,
76 due to allele code mismatch); see PRS/demo_hg19_imput_PGS002030.nopred for
details.
--score: 148696 valid predictors loaded.
--score: Results written to PRS/demo_hg19_imput_PGS002030.profile .
PLINK v2.00a3.7LM 64-bit Intel (24 Oct 2022) www.cog-genomics.org/plink/2.0/
(C) 2005-2022 Shaun Purcell, Christopher Chang GNU General Public License v3
Logging to PRS/demo_hg19_imput_PGS002030.log.
Options in effect:
--bfile demo_hg19_imput
--out PRS/demo_hg19_imput_PGS002030
--score materials/pgs/PGS002030_hmPOS_GRCh38.txt 1 4 6 cols=+scoresums
Start time: Mon Apr 14 14:53:28 2025
1031497 MiB RAM detected; reserving 515748 MiB for main workspace.
Using up to 128 threads (change this with --threads).
2000 samples (0 females, 0 males, 2000 ambiguous; 2000 founders) loaded from
demo_hg19_imput.fam.
16496116 variants loaded from demo_hg19_imput.bim.
Note: No phenotype data present.
Calculating allele frequencies... done.
Warning: 14439 --score file entries were skipped due to missing variant IDs,
and 76 were skipped due to mismatching allele codes.
(Add the 'list-variants' modifier to see which variants were actually used for
scoring.)
--score: 148696 variants processed.
--score: Results written to PRS/demo_hg19_imput_PGS002030.sscore .
End time: Mon Apr 14 14:53:31 2025
demo_hg19_imput_PGS002030.profile
FID |
IID |
PHENO |
CNT |
CNT2 |
SCORE |
TWB0001H02 |
TWB0001H02 |
-9 |
296222 |
115155 |
-3.62423e-09 |
2014-01-B08 |
2014-01-B08 |
-9 |
296186 |
115847 |
-1.39565e-08 |
TWB0011F05 |
TWB0011F05 |
-9 |
296418 |
114938 |
3.20549e-08 |
... |
|
|
|
|
|
demo_hg19_imput_PGS002030.sscore
FID |
IID |
ALLELE_CT |
NAMED_ALLELE_DOSAGE_SUM |
SCORE1_AVG |
SCORE1_SUM |
TWB0001H02 |
TWB0001H02 |
296222 |
115155 |
-3.62423e-09 |
-0.00107782 |
2014-01-B08 |
2014-01-B08 |
296186 |
115847 |
-1.39565e-08 |
-0.00415055 |
TWB0011F05 |
TWB0011F05 |
296418 |
114938 |
3.20549e-08 |
0.00953287 |
... |
|
|
|
|
|
Clumping + Thresholding
PRScie2
tools/PRSice2/PRSice_linux \
--target demo_hg19 \
--base materials/gwas/GCST90428597.h.tsv \
--binary-targe T \
--snp rsid \
--A1 effect_allele \
--stat beta \
--pvalue p_value \
--beta \
--pheno demo.phecov \
--pheno-col gout \
--cov demo.phecov \
--cov-col age,sex \
--out PRS/demo_hg19.prsice2
Rscript tools/PRSice2/PRSice \
--prsice tools/PRSice2/PRSice_linux \
--target demo_hg19 \
--base materials/gwas/GCST90428597.h.tsv \
--binary-targe T \
--snp rsid \
--A1 effect_allele \
--stat beta \
--pvalue p_value \
--beta \
--pheno demo.phecov \
--pheno-col gout \
--cov demo.phecov \
--cov-col age,sex \
--out PRS/demo_hg19.prsice2 \
--extract PRS/demo_hg19.prsice2.valid
PRSice 2.3.5 (2021-09-20)
https://github.com/choishingwan/PRSice
(C) 2016-2020 Shing Wan (Sam) Choi and Paul F. O'Reilly
GNU General Public License v3
If you use PRSice in any published work, please cite:
Choi SW, O'Reilly PF.
PRSice-2: Polygenic Risk Score Software for Biobank-Scale Data.
GigaScience 8, no. 7 (July 1, 2019)
2025-04-14 15:21:34
tools/PRSice2/PRSice_linux \
--a1 effect_allele \
--bar-levels 0.001,0.05,0.1,0.2,0.3,0.4,0.5,1 \
--base materials/gwas/GCST90428597.h.tsv \
--beta \
--binary-target T \
--clump-kb 250kb \
--clump-p 1.000000 \
--clump-r2 0.100000 \
--cov demo.phecov \
--cov-col age,sex \
--interval 5e-05 \
--lower 5e-08 \
--num-auto 22 \
--out PRS/demo_hg19.prsice2 \
--pheno demo.phecov \
--pheno-col gout \
--pvalue p_value \
--seed 2013053753 \
--snp rsid \
--stat beta \
--target demo_hg19 \
--thread 1 \
--upper 0.5
Initializing Genotype file: demo_hg19 (bed)
Start processing GCST90428597.h
==================================================
Base file: materials/gwas/GCST90428597.h.tsv
Header of file is:
chromosome base_pair_location effect_allele other_allele beta standard_error effect_allele_frequency p_value rsid rs_id n hm_coordinate_c nversion hm_code variant_id
Reading 100.00%
8714421 variant(s) observed in base file, with:
Error: A total of 6 duplicated SNP ID detected out of
8714415 input SNPs! Valid SNP ID (post --extract /
--exclude, non-duplicated SNPs) stored at
PRS/demo_hg19.prsice2.valid. You can avoid this
error by using --extract PRS/demo_hg19.prsice2.valid
PRSice 2.3.5 (2021-09-20)
https://github.com/choishingwan/PRSice
(C) 2016-2020 Shing Wan (Sam) Choi and Paul F. O'Reilly
GNU General Public License v3
If you use PRSice in any published work, please cite:
Choi SW, O'Reilly PF.
PRSice-2: Polygenic Risk Score Software for Biobank-Scale Data.
GigaScience 8, no. 7 (July 1, 2019)
2025-04-14 15:40:31
./tools/PRSice2/PRSice_linux \
--a1 effect_allele \
--bar-levels 0.001,0.05,0.1,0.2,0.3,0.4,0.5,1 \
--base materials/gwas/GCST90428597.h.tsv \
--beta \
--binary-target T \
--clump-kb 250kb \
--clump-p 1.000000 \
--clump-r2 0.100000 \
--cov demo.phecov \
--cov-col age,sex \
--extract PRS/demo_hg19.prsice2.valid \
--interval 5e-05 \
--lower 5e-08 \
--num-auto 22 \
--out PRS/demo_hg19.prsice2 \
--pheno demo.phecov \
--pheno-col gout \
--pvalue p_value \
--seed 685772835 \
--snp rsid \
--stat beta \
--target demo_hg19 \
--thread 1 \
--upper 0.5
Initializing Genotype file: demo_hg19 (bed)
Start processing GCST90428597.h
==================================================
SNP extraction/exclusion list contains 4 columns, will
assume first column contains the SNP ID
Base file: materials/gwas/GCST90428597.h.tsv
Header of file is:
chromosome base_pair_location effect_allele other_allele beta standard_error effect_allele_frequency p_value rsid rs_id n hm_coordinate_c nversion hm_code variant_id
Reading 100.00%
8714421 variant(s) observed in base file, with:
12 variant(s) excluded based on user input
8714409 total variant(s) included from base file
Loading Genotype info from target
==================================================
2000 people (1384 male(s), 616 female(s)) observed
2000 founder(s) included
Warning: Currently not support haploid chromosome and sex
chromosomes
28613 variant(s) not found in previous data
59 variant(s) with mismatch information
38747 ambiguous variant(s) excluded
564753 variant(s) included
Phenotype file: demo.phecov
Column Name of Sample ID: FID+IID
Note: If the phenotype file does not contain a header, the
column name will be displayed as the Sample ID which is
expected.
There are a total of 1 phenotype to process
Start performing clumping
Clumping Progress: 100.00%
Number of variant(s) after clumping : 96391
Processing the 1 th phenotype
gout is a binary phenotype
1000 control(s)
1000 case(s)
Processing the covariate file: demo.phecov
==============================
Include Covariates:
Name Missing Number of levels
age 0 -
sex 0 -
After reading the covariate file, 2000 sample(s) included
in the analysis
Start Processing
Processing 100.00%
There are 1 region(s) with p-value less than 1e-5. Please
note that these results are inflated due to the overfitting
inherent in finding the best-fit PRS (but it's still best
to find the best-fit PRS!).
You can use the --perm option (see manual) to calculate an
empirical P-value.
Begin plotting
Current Rscript version = 2.3.3
Plotting Bar Plot
Plotting the high resolution plot
demo_hg19.prsice2.prsice
Pheno |
Set |
Threshold |
R2 |
P |
Coefficient |
Standard.Error |
Num_SNP |
- |
Base |
5e-08 |
0.0204506 |
1.18029e-10 |
13.0836 |
2.03103 |
41 |
- |
Base |
5.005e-05 |
0.0199556 |
2.09099e-10 |
34.0981 |
5.36598 |
121 |
... |
|
|
|
|
|
|
|
- |
Base |
0.00010005 |
0.0209524 |
7.90434e-11 |
44.9678 |
6.91556 |
159 |
... |
|
|
|
|
|
|
|
demo_hg19.prsice2.summary
Phenotype |
Set |
Threshold |
PRS.R2 |
Full.R2 |
Null.R2 |
Prevalence |
Coefficient |
Standard.Error |
P |
Num_SNP |
- |
Base |
0.00010005 |
0.0328111 |
0.382376 |
0.361424 |
- |
44.9678 |
6.91556 |
7.90434e-11 |
159 |
demo_hg19.prsice2.best
FID |
IID |
In_Regression |
PRS |
sample1 |
sample1 |
Yes |
-0.00295987187 |
sample2 |
sample2 |
Yes |
0.00780391272 |
sample3 |
sample3 |
Yes |
-0.00158728225 |
... |
|
|
|
demo_hg19.prsice2_BARPLOT_2025-04-14.png

Shrinkage
PRS-CS/PRS-CSx
- create input summary file
awk 'BEGIN {print "SNP\tA1\tA2\tBETA\tP"} NR > 1 {print $1"\t"$5"\t"$4"\t"$8"\t"$6}' materials/gwas/m10.tsv > PRS/ckb.txt
awk 'BEGIN {print "SNP\tA1\tA2\tBETA\tP"} NR > 1 {print $5"\t"$4"\t"$3"\t"$8"\t"$7}' materials/gwas/phenocode-20002_1466.tsv > PRS/ukb.txt
ukb.txt
SNP |
A1 |
A2 |
BETA |
P |
rs12238997 |
G |
A |
0.00073 |
0.12 |
rs144155419 |
A |
G |
-0.001 |
0.42 |
rs148120343 |
C |
T |
0.00042 |
0.53 |
... |
|
|
|
|
* running PRS-CS
tools/PRScs/PRScs.py --ref_dir=materials/ld_ref/ldblk_1kg_eas \
--bim_prefix=demo_hg19_imput \
--sst_file=PRS/ckb.txt \
--n_gwas=75956 --seed=1 --out_dir=PRS/prscs
--ref_dir=materials/ld_ref/ldblk_1kg_eas
--bim_prefix=demo_hg19_imput
--sst_file=PRS/ckb.txt
--a=1
--b=0.5
--phi=None
--n_gwas=75956
--n_iter=1000
--n_burnin=500
--thin=5
--out_dir=PRS/prscs_
--chrom=['1']
--beta_std=FALSE
--write_psi=FALSE
--write_pst=FALSE
--seed=1
##### process chromosome 1 #####
... parse reference file: materials/ld_ref/ldblk_1kg_eas/snpinfo_1kg_hm3 ...
... 85604 SNPs on chromosome 1 read from materials/ld_ref/ldblk_1kg_eas/snpinfo_1kg_hm3 ...
... parse bim file: demo_hg19_imput.bim ...
... 1294576 SNPs on chromosome 1 read from demo_hg19_imput.bim ...
... parse sumstats file: PRS/ckb.txt ...
... 6850163 SNPs read from PRS/ckb.txt ...
... 78898 common SNPs in the reference, sumstats, and validation set ...
... parse reference LD on chromosome 1 ...
... MCMC ...
--- iter-100 ---
--- iter-200 ---
--- iter-300 ---
--- iter-400 ---
--- iter-500 ---
--- iter-600 ---
--- iter-700 ---
--- iter-800 ---
--- iter-900 ---
--- iter-1000 ---
... Estimated global shrinkage parameter: ...
... Done ...
...
##### process chromosome 22 #####
... parse reference file: materials/ld_ref/ldblk_1kg_eas/snpinfo_1kg_hm3 ...
... 14968 SNPs on chromosome 22 read from materials/ld_ref/ldblk_1kg_eas/snpinfo_1kg_hm3 ...
... parse bim file: demo_hg19_imput.bim ...
... 213675 SNPs on chromosome 22 read from demo_hg19_imput.bim ...
... parse sumstats file: PRS/ckb.txt ...
... 6850163 SNPs read from PRS/ckb.txt ...
... 12673 common SNPs in the reference, sumstats, and validation set ...
... parse reference LD on chromosome 22 ...
... MCMC ...
--- iter-100 ---
--- iter-200 ---
--- iter-300 ---
--- iter-400 ---
--- iter-500 ---
--- iter-600 ---
--- iter-700 ---
--- iter-800 ---
--- iter-900 ---
--- iter-1000 ---
... Estimated global shrinkage parameter: 9.25e-05 ...
... Done ...
prscs_pst_eff_a1_b0.5_phiauto_chr1.txt
|
|
|
|
|
|
1 |
rs3131969 |
754182 |
G |
A |
1.115942e-06 |
1 |
rs1048488 |
760912 |
C |
T |
-1.175908e-04 |
1 |
rs4040617 |
779322 |
G |
A |
-1.562966e-04 |
... |
|
|
|
|
|
cat PRS/prscs_pst_eff_a1_b0.5_phiauto_chr*.txt > PRS/prscs_pst_eff_a1_b0.5_phiauto.txt
plink2 --bfile demo_hg19_imput --score PRS/prscs_pst_eff_a1_b0.5_phiauto.txt 2 4 6 cols=+scoresums --out PRS/demo_hg19_imput_prscs
PLINK v2.00a3.7LM 64-bit Intel (24 Oct 2022) www.cog-genomics.org/plink/2.0/
(C) 2005-2022 Shaun Purcell, Christopher Chang GNU General Public License v3
Logging to PRS/demo_hg19_imput_prscs.log.
Options in effect:
--bfile demo_hg19_imput
--out PRS/demo_hg19_imput_prscs
--score PRS/prscs_pst_eff_a1_b0.5_phiauto.txt 2 4 6 cols=+scoresums
Start time: Tue Apr 15 12:08:51 2025
1031497 MiB RAM detected; reserving 515748 MiB for main workspace.
Using up to 128 threads (change this with --threads).
2000 samples (0 females, 0 males, 2000 ambiguous; 2000 founders) loaded from
demo_hg19_imput.fam.
16496116 variants loaded from demo_hg19_imput.bim.
Note: No phenotype data present.
Calculating allele frequencies... done.
--score: 957101 variants processed.
--score: Results written to PRS/demo_hg19_imput_prscs.sscore .
End time: Tue Apr 15 12:08:55 2025
ROC_decile.png

mkdir PRS/prscsx
tools/PRScsx/PRScsx.py --ref_dir=materials/ld_ref \
--bim_prefix=demo_hg19_imput \
--sst_file=PRS/ckb.txt,PRS/ukb.txt \
--n_gwas=75956,337159 --pop=EAS,EUR --seed=1 --meta=TRUE --out_dir=PRS/prscsx --out_name=demo
--ref_dir=materials/ld_ref
--bim_prefix=demo_hg19_imput
--sst_file=['PRS/ckb.txt', 'PRS/ukb.txt']
--a=1
--b=0.5
--phi=None
--n_gwas=[75956, 337159]
--pop=['EAS', 'EUR']
--n_iter=2000
--n_burnin=1000
--thin=5
--out_dir=PRS/prscsx
--out_name=demo
--chrom=['1', '2', '3', '4', '5']
--meta=TRUE
--write_pst=FALSE
--seed=1
*** 2 discovery populations detected ***
##### process chromosome 1 #####
... parse reference file: materials/ld_ref/snpinfo_mult_1kg_hm3 ...
... 109168 SNPs on chromosome 1 read from materials/ld_ref/snpinfo_mult_1kg_hm3 ...
... parse bim file: demo_hg19_imput.bim ...
... 1294576 SNPs on chromosome 1 read from demo_hg19_imput.bim ...
... parse EAS sumstats file: PRS/ckb.txt ...
... 6850163 SNPs read from PRS/ckb.txt ...
... 78898 common SNPs in the EAS reference, EAS sumstats, and validation set ...
... parse EUR sumstats file: PRS/ukb.txt ...
... 10894596 SNPs read from PRS/ukb.txt ...
... 82349 common SNPs in the EUR reference, EUR sumstats, and validation set ...
... parse EAS reference LD on chromosome 1 ...
... parse EUR reference LD on chromosome 1 ...
... align reference LD on chromosome 1 across populations ...
... 85336 valid SNPs across populations ...
... MCMC ...
--- iter-100 ---
--- iter-200 ---
--- iter-300 ---
--- iter-400 ---
--- iter-500 ---
--- iter-600 ---
--- iter-700 ---
--- iter-800 ---
--- iter-900 ---
--- iter-1000 ---
--- iter-1100 ---
--- iter-1200 ---
--- iter-1300 ---
--- iter-1400 ---
--- iter-1500 ---
--- iter-1600 ---
--- iter-1700 ---
--- iter-1800 ---
--- iter-1900 ---
--- iter-2000 ---
... Estimated global shrinkage parameter: 3.29e-05 ...
... Done ...
...
##### process chromosome 22 #####
... parse reference file: materials/ld_ref/snpinfo_mult_1kg_hm3 ...
... 18944 SNPs on chromosome 22 read from materials/ld_ref/snpinfo_mult_1kg_hm3 ...
... parse bim file: demo_hg19_imput.bim ...
... 213675 SNPs on chromosome 22 read from demo_hg19_imput.bim ...
... parse EAS sumstats file: PRS/ckb.txt ...
... 6850163 SNPs read from PRS/ckb.txt ...
... 12673 common SNPs in the EAS reference, EAS sumstats, and validation set ...
... parse EUR sumstats file: PRS/ukb.txt ...
... 10894596 SNPs read from PRS/ukb.txt ...
... 13399 common SNPs in the EUR reference, EUR sumstats, and validation set ...
... parse EAS reference LD on chromosome 22 ...
... parse EUR reference LD on chromosome 22 ...
... align reference LD on chromosome 22 across populations ...
... 13861 valid SNPs across populations ...
... MCMC ...
--- iter-100 ---
--- iter-200 ---
--- iter-300 ---
--- iter-400 ---
--- iter-500 ---
--- iter-600 ---
--- iter-700 ---
--- iter-800 ---
--- iter-900 ---
--- iter-1000 ---
--- iter-1100 ---
--- iter-1200 ---
--- iter-1300 ---
--- iter-1400 ---
--- iter-1500 ---
--- iter-1600 ---
--- iter-1700 ---
--- iter-1800 ---
--- iter-1900 ---
--- iter-2000 ---
... Estimated global shrinkage parameter: 2.48e-05 ...
... Done ...
demo_EAS_pst_eff_a1_b0.5_phiauto_chr10.txt
...
demo_EAS_pst_eff_a1_b0.5_phiauto_chr9.txt
demo_EUR_pst_eff_a1_b0.5_phiauto_chr10.txt
...
demo_EUR_pst_eff_a1_b0.5_phiauto_chr9.txt
demo_META_pst_eff_a1_b0.5_phiauto_chr10.txt
...
demo_META_pst_eff_a1_b0.5_phiauto_chr9.txt
prscs_pst_eff_a1_b0.5_phiauto_chr1.txt
|
|
|
|
|
|
1 |
rs3131969 |
754182 |
G |
A |
1.115942e-06 |
1 |
rs1048488 |
760912 |
C |
T |
-1.175908e-04 |
1 |
rs4040617 |
779322 |
G |
A |
-1.562966e-04 |
... |
|
|
|
|
|
cat PRS/prscsx/demo_EAS_pst_eff_a1_b0.5_phiauto_chr*.txt > PRS/prscsx_pst_eff_a1_b0.5_phiauto_EAS.txt
cat PRS/prscsx/demo_EUR_pst_eff_a1_b0.5_phiauto_chr*.txt > PRS/prscsx_pst_eff_a1_b0.5_phiauto_EUR.txt
cat PRS/prscsx/demo_META_pst_eff_a1_b0.5_phiauto_chr*.txt > PRS/prscsx_pst_eff_a1_b0.5_phiauto_META.txt
wc -l PRS/prscsx_pst_eff_a1_b0.5_phiauto_EAS.txt # 957101 variants
wc -l PRS/prscsx_pst_eff_a1_b0.5_phiauto_EUR.txt # 1000352 variants
wc -l PRS/prscsx_pst_eff_a1_b0.5_phiauto_META.txt # 1035617 variants
plink2 --bfile demo_hg19_imput --score PRS/prscsx_pst_eff_a1_b0.5_phiauto_META.txt 2 4 6 cols=+scoresums --out PRS/demo_hg19_imput_prscsx_meta
PLINK v2.00a3.7LM 64-bit Intel (24 Oct 2022) www.cog-genomics.org/plink/2.0/
(C) 2005-2022 Shaun Purcell, Christopher Chang GNU General Public License v3
Logging to PRS/demo_hg19_imput_prscsx_meta.log.
Options in effect:
--bfile demo_hg19_imput
--out PRS/demo_hg19_imput_prscsx_meta
--score PRS/prscsx_pst_eff_a1_b0.5_phiauto_META.txt 2 4 6 cols=+scoresums
Start time: Wed Apr 16 14:22:52 2025
1031497 MiB RAM detected; reserving 515748 MiB for main workspace.
Using up to 128 threads (change this with --threads).
2000 samples (0 females, 0 males, 2000 ambiguous; 2000 founders) loaded from
demo_hg19_imput.fam.
16496116 variants loaded from demo_hg19_imput.bim.
Note: No phenotype data present.
Calculating allele frequencies... done.
--score: 1035617 variants processed.
--score: Results written to PRS/demo_hg19_imput_prscsx_meta.sscore .
End time: Wed Apr 16 14:22:56 2025
ROC_decile.png
