Mendelian Randomization Report [Smoking and Graves’ disease (GD)]

Exposure 1: Smoking Initiation on Graves’ disease (GD)

Exposure 2: Age of Initiation Smoking on Graves’ disease (GD)

Exposure 3: Cigarettes per day on Graves’ disease (GD)

Exposure 4: Life time smoking on Graves’ disease (GD)

Exposure 1

Introduction

Data Preparation

1- Number of total SNPs in exposure: 25,008 SNPs (<\(5 \times 10^-8\))

2- Number of SNPs exposure with p-value < \(5 \times 10^-8\): 7,846 SNPs

3- Number of SNPs exposure after clumping : 93 SNPs

4- Number of total SNPs in outcome: 23,868 SNPs

5- Number of common variants between exposure and outcome: 90 SNPs (“rs10114490” “rs72896886” “rs56820925 have been eliminated)

6- Number of SNPs after replacing proxies: 3 SNPs from NIH LDproxy database according to EUR ancestry have been selected: We replace “rs10114490” , “rs72896886” and “rs56820925 by rs7871108&rs72896891&rs57703976 with R2 1 & 0.91 & 1, respectively).So, 93 SNPs remained.

7- Number of SNPs after harmonization (action=2) = 84 SNPs (Removing the following SNPs for incompatible alleles: rs13246563, rs72896886 and Removing the following SNPs for being palindromic with intermediate allele frequencies: rs10956809, rs1160685, rs13246563, rs2186122, rs6508144, rs7585579, rs7921378, rs9540729)

8- Number of SNPs after removing HLA region with exploring in HLA Genes, Nomenclature = 84 SNP

9- Number of SNPs after removing those that have MAF < 0.01 = 84 SNPs

10- Checking pleiotropy by PhenoScanner:

How many SNPs have been eliminated after checking the PhenoScanner website: 84 SNPs

Checking weakness of the instruments:

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   29.80   31.95   35.45   42.68   46.50  145.00

How many SNPs have been eliminated with checking the weakness: 0 SNP

RUN an initial MR:

##   id.exposure id.outcome outcome exposure                    method nsnp
## 1      Tq1nmA     wKlseC outcome exposure                  MR Egger   84
## 2      Tq1nmA     wKlseC outcome exposure           Weighted median   84
## 3      Tq1nmA     wKlseC outcome exposure Inverse variance weighted   84
## 4      Tq1nmA     wKlseC outcome exposure               Simple mode   84
## 5      Tq1nmA     wKlseC outcome exposure             Weighted mode   84
##             b        se      pval
## 1 -0.09125143 0.8483600 0.9146058
## 2  0.24684067 0.2457584 0.3151840
## 3 -0.03829324 0.1687440 0.8204775
## 4  0.66388271 0.6027112 0.2738658
## 5  0.58120491 0.5665021 0.3078942

##   id.exposure id.outcome outcome exposure                    method        Q
## 1      Tq1nmA     wKlseC outcome exposure                  MR Egger 82.76598
## 2      Tq1nmA     wKlseC outcome exposure Inverse variance weighted 82.77008
##   Q_df    Q_pval
## 1   82 0.4555403
## 2   83 0.4864693
##   id.exposure id.outcome outcome exposure egger_intercept         se      pval
## 1      Tq1nmA     wKlseC outcome exposure     0.001404963 0.02205274 0.9493568

Testing Outlier with PRESSO test

## $`Main MR results`
##        Exposure       MR Analysis Causal Estimate        Sd     T-stat
## 1 beta.exposure               Raw     -0.03829324 0.1685102 -0.2272459
## 2 beta.exposure Outlier-corrected              NA        NA         NA
##     P-value
## 1 0.8207914
## 2        NA
## 
## $`MR-PRESSO results`
## $`MR-PRESSO results`$`Global Test`
## $`MR-PRESSO results`$`Global Test`$RSSobs
## [1] 84.68025
## 
## $`MR-PRESSO results`$`Global Test`$Pvalue
## [1] 0.478
##   id.exposure id.outcome outcome exposure                    method nsnp
## 1      Tq1nmA     wKlseC outcome exposure                  MR Egger   84
## 2      Tq1nmA     wKlseC outcome exposure           Weighted median   84
## 3      Tq1nmA     wKlseC outcome exposure Inverse variance weighted   84
## 4      Tq1nmA     wKlseC outcome exposure               Simple mode   84
## 5      Tq1nmA     wKlseC outcome exposure             Weighted mode   84
##             b        se      pval
## 1 -0.09125143 0.8483600 0.9146058
## 2  0.24684067 0.2428262 0.3093761
## 3 -0.03829324 0.1687440 0.8204775
## 4  0.66388271 0.5925601 0.2657903
## 5  0.58120491 0.5739547 0.3141789

##   id.exposure id.outcome outcome exposure                    method        Q
## 1      Tq1nmA     wKlseC outcome exposure                  MR Egger 82.76598
## 2      Tq1nmA     wKlseC outcome exposure Inverse variance weighted 82.77008
##   Q_df    Q_pval
## 1   82 0.4555403
## 2   83 0.4864693
##   id.exposure id.outcome outcome exposure egger_intercept         se      pval
## 1      Tq1nmA     wKlseC outcome exposure     0.001404963 0.02205274 0.9493568

Studentized residuals:

Radial test

## 
## Radial IVW
## 
##                     Estimate Std.Error    t value  Pr(>|t|)
## Effect (Mod.2nd) -0.03829345 0.1685102 -0.2272470 0.8202317
## Iterative        -0.03829345 0.1685102 -0.2272470 0.8202317
## Exact (FE)       -0.03917752 0.1687454 -0.2321695 0.8164064
## Exact (RE)       -0.03918637 0.1724962 -0.2271723 0.8208483
## 
## 
## Residual standard error: 0.999 on 83 degrees of freedom
## 
## F-statistic: 0.05 on 1 and 83 DF, p-value: 0.821
## Q-Statistic for heterogeneity: 82.7689 on 83 DF , p-value: 0.4865057
## 
##  No significant outliers 
## Number of iterations = 2
## [1] "No significant outliers"

Cook’s distance

In statistics, Cook’s distance or Cook’s D is a commonly used estimate of the influence of a data point when performing a least-squares regression analysis.[1] In a practical ordinary least squares analysis, Cook’s distance can be used in several ways:

1- To indicate influential data points that are particularly worth checking for validity.

2- To indicate regions of the design space where it would be good to be able to obtain more data points.

It is named after the American statistician R. Dennis Cook, who introduced the concept in 1977.

Refernce

##         20         22         23         25         66         73         77 
## 0.14237308 0.05712271 0.07812693 0.07705094 0.05064444 0.07619427 0.05108820

Run After deleting new outlier: Final Results:

##   id.exposure id.outcome outcome exposure                    method nsnp
## 1      Tq1nmA     wKlseC outcome exposure                  MR Egger   65
## 2      Tq1nmA     wKlseC outcome exposure           Weighted median   65
## 3      Tq1nmA     wKlseC outcome exposure Inverse variance weighted   65
## 4      Tq1nmA     wKlseC outcome exposure               Simple mode   65
## 5      Tq1nmA     wKlseC outcome exposure             Weighted mode   65
##           b        se       pval
## 1 0.2718293 0.9183576 0.76820739
## 2 0.4431695 0.2616346 0.09029398
## 3 0.4080681 0.1900178 0.03175145
## 4 0.7028273 0.6093891 0.25306373
## 5 0.6834861 0.5736731 0.23788812

##   id.exposure id.outcome outcome exposure                    method        Q
## 1      Tq1nmA     wKlseC outcome exposure                  MR Egger 36.59845
## 2      Tq1nmA     wKlseC outcome exposure Inverse variance weighted 36.62145
##   Q_df    Q_pval
## 1   63 0.9968696
## 2   64 0.9976589
##   id.exposure id.outcome outcome exposure egger_intercept         se      pval
## 1      Tq1nmA     wKlseC outcome exposure     0.003634029 0.02396613 0.8799618

Sensitivity analyses with MendelianRandomization Package

## 
## Inverse-variance weighted method
## (variants uncorrelated, random-effect model)
## 
## Number of Variants : 65 
## 
## ------------------------------------------------------------------
##  Method Estimate Std Error 95% CI       p-value
##     IVW    0.408     0.190 0.036, 0.780   0.032
## ------------------------------------------------------------------
## Residual standard error =  0.756 
## Residual standard error is set to 1 in calculation of confidence interval when its estimate is less than 1.
## Heterogeneity test statistic (Cochran's Q) = 36.6214 on 64 degrees of freedom, (p-value = 0.9977). I^2 = 0.0%.
##                     Method Estimate Std Error 95% CI        P-value
##              Simple median    0.485     0.267  -0.038 1.007   0.069
##            Weighted median    0.451     0.264  -0.066 0.969   0.087
##  Penalized weighted median    0.440     0.264  -0.078 0.958   0.096
##                                                                    
##                        IVW    0.408     0.190   0.036 0.780   0.032
##              Penalized IVW    0.408     0.190   0.036 0.780   0.032
##                 Robust IVW    0.408     0.199   0.018 0.798   0.040
##       Penalized robust IVW    0.408     0.199   0.018 0.798   0.040
##                                                                    
##                   MR-Egger    0.272     0.918  -1.528 2.072   0.767
##                (intercept)    0.004     0.024  -0.043 0.051   0.879
##         Penalized MR-Egger    0.272     0.918  -1.528 2.072   0.767
##                (intercept)    0.004     0.024  -0.043 0.051   0.879
##            Robust MR-Egger    0.246     0.979  -1.673 2.164   0.802
##                (intercept)    0.004     0.025  -0.045 0.054   0.865
##  Penalized robust MR-Egger    0.246     0.979  -1.673 2.164   0.802
##                (intercept)    0.004     0.025  -0.045 0.054   0.865

id.exposure id.outcome exposure outcome snp_r2.exposure snp_r2.outcome correct_causal_direction steiger_pval
Tq1nmA wKlseC exposure outcome 0.0081828 8.99e-05 TRUE 0
## $r2_exp
## [1] 0
## 
## $r2_out
## [1] 0.25
## 
## $r2_exp_adj
## [1] 0
## 
## $r2_out_adj
## [1] 0.25
## 
## $correct_causal_direction
## [1] FALSE
## 
## $steiger_test
## [1] 0
## 
## $correct_causal_direction_adj
## [1] FALSE
## 
## $steiger_test_adj
## [1] 0
## 
## $vz
## [1] NaN
## 
## $vz0
## [1] 0
## 
## $vz1
## [1] NaN
## 
## $sensitivity_ratio
## [1] NaN
## 
## $sensitivity_plot

Working with MRraps

## $beta.hat
## [1] 0.413683
## 
## $beta.se
## [1] 0.1956862
## 
## $beta.p.value
## [1] 0.03451424
## 
## $naive.se
## [1] 0.1933894
## 
## $chi.sq.test
## [1] 36.55829
##   over.dispersion loss.function  beta.hat   beta.se
## 1           FALSE            l2 0.4136830 0.1956862
## 2           FALSE         huber 0.4070273 0.2007585
## 3           FALSE         tukey 0.4115066 0.2007664
## 4            TRUE            l2 0.4136829 0.1956947
## 5            TRUE         huber 0.4070274 0.2007669
## 6            TRUE         tukey 0.4115066 0.2007752
## 
## MR-Lasso method 
## 
## Number of variants : 65 
## Number of valid instruments : 65 
## Tuning parameter : 0.2946335 
## ------------------------------------------------------------------
##  Exposure Estimate Std Error 95% CI       p-value
##  exposure    0.408     0.190 0.036, 0.780   0.032
## ------------------------------------------------------------------
## 
## Constrained maximum likelihood method (MRcML) 
## Number of Variants:  65 
## Results for:  cML-MA-BIC 
## ------------------------------------------------------------------
##      Method Estimate    SE Pvalue        95% CI
##  cML-MA-BIC    0.412 0.192  0.031 [0.037,0.788]
## ------------------------------------------------------------------
## 
## Debiased inverse-variance weighted method
## (Over.dispersion:TRUE)
## 
## Number of Variants : 65 
## ------------------------------------------------------------------
##  Method Estimate Std Error 95% CI       p-value Condition
##    dIVW    0.418     0.195 0.036, 0.799   0.032   338.528
## ------------------------------------------------------------------
## 
## Mode-based method of Hartwig et al
## (weighted, delta standard errors [not assuming NOME], bandwidth factor = 1)
## 
## Number of Variants : 65 
## ------------------------------------------------------------------
##  Method Estimate Std Error  95% CI       p-value
##     MBE    0.683     0.593 -0.479, 1.846   0.249
## ------------------------------------------------------------------

Exposure 2

Introduction

Data Preparation

1- Number of total SNPs in exposure: 2,076 SNPs

2- Number of SNPs exposure with p-value < \(10^-8\) = 447 SNPs

3- Number of SNPs exposure after clumping = 7 SNPs

4- Number of total SNPs in outcome: 1,968 SNPs

5- Number of common variants between exposure and outcome:7

6- Number of SNPs after replacing proxies: -

7- Number of SNPs after harmonization (action=2) = 7 SNPs

8- Number of SNPs after removing HLA region with exploring in HLA Genes, Nomenclature = 7 SNP

9- Number of SNPs after removing those that have MAF < 0.01 = 7 SNPs

10- Checking pleiotropy by PhenoScanner:

How many SNPs have been eliminated after checking the PhenoScanner website: 7 SNPs

Checking weakness of the instruments:

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   30.70   32.65   33.40   39.20   46.15   52.70

How many SNPs have been eliminated with checking the weakness: 0 SNP

RUN an initial MR:

##   id.exposure id.outcome outcome exposure                    method nsnp
## 1      poh7vM     tYC4bs outcome exposure                  MR Egger    7
## 2      poh7vM     tYC4bs outcome exposure           Weighted median    7
## 3      poh7vM     tYC4bs outcome exposure Inverse variance weighted    7
## 4      poh7vM     tYC4bs outcome exposure               Simple mode    7
## 5      poh7vM     tYC4bs outcome exposure             Weighted mode    7
##           b       se      pval
## 1 3.3133470 2.607398 0.2597313
## 2 1.3987757 1.065708 0.1893408
## 3 1.2576244 0.835104 0.1320802
## 4 0.4435410 1.599845 0.7909001
## 5 0.9554978 1.404340 0.5216325

##   id.exposure id.outcome outcome exposure                    method        Q
## 1      poh7vM     tYC4bs outcome exposure                  MR Egger 5.741414
## 2      poh7vM     tYC4bs outcome exposure Inverse variance weighted 6.541648
##   Q_df    Q_pval
## 1    5 0.3321991
## 2    6 0.3653194
##   id.exposure id.outcome outcome exposure egger_intercept         se      pval
## 1      poh7vM     tYC4bs outcome exposure     -0.04724274 0.05659145 0.4418768

Testing Outlier with PRESSO test

## $`Main MR results`
##        Exposure       MR Analysis Causal Estimate       Sd   T-stat   P-value
## 1 beta.exposure               Raw        1.257624 0.835104 1.505949 0.1827924
## 2 beta.exposure Outlier-corrected              NA       NA       NA        NA
## 
## $`MR-PRESSO results`
## $`MR-PRESSO results`$`Global Test`
## $`MR-PRESSO results`$`Global Test`$RSSobs
## [1] 8.562676
## 
## $`MR-PRESSO results`$`Global Test`$Pvalue
## [1] 0.446
##   id.exposure id.outcome outcome exposure                    method nsnp
## 1      poh7vM     tYC4bs outcome exposure                  MR Egger    7
## 2      poh7vM     tYC4bs outcome exposure           Weighted median    7
## 3      poh7vM     tYC4bs outcome exposure Inverse variance weighted    7
## 4      poh7vM     tYC4bs outcome exposure               Simple mode    7
## 5      poh7vM     tYC4bs outcome exposure             Weighted mode    7
##           b       se      pval
## 1 3.3133470 2.607398 0.2597313
## 2 1.3987757 1.092365 0.2003688
## 3 1.2576244 0.835104 0.1320802
## 4 0.4435410 1.602862 0.7912822
## 5 0.9554978 1.369968 0.5116219

##   id.exposure id.outcome outcome exposure                    method        Q
## 1      poh7vM     tYC4bs outcome exposure                  MR Egger 5.741414
## 2      poh7vM     tYC4bs outcome exposure Inverse variance weighted 6.541648
##   Q_df    Q_pval
## 1    5 0.3321991
## 2    6 0.3653194
##   id.exposure id.outcome outcome exposure egger_intercept         se      pval
## 1      poh7vM     tYC4bs outcome exposure     -0.04724274 0.05659145 0.4418768

Studentized residuals:

Radial test

## 
## Radial IVW
## 
##                  Estimate Std.Error  t value  Pr(>|t|)
## Effect (Mod.2nd) 1.258673 0.8356837 1.506160 0.1320261
## Iterative        1.258675 0.8356846 1.506160 0.1320260
## Exact (FE)       1.286563 0.8038056 1.600590 0.1094678
## Exact (RE)       1.284338 0.7257424 1.769688 0.1271788
## 
## 
## Residual standard error: 1.04 on 6 degrees of freedom
## 
## F-statistic: 2.27 on 1 and 6 DF, p-value: 0.183
## Q-Statistic for heterogeneity: 6.488224 on 6 DF , p-value: 0.3707738
## 
##  No significant outliers 
## Number of iterations = 3
## [1] "No significant outliers"

Cook’s distance

In statistics, Cook’s distance or Cook’s D is a commonly used estimate of the influence of a data point when performing a least-squares regression analysis.[1] In a practical ordinary least squares analysis, Cook’s distance can be used in several ways:

1- To indicate influential data points that are particularly worth checking for validity.

2- To indicate regions of the design space where it would be good to be able to obtain more data points.

It is named after the American statistician R. Dennis Cook, who introduced the concept in 1977.

Refernce

##        3 
## 2.950427

Run After deleting new outlier: Final Results:

##   id.exposure id.outcome outcome exposure                    method nsnp
## 1      poh7vM     tYC4bs outcome exposure                  MR Egger    5
## 2      poh7vM     tYC4bs outcome exposure           Weighted median    5
## 3      poh7vM     tYC4bs outcome exposure Inverse variance weighted    5
## 4      poh7vM     tYC4bs outcome exposure               Simple mode    5
## 5      poh7vM     tYC4bs outcome exposure             Weighted mode    5
##          b        se       pval
## 1 1.607576 2.6388686 0.58543258
## 2 1.568729 1.1746335 0.18171103
## 3 2.136997 0.9267736 0.02111923
## 4 1.326165 1.4610788 0.41539488
## 5 1.468391 1.3284507 0.33102058

##   id.exposure id.outcome outcome exposure                    method        Q
## 1      poh7vM     tYC4bs outcome exposure                  MR Egger 2.943647
## 2      poh7vM     tYC4bs outcome exposure Inverse variance weighted 2.989560
##   Q_df    Q_pval
## 1    3 0.4003956
## 2    4 0.5595741
##   id.exposure id.outcome outcome exposure egger_intercept         se      pval
## 1      poh7vM     tYC4bs outcome exposure      0.01376197 0.06422619 0.8440717

Sensitivity analyses with MendelianRandomization Package

## 
## Inverse-variance weighted method
## (variants uncorrelated, random-effect model)
## 
## Number of Variants : 5 
## 
## ------------------------------------------------------------------
##  Method Estimate Std Error 95% CI       p-value
##     IVW    2.137     0.927 0.321, 3.953   0.021
## ------------------------------------------------------------------
## Residual standard error =  0.865 
## Residual standard error is set to 1 in calculation of confidence interval when its estimate is less than 1.
## Heterogeneity test statistic (Cochran's Q) = 2.9896 on 4 degrees of freedom, (p-value = 0.5596). I^2 = 0.0%.
##                     Method Estimate Std Error 95% CI        P-value
##              Simple median    1.644     1.241  -0.788 4.076   0.185
##            Weighted median    1.581     1.127  -0.628 3.791   0.161
##  Penalized weighted median    1.581     1.127  -0.628 3.791   0.161
##                                                                    
##                        IVW    2.137     0.927   0.321 3.953   0.021
##              Penalized IVW    2.137     0.927   0.321 3.953   0.021
##                 Robust IVW    2.110     0.669   0.798 3.422   0.002
##       Penalized robust IVW    2.110     0.669   0.798 3.422   0.002
##                                                                    
##                   MR-Egger    1.608     2.639  -3.565 6.780   0.542
##                (intercept)    0.014     0.064  -0.112 0.140   0.830
##         Penalized MR-Egger    1.608     2.639  -3.565 6.780   0.542
##                (intercept)    0.014     0.064  -0.112 0.140   0.830
##            Robust MR-Egger    1.583     2.307  -2.939 6.105   0.493
##                (intercept)    0.014     0.050  -0.084 0.112   0.783
##  Penalized robust MR-Egger    1.583     2.307  -2.939 6.105   0.493
##                (intercept)    0.014     0.050  -0.084 0.112   0.783

id.exposure id.outcome exposure outcome snp_r2.exposure snp_r2.outcome correct_causal_direction steiger_pval
poh7vM tYC4bs exposure outcome 0.0006169 1.81e-05 TRUE 0
## $r2_exp
## [1] 0
## 
## $r2_out
## [1] 0.25
## 
## $r2_exp_adj
## [1] 0
## 
## $r2_out_adj
## [1] 0.25
## 
## $correct_causal_direction
## [1] FALSE
## 
## $steiger_test
## [1] 0
## 
## $correct_causal_direction_adj
## [1] FALSE
## 
## $steiger_test_adj
## [1] 0
## 
## $vz
## [1] NaN
## 
## $vz0
## [1] 0
## 
## $vz1
## [1] NaN
## 
## $sensitivity_ratio
## [1] NaN
## 
## $sensitivity_plot

Working with MRraps

## $beta.hat
## [1] 2.165151
## 
## $beta.se
## [1] 0.9686016
## 
## $beta.p.value
## [1] 0.02539521
## 
## $naive.se
## [1] 0.9575782
## 
## $chi.sq.test
## [1] 2.92799
##   over.dispersion loss.function beta.hat   beta.se
## 1           FALSE            l2 2.165151 0.9686016
## 2           FALSE         huber 2.165151 0.9937637
## 3           FALSE         tukey 2.129517 0.9928745
## 4            TRUE            l2 2.165153 0.9692145
## 5            TRUE         huber 2.165151 0.9944238
## 6            TRUE         tukey 2.129518 0.9935361
## 
## MR-Lasso method 
## 
## Number of variants : 5 
## Number of valid instruments : 5 
## Tuning parameter : 0.5065273 
## ------------------------------------------------------------------
##  Exposure Estimate Std Error 95% CI       p-value
##  exposure    2.137     0.927 0.321, 3.953   0.021
## ------------------------------------------------------------------
## 
## Constrained maximum likelihood method (MRcML) 
## Number of Variants:  5 
## Results for:  cML-MA-BIC 
## ------------------------------------------------------------------
##      Method Estimate    SE Pvalue        95% CI
##  cML-MA-BIC    2.164 0.949  0.023 [0.304,4.023]
## ------------------------------------------------------------------
## 
## Debiased inverse-variance weighted method
## (Over.dispersion:TRUE)
## 
## Number of Variants : 5 
## ------------------------------------------------------------------
##  Method Estimate Std Error 95% CI       p-value Condition
##    dIVW    2.187     0.962 0.301, 4.073   0.023    91.947
## ------------------------------------------------------------------
## 
## Mode-based method of Hartwig et al
## (weighted, delta standard errors [not assuming NOME], bandwidth factor = 1)
## 
## Number of Variants : 5 
## ------------------------------------------------------------------
##  Method Estimate Std Error  95% CI       p-value
##     MBE    1.468     1.249 -0.980, 3.916   0.240
## ------------------------------------------------------------------

Exposure 3

Introduction

Data Preparation

1- Number of total SNPs in exposure: 5,213 SNPs

2- Number of SNPs exposure with p-value < \(10^-8\) = 2,129 SNPs

3- Number of SNPs exposure after clumping = 23 SNPs

4- Number of total SNPs in outcome: 4,943 SNPs

5- Number of common variants between exposure and outcome: 21 SNPs(“rs7951365” and “rs2273500” have been eliminated)

6- Number of SNPs after replacing proxies: 1 SNPs from NIH LDproxy database according to EUR ancestry have been selected: We replace “rs7951365” by rs7933830 with R2 1. So, 22 SNPs remained.

7- Number of SNPs after harmonization (action=2) = 21 SNPs (Removing the following SNPs for being palindromic with intermediate allele frequencies:rs787362)

8- Number of SNPs after removing HLA region with exploring in HLA Genes, Nomenclature = 21 SNP

9- Number of SNPs after removing those that have MAF < 0.01 = 21 SNPs

10- Checking pleiotropy by PhenoScanner:

How many SNPs have been eliminated after checking the PhenoScanner website: 21 SNPs

Checking weakness of the instruments:

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   30.90   33.95   36.90   75.55   58.45  366.00

How many SNPs have been eliminated with checking the weakness: 0 SNP

RUN an initial MR:

##   id.exposure id.outcome outcome exposure                    method nsnp
## 1      DCkPcM     E0lAMo outcome exposure                  MR Egger   11
## 2      DCkPcM     E0lAMo outcome exposure           Weighted median   11
## 3      DCkPcM     E0lAMo outcome exposure Inverse variance weighted   11
## 4      DCkPcM     E0lAMo outcome exposure               Simple mode   11
## 5      DCkPcM     E0lAMo outcome exposure             Weighted mode   11
##           b        se       pval
## 1 0.2468015 0.5054122 0.63700200
## 2 0.4242802 0.2873233 0.13976553
## 3 0.4941513 0.2214949 0.02568225
## 4 0.3028634 0.4142543 0.48149459
## 5 0.3890023 0.3125860 0.24170540

##   id.exposure id.outcome outcome exposure                    method        Q
## 1      DCkPcM     E0lAMo outcome exposure                  MR Egger 1.602865
## 2      DCkPcM     E0lAMo outcome exposure Inverse variance weighted 1.899316
##   Q_df    Q_pval
## 1    9 0.9963093
## 2   10 0.9970552
##   id.exposure id.outcome outcome exposure egger_intercept         se      pval
## 1      DCkPcM     E0lAMo outcome exposure      0.01463246 0.02687456 0.5993448

Testing Outlier with PRESSO test

## $`Main MR results`
##        Exposure       MR Analysis Causal Estimate         Sd   T-stat
## 1 beta.exposure               Raw       0.4941513 0.09652998 5.119148
## 2 beta.exposure Outlier-corrected              NA         NA       NA
##        P-value
## 1 0.0004512495
## 2           NA
## 
## $`MR-PRESSO results`
## $`MR-PRESSO results`$`Global Test`
## $`MR-PRESSO results`$`Global Test`$RSSobs
## [1] 2.164131
## 
## $`MR-PRESSO results`$`Global Test`$Pvalue
## [1] 0.995
##   id.exposure id.outcome outcome exposure                    method nsnp
## 1      DCkPcM     E0lAMo outcome exposure                  MR Egger   11
## 2      DCkPcM     E0lAMo outcome exposure           Weighted median   11
## 3      DCkPcM     E0lAMo outcome exposure Inverse variance weighted   11
## 4      DCkPcM     E0lAMo outcome exposure               Simple mode   11
## 5      DCkPcM     E0lAMo outcome exposure             Weighted mode   11
##           b        se       pval
## 1 0.2468015 0.5054122 0.63700200
## 2 0.4242802 0.2851799 0.13681336
## 3 0.4941513 0.2214949 0.02568225
## 4 0.3028634 0.4450724 0.51164674
## 5 0.3890023 0.2990912 0.22256335

##   id.exposure id.outcome outcome exposure                    method        Q
## 1      DCkPcM     E0lAMo outcome exposure                  MR Egger 1.602865
## 2      DCkPcM     E0lAMo outcome exposure Inverse variance weighted 1.899316
##   Q_df    Q_pval
## 1    9 0.9963093
## 2   10 0.9970552
##   id.exposure id.outcome outcome exposure egger_intercept         se      pval
## 1      DCkPcM     E0lAMo outcome exposure      0.01463246 0.02687456 0.5993448

Studentized residuals:

Radial test

## 
## Radial IVW
## 
##                   Estimate  Std.Error  t value     Pr(>|t|)
## Effect (Mod.2nd) 0.4940907 0.09651567 5.119280 3.067048e-07
## Iterative        0.4940907 0.09651567 5.119280 3.067048e-07
## Exact (FE)       0.4952785 0.22216433 2.229334 2.579168e-02
## Exact (RE)       0.4952773 0.09534169 5.194761 4.043473e-04
## 
## 
## Residual standard error: 0.434 on 10 degrees of freedom
## 
## F-statistic: 26.21 on 1 and 10 DF, p-value: 0.000451
## Q-Statistic for heterogeneity: 1.887378 on 10 DF , p-value: 0.9971327
## 
##  No significant outliers 
## Number of iterations = 2
## [1] "No significant outliers"

Cook’s distance

In statistics, Cook’s distance or Cook’s D is a commonly used estimate of the influence of a data point when performing a least-squares regression analysis.[1] In a practical ordinary least squares analysis, Cook’s distance can be used in several ways:

1- To indicate influential data points that are particularly worth checking for validity.

2- To indicate regions of the design space where it would be good to be able to obtain more data points.

It is named after the American statistician R. Dennis Cook, who introduced the concept in 1977.

Refernce

par(mfrow = c(2, 2))
model <- lm(data$beta.outcome~data$beta.exposure-1)
plot(model)

par(mfrow = c(1, 1))

cooksD <- cooks.distance(model)
influential <- cooksD[(cooksD > (3 * mean(cooksD, na.rm = TRUE)))]
influential
##         5 
## 0.2299813

Run After deleting new outlier: Final Results:

##   id.exposure id.outcome outcome exposure                    method nsnp
## 1      DCkPcM     E0lAMo outcome exposure                  MR Egger   11
## 2      DCkPcM     E0lAMo outcome exposure           Weighted median   11
## 3      DCkPcM     E0lAMo outcome exposure Inverse variance weighted   11
## 4      DCkPcM     E0lAMo outcome exposure               Simple mode   11
## 5      DCkPcM     E0lAMo outcome exposure             Weighted mode   11
##           b        se       pval
## 1 0.2468015 0.5054122 0.63700200
## 2 0.4242802 0.2761511 0.12443883
## 3 0.4941513 0.2214949 0.02568225
## 4 0.3028634 0.4286115 0.49594001
## 5 0.3890023 0.3168860 0.24771799

##   id.exposure id.outcome outcome exposure                    method        Q
## 1      DCkPcM     E0lAMo outcome exposure                  MR Egger 1.602865
## 2      DCkPcM     E0lAMo outcome exposure Inverse variance weighted 1.899316
##   Q_df    Q_pval
## 1    9 0.9963093
## 2   10 0.9970552
##   id.exposure id.outcome outcome exposure egger_intercept         se      pval
## 1      DCkPcM     E0lAMo outcome exposure      0.01463246 0.02687456 0.5993448

Sensitivity analyses with MendelianRandomization Package

## 
## Inverse-variance weighted method
## (variants uncorrelated, random-effect model)
## 
## Number of Variants : 11 
## 
## ------------------------------------------------------------------
##  Method Estimate Std Error 95% CI       p-value
##     IVW    0.494     0.221 0.060, 0.928   0.026
## ------------------------------------------------------------------
## Residual standard error =  0.436 
## Residual standard error is set to 1 in calculation of confidence interval when its estimate is less than 1.
## Heterogeneity test statistic (Cochran's Q) = 1.8993 on 10 degrees of freedom, (p-value = 0.9971). I^2 = 0.0%.
##                     Method Estimate Std Error 95% CI        P-value
##              Simple median    0.447     0.327  -0.195 1.088   0.172
##            Weighted median    0.425     0.282  -0.127 0.977   0.131
##  Penalized weighted median    0.425     0.282  -0.127 0.977   0.131
##                                                                    
##                        IVW    0.494     0.221   0.060 0.928   0.026
##              Penalized IVW    0.494     0.221   0.060 0.928   0.026
##                 Robust IVW    0.480     0.167   0.152 0.807   0.004
##       Penalized robust IVW    0.480     0.167   0.152 0.807   0.004
##                                                                    
##                   MR-Egger    0.247     0.505  -0.744 1.237   0.625
##                (intercept)    0.015     0.027  -0.038 0.067   0.586
##         Penalized MR-Egger    0.247     0.505  -0.744 1.237   0.625
##                (intercept)    0.015     0.027  -0.038 0.067   0.586
##            Robust MR-Egger    0.254     0.334  -0.402 0.909   0.448
##                (intercept)    0.014     0.025  -0.034 0.063   0.565
##  Penalized robust MR-Egger    0.254     0.334  -0.402 0.909   0.448
##                (intercept)    0.014     0.025  -0.034 0.063   0.565

id.exposure id.outcome exposure outcome snp_r2.exposure snp_r2.outcome correct_causal_direction steiger_pval
DCkPcM E0lAMo exposure outcome 0.0022022 1.5e-05 TRUE 0
## $r2_exp
## [1] 0
## 
## $r2_out
## [1] 0.25
## 
## $r2_exp_adj
## [1] 0
## 
## $r2_out_adj
## [1] 0.25
## 
## $correct_causal_direction
## [1] FALSE
## 
## $steiger_test
## [1] 0
## 
## $correct_causal_direction_adj
## [1] FALSE
## 
## $steiger_test_adj
## [1] 0
## 
## $vz
## [1] NaN
## 
## $vz0
## [1] 0
## 
## $vz1
## [1] NaN
## 
## $sensitivity_ratio
## [1] NaN
## 
## $sensitivity_plot

Working with MRraps

## $beta.hat
## [1] 0.4952775
## 
## $beta.se
## [1] 0.2262889
## 
## $beta.p.value
## [1] 0.02861901
## 
## $naive.se
## [1] 0.2248014
## 
## $chi.sq.test
## [1] 1.887352
##   over.dispersion loss.function  beta.hat   beta.se
## 1           FALSE            l2 0.4952775 0.2262889
## 2           FALSE         huber 0.4952775 0.2321674
## 3           FALSE         tukey 0.4923685 0.2321515
## 4            TRUE            l2 0.4952774 0.2263075
## 5            TRUE         huber 0.4952775 0.2321867
## 6            TRUE         tukey 0.4923685 0.2321712
## 
## MR-Lasso method 
## 
## Number of variants : 11 
## Number of valid instruments : 11 
## Tuning parameter : 0.2300059 
## ------------------------------------------------------------------
##  Exposure Estimate Std Error 95% CI       p-value
##  exposure    0.494     0.221 0.060, 0.928   0.026
## ------------------------------------------------------------------
## 
## Constrained maximum likelihood method (MRcML) 
## Number of Variants:  11 
## Results for:  cML-MA-BIC 
## ------------------------------------------------------------------
##      Method Estimate    SE Pvalue        95% CI
##  cML-MA-BIC    0.495 0.222  0.026 [0.059,0.931]
## ------------------------------------------------------------------
## 
## Debiased inverse-variance weighted method
## (Over.dispersion:TRUE)
## 
## Number of Variants : 11 
## ------------------------------------------------------------------
##  Method Estimate Std Error 95% CI       p-value Condition
##    dIVW    0.501     0.225 0.059, 0.942   0.026   247.269
## ------------------------------------------------------------------
## 
## Mode-based method of Hartwig et al
## (weighted, delta standard errors [not assuming NOME], bandwidth factor = 1)
## 
## Number of Variants : 11 
## ------------------------------------------------------------------
##  Method Estimate Std Error  95% CI       p-value
##     MBE    0.389     0.251 -0.103, 0.881   0.122
## ------------------------------------------------------------------

Exposure 4

Introduction

Data Preparation

1- Number of total SNPs in exposure: 52,663 SNPs

2- Number of SNPs exposure with p-value < \(10^-8\) = 10,413 SNPs

3- Number of SNPs exposure after clumping = 126 SNPs

4- Number of total SNPs in outcome: 50,272 SNPs

5- Number of common variants between exposure and outcome: 118 SNPs(“rs13016665”, “rs12623702”, “rs73220544” “rs317021”, “rs10823968”, “rs12244388”, “rs7297175” and “rs369230” have been eliminated)

6- Number of SNPs after replacing proxies: 3 SNPs from NIH LDproxy database according to EUR ancestry have been selected: We replace “rs12623702”, “rs73220544” and “rs7297175” by rs1477031, rs62280815 and rs11171739 with R2 0.99, 1 and 0.98.So, 121 SNPs remained.

7- Number of SNPs after harmonization (action=2) = 115 SNPs (Removing the following SNPs for incompatible alleles: rs73220544 and Removing the following SNPs for being palindromic with intermediate allele frequencies: rs10922907, rs2401924, rs2678670, rs3769949, rs6692614)

8- Number of SNPs after removing HLA region with exploring in HLA Genes, Nomenclature = 115 SNP

9- Number of SNPs after removing those that have MAF < 0.01 = 115 SNPs

10- Checking pleiotropy by PhenoScanner:

How many SNPs have been eliminated after checking the PhenoScanner website: 115 SNPs

Checking weakness of the instruments:

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   30.05   32.95   36.98   44.16   46.43  172.80

How many SNPs have been eliminated with checking the weakness: 0 SNP

RUN an initial MR:

##   id.exposure id.outcome outcome exposure                    method nsnp
## 1      WHewoY     Tlmzdq outcome exposure                  MR Egger  115
## 2      WHewoY     Tlmzdq outcome exposure           Weighted median  115
## 3      WHewoY     Tlmzdq outcome exposure Inverse variance weighted  115
## 4      WHewoY     Tlmzdq outcome exposure               Simple mode  115
## 5      WHewoY     Tlmzdq outcome exposure             Weighted mode  115
##             b        se      pval
## 1 -0.11078683 1.5711560 0.9439100
## 2  0.32205223 0.5403863 0.5511975
## 3  0.03876063 0.3880606 0.9204373
## 4 -0.94968081 1.3879703 0.4952231
## 5 -0.47869326 1.2203833 0.6956072

##   id.exposure id.outcome outcome exposure                    method        Q
## 1      WHewoY     Tlmzdq outcome exposure                  MR Egger 131.9453
## 2      WHewoY     Tlmzdq outcome exposure Inverse variance weighted 131.9566
##   Q_df    Q_pval
## 1  113 0.1075349
## 2  114 0.1198860
##   id.exposure id.outcome outcome exposure egger_intercept         se      pval
## 1      WHewoY     Tlmzdq outcome exposure     0.001590906 0.01619172 0.9219044

Testing Outlier with PRESSO test

## $`Main MR results`
##        Exposure       MR Analysis Causal Estimate        Sd     T-stat
## 1 beta.exposure               Raw      0.03876063 0.3880606 0.09988294
## 2 beta.exposure Outlier-corrected              NA        NA         NA
##     P-value
## 1 0.9206127
## 2        NA
## 
## $`MR-PRESSO results`
## $`MR-PRESSO results`$`Global Test`
## $`MR-PRESSO results`$`Global Test`$RSSobs
## [1] 134.1068
## 
## $`MR-PRESSO results`$`Global Test`$Pvalue
## [1] 0.132
##   id.exposure id.outcome outcome exposure                    method nsnp
## 1      WHewoY     Tlmzdq outcome exposure                  MR Egger  115
## 2      WHewoY     Tlmzdq outcome exposure           Weighted median  115
## 3      WHewoY     Tlmzdq outcome exposure Inverse variance weighted  115
## 4      WHewoY     Tlmzdq outcome exposure               Simple mode  115
## 5      WHewoY     Tlmzdq outcome exposure             Weighted mode  115
##             b        se      pval
## 1 -0.11078683 1.5711560 0.9439100
## 2  0.32205223 0.5303160 0.5436627
## 3  0.03876063 0.3880606 0.9204373
## 4 -0.94968081 1.4126015 0.5027581
## 5 -0.47869326 1.2435181 0.7009911

##   id.exposure id.outcome outcome exposure                    method        Q
## 1      WHewoY     Tlmzdq outcome exposure                  MR Egger 131.9453
## 2      WHewoY     Tlmzdq outcome exposure Inverse variance weighted 131.9566
##   Q_df    Q_pval
## 1  113 0.1075349
## 2  114 0.1198860
##   id.exposure id.outcome outcome exposure egger_intercept         se      pval
## 1      WHewoY     Tlmzdq outcome exposure     0.001590906 0.01619172 0.9219044

Studentized residuals:

Radial test

## 
## Radial IVW
## 
##                    Estimate Std.Error    t value  Pr(>|t|)
## Effect (Mod.2nd) 0.03876064 0.3880606 0.09988296 0.9204372
## Iterative        0.03876064 0.3880606 0.09988296 0.9204372
## Exact (FE)       0.03981889 0.3606924 0.11039570 0.9120956
## Exact (RE)       0.03967046 0.3725559 0.10648190 0.9153872
## 
## 
## Residual standard error: 1.076 on 114 degrees of freedom
## 
## F-statistic: 0.01 on 1 and 114 DF, p-value: 0.921
## Q-Statistic for heterogeneity: 131.9563 on 114 DF , p-value: 0.1198897
## 
##  No significant outliers 
## Number of iterations = 2
## [1] "No significant outliers"

Cook’s distance

In statistics, Cook’s distance or Cook’s D is a commonly used estimate of the influence of a data point when performing a least-squares regression analysis.[1] In a practical ordinary least squares analysis, Cook’s distance can be used in several ways:

1- To indicate influential data points that are particularly worth checking for validity.

2- To indicate regions of the design space where it would be good to be able to obtain more data points.

It is named after the American statistician R. Dennis Cook, who introduced the concept in 1977.

Refernce

##         10         23         24         44         58         65         69 
## 0.03980598 0.03034967 0.04876298 0.10730566 0.03610658 0.03062523 0.02538265 
##         77         97        107 
## 0.03869120 0.07938431 0.02884155

Run After deleting new outlier: Final Results:

##   id.exposure id.outcome outcome exposure                    method nsnp
## 1      WHewoY     Tlmzdq outcome exposure                  MR Egger   95
## 2      WHewoY     Tlmzdq outcome exposure           Weighted median   95
## 3      WHewoY     Tlmzdq outcome exposure Inverse variance weighted   95
## 4      WHewoY     Tlmzdq outcome exposure               Simple mode   95
## 5      WHewoY     Tlmzdq outcome exposure             Weighted mode   95
##            b        se        pval
## 1 0.04590925 1.7929416 0.979626824
## 2 1.15306148 0.5689894 0.042712774
## 3 1.23129422 0.3999407 0.002079116
## 4 0.79865026 1.4012922 0.570079143
## 5 1.03890826 1.2390246 0.403883377

##   id.exposure id.outcome outcome exposure                    method        Q
## 1      WHewoY     Tlmzdq outcome exposure                  MR Egger 62.22993
## 2      WHewoY     Tlmzdq outcome exposure Inverse variance weighted 62.68993
##   Q_df    Q_pval
## 1   93 0.9940896
## 2   94 0.9946374
##   id.exposure id.outcome outcome exposure egger_intercept         se      pval
## 1      WHewoY     Tlmzdq outcome exposure      0.01236733 0.01823475 0.4993109

Sensitivity analyses with MendelianRandomization Package

## 
## Inverse-variance weighted method
## (variants uncorrelated, random-effect model)
## 
## Number of Variants : 95 
## 
## ------------------------------------------------------------------
##  Method Estimate Std Error 95% CI       p-value
##     IVW    1.231     0.400 0.447, 2.015   0.002
## ------------------------------------------------------------------
## Residual standard error =  0.817 
## Residual standard error is set to 1 in calculation of confidence interval when its estimate is less than 1.
## Heterogeneity test statistic (Cochran's Q) = 62.6899 on 94 degrees of freedom, (p-value = 0.9946). I^2 = 0.0%.
##                     Method Estimate Std Error 95% CI        P-value
##              Simple median    1.181     0.554   0.095 2.267   0.033
##            Weighted median    1.176     0.553   0.092 2.260   0.033
##  Penalized weighted median    1.169     0.553   0.084 2.254   0.035
##                                                                    
##                        IVW    1.231     0.400   0.447 2.015   0.002
##              Penalized IVW    1.231     0.400   0.447 2.015   0.002
##                 Robust IVW    1.168     0.359   0.465 1.872   0.001
##       Penalized robust IVW    1.168     0.359   0.465 1.872   0.001
##                                                                    
##                   MR-Egger    0.046     1.793  -3.468 3.560   0.980
##                (intercept)    0.012     0.018  -0.023 0.048   0.498
##         Penalized MR-Egger    0.046     1.793  -3.468 3.560   0.980
##                (intercept)    0.012     0.018  -0.023 0.048   0.498
##            Robust MR-Egger    0.126     1.178  -2.184 2.435   0.915
##                (intercept)    0.011     0.013  -0.015 0.037   0.411
##  Penalized robust MR-Egger    0.126     1.178  -2.184 2.435   0.915
##                (intercept)    0.011     0.013  -0.015 0.037   0.411

id.exposure id.outcome exposure outcome snp_r2.exposure snp_r2.outcome correct_causal_direction steiger_pval
WHewoY Tlmzdq exposure outcome 0.0769009 0.0014356 TRUE 0
## $r2_exp
## [1] 0
## 
## $r2_out
## [1] 0.25
## 
## $r2_exp_adj
## [1] 0
## 
## $r2_out_adj
## [1] 0.25
## 
## $correct_causal_direction
## [1] FALSE
## 
## $steiger_test
## [1] 0
## 
## $correct_causal_direction_adj
## [1] FALSE
## 
## $steiger_test_adj
## [1] 0
## 
## $vz
## [1] NaN
## 
## $vz0
## [1] 0
## 
## $vz1
## [1] NaN
## 
## $sensitivity_ratio
## [1] NaN
## 
## $sensitivity_plot

Working with MRraps

## $beta.hat
## [1] 1.250703
## 
## $beta.se
## [1] 0.4117848
## 
## $beta.p.value
## [1] 0.002387288
## 
## $naive.se
## [1] 0.4068934
## 
## $chi.sq.test
## [1] 62.54091
##   over.dispersion loss.function beta.hat   beta.se
## 1           FALSE            l2 1.250703 0.4117848
## 2           FALSE         huber 1.207223 0.4224118
## 3           FALSE         tukey 1.199051 0.4223997
## 4            TRUE            l2 1.250703 0.4118096
## 5            TRUE         huber 1.207223 0.4224358
## 6            TRUE         tukey 1.199051 0.4224242
## 
## MR-Lasso method 
## 
## Number of variants : 95 
## Number of valid instruments : 95 
## Tuning parameter : 0.2253352 
## ------------------------------------------------------------------
##  Exposure Estimate Std Error 95% CI       p-value
##  exposure    1.231     0.400 0.447, 2.015   0.002
## ------------------------------------------------------------------
## 
## Constrained maximum likelihood method (MRcML) 
## Number of Variants:  95 
## Results for:  cML-MA-BIC 
## ------------------------------------------------------------------
##      Method Estimate    SE Pvalue        95% CI
##  cML-MA-BIC    1.249 0.404  0.002 [0.458,2.041]
## ------------------------------------------------------------------
## 
## Debiased inverse-variance weighted method
## (Over.dispersion:TRUE)
## 
## Number of Variants : 95 
## ------------------------------------------------------------------
##  Method Estimate Std Error 95% CI       p-value Condition
##    dIVW    1.261     0.410 0.457, 2.065   0.002   405.910
## ------------------------------------------------------------------
## 
## Mode-based method of Hartwig et al
## (weighted, delta standard errors [not assuming NOME], bandwidth factor = 1)
## 
## Number of Variants : 95 
## ------------------------------------------------------------------
##  Method Estimate Std Error  95% CI       p-value
##     MBE    1.039     1.195 -1.302, 3.380   0.384
## ------------------------------------------------------------------