Title: Investigating the causality between Serum 25 hydroxy vitamin D concentration on cardioembolic stroke
1- Number of total SNPs in exposure: 7,250,104 SNPs
2- Number of SNPs exposure with p-value < \(10^-8\) = 16,012 SNPs
3- Number of SNPs exposure after clumping = 150 SNPs
4- Number of total SNPs in outcome: 8,306,091 SNPs
5- Number of common variants between exposure and outcome: 112 SNPs
6- Number of SNPs after harmonization (action=2) = 110 SNPs
7- Number of SNPs after removing HLA region = 110 SNP
8- Number of SNPs after removing those that have MAF < 0.01 = 110 SNPs
How many SNPs have been eliminated with checking the phenoscanner website: 0 SNPs
data <- fread("data.txt")
data$F<-(data$beta.exposure/data$se.exposure)^2
summary(data$F)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 29.78 34.98 42.88 118.79 68.94 2567.54
How many SNPs have been eliminated with checking the weakness: 0 SNP
res<-mr(data)
res
## id.exposure id.outcome outcome exposure method nsnp
## 1 w10boa k0xTZ9 outcome exposure MR Egger 110
## 2 w10boa k0xTZ9 outcome exposure Weighted median 110
## 3 w10boa k0xTZ9 outcome exposure Inverse variance weighted 110
## 4 w10boa k0xTZ9 outcome exposure Simple mode 110
## 5 w10boa k0xTZ9 outcome exposure Weighted mode 110
## b se pval
## 1 -0.1582032 0.16218781 0.33152582
## 2 -0.1414437 0.15063818 0.34774977
## 3 -0.1221412 0.09678982 0.20697681
## 4 -0.5416048 0.30771357 0.08119709
## 5 -0.1780771 0.13741357 0.19774028
plot(data$beta.exposure,data$beta.outcome)
text(data$beta.exposure,
data$beta.outcome,
labels = data$SNP,
pos = 4)
#scatter plot
p1 <- mr_scatter_plot(res, data)
p1[[1]]
#Heterogeneity testing
mr_heterogeneity<- mr_heterogeneity(data); mr_heterogeneity
## id.exposure id.outcome outcome exposure method Q
## 1 w10boa k0xTZ9 outcome exposure MR Egger 131.6582
## 2 w10boa k0xTZ9 outcome exposure Inverse variance weighted 131.7523
## Q_df Q_pval
## 1 108 0.06058768
## 2 109 0.06816311
#pleiotropy testing
mr_pleiotropy_test<- mr_pleiotropy_test(data); mr_pleiotropy_test
## id.exposure id.outcome outcome exposure egger_intercept se pval
## 1 w10boa k0xTZ9 outcome exposure 0.001187683 0.00427599 0.7817304
#plot of single SNP MR:
res_single <- mr_singlesnp(data); p2 <- mr_forest_plot(res_single); p2[[1]]
#plot of LOO:
res_loo <- mr_leaveoneout(data); p3 <- mr_leaveoneout_plot(res_loo); p3[[1]]
#Funnel plot
p4 <- mr_funnel_plot(res_single); p4[[1]]
## $`Main MR results`
## Exposure MR Analysis Causal Estimate Sd T-stat
## 1 beta.exposure Raw -0.1221412 0.09678982 -1.261922
## 2 beta.exposure Outlier-corrected NA NA NA
## P-value
## 1 0.2096704
## 2 NA
##
## $`MR-PRESSO results`
## $`MR-PRESSO results`$`Global Test`
## $`MR-PRESSO results`$`Global Test`$RSSobs
## [1] 133.587
##
## $`MR-PRESSO results`$`Global Test`$Pvalue
## [1] 0.077
## id.exposure id.outcome outcome exposure method nsnp
## 1 w10boa k0xTZ9 outcome exposure MR Egger 110
## 2 w10boa k0xTZ9 outcome exposure Weighted median 110
## 3 w10boa k0xTZ9 outcome exposure Inverse variance weighted 110
## 4 w10boa k0xTZ9 outcome exposure Simple mode 110
## 5 w10boa k0xTZ9 outcome exposure Weighted mode 110
## b se pval
## 1 -0.1582032 0.16218781 0.33152582
## 2 -0.1414437 0.13879759 0.30817241
## 3 -0.1221412 0.09678982 0.20697681
## 4 -0.5416048 0.30870173 0.08216196
## 5 -0.1780771 0.13890941 0.20257478
## id.exposure id.outcome outcome exposure method Q
## 1 w10boa k0xTZ9 outcome exposure MR Egger 131.6582
## 2 w10boa k0xTZ9 outcome exposure Inverse variance weighted 131.7523
## Q_df Q_pval
## 1 108 0.06058768
## 2 109 0.06816311
## id.exposure id.outcome outcome exposure egger_intercept se pval
## 1 w10boa k0xTZ9 outcome exposure 0.001187683 0.00427599 0.7817304
##
## Radial IVW
##
## Estimate Std.Error t value Pr(>|t|)
## Effect (Mod.2nd) -0.1221414 0.09678978 -1.261925 0.2069759
## Iterative -0.1221414 0.09678978 -1.261925 0.2069759
## Exact (FE) -0.1234126 0.08804344 -1.401724 0.1609976
## Exact (RE) -0.1231890 0.08810024 -1.398282 0.1648669
##
##
## Residual standard error: 1.099 on 109 degrees of freedom
##
## F-statistic: 1.59 on 1 and 109 DF, p-value: 0.21
## Q-Statistic for heterogeneity: 131.7325 on 109 DF , p-value: 0.06831866
##
## Outliers detected
## Number of iterations = 2
## SNP Q_statistic p.value
## 1 rs532436 25.43846 4.567209e-07
library(data.table)
data<-fread("data.txt")
reg_1<-lm(data$beta.outcome~data$beta.exposure-1)
data$st_1<-rstandard(reg_1)
#Histogram plot
hist(data$st_1)
How many SNPs have been eliminated with checking the weakness: 0 SNP
datacc_1 <- fread("datacc_1.txt")
res_cc_1<-mr(datacc_1)
res_cc_1
## id.exposure id.outcome outcome exposure method nsnp
## 1 w10boa k0xTZ9 outcome exposure MR Egger 103
## 2 w10boa k0xTZ9 outcome exposure Weighted median 103
## 3 w10boa k0xTZ9 outcome exposure Inverse variance weighted 103
## 4 w10boa k0xTZ9 outcome exposure Simple mode 103
## 5 w10boa k0xTZ9 outcome exposure Weighted mode 103
## b se pval
## 1 -0.1576632 0.14756508 0.28787278
## 2 -0.1448570 0.14499155 0.31775972
## 3 -0.1665251 0.08908217 0.06157511
## 4 -0.5255756 0.30254676 0.08537554
## 5 -0.1697537 0.13983600 0.22757085
mr_heterogeneity_datacc_1<- mr_heterogeneity(datacc_1); mr_heterogeneity_datacc_1
## id.exposure id.outcome outcome exposure method Q
## 1 w10boa k0xTZ9 outcome exposure MR Egger 89.19564
## 2 w10boa k0xTZ9 outcome exposure Inverse variance weighted 89.20132
## Q_df Q_pval
## 1 101 0.7932955
## 2 102 0.8130977
mr_pleiotropy_test_datacc_1<- mr_pleiotropy_test(datacc_1); mr_pleiotropy_test_datacc_1
## id.exposure id.outcome outcome exposure egger_intercept se pval
## 1 w10boa k0xTZ9 outcome exposure -0.0002928949 0.00388821 0.940102
In statistics, Cook’s distance or Cook’s D is a commonly used estimate of the influence of a data point when performing a least-squares regression analysis.[1] In a practical ordinary least squares analysis, Cook’s distance can be used in several ways:
1- To indicate influential data points that are particularly worth checking for validity.
2- To indicate regions of the design space where it would be good to be able to obtain more data points.
It is named after the American statistician R. Dennis Cook, who introduced the concept in 1977.
library(data.table)
library(TwoSampleMR)
datacc_1 <- fread("datacc_1.txt")
#Residuals vs fitted:
plot(lm(datacc_1$beta.outcome~datacc_1$beta.exposure-1), which=1)
#Q-Q Residuals:
plot(lm(datacc_1$beta.outcome~datacc_1$beta.exposure-1), which=2)
#Standardized Residuals:
plot(lm(datacc_1$beta.outcome~datacc_1$beta.exposure-1), which=3)
#Cook distance:
plot(lm(datacc_1$beta.outcome~datacc_1$beta.exposure-1), which=4)
dataRC <- fread("dataRC.txt")
res_RC<-mr(dataRC)
res_RC
## id.exposure id.outcome outcome exposure method nsnp
## 1 w10boa k0xTZ9 outcome exposure MR Egger 99
## 2 w10boa k0xTZ9 outcome exposure Weighted median 99
## 3 w10boa k0xTZ9 outcome exposure Inverse variance weighted 99
## 4 w10boa k0xTZ9 outcome exposure Simple mode 99
## 5 w10boa k0xTZ9 outcome exposure Weighted mode 99
## b se pval
## 1 -0.2334039 0.16146571 0.15153152
## 2 -0.1765377 0.14937491 0.23726791
## 3 -0.1989857 0.09549837 0.03719151
## 4 -0.5271198 0.32539786 0.10846221
## 5 -0.2393457 0.14040596 0.09142457
mr_heterogeneity_dataRC<- mr_heterogeneity(dataRC); mr_heterogeneity_dataRC
## id.exposure id.outcome outcome exposure method Q
## 1 w10boa k0xTZ9 outcome exposure MR Egger 78.64052
## 2 w10boa k0xTZ9 outcome exposure Inverse variance weighted 78.71040
## Q_df Q_pval
## 1 97 0.9135906
## 2 98 0.9239006
mr_pleiotropy_test_dataRC<- mr_pleiotropy_test(dataRC); mr_pleiotropy_test_dataRC
## id.exposure id.outcome outcome exposure egger_intercept se pval
## 1 w10boa k0xTZ9 outcome exposure 0.001064541 0.004026936 0.7920672
##
## Inverse-variance weighted method
## (variants uncorrelated, random-effect model)
##
## Number of Variants : 99
##
## ------------------------------------------------------------------
## Method Estimate Std Error 95% CI p-value
## IVW -0.199 0.095 -0.386, -0.012 0.037
## ------------------------------------------------------------------
## Residual standard error = 0.896
## Residual standard error is set to 1 in calculation of confidence interval when its estimate is less than 1.
## Heterogeneity test statistic (Cochran's Q) = 78.7104 on 98 degrees of freedom, (p-value = 0.9239). I^2 = 0.0%.
## Method Estimate Std Error 95% CI P-value
## Simple median -0.291 0.166 -0.616 0.034 0.079
## Weighted median -0.177 0.149 -0.468 0.114 0.234
## Penalized weighted median -0.179 0.148 -0.469 0.112 0.228
##
## IVW -0.199 0.095 -0.386 -0.012 0.037
## Penalized IVW -0.199 0.095 -0.386 -0.012 0.037
## Robust IVW -0.218 0.088 -0.390 -0.046 0.013
## Penalized robust IVW -0.218 0.088 -0.390 -0.046 0.013
##
## MR-Egger -0.233 0.161 -0.550 0.083 0.148
## (intercept) 0.001 0.004 -0.007 0.009 0.792
## Penalized MR-Egger -0.233 0.161 -0.550 0.083 0.148
## (intercept) 0.001 0.004 -0.007 0.009 0.792
## Robust MR-Egger -0.216 0.130 -0.470 0.039 0.097
## (intercept) 0.000 0.004 -0.008 0.008 0.988
## Penalized robust MR-Egger -0.216 0.130 -0.470 0.039 0.097
## (intercept) 0.000 0.004 -0.008 0.008 0.988
## $r2_exp
## [1] 0
##
## $r2_out
## [1] 0.25
##
## $r2_exp_adj
## [1] 0
##
## $r2_out_adj
## [1] 0.25
##
## $correct_causal_direction
## [1] FALSE
##
## $steiger_test
## [1] NA
##
## $correct_causal_direction_adj
## [1] FALSE
##
## $steiger_test_adj
## [1] NA
##
## $vz
## [1] NaN
##
## $vz0
## [1] 0
##
## $vz1
## [1] NaN
##
## $sensitivity_ratio
## [1] NaN
##
## $sensitivity_plot
## $beta.hat
## [1] -0.200448
##
## $beta.se
## [1] 0.09647138
##
## $beta.p.value
## [1] 0.03772802
##
## $naive.se
## [1] 0.09603428
##
## $chi.sq.test
## [1] 78.67845
## over.dispersion loss.function beta.hat beta.se
## 1 FALSE l2 -0.2004480 0.09647138
## 2 FALSE huber -0.2159041 0.09898388
## 3 FALSE tukey -0.2142594 0.09898336
## 4 TRUE l2 -0.2004483 0.09647265
## 5 TRUE huber -0.2159041 0.09898433
## 6 TRUE tukey -0.2142595 0.09898382
##
## MR-Lasso method
##
## Number of variants : 99
## Number of valid instruments : 99
## Tuning parameter : 0.2464102
## ------------------------------------------------------------------
## Exposure Estimate Std Error 95% CI p-value
## exposure -0.199 0.095 -0.386, -0.012 0.037
## ------------------------------------------------------------------
##
## Constrained maximum likelihood method (MRcML)
## Number of Variants: 99
## Results for: cML-MA-BIC
## ------------------------------------------------------------------
## Method Estimate SE Pvalue 95% CI
## cML-MA-BIC -0.201 0.096 0.036 [-0.389,-0.013]
## ------------------------------------------------------------------
##
## Debiased inverse-variance weighted method
## (Over.dispersion:TRUE)
##
## Number of Variants : 99
## ------------------------------------------------------------------
## Method Estimate Std Error 95% CI p-value Condition
## dIVW -0.201 0.096 -0.390, -0.012 0.037 1098.764
## ------------------------------------------------------------------
##
## Mode-based method of Hartwig et al
## (weighted, delta standard errors [not assuming NOME], bandwidth factor = 1)
##
## Number of Variants : 99
## ------------------------------------------------------------------
## Method Estimate Std Error 95% CI p-value
## MBE -0.239 0.163 -0.560, 0.081 0.143
## ------------------------------------------------------------------