RDDtools works in an object-oriented way: the user has to define once the characteristic of the data, creating a rdd_data object, on which different anaylsis tools can be applied.
Load the package, and load the built-in dataset from [Lee 2008]:
library(rddtools)
data(house)
Declare the data to be a rdd_data object:
<- rdd_data(y=house$y, x=house$x, cutpoint=0) house_rdd
You can now directly summarise and visualise this data:
summary(house_rdd)
#> ### rdd_data object ###
#>
#> Cutpoint: 0
#> Type: Sharp
#> Sample size:
#> -Full : 6558
#> -Left : 2740
#> -Right: 3818
#> Covariates: no
plot(house_rdd)
Estimate parametrically, by fitting a 4th order polynomial.
<- rdd_reg_lm(rdd_object=house_rdd, order=4)
reg_para
reg_para#> ### RDD regression: parametric ###
#> Polynomial order: 4
#> Slopes: separate
#> Number of obs: 6558 (left: 2740, right: 3818)
#>
#> Coefficient:
#> Estimate Std. Error t value Pr(>|t|)
#> D 0.076590 0.013239 5.7851 7.582e-09 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
plot(reg_para)
#> [1] "Mass points detected in the running variable."
Run a simple local regression, using the [Imbens and Kalyanaraman 2012] bandwidth.
<- rdd_bw_ik(house_rdd)
bw_ik <- rdd_reg_np(rdd_object=house_rdd, bw=bw_ik)
reg_nonpara print(reg_nonpara)
#> ### RDD regression: nonparametric local linear###
#> Bandwidth: 0.2938561
#> Number of obs: 3200 (left: 1594, right: 1606)
#>
#> Coefficient:
#> Estimate Std. Error z value Pr(>|z|)
#> D 0.079924 0.009465 8.4443 < 2.2e-16 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
One can easily check the sensitivity of the estimate to different bandwidths:
plotSensi(reg_nonpara, from=0.05, to=1, by=0.1)
Or run the Placebo test, estimating the RDD effect based on fake cutpoints:
plotPlacebo(reg_nonpara)
Design sensitivity tests check whether the discontinuity found can actually be attributed ot other causes. Two types of tests are available:
use simply the function dens_test(), on either the raw data, or the regression output:
dens_test(reg_nonpara)
#>
#> McCrary Test for no discontinuity of density around cutpoint
#>
#> data: reg_nonpara
#> z-val = 1.2952, p-value = 0.1952
#> alternative hypothesis: Density is discontinuous around cutpoint
#> sample estimates:
#> Discontinuity
#> 0.1035008
Two tests available: + equal means of covariates: covarTest_mean() + equal density of covariates: covarTest_dens()
We need here to simulate some data, given that the Lee (2008) dataset contains no covariates. We here simulate three variables, with the second having a different mean on the left and the right.
set.seed(123)
<- nrow(house)
n_Lee <- data.frame(z1 = rnorm(n_Lee, sd=2),
Z z2 = rnorm(n_Lee, mean = ifelse(house<0, 5, 8)),
z3 = sample(letters, size = n_Lee, replace = TRUE))
<- rdd_data(y = house$y, x = house$x, covar = Z, cutpoint = 0) house_rdd_Z
Tests correctly reject equality of the second, and correctly do not reject equality for the first and third.