Two small, partially overlapping datasets, built to mimick non-independent samples selected with a one-stage, stratified, element sampling design. Allow to run R code contained in the ‘Examples’ section of ReGenesees function svyDelta.

data(Delta.el)

Format

Two data frames, s1 and s2, with 20 observations each and the following 5 variables.

  • For both samples s1 and s2:

id

Identifier of sample units, numeric

strata

Stratification variable, a factor with 2 levels: A, and B

w

Sampling weights, numeric

y

A numeric variable

x

A numeric variable, correlated with y

Details

The two samples, s1 and s2, have 8 units in common, resulting in an overlap rate of 8 / 20 = 0.4. One could think of them as, e.g., two consecutive waves of a rotating panel with a 40% overlap.

Common units are unambigously identified by variable id.

The stratification is static: (1) s1 and s2 use the same strata (i.e. levels A, and B), and (2) no common units changed stratum from s1 to s2.

The ‘Examples’ section of svyDelta will illustrate the effect of dynamic stratification by injecting new strata and stratum-changer units in the samples.

See also

svyDelta for calculating estimates and sampling errors of Measures of Change from two not necessarily independent samples, and Delta.clus for 2 artificial overlapping samples of clusters.

Examples

data(Delta.el) # Have a look: s1
#> id strata w y x #> 1 1 A 18.55982 0.5 1.162374 #> 2 2 A 21.06586 1.0 1.265054 #> 3 3 A 21.93166 1.5 2.030517 #> 4 4 A 19.98340 2.0 1.281605 #> 5 5 A 18.51504 2.5 2.364694 #> 6 6 A 20.28786 3.0 2.734145 #> 7 7 A 19.47544 3.5 3.179758 #> 8 8 A 20.08048 4.0 3.302226 #> 9 9 A 19.35298 4.5 2.565109 #> 10 10 A 21.64282 5.0 2.227502 #> 11 11 B 30.33911 5.5 2.710793 #> 12 12 B 29.99442 6.0 5.148122 #> 13 13 B 31.15830 6.5 4.952516 #> 14 14 B 30.75910 7.0 3.942981 #> 15 15 B 29.87766 7.5 4.827640 #> 16 16 B 29.97200 8.0 3.757939 #> 17 17 B 29.66927 8.5 3.687931 #> 18 18 B 30.54008 9.0 4.980502 #> 19 19 B 30.21610 9.5 7.083637 #> 20 20 B 30.05322 10.0 5.710445
s2
#> id strata w y x #> 1 21 A 20.78341 2.512918 2.696096 #> 2 2 A 21.06586 3.308235 3.087006 #> 3 23 A 18.76287 5.003790 3.670774 #> 4 4 A 19.98340 2.977190 2.171311 #> 5 25 A 18.07945 4.510136 1.465740 #> 6 6 A 20.28786 3.165896 1.143721 #> 7 27 A 19.98976 3.504128 2.627141 #> 8 8 A 20.08048 4.540171 1.879891 #> 9 29 A 20.28916 4.985899 2.365247 #> 10 30 A 20.57337 6.370713 3.869849 #> 11 31 B 30.46191 6.681330 2.236143 #> 12 12 B 29.99442 7.067339 3.314140 #> 13 33 B 29.54631 6.551161 4.297401 #> 14 14 B 30.75910 8.740280 5.599236 #> 15 35 B 31.98209 8.118962 4.011871 #> 16 16 B 29.97200 8.900935 4.677009 #> 17 37 B 29.38564 9.446146 5.044527 #> 18 18 B 30.54008 9.596294 2.891738 #> 19 39 B 30.13084 9.444555 2.989280 #> 20 40 B 29.54043 10.408366 5.223684
# Have a look at the overlap subsample of 8 units: sc <- merge(s1, s2, by = "id", suffixes = c("1", "2")) sc
#> id strata1 w1 y1 x1 strata2 w2 y2 x2 #> 1 2 A 21.06586 1 1.265054 A 21.06586 3.308235 3.087006 #> 2 4 A 19.98340 2 1.281605 A 19.98340 2.977190 2.171311 #> 3 6 A 20.28786 3 2.734145 A 20.28786 3.165896 1.143721 #> 4 8 A 20.08048 4 3.302226 A 20.08048 4.540171 1.879891 #> 5 12 B 29.99442 6 5.148122 B 29.99442 7.067339 3.314140 #> 6 14 B 30.75910 7 3.942981 B 30.75910 8.740280 5.599236 #> 7 16 B 29.97200 8 3.757939 B 29.97200 8.900935 4.677009 #> 8 18 B 30.54008 9 4.980502 B 30.54008 9.596294 2.891738
# Have a look at the full rotation structure (40% overlap in each stratum): s <- merge(s1, s2, by = "id", all = TRUE, suffixes = c("1", "2")) s <- s[order(s$strata1, s$strata2), ] s
#> id strata1 w1 y1 x1 strata2 w2 y2 x2 #> 2 2 A 21.06586 1.0 1.265054 A 21.06586 3.308235 3.087006 #> 4 4 A 19.98340 2.0 1.281605 A 19.98340 2.977190 2.171311 #> 6 6 A 20.28786 3.0 2.734145 A 20.28786 3.165896 1.143721 #> 8 8 A 20.08048 4.0 3.302226 A 20.08048 4.540171 1.879891 #> 1 1 A 18.55982 0.5 1.162374 <NA> NA NA NA #> 3 3 A 21.93166 1.5 2.030517 <NA> NA NA NA #> 5 5 A 18.51504 2.5 2.364694 <NA> NA NA NA #> 7 7 A 19.47544 3.5 3.179758 <NA> NA NA NA #> 9 9 A 19.35298 4.5 2.565109 <NA> NA NA NA #> 10 10 A 21.64282 5.0 2.227502 <NA> NA NA NA #> 12 12 B 29.99442 6.0 5.148122 B 29.99442 7.067339 3.314140 #> 14 14 B 30.75910 7.0 3.942981 B 30.75910 8.740280 5.599236 #> 16 16 B 29.97200 8.0 3.757939 B 29.97200 8.900935 4.677009 #> 18 18 B 30.54008 9.0 4.980502 B 30.54008 9.596294 2.891738 #> 11 11 B 30.33911 5.5 2.710793 <NA> NA NA NA #> 13 13 B 31.15830 6.5 4.952516 <NA> NA NA NA #> 15 15 B 29.87766 7.5 4.827640 <NA> NA NA NA #> 17 17 B 29.66927 8.5 3.687931 <NA> NA NA NA #> 19 19 B 30.21610 9.5 7.083637 <NA> NA NA NA #> 20 20 B 30.05322 10.0 5.710445 <NA> NA NA NA #> 21 21 <NA> NA NA NA A 20.78341 2.512918 2.696096 #> 22 23 <NA> NA NA NA A 18.76287 5.003790 3.670774 #> 23 25 <NA> NA NA NA A 18.07945 4.510136 1.465740 #> 24 27 <NA> NA NA NA A 19.98976 3.504128 2.627141 #> 25 29 <NA> NA NA NA A 20.28916 4.985899 2.365247 #> 26 30 <NA> NA NA NA A 20.57337 6.370713 3.869849 #> 27 31 <NA> NA NA NA B 30.46191 6.681330 2.236143 #> 28 33 <NA> NA NA NA B 29.54631 6.551161 4.297401 #> 29 35 <NA> NA NA NA B 31.98209 8.118962 4.011871 #> 30 37 <NA> NA NA NA B 29.38564 9.446146 5.044527 #> 31 39 <NA> NA NA NA B 30.13084 9.444555 2.989280 #> 32 40 <NA> NA NA NA B 29.54043 10.408366 5.223684
# As anticipated, strata are static: with(s, table(strata1, strata2, useNA = "ifany"))
#> strata2 #> strata1 A B <NA> #> A 4 0 6 #> B 0 4 6 #> <NA> 6 6 0