A small dataset mimicking sample data selected with a 2-stage, stratified, cluster sampling without replacement. Allows to run R code contained in the ‘Examples’ section of the ReGenesees package help pages.

data(fpcdat)

Format

A data frame with 28 observations on the following 12 variables.

psu

Identifier of the primary sampling units, numeric

ssu

Identifier of the second stage sampling units, numeric

stratum

Stratification Variable, a factor with 5 levels: S.1, S.2, S.3, S.4, S.5

sr

Strata type, integer with values 0 (NSR strata) and 1 (SR strata)

fpc1

First stage finite population corrections, given as population sizes (in terms of psu clusters) inside strata, numeric

fpc2

Second stage finite population corrections, given as population sizes (in terms of ssu clusters) inside the corresponding sampled psu, numeric

x

A numeric variable

y

A numeric variable

dom1

A variable defining unplanned estimation domains, factor with 3 levels: A, B, C

dom2

A variable defining unplanned estimation domains, factor with 6 levels: a, b, c, d, e, f

w

Direct weights, numeric

z

A numeric variable

pl.domain

A variable defining planned estimation domains, factor with 3 levels: pd.1, pd.2, pd.3

Details

Though very small, the fpcdat dataset concentrates a lot of interesting features. The sampling design is a complex one, with both self-representing (SR) and not-self-representing (NSR) strata. Sampling fractions are deliberately not negligible, in order to stress the effects of finite population corrections on variance estimation. Moreover, being the observations so few, performing computations on the fpcdat dataset allows to check and understand easily all the effects of setting/changing the global variance estimation options of the ReGenesees package (see e.g. ReGenesees.options).

See also

ReGenesees.options for setting/changing variance estimation options.

Examples

data(fpcdat) head(fpcdat)
#> psu ssu stratum sr fpc1 fpc2 x y dom1 dom2 w z pl.domain #> 1 1 0 S.1 0 20 4 10 9.21 B a 13.333333 122.39639 pd.1 #> 2 1 1 S.1 0 20 4 3 6.77 A a 13.333333 120.71089 pd.1 #> 3 1 1 S.1 0 20 4 4 4.68 B c 13.333333 95.96800 pd.1 #> 4 2 2 S.1 0 20 2 9 8.92 C a 6.666667 88.26737 pd.1 #> 5 2 3 S.1 0 20 2 3 7.76 A d 6.666667 113.77454 pd.1 #> 6 3 4 S.1 0 20 3 8 8.14 A b 10.000000 92.73225 pd.1
str(fpcdat)
#> 'data.frame': 28 obs. of 13 variables: #> $ psu : int 1 1 1 2 2 3 3 4 4 4 ... #> $ ssu : int 0 1 1 2 3 4 5 6 6 6 ... #> $ stratum : Factor w/ 5 levels "S.1","S.2","S.3",..: 1 1 1 1 1 1 1 2 2 2 ... #> $ sr : int 0 0 0 0 0 0 0 0 0 0 ... #> $ fpc1 : int 20 20 20 20 20 20 20 12 12 12 ... #> $ fpc2 : int 4 4 4 2 2 3 3 2 2 2 ... #> $ x : int 10 3 4 9 3 8 5 0 6 5 ... #> $ y : num 9.21 6.77 4.68 8.92 7.76 8.14 0.47 0.49 1.16 4.01 ... #> $ dom1 : Factor w/ 3 levels "A","B","C": 2 1 2 3 1 1 2 2 1 1 ... #> $ dom2 : Factor w/ 6 levels "a","b","c","d",..: 1 1 3 1 4 2 2 5 3 2 ... #> $ w : num 13.33 13.33 13.33 6.67 6.67 ... #> $ z : num 122.4 120.7 96 88.3 113.8 ... #> $ pl.domain: Factor w/ 3 levels "pd.1","pd.2",..: 1 1 1 1 1 1 1 2 2 2 ...