A Small But Not Trivial Artificial Sample Data Set

A small dataset mimicking sample data selected with a 2-stage, stratified, cluster sampling without replacement. Allows to run R code contained in the ‘Examples’ section of the ReGenesees package help pages.

data(fpcdat)

Format

A data frame with 28 observations on the following 12 variables.

psu: Identifier of the primary sampling units, numeric
ssu: Identifier of the second stage sampling units, numeric
stratum: Stratification Variable, a factor with 5 levels: S.1, S.2, S.3, S.4, S.5
sr: Strata type, integer with values 0 (NSR strata) and 1 (SR strata)
fpc1: First stage finite population corrections, given as population sizes (in terms of psu clusters) inside strata, numeric
fpc2: Second stage finite population corrections, given as population sizes (in terms of ssu clusters) inside the corresponding sampled psu, numeric
x: A numeric variable
y: A numeric variable
dom1: A variable defining unplanned estimation domains, factor with 3 levels: A, B, C
dom2: A variable defining unplanned estimation domains, factor with 6 levels: a, b, c, d, e, f
w: Direct weights, numeric
z: A numeric variable
pl.domain: A variable defining planned estimation domains, factor with 3 levels: pd.1, pd.2, pd.3

Details

Though very small, the fpcdat dataset concentrates a lot of interesting features. The sampling design is a complex one, with both self-representing (SR) and not-self-representing (NSR) strata. Sampling fractions are deliberately not negligible, in order to stress the effects of finite population corrections on variance estimation. Moreover, being the observations so few, performing computations on the fpcdat dataset allows to check and understand easily all the effects of setting/changing the global variance estimation options of the ReGenesees package (see e.g. ReGenesees.options).

Examples

data(fpcdat)
head(fpcdat)
#>   psu ssu stratum sr fpc1 fpc2  x    y dom1 dom2         w         z pl.domain
#> 1   1   0     S.1  0   20    4 10 9.21    B    a 13.333333 122.39639      pd.1
#> 2   1   1     S.1  0   20    4  3 6.77    A    a 13.333333 120.71089      pd.1
#> 3   1   1     S.1  0   20    4  4 4.68    B    c 13.333333  95.96800      pd.1
#> 4   2   2     S.1  0   20    2  9 8.92    C    a  6.666667  88.26737      pd.1
#> 5   2   3     S.1  0   20    2  3 7.76    A    d  6.666667 113.77454      pd.1
#> 6   3   4     S.1  0   20    3  8 8.14    A    b 10.000000  92.73225      pd.1
str(fpcdat)
#> 'data.frame':	28 obs. of  13 variables:
#>  $ psu      : int  1 1 1 2 2 3 3 4 4 4 ...
#>  $ ssu      : int  0 1 1 2 3 4 5 6 6 6 ...
#>  $ stratum  : Factor w/ 5 levels "S.1","S.2","S.3",..: 1 1 1 1 1 1 1 2 2 2 ...
#>  $ sr       : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ fpc1     : int  20 20 20 20 20 20 20 12 12 12 ...
#>  $ fpc2     : int  4 4 4 2 2 3 3 2 2 2 ...
#>  $ x        : int  10 3 4 9 3 8 5 0 6 5 ...
#>  $ y        : num  9.21 6.77 4.68 8.92 7.76 8.14 0.47 0.49 1.16 4.01 ...
#>  $ dom1     : Factor w/ 3 levels "A","B","C": 2 1 2 3 1 1 2 2 1 1 ...
#>  $ dom2     : Factor w/ 6 levels "a","b","c","d",..: 1 1 3 1 4 2 2 5 3 2 ...
#>  $ w        : num  13.33 13.33 13.33 6.67 6.67 ...
#>  $ z        : num  122.4 120.7 96 88.3 113.8 ...
#>  $ pl.domain: Factor w/ 3 levels "pd.1","pd.2",..: 1 1 1 1 1 1 1 2 2 2 ...

A Small But Not Trivial Artificial Sample Data Set

Format

Details

See also

Examples

Contents