Add Variables to Design Objects

Modifies an analytic object by adding new variables to it.

des.addvars(design, ...)

Arguments

design	Object of class `analytic` (or inheriting from it) containing survey data and sampling design metadata.
...	`tag = expr` arguments defining columns to be added to `design`.

Details

This function adds to the data frame contained in design the new variables defined by the tag = expr arguments. A tag can be specified either by means of an identifier or by a character string; expr can be any expression that it makes sense to evaluate in the design environment.

For each argument tag = expr bound to the formal argument ... the added column will have name given by the tag value and values obtained by evaluating the expr expression on design. Any input expression not supplied with a tag will be ignored and will therefore have no effect on the des.addvars return value.

Variables to be added to the input object have to be new: namely it is not possible to use des.addvars to modify the values in a pre-existing design column. This an intentional feature meant to safeguard the integrity of the relations between survey data and sampling design metadata stored in design.

Value

An object of the same class of design, containing new variables but supplied with exactly the same metadata.

References

Zardetto, D. (2015) “ReGenesees: an Advanced R System for Calibration, Estimation and Sampling Error Assessment in Complex Sample Surveys”. Journal of Official Statistics, 31(2), 177-203. doi:10.1515/jos-2015-0013 .

Examples

data(data.examples)

# Creation of an analytic object:
des<-e.svydesign(data=example,ids=~towcod+famcod,strata=~SUPERSTRATUM,
     weights=~weight)

# Adding the new 'ones' variable to estimate the number
# of final units in the population: 
des<-des.addvars(des,ones=1)
svystatTM(des,~ones)
#>         Total       SE
#> ones 924101.3 17172.68

# Recoding a qualitative variable:
des<-des.addvars(des,agerange=factor(ifelse(age5c==1,
                                     "young","not-young")))
svystatTM(des,~agerange,estimator="Mean")
#>                        Mean          SE
#> agerangenot-young 0.8604824 0.006775833
#> agerangeyoung     0.1395176 0.006775833
svystatTM(des,~income,~agerange,estimator="Mean",conf.int=TRUE)
#>            agerange Mean.income SE.Mean.income CI.l(95%).Mean.income
#> not-young not-young   1303.7618       9.081213             1285.9630
#> young         young    962.6162      17.684967              927.9543
#>           CI.u(95%).Mean.income
#> not-young             1321.5607
#> young                  997.2781

# Algebraic operations on numeric variables:
des<-des.addvars(des,z2=z^2)
svystatTM(des,~z2,estimator="Mean")
#>        Mean       SE
#> z2 20623.27 356.5924

# A more interesting example: estimating the
# percentage of population with income below
# the poverty threshold (defined as 0.6 times
# the median income for the whole population):
Median.Income <- coef(svystatQ(des, ~income,probs=0.5))
Median.Income
#> income 
#>   1244 
des <- des.addvars(des,
                   status = factor(
                          ifelse(income < (0.6 * Median.Income),
                          "poor", "non-poor")
                                  )
                   )
svystatTM(des,~status,estimator="Mean")
#>                     Mean          SE
#> statusnon-poor 0.8842155 0.006131183
#> statuspoor     0.1157845 0.006131183
# Mean income for poor and non-poor:
svystatTM(des,~income,~status,estimator="Mean")
#>            status Mean.income SE.Mean.income
#> non-poor non-poor   1349.3443       7.855904
#> poor         poor    544.5881      10.161308

### NOTE: Procedure above yields *correct point estimates* of the share of poor
###       population and their average income, while *variance estimation is
###       approximated* since we neglected the sampling variability of the
###       estimated poverty threshold.

Arguments

Details

Value

References

See also

Examples

Contents

Author