Modifies an analytic object by adding new variables to it.

des.addvars(design, ...)

Arguments

design

Object of class analytic (or inheriting from it) containing survey data and sampling design metadata.

...

tag = expr arguments defining columns to be added to design.

Details

This function adds to the data frame contained in design the new variables defined by the tag = expr arguments. A tag can be specified either by means of an identifier or by a character string; expr can be any expression that it makes sense to evaluate in the design environment.

For each argument tag = expr bound to the formal argument ... the added column will have name given by the tag value and values obtained by evaluating the expr expression on design. Any input expression not supplied with a tag will be ignored and will therefore have no effect on the des.addvars return value.

Variables to be added to the input object have to be new: namely it is not possible to use des.addvars to modify the values in a pre-existing design column. This an intentional feature meant to safeguard the integrity of the relations between survey data and sampling design metadata stored in design.

Value

An object of the same class of design, containing new variables but supplied with exactly the same metadata.

References

Zardetto, D. (2015) “ReGenesees: an Advanced R System for Calibration, Estimation and Sampling Error Assessment in Complex Sample Surveys”. Journal of Official Statistics, 31(2), 177-203. doi: https://doi.org/10.1515/jos-2015-0013.

See also

e.svydesign to bind survey data and sampling design metadata, e.calibrate for calibrating weights.

Examples

data(data.examples) # Creation of an analytic object: des<-e.svydesign(data=example,ids=~towcod+famcod,strata=~SUPERSTRATUM, weights=~weight) # Adding the new 'ones' variable to estimate the number # of final units in the population: des<-des.addvars(des,ones=1) svystatTM(des,~ones)
#> Total SE #> ones 924101.3 17172.68
# Recoding a qualitative variable: des<-des.addvars(des,agerange=factor(ifelse(age5c==1, "young","not-young"))) svystatTM(des,~agerange,estimator="Mean")
#> Mean SE #> agerangenot-young 0.8604824 0.006775833 #> agerangeyoung 0.1395176 0.006775833
svystatTM(des,~income,~agerange,estimator="Mean",conf.int=TRUE)
#> agerange Mean.income SE.Mean.income CI.l(95%).Mean.income #> not-young not-young 1303.7618 9.081213 1285.9630 #> young young 962.6162 17.684967 927.9543 #> CI.u(95%).Mean.income #> not-young 1321.5607 #> young 997.2781
# Algebraic operations on numeric variables: des<-des.addvars(des,z2=z^2) svystatTM(des,~z2,estimator="Mean")
#> Mean SE #> z2 20623.27 356.5924
# A more interesting example: estimating the # percentage of population with income below # the poverty threshold (defined as 0.6 times # the median income for the whole population): Median.Income <- coef(svystatQ(des, ~income,probs=0.5)) Median.Income
#> income #> 1244
des <- des.addvars(des, status = factor( ifelse(income < (0.6 * Median.Income), "poor", "non-poor") ) ) svystatTM(des,~status,estimator="Mean")
#> Mean SE #> statusnon-poor 0.8842155 0.006131183 #> statuspoor 0.1157845 0.006131183
# Mean income for poor and non-poor: svystatTM(des,~income,~status,estimator="Mean")
#> status Mean.income SE.Mean.income #> non-poor non-poor 1349.3443 7.855904 #> poor poor 544.5881 10.161308
### NOTE: Procedure above yields *correct point estimates* of the share of poor ### population and their average income, while *variance estimation is ### approximated* since we neglected the sampling variability of the ### estimated poverty threshold.