Prepare survey data and control totals to run a calibration task on multiple regression coefficients.

prep.calBeta(design, model, Beta,
             by = NULL, partition = FALSE, drop.z = TRUE)

pop.calBeta(design)

Arguments

design

Object of class analytic (or inheriting from it) containing survey data and sampling design metadata. For function pop.calBeta, a prepared design object as returned by function prep.calBeta.

model

Formula (or list of formulas) specifying the linear model(s) whose regression coefficients will be used as calibration benchmarks (see ‘Details’).

Beta

Numeric vector (or list of numeric vectors) of regression coefficients to be used as calibration benchmarks (see ‘Details’).

by

If regression coefficients are known at domain level (i.e. within subpopulations), a formula specifying the domains (see ‘Details’). Specify NULL (the default option) if the regression coefficients are known at the overall population level.

partition

In case domains have been specified through argument by, should a partitioned calibration task be performed? The default is FALSE, which selects a global calibration task.

drop.z

Can the prepared calibration variables (see ‘Details’) be dropped upon completion of the calibration task? Specify TRUE (the default) if you want to save memory space.

Details

Special Purpose Calibration tasks
ReGenesees 2.1 introduced support for ‘special purpose calibration’ tasks, i.e. facilities to calibrate survey weights so as to match complex population parameters, instead of ordinary population totals.

When calibration benchmarks come in the form of population parameters that are complex, non-linear functions of auxiliary variables (like, in the present case, multiple regression coefficients), calibration constraints have to be linearized first. This generates synthetic linearized variables z, which are the actual calibration variables the calibration algorithm will eventually work on. Typically, control totals for these synthetic linearized variables z will all be zero. See, e.g. [Lesage, 2011].

Put briefly:

  • Function prep.calBeta generates and binds to design the synthetic linearized variables z needed for the calibration task.

  • Then, given the prepared design object returned by prep.calBeta, function pop.calBeta generates the control totals data frame for those z variables.

Of course, once prepared, survey data and control totals returned by the above functions will be fed in input to function e.calibrate, which will run the calibration task.

Function prep.calBeta()
Function prep.calBeta makes it possible to:

  1. Calibrate on regression coefficients of different linear models, each known at the overall population level.

  2. Calibrate on regression coefficients of a single linear model, known possibly for different subpopulations.

Note that, as detailed below, the key argument to switch from case 1. to case 2. is by. For extensive illustration of both the above cases 1. and 2., please see the ‘Examples’ section.

The mandatory argument model specifies the linear model(s) whose regression coefficients will be used as calibration benchmarks. Use a formula object to specify a single linear model, or a list of formula objects to specify several different linear models. For details on model specification, see e.g. lm.
The design variables referenced by model formula(s) must be numeric or factor and must not contain any missing value (NA).

The mandatory argument Beta specifies the vector(s) of regression coefficients that will be used as calibration benchmarks. Use a numeric vector to specify regression coefficients of a single linear model, or a list of numeric vectors to specify regression coefficients of several different linear models. The Beta vector(s) must not contain any missing value (NA).

If argument model is passed as a list, argument Beta must be passed as a list too, and the length of both must be the same. If this is the case, Beta vectors will be positionally tied to linear model formulas contained in model.

Each vector of regression coefficients specified through Beta must be consistent with the corresponding model, namely match its model matrix columns (as generated using actual design data). Function prep.calBeta will check for this consistency and raise an informative error message in case of failure.
Note that, in order to ensure consistency with model, not only the length, but also the order of Beta elements matters. If in doubt, you can easily learn about the right ordering of Beta coefficients, given model, by calling function svystatB as follows: svystatB(design, model). This will, of course, return the estimated regression coefficients of model before calibration.
Note that each Beta vector can have names or not. If a Beta vector has names that match the expected ones (given the corresponding model formula), but appear in a different order, then function prep.calBeta will suitably re-order them and inform you with a warning message.

As anticipated, argument by can be used to enable calibration on regression coefficients known at domain level (i.e. within subpopulations).

If passed, by must be a formula: for example, the statement by=~B1:B2 defines as domains the subpopulations determined by crossing the modalities of variables B1 and B2. Notice that a formula like by=~B1+B2 will be automatically translated into the factor-crossing formula by=~B1:B2. The design variables referenced by by (if any) should be of type factor, otherwise they will be coerced. Note that, to prevent obvious collinearity issues, the variables referenced by argument by must not appear in the input model formula: otherwise, the program will stop and print an error message.

If you specify domains through argument by, you will be allowed to specify just a single linear model through argument model. Instead, through argument Beta, you will have to specify domain-level vectors of regression coefficients. Therefore, Beta will be necessarily passed as a list. Note that, in this case, the Beta list must have as many components as the domains defined through argument by, and the two will be matched positionally by function prep.calBeta. Therefore, the order of elements in the Beta list matters. Specifically, Beta vectors must appear in the same order as the domains specified by argument by. You can easily learn about the right ordering of by domains (and hence Beta elements) by calling function svystatB as follows: svystatB(design, model, by). This will, of course, return the estimated regression coefficients of model within by domains before calibration.

Lastly, if you specify domains through argument by, you will be allowed to decide whether design should be prepared for a global or a partitioned calibration task (see e.calibrate). Recall that partitioned calibration tasks are computationally more efficient, but produce exactly the same results. Note that, if you select partition = TRUE, than the calibration domains used to split the global calibration task will be the same domains specified via argument by. More explicitly: function e.calibrate will eventually be called on prep.calBeta's output object silently assuming partition = by.

The synthetic linearized variables z prepared by prep.calBeta will eventually be used by function e.calibrate to solve the calibration task. Argument drop.z allows you to instruct e.calibrate to drop these variables from its output object - or rather keep them within it - upon completion of the calibration task. The default has been set to TRUE to reduce memory usage. In case you want to be able to consistently trim weights after calibration via function trimcal, you must specify drop.z = FALSE.

Function pop.calBeta()
Given the prepared design object returned by prep.calBeta, function pop.calBeta generates the control totals data frame needed to run the calibration task. This dataframe suitably accomodates the control totals of the synthetic linearized variables z prepared by prep.calBeta.

Note that the data frame object returned by pop.calBeta is already filled, with control totals that are all zero, and it is ready to be directly passed to function e.calibrate (see the ‘Examples’ section).

Note, lastly, that printing this control totals data frame might not be very telling: to better understand its structure you should instead leverage function pop.desc, for which a method dedicated to ‘special purpose calibration’ tasks is available (see the ‘Examples’ section).

Linear Algebra Remark

Contrary to ordinary calibration tasks, the control totals of a ‘special purpose calibration’ task are typically all zero. Therefore, calibration constraints of such tasks - unlike ordinary ones - define a system of linear equations that is homogeneous. Homogeneous systems always admit the trivial zero solution, which, in calibration terms, would mean output weights that are identically zero (and thus statistically unacceptable). For this reason, the possibility that function e.calibrate ends up with a false convergence of the special purpose calibration task cannot, in general, be ruled out. Note that functions e.calibrate and check.cal are able to detect such false convergence events and warn the user about it.

A satisfactory countermeasure to this issue is to set calibration bounds to any interval that does not include zero (see argument bounds of e.calibrate).
A second, very attractive alternative would be to run the special purpose calibration task jointly with some ordinary calibration task, as the joint calibration constraints would then define a non-homogeneous system. This solution can be obtained straightforwardly using function pop.fuse.

Value

  • For function pop.calBeta, an object of the same class as design, storing the freshly created synthetic linearized variables z as columns of its $variables slot (see the ‘Examples’ section).

  • For function pop.calBeta, a control totals data frame for those z variables, with class spc.pop (see the ‘Examples’ section).

References

Lesage, E. (2011). “The use of estimating equations to perform a calibration on complex parameters”. Survey Methodology. 37.

See also

e.calibrate to calibrate weights, svystatB to compute estimates and sampling errors of Multiple Regression Coefficients, pop.desc to obtain a natural language description of control totals (including those for special purpose calibration tasks), pop.fuse to fuse population totals prepared for ordinary and special purpose calibration tasks.

Examples

# Load sbs data: data(sbs) # Create a design object: sbsdes <- e.svydesign(data = sbs, ids = ~id, strata = ~strata, weights = ~weight, fpc = ~fpc) ############################################################################# # Calibrate on the regression coefficients of a *single model* known at the # # *overall population level* # ############################################################################# # Suppose you know the coefficients of the following linear model, which you # obtained fitting the model on some external source (e.g. Census or register # data): model0 <- y ~ emp.num*emp.cl # Here, use the sbs sampling frame available in ReGenesees to simulate the # external source and compute the values of the regression coefficients: B0 <- coef(lm(model0, data = sbs.frame)) B0
#> (Intercept) emp.num emp.cl(9,19] #> 283.275194 16.331333 -57.821776 #> emp.cl(19,49] emp.cl(49,99] emp.cl(99,Inf] #> -97.929104 95.802161 1981.066079 #> emp.num:emp.cl(9,19] emp.num:emp.cl(19,49] emp.num:emp.cl(49,99] #> 8.261965 11.644131 6.751837 #> emp.num:emp.cl(99,Inf] #> -0.572829
# Now, prepare the survey design for calibration: sbsdes0 <- prep.calBeta(design = sbsdes, model = model0, Beta = B0) # Have a look at the freshly created *synthetic* auxiliary variables: head(sbsdes0$variables)
#> id public emp.num emp.cl nace5 nace2 area cens region va.cl va #> 1 1268 0 38 (19,49] 1210 1 32 0 Center 22 5500.0 #> 2 1358 0 30 (19,49] 1240 1 32 0 Center 19 1500.0 #> 3 13819 0 25 (19,49] 1131 1 41 0 Center 16 400.0 #> 4 15749 0 22 (19,49] 1111 1 43 0 Center 1 0.0 #> 5 8431 0 29 (19,49] 1121 1 31 0 Center 2 0.5 #> 6 7572 0 50 (49,99] 1132 1 41 0 Center 11 60.0 #> dom1 nace.macro dom2 strata va.imp1 va.imp2 #> 1 1.(19,49] Agriculture Agriculture.Center Center.1.(19,49].0 5500.0 5500.0 #> 2 1.(19,49] Agriculture Agriculture.Center Center.1.(19,49].0 1500.0 1500.0 #> 3 1.(19,49] Agriculture Agriculture.Center Center.1.(19,49].0 400.0 400.0 #> 4 1.(19,49] Agriculture Agriculture.Center Center.1.(19,49].0 0.0 0.0 #> 5 1.(19,49] Agriculture Agriculture.Center Center.1.(19,49].0 0.5 0.5 #> 6 1.(49,99] Agriculture Agriculture.Center Center.1.(49,99].0 60.0 60.0 #> y weight fpc ent dom3 y_.Intercept. y_emp.num #> 1 1636.6075 1.40 0.7142857 1 C -2.735590e-13 4.440892e-16 #> 2 1002.4378 1.40 0.7142857 1 C 1.565414e-14 -2.775558e-17 #> 3 444.4637 1.40 0.7142857 1 D 3.108624e-13 -6.661338e-16 #> 4 252.1287 1.40 0.7142857 1 D 3.890221e-13 -8.881784e-16 #> 5 466.5918 1.40 0.7142857 1 D 3.765876e-13 -8.881784e-16 #> 6 742.9053 1.25 0.8000000 1 B -3.318235e-12 7.993606e-15 #> y_emp.cl.9.19. y_emp.cl.19.49. y_emp.cl.49.99. y_emp.cl.99.Inf. #> 1 1.776357e-14 -2.590537e-01 3.552714e-15 -1.421085e-14 #> 2 -7.771561e-16 -5.831354e-03 -3.330669e-16 5.551115e-16 #> 3 -1.421085e-14 -3.717904e-01 -7.105427e-15 7.105427e-15 #> 4 -1.776357e-14 -6.547574e-01 -8.881784e-15 8.881784e-15 #> 5 -2.131628e-14 -2.010425e-01 -7.105427e-15 1.065814e-14 #> 6 7.105427e-15 -1.421085e-14 -3.917770e+00 4.263256e-14 #> y_emp.num.emp.cl.9.19. y_emp.num.emp.cl.19.49. y_emp.num.emp.cl.49.99. #> 1 0.000000e+00 1.225995e-02 -4.440892e-16 #> 2 1.387779e-17 -9.993173e-06 1.387779e-17 #> 3 2.220446e-16 8.367914e-03 4.440892e-16 #> 4 6.661338e-16 1.683377e-02 4.440892e-16 #> 5 4.440892e-16 1.823722e-03 4.440892e-16 #> 6 -6.217249e-15 -6.217249e-15 4.896417e-02 #> y_emp.num.emp.cl.99.Inf. #> 1 -4.440892e-16 #> 2 2.775558e-17 #> 3 6.661338e-16 #> 4 8.881784e-16 #> 5 8.881784e-16 #> 6 -7.105427e-15
# Then, prepare the control totals dataframe for the calibration task: pop0 <- pop.calBeta(sbsdes0) # Have a look... class(pop0)
#> [1] "spc.pop" "pop.totals" "data.frame"
pop.desc(pop0)
#> # Data frame of control totals for a *special purpose calibration* task #> - Benchmark parameters: Regression Coefficients #> - Benchmarks known at level: Overall Population #> - Calibration task: Global #> #> ## Regression models #> #> y ~ emp.num * emp.cl #> #> ## Benchmark vectors (Beta) #> #> (Intercept) emp.num emp.cl(9,19] #> 283.275194 16.331333 -57.821776 #> emp.cl(19,49] emp.cl(49,99] emp.cl(99,Inf] #> -97.929104 95.802161 1981.066079 #> emp.num:emp.cl(9,19] emp.num:emp.cl(19,49] emp.num:emp.cl(49,99] #> 8.261965 11.644131 6.751837 #> emp.num:emp.cl(99,Inf] #> -0.572829 #>
# ...and note that all the control totals are zero! pop0
#> y_.Intercept. y_emp.num y_emp.cl.9.19. y_emp.cl.19.49. y_emp.cl.49.99. #> 1 0 0 0 0 0 #> y_emp.cl.99.Inf. y_emp.num.emp.cl.9.19. y_emp.num.emp.cl.19.49. #> 1 0 0 0 #> y_emp.num.emp.cl.49.99. y_emp.num.emp.cl.99.Inf. #> 1 0 0
# Lastly, calibrate: sbscal0 <- e.calibrate(sbsdes0, pop0) # Check that calibration estimates of regression coefficients now match the # input B0 values derived from the external source: svystatB(sbscal0, model0)
#> RegCoef.y SE #> (Intercept) 283.275194 3.110855e-12 #> emp.num 16.331333 4.447773e-13 #> emp.cl(9,19] -57.821776 7.072171e-12 #> emp.cl(19,49] -97.929104 3.000343e-11 #> emp.cl(49,99] 95.802161 1.565824e-10 #> emp.cl(99,Inf] 1981.066079 3.164952e-12 #> emp.num:emp.cl(9,19] 8.261965 6.701616e-13 #> emp.num:emp.cl(19,49] 11.644131 1.162343e-12 #> emp.num:emp.cl(49,99] 6.751837 2.327074e-12 #> emp.num:emp.cl(99,Inf] -0.572829 4.395589e-13
B0
#> (Intercept) emp.num emp.cl(9,19] #> 283.275194 16.331333 -57.821776 #> emp.cl(19,49] emp.cl(49,99] emp.cl(99,Inf] #> -97.929104 95.802161 1981.066079 #> emp.num:emp.cl(9,19] emp.num:emp.cl(19,49] emp.num:emp.cl(49,99] #> 8.261965 11.644131 6.751837 #> emp.num:emp.cl(99,Inf] #> -0.572829
# OK ######################################################################### # Calibrate simultaneously on the regression coefficients of *different # # models* known at the *overall population level* # ######################################################################### # Suppose you know the coefficients of the following linear models, which you # obtained fitting the models on some external sources model1 <- va.imp2 ~ emp.num:emp.cl + nace.macro - 1 model2 <- y ~ dom3 - 1 # Here, use the sbs sampling frame available in ReGenesees to simulate the # external sources and compute the values of the regression coefficients: B1 <- coef(lm(model1, data = sbs.frame)) B2 <- coef(lm(model2, data = sbs.frame)) ## First, just for illustration, calibrate only on B1 regression coefficients: sbsdes1 <- prep.calBeta(sbsdes, model = model1, Beta = B1) pop1 <- pop.calBeta(sbsdes1) sbscal1 <- e.calibrate(sbsdes1, pop1) # Check that calibration estimates of regression coefficients now match the # input B1 values derived from the external source... svystatB(sbscal1, model1)
#> RegCoef.va.imp2 SE #> nace.macroAgriculture 5924.29979 2.957017e-12 #> nace.macroIndustry 6032.07596 1.866033e-12 #> nace.macroCommerce 9082.05961 1.085778e-12 #> nace.macroServices 5926.88094 1.347099e-12 #> emp.num:emp.cl[6,9] -766.68839 1.676626e-13 #> emp.num:emp.cl(9,19] -366.77419 1.160064e-13 #> emp.num:emp.cl(19,49] -112.98311 3.172506e-14 #> emp.num:emp.cl(49,99] -25.49606 1.492446e-14 #> emp.num:emp.cl(99,Inf] 12.87226 2.332265e-15
B1
#> nace.macroAgriculture nace.macroIndustry nace.macroCommerce #> 5924.29979 6032.07596 9082.05961 #> nace.macroServices emp.num:emp.cl[6,9] emp.num:emp.cl(9,19] #> 5926.88094 -766.68839 -366.77419 #> emp.num:emp.cl(19,49] emp.num:emp.cl(49,99] emp.num:emp.cl(99,Inf] #> -112.98311 -25.49606 12.87226
# ...but, of course, do *not* match those of B2: svystatB(sbscal1, model2)
#> RegCoef.y SE #> dom3A 1438.495 62.95115 #> dom3B 1602.706 41.15951 #> dom3C 1512.369 31.57178 #> dom3D 1561.797 27.61109
B2
#> dom3A dom3B dom3C dom3D #> 1486.811 1586.184 1503.338 1540.387
# OK ## Second, just for illustration, calibrate only on B2 regression coefficients: sbsdes2 <- prep.calBeta(sbsdes, model = model2, Beta = B2) pop2 <- pop.calBeta(sbsdes2) sbscal2 <- e.calibrate(sbsdes2, pop2) # Check that calibration estimates of regression coefficients now match the # input B2 values derived from the external source... svystatB(sbscal2, model2)
#> RegCoef.y SE #> dom3A 1486.811 5.946116e-14 #> dom3B 1586.184 1.221214e-13 #> dom3C 1503.338 7.099216e-14 #> dom3D 1540.387 1.266409e-13
B2
#> dom3A dom3B dom3C dom3D #> 1486.811 1586.184 1503.338 1540.387
# ...but, of course, do *not* match those of B1: svystatB(sbscal2, model1)
#> RegCoef.va.imp2 SE #> nace.macroAgriculture 5798.62214 232.4651711 #> nace.macroIndustry 5780.83789 198.1482534 #> nace.macroCommerce 9263.58984 253.6379337 #> nace.macroServices 5621.53228 188.1720452 #> emp.num:emp.cl[6,9] -735.21455 25.8519684 #> emp.num:emp.cl(9,19] -349.45447 14.3496646 #> emp.num:emp.cl(19,49] -95.39318 11.9194364 #> emp.num:emp.cl(49,99] -24.39966 3.7194072 #> emp.num:emp.cl(99,Inf] 13.33464 0.3550024
B1
#> nace.macroAgriculture nace.macroIndustry nace.macroCommerce #> 5924.29979 6032.07596 9082.05961 #> nace.macroServices emp.num:emp.cl[6,9] emp.num:emp.cl(9,19] #> 5926.88094 -766.68839 -366.77419 #> emp.num:emp.cl(19,49] emp.num:emp.cl(49,99] emp.num:emp.cl(99,Inf] #> -112.98311 -25.49606 12.87226
# OK ## Now, calibrate *simultaneously* on *B1 and B2* regression coefficients # Prepare the survey design for the joint calibration task: sbsdes1_2 <- prep.calBeta(sbsdes, model = list(model1, model2), Beta = list(B1, B2)) # Prepare the control totals dataframe for the joint calibration task: pop1_2 <- pop.calBeta(sbsdes1_2) # Have a look to the control totals (note the presence of two models and # Beta vectors): pop.desc(pop1_2)
#> # Data frame of control totals for a *special purpose calibration* task #> - Benchmark parameters: Regression Coefficients #> - Benchmarks known at level: Overall Population #> - Calibration task: Global #> #> ## Regression models #> #> $mod1 #> va.imp2 ~ emp.num:emp.cl + nace.macro - 1 #> #> $mod2 #> y ~ dom3 - 1 #> #> ## Benchmark vectors (Beta) #> #> $mod1 #> nace.macroAgriculture nace.macroIndustry nace.macroCommerce #> 5924.29979 6032.07596 9082.05961 #> nace.macroServices emp.num:emp.cl[6,9] emp.num:emp.cl(9,19] #> 5926.88094 -766.68839 -366.77419 #> emp.num:emp.cl(19,49] emp.num:emp.cl(49,99] emp.num:emp.cl(99,Inf] #> -112.98311 -25.49606 12.87226 #> #> $mod2 #> dom3A dom3B dom3C dom3D #> 1486.811 1586.184 1503.338 1540.387 #>
pop1_2
#> va.imp2_nace.macroAgriculture_mod1 va.imp2_nace.macroIndustry_mod1 #> 1 0 0 #> va.imp2_nace.macroCommerce_mod1 va.imp2_nace.macroServices_mod1 #> 1 0 0 #> va.imp2_emp.num.emp.cl.6.9._mod1 va.imp2_emp.num.emp.cl.9.19._mod1 #> 1 0 0 #> va.imp2_emp.num.emp.cl.19.49._mod1 va.imp2_emp.num.emp.cl.49.99._mod1 #> 1 0 0 #> va.imp2_emp.num.emp.cl.99.Inf._mod1 y_dom3A_mod2 y_dom3B_mod2 y_dom3C_mod2 #> 1 0 0 0 0 #> y_dom3D_mod2 #> 1 0
# Lastly, run the calibration: sbscal1_2 <- e.calibrate(sbsdes1_2, pop1_2) # Check that calibration estimates of regression coefficients now match *both* # the B1 and B2 values derived from the external sources: svystatB(sbscal1_2, model1)
#> RegCoef.va.imp2 SE #> nace.macroAgriculture 5924.29979 3.588752e-12 #> nace.macroIndustry 6032.07596 1.824968e-12 #> nace.macroCommerce 9082.05961 1.080236e-12 #> nace.macroServices 5926.88094 1.492405e-12 #> emp.num:emp.cl[6,9] -766.68839 2.027428e-13 #> emp.num:emp.cl(9,19] -366.77419 8.760977e-14 #> emp.num:emp.cl(19,49] -112.98311 2.926817e-14 #> emp.num:emp.cl(49,99] -25.49606 2.258273e-14 #> emp.num:emp.cl(99,Inf] 12.87226 2.870303e-15
B1
#> nace.macroAgriculture nace.macroIndustry nace.macroCommerce #> 5924.29979 6032.07596 9082.05961 #> nace.macroServices emp.num:emp.cl[6,9] emp.num:emp.cl(9,19] #> 5926.88094 -766.68839 -366.77419 #> emp.num:emp.cl(19,49] emp.num:emp.cl(49,99] emp.num:emp.cl(99,Inf] #> -112.98311 -25.49606 12.87226
svystatB(sbscal1_2, model2)
#> RegCoef.y SE #> dom3A 1486.811 1.862539e-13 #> dom3B 1586.184 5.412718e-14 #> dom3C 1503.338 2.803516e-14 #> dom3D 1540.387 5.373730e-14
B2
#> dom3A dom3B dom3C dom3D #> 1486.811 1586.184 1503.338 1540.387
# OK ############################################################################### # Calibrate simultaneously on the regression coefficients of a *single model* # # known for *different subpopulations* # ############################################################################### # NOTE: In this case, both *global* and *partitioned* calibration tasks are # possible, and both will be illustrated below. # Suppose you know the coefficients of the following linear model, which you # obtained fitting the model on some external source (e.g. Census or register # data) *within subpopulations* defined by some factor variable(s): model <- va.imp2 ~ emp.num:emp.cl + nace.macro - 1 # Here, use the sbs sampling frame available in ReGenesees to simulate the # external source and suppose the subpopulations are defined by variable 'dom3'. # Thus, compute the values of the regression coefficients as follows: B <- by(sbs.frame, sbs.frame$dom3, function(df) coef(lm(model, data = df))) B
#> sbs.frame$dom3: A #> nace.macroAgriculture nace.macroIndustry nace.macroCommerce #> 5963.62356 5681.88344 8265.52592 #> nace.macroServices emp.num:emp.cl[6,9] emp.num:emp.cl(9,19] #> 5822.60113 -699.58266 -349.73020 #> emp.num:emp.cl(19,49] emp.num:emp.cl(49,99] emp.num:emp.cl(99,Inf] #> -72.12368 -23.92154 14.58152 #> ------------------------------------------------------------ #> sbs.frame$dom3: B #> nace.macroAgriculture nace.macroIndustry nace.macroCommerce #> 6302.598282 7182.964446 9276.480795 #> nace.macroServices emp.num:emp.cl[6,9] emp.num:emp.cl(9,19] #> 6661.276494 -840.667048 -427.082945 #> emp.num:emp.cl(19,49] emp.num:emp.cl(49,99] emp.num:emp.cl(99,Inf] #> -146.687539 -52.557432 9.449148 #> ------------------------------------------------------------ #> sbs.frame$dom3: C #> nace.macroAgriculture nace.macroIndustry nace.macroCommerce #> 5187.85116 4773.29153 8108.27060 #> nace.macroServices emp.num:emp.cl[6,9] emp.num:emp.cl(9,19] #> 4750.07096 -628.95434 -267.87040 #> emp.num:emp.cl(19,49] emp.num:emp.cl(49,99] emp.num:emp.cl(99,Inf] #> -79.52702 -8.59304 16.25235 #> ------------------------------------------------------------ #> sbs.frame$dom3: D #> nace.macroAgriculture nace.macroIndustry nace.macroCommerce #> 5864.84680 6070.15362 9470.73103 #> nace.macroServices emp.num:emp.cl[6,9] emp.num:emp.cl(9,19] #> 6022.45539 -790.41164 -383.86883 #> emp.num:emp.cl(19,49] emp.num:emp.cl(49,99] emp.num:emp.cl(99,Inf] #> -118.21161 -18.76583 13.47778
## Let's start with the *global* solution # Prepare the survey design for the calibration task (note that the 'by' # argument is used): sbsdes.g <- prep.calBeta(sbsdes, model, Beta = B, by = ~dom3) # Prepare the control totals for the calibration task: pop.g <- pop.calBeta(sbsdes.g) # Have a look to the control totals (note the presence of one Beta vector *for # each domain*): pop.desc(pop.g)
#> # Data frame of control totals for a *special purpose calibration* task #> - Benchmark parameters: Regression Coefficients #> - Benchmarks known at level: Domains [dom3] #> - Calibration task: Global #> #> ## Regression models #> #> va.imp2 ~ emp.num:emp.cl + nace.macro - 1 #> #> ## Domain benchmark vectors (Beta) #> $A #> nace.macroAgriculture nace.macroIndustry nace.macroCommerce #> 5963.62356 5681.88344 8265.52592 #> nace.macroServices emp.num:emp.cl[6,9] emp.num:emp.cl(9,19] #> 5822.60113 -699.58266 -349.73020 #> emp.num:emp.cl(19,49] emp.num:emp.cl(49,99] emp.num:emp.cl(99,Inf] #> -72.12368 -23.92154 14.58152 #> #> $B #> nace.macroAgriculture nace.macroIndustry nace.macroCommerce #> 6302.598282 7182.964446 9276.480795 #> nace.macroServices emp.num:emp.cl[6,9] emp.num:emp.cl(9,19] #> 6661.276494 -840.667048 -427.082945 #> emp.num:emp.cl(19,49] emp.num:emp.cl(49,99] emp.num:emp.cl(99,Inf] #> -146.687539 -52.557432 9.449148 #> #> $C #> nace.macroAgriculture nace.macroIndustry nace.macroCommerce #> 5187.85116 4773.29153 8108.27060 #> nace.macroServices emp.num:emp.cl[6,9] emp.num:emp.cl(9,19] #> 4750.07096 -628.95434 -267.87040 #> emp.num:emp.cl(19,49] emp.num:emp.cl(49,99] emp.num:emp.cl(99,Inf] #> -79.52702 -8.59304 16.25235 #> #> $D #> nace.macroAgriculture nace.macroIndustry nace.macroCommerce #> 5864.84680 6070.15362 9470.73103 #> nace.macroServices emp.num:emp.cl[6,9] emp.num:emp.cl(9,19] #> 6022.45539 -790.41164 -383.86883 #> emp.num:emp.cl(19,49] emp.num:emp.cl(49,99] emp.num:emp.cl(99,Inf] #> -118.21161 -18.76583 13.47778 #> #>
pop.g
#> va.imp2_nace.macroAgriculture_A va.imp2_nace.macroIndustry_A #> 1 0 0 #> va.imp2_nace.macroCommerce_A va.imp2_nace.macroServices_A #> 1 0 0 #> va.imp2_emp.num.emp.cl.6.9._A va.imp2_emp.num.emp.cl.9.19._A #> 1 0 0 #> va.imp2_emp.num.emp.cl.19.49._A va.imp2_emp.num.emp.cl.49.99._A #> 1 0 0 #> va.imp2_emp.num.emp.cl.99.Inf._A va.imp2_nace.macroAgriculture_B #> 1 0 0 #> va.imp2_nace.macroIndustry_B va.imp2_nace.macroCommerce_B #> 1 0 0 #> va.imp2_nace.macroServices_B va.imp2_emp.num.emp.cl.6.9._B #> 1 0 0 #> va.imp2_emp.num.emp.cl.9.19._B va.imp2_emp.num.emp.cl.19.49._B #> 1 0 0 #> va.imp2_emp.num.emp.cl.49.99._B va.imp2_emp.num.emp.cl.99.Inf._B #> 1 0 0 #> va.imp2_nace.macroAgriculture_C va.imp2_nace.macroIndustry_C #> 1 0 0 #> va.imp2_nace.macroCommerce_C va.imp2_nace.macroServices_C #> 1 0 0 #> va.imp2_emp.num.emp.cl.6.9._C va.imp2_emp.num.emp.cl.9.19._C #> 1 0 0 #> va.imp2_emp.num.emp.cl.19.49._C va.imp2_emp.num.emp.cl.49.99._C #> 1 0 0 #> va.imp2_emp.num.emp.cl.99.Inf._C va.imp2_nace.macroAgriculture_D #> 1 0 0 #> va.imp2_nace.macroIndustry_D va.imp2_nace.macroCommerce_D #> 1 0 0 #> va.imp2_nace.macroServices_D va.imp2_emp.num.emp.cl.6.9._D #> 1 0 0 #> va.imp2_emp.num.emp.cl.9.19._D va.imp2_emp.num.emp.cl.19.49._D #> 1 0 0 #> va.imp2_emp.num.emp.cl.49.99._D va.imp2_emp.num.emp.cl.99.Inf._D #> 1 0 0
# Run the calibration: sbscal.g <- e.calibrate(sbsdes.g, pop.g) # Check that calibration estimates of regression coefficients now match the # input B values derived from the external source *for each domain*: svystatB(sbscal.g, model, ~dom3)
#> dom3 va.imp2_nace.macroAgriculture va.imp2_nace.macroIndustry #> A A 5963.624 5681.883 #> B B 6302.598 7182.964 #> C C 5187.851 4773.292 #> D D 5864.847 6070.154 #> va.imp2_nace.macroCommerce va.imp2_nace.macroServices #> A 8265.526 5822.601 #> B 9276.481 6661.276 #> C 8108.271 4750.071 #> D 9470.731 6022.455 #> va.imp2_emp.num:emp.cl[6,9] va.imp2_emp.num:emp.cl(9,19] #> A -699.5827 -349.7302 #> B -840.6670 -427.0829 #> C -628.9543 -267.8704 #> D -790.4116 -383.8688 #> va.imp2_emp.num:emp.cl(19,49] va.imp2_emp.num:emp.cl(49,99] #> A -72.12368 -23.92154 #> B -146.68754 -52.55743 #> C -79.52702 -8.59304 #> D -118.21161 -18.76583 #> va.imp2_emp.num:emp.cl(99,Inf] SE.va.imp2_nace.macroAgriculture #> A 14.581521 1.464051e-11 #> B 9.449148 2.626572e-12 #> C 16.252348 7.715128e-12 #> D 13.477778 2.826564e-12 #> SE.va.imp2_nace.macroIndustry SE.va.imp2_nace.macroCommerce #> A 7.267283e-12 8.222697e-12 #> B 5.508275e-13 2.371340e-12 #> C 3.242956e-12 4.110657e-12 #> D 1.055484e-12 6.637389e-12 #> SE.va.imp2_nace.macroServices SE.va.imp2_emp.num:emp.cl[6,9] #> A 7.466187e-12 9.070664e-13 #> B 1.652352e-12 2.072976e-13 #> C 3.462708e-12 5.007919e-13 #> D 2.859232e-12 4.591229e-13 #> SE.va.imp2_emp.num:emp.cl(9,19] SE.va.imp2_emp.num:emp.cl(19,49] #> A 5.768278e-13 2.066474e-13 #> B 4.520645e-14 2.322698e-14 #> C 3.992955e-13 1.247204e-13 #> D 1.631044e-13 9.610965e-14 #> SE.va.imp2_emp.num:emp.cl(49,99] SE.va.imp2_emp.num:emp.cl(99,Inf] #> A 3.087308e-13 7.951077e-15 #> B 9.279691e-15 5.404357e-15 #> C 7.720180e-14 9.461398e-15 #> D 5.160165e-14 1.556487e-14
B
#> sbs.frame$dom3: A #> nace.macroAgriculture nace.macroIndustry nace.macroCommerce #> 5963.62356 5681.88344 8265.52592 #> nace.macroServices emp.num:emp.cl[6,9] emp.num:emp.cl(9,19] #> 5822.60113 -699.58266 -349.73020 #> emp.num:emp.cl(19,49] emp.num:emp.cl(49,99] emp.num:emp.cl(99,Inf] #> -72.12368 -23.92154 14.58152 #> ------------------------------------------------------------ #> sbs.frame$dom3: B #> nace.macroAgriculture nace.macroIndustry nace.macroCommerce #> 6302.598282 7182.964446 9276.480795 #> nace.macroServices emp.num:emp.cl[6,9] emp.num:emp.cl(9,19] #> 6661.276494 -840.667048 -427.082945 #> emp.num:emp.cl(19,49] emp.num:emp.cl(49,99] emp.num:emp.cl(99,Inf] #> -146.687539 -52.557432 9.449148 #> ------------------------------------------------------------ #> sbs.frame$dom3: C #> nace.macroAgriculture nace.macroIndustry nace.macroCommerce #> 5187.85116 4773.29153 8108.27060 #> nace.macroServices emp.num:emp.cl[6,9] emp.num:emp.cl(9,19] #> 4750.07096 -628.95434 -267.87040 #> emp.num:emp.cl(19,49] emp.num:emp.cl(49,99] emp.num:emp.cl(99,Inf] #> -79.52702 -8.59304 16.25235 #> ------------------------------------------------------------ #> sbs.frame$dom3: D #> nace.macroAgriculture nace.macroIndustry nace.macroCommerce #> 5864.84680 6070.15362 9470.73103 #> nace.macroServices emp.num:emp.cl[6,9] emp.num:emp.cl(9,19] #> 6022.45539 -790.41164 -383.86883 #> emp.num:emp.cl(19,49] emp.num:emp.cl(49,99] emp.num:emp.cl(99,Inf] #> -118.21161 -18.76583 13.47778
# OK ## Let's proceed with the *partitioned* solution # Prepare the survey design for the calibration task (note that 'by' and # 'partition' arguments are used): sbsdes.p <- prep.calBeta(sbsdes, model, Beta = B, by = ~dom3, partition = TRUE) # Prepare the control totals for the calibration task: pop.p <- pop.calBeta(sbsdes.p) # Have a look to the control totals (note the presence of one Beta vector *for # each domain*): pop.desc(pop.p)
#> # Data frame of control totals for a *special purpose calibration* task #> - Benchmark parameters: Regression Coefficients #> - Benchmarks known at level: Domains [dom3] #> - Calibration task: Partitioned #> #> ## Regression models #> #> va.imp2 ~ emp.num:emp.cl + nace.macro - 1 #> #> ## Domain benchmark vectors (Beta) #> $A #> nace.macroAgriculture nace.macroIndustry nace.macroCommerce #> 5963.62356 5681.88344 8265.52592 #> nace.macroServices emp.num:emp.cl[6,9] emp.num:emp.cl(9,19] #> 5822.60113 -699.58266 -349.73020 #> emp.num:emp.cl(19,49] emp.num:emp.cl(49,99] emp.num:emp.cl(99,Inf] #> -72.12368 -23.92154 14.58152 #> #> $B #> nace.macroAgriculture nace.macroIndustry nace.macroCommerce #> 6302.598282 7182.964446 9276.480795 #> nace.macroServices emp.num:emp.cl[6,9] emp.num:emp.cl(9,19] #> 6661.276494 -840.667048 -427.082945 #> emp.num:emp.cl(19,49] emp.num:emp.cl(49,99] emp.num:emp.cl(99,Inf] #> -146.687539 -52.557432 9.449148 #> #> $C #> nace.macroAgriculture nace.macroIndustry nace.macroCommerce #> 5187.85116 4773.29153 8108.27060 #> nace.macroServices emp.num:emp.cl[6,9] emp.num:emp.cl(9,19] #> 4750.07096 -628.95434 -267.87040 #> emp.num:emp.cl(19,49] emp.num:emp.cl(49,99] emp.num:emp.cl(99,Inf] #> -79.52702 -8.59304 16.25235 #> #> $D #> nace.macroAgriculture nace.macroIndustry nace.macroCommerce #> 5864.84680 6070.15362 9470.73103 #> nace.macroServices emp.num:emp.cl[6,9] emp.num:emp.cl(9,19] #> 6022.45539 -790.41164 -383.86883 #> emp.num:emp.cl(19,49] emp.num:emp.cl(49,99] emp.num:emp.cl(99,Inf] #> -118.21161 -18.76583 13.47778 #> #>
pop.p
#> dom3 va.imp2_nace.macroAgriculture va.imp2_nace.macroIndustry #> 1 A 0 0 #> 2 B 0 0 #> 3 C 0 0 #> 4 D 0 0 #> va.imp2_nace.macroCommerce va.imp2_nace.macroServices #> 1 0 0 #> 2 0 0 #> 3 0 0 #> 4 0 0 #> va.imp2_emp.num.emp.cl.6.9. va.imp2_emp.num.emp.cl.9.19. #> 1 0 0 #> 2 0 0 #> 3 0 0 #> 4 0 0 #> va.imp2_emp.num.emp.cl.19.49. va.imp2_emp.num.emp.cl.49.99. #> 1 0 0 #> 2 0 0 #> 3 0 0 #> 4 0 0 #> va.imp2_emp.num.emp.cl.99.Inf. #> 1 0 #> 2 0 #> 3 0 #> 4 0
# Run the calibration: sbscal.p <- e.calibrate(sbsdes.p, pop.p) # Check that calibration estimates of regression coefficients now match the # input B values derived from the external source *for each domain*: svystatB(sbscal.p, model, ~dom3)
#> dom3 va.imp2_nace.macroAgriculture va.imp2_nace.macroIndustry #> A A 5963.624 5681.883 #> B B 6302.598 7182.964 #> C C 5187.851 4773.292 #> D D 5864.847 6070.154 #> va.imp2_nace.macroCommerce va.imp2_nace.macroServices #> A 8265.526 5822.601 #> B 9276.481 6661.276 #> C 8108.271 4750.071 #> D 9470.731 6022.455 #> va.imp2_emp.num:emp.cl[6,9] va.imp2_emp.num:emp.cl(9,19] #> A -699.5827 -349.7302 #> B -840.6670 -427.0829 #> C -628.9543 -267.8704 #> D -790.4116 -383.8688 #> va.imp2_emp.num:emp.cl(19,49] va.imp2_emp.num:emp.cl(49,99] #> A -72.12368 -23.92154 #> B -146.68754 -52.55743 #> C -79.52702 -8.59304 #> D -118.21161 -18.76583 #> va.imp2_emp.num:emp.cl(99,Inf] SE.va.imp2_nace.macroAgriculture #> A 14.581521 1.473273e-11 #> B 9.449148 2.358546e-12 #> C 16.252348 4.851001e-12 #> D 13.477778 8.698131e-13 #> SE.va.imp2_nace.macroIndustry SE.va.imp2_nace.macroCommerce #> A 1.638006e-11 9.124250e-12 #> B 9.455565e-13 1.396646e-12 #> C 2.496062e-12 4.006464e-12 #> D 5.323422e-13 1.417058e-12 #> SE.va.imp2_nace.macroServices SE.va.imp2_emp.num:emp.cl[6,9] #> A 9.249272e-12 1.176730e-12 #> B 1.157037e-12 1.186337e-13 #> C 3.800487e-12 4.371859e-13 #> D 6.853105e-13 1.249636e-13 #> SE.va.imp2_emp.num:emp.cl(9,19] SE.va.imp2_emp.num:emp.cl(19,49] #> A 7.084993e-13 3.896926e-13 #> B 6.554536e-14 2.534028e-14 #> C 3.492117e-13 1.098546e-13 #> D 1.275862e-13 4.537337e-14 #> SE.va.imp2_emp.num:emp.cl(49,99] SE.va.imp2_emp.num:emp.cl(99,Inf] #> A 3.243194e-13 2.668893e-14 #> B 4.825743e-14 9.466470e-15 #> C 7.614446e-14 2.720381e-15 #> D 2.060717e-14 1.629309e-14
B
#> sbs.frame$dom3: A #> nace.macroAgriculture nace.macroIndustry nace.macroCommerce #> 5963.62356 5681.88344 8265.52592 #> nace.macroServices emp.num:emp.cl[6,9] emp.num:emp.cl(9,19] #> 5822.60113 -699.58266 -349.73020 #> emp.num:emp.cl(19,49] emp.num:emp.cl(49,99] emp.num:emp.cl(99,Inf] #> -72.12368 -23.92154 14.58152 #> ------------------------------------------------------------ #> sbs.frame$dom3: B #> nace.macroAgriculture nace.macroIndustry nace.macroCommerce #> 6302.598282 7182.964446 9276.480795 #> nace.macroServices emp.num:emp.cl[6,9] emp.num:emp.cl(9,19] #> 6661.276494 -840.667048 -427.082945 #> emp.num:emp.cl(19,49] emp.num:emp.cl(49,99] emp.num:emp.cl(99,Inf] #> -146.687539 -52.557432 9.449148 #> ------------------------------------------------------------ #> sbs.frame$dom3: C #> nace.macroAgriculture nace.macroIndustry nace.macroCommerce #> 5187.85116 4773.29153 8108.27060 #> nace.macroServices emp.num:emp.cl[6,9] emp.num:emp.cl(9,19] #> 4750.07096 -628.95434 -267.87040 #> emp.num:emp.cl(19,49] emp.num:emp.cl(49,99] emp.num:emp.cl(99,Inf] #> -79.52702 -8.59304 16.25235 #> ------------------------------------------------------------ #> sbs.frame$dom3: D #> nace.macroAgriculture nace.macroIndustry nace.macroCommerce #> 5864.84680 6070.15362 9470.73103 #> nace.macroServices emp.num:emp.cl[6,9] emp.num:emp.cl(9,19] #> 6022.45539 -790.41164 -383.86883 #> emp.num:emp.cl(19,49] emp.num:emp.cl(49,99] emp.num:emp.cl(99,Inf] #> -118.21161 -18.76583 13.47778
# OK ## Lastly, check that calibration weights obtained using the *global* and ## *partitioned* solution are the same: g.range(sbscal.g)
#> g.min g.max #> 0.09254688 2.11921444
g.range(sbscal.p)
#> g.min g.max #> 0.09254688 2.11921444
all.equal(weights(sbscal.g), weights(sbscal.p))
#> [1] TRUE
# OK ########################################################### # BONUS TIP: Calibration on the mean (or on domain means) # # of one variable or multiple variables. # ########################################################### # Since the domain mean of a numeric variable can be thought as a # regression coefficient (see the 'Examples' section of ?svystatB), # you can use the ReGenesees facilities documented above to # *calibrate on the mean (or on domain means)* of *one variable # or multiple variables*. # NOTE: The examples below cover the following cases of calibration on: # (i) The overall mean of a single variable. # (ii) The means of a single variable within domains of just # one type. # (iii) The domain means of several variables, with multiple # and different domain types. # Load artificial household survey data and define a survey design: data(data.examples) exdes <- e.svydesign(data = example, ids = ~towcod + famcod, strata = ~SUPERSTRATUM, weights = ~weight) exdes <- des.addvars(exdes, ones = 1) ## CASE (i): Calibrate on the overall mean of a single variable # Suppose you know with satisfactory accuracy the *average income* of your # target population (but you do *not* have reliable information on the # *total income*, nor on the *total number of individuals*): income.AVG <- 1270 # You can calibrate on *average income* as follows: exdes.new <- prep.calBeta(exdes, income ~ 1, Beta = income.AVG) pop.new <- pop.calBeta(exdes.new) excal.new <- e.calibrate(exdes.new, pop.new) # Now, check that calibration estimate of average income now match the known # value derived from the external source *without residual uncertainty*: svystatTM(excal.new, ~income, estimator = "Mean")
#> Mean SE #> income 1270 3.148794e-14
income.AVG
#> [1] 1270
# ...while there is *residual uncertainty* in the estimates of the numerator and # denominator totals: svystatTM(excal.new, ~income + ones, estimator = "Total")
#> Total SE #> income 1172423407 21707058.68 #> ones 923168 17092.17
# OK ## CASE (ii): Calibrate on the means of a single variable within domains # of just one kind # You can calibrate on *domain means* along the lines illustrated above (note, # however, that argument 'by' would provide an alternative way to achieve # the same result). # Suppose you know with satisfactory accuracy the *average income* by the # crossclassification of sex and marital status: income.AVG.sex.marstat <- c(f.married = 1310, m.married = 1260, f.unmarried = 1150, m.unmarried = 1200, f.widowed = 1380, m.widowed = 1300) # Run the calibration on *average income by sex:marstat* as follows: exdes.new <- prep.calBeta(exdes, income ~ sex:marstat -1, Beta = income.AVG.sex.marstat)
#> Warning: Names of regression coefficients do not agree with model matrix columns! #> Please double-check the consistency of 'model' and 'Beta' input values.
pop.new <- pop.calBeta(exdes.new) excal.new <- e.calibrate(exdes.new, pop.new) # Now, check that calibration estimates of average income by domains now match # the known values derived from the external source *without residual # uncertainty*: svystatTM(excal.new, ~income, ~sex:marstat, estimator = "Mean")
#> sex marstat Mean.income SE.Mean.income #> f.married f married 1310 2.167736e-14 #> m.married m married 1260 3.149995e-15 #> f.unmarried f unmarried 1150 6.709128e-14 #> m.unmarried m unmarried 1200 1.835315e-14 #> f.widowed f widowed 1380 2.339318e-14 #> m.widowed m widowed 1300 2.183212e-14
income.AVG.sex.marstat
#> f.married m.married f.unmarried m.unmarried f.widowed m.widowed #> 1310 1260 1150 1200 1380 1300
# ...while there is *residual uncertainty* in the estimates of the numerator and # denominator totals: svystatTM(excal.new, ~income + ones, ~sex:marstat, estimator = "Total")
#> sex marstat Total.income Total.ones SE.Total.income SE.Total.ones #> f.married f married 358161674 273405.86 12162919 9284.671 #> m.married m married 330429504 262245.64 12318236 9776.378 #> f.unmarried f unmarried 182847978 158998.24 9370624 8148.369 #> m.unmarried m unmarried 184206581 153505.48 9020391 7516.993 #> f.widowed f widowed 51190368 37094.47 5600383 4058.249 #> m.widowed m widowed 48627191 37405.53 4725622 3635.094
# OK ## CASE (iii): Calibrate on the domain means of several variables, with multiple # and different domain types. # Suppose you know with satisfactory accuracy: # - the average income by sex: # - the average income by marstat: # - the average of variable z by age (variable 'age5c', 5 classes): income.AVG.sex <- c("f" = 1245, "m" = 1250) income.AVG.marstat <- c("married" = 1260, "unmarried" = 1230, "widowed" = 1290) z.AVG.age5c <- c("1" = 125, "2" = 130, "3" = 135, "4" = 125, "5" = 140) # Run the calibration as follows: exdes.new <- prep.calBeta(exdes, model = list(income ~ sex -1, income ~ marstat -1, z ~ age5c -1), Beta = list(income.AVG.sex, income.AVG.marstat, z.AVG.age5c) )
#> Warning: Names of regression coefficients do not agree with model matrix columns for input list element: 1! #> Please double-check the consistency of 'model' and 'Beta' input values.
#> Warning: Names of regression coefficients do not agree with model matrix columns for input list element: 2! #> Please double-check the consistency of 'model' and 'Beta' input values.
#> Warning: Names of regression coefficients do not agree with model matrix columns for input list element: 3! #> Please double-check the consistency of 'model' and 'Beta' input values.
pop.new <- pop.calBeta(exdes.new) excal.new <- e.calibrate(exdes.new, pop.new) # Now, check that calibration estimates match the known domain means derived # from the external source: svystatTM(excal.new, ~income, ~sex, estimator = "Mean")
#> sex Mean.income SE.Mean.income #> f f 1245 2.397528e-13 #> m m 1250 2.886708e-14
income.AVG.sex
#> f m #> 1245 1250
svystatTM(excal.new, ~income, ~marstat, estimator = "Mean")
#> marstat Mean.income SE.Mean.income #> married married 1260 5.530487e-14 #> unmarried unmarried 1230 3.157294e-14 #> widowed widowed 1290 5.158032e-14
income.AVG.marstat
#> married unmarried widowed #> 1260 1230 1290
svystatTM(excal.new, ~z, ~age5c, estimator = "Mean")
#> age5c Mean.z SE.Mean.z #> 1 1 125 1.109197e-14 #> 2 2 130 1.152637e-14 #> 3 3 135 5.679110e-15 #> 4 4 125 1.212958e-14 #> 5 5 140 4.419762e-15
z.AVG.age5c
#> 1 2 3 4 5 #> 125 130 135 125 140
# OK