Quickly estimates the totals of the auxiliary variables of a calibration model.

aux.estimates(design,
              calmodel = if (inherits(template, "pop.totals"))
                         attr(template, "calmodel"),
              partition = if (inherits(template, "pop.totals"))
                          attr(template, "partition") else FALSE,
              template = NULL)

Arguments

design

Object of class analytic (or inheriting from it) containing survey data and sampling design metadata.

calmodel

Formula defining the linear structure of the calibration model.

partition

Formula specifying the variables that define the "calibration domains" for the model (see ‘Details’); FALSE (the default) implies no calibration domains.

template

An object of class pop.totals, be it a template or the actual known totals data frame for the calibration task.

Details

The main purpose of function aux.estimates is to make easy the task of estimating the totals of all the auxiliary variables involved in a calibration model (separately inside distinct calibration domains, if specified). Even if such totals can be estimated also by repeatedly invoking function svystatTM, this may reveal very tricky in practice, because real-world calibration tasks (e.g. in the field of Official Statistics) can simultaneously involve hundreds of auxiliary variables. Moreover, total estimates provided by function svystatTM are always complemented by sampling errors, whose estimation is very computationally demanding.

Function aux.estimates, on the contrary, only provides estimates of totals (i.e. without associated sampling errors), thus being very quick to be executed. Moreover, aux.estimates is able to compute, in just a single shot, all the totals of the auxiliary variables of a calibration model, no matter how complex the model is. Lastly, as a third strong point, the totals estimated by aux.estimates will be returned exactly in the same standard format in which the known population totals for the related calibration task need to be represented (see pop.template, population.check, fill.template).

It may be useful to point out that, besides having been designed to handle auxiliary variables involved in calibration models, function aux.estimates could be also used for computing general estimates of totals inside subpopulations in a very effective way (see ‘Examples’).

Value

An object of class pop.totals, thus inheriting from class data.frame storing the estimated totals in a standard format.

See also

e.svydesign to bind survey data and sampling design metadata, svystatTM for calculating estimates and standard errors of totals, e.calibrate for calibrating weights, pop.template for constructing known totals data frames in compliance with the standard required by e.calibrate, population.check to check that the known totals data frame satisfies that standard, fill.template to automatically fill the template when a sampling frame is available.

Examples

# Load sbs data: data(sbs) # Build a design object: sbsdes<-e.svydesign(data=sbs,ids=~id,strata=~strata,weights=~weight,fpc=~fpc) # Now suppose you have to perform a calibration process which # exploits as auxiliary information: # i) the total number of employees (emp.num) # by class of number of employees (emp.cl) crossed with nace.macro; # ii) the total number of enterprises (ent) # by region crossed with nace.macro; # Build a template for the known totals: pop<-pop.template(sbsdes, calmodel=~emp.num:emp.cl + region -1, partition=~nace.macro) # Use the fill.template function to automatically compute # the totals from the universe (sbs.frame) and safely fill # the template: pop<-fill.template(sbs.frame,template=pop)
#> #> # Coherence check between 'universe' and 'template': OK #>
pop
#> nace.macro regionNorth regionCenter regionSouth emp.num:emp.cl[6,9] #> 1 Agriculture 283 61 114 729 #> 2 Industry 5080 2290 1925 15494 #> 3 Commerce 1902 554 604 9931 #> 4 Services 3109 636 760 11888 #> emp.num:emp.cl(9,19] emp.num:emp.cl(19,49] emp.num:emp.cl(49,99] #> 1 2615 3063 2000 #> 2 31575 69898 82256 #> 3 12654 13421 8353 #> 4 15835 22799 22768 #> emp.num:emp.cl(99,Inf] #> 1 8761 #> 2 380357 #> 3 25166 #> 4 244831
# You can now use aux.estimates to verify how much difference # exists between the target totals and the initial HT estimates: aux.HT<-aux.estimates(sbsdes,template=pop) aux.HT
#> nace.macro regionNorth regionCenter regionSouth emp.num:emp.cl[6,9] #> 1 Agriculture 283 61 114 731.3167 #> 2 Industry 5080 2290 1925 15427.1899 #> 3 Commerce 1902 554 604 9909.9941 #> 4 Services 3109 636 760 11994.3642 #> emp.num:emp.cl(9,19] emp.num:emp.cl(19,49] emp.num:emp.cl(49,99] #> 1 2624.436 2922.562 1999.615 #> 2 31917.721 69637.743 84114.682 #> 3 12319.514 13644.274 8177.270 #> 4 15626.539 21886.497 23109.134 #> emp.num:emp.cl(99,Inf] #> 1 8761 #> 2 380357 #> 3 25166 #> 4 244831
# If you calibrate, ... sbscal<-e.calibrate(sbsdes,pop) # ... you can verify that CAL estimates exactly match the known totals: aux.CAL<-aux.estimates(sbscal,template=pop) aux.CAL
#> nace.macro regionNorth regionCenter regionSouth emp.num:emp.cl[6,9] #> 1 Agriculture 283 61 114 729 #> 2 Industry 5080 2290 1925 15494 #> 3 Commerce 1902 554 604 9931 #> 4 Services 3109 636 760 11888 #> emp.num:emp.cl(9,19] emp.num:emp.cl(19,49] emp.num:emp.cl(49,99] #> 1 2615 3063 2000 #> 2 31575 69898 82256 #> 3 12654 13421 8353 #> 4 15835 22799 22768 #> emp.num:emp.cl(99,Inf] #> 1 8761 #> 2 380357 #> 3 25166 #> 4 244831
# Recall that you can also use aux.estimates for computing # general estimates of totals inside subpopulations (even # not related to any calibration task). # E.g. estimate the total of value added inside areas: aux.estimates(sbsdes,~va.imp2-1,~area)
#> area va.imp2 #> 1 11 19676130.06 #> 2 12 4154270.83 #> 3 13 7559215.95 #> 4 14 1172101.78 #> 5 15 4058592.18 #> 6 16 589516.48 #> 7 17 1961458.41 #> 8 21 492271.11 #> 9 22 148638.82 #> 10 23 125882.78 #> 11 24 85940.47 #> 12 31 4086071.44 #> 13 32 2781773.81 #> 14 33 1055307.50 #> 15 34 5776831.62 #> 16 41 499092.70 #> 17 42 127847.21 #> 18 43 114218.64 #> 19 51 734492.92 #> 20 52 628805.61 #> 21 53 71104.66 #> 22 61 253466.99 #> 23 62 76083.79 #> 24 63 64764.31
# ...and compare to svystatTM (notice also # the increased execution time): svystatTM(sbsdes,~va.imp2,~area)
#> area Total.va.imp2 SE.Total.va.imp2 #> 11 11 19676130.06 675015.62 #> 12 12 4154270.83 268571.62 #> 13 13 7559215.95 458784.00 #> 14 14 1172101.78 116567.07 #> 15 15 4058592.18 496491.73 #> 16 16 589516.48 46146.21 #> 17 17 1961458.41 132339.76 #> 21 21 492271.11 73943.48 #> 22 22 148638.82 30108.12 #> 23 23 125882.78 16611.29 #> 24 24 85940.47 18159.72 #> 31 31 4086071.44 298354.00 #> 32 32 2781773.81 240456.28 #> 33 33 1055307.50 155673.78 #> 34 34 5776831.62 389088.52 #> 41 41 499092.70 85159.34 #> 42 42 127847.21 11601.29 #> 43 43 114218.64 34182.20 #> 51 51 734492.92 103743.49 #> 52 52 628805.61 65170.80 #> 53 53 71104.66 15903.39 #> 61 61 253466.99 52830.99 #> 62 62 76083.79 20809.48 #> 63 63 64764.31 26085.19