bounds.hint.Rd
Suggests a sound bounds
value for which kottcalibrate
is likely to converge.
bounds.hint(deskott, df.population, calmodel = if (inherits(df.population, "pop.totals")) attr(df.population, "calmodel"), partition = if (inherits(df.population, "pop.totals")) attr(df.population, "partition") else FALSE)
deskott | Object of class |
---|---|
df.population | Data frame containing the known population totals for the auxiliary variables. |
calmodel | Formula defining the linear structure of the calibration model. |
partition | Formula specifying the variables that define the "calibration domains" for the model; |
The function bounds.hint
returns a bounds
value for which kottcalibtrate
is likely to converge. This interval is just a sound hint, not an exact result (see 'Note').
The mandatory argument deskott
identifies the kott.design
object on which the calibration problem is defined.
The mandatory argument df.population
identifies the known totals data frame.
The argument calmodel
symbolically defines the calibration model you want to use: it identifies the auxiliary variables and the constraints for the calibration problem. The deskott
variables referenced by calmodel
must be numeric
or factor
and must not contain any missing value (NA
). The argument can be omitted provided df.population
is an object of class pop.totals
(see population.check
).
The optional argument partition
specifies the variables that define the calibration domains for the model. The default value (FALSE
) means either that there are not calibration domains or that you want to solve the problem globally (even though it could be factorised). The deskott
variables referenced by partition
(if any) must be factor
and must not contain any missing value (NA
). The argument can be omitted provided df.population
is an object of class pop.totals
(see population.check
).
A numeric vector of length 2, representing the suggested value for the bounds
argument of kottcalibrate
. The attributes of that vector store additional information, which can lead to better understand why a given calibration problem is (un)feasible (see 'Examples').
Assessing the feasibility of an arbitrary calibration problem is not an easy task. The problem is even more difficult whenever additional "range restrictions" are imposed. Indeed, even if one assumes that the calibration constraints define a consistent system, one also has to choose the bounds
such that the feasible region is non-empty.
One can argue that there must exist a minimun-length interval \(I=[L,U]\) such that, if it is covered by bounds
, the specified calibration problem is feasible. Unfortunately in order to compute exactly that minimun-length interval \(I\) one should solve a big linear programming problem [Vanderhoeft 01]. As an alternative, a trial and error procedure has been frequently proposed [Deville et al 1993; Sautory 1993]: (i) start with a very large interval bounds.0
; (ii) if convergence is achieved, shrink it so as to obtain a new inteval bounds.1
; (iii) repeat until you get a sufficiently tight feasible interval bounds.n
. The drawback is that this procedure can cost a lot of computer time since, for each choice of the bounds
, the full calibration problem has to be solved.
A rather easy task is, on the contrary, the one of finding at least a given specific interval \(I^*=[L^*,U^*]\) such that, if it is not covered by bounds
, the current calibration problem is surely unfeasible. This means that any feasible bounds
value must necessarily contain the \(I^*\) interval. The function bounds.hint
: (i) first identifies such an \(I^*\) interval (by computing the range of the ratios between known population totals and corresponding direct Horvitz-Thompson estimates), (ii) then builds a new interval \(I^{sugg}\) with same midpoint and double length. The latter is the suggested value for the bounds
argument of kottcalibrate
. The return value of bounds.hint
should be understood as a useful starting guess for bounds
, even though there is definitely no warranty that the calibration algorithm will actually converge.
Vanderhoeft, C. (2001) "Generalized Calibration at Statistic Belgium", Statistics Belgium Working Paper n. 3, http://www.statbel.fgov.be/studies/paper03_en.asp.
Deville, J.C., Sarndal, C.E. and Sautory, O. (1993) "Generalized Raking Procedures in Survey Sampling", Journal of the American Statistical Association, Vol. 88, No. 423, pp.1013-1020.
Sautory, O. (1993) "La macro CALMAR: Redressement d'un Echantillon par Calage sur Marges", Document de travail de la Direction des Statistiques Demographiques et Sociales, no. F9310.
kottcalibrate
for calibrating replicate weights, pop.template
for constructing known totals data frames in compliance with the standard required by kottcalibrate
, population.check
to check that the known totals data frame satisfies that standard and g.range
to compute the range of the obtained g-weights.
# Load sample data (the only reason for fixing # the RNG seed is to achieve reproducible examples) data(data.examples) set.seed(123) # Creation of the object to be calibrated: kdes<-kottdesign(data=example,ids=~towcod+famcod,strata=~SUPERSTRATUM, weights=~weight,nrg=15) # Calibration (global solution) on the joint distribution # of sex and marstat (totals in pop03). Get a hint for feasible bounds: hint<-bounds.hint(kdes,pop03,~marstat:sex-1)#> #> A starting suggestion: try to calibrate with bounds=c(0.899, 1.129) #> #> Remark: this is just a hint, not an exact result #> Feasible bounds for calibration problem must cover the interval [0.956, 1.071] #># Let's first verify if calibration converges with the suggested # value for the bounds argument (i.e. c(0.909, 1.062) ): kdescal03<-kottcalibrate(deskott=kdes,df.population=pop03, calmodel=~marstat:sex-1,calfun="logit",bounds=hint) # Now let's verify that calibration fails, if bounds don't cover # the interval [0.947, 1.023]:# NOT RUN { kdescal03<-kottcalibrate(deskott=kdes,df.population=pop03, calmodel=~marstat:sex-1,calfun="logit",bounds=c(0.95, 1.03)) # }# Calibration (iterative solution) on the totals for the quantitative # variables x1, x2 and x3 in the subpopulations defined by the # regcod variable (totals in pop04p): Get a hint for feasible bounds: hint<-bounds.hint(kdes,pop04p,~x1+x2+x3-1,~regcod)#> #> A starting suggestion: try to calibrate with bounds=c(0.038, 2.72) #> #> Remark: this is just a hint, not an exact result #> Feasible bounds for calibration problem must cover the interval [0.709, 2.049] #># Let's verify if calibration converges with the suggested # value for the bounds argument (i.e. c(0.133, 2.497) ): kdescal04p<-kottcalibrate(deskott=kdes,df.population=pop04p, calmodel=~x1+x2+x3-1,partition=~regcod,calfun="logit", bounds=hint,aggregate.stage=2) # Now let's verify that calibration fails, if bounds don't cover # the interval [0.724, 1.906]:# NOT RUN { kdescal04p<-kottcalibrate(deskott=kdes,df.population=pop04p, calmodel=~x1+x2+x3-1,partition=~regcod,calfun="logit", bounds=c(0.71,1.89),aggregate.stage=2) # }# By analysing kottcal.status one understands that calibration # failed due to the sub-task identified by replicate.12 and # regcod 6: kottcal.status#> $call #> kottcalibrate(deskott = kdes, df.population = pop04p, calmodel = ~x1 + #> x2 + x3 - 1, partition = ~regcod, calfun = "logit", bounds = hint, #> aggregate.stage = 2) #> #> $return.code #> 6 7 10 #> original 0 0 0 #> replicate.1 0 0 0 #> replicate.2 0 0 0 #> replicate.3 0 0 0 #> replicate.4 0 0 0 #> replicate.5 0 0 0 #> replicate.6 0 0 0 #> replicate.7 0 0 0 #> replicate.8 0 0 0 #> replicate.9 0 0 0 #> replicate.10 0 0 0 #> replicate.11 0 0 0 #> replicate.12 0 0 0 #> replicate.13 0 0 0 #> replicate.14 0 0 0 #> replicate.15 0 0 0 #># this is easily explained by inspectioning the "bounds" # attribute of the bounds.hint output object: hint#> [1] 0.038 2.720 #> attr(,"star.interval") #> [1] 0.7085247 2.0491790 #> attr(,"bounds") #> attr(,"bounds")$call #> bounds.hint(kdes, pop04p, ~x1 + x2 + x3 - 1, ~regcod) #> #> attr(,"bounds")$lower #> 6 7 10 #> original 0.8045835 0.7735073 0.8987247 #> replicate.1 0.7443905 0.7530853 0.8052112 #> replicate.2 0.8692697 0.7212123 0.9761906 #> replicate.3 0.8202755 0.7743616 0.8669947 #> replicate.4 0.8175836 0.7633970 0.8639573 #> replicate.5 0.7598176 0.7085247 0.9005086 #> replicate.6 0.8380082 0.7850858 0.9016069 #> replicate.7 0.7831603 0.7501212 0.8892474 #> replicate.8 0.7577942 0.7527572 0.9105671 #> replicate.9 0.7803752 0.7705280 0.9600142 #> replicate.10 0.8982550 0.7813897 0.8809964 #> replicate.11 0.7340459 0.7687987 0.8885818 #> replicate.12 0.9029446 0.8027554 0.9701130 #> replicate.13 0.8579082 0.7359546 0.8810956 #> replicate.14 0.7635730 0.7884327 0.8419929 #> replicate.15 0.7915788 0.7716666 0.9854769 #> all 0.7340459 0.7085247 0.8052112 #> #> attr(,"bounds")$upper #> 6 7 10 #> original 1.534247 0.9326981 1.297280 #> replicate.1 1.549311 0.9464416 1.308133 #> replicate.2 1.526246 1.0000166 1.249711 #> replicate.3 1.585542 0.8927598 1.272591 #> replicate.4 1.558306 0.9737890 1.269757 #> replicate.5 1.481951 0.8543082 1.273368 #> replicate.6 1.490087 0.9053758 1.370347 #> replicate.7 2.049179 0.9615852 1.307158 #> replicate.8 1.299465 0.8923319 1.303436 #> replicate.9 1.356716 0.9831481 1.251063 #> replicate.10 1.972342 0.9286247 1.322551 #> replicate.11 1.457476 0.8923319 1.307297 #> replicate.12 1.498696 1.1089917 1.330477 #> replicate.13 1.497552 0.8899994 1.387637 #> replicate.14 1.497552 0.9740486 1.251564 #> replicate.15 1.497552 0.8476070 1.281982 #> all 2.049179 1.1089917 1.387637 #> #> attr(,"class") #> [1] "bounds.hint" "numeric"# indeed the specified upper bound (1.89) was too low # for replicate.12 and regcod 6