sbs.Rd
The sbs
data frame stores artificial sbs-like sampling data, while sbs.frame
is the artificial sampling frame from which the sbs
units have been drawn. They allow to run R code contained in the ‘Examples’ section of the ReGenesees package help pages.
data(sbs)
The sbs
data frame mimics data observed in a Structural Business Statistics survey, under a one-stage stratified unit sampling design. The sample is made up of 6909 units, for which the following 22 variables were observed:
id
Identifier of the sampling units (enterprises), numeric
public
Does the enterprise belong to the Public Sector? factor
with levels 0
(No) and 1
(Yes)
emp.num
Number of employees, numeric
emp.cl
Number of employees classified into 5 categories, factor
with levels [6,9]
(9,19]
(19,49]
(49,99]
(99,Inf]
(notice that small enterprises with less than 6 employees fell outside the scope of the survey)
nace5
Economic Activity code with 5 digits, factor
with 596
levels
nace2
Economic Activity code with 2 digits, factor
with 57
levels
area
Territorial Division, factor
with 24
levels
cens
Flag identifying statistical units to be censused (hence defining take-all strata), factor
with levels 0
(No) and 1
(Yes)
region
Macroregion, factor
with levels North
Center
South
va.cl
Class of Value Added, factor
with 27
levels
va
Value Added, numeric
(contains NA
s)
dom1
A planned estimation domain, factor
with 261
levels (dom1
crosses nace2
and emp.cl
)
nace.macro
Economic Activity Macrosector, factor
with levels Agriculture
Industry
Commerce
Services
dom2
A planned estimation domain, factor
with 12
levels (dom2
crosses nace.macro
and region
)
strata
Stratification Variable, a factor
with 664
levels (obtained by crossing variables region
, nace2
, emp.cl
and cens
)
va.imp1
Value Added Imputed1, numeric
(NA
s were replaced with average values computed inside imputation strata obtained by crossing region
, nace.macro
, emp.cl
)
va.imp2
Value Added Imputed2, numeric
(NA
s were replaced with median values computed inside imputation strata obtained by crossing region
, nace.macro
, emp.cl
)
y
A numeric
variable correlated with va
weight
Direct weights, numeric
fpc
Finite Population Corrections (given as sampling fractions inside strata), numeric
ent
Convenience numeric
variable identically equal to 1
(sometimes useful, e.g. to estimate the total number of enterprises)
dom3
An unplanned estimation domain, factor
with 4
levels
The sbs.frame
sampling frame (from which sbs
units have been drawn) contains 17318 units.
#> id public emp.num emp.cl nace5 nace2 area cens region va.cl va #> 1 1268 0 38 (19,49] 1210 1 32 0 Center 22 5500.0 #> 2 1358 0 30 (19,49] 1240 1 32 0 Center 19 1500.0 #> 3 13819 0 25 (19,49] 1131 1 41 0 Center 16 400.0 #> 4 15749 0 22 (19,49] 1111 1 43 0 Center 1 0.0 #> 5 8431 0 29 (19,49] 1121 1 31 0 Center 2 0.5 #> 6 7572 0 50 (49,99] 1132 1 41 0 Center 11 60.0 #> dom1 nace.macro dom2 strata va.imp1 va.imp2 #> 1 1.(19,49] Agriculture Agriculture.Center Center.1.(19,49].0 5500.0 5500.0 #> 2 1.(19,49] Agriculture Agriculture.Center Center.1.(19,49].0 1500.0 1500.0 #> 3 1.(19,49] Agriculture Agriculture.Center Center.1.(19,49].0 400.0 400.0 #> 4 1.(19,49] Agriculture Agriculture.Center Center.1.(19,49].0 0.0 0.0 #> 5 1.(19,49] Agriculture Agriculture.Center Center.1.(19,49].0 0.5 0.5 #> 6 1.(49,99] Agriculture Agriculture.Center Center.1.(49,99].0 60.0 60.0 #> y weight fpc ent dom3 #> 1 1636.6075 1.40 0.7142857 1 C #> 2 1002.4378 1.40 0.7142857 1 C #> 3 444.4637 1.40 0.7142857 1 D #> 4 252.1287 1.40 0.7142857 1 D #> 5 466.5918 1.40 0.7142857 1 D #> 6 742.9053 1.25 0.8000000 1 Bstr(sbs)#> 'data.frame': 6909 obs. of 22 variables: #> $ id : int 1268 1358 13819 15749 8431 7572 9701 9661 11899 15136 ... #> $ public : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ... #> $ emp.num : int 38 30 25 22 29 50 67 55 52 12 ... #> $ emp.cl : Factor w/ 5 levels "[6,9]","(9,19]",..: 3 3 3 3 3 4 4 4 4 2 ... #> $ nace5 : Factor w/ 504 levels "1000","1100",..: 13 17 8 3 5 9 17 6 8 3 ... #> $ nace2 : Factor w/ 57 levels "1","2","5","11",..: 1 1 1 1 1 1 1 1 1 1 ... #> $ area : Factor w/ 24 levels "11","12","13",..: 13 13 16 18 12 16 14 13 16 18 ... #> $ cens : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ... #> $ region : Factor w/ 3 levels "North","Center",..: 2 2 2 2 2 2 2 2 2 2 ... #> $ va.cl : Factor w/ 27 levels "1","2","3","4",..: 22 19 16 1 2 11 23 16 16 1 ... #> $ va : num 5500 1500 400 0 0.5 60 7000 400 400 0 ... #> $ dom1 : Factor w/ 261 levels "1.(19,49]","1.(49,99]",..: 1 1 1 1 1 2 2 2 2 3 ... #> $ nace.macro: Factor w/ 4 levels "Agriculture",..: 1 1 1 1 1 1 1 1 1 1 ... #> $ dom2 : Factor w/ 12 levels "Agriculture.Center",..: 1 1 1 1 1 1 1 1 1 1 ... #> $ strata : Factor w/ 664 levels "Center.1.(19,49].0",..: 1 1 1 1 1 2 2 2 2 3 ... #> $ va.imp1 : num 5500 1500 400 0 0.5 60 7000 400 400 0 ... #> $ va.imp2 : num 5500 1500 400 0 0.5 60 7000 400 400 0 ... #> $ y : num 1637 1002 444 252 467 ... #> $ weight : num 1.4 1.4 1.4 1.4 1.4 1.25 1.25 1.25 1.25 1.5 ... #> $ fpc : num 0.714 0.714 0.714 0.714 0.714 ... #> $ ent : num 1 1 1 1 1 1 1 1 1 1 ... #> $ dom3 : Factor w/ 4 levels "A","B","C","D": 3 3 4 4 4 2 3 2 2 4 ...str(sbs.frame)#> 'data.frame': 17318 obs. of 20 variables: #> $ id : int 1 2 3 4 5 6 7 8 9 10 ... #> $ public : Factor w/ 2 levels "0","1": 1 1 1 1 2 1 2 1 1 1 ... #> $ emp.num : int 21 35 20 18 689 12 172 51 14 9 ... #> $ emp.cl : Factor w/ 5 levels "[6,9]","(9,19]",..: 3 3 3 2 5 2 5 4 2 1 ... #> $ nace5 : Factor w/ 596 levels "1000","1100",..: 388 51 127 226 497 480 497 478 346 480 ... #> $ nace2 : Factor w/ 57 levels "1","2","5","11",..: 34 7 11 20 45 40 45 40 33 40 ... #> $ area : Factor w/ 24 levels "11","12","13",..: 1 1 3 1 2 1 1 1 1 1 ... #> $ cens : Factor w/ 2 levels "0","1": 1 1 1 1 2 1 2 1 1 1 ... #> $ region : Factor w/ 3 levels "North","Center",..: 1 1 1 1 1 1 1 1 1 1 ... #> $ va.cl : Factor w/ 27 levels "1","2","3","4",..: 21 23 19 17 27 NA 22 NA NA NA ... #> $ va : num 3500 7000 1500 600 70000 NA 5500 NA NA NA ... #> $ dom1 : Factor w/ 261 levels "1.(19,49]","1.(49,99]",..: 154 18 37 85 208 182 208 181 151 184 ... #> $ nace.macro: Factor w/ 4 levels "Agriculture",..: 3 2 2 2 4 4 4 4 3 4 ... #> $ dom2 : Factor w/ 12 levels "Agriculture.Center",..: 5 8 8 8 11 11 11 11 5 11 ... #> $ strata : Factor w/ 664 levels "Center.1.(19,49].0",..: 344 210 228 277 406 373 406 372 341 375 ... #> $ va.imp1 : num 3500 7000 1500 600 70000 ... #> $ va.imp2 : num 3500 7000 1500 600 70000 3500 5500 750 750 400 ... #> $ y : num 1374 2074 457 455 13584 ... #> $ ent : num 1 1 1 1 1 1 1 1 1 1 ... #> $ dom3 : Factor w/ 4 levels "A","B","C","D": 4 3 1 4 1 2 1 2 3 3 ...