What is ReGenesees

ReGenesees (R Evolved Generalized Software for Sampling Estimates and Errors in Surveys) is an R package for design-based and model-assisted analysis of complex sample surveys.

It is the outcome of a long term research and development project, aimed at defining a new standard for calibration, estimation and sampling error assessment to be adopted in all large-scale sample surveys routinely carried out by Istat (the Italian National Institute of Statistics).

Installation

You can install the development version of ReGenesees from GitHub as follows:

install.packages("devtools")
devtools::install_github("DiegoZardetto/ReGenesees")

The latest released version of ReGenesees can be downloaded from Istat website or from the European Commission platform Joinup (where older versions are available too).

Main Statistical Functions

  • Complex Sampling Designs

    • Multistage, stratified, clustered, sampling designs
    • Sampling with equal or unequal probabilities, with or without replacement
    • “Mixed” sampling designs (i.e. with both Self-Representing and Non-Self-Representing strata)
  • Calibration

    • Global and partitioned (for factorizable calibration models)
    • Unit-level and cluster-level weights adjustment
    • Homoscedastic and heteroscedastic models
    • Linear, raking and logit distance functions
    • Bounded and unbounded weights adjustment
    • Multi-step calibration
    • Calibration on multiple regression coefficients
    • Consistent trimming of calibration weights
  • Basic Estimators

    • Horvitz-Thompson
    • Calibration Estimators
  • Variance Estimation

    • Multistage formulation (via Bellhouse recursive algorithm)
    • Ultimate Cluster approximation
    • Collapsed strata technique for handling lonely PSUs
    • Taylor-linearization of nonlinear “smooth” estimators
    • Generalized Variance Functions (GVF) method
  • Estimates and Sampling Errors (standard error, variance, coefficient of variation, confidence interval, design effect) for:

    • Totals
    • Means
    • Absolute and relative frequency distributions (marginal, conditional and joint)
    • Ratios between totals
    • Shares and ratios between shares
    • Multiple regression coefficients
    • Quantiles
    • Population variance and standard deviation of numeric variables
    • Measures of Change derived from two not necessarily independent samples
  • Estimates and Sampling Errors for Complex Estimators

    • Handles arbitrary differentiable functions of Horvitz-Thompson or Calibration estimators
    • Complex Estimators can be freely defined by the user
    • Automated Taylor-linearization
    • Design covariance and correlation between Complex Estimators
  • Estimates and Sampling Errors for Subpopulations (Domains)

    • All the analyses above can be carried out for arbitrary domains
  • Sample Size Requirements and Power Calculations for:

    • Estimators of proportions and comparisons between two proportions
    • Estimators of means and comparisons between two means

Citation

Zardetto, D. (2015). “ReGenesees: An Advanced R System for Calibration, Estimation and Sampling Error Assessment in Complex Sample Surveys”. Journal of Official Statistics, 31(2), 177-203. https://sciendo.com/article/10.1515/jos-2015-0013.

Graphical User Interface

A companion R package named ReGenesees.GUI is also available, which provides a user-friendly mouse-click graphical interface for ReGenesees. Find it on GitHub here:

Sponsors

The ReGenesees project was conceived in Istat in late 2006, and Istat has been ReGenesees’ primary sponsor ever since.

From April 2021, development, maintenance and support of ReGenesees are also actively sponsored by the World Bank.

Disclaimer

In case you come across malfunctions or flaws of this website, please bear in mind that it has been automatically generated from the sources of the ReGenesees package and it has no human maintainers.

In particular, the printed output in the ‘Examples’ sections of some functions - e.g. svystatL() and, through it, e.calibrate() and Corr() - is known to mistakenly show error messages that do not actually exist in the package.