| Title: | Optimum Sample Allocation in Stratified Sampling |
|---|---|
| Description: | Provides exact analytical algorithms for computing optimum sample allocations in stratified sampling. Supports classical Neyman-Tschuprow allocation, minimum-cost allocation under a variance constraint, and multi-domain allocation with controlled precision. Handles lower and upper bounds, cost constraints, and multiple domains. Includes helper functions for variance computation, allocation summaries, rounding, and example datasets for testing and benchmarking. |
| Authors: | Wojciech Wójciak [aut, cre], Jacek Wesołowski [sad], Robert Wieczorkowski [ctb], David Munoz Tord [ctb] |
| Maintainer: | Wojciech Wójciak <[email protected]> |
| License: | GPL-2 |
| Version: | 3.0.1 |
| Built: | 2026-06-05 11:26:19 UTC |
| Source: | https://github.com/wwojciech/stratallo |
Functions implementing selected optimum allocation algorithms for solving the optimum sample allocation problem, formulated as follows:
Minimize
over , subject to
and either
or
where
,
are given numbers.
The following is a list of all available algorithms along with the functions that implement them:
RNA - rna(),
LRNA - rna(),
SGA - sga(),
SGAPLUS - sgaplus(),
COMA - coma().
See the documentation of a specific function for details.
The inequality constraints are optional. The user can choose whether and how they are imposed in the optimization problem, depending on the chosen algorithm:
Lower bounds can be specified only for the LRNA
algorithm (by setting cmp = .Primitive("<=") for rna()).
Upper bounds are supported by all other algorithms.
Simultaneous constraints (both lower and upper bounds) are not supported by these functions.
The costs of surveying one element in a stratum
can be specified by the user only for the RNA and LRNA algorithms.
For the remaining algorithms, these costs are fixed at 1, i.e.,
.
rna( total_cost, A, bounds = NULL, unit_costs = 1, cmp = .Primitive(">="), details = FALSE ) sga(total_cost, A, M) sgaplus(total_cost, A, M) coma(total_cost, A, M)rna( total_cost, A, bounds = NULL, unit_costs = 1, cmp = .Primitive(">="), details = FALSE ) sga(total_cost, A, M) sgaplus(total_cost, A, M) coma(total_cost, A, M)
total_cost |
(
|
A |
( |
bounds |
(
See also |
unit_costs |
( |
cmp |
( The value of this argument has no effect if |
details |
( |
M |
( |
If no inequality constraints are imposed, the allocation is given by the Neyman allocation:
For the stratified -estimator of the population total under
stratified simple random sampling without replacement design, the
parameters of the objective function are
where denotes the size of stratum and is the
standard deviation of the study variable in stratum .
A numeric vector of optimum sample allocations in strata. In the case
of rna() only, the return value may also be a list containing the
optimum allocations and strata assignments.
rna(): Implements the Recursive Neyman Algorithm (RNA) and its counterpart,
the Lower Recursive Neyman Algorithm (LRNA), designed for the optimum
allocation problem with one-sided lower-bound constraints.
The RNA is described in Wesołowski et al. (2022), whereas the LRNA
is introduced in Wójciak (2023).
sga(): The Stenger-Gabler (SGA) algorithm, as proposed by
Stenger and Gabler (2005) and described in
Wesołowski et al. (2022).
This algorithm solves the optimum allocation problem with one-sided
upper-bound constraints. It assumes unit costs are constant and equal to 1,
i.e., .
sgaplus(): A modified Stenger-Gabler-type algorithm, described in
Wójciak (2019), implemented as the
Sequential Allocation (version 1) algorithm.
This algorithm solves the optimum allocation problem with one-sided
upper-bound constraints. It assumes unit costs are constant and equal to 1,
i.e., .
coma(): The Change of Monotonicity Algorithm (COMA), described in
Wesołowski et al. (2022), solves the optimum allocation problem
with one-sided upper-bound constraints. It assumes unit costs are constant
and equal to 1, i.e., .
These functions are optimized for internal use and should typically not
be called directly by users. Use opt() or optcost() instead.
Wójciak W (2023). “Another Solution for Some Optimum Allocation Problem.” Statistics in Transition new series, 24(5), 203-219. doi:10.59170/stattrans-2023-071.
Wesołowski J, Wieczorkowski R, Wójciak W (2022). “Optimality of the Recursive Neyman Allocation.” Journal of Survey Statistics and Methodology, 10(5), 1263-1275. ISSN 2325-0984. doi:10.1093/jssam/smab018.
Wójciak W (2019). Optimal Allocation in Stratified Sampling Schemes. Master's thesis, Warsaw University of Technology. http://home.elka.pw.edu.pl/~wwojciak/msc_wwojciech_optimum_alloc.pdf.
Stenger H, Gabler S (2005). “Combining random sampling and census strategies - Justification of inclusion probabilities equal to 1.” Metrika, 61(2), 137–156. doi:10.1007/s001840400328.
Särndal C, Swensson B, Wretman J (1992). Model Assisted Survey Sampling. Springer New York, NY. ISBN 978-0-387-40620-6.
A <- c(3000, 4000, 5000, 2000) m <- c(50, 40, 10, 30) # lower bounds M <- c(100, 90, 70, 80) # upper bounds rna(total_cost = 190, A = A, bounds = M) rna(total_cost = 190, A = A, bounds = m, cmp = .Primitive("<=")) rna(total_cost = 300, A = A, bounds = M) sga(total_cost = 190, A = A, M = M) sgaplus(total_cost = 190, A = A, M = M) coma(total_cost = 190, A = A, M = M)A <- c(3000, 4000, 5000, 2000) m <- c(50, 40, 10, 30) # lower bounds M <- c(100, 90, 70, 80) # upper bounds rna(total_cost = 190, A = A, bounds = M) rna(total_cost = 190, A = A, bounds = m, cmp = .Primitive("<=")) rna(total_cost = 300, A = A, bounds = M) sga(total_cost = 190, A = A, M = M) sgaplus(total_cost = 190, A = A, M = M) coma(total_cost = 190, A = A, M = M)
A utility that returns a simple data.frame summarizing the allocation
returned by opt() or optcost().
alloc_summary(x, A, m = NULL, M = NULL)alloc_summary(x, A, m = NULL, M = NULL)
x |
( |
A |
( |
m |
( |
M |
( |
A data.frame with rows and up to seven variables,
where is the number of strata.
The first rows correspond to strata , while
the last row contains column totals (where applicable).
The columns include:
Population constant .
Lower bound (if provided).
Upper bound (if provided).
The optimized sample size .
Boolean indicator: .
Boolean indicator: .
Boolean indicator: (or
simply the internal Neyman allocation if no bounds were violated).
A <- c(3000, 4000, 5000, 2000) m <- c(100, 90, 70, 80) M <- c(200, 150, 300, 210) xopt_1 <- opt(n = 400, A, m) alloc_summary(xopt_1, A, m) xopt_2 <- opt(n = 540, A, m, M) alloc_summary(xopt_2, A, m, M)A <- c(3000, 4000, 5000, 2000) m <- c(100, 90, 70, 80) M <- c(200, 150, 300, 210) xopt_1 <- opt(n = 400, A, m) alloc_summary(xopt_1, A, m) xopt_2 <- opt(n = 540, A, m, M) alloc_summary(xopt_2, A, m, M)
rdca()
AllocationsDiagnostic tools to verify the objective function and constraint
satisfaction for allocations computed by the rdca() algorithm.
rdca_obj_cnstr(x, n, H_counts, N, S, rho2, J = integer(0)) rdca_cnstr_check(x, n, H_counts, N, S, rho2, J = integer(0), tol_max = 0.1) rdca_optcond_sU(H_counts, S, rho, s, U, return_diff = FALSE)rdca_obj_cnstr(x, n, H_counts, N, S, rho2, J = integer(0)) rdca_cnstr_check(x, n, H_counts, N, S, rho2, J = integer(0), tol_max = 0.1) rdca_optcond_sU(H_counts, S, rho, s, U, return_diff = FALSE)
x |
(numeric) |
n |
( |
H_counts |
( |
N |
( |
S |
( |
rho2 |
( |
J |
( |
tol_max |
( |
rho |
( |
s |
( |
U |
( For example, if If
|
return_diff |
(
is satisfied for each unblocked stratum
instead of a logical vector, which can be used to assess by how much the condition is satisfied or violated. |
rdca_obj_cnstr(): Compute the value of the objective function and constraint functions for a
given allocation.
rdca_cnstr_check(): Check whether the equality and inequality constraints are satisfied
for a given allocation, within a specified tolerance.
The tolerance applies to equality constraints only.
rdca_optcond_sU(): Check the optimality condition related to .
Specifically, verifies whether
for all such that is not fully blocked
by .
H_counts <- c(2, 2) # 2 domains with 2 strata each N <- c(140, 110, 135, 190) S <- sqrt(c(180, 20, 5, 4)) total <- c(2, 3) kappa <- c(0.4, 0.6) rho <- total * sqrt(kappa) rho2 <- total^2 * kappa n <- 500 (x <- dca(n, H_counts, N, S, rho, rho2)) # internal functions (not exported) – examples skipped ## Not run: rdca_obj_cnstr(x, n, H_counts, N, S, rho2) rdca_obj_cnstr(x, n, H_counts, N, S, rho2, 2) rdca_obj_cnstr(x, n, H_counts, N, S, rho2, NULL) ## End(Not run) ## Not run: rdca_cnstr_check(x, n, H_counts, N, S, rho2) rdca_cnstr_check(x, n, H_counts, N, S, rho2, 1) rdca_cnstr_check(x, n, H_counts, N, S, rho2, 2) rdca_cnstr_check(x, n, H_counts, N, S, rho2, NULL) ## End(Not run) ## Not run: (n <- dca_nmax(H_counts, N, S) - 1) U <- 1 (x <- dca(n, H_counts, N, S, rho, rho2, U = U, details = TRUE)) rdca_optcond_sU(H_counts, S, rho, x$s, U) # TRUE U <- 2 (x <- dca(n, H_counts, N, S, rho, rho2, U = U, details = TRUE)) rdca_optcond_sU(H_counts, S, rho, x$s, U) # FALSE U <- 3 (x <- dca(n, H_counts, N, S, rho, rho2, U = U, details = TRUE)) rdca_optcond_sU(H_counts, S, rho, x$s, U) # TRUE U <- 1:2 # domain 2 blocked (x <- dca(n, H_counts, N, S, rho, rho2, U = U, details = TRUE)) rdca_optcond_sU(H_counts, S, rho, x$s, U) # no unblocked strata in `U` ## End(Not run)H_counts <- c(2, 2) # 2 domains with 2 strata each N <- c(140, 110, 135, 190) S <- sqrt(c(180, 20, 5, 4)) total <- c(2, 3) kappa <- c(0.4, 0.6) rho <- total * sqrt(kappa) rho2 <- total^2 * kappa n <- 500 (x <- dca(n, H_counts, N, S, rho, rho2)) # internal functions (not exported) – examples skipped ## Not run: rdca_obj_cnstr(x, n, H_counts, N, S, rho2) rdca_obj_cnstr(x, n, H_counts, N, S, rho2, 2) rdca_obj_cnstr(x, n, H_counts, N, S, rho2, NULL) ## End(Not run) ## Not run: rdca_cnstr_check(x, n, H_counts, N, S, rho2) rdca_cnstr_check(x, n, H_counts, N, S, rho2, 1) rdca_cnstr_check(x, n, H_counts, N, S, rho2, 2) rdca_cnstr_check(x, n, H_counts, N, S, rho2, NULL) ## End(Not run) ## Not run: (n <- dca_nmax(H_counts, N, S) - 1) U <- 1 (x <- dca(n, H_counts, N, S, rho, rho2, U = U, details = TRUE)) rdca_optcond_sU(H_counts, S, rho, x$s, U) # TRUE U <- 2 (x <- dca(n, H_counts, N, S, rho, rho2, U = U, details = TRUE)) rdca_optcond_sU(H_counts, S, rho, x$s, U) # FALSE U <- 3 (x <- dca(n, H_counts, N, S, rho, rho2, U = U, details = TRUE)) rdca_optcond_sU(H_counts, S, rho, x$s, U) # TRUE U <- 1:2 # domain 2 blocked (x <- dca(n, H_counts, N, S, rho, rho2, U = U, details = TRUE)) rdca_optcond_sU(H_counts, S, rho, x$s, U) # no unblocked strata in `U` ## End(Not run)
Example datasets containing artificial populations for testing and demonstrating optimum sample allocation algorithms.
pop10s_bounds_ucost pop507s_ucost pop969s_ucost pop2d4s pop9d278spop10s_bounds_ucost pop507s_ucost pop969s_ucost pop2d4s pop9d278s
pop10s_bounds_ucost: Population with 10 strata, lower and upper bounds on sample sizes, and associated surveying costs. A matrix with 10 rows and 5 variables:
stratum size
standard deviation of study variable in the stratum.
lower bound for sample size in the stratum.
upper bound for sample size in the stratum.
cost of surveying one element in the stratum.
pop507s_ucost: Population with 507 strata and associated surveying costs. A matrix with 507 rows and 3 columns:
stratum size.
standard deviation of study variable in the stratum.
cost of surveying one element in the stratum.
pop969s_ucost: Population with 969 strata and associated surveying costs. A matrix with 969 rows and 3 columns:
stratum size.
standard deviation of study variable in the stratum.
cost of surveying one element in the stratum.
pop2d4s: Population with 2 domains and 4 strata. A list with the following elements:
strata counts in each domain.
stratum sizes.
standard deviations of study variable in strata.
totals in domains, i.e., the sum of the study variable values for population elements in each domain.
priority weights for domains.
total * sqrt(kappa).
total^2 * kappa.
See dca_nmax() or dca().
pop9d278s: Population with 9 domains and 278 strata. A list with the following elements:
strata counts in each domain.
stratum sizes.
standard deviations of study variable in strata.
totals in domains, i.e., the sum of the study variable values for population elements in each domain.
priority weights for domains.
total * sqrt(kappa).
total^2 * kappa.
See dca_nmax() or dca().
Functions implementing the Domain-Controlled Allocation (DCA) algorithm described in Wesołowski (2019) and Wójciak (2026). The algorithm solves the following optimum allocation problem, formulated in mathematical optimization terms:
Minimize
over ,
subject to
where:
the optimization variable,
the set of domain-stratum indices,
the set of domain indices,
the set of strata indices in domain ,
size of stratum ,
standard deviation of the study variable in stratum ,
where denotes the total in domain , i.e., the sum of the
values of the study variable for population elements in domain ,
and is a priority weight for domain ,
total sample size.
dca0(n, H_counts, N, S, rho, rho2, details = FALSE) dca(n, H_counts, N, S, rho, rho2, U = NULL, details = FALSE) dca_nmax(H_counts, N, S)dca0(n, H_counts, N, S, rho, rho2, details = FALSE) dca(n, H_counts, N, S, rho, rho2, U = NULL, details = FALSE) dca_nmax(H_counts, N, S)
n |
( |
H_counts |
( |
N |
( |
S |
( |
rho |
( |
rho2 |
( |
details |
( |
U |
( For example, if If
|
For , the optimal value satisfies ,
where
See Proposition 2.1 in Wesołowski (2019) or
Wójciak (2026) for details.
The value is less than or equal to sum(N) and can be
computed with dca_nmax().
If details = FALSE, the optimal is returned.
Otherwise, a list is returned containing the optimal
(element named x) along with other internal details of this algorithm.
In particular, the lambda element of the list corresponds to the optimal
.
dca0(): Domain-Controlled Allocation algorithm by
Wesołowski (2019)
dca(): Domain-Controlled Allocation algorithm by
Wesołowski (2019), optionally using a set of
take-max strata as described in Wójciak (2026).
dca_nmax(): Computes the maximum total sample size such that the
optimization problem solved by the Domain-Controlled Allocation (DCA)
algorithm admits a strictly positive optimal value .
These functions are optimized for internal use and should typically not
be called directly by users. They are designed to handle a large number of
invocations, specifically recursive calls from rdca(), and, as a result,
parameter assertions are minimal.
Wójciak W (2026). Multi-Domain Optimum Sample Allocation with Controlled-Precision under Upper-Bound Constraints. Ph.D. thesis, Warsaw University of Technology. http://home.elka.pw.edu.pl/~wwojciak/phd_wwojciech_optimum_alloc.pdf.
Wesołowski J (2019). “Multi-domain Neyman-Tchuprov optimal allocation.” Statistics in Transition new series, 20(4), 1–12. doi:10.21307/stattrans-2019-031.
Wesołowski J, Wieczorkowski R (2017). “An eigenproblem approach to optimal equal-precision sample allocation in subpopulations.” Communications in Statistics - Theory and Methods, 46(5), 2212–2231. doi:10.1080/03610926.2015.1040501.
# Two domains with 1 and 3 strata, respectively, # that is, H = {(1,1), (2,1), (2,2), (2,3)}. H_counts <- c(1, 3) N <- c(140, 110, 135, 190) # (N_{1,1}, N_{2,1}, N_{2,2}, N_{2,3}) S <- sqrt(c(180, 20, 5, 4)) # (S_{1,1}, S_{2,1}, S_{2,2}, S_{2,3}) total <- c(2, 3) kappa <- c(0.4, 0.6) rho <- total * sqrt(kappa) # (rho_1, rho_2) rho2 <- total^2 * kappa sum(N) # 575 n_max <- dca_nmax(H_counts, N, S) # 519.0416 n <- floor(n_max) - 1 dca0(n, H_counts, N, S, rho, rho2) x0 <- dca0(n, H_counts, N, S, rho, rho2, details = TRUE) x0$x x0$lambda x0$k x0$v x0$s n <- ceiling(n_max) + 1 x0 <- dca0(n, H_counts, N, S, rho, rho2, details = TRUE) x0$x x0$lambda n <- floor(n_max) - 1 x1 <- dca(n, H_counts, N, S, rho, rho2, details = TRUE) x1$x x1$x_Uc x1$lambda x1$s dca(n, H_counts, N, S, rho, rho2, U = 1) x2 <- dca(n, H_counts, N, S, rho, rho2, U = 1, details = TRUE) x2$x x2$x_Uc x2$lambda x2$s# Two domains with 1 and 3 strata, respectively, # that is, H = {(1,1), (2,1), (2,2), (2,3)}. H_counts <- c(1, 3) N <- c(140, 110, 135, 190) # (N_{1,1}, N_{2,1}, N_{2,2}, N_{2,3}) S <- sqrt(c(180, 20, 5, 4)) # (S_{1,1}, S_{2,1}, S_{2,2}, S_{2,3}) total <- c(2, 3) kappa <- c(0.4, 0.6) rho <- total * sqrt(kappa) # (rho_1, rho_2) rho2 <- total^2 * kappa sum(N) # 575 n_max <- dca_nmax(H_counts, N, S) # 519.0416 n <- floor(n_max) - 1 dca0(n, H_counts, N, S, rho, rho2) x0 <- dca0(n, H_counts, N, S, rho, rho2, details = TRUE) x0$x x0$lambda x0$k x0$v x0$s n <- ceiling(n_max) + 1 x0 <- dca0(n, H_counts, N, S, rho, rho2, details = TRUE) x0$x x0$lambda n <- floor(n_max) - 1 x1 <- dca(n, H_counts, N, S, rho, rho2, details = TRUE) x1$x x1$x_Uc x1$lambda x1$s dca(n, H_counts, N, S, rho, rho2, U = 1) x2 <- dca(n, H_counts, N, S, rho, rho2, U = 1, details = TRUE) x2$x x2$x_Uc x2$lambda x2$s
Prototype (under testing).
dca_M(n, H_counts, N, S, rho, rho2, M = N, U = NULL)dca_M(n, H_counts, N, S, rho, rho2, M = N, U = NULL)
n |
( |
H_counts |
( |
N |
( |
S |
( |
rho |
( |
rho2 |
( |
M |
( |
U |
( For example, if If
|
H_counts <- c(5, 2) H_names <- rep(seq_along(H_counts), times = H_counts) S <- c(154, 178, 134, 213, 124, 102, 12) N <- c(100, 100, 100, 100, 100, 100, 100) M <- c(80, 90, 70, 40, 10, 90, 100) names(M) <- names(N) <- H_names total <- c(13, 2) kappa <- c(0.8, 0.2) n <- 150 # experimental function (not exported) – examples skipped ## Not run: dca_M(n, H_counts, N, S, total, kappa, M = M, U = 5) # 1 1 1 1 1 2 2 # 12.754880 14.742653 11.098402 17.641490 10.000000 74.945462 8.817113 ## End(Not run)H_counts <- c(5, 2) H_names <- rep(seq_along(H_counts), times = H_counts) S <- c(154, 178, 134, 213, 124, 102, 12) N <- c(100, 100, 100, 100, 100, 100, 100) M <- c(80, 90, 70, 40, 10, 90, 100) names(M) <- names(N) <- H_names total <- c(13, 2) kappa <- c(0.8, 0.2) n <- 150 # experimental function (not exported) – examples skipped ## Not run: dca_M(n, H_counts, N, S, total, kappa, M = M, U = 5) # 1 1 1 1 1 2 2 # 12.754880 14.742653 11.098402 17.641490 10.000000 74.945462 8.817113 ## End(Not run)
Computes the optimum allocation for the following multi-domain optimum allocation problem, formulated in mathematical optimization terms:
Minimize
over ,
subject to
where:
the optimization variable,
the set of domain-stratum indices,
the set of domain indices,
the set of strata indices in domain ,
size of stratum ,
standard deviation of the study variable in stratum ,
where denotes the total in domain , i.e., the sum of the
values of the study variable for population elements in domain ,
and is a priority weight for domain ,
total sample size.
dopt(n, H_counts, N, S, total, kappa, return_T = FALSE)dopt(n, H_counts, N, S, total, kappa, return_T = FALSE)
n |
( |
H_counts |
( |
N |
( |
S |
( |
total |
( |
kappa |
( |
return_T |
( |
The dopt() function uses the RDCA algorithm implemented in rdca().
If return_T = FALSE (default), a numeric vector containing the optimal
sample allocations for each stratum .
If return_T = TRUE, a list with components:
numeric vector of optimal sample allocations.
optimal value of the objective function .
Wójciak W (2026). Multi-Domain Optimum Sample Allocation with Controlled-Precision under Upper-Bound Constraints. Ph.D. thesis, Warsaw University of Technology. http://home.elka.pw.edu.pl/~wwojciak/phd_wwojciech_optimum_alloc.pdf.
rdca(), dca(), dca_nmax(), opt(), optcost().
# Three domains with 2, 2, and 3 strata, respectively, # that is, H = {(1,1), (1,2), (2,1), (2,2), (3,1), (3,2), (3,3)}. H_counts <- c(2, 2, 3) # (N_{1,1}, N_{1,2}, N_{2,1}, N_{2,2}, N_{3,1}, N_{3,2}, N_{3,3}) N <- c(140, 110, 135, 190, 200, 40, 70) # (S_{1,1}, S_{1,2}, S_{2,1}, S_{2,2}, S_{3,1}, S_{3,2}, S_{3,3}) S <- c(180, 20, 5, 4, 35, 9, 40) total <- c(2, 3, 5) kappa <- c(0.5, 0.2, 0.3) n <- 828 # Optimum allocation. dopt(n, H_counts, N, S, total, kappa) # Example population with 9 domains and 278 strata p <- pop9d278s sum(p$N) n <- 5000 x <- dopt(n, p$H_counts, p$N, p$S, p$total, p$kappa, return_T = TRUE) x all(x$xopt <= p$N) sum(x$xopt)# Three domains with 2, 2, and 3 strata, respectively, # that is, H = {(1,1), (1,2), (2,1), (2,2), (3,1), (3,2), (3,3)}. H_counts <- c(2, 2, 3) # (N_{1,1}, N_{1,2}, N_{2,1}, N_{2,2}, N_{3,1}, N_{3,2}, N_{3,3}) N <- c(140, 110, 135, 190, 200, 40, 70) # (S_{1,1}, S_{1,2}, S_{2,1}, S_{2,2}, S_{3,1}, S_{3,2}, S_{3,3}) S <- c(180, 20, 5, 4, 35, 9, 40) total <- c(2, 3, 5) kappa <- c(0.5, 0.2, 0.3) n <- 828 # Optimum allocation. dopt(n, H_counts, N, S, total, kappa) # Example population with 9 domains and 278 strata p <- pop9d278s sum(p$N) n <- 5000 x <- dopt(n, p$H_counts, p$N, p$S, p$total, p$kappa, return_T = TRUE) x all(x$xopt <= p$N) sum(x$xopt)
Algorithm for optimum sample allocation in stratified sampling under lower- and upper-bound constraints, based on fixed-point iteration.
fpia2(v0, Nh, Sh, mh = NULL, Mh = NULL, lambda0 = NULL, maxiter = 100) glambda(lambda, n, Ah, mh = NULL, Mh = NULL) philambda(lambda, n, Ah, mh = NULL, Mh = NULL)fpia2(v0, Nh, Sh, mh = NULL, Mh = NULL, lambda0 = NULL, maxiter = 100) glambda(lambda, n, Ah, mh = NULL, Mh = NULL) philambda(lambda, n, Ah, mh = NULL, Mh = NULL)
v0 |
variance |
mh |
( |
Mh |
( |
lambda0 |
( |
maxiter |
( |
lambda |
( |
tol |
( |
A list with elements:
Vector of optimal allocation sizes.
Number of iterations performed.
fpia2(): Variant of fpia() using variance-based parametrization.
glambda(): Helper function for the fpia()
philambda(): Helper function for the fpia().
Münnich RT, Sachs EW, Wagner M (2012). “Numerical solution of optimal allocation problems in stratified sampling under box constraints.” AStA Advances in Statistical Analysis, 96(3), 435–450. doi:10.1007/s10182-011-0176-z.
Determines whether a numeric vector contains both negative and positive
values. Zero (0) is treated as neutral and does not count as either sign.
has_mixed_signs(x)has_mixed_signs(x)
x |
( |
TRUE if the vector contains both positive and negative values,
FALSE otherwise.
# internal functions (not exported) – examples skipped ## Not run: has_mixed_signs(1:5) has_mixed_signs(-(1:5)) has_mixed_signs(c(-1, -2, 3)) has_mixed_signs(c(0, -1)) has_mixed_signs(c(0, 1)) has_mixed_signs(c(0, 1, -1)) ## End(Not run)# internal functions (not exported) – examples skipped ## Not run: has_mixed_signs(1:5) has_mixed_signs(-(1:5)) has_mixed_signs(c(-1, -2, 3)) has_mixed_signs(c(0, -1)) has_mixed_signs(c(0, 1)) has_mixed_signs(c(0, 1, -1)) ## End(Not run)
dca0(), dca(), and rdca()
Internal utility functions used by dca0(), dca(), and rdca() that
perform operations on sets of domain–strata indices and manage the mapping
between strata and domains.
H_cnt2dind(H_counts) H_cnt2glbidx(H_counts) H_get_strata_indices(H_counts, d)H_cnt2dind(H_counts) H_cnt2glbidx(H_counts) H_get_strata_indices(H_counts, d)
H_counts |
( |
d |
( |
H_cnt2dind(): Creates a vector of domain indicators from a vector of strata counts per
domain;
each element of the vector is the index of the domain to which the
corresponding stratum belongs.
H_cnt2glbidx(): Creates unique indices for strata across multiple domains.
Returns a list of integer vectors, where the -th element contains
the unique indices of the strata in domain .
H_get_strata_indices(): Get the globally unique indices of strata belonging to a specific domain.
These functions are internal and should typically not be called directly by users.
H_counts <- c(2, 2, 3) # three domains with 2, 2, and 3 strata respectively # internal functions (not exported) – examples skipped ## Not run: H_cnt2dind(H_counts) # 1 1 2 2 3 3 3 ## End(Not run) # internal functions (not exported) – examples skipped ## Not run: H_cnt2glbidx(H_counts) ## End(Not run) # internal functions (not exported) – examples skipped ## Not run: H_get_strata_indices(H_counts, 3) # 5 6 7 ## End(Not run)H_counts <- c(2, 2, 3) # three domains with 2, 2, and 3 strata respectively # internal functions (not exported) – examples skipped ## Not run: H_cnt2dind(H_counts) # 1 1 2 2 3 3 3 ## End(Not run) # internal functions (not exported) – examples skipped ## Not run: H_cnt2glbidx(H_counts) ## End(Not run) # internal functions (not exported) – examples skipped ## Not run: H_get_strata_indices(H_counts, 3) # 5 6 7 ## End(Not run)
Compares two numeric vectors element-wise using an adaptive tolerance
sequence ranging from 10^-19 to 10^tol_max. The smallest tolerance at
which the values are considered equal is returned as the corresponding name
in the output.
is_equal(x, y, tol_max = -1)is_equal(x, y, tol_max = -1)
x |
( |
y |
( |
tol_max |
( |
A logical vector indicating whether each element pair is equal within the detected tolerance. The names reflect the tolerance used.
# internal functions (not exported) – examples skipped ## Not run: is_equal(c(3, 4), c(3, 4)) is_equal(c(3, 4), c(3.01, 4.11)) is_equal(c(3, 4), c(3.01, 4.11), tol_max = 0) ## End(Not run)# internal functions (not exported) – examples skipped ## Not run: is_equal(c(3, 4), c(3, 4)) is_equal(c(3, 4), c(3.01, 4.11)) is_equal(c(3, 4), c(3.01, 4.11), tol_max = 0) ## End(Not run)
NULL or zero-length)Utility functions for checking whether an object is empty, where emptiness
is defined as being NULL or having length 0.
is_empty(x) is_nonempty(x)is_empty(x) is_nonempty(x)
x |
object to test. |
is_empty(): Returns TRUE if the object is NULL or has length 0, and FALSE otherwise.
is_nonempty(): Logical negation of is_empty().
This function directly checks if length(x) > 0L for performance reasons,
avoiding the extra negation step that would occur if using !is_empty(x).
It is optimized for repeated use in algorithms where is_nonempty() is called
many times.
# internal functions (not exported) – examples skipped ## Not run: is_empty(NULL) is_empty(character(0)) is_empty(1) ## End(Not run) ## Not run: is_nonempty(NULL) is_nonempty(character(0)) is_nonempty(1) ## End(Not run)# internal functions (not exported) – examples skipped ## Not run: is_empty(NULL) is_empty(character(0)) is_empty(1) ## End(Not run) ## Not run: is_nonempty(NULL) is_nonempty(character(0)) is_nonempty(1) ## End(Not run)
Computes the optimum allocation for the following optimum allocation problem, formulated in mathematical optimization terms:
Minimize
over , subject to
where , such that
, and
, are given numbers.
Inequality constraints are optional and may be omitted.
Inequality constraints are optional, and the user can choose whether and how
they are applied to the optimization problem. This is controlled using the
m and M arguments as follows:
No inequality constraints: both m and M must be NULL (default).
Lower bounds only (): specify m, and
set M = NULL.
Upper bounds only (): specify M, and
set m = NULL.
Box constraints (): specify both m
and M.
opt(n, A, m = NULL, M = NULL, M_algorithm = "rna")opt(n, A, m = NULL, M = NULL, M_algorithm = "rna")
n |
(
|
A |
( |
m |
( |
M |
( |
M_algorithm |
( |
The opt() function uses different allocation algorithms depending on which
inequality constraints are applied. Each algorithm is implemented in a
separate R function, which is generally not intended to be called directly
by the end user. The algorithms are:
See the documentation of each specific function for more details about the corresponding algorithm.
A numeric vector of the optimal sample allocations for each stratum.
If no inequality constraints are applied, the allocation follows the Neyman allocation:
For a stratified estimator of the population total using
stratified simple random sampling without replacement design, the objective
function parameters are:
where is the size of stratum and is the
standard deviation of the study variable in stratum .
Särndal C, Swensson B, Wretman J (1992). Model Assisted Survey Sampling. Springer New York, NY. ISBN 978-0-387-40620-6.
optcost(), rna(), sga(), sgaplus(), coma(), rnabox().
A <- c(3000, 4000, 5000, 2000) m <- c(100, 90, 70, 50) M <- c(300, 400, 200, 90) # One-sided lower bounds. opt(n = 340, A = A, m = m) opt(n = 400, A = A, m = m) opt(n = 700, A = A, m = m) # One-sided upper bounds. opt(n = 190, A = A, M = M) opt(n = 700, A = A, M = M) # Box-constraints. opt(n = 340, A = A, m = m, M = M) opt(n = 500, A = A, m = m, M = M) x <- opt(n = 800, A = A, m = m, M = M) x # Variance corresponding to the allocation x. var_st(x = x, A = A, A0 = 45000) # Execution-time comparison of different algorithms using the microbenchmark package. ## Not run: N <- pop969s_ucost[, "N"] S <- pop969s_ucost[, "S"] A <- N * S nfrac <- c(0.005, seq(0.05, 0.95, 0.05)) n <- setNames(as.integer(nfrac * sum(N)), nfrac) lapply( n, function(ni) { microbenchmark::microbenchmark( RNA = opt(ni, A, M = N, M_algorithm = "rna"), SGA = opt(ni, A, M = N, M_algorithm = "sga"), SGAPLUS = opt(ni, A, M = N, M_algorithm = "sgaplus"), COMA = opt(ni, A, M = N, M_algorithm = "coma"), times = 200, unit = "us" ) } ) ## End(Not run)A <- c(3000, 4000, 5000, 2000) m <- c(100, 90, 70, 50) M <- c(300, 400, 200, 90) # One-sided lower bounds. opt(n = 340, A = A, m = m) opt(n = 400, A = A, m = m) opt(n = 700, A = A, m = m) # One-sided upper bounds. opt(n = 190, A = A, M = M) opt(n = 700, A = A, M = M) # Box-constraints. opt(n = 340, A = A, m = m, M = M) opt(n = 500, A = A, m = m, M = M) x <- opt(n = 800, A = A, m = m, M = M) x # Variance corresponding to the allocation x. var_st(x = x, A = A, A0 = 45000) # Execution-time comparison of different algorithms using the microbenchmark package. ## Not run: N <- pop969s_ucost[, "N"] S <- pop969s_ucost[, "S"] A <- N * S nfrac <- c(0.005, seq(0.05, 0.95, 0.05)) n <- setNames(as.integer(nfrac * sum(N)), nfrac) lapply( n, function(ni) { microbenchmark::microbenchmark( RNA = opt(ni, A, M = N, M_algorithm = "rna"), SGA = opt(ni, A, M = N, M_algorithm = "sga"), SGAPLUS = opt(ni, A, M = N, M_algorithm = "sgaplus"), COMA = opt(ni, A, M = N, M_algorithm = "coma"), times = 200, unit = "us" ) } ) ## End(Not run)
Computes stratum sample sizes that minimize the total survey cost for a given target variance of a stratified estimator, optionally subject to one-sided upper bounds on the stratum sample sizes. Specifically, the function solves the following optimization problem:
Minimize
over , subject to
where ,
and , are given numbers.
The upper-bound constraints are optional. If they are not
imposed, it is only required that .
optcost(V, A, A0, M = NULL, unit_costs = 1)optcost(V, A, A0, M = NULL, unit_costs = 1)
V |
( |
A |
( |
A0 |
( |
M |
( |
unit_costs |
( |
The allocation is computed using the LRNA algorithm, described in Wójciak (2023).
The solution is valid for stratified sampling designs in which the variance
of the stratified estimator can be expressed as
where is the number of strata, are the stratum
sample sizes, and do not depend on .
A numeric vector containing the optimal sample allocation for each stratum.
For the stratified -estimator of the population total under
stratified simple random sampling without replacement design, the
parameters take the form
where is the size of stratum and is the
standard deviation of the study variable in stratum .
Wójciak W (2023). “Another Solution for Some Optimum Allocation Problem.” Statistics in Transition new series, 24(5), 203-219. doi:10.59170/stattrans-2023-071.
A <- c(3000, 4000, 5000, 2000) M <- c(100, 90, 70, 80) x <- optcost(1017579, A = A, A0 = 579, M = M) xA <- c(3000, 4000, 5000, 2000) M <- c(100, 90, 70, 80) x <- optcost(1017579, A = A, A0 = 579, M = M) x
Implements the Recursive Domain-Controlled Allocation (RDCA) algorithm described in Wójciak (2026). The algorithm solves the following optimum allocation problem, formulated in mathematical optimization terms:
Minimize
over ,
subject to
where:
the optimization variable,
the set of domain-stratum indices,
the set of domain indices,
the set of strata indices in domain ,
size of stratum ,
standard deviation of the study variable in stratum ,
where denotes the total in domain , i.e., the sum of the
values of the study variable for population elements in domain ,
and is a priority weight for domain ,
total sample size.
rdca(n, H_counts, N, S, rho, rho2 = rho^2, U = NULL, J = NULL)rdca(n, H_counts, N, S, rho, rho2 = rho^2, U = NULL, J = NULL)
n |
( |
H_counts |
( |
N |
( |
S |
( |
rho |
( |
rho2 |
( |
U |
( For example, if If
|
J |
( |
The upper-bound constraints are guaranteed
to be preserved only if domain is in J.
The parameter J is used in the recursion.
The specified optimization problem is solved when J = NULL, i.e., when
J contains all domains.
This function is optimized for internal use and should typically not be
called directly by users.
It is designed to handle a large number of invocations, specifically
recursive invocations of rdca(), and, as a result, parameter assertions
are minimal.
Wójciak W (2026). Multi-Domain Optimum Sample Allocation with Controlled-Precision under Upper-Bound Constraints. Ph.D. thesis, Warsaw University of Technology. http://home.elka.pw.edu.pl/~wwojciak/phd_wwojciech_optimum_alloc.pdf.
# Three domains with 2, 2, and 3 strata, respectively, # that is, H = {(1,1), (1,2), (2,1), (2,2), (3,1), (3,2), (3,3)}. H_counts <- c(2, 2, 3) # (N_{1,1}, N_{1,2}, N_{2,1}, N_{2,2}, N_{3,1}, N_{3,2}, N_{3,3}) N <- c(140, 110, 135, 190, 200, 40, 70) # (S_{1,1}, S_{1,2}, S_{2,1}, S_{2,2}, S_{3,1}, S_{3,2}, S_{3,3}) S <- c(180, 20, 5, 4, 35, 9, 40) total <- c(2, 3, 5) kappa <- c(0.5, 0.2, 0.3) rho <- total * sqrt(kappa) # (rho_1, rho_2, rho_3) rho2 <- total^2 * kappa sum(N) n <- 828 # Optimum allocation. rdca(n, H_counts, N, S, rho, rho2) # Upper bounds enforced only for domain 1. rdca(n, H_counts, N, S, rho, rho2, J = 1)# Three domains with 2, 2, and 3 strata, respectively, # that is, H = {(1,1), (1,2), (2,1), (2,2), (3,1), (3,2), (3,3)}. H_counts <- c(2, 2, 3) # (N_{1,1}, N_{1,2}, N_{2,1}, N_{2,2}, N_{3,1}, N_{3,2}, N_{3,3}) N <- c(140, 110, 135, 190, 200, 40, 70) # (S_{1,1}, S_{1,2}, S_{2,1}, S_{2,2}, S_{3,1}, S_{3,2}, S_{3,3}) S <- c(180, 20, 5, 4, 35, 9, 40) total <- c(2, 3, 5) kappa <- c(0.5, 0.2, 0.3) rho <- total * sqrt(kappa) # (rho_1, rho_2, rho_3) rho2 <- total^2 * kappa sum(N) n <- 828 # Optimum allocation. rdca(n, H_counts, N, S, rho, rho2) # Upper bounds enforced only for domain 1. rdca(n, H_counts, N, S, rho, rho2, J = 1)
Iterative implementation of the Recursive Domain-Controlled Allocation (RDCA) algorithm. Not tested.
rdca_iter(n, H_counts, N, S, rho, rho2 = NULL, ref_domain = 1L)rdca_iter(n, H_counts, N, S, rho, rho2 = NULL, ref_domain = 1L)
n |
( |
H_counts |
( |
N |
( |
S |
( |
rho |
( |
rho2 |
( |
ref_domain |
( |
H_counts <- c(2, 2, 3) N <- c(140, 110, 135, 190, 200, 40, 70) S <- sqrt(c(180, 20, 5, 4, 35, 9, 40)) total <- c(2, 3, 5) kappa <- c(0.5, 0.2, 0.3) rho <- total * sqrt(kappa) (n <- dca_nmax(H_counts, N, S) - 1) # experimental function (not exported) – examples skipped ## Not run: rdca_iter(n, H_counts, N, S, rho) # 140.0000 103.6139 132.1970 166.4127 195.9701 19.8750 70.0000 ## End(Not run)H_counts <- c(2, 2, 3) N <- c(140, 110, 135, 190, 200, 40, 70) S <- sqrt(c(180, 20, 5, 4, 35, 9, 40)) total <- c(2, 3, 5) kappa <- c(0.5, 0.2, 0.3) rho <- total * sqrt(kappa) (n <- dca_nmax(H_counts, N, S) - 1) # experimental function (not exported) – examples skipped ## Not run: rdca_iter(n, H_counts, N, S, rho) # 140.0000 103.6139 132.1970 166.4127 195.9701 19.8750 70.0000 ## End(Not run)
Experimental variants of the Recursive Neyman Algorithm (RNA).
rna_rec( total_cost, A, bounds = NULL, unit_costs = rep(1, length(A)), cmp = .Primitive(">=") ) rna_prior( total_cost, A, bounds = NULL, check = NULL, cmp = .Primitive(">="), details = FALSE )rna_rec( total_cost, A, bounds = NULL, unit_costs = rep(1, length(A)), cmp = .Primitive(">=") ) rna_prior( total_cost, A, bounds = NULL, check = NULL, cmp = .Primitive(">="), details = FALSE )
total_cost |
(
|
A |
( |
bounds |
(
See also |
unit_costs |
( |
cmp |
( The value of this argument has no effect if |
check |
( |
details |
( |
rna_rec(): Recursive implementation of the RNA.
rna_prior(): A variant of the Recursive Neyman Algorithm (RNA) that uses prior information
about strata for which allocation constraints may be violated. For all other
strata, allocations are assumed to satisfy the bounds.
This code has not been extensively tested and may change in future releases.
A <- c(3000, 4000, 5000, 2000) M <- c(100, 90, 70, 80) # upper bounds # experimental function (not exported) – examples skipped ## Not run: rna_rec(total_cost = 190, A = A, bounds = M) rna_rec(total_cost = 312, A = A, bounds = M) rna_rec(total_cost = 339, A = A, bounds = M) rna_rec(total_cost = 340, A = A, bounds = M) ## End(Not run)A <- c(3000, 4000, 5000, 2000) M <- c(100, 90, 70, 80) # upper bounds # experimental function (not exported) – examples skipped ## Not run: rna_rec(total_cost = 190, A = A, bounds = M) rna_rec(total_cost = 312, A = A, bounds = M) rna_rec(total_cost = 339, A = A, bounds = M) rna_rec(total_cost = 340, A = A, bounds = M) ## End(Not run)
Implements the Recursive Neyman Algorithm for Optimum Sample Allocation under Box Constraints (RNABOX), as proposed in Wesołowski et al. (2024). The algorithm solves the following optimum allocation problem, formulated in mathematical optimization terms:
Minimize
over , subject to
where , such that
, and
, are given numbers.
Inequality constraints are optional and may be omitted.
rnabox( n, A, bounds_inner = NULL, bounds_outer = NULL, cmp_inner = .Primitive(">="), cmp_outer = .Primitive("<=") )rnabox( n, A, bounds_inner = NULL, bounds_outer = NULL, cmp_inner = .Primitive(">="), cmp_outer = .Primitive("<=") )
n |
(
|
A |
( |
bounds_inner |
( If both
|
bounds_outer |
( If both
|
cmp_inner |
(
The value of this argument has no effect if
|
cmp_outer |
(
The value of this argument has no effect if
|
A numeric vector of optimum sample allocations in strata.
The rnabox() function is optimized for internal use and should
typically not be called directly by users. Use opt() instead.
Wesołowski J, Wieczorkowski R, Wójciak W (2024). “Recursive Neyman algorithm for optimum sample allocation under box constraints on sample sizes in strata.” Survey Methodology, 50(2), 487–511. ISSN 1492-0921. https://www150.statcan.gc.ca/n1/en/catalogue/12-001-X201200111682.
opt(), optcost(), rna(), sga(), sgaplus(), coma()
N <- c(454, 10, 116, 2500, 2240, 260, 39, 3000, 2500, 400) S <- c(0.9, 5000, 32, 0.1, 3, 5, 300, 13, 20, 7) A <- N * S m <- c(322, 3, 57, 207, 715, 121, 9, 1246, 1095, 294) # lower bounds M <- N # upper bounds # Regular allocation. n <- 6000 opt_regular <- rnabox(n, A, M, m) # Vertex allocation. n <- 4076 opt_vertex <- rnabox(n, A, M, m)N <- c(454, 10, 116, 2500, 2240, 260, 39, 3000, 2500, 400) S <- c(0.9, 5000, 32, 0.1, 3, 5, 300, 13, 20, 7) A <- N * S m <- c(322, 3, 57, 207, 715, 121, 9, 1246, 1095, 294) # lower bounds M <- N # upper bounds # Regular allocation. n <- 6000 opt_regular <- rnabox(n, A, M, m) # Vertex allocation. n <- 4076 opt_vertex <- rnabox(n, A, M, m)
round_ran(x) round_oric(x)round_ran(x) round_oric(x)
x |
( |
An integer vector.
round_ran(): Random rounding of numbers.
A number is rounded to an integer according to the following
rule:
where the indicator function is
defined as
and is a random number drawn from the
distribution.
round_oric(): Optimal rounding under integer constraints, as proposed by
Cont and Heidari (2014).
Cont R, Heidari M (2014). “Optimal rounding under integer constraints.” 1501.00014, https://arxiv.org/abs/1501.00014.
x <- c(4.5, 4.1, 4.9) set.seed(5) round_ran(x) # 5 4 4 set.seed(6) round_ran(x) # 4 4 5 round_oric(x) # 4 4 5x <- c(4.5, 4.1, 4.9) set.seed(5) round_ran(x) # 5 4 4 set.seed(6) round_ran(x) # 4 4 5 round_oric(x) # 4 4 5
Fast integer-valued algorithms for optimum allocations under constraints in stratified sampling proposed in Friedrich et al. (2015).
SimpleGreedy2(v0, Nh, Sh, mh = rep(1, length(Nh)), Mh = Nh, nh = mh) CapacityScaling2( v0, Nh, Sh, mh = rep(1, length(Nh)), Mh = rep(Inf, length(Nh)) )SimpleGreedy2(v0, Nh, Sh, mh = rep(1, length(Nh)), Mh = Nh, nh = mh) CapacityScaling2( v0, Nh, Sh, mh = rep(1, length(Nh)), Mh = rep(Inf, length(Nh)) )
v0 |
( |
Nh |
( |
Sh |
( |
mh |
( |
Mh |
( |
nh |
( |
n |
( |
Ah |
( |
For the fpia() - an integer vector of optimum sample sizes allocated to
each stratum.
SimpleGreedy2(): Variant of the SimpleGreedy algorithm based on a variance stopping
rule.
CapacityScaling2(): Variant of the CapacityScaling algorithm based on a variance stopping
rule.
Friedrich U, Münnich R, de Vries S, Wagner M (2015). “Fast integer-valued algorithms for optimal allocations under constraints in stratified sampling.” Computational Statistics & Data Analysis, 92, 1-12. ISSN 0167-9473. doi:10.1016/j.csda.2015.06.003.
Optimum Sample Allocation in Stratified Sampling
Wojciech Wójciak [email protected]
Wesołowski J, Wieczorkowski R, Wójciak W (2026). “R package stratallo - source code.” https://github.com/wwojciech/stratallo.
Wesołowski J, Wieczorkowski R, Wójciak W (2023). “Numerical Performance of the RNABOX Algorithm.” https://github.com/rwieczor/recursive_Neyman_rnabox.
Wesołowski J, Wieczorkowski R, Wójciak W (2022). “Optimality of the Recursive Neyman Allocation.” Journal of Survey Statistics and Methodology, 10(5), 1263-1275. ISSN 2325-0984. doi:10.1093/jssam/smab018.
Wójciak W (2023). “Another Solution for Some Optimum Allocation Problem.” Statistics in Transition new series, 24(5), 203-219. doi:10.59170/stattrans-2023-071.
Wójciak W (2019). Optimal Allocation in Stratified Sampling Schemes. Master's thesis, Warsaw University of Technology. http://home.elka.pw.edu.pl/~wwojciak/msc_wwojciech_optimum_alloc.pdf.
Wójciak W (2026). Multi-Domain Optimum Sample Allocation with Controlled-Precision under Upper-Bound Constraints. Ph.D. thesis, Warsaw University of Technology. http://home.elka.pw.edu.pl/~wwojciak/phd_wwojciech_optimum_alloc.pdf.
Stenger H, Gabler S (2005). “Combining random sampling and census strategies - Justification of inclusion probabilities equal to 1.” Metrika, 61(2), 137–156. doi:10.1007/s001840400328.
Särndal C, Swensson B, Wretman J (1992). Model Assisted Survey Sampling. Springer New York, NY. ISBN 978-0-387-40620-6.
Useful links:
Estimator of the Population TotalComputes the value of the variance function of the stratified
estimator of the population total, which has the following generic form:
where denotes the total number of strata, are
the stratum sample sizes, and and , for
, are population constants that do not depend on the
.
var_st(x, A, A0) var_stsi(x, N, S)var_st(x, A, A0) var_stsi(x, N, S)
x |
( |
A |
( |
A0 |
( |
N |
( |
S |
( |
The value of the variance for a given allocation vector
.
var_st(): The value of the variance .
var_stsi(): The value of the variance for the case of
simple random sampling without replacement design within each stratum.
This particular case yields:
where denotes the size of stratum and is the
corresponding stratum standard deviation of the study variable, for
.
Särndal C, Swensson B, Wretman J (1992). Model Assisted Survey Sampling. Springer New York, NY. ISBN 978-0-387-40620-6.
N <- c(300, 400, 500, 200) S <- c(2, 5, 3, 1) x <- c(27, 88, 66, 9) A <- N * S A0 <- sum(N * S^2) var_st(x, A, A0) N <- c(3000, 4000, 5000, 2000) S <- rep(1, 4) M <- c(100, 90, 70, 80) x <- opt(n = 320, A = N * S, M = M) var_stsi(x = x, N, S)N <- c(300, 400, 500, 200) S <- c(2, 5, 3, 1) x <- c(27, 88, 66, 9) A <- N * S A0 <- sum(N * S^2) var_st(x, A, A0) N <- c(3000, 4000, 5000, 2000) S <- rep(1, 4) M <- c(100, 90, 70, 80) x <- opt(n = 320, A = N * S, M = M) var_stsi(x = x, N, S)