Scribal laws: exegetical variation in the textual transmission of biblical

Parker và Vannest (2009) proposed non-overlap of all pairs (NAP) as an effect size index for use in single-case retìm kiếm. NAP is defined in terms of all pair-wise comparisons between the data points in two different phases for a given case (i.e., a treatment phase versus a baseline phase). For an outcome that is desirable to increase, NAPhường is the proportion of all such pair-wise comparisons where the treatment phase observation exceeds the baseline phase observation, with pairs that are exactly tied getting a weight of 1/2. NAPhường belongs to lớn the family of non-overlap measures, which also includes the percentage of non-overlapping data, the improvement rate difference, và several other indices. It is exactly equivalent lớn Vargha và Delaney’s (2000) modified Common Language Effect Size và has been proposed as an effect size index in other contexts too (e.g., Acion, Peterson, Temple, & Arndt, 2006).

You watching: Scribal laws: exegetical variation in the textual transmission of biblical

The developers of NAP. have created a web-based tool for calculating it (as well as several other non-overlap indices), and I have the impression that the tool is fairly widely used. For example, Roth, Gillis, và DiGennaro Reed (2014) & Whalon, Conroy, Martinez, & Welch (2015) both used NAPhường in their meta-analyses of single-case retìm kiếm, and both noted that they used singlecaseretìm kiế for calculating the effect kích cỡ measure. Given that the website tool is being used, it is worth scrutinizing the methods behind the calculations it reports. As of this writing, the standard error and confidence intervals reported along with the NAP.. statistic are incorrect, and should not be used. After introducing a bit of notation, I’ll explain why the existing methods are deficient. I’ll also suggest some methods for calculating standard errors & confidence intervals that are potentially more accurate.


Suppose that we have data from the baseline phase and treatment phase for a single case. Let (m) denote the number of baseline observations and (n) denote the number of treatment phase observations. Let (y^A_1,...,y^A_m) denote the baseline phase data & (y^B_1,...,y^B_n) denote the treatment phase data. Then NAPhường is calculated as

< extNAP = frac1m n sum_i=1^m sum_j=1^n left y^A_i ight) + 0.5 Ileft(y^B_j = y^A_i ight) ight>>

What is NAP an estimate of? The parameter of interest is the probability that a randomly selected treatment phase observation will exceed a randomly selected baseline phase observation (again, with an adjustment for ties):

< heta = extPr(Y^B > Y^A) + 0.5 extPr(Y^B = Y^A).>

Vargha and Delaney điện thoại tư vấn ( heta) the measure of stochastic superiority.

NAP.. is very closely related to lớn another non-overlap index called Tau (Parker, Vannest, Davis, & Sauber, 2011). Tau is nothing more than a linear re-scaling of NAPhường to the range of <-1, 1>:

< extTau = fracSm n = 2 imes extNAP - 1,>


y^A_i ight) - Ileft(y^B_j

The (S) is Kendall’s S statistic, which is closely related to the Mann-Whitney (U) chạy thử.

Here is an R function for calculating NAP:

NAP. i) + 0.5 * (j == i)))) U / (m * n)}Using the data from the worked example in Parker & Vannest (2009), the function result agrees with their reported NAPhường of 0.96:

yA ## <1> 0.9636364

Standard errors

The webtool at reports a standard error for NAP. (it is labelled as “SDnap”), which from what I can tell is based on the formula

< extSE_ extTau = sqrtfracm + n + 13 m n.>

This formula appears khổng lồ actually be the standard error for Tau, rather than for NAP.. Since ( extNAP = left( extTau + 1 ight) / 2), the standard error for NAPhường should be half as large:

< extSE_null = sqrtfracm + n + 112 m n>

(cf. Grissom và Klặng, 2001, p. 141). However, even the latter formula is not always correct. It is valid only when the observations are all mutually independent và when the treatment phase data are drawn from the same distribution as the baseline phase data—that is, when the treatment has no effect on the outcome. I’ve therefore denoted it as ( extSE_null).

Other standard error estimators

Because an equivalent effect form size measure is used in other contexts like clinical medicine, there has actually been a fair bit of retìm kiếm into better approaches for assessing the uncertainty in NAPhường. Hanley và McNeil (1982) proposed an estimator for the sampling variance of NAP. that is designed for continuous outcome measures, where exact ties are impossible. Modifying it slightly (và in entirely ad hoc fashion) khổng lồ trương mục for ties, let

<eginalignedQ_1 &= frac1m n^2sum_i=1^m left y^A_i ight) + 0.5 Ileft(y^B_j = y^A_i ight) ight>^2 \Q_2 &= frac1m^2 nsum_j=1^n left y^A_i ight) + 0.5 Ileft(y^B_j = y^A_i ight) ight>^2.endaligned>

Then the Hanley-McNeil variance estimator is


with ( extSE_HM = sqrtV_HM).

The same authors also propose a different estimator, which is based on the assumption that the outcome data are exponentially distributed. Even though this is a strong & often inappropriate assumption, there is evidence that this estimator works even for other, non-exponential distributions. Newcombe (2006) suggested a further modification of their estimator, & I’ll describe his version. Let (h = (m + n) / 2 - 1). Then


with ( extSE_New = sqrtV_New).

Here are R functions to calculate each of these variance estimators.

V_HM yA) + 0.5 * (j == yA)) t ## <1> 0.03483351sqrt(V_New(yA, yB))## <1> 0.04370206For the worked example dataphối from Parker & Vannest, the Newcombe estimator yields a standard error that is about 25% larger than the Hanley-McNeil estimator. Both of these are substantially smaller than the null standard error, which in this example is ( extSE_null = 0.129).

A small simulation

Simulation methods can be used lớn examine how well these various standard error formulas estimate the actual sampling variation of NAP. For simplicity, I’ll simulate normally distributed data where

for varying values of the effect form size estimand (( heta)) & a couple of different sample sizes.

sample_NAP. % group_by(theta, m, n) %>% mutate(delta = sqrt(2) * qnorm(theta)) -> params params %>% do(sample_NAP(.$delta, .$m, .$n, iterations = 2000)) %>% mutate(se_null = sqrt((m + n + 1) / (12 * m * n))) %>% gather("sd","val", sd, SE_HM, SE_New, se_null) -> NAP_simlibrary(ggplot2)ggplot(NAP_sim, aes(theta, val, color = sd)) + facet_grid(n ~ m, labeller = "label_both") + geom_line() + theme_bw() + theme(legover.position = "bottom")


In the above figure, the actual sampling standard deviation of NAP (in red) và the value of ( extSE_null) (in purple) are plotted against the true value of ( heta), with separate plots for various combinations of (m) and (n). The expected value of the standard errors ( extSE_HM) và ( extSE_New) (actually the square root of the expectation of the variance estimators) are depicted in green and blue, respectively. The value of ( extSE_null) agrees with the actual standard error when (delta = 0), but the two diverge when there is a positive sầu treatment effect. It appears that ( extSE_HM) và ( extSE_New) both under-estimate the actual standard error when (m) or (n) is equal khổng lồ 5, và over-estimate for the largest values of ( heta). However, both of these estimators offer a marked improvement over ( extSE_null).

See more: Read Itsuka Sekai Wo Sukuu Tame Ni, Qualidea Code: Itsuka Sekai Wo Sukuu Tame Ni

Confidence intervals

The webtool at also reports 85% và 90% confidence intervals for NAPhường. These confidence intervals appear to lớn have sầu the same two problems as the standard errors. First, they are constructed as CIs for Tau rather than for NAP. For the (100\% imes (1 - alpha)) CI, let (z_altrộn / 2) be the appropriate critical value from a standard normal distribution. The CIs reported by the webtool are given by

< extTau pm extSE_ extTau imes z_altrộn / 2. >

This is probably just an oversight in the programming, which could be corrected by instead using

< extNAP pm extSE_null imes z_altrộn / 2.>

In parallel with the standard error formulas, I’ll Điện thoại tư vấn this formula the null confidence interval. Funnily enough, the upper bound of the null CI is the same as the upper bound of the Tau CI. However, the lower bound is going lớn be quite a bit larger than the lower bound for the Tau CI, so that the null CI will be much narrower.

The second problem is that even the null CI has poor coverage properties because it is based on ( extSE_null), which can drastically over-estimate the standard error of NAP. for non-null values.

Other confidence intervals

As I noted above sầu, there has been a fair amount of previous retìm kiếm inlớn how to lớn construct CIs for ( heta), the parameter estimated by NAP.. As is often the case with these sorts of problems, there are many different methods available, scattered across the literature. Fortunately, there are two (at least) fairly comprehensive simulation studies that compare the performance of various methods under a wide range of conditions. Newcombe (2006) examined a range of methods based on inverting Wald-type chạy thử statistics (which give CIs of the form ( extestimate pm extSE imes z_altrộn / 2), where ( extSE) is some standard error estimate) và score-based methods (in which the standard error is estimated using the candidate parameter value). Based on an extensive sầu simulation, he suggested a score-based method in which the end-points of the CI are defined the values of ( heta) that satisfy:

<( extNAP - heta)^2 = fracz^2_alpha / 2 h heta (1 - heta)mnleft,>

where (h = (m + n) / 2 - 1). This equation is a fourth-degree polynomial in ( heta), easily solved using a numerical root-finding algorithm.

In a different simulation study, Ruscio and Mullen (2012) examined the performance of a selection of different confidence intervals for ( heta), including several methods not considered by Newcombe. Aước ao the methods that they examined, they find that the bias-corrected, accelerated (BCa) bootstrap CI performs particularly well (and seems to lớn outperform the score-based CI recommended by Newcombe).

Neither Newcombe (2006) nor Ruscio và Mullen (2012) considered constructing a confidence interval by directly pivoting the Mann-Whitney U demo (the same technique used khổng lồ construct confidence intervals for the Hodges-Lehmann estimator of location shift), although it seems to me that this would be possible & potentially an attractive approach in the context of SCDs. The main caveat is that such a CI would require stronger distributional assumptions than those studied in the simulations, such as that the distributions of (Y^A) & (Y^B) differ by an additive sầu (or multiplicative) constant. In any case, it seems lượt thích it would be worth exploring this approach too.

Another small simulation

Here is an R function for calculating several different CIs for ( heta), including the null CI, Wald-type CIs based on (V_HM) và (V_New), và the score-type CI recommended by Newcombe (2006). I haven’t programmed the BCa bootstrap because it would take a bit more thought khổng lồ figure out how khổng lồ simulate it efficiently.

The following code simulates the coverage rates of nominal 90% CIs based on each of these methods, following the same simulation set-up as above sầu.

NAP_CIs yA) + 0.5 * (j == yA)) t 0) uniroot(f, c(0, t))$root else 0 score_upper ## $NAP## <1> 0.9636364## ## $CI## lower upper## null 0.7106061 1.2166666## HM 0.8953639 1.0319088## Newcombe 0.8779819 1.0492908## score 0.7499741 0.9950729sample_CIs % do(sample_CIs(delta = .$delta, m = .$m, n = .$n, altrộn = .10, iterations = 5000)) -> NAP_CI_simggplot(NAP_CI_slặng, aes(theta, coverage, color = CI)) + facet_grid(n ~ m, labeller = "label_both", scales = "free_y") + geom_line() + labs(y = "SE") + geom_hline(yintercept=.90, linetype="dashed") + theme_bw() + theme(legkết thúc.position = "bottom")


The figure above plots the coverage rates of several different confidence intervals for ( heta): the naive CI (in blue), the HM Wald CI (red), the Newcombe Wald CI (green), and the Newcombe score CI (purple). The dashed horizontal line is the nominal coverage rate of 90%. It can be seen that the null CI has the correct coverage only when ( heta leq .6); for larger values of ( heta), its coverage becomes too conservative sầu (tending towards 100%). The Wald-type CIs have below-nominal coverage rates, which improve sầu as the sample form size in each phase increases but remain too liberal even at the largest sample kích thước considered. Finally, Newcombe’s score CI maintains close-to-nominal coverage over a wider range of ( heta) values. Although these CIs have sầu below-nominal coverage for the smallest sample sizes, they generally have sầu good coverage for ( heta and when the sample size in each phase is 10 or more. It is also notable that their coverage rates appear to lớn become more accurate as the sample size in a given group increases, even if the sample size in the other group is fairly small and remains constant.


My ayên ổn in this post was to lớn highlight the problems with how singlecaseretìm kiế calculates standard errors và CIs for the NAPhường. statistic. Some of these issues could easily be resolved by correcting the relevant formulas so that they are appropriate for NAP. rather than Tau. However, even with these corrections, better approaches exist for calculating standard errors & CIs. I’ve highlighted some promising ones above sầu, which seem worthy of further investigation. But I should also emphakích cỡ that these methods vị come with some important caveats too.

See more: 300 Gift Code Bảo Bối Thần Kỳ H5 2020, Tặng 200+ Code Bảo Bối Thần Kỳ H5 Miễn Phí

First, all of the methods I’ve discussed are premised on having mutually independent observations. In the presence of serial correlation, I would anticipate that any of these standard errors will be too small & any of the confidence intervals will be too narrow. (This could readily be verified through simulation, although I have sầu not done so here.)

Second, my small simulations are based on the assumption of normally distributed, homoskedastic observations in each phase, which is not a particularly good mã sản phẩm for the types of outcome measures commonly used in single case retìm kiếm. In some of my other work, I’ve developed statistical models for data collected by systematic direct observation of behavior, which is the most prevalent type of outcome data in single-case research. Before recommending any particular method, the performance of the standard error formulas (e.g., the Hanley-McNeil & Newcombe estimators) and CI methods (such as Newcombe’s score CI) should be examined under more realistic models for behavioral observation data.

Chuyên mục: giftcode