| Title: | Exploratory Data Analysis, Group Comparison Tools, and Other Procedures |
|---|---|
| Description: | Provides a comprehensive set of tools for descriptive statistics, graphical data exploration, outlier detection, homoscedasticity testing, and multiple comparison procedures. Includes manual implementations of Levene's test, Bartlett's test, and the Fligner-Killeen test, as well as post hoc comparison methods such as Tukey, Scheffé, Games-Howell, Brunner-Munzel, and others. This version introduces two new procedures: the Jonckheere-Terpstra trend test and the Jarque-Bera test with Glinskiy's (2024) correction. Designed for use in teaching, applied statistical analysis, and reproducible research. Additionally you can find a post hoc Test Planner, which helps you to make a decision on which procedure is most suitable. |
| Authors: | Carlos Jiménez-Gallardo [aut, cre] |
| Maintainer: | Carlos Jiménez-Gallardo <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 2.2.0 |
| Built: | 2026-06-06 06:32:27 UTC |
| Source: | https://github.com/cran/Analitica |
The Analitica package provides tools for exploratory statistical analysis, data visualization, and comparison of groups using both parametric and non-parametric methods. It supports univariate and grouped descriptive summaries, outlier detection, homoscedasticity testing, and multiple post hoc procedures.
Designed for applied analysis workflows, this package includes intuitive plotting functions and manual implementations of key statistical tests often needed in educational or research contexts.
descripYG: Descriptive statistics with visualizations (histograms, boxplots, density ridges).
Levene.Test: Manual implementation of Levene’s test for homogeneity of variances.
BartlettTest: Manual implementation of Bartlett’s test.
FKTest: Manual implementation of the Fligner-Killeen test.
grubbs_outliers: Outlier detection based on Grubbs' test.
GHTest, DuncanTest, SNKTest, etc.: Post hoc comparison procedures.
Carlos Jiménez-Gallardo
Creates a bar plot of group means with error bars representing either the standard deviation (SD) or the standard error (SE).
bar_error( dataSet, vD, vI, variation = "sd", title = "Bar plot with error bars", label_y = "Y Axis", label_x = "X Axis" )bar_error( dataSet, vD, vI, variation = "sd", title = "Bar plot with error bars", label_y = "Y Axis", label_x = "X Axis" )
dataSet |
A |
vD |
A string indicating the name of the numeric dependent variable. |
vI |
A string indicating the name of the categorical independent variable (grouping variable). |
variation |
Type of variation to display: |
title |
Title of the plot. Default is |
label_y |
Label for the Y-axis. Default is |
label_x |
Label for the X-axis. Default is |
A ggplot object representing the plot.
data(d_e, package = "Analitica") bar_error(d_e, vD = Sueldo_actual, vI = labor, variation = "sd")data(d_e, package = "Analitica") bar_error(d_e, vD = Sueldo_actual, vI = labor, variation = "sd")
Conducts Bartlett's test to evaluate whether multiple groups have equal variances, based on a formula interface and raw data vectors, without requiring a fitted model. This implementation provides flexibility for exploratory variance testing in custom workflows.
BartlettTest(formula, data, alpha = 0.05)BartlettTest(formula, data, alpha = 0.05)
formula |
A formula of the form |
data |
A data frame containing the variables specified in the formula. |
alpha |
Significance level for the test (default is 0.05). |
Bartlett’s test is appropriate when group distributions are approximately normal. It tests the null hypothesis that all groups have equal variances (homoscedasticity).
Advantages: - Straightforward to compute. - High sensitivity to variance differences under normality.
Disadvantages: - Highly sensitive to non-normal distributions. - Less robust than alternatives like Levene’s test for skewed or heavy-tailed data.
An object of class "homocedasticidad", containing:
Statistic: Bartlett's chi-squared test statistic.
df: Degrees of freedom associated with the test.
p_value: The p-value for the test statistic.
Decision: A character string indicating the conclusion ("Heterocedastic" or "Homocedastic").
Method: A character string indicating the method used ("Bartlett").
Bartlett, M. S. (1937). "Properties of sufficiency and statistical tests." Proceedings of the Royal Society of London, Series A, 160(901), 268–282.
data(d_e, package = "Analitica") res <- BartlettTest(Sueldo_actual ~ labor, data = d_e) summary(res) summary(BartlettTest(Sueldo_actual ~ as.factor(labor), data = d_e))data(d_e, package = "Analitica") res <- BartlettTest(Sueldo_actual ~ labor, data = d_e) summary(res) summary(BartlettTest(Sueldo_actual ~ as.factor(labor), data = d_e))
Performs the Brunner-Munzel test using a permutation approach, suitable for comparing two independent samples when the assumption of equal variances may not hold.
BMpTest( grupo1, grupo2, alpha = 0.05, alternative = c("two.sided", "less", "greater"), nperm = 10000, seed = NULL )BMpTest( grupo1, grupo2, alpha = 0.05, alternative = c("two.sided", "less", "greater"), nperm = 10000, seed = NULL )
grupo1 |
A numeric vector representing the first group. |
grupo2 |
A numeric vector representing the second group. |
alpha |
Significance level (default is 0.05). |
alternative |
Character string specifying the alternative hypothesis:
one of |
nperm |
Number of permutations to perform (default = 10000). |
seed |
Optional random seed for reproducibility (default is NULL). |
This version computes an empirical p-value based on resampling, without relying on the t-distribution approximation.
An object of class "comparacion" and "brunnermunzel_perm", containing:
Resultados: A data frame with comparison name, mean difference, empirical p-value, and significance.
Promedios: A named numeric vector of group means.
Orden_Medias: Group names ordered by their mean.
Metodo: Description of the method used.
Brunner, E., & Munzel, U. (2000). "The nonparametric Behrens-Fisher problem: Asymptotic theory and a small-sample approximation." Biometrical Journal, 42(1), 17–25.
data(d_e, package = "Analitica") g1 <- d_e$Sueldo_actual[d_e$labor == 1] g2 <- d_e$Sueldo_actual[d_e$labor == 2] resultado <- BMpTest(g1, g2) summary(resultado)data(d_e, package = "Analitica") g1 <- d_e$Sueldo_actual[d_e$labor == 1] g2 <- d_e$Sueldo_actual[d_e$labor == 2] resultado <- BMpTest(g1, g2) summary(resultado)
Performs the Brunner-Munzel nonparametric test for two independent groups, which estimates the probability that a randomly selected value from one group is less than a randomly selected value from the other group.
BMTest( grupo1, grupo2, alpha = 0.05, alternative = c("two.sided", "less", "greater") )BMTest( grupo1, grupo2, alpha = 0.05, alternative = c("two.sided", "less", "greater") )
grupo1 |
Numeric vector of values from group 1. |
grupo2 |
Numeric vector of values from group 2. |
alpha |
Significance level (default = 0.05). |
alternative |
Character string specifying the alternative hypothesis.
One of |
This test is suitable when group variances are unequal and/or sample sizes differ. It does not assume equal variances and is often used as a more robust alternative to the Wilcoxon test.
Advantages: - Handles unequal variances and non-normality. - Recommended when variance homogeneity is questionable.
Disadvantages: - Less well-known and supported. - Requires large sample sizes for accurate inference.
An object of class "comparacion" and "brunnermunzel", containing:
Resultados: A data frame with test statistics, p-value, and estimated effect size.
Promedios: A named numeric vector of group means.
Orden_Medias: Group names ordered by their mean values (descending).
Metodo: A character string describing the test and hypothesis.
p_hat: Estimated probability that a value from grupo1 is less than a value from grupo2 (plus 0.5 * ties).
Brunner, E., & Munzel, U. (2000). "The nonparametric Behrens-Fisher problem: Asymptotic theory and a small-sample approximation." Biometrical Journal, 42(1), 17–25. <https://doi.org/10.1002/(SICI)1521-4036(200001)42:1
data(d_e, package = "Analitica") g1 <- d_e$Sueldo_actual[d_e$labor == 1] g2 <- d_e$Sueldo_actual[d_e$labor == 2] resultado <- BMTest(g1, g2, alternative = "greater") summary(resultado)data(d_e, package = "Analitica") g1 <- d_e$Sueldo_actual[d_e$labor == 1] g2 <- d_e$Sueldo_actual[d_e$labor == 2] resultado <- BMTest(g1, g2, alternative = "greater") summary(resultado)
Performs all pairwise comparisons using the Wilcoxon rank-sum test (Mann-Whitney) with Bonferroni correction for multiple testing.
BonferroniNPTest(formula, data, alpha = 0.05)BonferroniNPTest(formula, data, alpha = 0.05)
formula |
A formula of the form |
data |
A data frame containing the variables. |
alpha |
Significance level (default is 0.05). |
Suitable for non-parametric data where ANOVA assumptions are violated.
Advantages: - Simple and intuitive non-parametric alternative to ANOVA post hoc tests. - Strong control of Type I error via Bonferroni correction. - Works with unequal group sizes.
Disadvantages: - Conservative with many groups. - Only valid for pairwise comparisons; does not support complex contrasts. - It is only useful in completely random or single-factor designs.
An object of class "bonferroni_np" and "comparaciones", containing:
Resultados: Data frame with comparisons, W-statistics, raw and adjusted p-values, and significance levels.
Promedios: Mean ranks of each group.
Orden_Medias: Group names ordered from highest to lowest rank.
Metodo: Name of the method used ("Bonferroni (non-parametric)").
Wilcoxon, F. (1945). Individual Comparisons by Ranking Methods. Biometrics Bulletin, 1(6), 80–83. doi:10.2307/3001968
Dunn, O. J. (1964). Multiple Comparisons Using Rank Sums. Technometrics, 6(3), 241–252. doi:10.1080/00401706.1964.10490181
Shaffer, J. P. (1995). Multiple Hypothesis Testing. Annual Review of Psychology, 46(1), 561–584. doi:10.1146/annurev.ps.46.020195.003021
data(iris) BonferroniNPTest(Sepal.Length ~ Species, data = iris)data(iris) BonferroniNPTest(Sepal.Length ~ Species, data = iris)
Performs pairwise t-tests with Bonferroni adjustment for multiple comparisons. This method controls the family-wise error rate by dividing the alpha level by the number of comparisons.
BonferroniTest(modelo, comparar = NULL, alpha = 0.05)BonferroniTest(modelo, comparar = NULL, alpha = 0.05)
modelo |
An |
comparar |
Character vector with the name(s) of the factor(s) to compare:
- One name: main effect (e.g., "treatment" or "A")
- Several names: interaction (e.g., |
alpha |
Significance level (default 0.05). |
Advantages: - Very simple and easy to implement. - Strong control of Type I error. - Applicable to any set of independent comparisons.
Disadvantages: - Highly conservative, especially with many groups. - Can lead to low statistical power (increased Type II error). - Does not adjust test statistics, only p-values.
An object of class "bonferroni" and "comparaciones" containing:
Resultados: a data.frame with columns Comparacion, Diferencia, SE, t_value,
p_value (unadjusted), p_ajustada (Bonferroni), Valor_Critico (critical difference), and Significancia.
Promedios: a named vector of group means as defined by comparar.
Orden_Medias: group names ordered from highest to lowest mean.
Metodo: "Bonferroni-adjusted t-test".
Termino: the term being compared (e.g., "A", "B", or "A:B").
MSerror, df_error, N: useful for plots with error bars.
Dunn, O. J. (1964). Multiple Comparisons Using Rank Sums. Technometrics, 6(3), 241–252. doi:10.1080/00401706.1964.10490181
Wilcoxon, F. (1945). Individual Comparisons by Ranking Methods. Biometrics Bulletin, 1(6), 80–83. doi:10.2307/3001968
#DCA data(d_e, package = "Analitica") mod1 <- aov(Sueldo_actual ~ as.factor(labor), data = d_e) summary(mod1) resultado <- BonferroniTest(mod1) summary(resultado) DBA: y ~ tratamiento + bloque mod2 <- aov(Sueldo_actual ~ as.factor(labor) + Sexo, data = d_e) res <- BonferroniTest(mod2, comparar = "as.factor(labor)") summary(res); plot(res) # DFactorial: y ~ A * B mod2 <- aov(Sueldo_actual ~ as.factor(labor) * Sexo, data = d_e) resAB <- BonferroniTest(mod2, comparar = c("as.factor(labor)","Sexo")) # compara celdas A:B summary(resAB, n = Inf); plot(resAB, horizontal = TRUE)#DCA data(d_e, package = "Analitica") mod1 <- aov(Sueldo_actual ~ as.factor(labor), data = d_e) summary(mod1) resultado <- BonferroniTest(mod1) summary(resultado) DBA: y ~ tratamiento + bloque mod2 <- aov(Sueldo_actual ~ as.factor(labor) + Sexo, data = d_e) res <- BonferroniTest(mod2, comparar = "as.factor(labor)") summary(res); plot(res) # DFactorial: y ~ A * B mod2 <- aov(Sueldo_actual ~ as.factor(labor) * Sexo, data = d_e) resAB <- BonferroniTest(mod2, comparar = c("as.factor(labor)","Sexo")) # compara celdas A:B summary(resAB, n = Inf); plot(resAB, horizontal = TRUE)
Performs the Brown-Forsythe test using absolute deviations from the median of each group, followed by a one-way ANOVA on those deviations.
BrownForsytheTest(formula, data, alpha = 0.05)BrownForsytheTest(formula, data, alpha = 0.05)
formula |
A formula of the form |
data |
A data frame containing the variables. |
alpha |
Significance level (default is 0.05). |
This test is a robust alternative to Bartlett's test, especially useful when the assumption of normality is violated or when outliers are present.
Advantages: - More robust than Bartlett's test under non-normal distributions. - Less sensitive to outliers due to the use of the median.
Disadvantages: - Lower power than Bartlett's test when normality strictly holds. - Assumes that absolute deviations follow similar distributions across groups.
An object of class "homocedasticidad", with:
Statistic: F-statistic.
df1: Numerator degrees of freedom.
df2: Denominator degrees of freedom.
p_value: P-value.
Decision: "Heterocedastic" or "Homocedastic".
Method: "Brown-Forsythe".
Brown, M. B., & Forsythe, A. B. (1974). "Robust Tests for the Equality of Variances". Journal of the American Statistical Association, 69(346), 364–367.
data(d_e, package = "Analitica") res <- BrownForsytheTest(Sueldo_actual ~ labor, data = d_e) summary(res)data(d_e, package = "Analitica") res <- BrownForsytheTest(Sueldo_actual ~ labor, data = d_e) summary(res)
Performs non-parametric pairwise comparisons based on rank-transformed data using the Conover-Iman procedure. This method is typically applied as a post hoc test following a significant Kruskal-Wallis test to identify specific group differences.
ConoverTest(formula, data, alpha = 0.05, method.p = "holm")ConoverTest(formula, data, alpha = 0.05, method.p = "holm")
formula |
A formula of the form |
data |
A data frame containing the variables specified in the formula. |
alpha |
Significance level for hypothesis testing (default is 0.05). |
method.p |
Method used to adjust p-values for multiple comparisons (default is |
The Conover-Iman test uses rank-based t-statistics, offering improved statistical power over Dunn's test while maintaining flexibility in sample size.
Advantages: - More powerful than Dunn’s test, especially with moderate group differences. - Robust to non-normal data and suitable for ordinal or skewed distributions. - Allows for unequal sample sizes across groups.
Disadvantages: - Sensitive to heteroscedasticity (non-constant variances). - Requires appropriate p-value adjustment to control the family-wise error rate. - It is only useful in completely random or single-factor designs.
An object of class "conover" and "comparaciones", containing:
Resultados: A data frame with pairwise comparisons, t-statistics, raw and adjusted p-values, and significance markers.
Promedios: A named numeric vector with mean ranks for each group.
Orden_Medias: A character vector with group names sorted from highest to lowest rank.
Metodo: A string describing the method used ("Conover (no parametrico)").
Conover, W. J. & Iman, R. L. (1979). "Multiple comparisons using rank sums." Technometrics, 21(4), 489–495.
data(d_e, package = "Analitica") ConoverTest(Sueldo_actual ~ labor, data = d_e)data(d_e, package = "Analitica") ConoverTest(Sueldo_actual ~ labor, data = d_e)
Conjunto de dato, para ser utilizados como ejemplo. Las variables son:
data(d_e)data(d_e)
Un data.frame con N filas y M columnas. Las variables típicas pueden incluir:
ID del empleado
Sexo del empleado
Fecha Nacimiento
cantidad de años de estudio
area de trabajo dentro de la emrpesa
sueldo a la fecha
sueldo al ingresar a la empresa
meses trabajando en la empresa
meses de experiencia
Ingreso mensual estimado
la pertenencia a una minoria
Performs a descriptive analysis on a numeric dependent variable, either globally or grouped by an independent variable. Displays summary statistics such as mean, standard deviation, skewness, and kurtosis, and generates associated plots (histogram, boxplot, or density ridges).
descripYG(dataset, vd, vi = NULL)descripYG(dataset, vd, vi = NULL)
dataset |
A |
vd |
A numeric variable to analyze (dependent variable). |
vi |
An optional grouping variable (independent variable, categorical or numeric). |
A data.frame with descriptive statistics. Also prints plots to the graphics device.
data(d_e, package = "Analitica") descripYG(d_e, vd = Sueldo_actual) descripYG(d_e, vd = Sueldo_actual, vi = labor) descripYG(d_e,Sueldo_actual,labor)data(d_e, package = "Analitica") descripYG(d_e, vd = Sueldo_actual) descripYG(d_e, vd = Sueldo_actual, vi = labor) descripYG(d_e,Sueldo_actual,labor)
Robust non-parametric method for multiple comparisons after Kruskal-Wallis. Uses rank-based pairwise tests with a pooled variance estimate.
DSCFTest( formula, data, alpha = 0.05, method.p = "holm", na.rm = TRUE, include_kw = TRUE )DSCFTest( formula, data, alpha = 0.05, method.p = "holm", na.rm = TRUE, include_kw = TRUE )
formula |
y ~ group |
data |
data.frame con las variables |
alpha |
nivel (0.05 por defecto) just for the little star |
method.p |
adjustment method (default "holm") |
na.rm |
remove NA (TRUE by default) |
include_kw |
if TRUE, add summary of Kruskal-Wallis test |
Advantages: - Strong control of Type I error with unequal sample sizes. - More powerful than Dunn in many conditions.
Disadvantages: - Computationally more complex. - Less commonly available in standard software. - It is only useful in completely random or single-factor designs.
objeto con clases c("comparaciones","dscf")
Dwass, M. (1960). Some k-sample rank-order tests. In I. Olkin et al. (Eds.), Contribution1s to Probability and Statistics: Essays in Honor of Harold Hotelling (pp. 198 - 202). Stanford University Press.
data(d_e, package = "Analitica") DSCFTest(Sueldo_actual ~ labor, data = d_e)data(d_e, package = "Analitica") DSCFTest(Sueldo_actual ~ labor, data = d_e)
Performs the Duncan test for pairwise comparisons after an ANOVA. This method is more liberal than Tukey's HSD, using a stepwise approach with critical values from the studentized range distribution.
DuncanTest(modelo, comparar = NULL, alpha = 0.05)DuncanTest(modelo, comparar = NULL, alpha = 0.05)
modelo |
An |
comparar |
Character vector with the name(s) of the factor(s) to compare:
- One name: main effect (e.g., "treatment" or "A")
- Several names: interaction (e.g., |
alpha |
Significance level (default 0.05). |
Advantages: - High power for detecting differences. - Simple to interpret and implement.
Disadvantages: - Inflates Type I error rate. - Not recommended for confirmatory research.
An object of class "duncan" and "comparaciones" containing:
Resultados: a data.frame with columns Comparacion, Diferencia, SE, t_value,
p_value (unadjusted), p_ajustada (duncan), Valor_Critico (critical difference), and Significancia.
Promedios: a named vector of group means as defined by comparar.
Orden_Medias: group names ordered from highest to lowest mean.
Metodo: "Duncan t-test".
Termino: the term being compared (e.g., "A", "B", or "A:B").
MSerror, df_error, N: useful for plots with error bars.
Duncan, D. B. (1955). "Multiple range and multiple F tests." Biometrics, 11(1), 1-42.
# DCA data(d_e, package = "Analitica") mod1 <- aov(Sueldo_actual ~ as.factor(labor), data = d_e) resultado <- DuncanTest(mod1) summary(resultado) plot(resultado) # DBA mod2 <- aov(Sueldo_actual ~ as.factor(labor) + Sexo, data = d_e) res <- DuncanTest(mod2, comparar = "as.factor(labor)") summary(res); plot(res) # DFactorial mod3 <- aov(Sueldo_actual ~as.factor(labor) * Sexo, data = d_e) resAB <- DuncanTest(mod3, comparar = c("as.factor(labor)","Sexo")) # celdas A:B summary(resAB, n = Inf); plot(resAB, horizontal = TRUE)# DCA data(d_e, package = "Analitica") mod1 <- aov(Sueldo_actual ~ as.factor(labor), data = d_e) resultado <- DuncanTest(mod1) summary(resultado) plot(resultado) # DBA mod2 <- aov(Sueldo_actual ~ as.factor(labor) + Sexo, data = d_e) res <- DuncanTest(mod2, comparar = "as.factor(labor)") summary(res); plot(res) # DFactorial mod3 <- aov(Sueldo_actual ~as.factor(labor) * Sexo, data = d_e) resAB <- DuncanTest(mod3, comparar = c("as.factor(labor)","Sexo")) # celdas A:B summary(resAB, n = Inf); plot(resAB, horizontal = TRUE)
Performs Dunn's test for pairwise comparisons following a Kruskal-Wallis test. Suitable for non-parametric data (ordinal or non-normal), using rank sums. Includes Holm correction by default for multiple comparisons.
DunnTest(formula, data, alpha = 0.05, method.p = "holm")DunnTest(formula, data, alpha = 0.05, method.p = "holm")
formula |
A formula of the form |
data |
A data frame containing the variables. |
alpha |
Significance level (default is 0.05). |
method.p |
Method for p-value adjustment (default is "holm"). |
Advantages: - Simple and widely used non-parametric alternative to Tukey's test. - Handles unequal sample sizes. - Compatible with various p-value corrections (e.g., Holm, Bonferroni).
Disadvantages: - Less powerful than DSCF or Conover when sample sizes vary widely. - Requires ranking all data and can be conservative depending on adjustment.
An object of class "dunn" and "comparaciones", including:
Resultados: Data frame with group comparisons, z-values, raw and adjusted p-values, and significance.
Promedios: Mean ranks of each group.
Orden_Medias: Group names ordered from highest to lowest rank.
Metodo: "Dunn (no paramétrico)".
Dunn, O. J. (1964). Multiple comparisons using rank sums. *Technometrics*, 6(3), 241–252. doi:10.1080/00401706.1964.10490181
data(d_e, package = "Analitica") DunnTest(Sueldo_actual ~ labor, data = d_e)data(d_e, package = "Analitica") DunnTest(Sueldo_actual ~ labor, data = d_e)
Performs a non-parametric Fligner-Killeen test for equality of variances across two or more groups, using raw vectors via a formula interface.
FKTest(formula, data, alpha = 0.05)FKTest(formula, data, alpha = 0.05)
formula |
A formula of the form |
data |
A data frame containing the variables in the formula. |
alpha |
Significance level (default is 0.05). |
This test is particularly useful when the assumption of normality is violated, as it is robust to outliers and distributional deviations. It serves as a reliable alternative to Bartlett’s test when data do not follow a normal distribution.
Advantages: - Non-parametric: No assumption of normality. - Robust to outliers. - Suitable for heterogeneous sample sizes.
Disadvantages: - Less powerful than parametric tests under normality. - May be computationally intensive with large datasets.
An object of class "homocedasticidad", containing:
The Fligner-Killeen chi-squared statistic.
Degrees of freedom.
The p-value for the test.
"Homoscedastic" or "Heteroscedastic" depending on the test result.
A string indicating the method used ("Fligner-Killeen").
Fligner, M. A., & Killeen, T. J. (1976). "Distribution-free two-sample tests for scale." Journal of the American Statistical Association, 71(353), 210–213. <https://doi.org/10.1080/01621459.1976.10480351>
data(d_e, package = "Analitica") res <- FKTest(Sueldo_actual ~ labor, data = d_e) summary(res)data(d_e, package = "Analitica") res <- FKTest(Sueldo_actual ~ labor, data = d_e) summary(res)
A modification of Tukey's test for use with moderately unequal sample sizes.
GabrielTest(modelo, comparar = NULL, alpha = 0.05)GabrielTest(modelo, comparar = NULL, alpha = 0.05)
modelo |
An |
comparar |
Character vector with the name(s) of the factor(s) to compare:
- One name: main effect (e.g., "treatment" or "A")
- Several names: interaction (e.g., |
alpha |
Significance level (default 0.05). |
Advantages: - More powerful than Tukey for unequal group sizes. - Controls error rates effectively with moderate imbalance.
Disadvantages: - Can be anti-conservative with large differences in group sizes. - Less common in standard statistical software.
An object of class "gabriel" and "comparaciones" containing:
Resultados: a data.frame with columns Comparacion, Diferencia, SE, t_value,
p_value (unadjusted), p_ajustada (gabriel), Valor_Critico (critical difference), and Significancia.
Promedios: a named vector of group means as defined by comparar.
Orden_Medias: group names ordered from highest to lowest mean.
Metodo: "Gabriel t-test".
Termino: the term being compared (e.g., "A", "B", or "A:B").
MSerror, df_error, N: useful for plots with error bars.
Hochberg, Y., & Tamhane, A. C. (1987). Multiple Comparison Procedures.
# DCA data(d_e, package = "Analitica") mod1 <- aov(Sueldo_actual ~ as.factor(labor), data = d_e) resultado <- GabrielTest(mod1) summary(resultado) plot(resultado) # RCBD mod2 <- aov(Sueldo_actual ~ as.factor(labor) + Sexo, data = d_e) res <- GabrielTest(mod2, comparar = "as.factor(labor)") summary(res); plot(res) # Factorial mod3 <- aov(Sueldo_actual ~ as.factor(labor) * Sexo, data = d_e) resAB <- GabrielTest(mod3, comparar = c("as.factor(labor)","Sexo")) # celdas A:B summary(resAB, n = Inf); plot(resAB, horizontal = TRUE)# DCA data(d_e, package = "Analitica") mod1 <- aov(Sueldo_actual ~ as.factor(labor), data = d_e) resultado <- GabrielTest(mod1) summary(resultado) plot(resultado) # RCBD mod2 <- aov(Sueldo_actual ~ as.factor(labor) + Sexo, data = d_e) res <- GabrielTest(mod2, comparar = "as.factor(labor)") summary(res); plot(res) # Factorial mod3 <- aov(Sueldo_actual ~ as.factor(labor) * Sexo, data = d_e) resAB <- GabrielTest(mod3, comparar = c("as.factor(labor)","Sexo")) # celdas A:B summary(resAB, n = Inf); plot(resAB, horizontal = TRUE)
Performs the Games-Howell test for pairwise comparisons after ANOVA, without assuming equal variances or sample sizes. It is suitable when Levene or Bartlett test indicates heterogeneity of variances.
GHTest(modelo, comparar = NULL, alpha = 0.05)GHTest(modelo, comparar = NULL, alpha = 0.05)
modelo |
An |
comparar |
Character vector with the name(s) of the factor(s) to compare:
- One name: main effect (e.g., "treatment" or "A")
- Several names: interaction (e.g., |
alpha |
Significance level (default is 0.05). |
Advantages: - Excellent for heteroscedastic data. - Controls Type I error across unequal group sizes.
Disadvantages: - Slightly conservative in small samples. - More complex to compute than Tukey.
An object of class "gameshowell" and "comparaciones",
which contains:
Resultados: A data frame with pairwise comparisons, including:
Comparacion, Diferencia, t_value, gl,
p_value, and Significancia.
Promedios: A named numeric vector of group means as defined by comparar.
Orden_Medias: Group names ordered from highest to lowest mean.
Metodo: A character string indicating the method used ("Games-Howell").
Termino: The term being compared (e.g., "A", "B", or "A:B").
Games, P. A., & Howell, J. F. (1976). "Pairwise Multiple Comparison Procedures with Unequal N's and/or Variances: A Monte Carlo Study". Journal of Educational Statistics, 1(2), 113–125. <https://doi.org/10.1002/j.2162-6057.1976.tb00211.x>
data(d_e, package = "Analitica") mod <- aov(Sueldo_actual ~ as.factor(labor), data = d_e) # Comparación sobre el primer factor del modelo resultado <- GHTest(mod) summary(resultado) plot(resultado) # Con bloques, comparando solo el factor de interés mod2 <- aov(Sueldo_actual ~ as.factor(labor) + Sexo, data = d_e) res2 <- GHTest(mod2, comparar = "as.factor(labor)") summary(res2) plot(res2) # Modelo con interacción mod3 <- aov(Sueldo_actual ~ as.factor(labor) * Sexo, data = d_e) # efecto principal resA <- GHTest(mod3, comparar = "as.factor(labor)") # interacción resAB <- GHTest(mod3, comparar = c("as.factor(labor)", "Sexo")) summary(resAB) plot(resAB)data(d_e, package = "Analitica") mod <- aov(Sueldo_actual ~ as.factor(labor), data = d_e) # Comparación sobre el primer factor del modelo resultado <- GHTest(mod) summary(resultado) plot(resultado) # Con bloques, comparando solo el factor de interés mod2 <- aov(Sueldo_actual ~ as.factor(labor) + Sexo, data = d_e) res2 <- GHTest(mod2, comparar = "as.factor(labor)") summary(res2) plot(res2) # Modelo con interacción mod3 <- aov(Sueldo_actual ~ as.factor(labor) * Sexo, data = d_e) # efecto principal resA <- GHTest(mod3, comparar = "as.factor(labor)") # interacción resAB <- GHTest(mod3, comparar = c("as.factor(labor)", "Sexo")) summary(resAB) plot(resAB)
Detects one or more outliers in a numeric variable using the iterative Grubbs' test, which assumes the data follow a normal distribution.
grubbs_outliers(dataSet, vD, alpha = 0.05)grubbs_outliers(dataSet, vD, alpha = 0.05)
dataSet |
A |
vD |
Unquoted name of the numeric variable to be tested for outliers. |
alpha |
Significance level for the test (default is |
The function applies Grubbs' test iteratively, removing the most extreme value and retesting until no further significant outliers are found. The test is valid only under the assumption of normality.
A data.frame identical to the input, with an added logical column outL
indicating which observations were identified as outliers (TRUE or FALSE).
Grubbs, F. E. (1969). "Procedures for Detecting Outlying Observations in Samples." Technometrics, 11(1), 1–21. doi:10.1080/00401706.1969.10490657
data(d_e, package = "Analitica") d<-grubbs_outliers(d_e, Sueldo_actual)data(d_e, package = "Analitica") d<-grubbs_outliers(d_e, Sueldo_actual)
Performs pairwise t-tests with p-values adjusted using Holm’s sequential method.
HolmTest(modelo, comparar = NULL, alpha = 0.05)HolmTest(modelo, comparar = NULL, alpha = 0.05)
modelo |
An |
comparar |
Character vector with the name(s) of the factor(s) to compare:
- One name: main effect (e.g., "treatment" or "A")
- Several names: interaction (e.g., |
alpha |
Significance level (default 0.05). |
Advantages: - Controls family-wise error rate more efficiently than Bonferroni. - Easy to apply over any set of p-values.
Disadvantages: - Does not adjust test statistics, only p-values. - Slightly more conservative than false discovery rate (FDR) methods.
An object of class "holm" and "comparaciones" containing:
Resultados: a data.frame with columns Comparacion, Diferencia, SE, t_value,
p_value (unadjusted), p_ajustada (Holm), Valor_Critico (critical difference), and Significancia.
Promedios: a named vector of group means as defined by comparar.
Orden_Medias: group names ordered from highest to lowest mean.
Metodo: "Holm t-test".
Termino: the term being compared (e.g., "A", "B", or "A:B").
MSerror, df_error, N: useful for plots with error bars.
Holm, S. (1979). A simple sequentially rejective multiple test procedure.Scandinavian Journal of Statistics, 6(2), 65–70.
data(d_e, package = "Analitica") mod <- aov(Sueldo_actual ~ as.factor(labor), data = d_e) resultado <- HolmTest(mod) summary(resultado) plot(resultado) # RCBD mod <- aov(Sueldo_actual ~ as.factor(labor) + Sexo, data = d_e) res <- HolmTest(mod, comparar = "as.factor(labor)") summary(res); plot(res) # usa p_ajustada automaticamente # Factorial mod2 <- aov(Sueldo_actual ~ as.factor(labor) * Sexo, data = d_e) resAB <- HolmTest(mod2, comparar = c("as.factor(labor)","Sexo")) summary(resAB, n = Inf); plot(resAB, horizontal = TRUE)data(d_e, package = "Analitica") mod <- aov(Sueldo_actual ~ as.factor(labor), data = d_e) resultado <- HolmTest(mod) summary(resultado) plot(resultado) # RCBD mod <- aov(Sueldo_actual ~ as.factor(labor) + Sexo, data = d_e) res <- HolmTest(mod, comparar = "as.factor(labor)") summary(res); plot(res) # usa p_ajustada automaticamente # Factorial mod2 <- aov(Sueldo_actual ~ as.factor(labor) * Sexo, data = d_e) resAB <- HolmTest(mod2, comparar = c("as.factor(labor)","Sexo")) summary(resAB, n = Inf); plot(resAB, horizontal = TRUE)
Performs the Jarque-Bera test for normality with optional corrections proposed by Glinskiy et al. (2024), depending on whether the mean, variance, or both are known a priori.
JBGTest(y, mu = NULL, sigma2 = NULL, alpha = 0.05)JBGTest(y, mu = NULL, sigma2 = NULL, alpha = 0.05)
y |
A numeric vector to test for normality. |
mu |
Optional known mean value. Default is |
sigma2 |
Optional known variance value. Default is |
alpha |
Significance level for the test (default is 0.05). |
An object of class "normalidad", containing:
statistic: Test statistic value.
df: Degrees of freedom (always 2).
p_value: P-value of the test.
decision: Conclusion about normality.
variant: Type of JB test applied.
method: "Jarque-Bera (Glinskiy)"
Glinskiy, Vladimir & Ismayilova, Yulia & Khrushchev, Sergey & Logachov, Artem & Logachova, Olga & Serga, Lyudmila & Yambartsev, Anatoly & Zaykov, Kirill. (2024). Modifications to the Jarque–Bera Test. Mathematics. 12. 2523. 10.3390/math12162523.
data(d_e, package = "Analitica") JBGTest(d_e$Sueldo_actual) #output different of result summary(JBGTest(d_e$Sueldo_actual))data(d_e, package = "Analitica") JBGTest(d_e$Sueldo_actual) #output different of result summary(JBGTest(d_e$Sueldo_actual))
Performs the Jonckheere-Terpstra test to evaluate the presence of a monotonic trend (increasing or decreasing) across three or more independent ordered groups. This test is non-parametric and is particularly useful when the independent variable is ordinal and the response is continuous or ordinal.
JT_Test(formula, data)JT_Test(formula, data)
formula |
A formula of the type y ~ group, where 'group' is an ordered factor. |
data |
A data.frame containing the variables in the formula. |
The Jonckheere-Terpstra test compares all pairwise combinations of groups and counts the number of times values in higher-ordered groups exceed those in lower-ordered groups. This implementation includes a full correction for ties in the data, which ensures more accurate inference.
Advantages: - Non-parametric: does not assume normality or equal variances. - More powerful than Kruskal-Wallis when there is an a priori ordering of groups. - Tie correction included, improving robustness in real-world data.
Disadvantages: - Requires that the group variable be ordered (ordinal). - Detects overall trend but not specific group differences. - Sensitive to large numbers of ties or very unbalanced group sizes.
An object of class "jonckheere" with:
J: Total Jonckheere-Terpstra statistic.
J_pares: Pairwise J statistics between group combinations.
mu_J: Expected value of J under the null hypothesis.
var_J: Variance of J (with complete tie correction).
Z: Standardized test statistic.
p_value: Two-sided p-value.
Trend: Detected trend ("increasing", "decreasing", or "none").
Method: Description of the method.
Hollander, M., Wolfe, D. A., & Chicken, E. (2014). Nonparametric statistical methods. p. 202 (3rd ed.). Wiley.
df <- data.frame( group = factor(rep(1:3, each = 6), ordered = TRUE), y = c(40,35,38,43,44,41,38,40,47,44,40,42,48,40,45,43,46,44) ) res <- JT_Test(y ~ group, data = df)df <- data.frame( group = factor(rep(1:3, each = 6), ordered = TRUE), y = c(40,35,38,43,44,41,38,40,47,44,40,42,48,40,45,43,46,44) ) res <- JT_Test(y ~ group, data = df)
Performs Levene's test for equality of variances across groups using a formula interface. This test evaluates the null hypothesis that the variances are equal across groups, and is commonly used as a preliminary test before ANOVA or other parametric analyses.
Levene.Test( formula, data, alpha = 0.05, center = "median", decompose = TRUE, anova_type = c("I", "II", "III") )Levene.Test( formula, data, alpha = 0.05, center = "median", decompose = TRUE, anova_type = c("I", "II", "III") )
formula |
y ~ factors (e.g., y ~ A or y ~ A * B). |
data |
data.frame with variables in the formula. |
alpha |
Significance level (default 0.05). |
center |
"median" (Brown-Forsythe, default) or "mean" (classical Levene). |
decompose |
logical. If TRUE and there are >= 2 factors, run ANOVA on |Y - cell_center|. |
anova_type |
"I", "II" (2-way only) or "III" (any number of factors, no 'car'). |
Levene's test is based on an analysis of variance (ANOVA) applied to the absolute deviations from each group's center (either the mean or, more robustly, the median). It is less sensitive to departures from normality than Bartlett's test.
Advantages: - Robust to non-normality, especially when using the median. - Suitable for equal or unequal sample sizes across groups. - Widely used in practice for checking homoscedasticity.
Disadvantages: - Less powerful than parametric alternatives under strict normality.
An object of class "homocedasticidad", containing:
F statistic of the Levene test.
Degrees of freedom (between and within groups).
The p-value for the test.
"Homoscedastic" or "Heteroscedastic" depending on the test result.
A string indicating the method used ("Levene").
Levene, H. (1960). "Robust Tests for Equality of Variances." In Contributions to Probability and Statistics: Essays in Honor of Harold Hotelling (pp.278-292). Stanford University Press.
data(d_e, package = "Analitica") res <- Levene.Test(Sueldo_actual ~ as.factor(labor), data = d_e) summary(res) # RCBD resB<-Levene.Test(Sueldo_actual ~ as.factor(labor)+Sexo, data = d_e) summary(resB) # anova 2-ways resC<-Levene.Test(Sueldo_actual ~ as.factor(labor)*Sexo, data = d_e) summary(resC)data(d_e, package = "Analitica") res <- Levene.Test(Sueldo_actual ~ as.factor(labor), data = d_e) summary(res) # RCBD resB<-Levene.Test(Sueldo_actual ~ as.factor(labor)+Sexo, data = d_e) summary(resB) # anova 2-ways resC<-Levene.Test(Sueldo_actual ~ as.factor(labor)*Sexo, data = d_e) summary(resC)
Performs unadjusted pairwise t-tests following a significant ANOVA.
LSDTest(modelo, comparar = NULL, alpha = 0.05)LSDTest(modelo, comparar = NULL, alpha = 0.05)
modelo |
An |
comparar |
Character vector with the name(s) of the factor(s) to compare:
- One name: main effect (e.g., "treatment" or "A")
- Several names: interaction (e.g., |
alpha |
Significance level (default 0.05). |
Advantages: - Very powerful when assumptions are met. - Simple and easy to interpret.
Disadvantages: - High risk of Type I error without correction. - Not recommended if many comparisons are made.
An object of class "lsd" and "comparaciones" containing:
Resultados: a data.frame with columns Comparacion, Diferencia, SE, t_value,
p_value (unadjusted), p_ajustada (LSD), Valor_Critico (critical difference), and Significancia.
Promedios: a named vector of group means as defined by comparar.
Orden_Medias: group names ordered from highest to lowest mean.
Metodo: "LSD t-test".
Termino: the term being compared (e.g., "A", "B", or "A:B").
MSerror, df_error, N: useful for plots with error bars.
Fisher, R. A. (1935). The Design of Experiments. Oliver & Boyd.
data(d_e, package = "Analitica") mod <- aov(Sueldo_actual ~ as.factor(labor), data = d_e) resultado <- LSDTest(mod) summary(resultado) plot(resultado) # RCBD mod <- aov(Sueldo_actual ~ as.factor(labor) + Sexo, data = d_e) res <- LSDTest(mod, comparar = "as.factor(labor)") summary(res); plot(res) # plot usara p_value # Factorial mod2 <- aov(Sueldo_actual ~ as.factor(labor) * Sexo, data = d_e) resAB <- LSDTest(mod2, comparar = c("as.factor(labor)","Sexo")) summary(resAB, n = Inf); plot(resAB, horizontal = TRUE)data(d_e, package = "Analitica") mod <- aov(Sueldo_actual ~ as.factor(labor), data = d_e) resultado <- LSDTest(mod) summary(resultado) plot(resultado) # RCBD mod <- aov(Sueldo_actual ~ as.factor(labor) + Sexo, data = d_e) res <- LSDTest(mod, comparar = "as.factor(labor)") summary(res); plot(res) # plot usara p_value # Factorial mod2 <- aov(Sueldo_actual ~ as.factor(labor) * Sexo, data = d_e) resAB <- LSDTest(mod2, comparar = c("as.factor(labor)","Sexo")) summary(resAB, n = Inf); plot(resAB, horizontal = TRUE)
Tests for sphericity in repeated measures designs. Uses an interface for
type formula dv ~ within | id, where:
dv: variable numerical response,
within: within-subjects factor (repeated levels),
id: subject/sample identifier.
MauchlyTest(formula, data, alpha = 0.05, digits = 4, do_print = TRUE)MauchlyTest(formula, data, alpha = 0.05, digits = 4, do_print = TRUE)
formula |
Formula |
data |
|
alpha |
Significance level (default 0.05). |
digits |
Decimals for printing (default 4). |
do_print |
if |
Calculates Mauchly's statistic , its approximation
corrected, the p-value and the correction coefficients for lack of
sphericity (Greenhouse–Geisser y Huynh–Feldt).
Objeto de clase "sphericity" with:
Cadena con el método.
Lista con W y Chi2.
Grados de libertad.
Valor-p.
"Sphericity" o "No sphericity" según alpha.
GG and HF.
List with n (subjects), k (levels), S (covariances), eigen (eigenvalues).
Mauchly, J. W. (1940). Significance test for sphericity of a normal n-variate distribution. The Annals of Mathematical Statistics, 11(2), 204–209. https://doi.org/10.1214/aoms/1177731915
# Ejemplo mínimo (datos ficticios): set.seed(1) d <- data.frame( id = rep(1:10, each = 4), within = rep(paste0("t", 1:4), times = 10), y = as.numeric(rep(rnorm(10, 10, 2), each = 4)) + rep(c(0, .5, 1.2, .8), times = 10) + rnorm(40, 0, 1) ) res <- MauchlyTest(y ~ within | id, data = d, do_print = TRUE) summary(res)# Ejemplo mínimo (datos ficticios): set.seed(1) d <- data.frame( id = rep(1:10, each = 4), within = rep(paste0("t", 1:4), times = 10), y = as.numeric(rep(rnorm(10, 10, 2), each = 4)) + rep(c(0, .5, 1.2, .8), times = 10) + rnorm(40, 0, 1) ) res <- MauchlyTest(y ~ within | id, data = d, do_print = TRUE) summary(res)
Performs the Mann-Whitney U test (Wilcoxon rank-sum) for two independent groups, using a manual implementation. Suitable when the assumptions of parametric tests (normality, homogeneity of variances) are not met.
MWTest( grupo1, grupo2, alpha = 0.05, alternative = c("two.sided", "less", "greater"), continuity = TRUE )MWTest( grupo1, grupo2, alpha = 0.05, alternative = c("two.sided", "less", "greater"), continuity = TRUE )
grupo1 |
Numeric vector for the first group. |
grupo2 |
Numeric vector for the second group. |
alpha |
Significance level (default = 0.05). |
alternative |
Character string specifying the alternative hypothesis.
Options are |
continuity |
Logical indicating whether to apply continuity correction (default = TRUE). |
Advantages: - Does not assume normality. - More powerful than t-test for skewed distributions.
Disadvantages: - Only compares two groups at a time. - Sensitive to unequal variances or shapes. - It is only useful in completely random or single-factor designs.
This implementation allows one- or two-sided alternatives and optionally applies a continuity correction.
An object of class "comparacion" and "mannwhitney", containing:
Resultados: A data frame with the comparison name, difference in means, p-value, and significance.
Promedios: A named numeric vector of group means.
Orden_Medias: A character vector of group names ordered from highest to lowest mean.
Metodo: A string describing the test and hypothesis direction.
Mann, H. B., & Whitney, D. R. (1947). "On a Test of Whether One of Two Random Variables is Stochastically Larger than the Other." Annals of Mathematical Statistics, 18(1), 50–60.
data(d_e, package = "Analitica") g1 <- d_e$Sueldo_actual[d_e$labor == 1] g2 <- d_e$Sueldo_actual[d_e$labor == 2] resultado <- MWTest(g1, g2, alternative = "greater") summary(resultado)data(d_e, package = "Analitica") g1 <- d_e$Sueldo_actual[d_e$labor == 1] g2 <- d_e$Sueldo_actual[d_e$labor == 2] resultado <- MWTest(g1, g2, alternative = "greater") summary(resultado)
Performs the Nemenyi test after a significant Kruskal-Wallis or Friedman test. Based on the studentized range distribution applied to mean ranks.
NemenyiTest(formula, data, alpha = 0.05)NemenyiTest(formula, data, alpha = 0.05)
formula |
A formula of the form |
data |
A data frame containing the variables. |
alpha |
Significance level (default is 0.05). |
Advantages: - Easy to implement for equal-sized groups. - Conservative control of family-wise error rate.
Disadvantages: - Only valid with equal group sizes. - No p-values are directly calculated (based on critical differences only). - It is only useful in completely random or single-factor designs.
An object of class "nemenyi" and "comparaciones", including:
Resultados: Data frame with group comparisons, rank differences, critical value, p-values, and significance codes.
Promedios: Mean ranks of each group.
Orden_Medias: Group names ordered from highest to lowest rank.
Metodo: Name of the method ("Nemenyi (no paramétrico)").
Nemenyi, P. (1963). Distribution-free Multiple Comparisons.
set.seed(123) datos <- data.frame( grupo = rep(c("A", "B", "C", "D"), each = 10), medida = c( rnorm(10, mean = 10), rnorm(10, mean = 12), rnorm(10, mean = 15), rnorm(10, mean = 11) ) ) table(datos$grupo) #> A B C D #>10 10 10 10 # Aplicar el test de Nemenyi resultado <- NemenyiTest(medida ~ grupo, data = datos) # Ver los resultados summary(resultado) # O simplemente resultado$Resultados # Ver orden de medias (rangos) resultado$Orden_Mediasset.seed(123) datos <- data.frame( grupo = rep(c("A", "B", "C", "D"), each = 10), medida = c( rnorm(10, mean = 10), rnorm(10, mean = 12), rnorm(10, mean = 15), rnorm(10, mean = 11) ) ) table(datos$grupo) #> A B C D #>10 10 10 10 # Aplicar el test de Nemenyi resultado <- NemenyiTest(medida ~ grupo, data = datos) # Ver los resultados summary(resultado) # O simplemente resultado$Resultados # Ver orden de medias (rangos) resultado$Orden_Medias
Generic plot for multiple-comparison tests (with multcompView letters) v2.0.1
## S3 method for class 'comparaciones' plot( x, alpha = 0.05, p_column = c("auto", "p_ajustada", "p_value", "p"), horizontal = FALSE, fill = "steelblue", label_size = 5, label_color = "black", angle_x = 45, show_se = FALSE, se_type = c("se", "ci95"), pad_frac = 0.35, errorbar_width = 0.2, ... )## S3 method for class 'comparaciones' plot( x, alpha = 0.05, p_column = c("auto", "p_ajustada", "p_value", "p"), horizontal = FALSE, fill = "steelblue", label_size = 5, label_color = "black", angle_x = 45, show_se = FALSE, se_type = c("se", "ci95"), pad_frac = 0.35, errorbar_width = 0.2, ... )
x |
An object of class |
alpha |
Significance threshold for the letters (default 0.05). |
p_column |
Which p-value column: "auto","p_ajustada","p_value","p". |
horizontal |
If TRUE, draw horizontal bars. |
fill |
Bar fill color. |
label_size |
Letter size. |
label_color |
Letter color. |
angle_x |
Angle of x-axis labels (if |
show_se |
If TRUE and |
se_type |
"se" (default) or "ci95". |
pad_frac |
Fraction of y-span used to place letters (default 0.35). |
errorbar_width |
Width of errorbar whiskers. |
... |
Not used. |
A ggplot object.
One-shot planner for factor or cell comparisons, reporting m, FWER, suggested adjustments (Bonferroni/Sidak) and a post hoc recommendation (Holm, Tukey, Duncan, Gabriel, Scheffe, SNK, etc.) before testing.
Posthoc_planner( model, compare = NULL, alpha = 0.05, scope = c("factor", "cells"), equal_var = TRUE, unequal_n = FALSE, independence = TRUE, liberal_ok = FALSE, orientation = c("rows", "cols"), digits = 4, percent_digits = 1, observed_cells = TRUE )Posthoc_planner( model, compare = NULL, alpha = 0.05, scope = c("factor", "cells"), equal_var = TRUE, unequal_n = FALSE, independence = TRUE, liberal_ok = FALSE, orientation = c("rows", "cols"), digits = 4, percent_digits = 1, observed_cells = TRUE )
model |
aov or lm object (complete model). Data are reconstructed with model.frame(). |
compare |
Character with the name(s) of the factor(s) to compare: - One name: main effect. - Several names: if scope="cells" compares A:B:... cells; if scope="factor", reports each factor. If omitted, uses all factors when scope="factor", or the first factor when scope="cells". |
alpha |
Overall significance level (FWER target), default 0.05. |
scope |
"factor" compares each factor separately; "cells" compares interaction cells. |
equal_var |
Logical; assume homoscedasticity (default TRUE). |
unequal_n |
Logical; expect moderate imbalance of group sizes (default FALSE). |
independence |
Logical; if TRUE reports FWER "under independence" (default TRUE). |
liberal_ok |
Logical; allows more liberal suggestions (LSD/Duncan/SNK) (default FALSE). |
orientation |
"rows" (metrics as rows, default) or "cols". |
digits |
Decimal places for numeric output, default 4. |
percent_digits |
Decimal places for percentages, default 1. |
observed_cells |
Logical; in scope="cells", count only observed cells (drop NA). Default TRUE. |
data.frame. - orientation="rows": first column "Metric", rest columns are units (factor/cells). - orientation="cols": one row per unit, metrics as columns. Includes: g levels, m comparisons, global alpha, Bonferroni/Sidak alphas, FWERs (under independence), "Suggested p-value adjustment" and "Post hoc suggestion".
#' Bonferroni, C. (1936). *Teoria statistica delle classi e calcolo delle probabilità*. Pubblicazioni del R. Istituto Superiore di Scienze Economiche e Commerciali di Firenze.
Fisher, R. A. (1935). *The design of experiments*. Oliver & Boyd.
Duncan, D. B. (1955). Multiple range and multiple F tests. *Biometrics, 11*(1), 1–42.
Gabriel, K. R. (1978). A simple method of multiple comparisons of means. *Journal of the American Statistical Association, 73*(364), 724–729.
Games, P. A., & Howell, J. F. (1976). Pairwise multiple comparison procedures with unequal n’s and/or variances: A Monte Carlo study. *Journal of Educational Statistics, 1*(2), 113–125.
Holm, S. (1979). A simple sequentially rejective multiple test procedure. *Scandinavian Journal of Statistics, 6*(2), 65–70.
Newman, D. (1939). The distribution of range in samples from a normal population, expressed in terms of an independent estimate of standard deviation. *Biometrika, 31*(1/2), 20–36.
Scheffé, H. (1953). A method for judging all contrasts in the analysis of variance. *Biometrika, 40*(1–2), 87–104.
Šidák, Z. (1967). Rectangular confidence regions for the means of multivariate normal distributions. *Journal of the American Statistical Association, 62*(318), 626–633.
Tukey, J. W. (1949). Comparing individual means in the analysis of variance. *Biometrics, 5*(2), 99–114.
## ========================= ## Ejemplo 1: One-way ANOVA ## ========================= # Datos: PlantGrowth (3 grupos balanceados) data(PlantGrowth) m1 <- aov(weight ~ group, data = PlantGrowth) # Comparar por factor (default scope="factor") Posthoc_planner(m1) # Variante: salida por columnas Posthoc_planner(m1, orientation = "cols") # Variante: alpha más estricto Posthoc_planner(m1, alpha = 0.01) ## ============================================== ## Ejemplo 2: Dos factores y comparación de celdas ## ============================================== # Datos: ToothGrowth (suplemento x dosis) data(ToothGrowth) TG <- ToothGrowth TG$dose <- factor(TG$dose) # tratar "dose" como factor m2 <- aov(len ~ supp * dose, data = TG) # scope="cells" compara celdas de la interacción (supp:dose) Posthoc_planner( m2, compare = c("supp","dose"), scope = "cells", # comparar celdas observed_cells = TRUE # contar solo celdas observadas (default) ) # También puedes pedir el resumen por factor dentro del mismo modelo Posthoc_planner( m2, compare = c("supp","dose"), scope = "factor" # reporte por cada factor por separado )## ========================= ## Ejemplo 1: One-way ANOVA ## ========================= # Datos: PlantGrowth (3 grupos balanceados) data(PlantGrowth) m1 <- aov(weight ~ group, data = PlantGrowth) # Comparar por factor (default scope="factor") Posthoc_planner(m1) # Variante: salida por columnas Posthoc_planner(m1, orientation = "cols") # Variante: alpha más estricto Posthoc_planner(m1, alpha = 0.01) ## ============================================== ## Ejemplo 2: Dos factores y comparación de celdas ## ============================================== # Datos: ToothGrowth (suplemento x dosis) data(ToothGrowth) TG <- ToothGrowth TG$dose <- factor(TG$dose) # tratar "dose" como factor m2 <- aov(len ~ supp * dose, data = TG) # scope="cells" compara celdas de la interacción (supp:dose) Posthoc_planner( m2, compare = c("supp","dose"), scope = "cells", # comparar celdas observed_cells = TRUE # contar solo celdas observadas (default) ) # También puedes pedir el resumen por factor dentro del mismo modelo Posthoc_planner( m2, compare = c("supp","dose"), scope = "factor" # reporte por cada factor por separado )
Performs Scheffe's post hoc test after fitting an ANOVA model. This test compares all possible pairs of group means, using a critical value based on the F-distribution.
ScheffeTest(modelo, comparar = NULL, alpha = 0.05)ScheffeTest(modelo, comparar = NULL, alpha = 0.05)
modelo |
An |
comparar |
Character vector with the name(s) of the factor(s) to compare:
- One name: main effect (e.g., "treatment" or "A")
- Several names: interaction (e.g., |
alpha |
Significance level (default 0.05). |
The Scheffe test is a conservative method, making it harder to detect significant differences, but reducing the likelihood of Type I errors (false positives). It is especially appropriate when the comparisons were not pre-planned and the number of contrasts is large.
Assumptions: normally distributed residuals and homogeneity of variances.
Advantages: - Very robust to violations of assumptions. - Suitable for complex comparisons, not just pairwise.
Disadvantages: - Very conservative; reduced power. - Not ideal for detecting small differences.
Objeto de clase "scheffe" and "comparaciones" with:
Resultados: data.frame with Comparacion, Diferencia,
SE2 (= MSerror*(1/n_i+1/n_j)), F_obs, Valor_Critico, p_value, Significancia.
Promedios, Orden_Medias, Metodo="Scheffe", Termino.
MSerror, df_error, N (utiles para plot.comparaciones()).
Scheffe, H. (1953). "A method for judging all contrasts in the analysis of variance." Biometrika, 40(1/2), 87–104. <https://doi.org/10.1093/biomet/40.1-2.87>
data(d_e, package = "Analitica") mod <- aov(Sueldo_actual ~ as.factor(labor), data = d_e) resultado <- ScheffeTest(mod) summary(resultado) plot(resultado) # RCBD mod <- aov(Sueldo_actual ~ as.factor(labor) + Sexo, data = d_e) res <- ScheffeTest(mod, comparar = "as.factor(labor)") summary(res); plot(res) # plot usara p_value # Factorial mod2 <- aov(Sueldo_actual ~ as.factor(labor) * Sexo, data = d_e) resAB <- ScheffeTest(mod2, comparar = c("as.factor(labor)","Sexo")) summary(resAB, n = Inf); plot(resAB, horizontal = TRUE)data(d_e, package = "Analitica") mod <- aov(Sueldo_actual ~ as.factor(labor), data = d_e) resultado <- ScheffeTest(mod) summary(resultado) plot(resultado) # RCBD mod <- aov(Sueldo_actual ~ as.factor(labor) + Sexo, data = d_e) res <- ScheffeTest(mod, comparar = "as.factor(labor)") summary(res); plot(res) # plot usara p_value # Factorial mod2 <- aov(Sueldo_actual ~ as.factor(labor) * Sexo, data = d_e) resAB <- ScheffeTest(mod2, comparar = c("as.factor(labor)","Sexo")) summary(resAB, n = Inf); plot(resAB, horizontal = TRUE)
Performs pairwise comparisons using the Sidak correction to adjust p-values and control the family-wise error rate in multiple testing scenarios. This method assumes independence between comparisons and is slightly less conservative than Bonferroni.
SidakTest(modelo, comparar = NULL, alpha = 0.05)SidakTest(modelo, comparar = NULL, alpha = 0.05)
modelo |
An |
comparar |
Character vector with the name(s) of the factor(s) to compare:
- One name: main effect (e.g., "treatment" or "A")
- Several names: interaction (e.g., |
alpha |
Significance level (default 0.05). |
The Sidak procedure adjusts the significance level to maintain an overall alpha across all pairwise tests, providing an effective post hoc tool following ANOVA or similar global tests.
Advantages: - Controls the family-wise error rate under independence assumption. - Slightly more powerful than Bonferroni. - Simple to compute and interpret.
Disadvantages: - Assumes independence of tests (may not hold in correlated data). - Less robust when variances are unequal or data are non-normal.
An object of class "sidak" and "comparaciones" containing:
Resultados: a data.frame with columns Comparacion, Diferencia, SE, t_value,
p_value (unadjusted), p_ajustada (Sidak), Valor_Critico (critical difference), and Significancia.
Promedios: a named vector of group means as defined by comparar.
Orden_Medias: group names ordered from highest to lowest mean.
Metodo: "Sidak-adjusted t-test".
Termino: the term being compared (e.g., "A", "B", or "A:B").
MSerror, df_error, N: useful for plots with error bars.
Sidak, Z. (1967). "Rectangular confidence regions for the means of multivariate normal distributions." Journal of the American Statistical Association, 62(318), 626–633.
data(d_e, package = "Analitica") mod <- aov(Sueldo_actual ~ as.factor(labor), data = d_e) resultado <- SidakTest(mod) summary(resultado) plot(resultado) # RCBD mod <- aov(Sueldo_actual ~ as.factor(labor) + Sexo, data = d_e) res <- SidakTest(mod, comparar = "as.factor(labor)") summary(res); plot(res) # plot usara p_value # Factorial mod2 <- aov(Sueldo_actual ~ as.factor(labor) * Sexo, data = d_e) resAB <- SidakTest(mod2, comparar = c("as.factor(labor)","Sexo")) summary(resAB, n = Inf); plot(resAB, horizontal = TRUE)data(d_e, package = "Analitica") mod <- aov(Sueldo_actual ~ as.factor(labor), data = d_e) resultado <- SidakTest(mod) summary(resultado) plot(resultado) # RCBD mod <- aov(Sueldo_actual ~ as.factor(labor) + Sexo, data = d_e) res <- SidakTest(mod, comparar = "as.factor(labor)") summary(res); plot(res) # plot usara p_value # Factorial mod2 <- aov(Sueldo_actual ~ as.factor(labor) * Sexo, data = d_e) resAB <- SidakTest(mod2, comparar = c("as.factor(labor)","Sexo")) summary(resAB, n = Inf); plot(resAB, horizontal = TRUE)
Performs the Student-Newman-Keuls (SNK) post hoc test for pairwise comparisons after fitting an ANOVA model. The test uses a stepwise approach where the critical value depends on the number of means spanned between groups (range r).
SNKTest(modelo, comparar = NULL, alpha = 0.05)SNKTest(modelo, comparar = NULL, alpha = 0.05)
modelo |
An |
comparar |
Character vector with the name(s) of the factor(s) to compare:
- One name: main effect (e.g., "treatment" or "A")
- Several names: interaction (e.g., |
alpha |
Significance level (default 0.05). |
SNK is more powerful but less conservative than Tukey’s HSD, increasing the chance of detecting real differences while slightly raising the Type I error rate.
Assumptions: normality, homogeneity of variances, and independence of observations.
Advantages: - More powerful than Tukey when differences are large. - Intermediate control of Type I error.
Disadvantages: - Error control is not family-wise. - Type I error increases with more comparisons.
An object of class "SNK" and "comparaciones" containing:
Resultados: a data.frame with columns Comparacion, Diferencia, SE, t_value,
p_value (unadjusted), p_ajustada (SNK), Valor_Critico (critical difference), and Significancia.
Promedios: a named vector of group means as defined by comparar.
Orden_Medias: group names ordered from highest to lowest mean.
Metodo: "SNK t-test".
Termino: the term being compared (e.g., "A", "B", or "A:B").
MSerror, df_error, N: useful for plots with error bars.
Student, Newman, and Keuls (1952). "Student-Newman-Keuls Procedure". See also: <https://doi.org/10.1002/bimj.200310019>
data(d_e, package = "Analitica") mod <- aov(Sueldo_actual ~ as.factor(labor), data = d_e) resultado <- SNKTest(mod) summary(resultado) plot(resultado) # RCBD mod <- aov(Sueldo_actual ~ as.factor(labor) + Sexo, data = d_e) res <- SNKTest(mod, comparar = "as.factor(labor)") summary(res); plot(res) # plot usara p_value # Factorial mod2 <- aov(Sueldo_actual ~ as.factor(labor) * Sexo, data = d_e) resAB <- SNKTest(mod2, comparar = c("as.factor(labor)","Sexo")) summary(resAB, n = Inf); plot(resAB, horizontal = TRUE)data(d_e, package = "Analitica") mod <- aov(Sueldo_actual ~ as.factor(labor), data = d_e) resultado <- SNKTest(mod) summary(resultado) plot(resultado) # RCBD mod <- aov(Sueldo_actual ~ as.factor(labor) + Sexo, data = d_e) res <- SNKTest(mod, comparar = "as.factor(labor)") summary(res); plot(res) # plot usara p_value # Factorial mod2 <- aov(Sueldo_actual ~ as.factor(labor) * Sexo, data = d_e) resAB <- SNKTest(mod2, comparar = c("as.factor(labor)","Sexo")) summary(resAB, n = Inf); plot(resAB, horizontal = TRUE)
Displays a formatted summary of the results from a pairwise comparison test
of two independent groups. Compatible with objects returned by functions like
BMTest() or MWTest().
## S3 method for class 'comparacion' summary(object, ...)## S3 method for class 'comparacion' summary(object, ...)
object |
An object of class |
... |
Additional arguments (currently ignored). |
Invisibly returns a one-row data frame with the summary statistics.
Displays a summary of variance homogeneity tests such as Bartlett, Fligner-Killeen, or Levene (1-via global and if exists, factorial decomposition on |Y - centro_celda|).
## S3 method for class 'homocedasticidad' summary(object, digits = 4, ...)## S3 method for class 'homocedasticidad' summary(object, digits = 4, ...)
object |
An object of class |
digits |
Number of digits for F; default 4. |
... |
Currently ignored. |
Invisibly returns the input object (invisible).
Performs the Tamhane T2 test for pairwise comparisons after an ANOVA model, assuming unequal variances and/or unequal sample sizes. This test is appropriate when the assumption of homogeneity of variances is violated, such as when Levene's test or Bartlett's test is significant.
T2Test(modelo, comparar = NULL, alpha = 0.05)T2Test(modelo, comparar = NULL, alpha = 0.05)
modelo |
An object of class |
comparar |
Character vector with the name(s) of the factor(s) to compare:
- One name: main effect (e.g., "treatment" or "A")
- Several names: interaction (e.g., |
alpha |
Significance level (default is 0.05). |
The test uses a modified t-test with Welch-Satterthwaite degrees of freedom and a conservative approach to control for multiple comparisons.
Advantages: - Controls Type I error under heteroscedasticity. - No assumption of equal sample sizes.
Disadvantages: - Conservative; may reduce power. - Not as powerful as Games-Howell in some contexts.
An object of class "tamhanet2" and "comparaciones", containing:
Resultados: A data frame with pairwise comparisons, mean differences,
t_value, gl, p_value, and significance codes.
Promedios: A named numeric vector of group means as defined by comparar.
Orden_Medias: Group names ordered from highest to lowest mean.
Metodo: A character string indicating the method used ("Tamhane T2").
Termino: The term being compared (e.g., "A", "B", or "A:B").
Tamhane, A. C. (1977). "Multiple comparisons in model I one-way ANOVA with unequal variances." Communications in Statistics - Theory and Methods, 6(1), 15–32. <https://doi.org/10.1080/03610927708827524>
data(d_e, package = "Analitica") mod <- aov(Sueldo_actual ~ as.factor(labor), data = d_e) resultado <- T2Test(mod) summary(resultado) plot(resultado) # Con bloques, comparando solo el factor de interés mod2 <- aov(Sueldo_actual ~ as.factor(labor) + Sexo, data = d_e) res2 <- T2Test(mod2, comparar = "as.factor(labor)") summary(res2) plot(res2) # Modelo con interacción mod3 <- aov(Sueldo_actual ~ as.factor(labor) * Sexo, data = d_e) res3 <- T2Test(mod3, comparar = c("as.factor(labor)", "Sexo")) summary(res3) plot(res3)data(d_e, package = "Analitica") mod <- aov(Sueldo_actual ~ as.factor(labor), data = d_e) resultado <- T2Test(mod) summary(resultado) plot(resultado) # Con bloques, comparando solo el factor de interés mod2 <- aov(Sueldo_actual ~ as.factor(labor) + Sexo, data = d_e) res2 <- T2Test(mod2, comparar = "as.factor(labor)") summary(res2) plot(res2) # Modelo con interacción mod3 <- aov(Sueldo_actual ~ as.factor(labor) * Sexo, data = d_e) res3 <- T2Test(mod3, comparar = c("as.factor(labor)", "Sexo")) summary(res3) plot(res3)
Performs Dunnett's T3 test for pairwise comparisons after an ANOVA model. This test is recommended when group variances are unequal and sample sizes differ. It is based on the studentized range distribution and provides conservative control over Type I error without assuming homoscedasticity.
T3Test(modelo, comparar = NULL, alpha = 0.05)T3Test(modelo, comparar = NULL, alpha = 0.05)
modelo |
An object of class |
comparar |
Character vector with the name(s) of the factor(s) to compare:
- One name: main effect (e.g., "treatment" or "A")
- Several names: interaction (e.g., |
alpha |
Significance level (default is 0.05). |
Advantages: - More powerful than T2 when group sizes are small. - Adjusted for unequal variances.
Disadvantages: - Complex critical value estimation. - Less frequently used and harder to find in software.
An object of class "dunnettt3" and "comparaciones", containing:
Resultados: A data frame with pairwise comparisons, mean differences,
q_value, gl, p_value, and significance indicators.
Promedios: A named numeric vector of group means as defined by comparar.
Orden_Medias: A character vector of group names ordered from highest to lowest mean.
Metodo: A character string with the test name ("Dunnett T3").
Termino: The term being compared (e.g., "A", "B", or "A:B").
Dunnett, C. W. (1980). "Pairwise multiple comparisons in the unequal variance case." Journal of the American Statistical Association, 75(372), 796–800. <https://doi.org/10.1080/01621459.1980.10477558>
data(d_e, package = "Analitica") mod <- aov(Sueldo_actual ~ as.factor(labor), data = d_e) resultado <- T3Test(mod) summary(resultado) plot(resultado) # Con bloques, comparando solo el factor de interés mod2 <- aov(Sueldo_actual ~ as.factor(labor) + Sexo, data = d_e) res2 <- T3Test(mod2, comparar = "as.factor(labor)") summary(res2) plot(res2) # Modelo con interacción mod3 <- aov(Sueldo_actual ~ as.factor(labor) * Sexo, data = d_e) res3 <- T3Test(mod3, comparar = c("as.factor(labor)", "Sexo")) summary(res3) plot(res3)data(d_e, package = "Analitica") mod <- aov(Sueldo_actual ~ as.factor(labor), data = d_e) resultado <- T3Test(mod) summary(resultado) plot(resultado) # Con bloques, comparando solo el factor de interés mod2 <- aov(Sueldo_actual ~ as.factor(labor) + Sexo, data = d_e) res2 <- T3Test(mod2, comparar = "as.factor(labor)") summary(res2) plot(res2) # Modelo con interacción mod3 <- aov(Sueldo_actual ~ as.factor(labor) * Sexo, data = d_e) res3 <- T3Test(mod3, comparar = c("as.factor(labor)", "Sexo")) summary(res3) plot(res3)
Performs Tukey's Honest Significant Difference (HSD) test for all pairwise comparisons after fitting an ANOVA model. This post hoc method uses the studentized range distribution and is appropriate when variances are equal across groups and observations are independent.
TukeyTest(modelo, comparar = NULL, alpha = 0.05)TukeyTest(modelo, comparar = NULL, alpha = 0.05)
modelo |
An |
comparar |
Character vector with the name(s) of the factor(s) to compare:
- One name: main effect (e.g., "treatment" or "A")
- Several names: interaction (e.g., |
alpha |
Significance level (default 0.05). |
Tukey's test controls the family-wise error rate and is widely used when group comparisons have not been planned in advance.
Advantages: - Strong control of Type I error rate. - Ideal for balanced designs with equal variances.
Disadvantages: - Assumes equal variances and sample sizes. - Less powerful with heteroscedasticity.
An object of class "tukey" and "comparaciones" containing:
Resultados: a data.frame with columns Comparacion, Diferencia, SE, t_value,
p_value (unadjusted), p_ajustada (Tukey), Valor_Critico (critical difference), and Significancia.
Promedios: a named vector of group means as defined by comparar.
Orden_Medias: group names ordered from highest to lowest mean.
Metodo: "Tukey test".
Termino: the term being compared (e.g., "A", "B", or "A:B").
MSerror, df_error, N: useful for plots with error bars.
Tukey, J. W. (1949). "Comparing individual means in the analysis of variance." Biometrics, 5(2), 99–114. <https://doi.org/10.2307/3001913>
#Caso DCA data(d_e, package = "Analitica") mod1 <- aov(Sueldo_actual ~ as.factor(labor), data = d_e) summary(mod1) resultado <- TukeyTest(mod1) summary(resultado) plot(resultado) #Caso DBA mod2 <- aov(Sueldo_actual ~ as.factor(labor) + Sexo, data = d_e) summary(mod2) # Comparar niveles de 'tratamiento' (ajustando el error por el modelo con bloque) res <- TukeyTest(mod2, comparar = "as.factor(labor)") summary(res) plot(res) #Caso DFA Two Ways mod3 <- aov(Sueldo_actual ~ as.factor(labor) * Sexo, data = d_e) summary(mod3) # promedios de as.factor(labor) (promediando sobre B) resA <- TukeyTest(mod3, comparar = "as.factor(labor)") summary(resA) plot(resA) # promedios de la interaccion entre factor A y factor B resB <- TukeyTest(mod3, comparar = c("as.factor(labor)","Sexo")) summary(resB) plot(resB)#Caso DCA data(d_e, package = "Analitica") mod1 <- aov(Sueldo_actual ~ as.factor(labor), data = d_e) summary(mod1) resultado <- TukeyTest(mod1) summary(resultado) plot(resultado) #Caso DBA mod2 <- aov(Sueldo_actual ~ as.factor(labor) + Sexo, data = d_e) summary(mod2) # Comparar niveles de 'tratamiento' (ajustando el error por el modelo con bloque) res <- TukeyTest(mod2, comparar = "as.factor(labor)") summary(res) plot(res) #Caso DFA Two Ways mod3 <- aov(Sueldo_actual ~ as.factor(labor) * Sexo, data = d_e) summary(mod3) # promedios de as.factor(labor) (promediando sobre B) resA <- TukeyTest(mod3, comparar = "as.factor(labor)") summary(resA) plot(resA) # promedios de la interaccion entre factor A y factor B resB <- TukeyTest(mod3, comparar = c("as.factor(labor)","Sexo")) summary(resB) plot(resB)