Volume 65 | Issue 8 | Year 2019 | Article Id. IJMTT-V65I8P516 | DOI : https://doi.org/10.14445/22315373/IJMTT-V65I8P516
Many tests for the analysis of continuous data have the underlying assumption that the data in question follows a normal distribution (ex. ANOVA, regression, etc.). Within certain research topics, it is common to end up with a dataset that has a disproportionately high number of zero-values but otherwise might follow a normal or a Poisson distribution. These datasets are often referred to as ‘zero-inflated’ and their analysis can be challenging. An example of where these zero-inflated datasets arise is in plant science. We conducted a simulation study to compare the performance of the Poisson zero-inflated model to a standard ANOVA model and also to a regular Poisson model on different types of zero-inflated data. Underlying distributions, number of populations, sample sizes, and percentages of zeros were variables of consideration. In this study, we conduct a Type I error assessment followed by a power comparison between the models.
[1] Min, Y. & Agresti, A. (2002) Modeling Nonnegative Data with Clumping at Zero: A Survey. Journal of Iranian Statistical Society Vol. 1, pp. 7-33.
[2] Eggers, J. (2015). On Statistical Methods for Zero-Inflated Models. Technical Report U.U.D.M. Project Report 2015:9, Uppsala University.
[3] Lambert, D. (1992) Zero-Inflated Poisson Regression, with an Application to Defects in Manufacturing. Technometrics Vol. 34, No. 1, pp. 1-14.
[4] Lord, D., Washington, S., & Ivan, J. (2005) Poisson, Poisson-Gamma and Zero-Inflated Regression Models of Motor Vehicle Crashes: Balancing Statistical Fit and Theory. Accident Analysis and Prevention Vol. 37, No. 1, pp. 35-46.
[5] Fisher, R. (1963) Statistical Methods for Research Workers, 13th ed. Edinburgh: Oliver and Boyd.
[6] Stroup, W. (2013) Generalized Linear Mixed Models: Modern Concepts, Methods and Applications. CRC Press.
[7] Tobin, J. (1958) Estimation of Relationships for Limited Dependent Variables. Econometrica Vol. 26, No. 1, pp. 24-36.
[8] Greene, W. (1994) Accounting for Excess Zeros and Sample Selection in Poisson and Negative Binomial Regression Models. Stern School of Business working paper no. 94-10.
[9] Hu, M., Pavlicovia, M. & Nunes, E. (2011) Zero-inflated and hurdle models of count data with extra zeros: examples from an HIV-risk reduction intervention trial. The American journal of drug and alcohol abuse Vol. 37, No. 5, pp. 367-375.
[10] Yang, S., Harlow, L., Puggioni, G., & Redding, C. (2017) A Comparison of Different Methods of Zero-Inflated Data Analysis and an Application in Health. Journal of Modern Applied Statistical Methods Vol 16, No. 1, pp. 518-543.
[11] Sileshi, G. (2008) The Excess-Zero Problem in Soil Animal Count Data and Choice of Appropriate Models for Statistical Inference. Pedobiologia Vol. 52, pp. 1-17.
[12] Mullahy, J. (1986) Specification of Some Modified Count Data Models. Journal of Econometrics Vol. 33, pp. 341-365
[13] Amemiya, T. (1973) Regression Analysis when the Dependent Variable Is Truncated Normal. Econometrica Vol. 41, No. 6, pp. 997-1016.
[14] McDonald, F. & Moffitt, R. (1980) The Uses of Tobit Analysis. The Review of Economics and Statistics Vol. 62, No. 2, pp. 318-321.
[15] Hall, D. & Zhang, Z (2004) Marginal models for zero inflated clustered data. Statistical Modeling Vol. 4, pp. 161-180
Lucas Young, Rhonda Magel, Curt Doetkott, "Type I Error Assessment And Power Comparison Of Anova And Zero-Inflated Methods On Zero-Inflated Data," International Journal of Mathematics Trends and Technology (IJMTT), vol. 65, no. 8, pp. 139-153, 2019. Crossref, https://doi.org/10.14445/22315373/IJMTT-V65I8P516