Assessing the Tolerance of Overdispersion in Geographically Weighted Poisson Regression for Spatial Count Data
Muhammad Nur Aidi1*, Indahwati2, Puput Cahya Ambarwati3
MSI Journal of AI and Technology | | Page 01 to 16
Abstract
Count data, particularly in spatial contexts, often exhibits overdispersion and spatial heterogeneity, challenging the assumptions of traditional Poisson regression. Geographically Weighted Poisson Regression (GWPR) extends Poisson regression by accommodating spatial variability in regression parameters, but it assumes equidispersion—an assumption frequently violated in practice. An alternative, the Geographically Weighted Negative Binomial Regression (GWNBR), accounts for overdispersion but is computationally intensive. This study evaluates the robustness of GWPR under varying levels of overdispersion through simulation. Data were generated across 49 spatial locations with two explanatory variables and three levels of overdispersion: negligible, moderate, and severe. Root Mean Square Error (RMSE) was used to assess model performance. Results indicate that GWPR performs reliably when overdispersion is low to moderate, with only a marginal increase in RMSE. However, as overdispersion becomes severe, GWPR’s accuracy declines substantially. The findings suggest that GWPR remains appropriate for spatial count data under mild overdispersion but should be replaced by GWNBR in high-overdispersion contexts.
Keywords: Count data; Overdispersion; Spatial heterogeneity; Geographically Weighted Poisson Regression (GWPR); Geographically Weighted Negative Binomial Regression (GWNBR); Simulation; Root Mean Square Error (RMSE).
CLASSIFICATION OF WOOD STRENGTH CLASSES BASED ON MECHANICAL PROPERTIES USING CLUSTER ANALYSIS
Muhammad Nur Aidi1*
MSI Journal of AI and Technology | | Page 01 to 16
Abstract
This paper presents a classification process of various wood types by applying a hybrid methodology that combines unsupervised and supervised classification techniques. The unsupervised classification employs Hierarchical Cluster Analysis to group wood types based on their mechanical characteristics. The resulting clusters are then used as input for Discriminant Analysis to derive discriminant functions and validate the classification. The combined approach clearly demonstrates the effectiveness of clustering in distinguishing between groups, as shown by significantly different mean values for each mechanical variable across the clusters. This approach not only enhances classification accuracy but also provides a clearer understanding of wood strength classes, particularly for lesser-known wood species.
Keywords: Wood classification, mechanical properties, cluster analysis, discriminant analysis, multivariate analysis, wood strength class.
Evaluating the Equidispersion Assumption in Poisson Distributions Through Simulation: A Study on Variance Mean Ratio Behavior
Siti Hariati Astuti1, Muhammad Nur Aidi2*, Kusman Sadik2
MSI Journal of AI and Technology | | Page 01 to 16
Abstract
This study explores the fundamental assumption of equidispersion in Poisson-distributed count data, wherein the mean equals the variance. Although the Poisson model is widely used for modeling rare events counts in fields such as epidemiology, telecommunications, and operations research, its assumption of equidispersion is frequently violated in real-world applications. Using simulated datasets, this research investigates the behavior of the Variance Mean Ratio (VMR) under different values of the Poisson parameter (λ) and sample sizes (n). Simulations were conducted across λ values ranging from 1 to 20 and sample sizes of 20, 40, 60, 80, and 100, each replicated 100 times. The study evaluates the stability and accuracy of the equidispersion property, employing histograms and statistical diagnostics to assess distributional characteristics. The results offer insights into when Poisson distributes adequately models count data and when alternative models, such as the negative binomial or zero-inflated models, may be required due to overdispersion. This analysis contributes to more informed and accurate modeling of discrete count data in statistical applications.
Keywords: Poisson distribution, count data, equidispersion, overdispersion, variance mean ratio, simulation, discrete probability distribution, statistical modeling.
DIFFERENTIATION OF PINE STAND AGE CLASSES THROUGH DISCRIMINANT ANALYSIS
Muhammad Nur Aidi1*,
MSI Journal of AI and Technology | | Page 01 to 18
Abstract
Indonesia’s forest ecosystems are essential for both ecological stability and economic productivity. Effective forest management relies on accurate data, with stand tables serving as key tools for understanding forest structure. Traditionally derived from field measurements, stand tables can now be developed using remote sensing data, including aerial photographs. This study explores the potential of photographic variables to distinguish forest age classes in pine plantations managed by Perum Perhutani. Using 40 observations of aerial imagery interpreted for qualitative and quantitative variables, non-hierarchical cluster analysis was applied to group forest stands into six age classes. Discriminate analysis was then conducted to identify significant variables and develop classification functions. The results show that quantitative variables—such as crown cover, crown diameter, and tree height—significantly differentiate forest age classes, while qualitative variables like tone and topography were less effective. The first two discriminant functions explained 98.5% of the variance, confirming their strong discriminatory power. This approach demonstrates that aerial photographic variables, particularly quantitative ones, offer a promising alternative for
forest age classification, enabling more efficient and large-scale forest inventory and planning.
Keywords: Forest inventory; aerial photography; stand tables; cluster analysis; discriminant analysis; forest age classification; remote sensing; pine plantations; Perum Perhutani; quantitative variables.
Logistics Discriminant Analysis Using SMOTE for Anemia Classification in Women of Reproductive Age
Muhammad Nur Aidi1*, Amamlia Nailul Husna2, Rahma Anisa3, Elisa Diana Juanti4
MSI Journal of AI and Technology | | Page 01 to 14
Abstract
Anemia is a condition when the hemoglobin (Hb) level is less than normal, which is less than 12 g/dL for women of reproductive age. Analysis to detect the dependence of the risk factor of anemia is important to distinguish the status of anemia and non-anemia women using the classification method. The data in this study are numerical and categorical types so that the classification method used is logistic discriminant. The data is imbalanced on the dependent variable, where the number of non-anemia observations is much more than the anemia observations, so that the data imbalance is handled using SMOTE for modeling. The logistic discriminant discriminates the observations based on the dependent variable and obtains a model where the affected dependent variable can be identified from the significant model coefficients. The results showed that the logistic discriminant classification model in this study had a quite good classification with 73.68% accuracy. The variables that affect anemia status in this study are pneumonia, tuberculosis, hepatitis, diabetes mellitus, malaria, gestational age, and age groups.
Keywords: anemia, logistic discriminant, SMOTE.