The liver is one of the most essential organs in the body, which helps with metabolism and keeping the body healthy. Successful treatments and better patient outcomes depend on early and correct Liver Disease (LD) diagnosis and identification. This study proposes a system for predicting the LD by combining the techniques of Machine Learning (ML) algorithms that include the Decision Tree, Random Forest, Extra Tree Classifier (ETC), LightGBM, and Adaboost, with the Tree-Structured Parzen Estimator (TPE) method for hyperparameter tuning. No previous literature research has utilized ML algorithms with TPE to predict LD. For this research, the Indian Liver Patients’ Dataset with 583 instances and 11 attributes was used. In the pre-processing of the data, techniques such as upsampling have been utilized to address the class imbalance problem. Normalization has been employed to scale the dataset, and feature selection has been applied to choose important features. The proposed model has been analyzed and compared using a 10-fold cross-validation process, with various evaluation metrics including accuracy, precision, recall, and F1-score. The model proposed in this study achieved the best level of accuracy while employing the ETC with the TPE approach, with a recorded accuracy of 95.8%.
The tests for tracking diseases in newborns available through the National Neonatal Screening Program of the Brazilian Unified Health Care System cover six diseases. Mass spectrometer equipment is needed to expand and more efficiently and effectively detect new diseases. However, only four neonatal screening centers have the equipment capable of carrying out the extended test, and the expansion of health service capacity should consider both the rationalization of costs and the comprehensiveness and accessibility of care to the population. This study uses analytics to analyze and estimate the cost of centralized or distributed logistics networks and the level of service to perform the expanded test for newborns throughout Brazil. We evaluate the accessibility of the current infrastructure for the neonatal screening program and propose a novel location–allocation model to create a more integrated infrastructure for reducing disparities and increase the accessibility to neonatal screening services.
In recent decades, breast cancer has become the most prevalent type of cancer that impacts women in the world, which shows a significant risk to the death rates of women. Early identification of breast cancer might drastically decrease patient mortality and greatly improve the chance of an effective treatment. In modern times, machine learning models have become crucial for classifying cancer and strengthening both the accuracy and efficiency of diagnostic and medical treatment strategies. Therefore, this study is focused on early detection of breast cancer using a variety of machine learning algorithms and desires to identify the most effective feature selection process with an amalgamated dataset. Initially, we evaluated five traditional models and two meta-models on separate datasets. To find the most valuable features, the study used the Least Absolute Shrinkage and Selection Operator (LASSO) as well as SHapley Additive exPlanations (SHAP) selection methods and analyzed them through a wide range of performance regulations. Additionally, we applied these models to the combined dataset and observed that the mergeddataset was significantly beneficial for breast cancer diagnosis. After analyzing the feature selection strategies, it was demonstrated that the majority of models performed more accurately when utilizing SHAP methodologies. Notably, three traditional models and two meta-classifiers obtained an accuracy of 99.82%, demonstrating superior performance compared to state-of-the-art methods. This advancement holds a crucial role as it lays the foundation for refining diagnostic tools and enhancing the progression of medical science in this field.
The rapid spread of coronavirus disease 2019 (COVID-19) initially presented unprecedented challenges for clinicians, policymakers, and healthcare systems, as there was limited evidence on the efficacy of various control measures. This study endeavors to provide a detailed and comprehensive overview of the global progression of the COVID-19 mortality in the context of vaccine rollout, utilizing public surveillance data from 145 countries sourced from the World Health Organization and the World Bank. The primary focus is to analyze shifts in the trend of new COVID-19 mortality worldwide before and after the introduction of COVID-19 vaccines. To achieve this, we propose a longitudinal mixed effects model aimed at elucidating the relationship between mortality trend and vaccination rollout, alongside other pertinent covariates. Our modeling approach seeks to accommodate variations in the timing of COVID-19 vaccine rollout among countries, as well as the correlation of observations from within the same country. Our findings highlight the significant impact of new cases, cardiovascular death rate, senior population, stringency index, and reproduction rate on mortality. However, we find that the impact of vaccination is not statistically significant, as evidenced by a relatively large -value. Furthermore, the study reveals substantial disparities in mortality rates among countries across four income groups.
The primary goal of this research is to examine the impact of balancing data on the prediction quality and inference in multilevel logistic regression models. Logistic regression is a valuable approach for modeling binary outcomes expected in health applications. The class imbalance problem, where one of the two outcome categories occurs much more often than the other, is common in healthcare data, such as when modeling the risk factors for rare diseases. The issue is particularly relevant for medical data that contains individual measurements and other data sources measured at a geographic region level, such as environmental risk factors. For this work, both prediction and model interpretation are of interest. A simulation model is proposed to test the impact of balancing strategies on the logistic multilevel model's parameter estimation, inference, and predictive performance. The simulated information emulates characteristics of a Gestational Diabetes Mellitus (GDM) dataset from Indiana's Medicaid program. Several datasets were simulated with varying levels of complexity, involving the balance of the outcome variable and predictors. These datasets exhibited high- or low-frequency occurrences in specific intersections of variables, often called ‘cells.’ The impact of the balancing strategies on prediction and inference was assessed using different techniques, such as the Equivalence (TOST) Test, power analysis, and predictive measures. To the best of our knowledge, this is the first research that explores the impact of using balanced samples on coefficient estimation and prediction measures when using logistic multilevel modeling, finding evidence about the benefits of using balanced samples in this context.

