Harmonization and aggregation of heterogeneous data from Human Biomonitoring (HBM) studies is critical to enhance the reliability of conclusions and move towards FAIR (i.e., Findable, Accessible, Interoperable, Reusable) data. We introduce the HBM Data Toolkit developed by the Flemish Institute for Technological Research (Vlaamse Instelling voor Technologisch Onderzoek - VITO) with the primary goal of optimizing data integrity and interoperability, key steps towards FAIR, while using flexible templates and ensuring data confidentiality. The HBM Data Toolkit was built in 2023–2024 and made available for stakeholders (via https://hbm.vito.be/tools) within the Partnership for the Assessment of Risks from Chemicals (PARC eu-parc.eu). The toolkit consists of 4 modules including data harmonization, data validation, derived variables, and summary statistics calculation. A Python package was created to interpret the templates, making validation and transformation possible. Using Pyodide and WebAssembly, the toolkit runs entirely in the web browser, enabling secure, local execution of Python code without uploading any data. In the validation module, input files in common format (i.e., Excel) were used to configure data templates, aligning with standards and formats as specified under the HBM4EU project (hbm4eu.eu) and PARC. The HBM Data Toolkit allows harmonized data storage in the Personal Exposure and Health (PEH) data platform. Formatted and validated HBM data were made compatible with the Monte Carlo Risk Assessment (MCRA) platform. In the derived variables calculation module, the toolkit also allows users to calculate imputed censored data and standardize/normalize the biomarker data. Furthermore, summary statistics (e.g., geometric mean, percentiles) can be calculated and further visualized in the European HBM dashboard and integrated into the Information Platform for Chemical Monitoring (IPCHEM). In conclusion, the current toolkit proves effective in advancing data quality, harmonization, and aggregation in HBM studies. With local execution, user-friendly codebooks, and standardized schemas, it supports a unified framework that enables consistent analysis and interpretation across diverse studies and datasets.
扫码关注我们
求助内容:
应助结果提醒方式:
