Adam Władziński, Grzegorz Orlicki, Michał Barczak, Małogrzata Szczerska, Jacek Łubiński, Filip Janiak
{"title":"Small data in model calibration for optical tissue phantom validation","authors":"Adam Władziński, Grzegorz Orlicki, Michał Barczak, Małogrzata Szczerska, Jacek Łubiński, Filip Janiak","doi":"10.1117/12.3021367","DOIUrl":null,"url":null,"abstract":"Machine learning algorithms traditionally rely on large datasets for high accuracy. However, advances in the field are now enabling the exploration of solutions in niche engineering areas with smaller datasets. This article reviews the challenges and solutions in working with small datasets, particularly in optoelectronics and biomedical engineering. In optoelectronics, small datasets are key for designing and validating photonic systems, as experiments with living tissues can be costly and complex. The article discusses optimizing photonic response simulations and system calibration using machine learning models that are effective with smaller datasets. In biomedical engineering, the focus is on 3D-printed tissue phantoms, which mimic living tissue properties for non-invasive validation of photonic devices in diagnostics. The study explores how small data techniques like transfer learning, bootstrapping, regularization, and K-fold cross-validation can improve interpretations from small datasets, enhance predictive capabilities, and address data scarcity issues.","PeriodicalId":198425,"journal":{"name":"Other Conferences","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Other Conferences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1117/12.3021367","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Machine learning algorithms traditionally rely on large datasets for high accuracy. However, advances in the field are now enabling the exploration of solutions in niche engineering areas with smaller datasets. This article reviews the challenges and solutions in working with small datasets, particularly in optoelectronics and biomedical engineering. In optoelectronics, small datasets are key for designing and validating photonic systems, as experiments with living tissues can be costly and complex. The article discusses optimizing photonic response simulations and system calibration using machine learning models that are effective with smaller datasets. In biomedical engineering, the focus is on 3D-printed tissue phantoms, which mimic living tissue properties for non-invasive validation of photonic devices in diagnostics. The study explores how small data techniques like transfer learning, bootstrapping, regularization, and K-fold cross-validation can improve interpretations from small datasets, enhance predictive capabilities, and address data scarcity issues.
机器学习算法传统上依赖大型数据集来实现高准确性。然而,现在该领域的进步使得人们能够利用较小的数据集探索利基工程领域的解决方案。本文回顾了使用小型数据集的挑战和解决方案,尤其是在光电子学和生物医学工程领域。在光电子学领域,小型数据集是设计和验证光子系统的关键,因为使用活体组织进行实验可能成本高昂且十分复杂。文章讨论了利用机器学习模型优化光子响应模拟和系统校准,这些模型对较小的数据集非常有效。在生物医学工程领域,重点是三维打印组织模型,它可以模拟活体组织的特性,对诊断中的光子设备进行无创验证。该研究探讨了转移学习、引导、正则化和 K 倍交叉验证等小数据技术如何改善小数据集的解释、增强预测能力并解决数据稀缺问题。