The recent proliferation of advanced data collection technologies for Patient Generated Health Data (PGHD) has made remote health monitoring more accessible. However, the complex nature of the big volume of medical generated data presents a significant challenge for traditional patient monitoring approaches, impeding the effective extraction of useful information. In this context, it is imperative to develop a robust and cost-effective framework that provides the scalability and deals with the heterogeneity of PGHD in real-time. Such a system could serve as a reference and would guide future research for monitoring patient undergoing a treatment at home conditions. This study presents a real-time visual analytics framework offering insightful visual representations of the multimodal big data. The proposed system was designed following the principles of User Centered Design (UCD) to ensure that it meets the needs and expectations of medical practitioners. The usability of this framework was evaluated by its application to the visualization of kinematic data of the upper limbs’ movement of patients during neuromotor rehabilitation exercises.
{"title":"Real-Time Visual Analytics for Remote Monitoring of Patient’s Health","authors":"Maryam Boumrah, S. Garbaya, A. Radgui","doi":"10.24132/csrn.3301.61","DOIUrl":"https://doi.org/10.24132/csrn.3301.61","url":null,"abstract":"The recent proliferation of advanced data collection technologies for Patient Generated Health Data (PGHD) has made remote health monitoring more accessible. However, the complex nature of the big volume of medical generated data presents a significant challenge for traditional patient monitoring approaches, impeding the effective extraction of useful information. In this context, it is imperative to develop a robust and cost-effective framework that provides the scalability and deals with the heterogeneity of PGHD in real-time. Such a system could serve as a reference and would guide future research for monitoring patient undergoing a treatment at home conditions. This study presents a real-time visual analytics framework offering insightful visual representations of the multimodal big data. The proposed system was designed following the principles of User Centered Design (UCD) to ensure that it meets the needs and expectations of medical practitioners. The usability of this framework was evaluated by its application to the visualization of kinematic data of the upper limbs’ movement of patients during neuromotor rehabilitation exercises.","PeriodicalId":322214,"journal":{"name":"Computer Science Research Notes","volume":"2012 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131110738","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fabian Sturm, Rahul Sathiyababu, E. Hergenroether, M. Siegel
Until now, it has been impossible to imagine industrial manual assembly without humans due to their flexibility and adaptability. But the assembly process does not always benefit from human intervention. The error-proneness of the assembler due to disturbance, distraction or inattention requires intelligent support of the employee and is ideally suited for deep learning approaches because of the permanently occurring and repetitive data patterns. However, there is the problem that the labels of the data are not always sufficiently available. In this work, a spatio-temporal transformer model approach is used to address the circumstances of few labels in an industrial setting. A pseudo-labeling method from the field of semi-supervised transfer learning is applied for model training, and the entire architecture is adapted to the fine-grained recognition of human hand actions in assembly. This implementation significantly improves the generalization of the model during the training process over different variations of strong and weak classes from the ground truth and proves that it is possible to work with deep learning technologies in an industrial setting, even with few labels. In addition to the main goal of improving the generalization capabilities of the model by using less data during training and exploring different variations of appropriate ground truth and new classes, the recognition capabilities of the model are improved by adding convolution to the temporal embedding layer, which increases the test accuracy by over 5% compared to a similar predecessor model.
{"title":"Semi-Supervised Learning Approach for Fine Grained Human Hand Action Recognition in Industrial Assembly","authors":"Fabian Sturm, Rahul Sathiyababu, E. Hergenroether, M. Siegel","doi":"10.24132/csrn.3301.58","DOIUrl":"https://doi.org/10.24132/csrn.3301.58","url":null,"abstract":"Until now, it has been impossible to imagine industrial manual assembly without humans due to their flexibility and adaptability. But the assembly process does not always benefit from human intervention. The error-proneness of the assembler due to disturbance, distraction or inattention requires intelligent support of the employee and is ideally suited for deep learning approaches because of the permanently occurring and repetitive data patterns. However, there is the problem that the labels of the data are not always sufficiently available. In this work, a spatio-temporal transformer model approach is used to address the circumstances of few labels in an industrial setting. A pseudo-labeling method from the field of semi-supervised transfer learning is applied for model training, and the entire architecture is adapted to the fine-grained recognition of human hand actions in assembly. This implementation significantly improves the generalization of the model during the training process over different variations of strong and weak classes from the ground truth and proves that it is possible to work with deep learning technologies in an industrial setting, even with few labels. In addition to the main goal of improving the generalization capabilities of the model by using less data during training and exploring different variations of appropriate ground truth and new classes, the recognition capabilities of the model are improved by adding convolution to the temporal embedding layer, which increases the test accuracy by over 5% compared to a similar predecessor model.","PeriodicalId":322214,"journal":{"name":"Computer Science Research Notes","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117064208","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Oliver Scheibert, Jannis Möller, S. Grogorick, M. Eisemann
Tracking errors severely impact the effectiveness of augmented reality display techniques for indoor navigation. In this work we take a look at the sources of error and accuracy of existing tracking technologies. We derive important design criteria for robust display techniques and present objective criteria. These serve evaluation of indoor navigation techniques without or in preparation of quantitative user studies. Based on these criteria we propose a new error tolerant display technique called Bending Words, where words move along the navigation path guiding the user. Bending Words outranks the other evaluated display techniques in many of the tested criteria and provides a robust, error-tolerant alternative to established augmented reality indoor navigation display techniques.
{"title":"Error-Robust Indoor Augmented Reality Navigation: Evaluation Criteria and a New Approach","authors":"Oliver Scheibert, Jannis Möller, S. Grogorick, M. Eisemann","doi":"10.24132/csrn.3301.17","DOIUrl":"https://doi.org/10.24132/csrn.3301.17","url":null,"abstract":"Tracking errors severely impact the effectiveness of augmented reality display techniques for indoor navigation. In this work we take a look at the sources of error and accuracy of existing tracking technologies. We derive important design criteria for robust display techniques and present objective criteria. These serve evaluation of indoor navigation techniques without or in preparation of quantitative user studies. Based on these criteria we propose a new error tolerant display technique called Bending Words, where words move along the navigation path guiding the user. Bending Words outranks the other evaluated display techniques in many of the tested criteria and provides a robust, error-tolerant alternative to established augmented reality indoor navigation display techniques.","PeriodicalId":322214,"journal":{"name":"Computer Science Research Notes","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124514071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ray tracing remains of interest to Computer Graphics community with its elegant framing of how light interacts with objects, being able to easily support multiple light sources, and simple framework of merging synthetic and real cameras. Recent trends to provide implementations at the chip-level means raytracing’s constant quest of realism would propel its usage in real-time applications. AR/VR, Animations, 3DGames Industry, 3D-large scale simulations, and future social computing platforms are just a few examples of possible major impact. Raytracing is also appealing to HCI community because raytracing extends well along the 3D-space and time, seamlessly blending both synthetic and real cameras at multiple scales to support storytelling. This presentation will include a few milestones from my work such as the Slicing Extent technique and Directed Safe Zones. Our recent applications of applying machine learning techniques creating novel synthetic views, which could also provide a future doorway to handle dynamic scenes with more compute power as needed, will also be presented. It is once again renaissance for ray tracing which for last 50+ years has remained the most elegant technique for modeling light phenomena in virtual worlds at whatever scale compute power could support.
{"title":"Raytracing Renaissance: An elegant framework for modeling light at Multiple Scales","authors":"S. Semwal","doi":"10.24132/csrn.3301.2","DOIUrl":"https://doi.org/10.24132/csrn.3301.2","url":null,"abstract":"Ray tracing remains of interest to Computer Graphics community with its elegant framing of how light interacts with objects, being able to easily support multiple light sources, and simple framework of merging synthetic and real cameras. Recent trends to provide implementations at the chip-level means raytracing’s constant quest of realism would propel its usage in real-time applications. AR/VR, Animations, 3DGames Industry, 3D-large scale simulations, and future social computing platforms are just a few examples of possible major impact. Raytracing is also appealing to HCI community because raytracing extends well along the 3D-space and time, seamlessly blending both synthetic and real cameras at multiple scales to support storytelling. This presentation will include a few milestones from my work such as the Slicing Extent technique and Directed Safe Zones. Our recent applications of applying machine learning techniques creating novel synthetic views, which could also provide a future doorway to handle dynamic scenes with more compute power as needed, will also be presented. It is once again renaissance for ray tracing which for last 50+ years has remained the most elegant technique for modeling light phenomena in virtual worlds at whatever scale compute power could support.","PeriodicalId":322214,"journal":{"name":"Computer Science Research Notes","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133043618","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
W. Benger, A. Voicu, R. Baran, Loredana Gonciulea, Cosmin Barna, F. Steinbacher
We present a novel methodological approach for the interactive editing of big point clouds. Based on the mathematics of fiber bundles, the proposed approach to model a data structure that is efficient for visualization, modification and I/O including an unlimited multi-level set of editing states useful for expressing and maintaining multiple undo histories. Backed by HDF5 as high performance file format, this data structure naturally allows persistent storage for the history of modification actions, an unique new feature of our approach. The challenges of visually based manual editing of big point clouds are discussed and a proper rendering solution is presented. The implemented solution and its features as consequences of the underlying methodology is compared with two major mainstream applications providing point-cloud editing tools as well.
{"title":"The Method of Mixed States for Interactive Editing of Big Point Clouds","authors":"W. Benger, A. Voicu, R. Baran, Loredana Gonciulea, Cosmin Barna, F. Steinbacher","doi":"10.24132/csrn.3301.21","DOIUrl":"https://doi.org/10.24132/csrn.3301.21","url":null,"abstract":"We present a novel methodological approach for the interactive editing of big point clouds. Based on the mathematics of fiber bundles, the proposed approach to model a data structure that is efficient for visualization, modification and I/O including an unlimited multi-level set of editing states useful for expressing and maintaining multiple undo histories. Backed by HDF5 as high performance file format, this data structure naturally allows persistent storage for the history of modification actions, an unique new feature of our approach. The challenges of visually based manual editing of big point clouds are discussed and a proper rendering solution is presented. The implemented solution and its features as consequences of the underlying methodology is compared with two major mainstream applications providing point-cloud editing tools as well.","PeriodicalId":322214,"journal":{"name":"Computer Science Research Notes","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129025026","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In recent academic literature Sex and Gender have both become synonyms, even though distinct definitions do exist. This give rise to the question, which of those two are actually face image classifiers identifying? It will be argued and explained why CNN based classifiers will generally identify gender, while feeding face recognition feature vectors into a neural network, will tend to verify sex rather than gender. It is shown for the first time how state of the art Sex Classification can be performed using Embedded Prototype Subspace Classifiers (EPSC) and also how the projection depth can be learned efficiently. The automatic Gender classification, which is produced by the emph{InsightFace} project, is used as a baseline and compared to the results given by the EPSC, which takes the feature vectors produced by emph{InsightFace} as input. It turns out that the depth of projection needed is much larger for these face feature vectors than for an example classifying on MNIST or similar. Therefore, one important contribution is a simple method to determine the optimal depth for any kind of data. Furthermore, it is shown how the weights in the final layer can be set in order to make the choice of depth stable and independent of the kind of learning data. The resulting EPSC is extremely light weight and yet very accurate, reaching over $98%$ accuracy for several datasets.
{"title":"Sex Classification of Face Images using Embedded Prototype Subspace Classifiers","authors":"A. Hast","doi":"10.24132/csrn.3301.7","DOIUrl":"https://doi.org/10.24132/csrn.3301.7","url":null,"abstract":"In recent academic literature Sex and Gender have both become synonyms, even though distinct definitions do exist. This give rise to the question, which of those two are actually face image classifiers identifying? It will be argued and explained why CNN based classifiers will generally identify gender, while feeding face recognition feature vectors into a neural network, will tend to verify sex rather than gender. It is shown for the first time how state of the art Sex Classification can be performed using Embedded Prototype Subspace Classifiers (EPSC) and also how the projection depth can be learned efficiently. The automatic Gender classification, which is produced by the emph{InsightFace} project, is used as a baseline and compared to the results given by the EPSC, which takes the feature vectors produced by emph{InsightFace} as input. It turns out that the depth of projection needed is much larger for these face feature vectors than for an example classifying on MNIST or similar. Therefore, one important contribution is a simple method to determine the optimal depth for any kind of data. Furthermore, it is shown how the weights in the final layer can be set in order to make the choice of depth stable and independent of the kind of learning data. The resulting EPSC is extremely light weight and yet very accurate, reaching over $98%$ accuracy for several datasets.","PeriodicalId":322214,"journal":{"name":"Computer Science Research Notes","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133757922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
3D landscapes generation is an interdisciplinary field that requires expertise in both computer graphics and geographic informations systems (GIS). It is a complex and time-consuming process. In this paper, we present a new approach to simplify 3D environment generation process, by creating a go-between data-model containing a list of available source data and steps to use them. To feed the data-model, we introduce a formal language that describes the process"s sequence. We propose an adapted format, designed to be human-readable and machine-readable, allowing for easy creation and modification of the scenery. We demonstrate the utility of our approach by implementing a prototype system to generate 3D landscapes with a use-case fit for multipurpose simulation. Our system takes a description as input and outputs a complete 3D environment, including terrain and feature elements such as buildings created by chosen geometrical process. Experiments show that our approach reduces the time and effort required to generate a 3D environment, making it accessible to a wider range of users without extensive knowledge of GIS. In conclusion, our custom language and implementation provide a simple and effective solution to the complexity of 3D terrain generation, making it a valuable tool for users in the area.
{"title":"Operational theater generation by a descriptive language","authors":"Matis Ghiotto, B. Desbenoit, Romain Raffin","doi":"10.24132/csrn.3301.19","DOIUrl":"https://doi.org/10.24132/csrn.3301.19","url":null,"abstract":"3D landscapes generation is an interdisciplinary field that requires expertise in both computer graphics and geographic informations systems (GIS). It is a complex and time-consuming process. In this paper, we present a new approach to simplify 3D environment generation process, by creating a go-between data-model containing a list of available source data and steps to use them. To feed the data-model, we introduce a formal language that describes the process\"s sequence. We propose an adapted format, designed to be human-readable and machine-readable, allowing for easy creation and modification of the scenery. We demonstrate the utility of our approach by implementing a prototype system to generate 3D landscapes with a use-case fit for multipurpose simulation. Our system takes a description as input and outputs a complete 3D environment, including terrain and feature elements such as buildings created by chosen geometrical process. Experiments show that our approach reduces the time and effort required to generate a 3D environment, making it accessible to a wider range of users without extensive knowledge of GIS. In conclusion, our custom language and implementation provide a simple and effective solution to the complexity of 3D terrain generation, making it a valuable tool for users in the area.","PeriodicalId":322214,"journal":{"name":"Computer Science Research Notes","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131455334","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Research on light vision mechanisms in biosystems and on the mechanisms of formation of deficits in color discrimination[1] reveals that not only white light is polychromatic but all light waves are. The spectrum of white light is composed of aggregations of only 4 monochromatic waves: magenta UV 384 nm, cyan 432 nm, yellow 576 nm and magenta IR 768 nm, grouped in 5 bi-chromatic waves: cinnabar red (magenta IR + yellow), green (yellow + cyan), indigo (cyan + magenta UV) and also two semi-bright bi-chromatic waves - porphyry IR (semi-infrared wave composed of the magenta IR 768 nm wave and the colorless infrared wave 864 nm) and porphyry UV (semi-ultraviolet wave composed of the magenta UV 384 nm wave and the colorless ultraviolet wave 288 nm). The light waves thus composed create the light sensations due to the mechanism of additive synthesis. The method allows a new approach to interpret the composition of the bright waves, the phenomenon of decomposition of colours and additive synthesis that constitutes the principle of colour production in computers. The new elaborate models of colour physics also constitute the basis by interpretation of the mechanisms of vision of colours.
{"title":"Polychromatism of all light waves: new approach to the analysis of the physical and perceptive color aspects","authors":"Justyna Niewiadomska-Kaplar","doi":"10.24132/csrn.3301.43","DOIUrl":"https://doi.org/10.24132/csrn.3301.43","url":null,"abstract":"Research on light vision mechanisms in biosystems and on the mechanisms of formation of deficits in color discrimination[1] reveals that not only white light is polychromatic but all light waves are. The spectrum of white light is composed of aggregations of only 4 monochromatic waves: magenta UV 384 nm, cyan 432 nm, yellow 576 nm and magenta IR 768 nm, grouped in 5 bi-chromatic waves: cinnabar red (magenta IR + yellow), green (yellow + cyan), indigo (cyan + magenta UV) and also two semi-bright bi-chromatic waves - porphyry IR (semi-infrared wave composed of the magenta IR 768 nm wave and the colorless infrared wave 864 nm) and porphyry UV (semi-ultraviolet wave composed of the magenta UV 384 nm wave and the colorless ultraviolet wave 288 nm). The light waves thus composed create the light sensations due to the mechanism of additive synthesis. The method allows a new approach to interpret the composition of the bright waves, the phenomenon of decomposition of colours and additive synthesis that constitutes the principle of colour production in computers. The new elaborate models of colour physics also constitute the basis by interpretation of the mechanisms of vision of colours.","PeriodicalId":322214,"journal":{"name":"Computer Science Research Notes","volume":"57 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123385467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We present a novel image reconstruction method from scattered data based on multigrid relaxation of the Poisson equation and convolutional neural networks (CNN). We first formulate the image reconstruction problem as a Poisson equation with irregular boundary conditions, then propose a fast multigrid method for solving such an equation, and finally enhance the reconstructed image with a CNN to recover the details. The method works incrementally so that additional points can be added, and the amount of points does not affect the reconstruction speed. Furthermore, the multigrid and CNN techniques ensure that the output image resolution has only minor impact on the reconstruction speed. We evaluated the method on the CompCars dataset, where it achieves up to 40% error reduction compared to a reconstruction-only approach and 9% compared to a CNN-only approach.
{"title":"Fast Incremental Image Reconstruction with CNN-enhanced Poisson Interpolation","authors":"Blaž Erzar, Žiga Lesar, Matija Marolt","doi":"10.24132/csrn.3301.24","DOIUrl":"https://doi.org/10.24132/csrn.3301.24","url":null,"abstract":"We present a novel image reconstruction method from scattered data based on multigrid relaxation of the Poisson equation and convolutional neural networks (CNN). We first formulate the image reconstruction problem as a Poisson equation with irregular boundary conditions, then propose a fast multigrid method for solving such an equation, and finally enhance the reconstructed image with a CNN to recover the details. The method works incrementally so that additional points can be added, and the amount of points does not affect the reconstruction speed. Furthermore, the multigrid and CNN techniques ensure that the output image resolution has only minor impact on the reconstruction speed. We evaluated the method on the CompCars dataset, where it achieves up to 40% error reduction compared to a reconstruction-only approach and 9% compared to a CNN-only approach.","PeriodicalId":322214,"journal":{"name":"Computer Science Research Notes","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115745308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The paper presents a method for detecting dangerous situations near pedestrian crossings using an in-car camera system. The approach utilizes deep learning-based object detection to identify pedestrians and vehicles, analyzing their behavior to identify potential hazards. The system incorporates vehicle sensor data for enhanced accuracy. Evaluation results show high accuracy in detecting dangerous situations. The proposed system can potentially enhance pedestrian and driver safety in urban transportation.
{"title":"Detection of Dangerous Situations Near Pedestrian Crossings using In-Car Camera","authors":"M. Kubanek, Lukasz Karbowiak, J. Bobulski","doi":"10.24132/csrn.3301.41","DOIUrl":"https://doi.org/10.24132/csrn.3301.41","url":null,"abstract":"The paper presents a method for detecting dangerous situations near pedestrian crossings using an in-car camera system. The approach utilizes deep learning-based object detection to identify pedestrians and vehicles, analyzing their behavior to identify potential hazards. The system incorporates vehicle sensor data for enhanced accuracy. Evaluation results show high accuracy in detecting dangerous situations. The proposed system can potentially enhance pedestrian and driver safety in urban transportation.","PeriodicalId":322214,"journal":{"name":"Computer Science Research Notes","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129657460","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}