Idris Dulau, M. Beurton-Aimar, Yeykuang Hwu, B. Recur
Field of View Nano-CT X-Ray synchrotron imaging is used for acquiring brain neuronal features from Golgi-stained bio-samples. It theoretically requires a large number of acquired radiographs for compensating reconstruction noise reinforced by the brain features sparsity. However reducing the number of radiographs is essential in routine applications but it results to degraded tomograms. In such a case, traditional segmentation methods are no longer able to distinguish neuronal structures from surrounding noise. We investigate several existing deep-learning networks and we define new ones to segment brain features from very degraded tomograms. We demonstrate the superiority of the proposed networks compared to existing ones.
{"title":"Investigation on Encoder-Decoder Networks for Segmentation of Very Degraded X-Ray CT Tomograms","authors":"Idris Dulau, M. Beurton-Aimar, Yeykuang Hwu, B. Recur","doi":"10.24132/csrn.3301.3","DOIUrl":"https://doi.org/10.24132/csrn.3301.3","url":null,"abstract":"Field of View Nano-CT X-Ray synchrotron imaging is used for acquiring brain neuronal features from Golgi-stained bio-samples. It theoretically requires a large number of acquired radiographs for compensating reconstruction noise reinforced by the brain features sparsity. However reducing the number of radiographs is essential in routine applications but it results to degraded tomograms. In such a case, traditional segmentation methods are no longer able to distinguish neuronal structures from surrounding noise. We investigate several existing deep-learning networks and we define new ones to segment brain features from very degraded tomograms. We demonstrate the superiority of the proposed networks compared to existing ones.","PeriodicalId":322214,"journal":{"name":"Computer Science Research Notes","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121658205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper we deal with the problem of real-time virtual view synthesis, which is crucial in practical immersive video systems. The majority of existing real-time view synthesizers described in literature require using dedicated hardware. In the proposed approach, the view synthesis algorithm is implemented on a CPU increasing its usability for users equipped with consumer devices such as personal computers or laptops. The novelty of the proposed algorithm is based on the atomic z-test function, which allows for parallelization of the depth reprojection step, what was not possible in previous works. The proposal was evaluated on a test set containing miscellaneous perspective and omnidirectional sequences, both in terms of quality and computational time. The results were compared to the state-of-the-art view synthesis algorithm – RVS.
{"title":"Massively Parallel CPU-based Virtual View Synthesis with Atomic Z-test","authors":"J. Stankowski, A. Dziembowski","doi":"10.24132/csrn.3301.32","DOIUrl":"https://doi.org/10.24132/csrn.3301.32","url":null,"abstract":"In this paper we deal with the problem of real-time virtual view synthesis, which is crucial in practical immersive video systems. The majority of existing real-time view synthesizers described in literature require using dedicated hardware. In the proposed approach, the view synthesis algorithm is implemented on a CPU increasing its usability for users equipped with consumer devices such as personal computers or laptops. The novelty of the proposed algorithm is based on the atomic z-test function, which allows for parallelization of the depth reprojection step, what was not possible in previous works. The proposal was evaluated on a test set containing miscellaneous perspective and omnidirectional sequences, both in terms of quality and computational time. The results were compared to the state-of-the-art view synthesis algorithm – RVS.","PeriodicalId":322214,"journal":{"name":"Computer Science Research Notes","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126408303","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ray tracing is typically accelerated by organizing the scene geometry into an acceleration data structure. Hardware-accelerated ray tracing, available through modern graphics APIs, exposes an interface to the acceleration structure (AS) builder that constructs it given the input scene geometry. However, this process is opaque, with limited knowledge and control over the internal algorithm. Additional control is available through the layout of the AS builder input data, the geometry of the scene structured in a user-defined way. In this work, we evaluate the impact of a different scene structuring on the run time performance of the ray-triangle intersections in the context of hardware-accelerated ray tracing. We discuss the possible causes of significantly different outcomes (up to 1.4 times) for the same scene and identify a potential to reduce the cost by automatic input structure optimization.
{"title":"On Importance of Scene Structure for Hardware-Accelerated Ray Tracing","authors":"Martin Káčerik, Jiří Bittner","doi":"10.24132/csrn.3301.60","DOIUrl":"https://doi.org/10.24132/csrn.3301.60","url":null,"abstract":"Ray tracing is typically accelerated by organizing the scene geometry into an acceleration data structure. Hardware-accelerated ray tracing, available through modern graphics APIs, exposes an interface to the acceleration structure (AS) builder that constructs it given the input scene geometry. However, this process is opaque, with limited knowledge and control over the internal algorithm. Additional control is available through the layout of the AS builder input data, the geometry of the scene structured in a user-defined way. In this work, we evaluate the impact of a different scene structuring on the run time performance of the ray-triangle intersections in the context of hardware-accelerated ray tracing. We discuss the possible causes of significantly different outcomes (up to 1.4 times) for the same scene and identify a potential to reduce the cost by automatic input structure optimization.","PeriodicalId":322214,"journal":{"name":"Computer Science Research Notes","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131503131","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The Blocky Volume Package (BVP) format is a distributed, platform-independent and API-independent format for storing static and temporal volumetric data. It is designed for efficient transfer over a network by supporting sparse volumes, multiple resolutions, random access, and streaming, as well as providing a strict framework for supporting a wide palette of encoding formats. The BVP format achieves this by dividing a volume or a volume sequence into blocks that can be compressed and reused. The metadata for the blocks are stored in separate files so that a client has all the information required for loading and decoding the blocks before the actual transmission, decoding and rendering take place. This design allows for random access and parallel loading and has been specifically designed for efficient use on the web platform by adhering to the current living standards. In the paper, we compare the BVP format with some of the most often implemented volume storage formats, and show that the BVP format supports most major features of these formats while at the same time being easily implementable and extensible.
{"title":"Blocky Volume Package: a Web-friendly Volume Storage and Compression Solution","authors":"Žiga Lesar, Ciril Bohak, M. Marolt","doi":"10.24132/csrn.3301.25","DOIUrl":"https://doi.org/10.24132/csrn.3301.25","url":null,"abstract":"The Blocky Volume Package (BVP) format is a distributed, platform-independent and API-independent format for storing static and temporal volumetric data. It is designed for efficient transfer over a network by supporting sparse volumes, multiple resolutions, random access, and streaming, as well as providing a strict framework for supporting a wide palette of encoding formats. The BVP format achieves this by dividing a volume or a volume sequence into blocks that can be compressed and reused. The metadata for the blocks are stored in separate files so that a client has all the information required for loading and decoding the blocks before the actual transmission, decoding and rendering take place. This design allows for random access and parallel loading and has been specifically designed for efficient use on the web platform by adhering to the current living standards. In the paper, we compare the BVP format with some of the most often implemented volume storage formats, and show that the BVP format supports most major features of these formats while at the same time being easily implementable and extensible.","PeriodicalId":322214,"journal":{"name":"Computer Science Research Notes","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126586141","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Automatic analysis of actions in sports training can provide useful feedback for athletes. Fencing is one of the sports disciplines in which the correct technique for performing actions is very important. For any practical application, temporal segmentation of movement in continuous training is crucial. In this work, we consider detecting and classifying actions in a sequence of fencing footwork exercises. We apply pose estimation to RGB videos and then we perform per-frame motion classification, using both classical machine learning and deep learning methods. Using sequences of frames with the same class we find data segments with specific actions. For evaluation, we provide extended manual labels for a fencing footwork dataset previously used in other works. Results indicate that the proposed methods are effective at detecting four footwork actions, obtaining 0.98 F1 score for recognition of action segments and 0.92 F1 score for per-frame classification. In the evaluation of our approach, we provide also a comparison with other data modalities, including depth-based pose estimation and inertial signals. Finally, we include an example of qualitative analysis of the performance of detected actions, to show how this approach can be used for training support.
{"title":"Temporal Segmentation of Actions in Fencing Footwork Training","authors":"F. Malawski, Marek Krupa","doi":"10.24132/csrn.3301.28","DOIUrl":"https://doi.org/10.24132/csrn.3301.28","url":null,"abstract":"Automatic analysis of actions in sports training can provide useful feedback for athletes. Fencing is one of the sports disciplines in which the correct technique for performing actions is very important. For any practical application, temporal segmentation of movement in continuous training is crucial. In this work, we consider detecting and classifying actions in a sequence of fencing footwork exercises. We apply pose estimation to RGB videos and then we perform per-frame motion classification, using both classical machine learning and deep learning methods. Using sequences of frames with the same class we find data segments with specific actions. For evaluation, we provide extended manual labels for a fencing footwork dataset previously used in other works. Results indicate that the proposed methods are effective at detecting four footwork actions, obtaining 0.98 F1 score for recognition of action segments and 0.92 F1 score for per-frame classification. In the evaluation of our approach, we provide also a comparison with other data modalities, including depth-based pose estimation and inertial signals. Finally, we include an example of qualitative analysis of the performance of detected actions, to show how this approach can be used for training support.","PeriodicalId":322214,"journal":{"name":"Computer Science Research Notes","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115933450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
River generation is an integral part of realistic terrain generation, since rivers shape terrains and changes in terrain, e.g., due to tectonic movements can change the path of rivers. Fast existing terrain generation methods often result in non-realistic river patterns, whereas physically-realistic techniques, e.g., building on erosion models, are usually slow. In this paper we investigate whether the Space Colonization Algorithm can be modified to generate realistic river patterns. We present several extensions of the Space Colonization Algorithm and show with a user study with $n=55$ participants that some variants of the algorithm are capable of generating river patterns that are indistinguishable from real river patterns. Although our technique can not generate all types of natural river patterns, our results suggest that it can prove useful for developing plausible 2D maps and potentially can form the basis for new terrain generation techniques.
{"title":"Generating Realistic River Patterns with Space Colonization","authors":"H. Feng, B. Wünsche, Alex Shaw","doi":"10.24132/csrn.3301.26","DOIUrl":"https://doi.org/10.24132/csrn.3301.26","url":null,"abstract":"River generation is an integral part of realistic terrain generation, since rivers shape terrains and changes in terrain, e.g., due to tectonic movements can change the path of rivers. Fast existing terrain generation methods often result in non-realistic river patterns, whereas physically-realistic techniques, e.g., building on erosion models, are usually slow. In this paper we investigate whether the Space Colonization Algorithm can be modified to generate realistic river patterns. We present several extensions of the Space Colonization Algorithm and show with a user study with $n=55$ participants that some variants of the algorithm are capable of generating river patterns that are indistinguishable from real river patterns. Although our technique can not generate all types of natural river patterns, our results suggest that it can prove useful for developing plausible 2D maps and potentially can form the basis for new terrain generation techniques.","PeriodicalId":322214,"journal":{"name":"Computer Science Research Notes","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121963096","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The problem of recycling secondary raw materials remains unresolved, despite many years of work on this issue. Among the many obstacles that arise is also the difficulty of sorting individual waste fractions. To facilitate this task and help solve this problem, modern computer vision and artificial intelligence techniques can be used. In our work, we propose constructing an intelligent garbage bin containing a camera and a microcomputer along with software that uses these techniques to sort waste. The role of the software is to recognize the type of waste and assign it to one of five main categories: paper, plastic, metal, glass and cardboard. The proposed method uses image recognition techniques with a convolutional neural network. The results confirm that using artificial intelligence methods significantly helps in sorting waste. The proposed solution can be used in homes and public places such as parks, cinemas or playgrounds.
{"title":"The use of Artificial Intelligence for Automatic Waste Segregation in the Garbage Recycling Process","authors":"J. Bobulski, M. Kubanek","doi":"10.24132/csrn.3301.40","DOIUrl":"https://doi.org/10.24132/csrn.3301.40","url":null,"abstract":"The problem of recycling secondary raw materials remains unresolved, despite many years of work on this issue. Among the many obstacles that arise is also the difficulty of sorting individual waste fractions. To facilitate this task and help solve this problem, modern computer vision and artificial intelligence techniques can be used. In our work, we propose constructing an intelligent garbage bin containing a camera and a microcomputer along with software that uses these techniques to sort waste. The role of the software is to recognize the type of waste and assign it to one of five main categories: paper, plastic, metal, glass and cardboard. The proposed method uses image recognition techniques with a convolutional neural network. The results confirm that using artificial intelligence methods significantly helps in sorting waste. The proposed solution can be used in homes and public places such as parks, cinemas or playgrounds.","PeriodicalId":322214,"journal":{"name":"Computer Science Research Notes","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131915335","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Geometric Algebra (GA) is popular for its immediate geometric interpretations of algebraic objects and operations. It is based on Clifford Algebra on vector spaces and extends linear algebra of vectors by operations such as an invertible product, i.e. divisions by vectors. This formalism allows for a complete algebra on vectors same as for scalar or complex numbers. It is particularly suitable for rotations in arbitrary dimensions. In Euclidean 3D space quaternions are known to be numerically superior to rotation matrices and already widely used in computer graphics. However, their meaning beyond its numerical formalism often remains mysterious. GA allows for an intuitive interpretation in terms of planes of rotations and extends this concept to arbitrary dimensions by embedding vectors into a higher dimensional, but still intuitively graspable space of multi-vectors. However, out intuition of more than three spatial dimensions is deficient. The space of colors forms a vector space as well, though one of non-spatial nature, but spun by the primary colors red, green, blue. The GA formalism can be applied here as well, amalgamating surprisingly with the notion of vectors and co-vectors known from differential geometry: tangential vectors on a manifold correspond to additive colors red/green/blue, whereas co-vectors from the co-tangential space correspond to subtractive primary colors magenta, yellow, cyan. GA in turn considers vectors, bi-vectors and anti-vectors as part of its generalized multi-vector zoo of algebraic objects. In 3D space vectors, anti-vectors, bi-vectors and covectors are all three-dimensional objects that can be identified with each other, so their distinction is concealed. Confusions arise from notions such as “normal vectors” vs. “axial vectors”. Higher dimensional spaces exhibit the differences more clearly. Using colors instead of spatial dimensions we can expand our intuition by considering "transparency" as an independent, four-dimensional property of a color vector. We can thereby explore 4D GA alternatively to spacetime in special/general relativity. However, even in 4D possibly confusing ambiguities remain between vectors, co-vectors, bi-vectors and bi-co-vectors: bi-vectors and bi-co-vectors - both six-dimensional objects - are visually equivalent. They become unequivocal only in five or higher dimensions. Envisioning five-dimensional geometry is even more challenging to the human mind, but in color space we can add another property, "texture" to constitute a five-dimensional vector space. The properties of a bi-vector and a bi-co-vector becomes evident there: We can still study all possible combinations of colors/transparency/texture visually. This higher-dimensional yet intuitive approach demonstrates the need to distinguish among different kinds of vectors before identifying them in special situations, which also clarifies the meanings of algebraic objects in 3D Euclidean space and allows for better formulations of
{"title":"Illustrating Geometric Algebra and Differential Geometry in 5D Color Space","authors":"W. Benger","doi":"10.24132/csrn.3301.1","DOIUrl":"https://doi.org/10.24132/csrn.3301.1","url":null,"abstract":"Geometric Algebra (GA) is popular for its immediate geometric interpretations of algebraic objects and operations. It is based on Clifford Algebra on vector spaces and extends linear algebra of vectors by operations such as an invertible product, i.e. divisions by vectors. This formalism allows for a complete algebra on vectors same as for scalar or complex numbers. It is particularly suitable for rotations in arbitrary dimensions. In Euclidean 3D space quaternions are known to be numerically superior to rotation matrices and already widely used in computer graphics. However, their meaning beyond its numerical formalism often remains mysterious. GA allows for an intuitive interpretation in terms of planes of rotations and extends this concept to arbitrary dimensions by embedding vectors into a higher dimensional, but still intuitively graspable space of multi-vectors. However, out intuition of more than three spatial dimensions is deficient. The space of colors forms a vector space as well, though one of non-spatial nature, but spun by the primary colors red, green, blue. The GA formalism can be applied here as well, amalgamating surprisingly with the notion of vectors and co-vectors known from differential geometry: tangential vectors on a manifold correspond to additive colors red/green/blue, whereas co-vectors from the co-tangential space correspond to subtractive primary colors magenta, yellow, cyan. GA in turn considers vectors, bi-vectors and anti-vectors as part of its generalized multi-vector zoo of algebraic objects. In 3D space vectors, anti-vectors, bi-vectors and covectors are all three-dimensional objects that can be identified with each other, so their distinction is concealed. Confusions arise from notions such as “normal vectors” vs. “axial vectors”. Higher dimensional spaces exhibit the differences more clearly. Using colors instead of spatial dimensions we can expand our intuition by considering \"transparency\" as an independent, four-dimensional property of a color vector. We can thereby explore 4D GA alternatively to spacetime in special/general relativity. However, even in 4D possibly confusing ambiguities remain between vectors, co-vectors, bi-vectors and bi-co-vectors: bi-vectors and bi-co-vectors - both six-dimensional objects - are visually equivalent. They become unequivocal only in five or higher dimensions. Envisioning five-dimensional geometry is even more challenging to the human mind, but in color space we can add another property, \"texture\" to constitute a five-dimensional vector space. The properties of a bi-vector and a bi-co-vector becomes evident there: We can still study all possible combinations of colors/transparency/texture visually. This higher-dimensional yet intuitive approach demonstrates the need to distinguish among different kinds of vectors before identifying them in special situations, which also clarifies the meanings of algebraic objects in 3D Euclidean space and allows for better formulations of ","PeriodicalId":322214,"journal":{"name":"Computer Science Research Notes","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131707146","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
eXpressive B-Spline Curve (XBSC) is a resolution-independent and computationally efficient technique for vector-based stroke modeling and rendering with the flexibility in defining and adjusting the shape and other parameters of the stroke. It generalizes the existing Disk B-Spline Curve (DBSC) geometric representation, which itself is a generalization of the Disk Bézier curve. XBSC allows flexible shape and color manipulation and rendering of strokes with asymmetrical shape control and rich color management. These properties make XBSC suitable for modeling freeform stroke shapes and animation, specifically in squash and stretch, a common technique to bestow elasticity and flexibility in shape changes. During the squash and stretch animation computation, we constrain the shape of the XBSC stroke to conserve its area. To achieve this, we apply the simulated annealing algorithm to iteratively adjust the XBSC while maintaining its area. We show several drawings, rendering and deformation examples to demonstrate the robustness of XBSC.
{"title":"Modeling and Rendering with eXpressive B-Spline Curves","authors":"H. Seah, Budianto Tandianus, Yiliang Sui","doi":"10.24132/csrn.3301.10","DOIUrl":"https://doi.org/10.24132/csrn.3301.10","url":null,"abstract":"eXpressive B-Spline Curve (XBSC) is a resolution-independent and computationally efficient technique for vector-based stroke modeling and rendering with the flexibility in defining and adjusting the shape and other parameters of the stroke. It generalizes the existing Disk B-Spline Curve (DBSC) geometric representation, which itself is a generalization of the Disk Bézier curve. XBSC allows flexible shape and color manipulation and rendering of strokes with asymmetrical shape control and rich color management. These properties make XBSC suitable for modeling freeform stroke shapes and animation, specifically in squash and stretch, a common technique to bestow elasticity and flexibility in shape changes. During the squash and stretch animation computation, we constrain the shape of the XBSC stroke to conserve its area. To achieve this, we apply the simulated annealing algorithm to iteratively adjust the XBSC while maintaining its area. We show several drawings, rendering and deformation examples to demonstrate the robustness of XBSC.","PeriodicalId":322214,"journal":{"name":"Computer Science Research Notes","volume":"104 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122101553","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Image colorization is a challenging problem due to the infinite RGB solutions for a grayscale picture. Therefore, human assistance, either directly or indirectly, is essential for achieving visually plausible colorization. This paper aims to perform colorization using only a grayscale image as the data source, without any reliance on metadata or human hints. The method assumes an (arbitrary) rgb2gray model and utilizes a few simple heuristics. Despite probabilistic elements, the results are visually acceptable and repeatable, making this approach feasible (e.g. for aesthetic purposes) in domains where only monochrome visual representations exist. The paper explains the method, presents exemplary results, and discusses a few supplementary issues.
{"title":"On Unguided Automatic Colorization of Monochrome Images","authors":"A. Sluzek","doi":"10.24132/csrn.3301.38","DOIUrl":"https://doi.org/10.24132/csrn.3301.38","url":null,"abstract":"Image colorization is a challenging problem due to the infinite RGB solutions for a grayscale picture. Therefore, human assistance, either directly or indirectly, is essential for achieving visually plausible colorization. This paper aims to perform colorization using only a grayscale image as the data source, without any reliance on metadata or human hints. The method assumes an (arbitrary) rgb2gray model and utilizes a few simple heuristics. Despite probabilistic elements, the results are visually acceptable and repeatable, making this approach feasible (e.g. for aesthetic purposes) in domains where only monochrome visual representations exist. The paper explains the method, presents exemplary results, and discusses a few supplementary issues.","PeriodicalId":322214,"journal":{"name":"Computer Science Research Notes","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128447984","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}