Gwenaelle Cunha Sergio, R. Mallipeddi, Jun-Su Kang, Minho Lee
{"title":"从图像生成音乐","authors":"Gwenaelle Cunha Sergio, R. Mallipeddi, Jun-Su Kang, Minho Lee","doi":"10.1145/2814940.2814978","DOIUrl":null,"url":null,"abstract":"Images can convey emotion just like music. If that's so, then it might be possible that, given an image, one can obtain a music that can produce a similar reaction from the listener/viewer. The challenge lies in how to do that. In this paper, we analyze the image using the HSV color space model and assume that each one of the three components have a relation with basic music elements, like tone, pitch, rhythm and loudness. The image is then scanned from left to right and top to bottom in order to generate a sequence of notes. In the end, the emotional Mean Opinion Score (MOS) is used to evaluate the performance of the proposed method. This work could prove to be a very important contribution to the field of HCI because it can improve the interaction between computers and humans who are visually and/or hearing impaired. In the current work, we only consider two emotions; positive and negative.","PeriodicalId":427567,"journal":{"name":"Proceedings of the 3rd International Conference on Human-Agent Interaction","volume":"61 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Generating Music from an Image\",\"authors\":\"Gwenaelle Cunha Sergio, R. Mallipeddi, Jun-Su Kang, Minho Lee\",\"doi\":\"10.1145/2814940.2814978\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Images can convey emotion just like music. If that's so, then it might be possible that, given an image, one can obtain a music that can produce a similar reaction from the listener/viewer. The challenge lies in how to do that. In this paper, we analyze the image using the HSV color space model and assume that each one of the three components have a relation with basic music elements, like tone, pitch, rhythm and loudness. The image is then scanned from left to right and top to bottom in order to generate a sequence of notes. In the end, the emotional Mean Opinion Score (MOS) is used to evaluate the performance of the proposed method. This work could prove to be a very important contribution to the field of HCI because it can improve the interaction between computers and humans who are visually and/or hearing impaired. In the current work, we only consider two emotions; positive and negative.\",\"PeriodicalId\":427567,\"journal\":{\"name\":\"Proceedings of the 3rd International Conference on Human-Agent Interaction\",\"volume\":\"61 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-10-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 3rd International Conference on Human-Agent Interaction\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2814940.2814978\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 3rd International Conference on Human-Agent Interaction","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2814940.2814978","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Images can convey emotion just like music. If that's so, then it might be possible that, given an image, one can obtain a music that can produce a similar reaction from the listener/viewer. The challenge lies in how to do that. In this paper, we analyze the image using the HSV color space model and assume that each one of the three components have a relation with basic music elements, like tone, pitch, rhythm and loudness. The image is then scanned from left to right and top to bottom in order to generate a sequence of notes. In the end, the emotional Mean Opinion Score (MOS) is used to evaluate the performance of the proposed method. This work could prove to be a very important contribution to the field of HCI because it can improve the interaction between computers and humans who are visually and/or hearing impaired. In the current work, we only consider two emotions; positive and negative.