{"title":"Classification, Segmentation and Chronological Prediction of Cinematic Sound","authors":"Pedro Silva","doi":"10.1109/ICMLA.2012.172","DOIUrl":null,"url":null,"abstract":"This paper presents work done on classification, segmentation and chronological prediction of cinematic sound employing support vector machines (SVM) with sequential minimal optimization (SMO). Speech, music, environmental sound and silence, plus all pair wise combinations excluding silence, are considered as classes. A model considering simple adjacency rules and probabilistic output from logistic regression is used for segmenting fixed-length parts into auditory scenes. Evaluation of the proposed methods on a 44-film dataset against k-nearest neighbor, Naive Bayes and standard SVM classifiers shows superior results of the SMO classifier on all performance metrics. Subsequently, we propose sample size optimizations to the building of similar datasets. Finally, we use meta-features built from classification as descriptors in a chronological model for predicting the period of production of a given soundtrack. A decision table classifier is able to estimate the year of production of an unknown soundtrack with a mean absolute error of approximately five years.","PeriodicalId":157399,"journal":{"name":"2012 11th International Conference on Machine Learning and Applications","volume":"113 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 11th International Conference on Machine Learning and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA.2012.172","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
This paper presents work done on classification, segmentation and chronological prediction of cinematic sound employing support vector machines (SVM) with sequential minimal optimization (SMO). Speech, music, environmental sound and silence, plus all pair wise combinations excluding silence, are considered as classes. A model considering simple adjacency rules and probabilistic output from logistic regression is used for segmenting fixed-length parts into auditory scenes. Evaluation of the proposed methods on a 44-film dataset against k-nearest neighbor, Naive Bayes and standard SVM classifiers shows superior results of the SMO classifier on all performance metrics. Subsequently, we propose sample size optimizations to the building of similar datasets. Finally, we use meta-features built from classification as descriptors in a chronological model for predicting the period of production of a given soundtrack. A decision table classifier is able to estimate the year of production of an unknown soundtrack with a mean absolute error of approximately five years.