Background: Music is an integral part of our lives and is often played in public places like restaurants. People exposed to music that contained alcohol-related lyrics in a bar scenario consumed significantly more alcohol than those exposed to music with less alcohol-related lyrics. Existing methods to quantify alcohol exposure in song lyrics have used manual annotation that is burdensome and time intensive. In this paper, we aim to build a deep learning algorithm (LYDIA) that can automatically detect and identify alcohol exposure and its context in song lyrics.
Methods: We identified 673 potentially alcohol-related words including brand names, urban slang, and beverage names. We collected all the lyrics from the Billboard's top-100 songs from 1959 to 2020 (N = 6110). We developed an annotation tool to annotate both the alcohol-relation of the word (alcohol, non-alcohol, or unsure) and the context (positive, negative, or neutral) of the word in the song lyrics.
Results: LYDIA achieved an accuracy of 86.6% in identifying the alcohol-relation of the word, and 72.9% in identifying its context. LYDIA can distinguish with an accuracy of 97.24% between the words that have positive and negative relation to alcohol; and with an accuracy of 98.37% between the positive and negative context.
Conclusion: LYDIA can automatically identify alcohol exposure and its context in song lyrics, which will allow for the swift analysis of future lyrics and can be used to help raise awareness about the amount of alcohol in music. Highlights Developed a deep learning algorithm (LYDIA) to identify alcohol words in songs. LYDIA achieved an accuracy of 86.6% in identifying alcohol-relation of the words. LYDIA's accuracy in identifying positive, negative, or neutral context was 72.9%. LYDIA can automatically provide evidence of alcohol in millions of songs. This can raise awareness of harms of listening to songs with alcohol words.