Abstract: In this paper, we present a Music Genre Classification System that utilizes Natural Language Processing (NLP) and Machine Learning techniques to predict the genre of a song solely based on its lyrical content. Unlike traditional audio-based classification methods, this approach focuses on the textual features of lyrics, enabling faster and more resource-efficient analysis. The system begins with data acquisition from publicly available song lyrics datasets, followed by rigorous text preprocessing involving tokenization, Stopword removal, and lemmatization to standardize input data. Feature extraction is performed using the Term Frequency–Inverse Document Frequency (TF-IDF) technique to represent textual information numerically, preserving the contextual importance of words.
The processed data is then used to train a supervised machine learning model, specifically Logistic Regression, which learns distinctive linguistic and stylistic patterns associated with different genres such as Pop, Rock, Hip-Hop, and Country. Model evaluation was carried out using metrics like accuracy, precision, recall, and F1-score, achieving an overall accuracy of approximately 85%. A user-friendly web interface was developed using Streamlit to allow real-time lyric input and instant genre prediction.
The proposed system demonstrates that lyrics carry significant semantic and emotional information that can be leveraged to classify music genres effectively. This work contributes to the growing field of computational music analysis and can be further extended to enhance music recommendation engines, automated playlist generation, and text-based sentiment-driven music analysis.
Downloads:
|
DOI:
10.17148/IJIREEICE.2025.131037
[1] PRANAV H, SARVESH S, HARI SRINIVAS, Dr. Golda Dilip, "MUSIC GENRE CLASSIFICATION SYSTEM USING NATURAL LANGUAGE PROCESSING," International Journal of Innovative Research in Electrical, Electronics, Instrumentation and Control Engineering (IJIREEICE), DOI 10.17148/IJIREEICE.2025.131037