Ever wondered what, if any, patterns are there to be discovered in black metal lyrics? Well, I did, and started by simply finding out which words occur the most in this data set. After some cleaning and pre-processing, I’ve ended up using lyrics of 76039 songs by 24086 bands, from 116 different countries. Stop words (which can be roughly defined as very common and very uninformative words like the or or) were removed in this pre-processing stage. In the end, a total of 258610 distinct words occur, with the number of occurrences summing up to 5304046.
The following bar plot shows the top 15 most used words across the whole lyrics data set.
The most common term is death (not at all unexpected) represents 0.7% of the total number of occurrences of all distinct words. Other more or less expected results such as blood or darkness also make an appearance, but it is somewhat intriguing to find time in the top 5. So, what does this all mean? Well, not much (yet): simply counting the number of occurrences of individual words is not a good indicator of “meaning” because it discards the context in which the words appear, as well as the relationships between them, but provides very helpful hints.