Blog Post 7
Space & Cepstrums
This week came with a couple of setbacks and some pretty cool break throughs.
At the end of last week I had a pretty robust model that could acuratley classify a sound file on weather it had been recorded in one of two rooms.
After this I decided to up the ante by making a new model that would have not two, but 8 distinct rooms to classify between.
I got the model working on a few training samples but problem arose when I tried to train the model with more data, as the bigger database was actually too big to be loaded into RAM.
I fixed this by decreasing the number of files in the training database.
After a few test runs however it turned out that the model was far less accurate than the previous binarly classifier, and took a lot longer to train.
After reading a few papers on music information retrival techniques I found that many projects utilise cepstral analysis to help corerelate audio simmilarity.
In DSP a cepstrum is ‘the result of taking the inverse Fourier transform of the logarithm of the estimated spectrum of a signal’ which sounds (and is) pretty complicated but it can kind of be thought of as a way to look at the rate of change of diffent spectrum bands.
By adding a prepreccessing step where I take the cepstrum of each sample in the traing set and generate a new data set using this information, I was able to increases the accuracy of the models classifcations and decreases the size of the database dramatically, saving huge ammounts of memory and training time.
A further step I might take is testing whether or not the auto-cepstrum (the cepstrum of the autocorrelation signal)
which is sometimes used in the analysis of singal data with echoes, produces more accurate results.
Next week I plan to tweak the model using tensorboard to analyse the sucess of slightly different models, and to begin working more with GNN to implement the model with ECL.