Calculating a Pearson Correlation Matrix
One of the great things about graduate school is you get to constantly realize how ignorant you are about how things work. If the domain name weren’t already taken, I would think about moving my stuff over to seriously.dontusethiscode.com. It can get a bit depressing thinking about all the stuff there is to know. But if you think about it a bit more, it’s really just awesome how exciting our world is. I’m glad I’ve got this opportunity to learn and explore.
Anyway, I found a question on reddit that reminded me of a post I did a while ago. One of the key points of ‘Simple Sequence Similarity’ was calculating a Pearson correlation matrix. While I realize that it isn’t exactly the same as the problem on the reddit link, the nested iteration made me realize that I should probably show some improved code that I currently use to calculate the correlation matrix. By “improved”, I mean ~500x faster.
Here’s a link to the notebook.
As a fun bonus (which I won’t dive into right now) there are at least four different examples of Ipython magic throughout the notebook. As always, I welcome any comments or suggestions!
Cheers!