End of Year Review (Part II)
- There are several functionalities that I’ve played with in Python’s core language that would likely be useful, but I haven’t had the time to properly understand them, or how to efficiently implement them in my code.
- Generators can save on memory. Things like lists of 10mers could be generators.
- Decorators would be useful for cleanly modifying functions when I want to have several options for how to run some piece of core logic.
- Inheritance is going to become more important as I continue to switch over to a more OOP approach.
- Matplotlib has some serious disadvantages for plotting large datasets. I’ve made several attempts at investigating other plotting libraries, especially bokeh, but nothing has clicked with me. Part of the problem is that some of these libraries are still under very active development, so things are changing quickly. Another issue is that I only devote a small amount of time to trying to plot something (like an animated plot), and if it doesn’t work out quickly, I give up because I feel like I’m falling behind on analyses.
- I was really hoping my Machine Learning course would give me the skills I needed to implement a neural network in Theano. Given the past few months, my guess is that I still have a serious time investment barrier before I would be able to properly incorporate a deep learning model into my research.
- I picked up the basics of git during my third rotation. I use it to store backup copies of my code, but I haven’t used it enough to employ it properly. I realize that if I used git enough though (creating branches, etc.) it could potentially save me from some goofy disaster while editing code or messing around with files.
- Similarly, I got fairly comfortable with the Killdevil cluster during my third rotation as well. Because of the difficulties of visualization of the Notebook on Killdevil, I often avoid using Killdevil. This means I haven’t kept up my command lines skills as much as I should have. I was originally hoping to be regularly using tools like ‘grep’, ‘awk’ and bash scripting by now, but I can avoid a lot of it with the Ubuntu GUI. Now, every time I do need the cluster for something, it slows me down considerably.
- I’m in the same boat with regular expressions. Every once in awhile I find a problem suitable for regex, google until I have the answer, and move on. I still don’t have enough real understanding of the tool to have the solutions I find stick around in my memory, so I don’t really learn.
- Statistics remains another weak area for me. The stats class I took at the beginning of grad school was a good introduction, but I need to find a way to stay fresh and continue to grow in this area. I still have a lot of trouble deciding when it’s appropriate to use what statistical test, for example.
There’s a theme here. Every couple months, I run into a problem that can be patched and/or worked around, or actually fixed with one of these tools. I decided I’m going to try out one of these tools, and spend a couple hours reading and trying some stuff out. I learn a little bit, but not enough to fix whatever problem I’m having. I become frustrated because I wanted to have the problem fixed by the end of the day and I know I can apply my work around. So, I just give up, apply my monkey patch, and move on. By the time I run into a similar problem a couple months later, the little bit I learned during those few researching and practicing hours has been completely forgotten and the cycle starts over. This is an unfortunate waste of time, especially since all of the tools listed are extremely valuable resources. I’ve been aware of this issue for a while now, and was hoping that I would have some method for avoiding it by now.