Tackling Webdev as a Bioinformatician: why is it so hard?

7 minute read

The web is amazing and amazingly complex

For context: I’m a bioinformatics engineer, and I’m building CodeStories. I used to say I was a scientist, though after a PhD in bioinformatics, interning at Google as a SWE, and now working full-time as an engineer, “scientist” isn’t as accurate as it once was. Regardless, more than 95% of my programming experience is scientific work in Python. I’m certainly not a web developer. I do, however, love the idea of the web. Writing software that is immediately accessible to anyone in the world with an internet connection still amazes me.

The problem is that I find web development tiresome and difficult. There are many more layers of complexity. Debugging is harder. The “right” way to solve a problem, or the “right” framework to choose, changes from year to year. UI is always a challenge when you’re as artistically underdeveloped as I am.

I’ve written the above paragraphs before. But the past couple of weeks have reminded me of them, since the development has been particularly frustrating. Here are just a few of the problems/issues I’ve been tackling.

I got this comment from a visitor. I fought an initial reaction of annoyment. He wasn’t wrong.

http://url1234.mycodestories.com/ls/click?upn=3QLThrqNigDEoSvVIPTzr5OzEa6O0MnUsvWv3olSZXl4L-2FYaN6SVBobENK8s9ueGGAFOA6m5yrLchJOvEChEY7P17X3veJ8UOiBnvmG9l-2Fp2elHVV9DwV85apRxA22lB1-2FSesba-2FJWayELpnnHzyCg-3D-3DGFRp_jqP8oJaQPFuAqB11Vl2sCfpACXj7r6HrULE4hOZOb1gAqnD5oEP2AQfL-2BW1jeWQlSQouzVbx8LjqaqL8GfEn7AGpGM-2B6MOKBPnOHE7dOT-2Fs8wnIUoU15UhdccNZ0PiVvFWLouo4TpW11lIQ7d2aCbRzzx5i6I7ozarMc-2BpC2xyMwzqKn8Dwd0TG9iz8YNtup-2Bi9ph4J4aPTMKpov4Ai8YphDw-2BKe9IgJz4vW4xgu8ACydvRsXEvI6e80ORBIwbJ-2BjXV9hp9hJlMEU8uGRyi8h-2F2tSygQULNLTyK8WlIyUZMUiMxX73U4Y7mrOvDMJLCI

A few things look weird here:

  1. Why’s there so much data in the url? A common refrain I’ll use for the remainder of the post is, “I don’t know”. One of my biggest gripes with web programming is the number of rabbit holes to fall down. There’s just not enough time. Here, for example, the url is being generated by django-allauth, a pretty nifty plugin that helps manage user accounts. I don’t know what data they’ve decided to serialize.
  2. What’s with the url1234 subdomain? This one was a little easier to figure out. I’m using sendgrid to send emails for me. When setting up my account, they asked if I wanted a custom subdomain. I said no, assuming that would mean that emails would be sent from @mycodestories.com. This was a pretty easy update to @email.mycodestories.com, but it would have been nice to have some prompt that gave a little more context for the decision I was making.
  3. Shouldn’t http really be https? I eventually filed an Issue asking for help on this one. The maintainers were incredibly gracious and helpful. But again, the complexity is overwhelming. In this single comment, here are a list of terms that you’d never come across as a bioinformatician: ["X-Forwarded-Proto", "proxying", "requests", "TLS connection", "HTTP", "loopback/localhost", "whitelisted proxy servers", "gunicorn", "nginx", "Apache/httpd", "HAProxy", "Django", "SECURE_SSL_REDIRECT", "SecurityMiddleware", "adapter", "URI"]. Over the past few years, I’ve come to understand some—but certainly not all—of these concepts and technologies.

On top of all this, users would occasionally click on the verification link and get a NET::ERR_CERT_COMMON_NAME_INVALID from chrome along with the helpful message that:

the website sent back unusual and incorrect credentials. This may happen when an attacker is trying to pretend to be url1234.mycodestories.com, or a Wi-Fi sign-in screen has interrupted the connection. Your information is still secure because Google Chrome stopped the connection before any data was exchanged.

I finally just decided to temporarily stop sending verification emails.

Stripe’s payment APIs are beautiful

Taking payments is one of the most common activities on the web. It seems like it should be dead simple. From what I’d read, Stripe is somewhat famous for their high-quality APIs. I was excited to spend an evening on payments and flip the switch on being able to sign up paying customers. A week later… I’d decided that Stripe’s API’s are beautiful. But their documentation leaves a surprising amount to be desired. Part of the problem is that Stripe is on v3. Their docs haven’t kept up.

Forms

I began by trying to follow this guide on building a form to accept a credit card. Their inline code examples are intentionally incomplete, which is fine because they have a full example on Github. The problem is (and maybe I missed some easy reconciliation) that the two don’t match up. This discrepancy is another example of something I didn’t follow up on. As I was flipping through documentation pages trying to figure out what I was doing wrong while building my form, I came across Stripe Checkout. Watch out! Don’t get this page confused with the previous page, also entitled “Accept a Payment”.

Checkout

I decided to scrap making my own form and roll with Checkout instead. I made the mistake of trying to follow an unofficial tutorial showing exactly how to integrate Checkout into Django. It was a well written guide, but things still weren’t clicking. Eventually I realized that the guide was a couple of years old, and thus outdated. Thankfully, I found an extremely helpful Stripe page, their migration guide. It was very easy to map the outdated tutorial instructions to the new standard.

Webhooks

In retrospect, it’s obvious that setting up Checkout means you’re only halfway done. The migration page nicely lead me to “After the payment”, another page, where I could learn how to set up and respond to a webhook when a payment succeeds. Again, the docs almost work. On this page, the instructions tell you to create an event by:

event = stripe.Webhook.construct_event(
  payload, sig_header, endpoint_secret
)

Except that wasn’t working for me. After spending a bunch of time trying to figure out what I did wrong, I came across a separate but remarkably similar page that says I should build my event as such:

event = stripe.Event.construct_from(
  json.loads(payload), stripe.api_key
)

This version happens to work.

In my experience, once you’ve been around the block once or twice with a system like Stripe, all of these errors and mistakes seem trivial. As a beginner, all of these hurdles end up making things 5-10x more difficult than they could be. In this case, it ended up taking me about a week to get payments fully integrated and tested.

I can’t login

This bit really capped off the week for me.

I’d rolled out a couple of soft launches of CodeStories, but because I hadn’t setup Stripe, I had ~40 people in a waitlist and no paying customers. After I finished setting up payments, I emailed everyone on the waitlist letting them know registration was open. My first potential customer contacted me with a question, I answered him, he said that sounded great, and he went to go enroll. Except, he couldn’t login to his account… When I checked my account neither could I…

At work, we sometimes run postmortems by using a technique called “The Five Whys”. Basically, you repeatedly ask why a problem occurred as a technique to get to the bottom of an issue. Let’s play.

  1. Why couldn’t I login? Because after some amount of waiting, gunicorn SystemExits and the dyno gets rebooted.
  2. Why is gunicorn performing a SystemExit? Something in the bowels of the django-allauth login-functionality was timing out.
  3. Why was django-allauth timing out? It tried to connect to redis, but it wasn’t able to do so, and didn’t handle the error gracefully.
  4. Why couldn’t django-allauth connect to redis? As recommended, I have a config var REDIS_URL that points to a URI. Mine was different than the one that I see when I open the redis admin page on Heroku. Importantly, I’ve had this environment up and running for over a year and have never had trouble with redis or logging in. And while I’d been doing a lot of signup testing for Stripe Integration, I was repeatedly creating new accounts then deleting them immediately after enrolling, without logging back in.
  5. Why was REDIS_URL incorrect? I DoN’t KnOw WhY!

My best guess is that it has something to do with this warning about redis credentials from Heroku:

Please note that these credentials are not permanent. Heroku rotates credentials periodically and updates applications where this datastore is attached.

It took me 4 straight hours of debugging to get to the bottom of this. By the time I was done, my customer was gone, and I haven’t heard from him since.

End Rant

This post is intentionally unpolished. Why? Because I want to get back to programming. On a personal level, the only way to overcome these issues is to learn more about the tools I’m using. And the only way to learn is to keep practicing.

More broadly, however, I’m writing this post as a plea to everyone building the web. Please stay conscious of those who know less than you. It’s easier said than done, and easily forgotten. I often fail at this myself. That’s why constant reminders are worthwhile. Build web tools and documentation that allow people like me, who aren’t web developers, to build tools like CodeStories for people who aren’t developers at all.

It’ll probably feel different tomorrow, but today it feels like we’ve managed to build systems over the last 40 years that are more complex than biology has been able to manage in 4,000,000,000 years of evolution.