Ankiport - Backend, Pt. 1


Ankiport: Backend, Pt. 1

Oh look! I’m not dead! And I’m back after a New Year’s break to keep this series on track with some promised posts. Today (and next time), I’ll dive into Ankiport’s backend, and what I’ve learned about web app design by building it.

Ankiport is really only possible because of Kerrick Staley and his genanki library (so big B I G thank you to you sir!!), which lets you programmatically create Anki decks with Python 3. Python is one of my favorite languages and my go-to for getting started with any random idea I have, so I was pleasantly surprised to see that a library like genanki existed. As I mentioned in the previous post, I originally envisioned Ankiport as a CLI app, but after diving into the web dev world this past semester, I decided to restructure it into a web app. This was actually a good decision, since it meant that I now had to have a backend to the app if I wanted to avoid re-writing genanki in JavaScript. I’d already had experience writing a simple data API in Node/Express, so I was familiar with the concept of what I needed from this backend – a simple API that would take a Quizlet and return the converted Anki file. In my attempt to create the CLI, I’d already written some Python code that could in fact convert a Quizlet to an Anki deck, so thankfully I wasn’t starting completely from scratch. I knew that there were ways to create processes within Node so that I could run my existing code that called genanki (which was in Python), but I figured it’d be good to learn as many new technologies as I could with this project, so I went with Flask instead. Flask is, of course, a Python library, so I could stay within the language my code was already written in, and after reading the docs, I found there wasn’t much difference between Flask and Express.

The first step in setting up my API backend was to rewrite my old Ankiport CLI code. Since this was a prototype that I’d thrown together in a few days, there were some kinks to iron out. The biggest change I made was with how I actually found the correct Quizlet to convert. I was originally taking in a Quizlet user’s username and searching through their decks until I found the specified one, but I discovered (why I didn’t find this at first is beyond me #derp) that the very smart API designers at Quizlet let you get a deck directly with the /set path. All you need is the deck ID, conveniently included in Quizlet URLs, which meant that I could fulfill my dream of taking in a URL and spitting out an Anki deck! 🎉🎊 I rewrote the code to query the /set endpoint and access the Quizlet set directly, iterating over the response and extracting each card into an Anki note, and then exporting the deck with the write_to_file("my_file_name") method in genanki. Now it was time to connect this to an API and see if it worked.

After getting the hang of Flask (I won’t go into the details of setting up a Flask app in this post, but you can find some great help here), I set up the main endpoint for my API: /port. I decided to take care of the URL parsing client-side, so the frontend would only be sending a set ID to my API. Since there wasn’t any data involved, the endpoint would handle a simple GET request from the frontend that would carry the set ID as a query param. I imported my Quizlet-to-Anki code (I call it ankiport-core) and passed the set ID param into it, and returned the generated file as a response. Here’s where I hit my first few roadblocks. First off, CORS. As I’m sure you know, CORS is a mechanism that tries to keep you safe from malicious stuff that comes from other servers other than the one where your code is running. As soon as I started testing the API, I ran into CORS complaints (btw, if you haven’t heard of it, Postman is the BEST tool for testing APIs). Thankfully, there’s a library called flask_cors that lets you fix your CORS issues with a simple line: CORS(app). Another issue I ran into was with deployment. I decided that since I was learning new technologies, I wouldn’t go with Azure for deployment (since I’d used it with my capstone project) and opted instead for Google Cloud. After doing some reading, I found that Google App Engine would let me deploy the API with a simple command and be really cheap, so I set up a Google App Engine project for my API and beamed the code up. I immediately got some errors however, and found that they came from my ankiport-core code, specifically from write_to_file(). write_to_file() , as you can imagine, takes a file and writes the Anki deck to that file. The problem is that Google App Engine apps are run on a read-only environment, so the write part of write_to_file() wouldn’t really work on GAE. I could try to write to some Google Cloud storage of course, but I wanted to avoid provisioning a storage bucket for the app in order to avoid complicating my frontend code with strange requests, and also possibly having to pay for the storage. It seemed pointless to get a solution that had permanent storage for files that really only needed to be delivered and downloaded to the user’s computer. I needed a way to bypass writing the Anki deck to disk before sending it in the response, and simply hold it in memory until the API could beam it to the frontend. After some sleuthing, however, I found a lead in the genanki source. write_to_file() uses the zipfile module to create the Anki deck file. The zipfile docs let me know that the ZipFile(file) constructor used in the write_to_file() method opened a new zipfile, “where file can be either a path to a file (a string) or a file-like object.” This piqued my interest, as I hadn’t heard of a file-like object before. After looking in the Python docs some more, I found that a file-like object included things like files (duh), but also BytesIO objects. For IO operations, BytesIO objects are treated as files open in binary mode, which was exactly what I needed – a way to keep the deck data in memory instead of on disk, plus it already expected data in binary format, which is what would be coming out of genanki. I wrote a new method that would use the BytesIO object instead of a filename:

    def makeDeckBytes(self, deckName, notes):
        '''
        Creates a deck with the given list of notes. Returns a BytesIO object
        containing all the binary data of the Anki deck.
        '''
        my_deck = genanki.Deck(gen_id(), deckName)
        deck_bytes = io.BytesIO()

        for note in notes:
            my_deck.add_note(note)
        genanki.Package(my_deck).write_to_file(deck_bytes)
        return deck_bytes

It worked like a charm! I could now switch to using make_response() in my API code instead of send_from_directory() like I was previously when I wrote the file to disk. Easy breezy! This solution is more elegant, works on Google App Engine, and is (probably?) more performant since it’s not writing to disk.

After these problems, I didn’t have to change much until I added the custom styling feature. I’ll talk about how I implemented those changes, plus some small stuff related to response headers and downloading, in Backend, Pt. 2 – coming soon!

Cheers, Jahz.