The first problem was getting my data out of Fusion Tables and into DynamoDB. I had a CSV data export of my Fusion Tables from my
google-fusion-tables-backup script (though, Google Takeout now also provides CSV exports for Fusion Tables and is much easier to use). But how to get that into DynamoDB? AWS provides a tutorial for “Importing Data into DynamoDB”, but this doesn’t take a CSV - it instead takes an idiosyncratically-delimited file of the same format as a DynamoDB export. It also uses AWS Data Pipelines and Elastic Map Reduce, which may increase the cost of importing data over simply writing the data into DynamoDB directly. After looking around, I found some solutions to crib from for writing a simple Ruby script for reading a CSV into a DynamoDB table, a somewhat-slow 25 records at a time (you could, of course, speed this up by doing the batch inserts asynchronously). Since there are around 7.5 million rows of data in the Book Aligner alignment tables, this wound up costing around $9 worth of writes to import completely into DynamoDB. Ideally, if I were regularly updating the backend tables, I’d also write a smarter process that could just delete deprecated records and add new ones, instead of writing the entire table every time.
Execution failed due to configuration error: API Gateway does not have permission to assume the provided role”. The solution was to manually edit the role’s Trust Relationship, as described in this Stack Overflow answer. The second major issue after testing my API and making sure it worked how I wanted it to was actually deploying the API live on the web. The documentation is not fantastic on this, and it seems like you can only deploy using the command-line
aws utility, not the web interface - and the documentation also isn’t clear that you must specify the stage name with
--stage-name when creating the initial deployment, otherwise you’ll get the cryptic error “
An error occurred (BadRequestException) when calling the CreateDeployment operation: Invalid deployment content specified.CreateDeploymentInput should not be null.”
Aside from those relatively minor issues, I’m pretty happy with how easy it was to move everything over. The old Fusion Tables backend for the Book Aligner only got around 4,000 queries per month, so I don’t anticipate read costs being a major issue with DynamoDB (at $0.25 per million read requests). For comparison, a service like AWS Athena or Google BigQuery which costs $5 per terabyte of data scanned would wind up costing around $10/month in reads with the approximately 500MB of data in the Book Aligner tables.