/etc

Originally Published: 2019-08-26

If your Gmail storage is filling up, you’ll be directed to Google’s relatively pathetic instructions for clearing space in Gmail (at the time of this writing, those are “delete emails in Spam” and “delete emails with large attachments”). While that may free up a small amount of space, it’s unlikely to make much of a dent if you’ve been using Gmail for some time. A better way to free up space is to find the most frequent senders and delete all their email you don’t care about (newsletters, notifications, ads, spam, etc.).

While I did find this Python script and corresponding blog post recommending the same approach, the Python script didn’t seem to work correctly on my machine (it seemed like the Python mailbox get_from() function was returning something other than the expected From: field on my mbox file).

So, if you’re familiar with the command line, you can use this one-liner on the mbox file you get from Google Takeout instead:

grep '^From:' ~/Downloads/Takeout/Mail/All\ mail\ Including\ Spam\ and\ Trash.mbox | cut -d'<' -f2 | tr -d '>' | sort | uniq -c | sort -rn > senders.txt

Sure, it’s not perfect, but for From: lines that don’t conform it still fails gracefully. You can then work your way through the resulting sorted output of most frequent senders and deleting all mail from them (I suggest deleting lines in the output as you delete email, so you can keep track of what you’ve already deleted). It may take a little while for your storage quota report to update, but I deleted many tens of thousands of emails this way, freeing up about 5GB of storage and getting comfortably back under the quota as a result.

Update: If you want a more accurate picture of how your email storage is being used, I’ve written a short Ruby script which tries to calculate the total number of bytes for each sender in an Mbox file from Google Takeout.