<- Updates
Sep 30, 2024
We’re building an email sending platform and regularly encounter legacy firewalls and outdated corporate rules that make delivering email more complex than it may seem on the surface. Terrible email has existed for as long as email has existed. Along the way, restrictions have been put in place that inadvertently catch emails sent with the best of intentions, resulting in false positive spam scores. This is how we diagnosed and remedied that issue for one of our large senders.
Recently, a large university institution told us that a customers' emails would be delayed (but not blocked) by their automated systems due to a greylisting process. This meant the emails were temporarily rejected to ensure they were legitimate, but they would still be delivered after a resending attempt.
We knew our customer passed our own requirements because we also need to protect the reputation of our platform, so we took this as an opportunity to optimize our system against standards set by a picky IT department.
The resulting effort involved digging into spam filtering source code, our own MJML generation (the email markup language we use), and the MIME encoding package we relied on.
The university’s email infrastructure uses an open-source spam filtering system called Rspamd, which is based on another open-source system called SpamAssassin. Rspamd categorizes emails into three levels of spamminess, based on a score derived from a set of code-driven rules.
Greylisted (Score >= 4): Our customer's email was classified at the lowest warning level, 'greylisted'. This means the email is temporarily rejected to verify its legitimacy by waiting for an automatic resending attempt. While this ensures the sender is genuine, it also causes delays in delivery.
Add Header (Score >= 6): The second level adds a header to the email, marking it as probable spam. This allows email clients like Gmail or Outline to automatically move it to the spam folder, reducing the likelihood of the recipient seeing it.
Reject (Score >= 15): The third and most severe level is rejecting the email entirely, preventing it from reaching the recipient. This ensures that highly suspicious emails do not enter the recipient's mailbox at all.
Our customer's email was greylisted at a score of 5.24. We investigated fixes for every issue in the scoring report, and ran our tests through an Rspamd docker container to validate the results, eventually reducing the score from 5.24 to 1.24, yay!
So what issues showed up in our customer's scorecard?
Rspamd scores contain three pieces of information: the problem, the original score, and the weighted score based on a customizable weight per rule. You can read the specific implementation in the Lua and C programming languages in the Rspamd Github repo.
The code is sophisticated and interesting to peruse, but let's break these rules down in plain English.
The Total Score
The first number of each row is a weighted value that is added to the total score, and the second number is the original unweighted value.
Many Invisible Parts
There are tricks to hide spam content from the human eye or computer vision, like choosing the same text color as the background color (just like early SEO hacks), using a zero font size, zero opacity, css transparency, etc.
When you use more sophisticated MJML with inline CSS like we do, not all false-positives can be avoided, but we started off at a pretty low score already.
URI Count Odd
When you send an email with HTML content, you also need to send a plain text version with the payload. Spam emails can contain invisible images for tracking, which do not show up in the plain text version.
This rule checks for hidden image links by excluding image URIs from a count of valid URIs in the HTML version, against the count of URIs in the plain text version. We got this score down to zero by disabling image-to-text conversion in our MJML generation, without affecting our actual content.
Has List Unsub
Our emails already come with automated links to unsubscribe, so we banked some goodwill with a teeny tiny negative score here (negative means less spammy, positive is more spammy).
Since the MIME standard was originally based on ASCII, it evolved to support unicode characters such as emojis and additional languages, and binary file attachments by encoding them in a format using only ASCII characters. The most efficient but least human-legible of these is the Base64 encoding format.
Spammers use Base64 encoding to hide the contents of headers such as "From", "Reply-To", and “Subject”. Our interface at Loops supports editing sender names and subjects with UTF-8 characters, such as emojis.
If you're sending an email from the plain name "Alice [email protected]", there's no need to encode it as "=?utf-8?QWxpY2U=?= [email protected]". It’s an automatic yellow card when the spam filtering system decodes the header and realizes the encoding was unnecessary.
We use the handy and concise MIMEText npm package, and its latest version automatically encodes everything as base64. We resolved this with a fork of the latest package version, so we could pass a custom filter function to determine whether a MIME header needs encoding. We proposed a PR with unit tests for review, and imported our forked package in the meantime. Originally, we only used base64 encoding when we detected non-ascii characters in a header, but realized even some ascii characters such as colon “:” and brackets “<>” needed encoding to pass server checks, hence the custom filter.
A legitimate email goes through many servers: the outgoing server, the incoming server, internal network servers, etc. Each server in this relay adds a “Received” header to the email payload.
An email with only one receipt might be coming from a suspicious origin. Since our infrastructure doesn’t go through many layers, we’ll take the hit on this minor score unless we decide to add some layers later.
The previous version of MIME encoding we used sometimes applied an encoding format called “7bit”, which isn’t compatible with Unicode characters. When we upgraded our MIME encoding package to use only base64 for headers, this went away.
We only had to comply with a few rules to reduce our spamminess score, but reading the source code for all the other Rspamd rules is like an archeological dig through decades of spam wars. The regular expressions alone seem to cover everything that any spammer has ever tried in the history of the internet, including the kitchen sink (https://github.com/rspamd/rspamd/blob/master/rules/regexp/headers.lua). You’ll see mentions of ancient email clients like “The Bat” and long-gone ISP’s like “sympatico”.
There are also some curious rules, such as penalizing an email if the subject contains or ends with “?” question or “!” exclamation marks. Also, there’s a rule penalizing subject lines containing a money currency such as $, €, ¥. This might not be obvious, but our rigorous technical analysis suggests you might get penalized for subject lines like, “Help, I am Prince! 💸 Can I transfer ¥30,000 to your account?”. So don’t do that maybe.
Gmail lets you see the raw MIME headers of an email by choosing “Show original” from an option menu. Sending an email from:
should automatically encode the “From” header as:
When we looked at the raw headers in Gmail, it did not appear to be encoded:
So why were we being penalized?
It turns out that Gmail will strip the base64 encoding to make the raw output human-readable, so we verified our conditional encoding on other email platforms like Proton and iCloud mail. Since manual verification played tricks on us, we decided to pursue more automated verification.
The best way to figure out how our email would be scored by Rspamd was to run Rspamd ourselves, so we added an Rspamd docker container to our development environments. We automated an internal task to generate raw MIME text from a new set of sample emails that covered different combinations of ascii and unicode characters in various headers.
Storing these samples as part of our git repo lets us diff the rendered emails as we add features. Then, we wrote tests to pass these MIME text payloads to our Rspamd docker container, and captured the resulting scores as part of our git repo. This historical record lets us do regression comparisons as our platform continues to evolve.
We untangled another email deliverability milestone with this deep dive, but there are still many challenges to address in the future. It's a cool space to be in, you can find an .html url from twenty years ago online and it will have relevant and topical information since in most cases, email doesn't actually change.
Thanks for reading.