Stop deleting spam right now

A hand typing on a laptop with email illustrations covering the keyboard
(Image credit: Shutterstock/Billion Photos)

Spam emails pose a serious risk to the security of our personal data, and this risk is widely underestimated. According to research GMX conducted last year, fewer than half of Brits (46%) even consider spam to be a cybersecurity risk at all, while one in four (24%) admitted having difficulty distinguishing spam emails from genuine newsletters or serious correspondence.

About the author

Jan Oetjen is responsible for the Mail and Portal businesses of United Internet AG.

In addition, there seems to be a serious lack of knowledge about how to effectively handle malicious email that the provider's spam filter has not detected, and thus delivered into your inbox, with most email users doing the wrong thing. For example, 51 percent of Brits in our survey say they simply delete spam emails. 

Doing this will have a very limited effect considering the gigantic amount of spam attacking email services every day. While deleting the email might get this particular spam message out of their sight, there are definitely more effective ways to make life more difficult for spammers.

The power of AI

Despite the eye-watering volumes of spam messages sent every day, the vast majority of it has very small chance of making it into a user’s inbox today. Spam filters applied by email providers have evolved significantly and continually become more and more sophisticated. 

In their simplest form, spam filters still follow a set of rules to filter out messages with suspicious words such as ‘online pharmacy’, ‘Viagra’ or ‘Lottery Win’ that come from unknown or blacklisted IP addresses. Spammers can quickly update their messages by word obfuscation, i.e., simply adjusting the spelling of a word and outwitting these simple filter rules. 

To enable this kind of spam filter to recognize unwanted message correctly, new rules must be added regularly to the filter system. This is a complicated process, as such adjustments need to be made for practically each new filter evasion that spammers come up with. Though this simple mechanism can still be effectively applied and can shield inboxes from large amounts of primitive spam, the analysis of individual words alone is no longer sufficient for a reliable spam detection. 

In spam fighting, Machine Learning (ML), a branch of Artificial Intelligence (AI), has successfully come into play in recent years, allowing computers to process enormous amounts of data and discover new patterns for themselves without the need to be manually programmed every time. Such Machine Learning-based spam filters can learn in several ways. 

This can be done, for example, by using existing data from already recognized spam mails. These emails are examined by ML for various characteristics that occur repeatedly. This information can then be used as a probable indicator for a spam email. From these patterns, the ML algorithm automatically updates its existing rules.

Users’ feedback

A second important way that ML can enhance its own knowledge is through human interventions. This mechanism is both effective and rapid at discovering the new tricks that spammers come up with – as long as an email provider can rely on its users’ support. There are myriads of ways to arrange various types of content in an email without even making it visible to the receiver. With broadband connections now commonplace, and inbox sizes becoming large enough, spammers increasingly exploit the flexibility of the email’s open standard to evade spam filters. 

For example, they may hide great quantities of unsuspicious content in an email body hoping this will outweigh all the alerting characteristics of their mailing. Though ML can easily cope with such data “noise”, the number of ways and combinations might be simply too big to recognize them quickly, and what is also important, is to ensure no legitimate mailing (or “ham”) gets mistaken for spam. But as soon as users mark an email in their inbox as spam, they signal to the algorithm that something could be wrong with it. When the ML system receives more of similar signals from other users, it learns that this email’s characteristics can be used as an indicator for spam.

With the prompt feedback from email users, the ML can detect new malicious attempts at a much earlier stage. This not only stops a new wave of spam from spreading in numerous email accounts, but it also helps ML to become more intelligent and recognize future, probably even more sophisticated spam attacks. Artificial Intelligence will continue to play a major role in increasing the speed of spam detection – it is also the reason why today, despite the increasing volumes of spam messages being sent, you receive proportionally less spam than you did a decade ago.

The human touch

Even with these advances, a really effective spam filter cannot rely on AI alone. In the same way that social media platforms still employ an army of moderators to sanitize their content in tandem with algorithms, the best spam filters in the world need to combine AI with human intervention. 

An experienced email security expert is still able to assess the risk potential of a spam email more comprehensively than a machine. This is because humans are better at determining the possible “value chain” of the message – that is, how the message ultimately gets converted into cash. Because, after all, the ultimate aim of spammers is still to get paid. Knowing this, the email security expert can ask, “How will the online fraudster get his money in real life?”, “What is the idea behind the spammer’s strategy in outwitting existing spam filters?” There is a lot of experience and some very specific expertise involved in thinking this through which is only possessed by humans.

Three shades of spam

There is also another human factor that plays an important role in the handling of spam: How the users themselves perceive the category to which a suspicious email belongs. From the user’s point of view, spam can be classified into three categories. First, there is blocked spam. This is spam which is either not accepted by the provider’s email servers (because it is delivered by servers on block lists) or can be detected as unwanted spam by spam filters, e.g., illegal advertising. Secondly, there is “red” spam that contains malicious links (e.g., phishing) or even malware

Then there is a third category: “Graymail.” Called “gray” because it is neither on the list of blocked senders nor on the user’s list of approved senders; this is an email that your spam filter is not sure what to do with, as some users mark it as spam and others do not. Emails from retailers often fall into this category, for example. The recipient usually opted in to receive those when he made a purchase, but after that he does not really want them to keep bothering him and always moves them to the ‘Junk’ folder, and perhaps clicks on ‘block the sender’ as well. 

Over time, the personal inbox spam filter will learn what the recipient considers to be “graymail” based on these actions. A serious challenge for experts working in mail security on adjusting email filters is to recognize which emails really fall into the spam category vs those that are legitimate newsletters that users should rather unsubscribe from than simply moving to their spam folder. AI in the future will be able to adjust and improve its reaction to this sort of messages proactively, based on such continuous feedback. 

Despite the new closed messaging technologies, email is unarguably the Internet's biggest and most successful messaging standard. The email’s role as the backbone for most other digital services is also the reason why no effort, both from email users and from email service providers, can be too great to keep email accounts safe and secure. It is good to know that with AI, we now have a powerful assistant in our ranks that will help us gain a victory in the fight against spam.

We've listed the best email hosting providers.

Jan Oetjen is responsible for the Mail and Portal businesses of United Internet AG.