Gathering of addresses In order to send spam, spammers need to obtain the email addresses of the intended recipients. To this end, both spammers themselves and
list merchants gather huge lists of potential email addresses. Since spam is, by definition, unsolicited, this
address harvesting is done without the consent (and sometimes against the expressed will) of the address owners. A single spam run may target tens of millions of possible addresses – many of which are invalid, malformed, or undeliverable.
Obfuscating message content Many spam-filtering techniques work by searching for patterns in the headers or bodies of messages. For instance, a user may decide that all email they receive with the word "
Viagra" in the subject line is spam, and instruct their mail program to automatically delete all such messages. To defeat such filters, the spammer may intentionally misspell commonly filtered words or insert other characters, often in a style similar to
leetspeak, as in the following examples: , , , , . This also allows for many different ways to express a given word, making identifying them all more difficult for filter software. The principle of this method is to leave the word readable to humans (who can easily recognize the intended word for such misspellings), but not likely to be recognized by a computer program. This is only somewhat effective, because modern filter patterns have been designed to recognize blacklisted terms in the various iterations of misspelling. Other filters target the actual obfuscation methods, such as the non-standard use of punctuation or numerals into unusual places. Similarly, HTML-based email gives the spammer more tools to obfuscate text. Inserting HTML comments between letters can foil some filters. Another common ploy involves presenting the text as an image, which is either sent along or loaded from a remote server.
Defeating Bayesian filters As
Bayesian filtering has become popular as a spam-filtering technique, spammers have started using methods to weaken it. To a rough approximation, Bayesian filters rely on word probabilities. If a message contains many words that are used only in spam, and few that are never used in spam, it is likely to be spam. To weaken Bayesian filters, some spammers, alongside the sales pitch, now include lines of irrelevant, random words, in a technique known as
Bayesian poisoning. More broadly machine learning can be used to identify and filter spam. There is a game of escalation between spammers and anti-spam identification and filtering systems where spammers adjust to attempt to evade new identification and filtering techniques.
Spam-support services A number of other online activities and business practices are considered by anti-spam activists to be connected to spamming. These are sometimes termed
spam-support services: business services, other than the actual sending of spam itself, which permit the spammer to continue operating. Spam-support services can include processing orders for goods advertised in spam, hosting Web sites or
DNS records referenced in spam messages, or a number of specific services as follows: Some Internet hosting firms advertise
bulk-friendly or
bulletproof hosting. This means that, unlike most ISPs, they will not terminate a customer for spamming. These hosting firms operate as clients of larger ISPs, and many have eventually been taken offline by these larger ISPs as a result of complaints regarding spam activity. Thus, while a firm may advertise bulletproof hosting, it is ultimately unable to deliver without the connivance of its upstream ISP. However, some spammers have managed to get what is called a
pink contract (see below) – a contract with the ISP that allows them to spam without being disconnected. A few companies produce
spamware, or software designed for spammers. Spamware varies widely, but may include the ability to import thousands of addresses, to generate random addresses, to insert fraudulent headers into messages, to use dozens or hundreds of mail servers simultaneously, and to make use of open relays. The sale of spamware is illegal in eight U.S. states. So-called
millions CDs are commonly advertised in spam. These are
CD-ROMs purportedly containing lists of email addresses, for use in sending spam to these addresses. Such lists are also sold directly online, frequently with the false claim that the owners of the listed addresses have requested (or "opted in") to be included. Such lists often contain invalid addresses. In recent years, these have fallen almost entirely out of use due to the low quality email addresses available on them, and because some email lists exceed 20GB in size. The
amount you can fit on a CD is no longer substantial. A number of
DNS blacklists (DNSBLs), including the MAPS RBL, Spamhaus SBL, SORBS and SPEWS, target the providers of spam-support services as well as spammers. DNSBLs blacklist IPs or ranges of IPs to persuade ISPs to terminate services with known customers who are spammers or resell to spammers. ==Related vocabulary==