Wordfilter

A wordfilter (sometimes referred to as just "filter" or "censor") is a script typically used on Internet forums or chat rooms that automatically scans users' posts or comments as they are submitted and automatically changes or censors particular words or phrases.

The most basic wordfilters search only for specific strings of letters, and remove or overwrite them regardless of their context. More advanced wordfilters make some exceptions for context (such as filtering "butt" but not "butter"), and the most advanced wordfilters may use regular expressions.

Functions

Wordfilters can serve any of a number of functions.

Removal of vulgar language

A swear filter, also known as a profanity filter or language filter is a software subsystem which modifies text to remove words deemed offensive by the administrator or community of an online forum. Swear filters are common in custom-programmed chat rooms and online video games, primarily MMORPGs. This is not to be confused with content filtering, which is usually built into internet browsing programs by third-party developers to filter or block specific websites or types of websites. Swear filters are usually created or implemented by the developers of the Internet service.

Most commonly, wordfilters are used to censor language considered inappropriate by the operators of the forum or chat room. Expletives are typically partially replaced, completely replaced, or replaced by nonsense words.^[1] This relieves the administrators or moderators of the task of constantly patrolling the board to watch for such language. This may also help the message board avoid content-control software installed on users' computers or networks, since such software often blocks access to Web pages that contain vulgar language.

Filtered phrases may be permanently replaced as it is saved (example: phpBB 1.x), or the original phrase may be saved but displayed as the censored text. In some software users can view the text behind the wordfilter by quoting the post.

Swear filters typically take advantage of string replacement functions built into the programming language used to create the program, to swap out a list of inappropriate words and phrases with a variety of alternatives. Alternatives can include:

Grawlix nonsense characters, such as !@#$%^&*
Replacing a certain letter with a shift-number character or a similar looking one.
Asterisks (* or #) of either a set length, or the length of the original word being filtered. Alternatively, posters often replace certain letters with an asterisk.
Minced oaths such as "heck" or "darn", or invented words such as "flum".
Family friendly words or phrases, or euphemisms, like "LOVE" or "I LOVE YOU", or completely different words which have nothing to do with the original word.
Deletion of the post. In this case, the entire post is blocked and there is usually no way to fix it.
Nothing at all. In this case, the offending word is deleted.

Some swear filters do a simple search for a string. Others have measures that ignore whitespace, and still others go as far as ignoring all non-alphanumeric characters and then filtering the plain text. This means that if the word "you" was set to be filtered, "y o u" or "y.o!u" would also be filtered.

Cliché control

Clichés—particular words or phrases constantly reused in posts, also known as "memes"—often develop on forums. Some users find that these clichés add to the fun, but other users find them tedious, especially when overused. Administrators may configure the wordfilter to replace the annoying cliché with a more embarrassing phrase, or remove it altogether.

Vandalism control

Internet forums are sometimes attacked by vandals who try to fill the forum with repeated nonsense messages, or by spammers who try to insert links to their commercial web sites. The site's wordfilter may be configured to remove the nonsense text used by the vandals, or to remove all links to particular websites from posts.

Lameness filter

Lameness filters are text-based wordfilters used by Slash-based websites (such as textboards and imageboards) to stop junk comments from being posted in response to stories. Some of the things they are designed to filter include:

Too many capital letters
Too much repetition
ASCII art
Comments which are too short or long
Use of HTML tags that try to break web pages
Comment titles consisting solely of "first post"
Any occurrence of a word or term deemed (by the programmers) to be offensive/vulgar

Circumventing filters

Since wordfilters are automated and look only for particular sequences of characters, users aware of the filters will sometimes try to circumvent them by changing their lettering just enough to avoid the filters. A user trying to avoid a vulgarity filter might replace one of the characters in the offending word into an asterisk, dash, or something similar. Some administrators respond by revising the wordfilters to catch common substitutions; others may make filter evasion a punishable offense of its own.^[2] A simple example of evading a wordfilter would be entering symbols between letters, deliberately misspelling words, or using leet. More advanced techniques of wordfilter evasion include the use of images, using hidden tags, or Cyrillic characters (i.e. a homograph spoofing attack).

Another method is to use a soft hyphen. A soft hyphen is only used to indicate where a word can be split when breaking text lines and is not displayed. By placing this halfway in a word, the word gets broken up and will in some cases not be recognised by the wordfilter.

Some more advanced filters, such as those in the online game RuneScape, can detect bypassing. However, the downside of sensitive wordfilters is that legitimate phrases get filtered out as well.

Censorship aspects

Wordfilters are coded into the Internet forums or chat rooms, and operate only on material submitted to the forum or chat room in question. This distinguishes wordfilters from content-control software, which is typically installed on an end user's PC or computer network, and which can filter all Internet content sent to or from the PC or network in question. Since wordfilters alter users' words without their consent, some users still consider them to be censorship, while others consider them an acceptable part of a forum operator's right to control the contents of the forum.

False positives

A common quirk with wordfilters, often considered either comical or aggravating by users, is that they often affect words that are not intended to be filtered. This is a typical problem when short words are filtered. For example, with the word "ass" censored, one may see, "Do you need istance for playing clical music?" instead of "Do you need assistance for playing classical music?" Multiple words may be filtered if whitespace is ignored, resulting in "as suspected" becoming " uspected". Prohibiting a phrase such as "hard on" will result in filtering innocuous statements such as "That was a hard one!" and "Sorry I was hard on you," into "That was a e!" and "Sorry I was you."

Some words that have been filtered accidentally can become replacements for profane words. One example of this is found on the Myst forum Mystcommunity. There, the word 'manuscript' was accidentally censored for containing the word 'anus', which resulted in 'm****cript'. The word was adopted as a replacement swear and carried over when the forum moved, and many substitutes, such as " 'scripting ", are used (though mostly by the older community members).

Place names may be filtered out unintentionally due to containing portions of swear words. In the early years of the internet, the British place name Penistone was often filtered out from spam and swear filters.^[3]

Implementation

Many games, such as World of Warcraft, and more recently, Habbo Hotel and RuneScape allow users to turn the filters off. Other games, especially free Massively multiplayer online games, such as Knight Online do not have such an option.

Other games such as Medal of Honor and Call of Duty (except Call of Duty: World at War, Call of Duty: Black Ops, Call of Duty: Black Ops 2, and Call of Duty: Black Ops 3) do not give users the option to turn off scripted foul language, while Gears of War does.

In addition to games, profanity filters can be used to moderate user generated content in forums, blogs, social media apps, kid's websites, and product reviews. There are many profanity filter APIs like WebPurify that help in replacing the swear words with other characters (i.e. "@#$!"). These profanity filters APIs work with profanity search and replace method.

References

^ "When the **** did we get a wordfilter?". Retrieved 2006-10-01.
^ "GameFAQs Terms of Use". GameFAQs. Retrieved 2008-08-04.
^ Sheerin, Jude (29 March 2010). "How spam filters dictated Canadian magazine's fate". BBC Online. Retrieved 5 April 2011.

External links

Online Text Obfuscator – replaces characters with similar Unicode chars from different character sets (e.g. Cyrillic)
Text Filter – Text Tools Online:Alphabetic sort, Remove duplicates, Delete All Non Alphanumeric Characters, Only Numbers, Letters etc.
Random Strings - generates random strings of human-readable characters with profanity removed.

replaces characters with similar Unicode chars from different character sets (e.g. Cyrillic)

[1] "When the **** did we get a wordfilter?". Retrieved 2006-10-01.

[2] "GameFAQs Terms of Use". GameFAQs. Retrieved 2008-08-04.

[3] Sheerin, Jude (29 March 2010). "How spam filters dictated Canadian magazine's fate". BBC Online. Retrieved 5 April 2011.

[1]

[2]

[3]

v t e Censorship
Media regulation	Books books banned Films banned films Internet circumvention Music Postal Press Radio Speech and expression Student media Televisions banned televisions Thought Video games banned video games
Methods	Bleeping Book burning Broadcast delay By copyright Cancel culture Censor bars Chilling effect Collateral censorship Concision Conspiracy of silence Content-control software Damnatio memoriae Debanking Deplatforming Euphemism Minced oath Expurgation Fig leaf Fogging Gag order Gatekeeping Hallin's spheres Heckling Heckler's veto Hush money Internet police Malinformation Media blackout Memory hole National intranet News embargo Newspaper theft Non-disclosure agreement Opinion corridor Overton window Pixelization Political correctness Prior restraint Propaganda Purge Redaction Revisionism Sanitization Self-censorship Shadow banning Social rejection Speech code Spiral of silence Strategic lawsuit Super-injunction Surveillance computer and network mass Taboo Whitewashing Word filtering
Contexts	AI Algorithmic Chinese censorship abroad Criminal Corporate Apple Facebook Google Hate speech Online Ideological Academic LGBT issues Media bias Moral police Moralistic fallacy Naturalistic fallacy Political censorship Banned parties Political prisoner Propaganda model Religious Blasphemy law Islamic Police Suppression of dissent Systemic bias Wikipedia
By location	Censorship by country Blasphemy law Freedom of speech Internet censorship In the Middle East In South Asia

v t e Profanity
By language	American Sign Language Cantonese Dutch Esperanto Finnish Georgian German Hindi-Urdu Hokkien Italian Japanese Korean Latin Mandarin Chinese Norwegian Polish Portuguese Quebec French Romanian Russian Sinhala Spanish Swedish Tagalog
Devices	Bleep censor Broadcast delay Expletive deleted Grawlix Swear jar Wordfilter
Other	Expletive attributive Expletive infixation Fighting words Four-letter word Hypoalgesic effect of swearing Maledicta Maledictology Minced oath Pardon my French Profane Swearing Act 1694 Profanity in science fiction Scunthorpe problem Seven dirty words List of ethnic slurs
Category

History

Wordfilter

Recent from talks

Recent from talks

Contribute something

Contribute something

Media Pages

Timelines

Articles

Notes collections

Notes

Notes

Days in Chronicle

Wordfilter

Functions

Removal of vulgar language

Cliché control

Vandalism control

Lameness filter

Circumventing filters

Censorship aspects

False positives

Implementation

See also

References

External links

Wordfilter

Origins and History

Early Development in Online Forums

Expansion to Gaming and Wikis

Core Functions

Profanity and Obscenity Filtering

Cliché and Quality Control

Vandalism and Spam Mitigation

Technical Implementation

Keyword-Based Matching Systems

Advanced Pattern Recognition and AI Integration

Operational Limitations

False Positives and Overfiltering

User Circumvention Techniques

Controversies and Ethical Debates

Censorship and Free Speech Implications

Effectiveness and Unintended Consequences

Modern Applications and Evolutions

Deployment in Social Media and Gaming Platforms

Recent Innovations and Tools

References

Add your contribution

Related Hubs

Contribute something