-
Notifications
You must be signed in to change notification settings - Fork 23
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Showing
36 changed files
with
279,504 additions
and
3 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -30,3 +30,5 @@ lu-mir-zeeguu-credentials.json | |
|
||
zenv* | ||
tools/_playground.py | ||
|
||
!zeeguu/core/word_filter/data/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
## Sources | ||
|
||
`bad-word-list` is cloned from: https://github.com/LDNOOBW/List-of-Dirty-Naughty-Obscene-and-Otherwise-Bad-Words | ||
|
||
Both `name-list.txt` and `city-names.txt` are from: https://github.com/FinNLP |
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,58 @@ | ||
# Our List of Dirty, Naughty, Obscene, and Otherwise Bad Words # | ||
|
||
With millions of images in our library and billions of user-submitted keywords, we work hard at Shutterstock to make sure that bad words don't show up in places they shouldn't. This repo contains a list of words that we use to filter results from our autocomplete server and recommendation engine. | ||
|
||
Please add to it as you see fit (particularly in non-English languages) or use it to spice up your next game of Scrabble :) | ||
|
||
Obvious warning: These lists contain material that many will find offensive. (But that's the point!) | ||
|
||
Miscellaneous caveat: Clearly, what goes in these lists is subjective. In our case, the question we use is, "What wouldn't we want to *suggest* that people look at?" This of course varies between culture, language, and geographies, so in the end we just have to make our best guess. | ||
|
||
## Languages | ||
|
||
| Name | Code | | ||
| ---------------------------------- | ----------------- | | ||
| [Arabic](ar) | ar | | ||
| [Chinese](zh) | zh | | ||
| [Czech](cs) | cs | | ||
| [Danish](da) | da | | ||
| [Dutch](nl) | nl | | ||
| [English](en) | en | | ||
| [Esperanto](eo) | eo | | ||
| [Filipino](fil) | fil | | ||
| [Finnish](fi) | fi | | ||
| [French](fr) | fr | | ||
| [French (CA)](fr-CA-u-sd-caqc) | fr-CA-u-sd-caqc | | ||
| [German](de) | de | | ||
| [Hindi](hi) | hi | | ||
| [Hungarian](hu) | hu | | ||
| [Italian](it) | it | | ||
| [Japanese](ja) | ja | | ||
| [Kabyle](kab) | kab | | ||
| [Klingon](tlh) | tlh | | ||
| [Korean](ko) | ko | | ||
| [Norwegian](no) | no | | ||
| [Persian](fa) | fa | | ||
| [Polish](pl) | pl | | ||
| [Portuguese](pt) | pt | | ||
| [Russian](ru) | ru | | ||
| [Spanish](es) | es | | ||
| [Swedish](sv) | sv | | ||
| [Thai](th) | th | | ||
| [Turkish](tr) | tr | | ||
|
||
See also the [list of projects, documents, and organizations](USERS.md) that use these lists. | ||
|
||
## Node Module | ||
|
||
If you are using the word lists as `.json`, or in an `npm`project, you can install the word list using the [naughty-words](https://github.com/LDNOOBW/naughty-words-js) package. | ||
|
||
```bash | ||
npm install naughty-words | ||
``` | ||
|
||
© 2012–2020 Shutterstock, Inc. | ||
|
||
[![Creative Commons License](http://i.creativecommons.org/l/by/4.0/80x15.png)](http://creativecommons.org/licenses/by/4.0/) | ||
|
||
This work is licensed under a [Creative Commons Attribution 4.0 International License](http://creativecommons.org/licenses/by/4.0/). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
# Users of these lists | ||
|
||
The following projects, documents, and organizations use these lists of dirty, | ||
naughty, obscene, and otherwise bad words. To contribute additional uses, please | ||
either [create an issue](https://github.com/shutterstock/List-of-Dirty-Naughty-Obscene-and-Otherwise-Bad-Words/issues/new) | ||
or send a pull request. | ||
|
||
## Projects | ||
|
||
* [jQuery.ProfanityFilter](https://github.com/ChaseFlorell/jQuery.ProfanityFilter): | ||
jQuery plugin to filter out profane words on the client. | ||
* [grunt-naughty-words](https://www.npmjs.com/package/grunt-naughty-words): | ||
Node.js npm library that provides a grunt task for creating an array of | ||
profane words, and a regex to test against, as well as the capability to | ||
compile your own obscene word list. | ||
* [Arena Metrics](http://www.arena.co.ke): | ||
Arena Metrics uses these lists to prevent users from popping offensive words | ||
into polls and surveys. | ||
* [osm-compare](https://github.com/mapbox/osm-compare) uses these lists to flag profanity in name tags when finding what changed during a feature edit on OpenStreetMap. | ||
<!-- * [Liquidata](https://www.dolthub.com/repositories/Liquidata/bad-words) uses these lists to create demo SQL tables for [Dolt](https://www.reddit.com/r/git/comments/cdl022/dolt_its_git_for_data/) consumers. --> | ||
|
||
## Documents | ||
|
||
* [“Bad words” filter](http://stackoverflow.com/questions/24515/bad-words-filter) | ||
on Stack Overflow | ||
* [How do you implement a good profanity filter?](http://stackoverflow.com/questions/273516/how-do-you-implement-a-good-profanity-filter) | ||
on Stack Overflow | ||
|
||
## Organizations | ||
|
||
* [Shutterstock, Inc.](http://code.shutterstock.com/) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
سكس | ||
طيز | ||
شرج | ||
لعق | ||
لحس | ||
مص | ||
تمص | ||
بيضان | ||
ثدي | ||
بز | ||
بزاز | ||
حلمة | ||
مفلقسة | ||
بظر | ||
كس | ||
فرج | ||
شهوة | ||
شاذ | ||
مبادل | ||
عاهرة | ||
جماع | ||
قضيب | ||
زب | ||
لوطي | ||
لواط | ||
سحاق | ||
سحاقية | ||
اغتصاب | ||
خنثي | ||
احتلام | ||
نيك | ||
متناك | ||
متناكة | ||
شرموطة | ||
عرص | ||
خول | ||
قحبة | ||
لبوة |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,41 @@ | ||
bordel | ||
buzna | ||
čumět | ||
čurák | ||
debil | ||
do piče | ||
do prdele | ||
dršťka | ||
držka | ||
flundra | ||
hajzl | ||
hovno | ||
chcanky | ||
chuj | ||
jebat | ||
kokot | ||
kokotina | ||
koňomrd | ||
kunda | ||
kurva | ||
mamrd | ||
mrdat | ||
mrdka | ||
mrdník | ||
oslošoust | ||
piča | ||
píčus | ||
píchat | ||
pizda | ||
prcat | ||
prdel | ||
prdelka | ||
sračka | ||
srát | ||
šoustat | ||
šulin | ||
vypíčenec | ||
zkurvit | ||
zkurvysyn | ||
zmrd | ||
žrát |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
anus | ||
bøsserøv | ||
cock | ||
fisse | ||
fissehår | ||
fuck | ||
hestepik | ||
kussekryller | ||
lort | ||
luder | ||
pik | ||
pikhår | ||
pikslugeri | ||
piksutteri | ||
pis | ||
røv | ||
røvhul | ||
røvskæg | ||
røvspræke | ||
shit |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,66 @@ | ||
analritter | ||
arsch | ||
arschficker | ||
arschlecker | ||
arschloch | ||
bimbo | ||
bratze | ||
bumsen | ||
bonze | ||
dödel | ||
fick | ||
ficken | ||
flittchen | ||
fotze | ||
fratze | ||
hackfresse | ||
hure | ||
hurensohn | ||
ische | ||
kackbratze | ||
kacke | ||
kacken | ||
kackwurst | ||
kampflesbe | ||
kanake | ||
kimme | ||
lümmel | ||
MILF | ||
möpse | ||
morgenlatte | ||
möse | ||
mufti | ||
muschi | ||
nackt | ||
neger | ||
nigger | ||
nippel | ||
nutte | ||
onanieren | ||
orgasmus | ||
penis | ||
pimmel | ||
pimpern | ||
pinkeln | ||
pissen | ||
pisser | ||
popel | ||
poppen | ||
porno | ||
reudig | ||
rosette | ||
schabracke | ||
schlampe | ||
scheiße | ||
scheisser | ||
schiesser | ||
schnackeln | ||
schwanzlutscher | ||
schwuchtel | ||
tittchen | ||
titten | ||
vögeln | ||
vollpfosten | ||
wichse | ||
wichsen | ||
wichser |
Oops, something went wrong.