Skip to content

Commit a280601

Browse files
authored
Merge pull request #91 from 2Toad/jp-issue-87
Fixes #87: Add support for multiple languages
2 parents 21f43ab + 53ec34d commit a280601

13 files changed

Lines changed: 1207 additions & 511 deletions

.prettierignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,3 +9,4 @@ node_modules
99
.nvmrc
1010
.prettierignore
1111
*.md
12+
Dockerfile

README.md

Lines changed: 24 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
[![Downloads](https://img.shields.io/npm/dm/@2toad/profanity.svg)](https://www.npmjs.com/package/@2toad/profanity)
55
[![Build status](https://github.com/2toad/profanity/actions/workflows/ci.yml/badge.svg)](https://github.com/2Toad/Profanity/actions/workflows/nodejs.yml)
66

7-
A JavaScript profanity filter with full TypeScript support
7+
A multi-language profanity filter with full TypeScript support
88

99
## Getting Started
1010

@@ -45,12 +45,35 @@ Create an instance of the Profanity class to change the default options:
4545
import { Profanity } from '@2toad/profanity';
4646

4747
const profanity = new Profanity({
48+
languages: ['de'],
4849
wholeWord: false,
4950
grawlix: '*****',
5051
grawlixChar: '$',
5152
});
5253
```
5354

55+
### languages
56+
57+
By default, this is set to `['en']` (English). You can change the default to any [supported language](./supported-languages.md), including multiple languages:
58+
59+
```JavaScript
60+
const profanity = new Profanity({
61+
languages: ["en", "de"],
62+
});
63+
```
64+
65+
You can override this option by specifying the languages in `exists` or `censor`:
66+
67+
```JavaScript
68+
profanity.exists('Je suis un connard', ["fr"]);
69+
// true
70+
71+
profanity.censor('I like big butts and je suis un connard', CensorType.Word, ["en", "de", "fr"]);
72+
// I like big @#$%&! and je suis un @#$%&!
73+
```
74+
75+
If no languages are specified in the method call, it will use the languages specified in the options.
76+
5477
### wholeWord
5578

5679
By default, this is set to `true` so profanity only matches on whole words:
@@ -112,7 +135,6 @@ profanity.censor('I like big butts and I cannot lie', CensorType.AllVowels);
112135
// I like big b$tts and I cannot lie
113136
```
114137

115-
116138
## Customize the word list
117139

118140
Add words:

package.json

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
{
22
"name": "@2toad/profanity",
33
"version": "2.5.0",
4-
"description": "A JavaScript profanity filter with full TypeScript support",
4+
"description": "A multi-language profanity filter with full TypeScript support",
55
"homepage": "https://github.com/2Toad/Profanity",
66
"author": "2Toad",
77
"license": "MIT",
@@ -17,7 +17,7 @@
1717
"pretest": "npm run build",
1818
"test": "mocha -r ts-node/register tests/**/*.spec.ts",
1919
"test:watch": "npm run test -- --watch",
20-
"benchmark": "ts-node src/benchmark/benchmark.ts",
20+
"benchmark": "docker-compose -f ./src/benchmark/docker-compose.yml up --build",
2121
"lint": "eslint . --cache",
2222
"lint:fix": "eslint . --fix",
2323
"format": "prettier . --write",

src/benchmark/Dockerfile

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
FROM node:20.17-alpine3.19
2+
3+
WORKDIR /app
4+
5+
COPY ../../package*.json ./
6+
7+
RUN npm install
8+
9+
COPY ../../ ./
10+
11+
CMD ["npx", "ts-node", "src/benchmark/benchmark.ts"]

src/benchmark/benchmark.ts

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -64,6 +64,10 @@ const { smallCleanText, smallProfaneText, largeCleanText, largeProfaneText } = t
6464
const defaultProfanity = new Profanity();
6565
const partialMatchProfanity = new Profanity({ wholeWord: false });
6666

67+
// Pre-cache regexes
68+
defaultProfanity.exists("foo");
69+
partialMatchProfanity.exists("bar");
70+
6771
// Benchmark exists() function
6872
suite
6973
.add("exists() - small clean text", () => {

src/benchmark/docker-compose.yml

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
services:
2+
benchmark:
3+
build:
4+
context: ../..
5+
dockerfile: ./src/benchmark/Dockerfile
6+
deploy:
7+
resources:
8+
limits:
9+
cpus: "1.0"
10+
memory: 512M
11+
reservations:
12+
cpus: "1.0"
13+
memory: 512M

src/benchmark/results.md

Lines changed: 30 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -3,24 +3,41 @@
33
## Test Environment
44

55
- **OS**: Windows 11 - WSL2 (Ubuntu 22.04.4 LTS)
6-
- **CPU**: AMD Ryzen 9 5900HX 3.30 GHz
7-
- **RAM**: 64 GB
6+
- **CPU**: AMD Ryzen 9 5900HX 3.30 GHz (Benchmark constrained to 1 CPU core)
7+
- **RAM**: 64 GB (Benchmark constrained to 512 MB)
88

99
### Benchmarks
1010

11+
#### v3.0.0
12+
```
13+
Using test data: v1
14+
exists() - small clean text x 2,263,763 ops/sec ±3.96% (83 runs sampled)
15+
exists() - small profane text x 1,831,670 ops/sec ±3.09% (86 runs sampled)
16+
exists() - large clean text x 38,185 ops/sec ±2.82% (84 runs sampled)
17+
exists() - large profane text x 686,951 ops/sec ±2.11% (87 runs sampled)
18+
exists() - partial match, small profane text x 1,624,503 ops/sec ±8.02% (78 runs sampled)
19+
censor() - Word, small profane text x 915,620 ops/sec ±6.16% (83 runs sampled)
20+
censor() - FirstChar, small profane text x 1,275,945 ops/sec ±2.68% (77 runs sampled)
21+
censor() - FirstVowel, small profane text x 902,065 ops/sec ±3.43% (81 runs sampled)
22+
censor() - AllVowels, small profane text x 942,445 ops/sec ±2.94% (84 runs sampled)
23+
censor() - Word, large profane text x 5,578 ops/sec ±2.17% (86 runs sampled)
24+
censor() - partial match, Word, small profane text x 869,941 ops/sec ±7.91% (82 runs sampled)
25+
Fastest: exists() - small clean text
26+
```
27+
1128
#### v2.4.0
1229
```
1330
Using test data: v1
14-
exists() - small clean text x 7,384,356 ops/sec ±1.24% (95 runs sampled)
15-
exists() - small profane text x 6,347,800 ops/sec ±1.25% (90 runs sampled)
16-
exists() - large clean text x 49,978 ops/sec ±0.56% (93 runs sampled)
17-
exists() - large profane text x 1,216,505 ops/sec ±2.03% (81 runs sampled)
18-
exists() - partial match, small profane text x 5,319,125 ops/sec ±1.04% (93 runs sampled)
19-
censor() - Word, small profane text x 1,899,374 ops/sec ±0.54% (95 runs sampled)
20-
censor() - FirstChar, small profane text x 3,233,749 ops/sec ±1.40% (87 runs sampled)
21-
censor() - FirstVowel, small profane text x 1,894,666 ops/sec ±0.92% (92 runs sampled)
22-
censor() - AllVowels, small profane text x 1,697,305 ops/sec ±2.07% (92 runs sampled)
23-
censor() - Word, large profane text x 9,563 ops/sec ±0.91% (87 runs sampled)
24-
censor() - partial match, Word, small profane text x 1,597,856 ops/sec ±1.19% (92 runs sampled)
31+
exists() - small clean text x 3,838,466 ops/sec ±3.34% (81 runs sampled)
32+
exists() - small profane text x 2,557,317 ops/sec ±7.47% (74 runs sampled)
33+
exists() - large clean text x 41,031 ops/sec ±2.82% (83 runs sampled)
34+
exists() - large profane text x 799,283 ops/sec ±2.16% (83 runs sampled)
35+
exists() - partial match, small profane text x 3,013,455 ops/sec ±5.68% (88 runs sampled)
36+
censor() - Word, small profane text x 1,328,481 ops/sec ±2.17% (86 runs sampled)
37+
censor() - FirstChar, small profane text x 2,197,796 ops/sec ±5.86% (84 runs sampled)
38+
censor() - FirstVowel, small profane text x 1,184,065 ops/sec ±4.31% (75 runs sampled)
39+
censor() - AllVowels, small profane text x 1,105,599 ops/sec ±7.69% (77 runs sampled)
40+
censor() - Word, large profane text x 5,594 ops/sec ±6.02% (85 runs sampled)
41+
censor() - partial match, Word, small profane text x 1,031,901 ops/sec ±2.86% (81 runs sampled)
2542
Fastest: exists() - small clean text
2643
```

0 commit comments

Comments
 (0)