Talk:Spamwords

From OHRRPGCE-Wiki
Jump to navigation Jump to search

Old discussion archived here


Bob the Hamster Oops! some of my regexes were wrong. a dash inside square brackets has to be at the beginning, otherwise it gets interpreted as a range.

The new code is installed. I saw the notification from your test, which was blocked even though you did not add any banned text. This is the same problem that previously made me believe the diff was showing the last two changes not the last one change. You see, if the very last line in a page contains spammy test (like "viagra" from the previous test) and you add harmless text after it, technically uoi have changed that last line because it now has a newline \n at the end of it, so it gets included in the changed diff.

In spite of that minor flaw, your new diff code is great! Now the chances of an already-rare false-positive are vastly lower.

Mike C. Aaaahhhh... Ok then. I'll commit a patch to fix that. Although, MediaWiki should already stick a new line at the end... Anyway, it's good to know that I'm not insane, something that I've been questioning recently...


Mike C. Can we filter all hyperlinks yet? "https?:\/\/"... it's tempting, yes it is...

Bob the Hamster Ah, yes, actually. Requiring a user to be logged-in and not evil before they can post external links is reasonable.


FyreWulff I think we could use this to block defacers too, couldn't we? like the recent 'buttsecks' edit, we could put words that would never appear in a legit entry here.

Bob the Hamster For at least a month, the filter has been quietly blocking multiple edits from a spammer who was posting links to hardcore gay porn. That "buttsecks" vandalism came from an IP address that closely resembled one he used (from the same dynamic block, I think), so I am guessing that he finally discovered that his edits were not getting saved, and vandalized that page out of frustration.


Bob the Hamster: For anyone curious about the latest round of gibberish spam, this spammer is just doing set-up for later spam attacks. It hopes that these spams will be ignored because they are small and contain no links or understandable spam content. This is supposed to skew bayesian-style statistical filters to recognize these gibberish words as non-spam, so it can use them later in real spams, tilting the probability filter in its advantage. Also, my blocking these IP addresses is just an act of frustration, and I am only wasting my own time by doing it. Stormnet is Legion, with something like tens of millions of infected nodes around the world.

Bob the Hamster: I was sick of deleting gibberish spam, so I enabled MathCaptcha. Logged in users don't need to bother with it. This should stop all spam robots except the ones powered by sweatshop slaves.