[pLog-svn] r6088 - plog/branches/lifetype-1.2/class/security
Jon Daley
plogworld at jon.limedaley.com
Fri Nov 30 17:19:29 EST 2007
Should we have a small filter at the front that does these sort of
checks, and then have the bayesian filter at the end? Or perhaps the real
reason is that since the bayesian filter actually saves the comment, it
needs to have additional checks, no matter where in the order it falls?
There are two filters before the bayesian filter, and maybe that
logic could go in there?
It would be nice to have filters be able to lower the cpu usage on
comments that have invalid article ids, etc. since presumably that is
spammers trying to mess with the system.
------------------------------------------------------------------------
r5918 | oscar | 2007-09-07 17:38:00 -0400 (Fri, 07 Sep 2007) | 5 lines
This should solve issues http://bugs.lifetype.net/view.php?id=1386
("Spammers are able to post comments even if comments are disabled for a
particular post") and http://bugs.lifetype.net/view.php?id=1387 ("comments
with article_id = 0 created by some spam bots")
The problem here was that since the bayesian filter is run *before* any
application logic is run, it should also check things like whether
comments are enabled or not and if the article is found at all or not,
even though this same checks are applied later on in the AddCommentAction
class. The articleId parameter was taken as is from the request, without
performing any check other than checking if it is an integer, so this
caused some comments to point to an article with an id of '0' because we
did not check if the article really existed before saving the spam
comment. And the same applies to the other situation, with the toggle for
enabling and disabling comments.
The solution was to add some additional logic to the BayesianFilter filter
class and perform these checks, that does indeed duplicate some of the
logic found later in the process flow but I did not find a more elegant
solution for this (at least not without a redesign of the whole filter
architecture anyway)
------------------------------------------------------------------------
On Fri, 30 Nov 2007, Paul Westbrook wrote:
> Hello,
> That should be fine. But in revision 5918 it looks like it is
> intentional that the Bayesian filter runs first.
>
> --Paul
>
> On 11/30/07, Oscar Renalias <oscar at renalias.net> wrote:
>>
>> So can this issue be closed by placing the Bayesian filter at the end
>> of the pipeline chain?
>>
>> On Nov 30, 2007, at 6:48 AM, Jon Daley wrote:
>>
>>> On Fri, 30 Nov 2007, Mark Wu wrote:
>>>> Why can't we just put the bayesian filter in last order? it seems
>>>> solve this
>>>> problem easier.
>>> Does that fix everything? It is certainly the easiest (coding and
>>> performance) wise.
>>> With my thinking it seems like that fixes it - at least for now,
>>> because we don't have any other plugins that would use the inputs of
>>> others. And we can maybe do Mark's priority idea if we ever need
>>> that sort of thing.
>>> As long as it works for Paul's stuff, I think that sounds good.
>> So,
>>> then we should take Mark's rev 6088 or whatever it is and use that,
>>> but modify it to pass in the previouslyRejected flag, and then put
>>> the bayesian at the end.
>>>
>>>> BTW, most lifetype installations in CJK site does rely on Bayesian
>>>> Filter to protect the spam attack. Because the tokenize algorithm
>>>> can't separate CJK into each atomic token. We don't use stop words
>>>> and "white space" to seperate a paragraph into "word".
>>> I am not sure what you are saying. It seems like you are saying
>>> the tokenizer doesn't work, so then it seems that the bayesian
>>> filter wouldn't be very good at all...
>>>
>>> Well, it's been 10 minutes since I read your idea of simply
>> putting
>>> the bayesian filter at the end, and haven't come up with a reason
>>> why it won't work. So, probably good. Do you want to do it, or me?
>>>
>>> --
>>> Jon Daley
>>> http://jon.limedaley.com/
>>>
>>> Whenever people agree with me I always feel I must be wrong.
>>> -- Oscar Wilde_______________________________________________
>>> pLog-svn mailing list
>>> pLog-svn at devel.lifetype.net
>>> http://limedaley.com/mailman/listinfo/plog-svn
>>
>> _______________________________________________
>> pLog-svn mailing list
>> pLog-svn at devel.lifetype.net
>> http://limedaley.com/mailman/listinfo/plog-svn
>>
>
--
Jon Daley
http://jon.limedaley.com/
All who would win joy, must share it; happiness was born a twin.
-- Lord Byron
More information about the pLog-svn
mailing list