[pLog-svn] r6088 - plog/branches/lifetype-1.2/class/security

Fri Nov 30 17:19:29 EST 2007

 	Should we have a small filter at the front that does these sort of 
checks, and then have the bayesian filter at the end?  Or perhaps the real 
reason is that since the bayesian filter actually saves the comment, it 
needs to have additional checks, no matter where in the order it falls?
 	There are two filters before the bayesian filter, and maybe that 
logic could go in there?
 	It would be nice to have filters be able to lower the cpu usage on 
comments that have invalid article ids, etc. since presumably that is 
spammers trying to mess with the system.

------------------------------------------------------------------------
r5918 | oscar | 2007-09-07 17:38:00 -0400 (Fri, 07 Sep 2007) | 5 lines

This should solve issues http://bugs.lifetype.net/view.php?id=1386 
("Spammers are able to post comments even if comments are disabled for a 
particular post") and http://bugs.lifetype.net/view.php?id=1387 ("comments 
with article_id = 0 created by some spam bots")

The problem here was that since the bayesian filter is run *before* any 
application logic is run, it should also check things like whether 
comments are enabled or not and if the article is found at all or not, 
even though this same checks are applied later on in the AddCommentAction 
class. The articleId parameter was taken as is from the request, without 
performing any check other than checking if it is an integer, so this 
caused some comments to point to an article with an id of '0' because we 
did not check if the article really existed before saving the spam 
comment. And the same applies to the other situation, with the toggle for 
enabling and disabling comments.

The solution was to add some additional logic to the BayesianFilter filter 
class and perform these checks, that does indeed duplicate some of the 
logic found later in the process flow but I did not find a more elegant 
solution for this (at least not without a redesign of the whole filter 
architecture anyway)
------------------------------------------------------------------------

On Fri, 30 Nov 2007, Paul Westbrook wrote:
> Hello,
>   That should be fine.  But in revision 5918 it looks like it is
> intentional that the Bayesian filter runs first.
>
> --Paul
>
> On 11/30/07, Oscar Renalias <oscar at renalias.net> wrote:
>>
>> So can this issue be closed by placing the Bayesian filter at the end
>> of the pipeline chain?
>>
>> On Nov 30, 2007, at 6:48 AM, Jon Daley wrote:
>>
>>> On Fri, 30 Nov 2007, Mark Wu wrote:
>>>> Why can't we just put the bayesian filter in last order? it seems
>>>> solve this
>>>> problem easier.
>>>       Does that fix everything?  It is certainly the easiest (coding and
>>> performance) wise.
>>>       With my thinking it seems like that fixes it - at least for now,
>>> because we don't have any other plugins that would use the inputs of
>>> others.  And we can maybe do Mark's priority idea if we ever need
>>> that sort of thing.
>>>       As long as it works for Paul's stuff, I think that sounds good.
>> So,
>>> then we should take Mark's rev 6088 or whatever it is and use that,
>>> but modify it to pass in the previouslyRejected flag, and then put
>>> the bayesian at the end.
>>>
>>>> BTW,  most lifetype installations in CJK site does rely on Bayesian
>>>> Filter to protect the spam attack. Because the tokenize algorithm
>>>> can't separate CJK into each atomic token. We don't use stop words
>>>> and "white space" to seperate a paragraph into "word".
>>>       I am not sure what you are saying.  It seems like you are saying
>>> the tokenizer doesn't work, so then it seems that the bayesian
>>> filter wouldn't be very good at all...
>>>
>>>       Well, it's been 10 minutes since I read your idea of simply
>> putting
>>> the bayesian filter at the end, and haven't come up with a reason
>>> why it won't work.  So, probably good.  Do you want to do it, or me?
>>>
>>> --
>>> Jon Daley
>>> http://jon.limedaley.com/
>>>
>>> Whenever people agree with me I always feel I must be wrong.
>>> -- Oscar Wilde_______________________________________________
>>> pLog-svn mailing list
>>> pLog-svn at devel.lifetype.net
>>> http://limedaley.com/mailman/listinfo/plog-svn
>>
>> _______________________________________________
>> pLog-svn mailing list
>> pLog-svn at devel.lifetype.net
>> http://limedaley.com/mailman/listinfo/plog-svn
>>
>

-- 
Jon Daley
http://jon.limedaley.com/

All who would win joy, must share it; happiness was born a twin.
-- Lord Byron