[pLog-svn] r6088 - plog/branches/lifetype-1.2/class/security

Jon Daley plogworld at jon.limedaley.com
Wed Dec 5 23:08:45 EST 2007


 	That sounds good.  The "extra" checks that are currently in the 
bayesian filter: html, etc. validation of the comment fields.  Is there 
any reason those shouldn't be in CommentFilter instead?
 	And, if we always have the commentFilter, do we need the 
nullFilter in there too?  (I haven't looked at the code for that part 
yet).

On Thu, 6 Dec 2007, Mark Wu wrote:

> Hi Jon and Oscar:
>
> I am also busy in this week.
>
> As I said, I will implement a "ordered" pipeline in 2.0 to solve this
> permanently. The idea is like the following way:
>
> The default $pipeline:
>
> $registerFilter( "NullFiter", 1 );
> $registerFilter( "CommentFiter", 2 );
> $registerFilter( "BayesianFiter", 999 );
>
> User pipeline:
>
> $registerFilter( "Any1Fiter" );
> $registerFilter( "Any2Fiter" );
>
> It will add before the last filter ... So, it can also keep the backward
> compatibility.
>
> Mark
>
>> -----Original Message-----
>> From: plog-svn-bounces at devel.lifetype.net
>> [mailto:plog-svn-bounces at devel.lifetype.net] On Behalf Of Jon Daley
>> Sent: Wednesday, December 05, 2007 11:51 PM
>> To: LifeType Developer List
>> Subject: Re: [pLog-svn] r6088 -
>> plog/branches/lifetype-1.2/class/security
>>
>>  	I have it on my list to get done in the next couple days.
>>
>> On Wed, 5 Dec 2007, Oscar Renalias wrote:
>>
>>> Is anyone taking care of making these changes so that the Bayesian
>>> filter runs last?
>>>
>>> On Dec 1, 2007 11:36 AM, Oscar Renalias <oscar at renalias.net> wrote:
>>>> The bayesian filter needs to perform additional checks on the
>>>> incoming comment because if it's going to end up being
>> saved in the
>>>> database (marked as spam), we first need to make sure that things
>>>> like the blog id and the article id are correct. But it's not
>>>> strictly necessary that it runs first, and in fact it
>> doesn't really
>>>> matter, so I guess making it run last is still good enough for now.
>>>>
>>>> Oscar
>>>>
>>>>
>>>> On Dec 1, 2007, at 12:19 AM, Jon Daley wrote:
>>>>
>>>>>       Should we have a small filter at the front that does these
>>>>> sort of checks, and then have the bayesian filter at the end?  Or
>>>>> perhaps the real reason is that since the bayesian filter
>> actually
>>>>> saves the comment, it needs to have additional checks, no matter
>>>>> where in the order it falls?
>>>>>       There are two filters before the bayesian filter, and maybe
>>>>> that logic could go in there?
>>>>>       It would be nice to have filters be able to lower the cpu
>>>>> usage on comments that have invalid article ids, etc. since
>>>>> presumably that is spammers trying to mess with the system.
>>>>>
>>>>>
>> --------------------------------------------------------------------
>>>>> ----
>>>>> r5918 | oscar | 2007-09-07 17:38:00 -0400 (Fri, 07 Sep 2007) | 5
>>>>> lines
>>>>>
>>>>> This should solve issues http://bugs.lifetype.net/view.php?id=1386
>>>>> ("Spammers are able to post comments even if comments are
>> disabled
>>>>> for a particular post") and
>>>>> http://bugs.lifetype.net/view.php?id=1387
>>>>> ("comments
>>>>> with article_id = 0 created by some spam bots")
>>>>>
>>>>> The problem here was that since the bayesian filter is
>> run *before*
>>>>> any application logic is run, it should also check things like
>>>>> whether comments are enabled or not and if the article is
>> found at
>>>>> all or not, even though this same checks are applied
>> later on in the
>>>>> AddCommentAction class. The articleId parameter was taken
>> as is from
>>>>> the request, without performing any check other than
>> checking if it
>>>>> is an integer, so this caused some comments to point to
>> an article
>>>>> with an id of '0'
>>>>> because we
>>>>> did not check if the article really existed before saving
>> the spam
>>>>> comment. And the same applies to the other situation, with the
>>>>> toggle for enabling and disabling comments.
>>>>>
>>>>> The solution was to add some additional logic to the
>> BayesianFilter
>>>>> filter class and perform these checks, that does indeed duplicate
>>>>> some of the logic found later in the process flow but I
>> did not find
>>>>> a more elegant solution for this (at least not without a
>> redesign of
>>>>> the whole filter architecture anyway)
>>>>>
>> --------------------------------------------------------------------
>>>>> ----
>>>>>
>>>>>
>>>>> On Fri, 30 Nov 2007, Paul Westbrook wrote:
>>>>>> Hello,
>>>>>>  That should be fine.  But in revision 5918 it looks like it is
>>>>>> intentional that the Bayesian filter runs first.
>>>>>>
>>>>>> --Paul
>>>>>>
>>>>>> On 11/30/07, Oscar Renalias <oscar at renalias.net> wrote:
>>>>>>>
>>>>>>> So can this issue be closed by placing the Bayesian
>> filter at the
>>>>>>> end of the pipeline chain?
>>>>>>>
>>>>>>> On Nov 30, 2007, at 6:48 AM, Jon Daley wrote:
>>>>>>>
>>>>>>>> On Fri, 30 Nov 2007, Mark Wu wrote:
>>>>>>>>> Why can't we just put the bayesian filter in last order? it
>>>>>>>>> seems solve this problem easier.
>>>>>>>>      Does that fix everything?  It is certainly the easiest
>>>>>>>> (coding and
>>>>>>>> performance) wise.
>>>>>>>>      With my thinking it seems like that fixes it - at
>> least for
>>>>>>>> now, because we don't have any other plugins that
>> would use the
>>>>>>>> inputs of others.  And we can maybe do Mark's priority
>> idea if we
>>>>>>>> ever need that sort of thing.
>>>>>>>>      As long as it works for Paul's stuff, I think that sounds
>>>>>>>> good.
>>>>>>> So,
>>>>>>>> then we should take Mark's rev 6088 or whatever it is and use
>>>>>>>> that, but modify it to pass in the previouslyRejected
>> flag, and
>>>>>>>> then put the bayesian at the end.
>>>>>>>>
>>>>>>>>> BTW,  most lifetype installations in CJK site does rely on
>>>>>>>>> Bayesian Filter to protect the spam attack. Because
>> the tokenize
>>>>>>>>> algorithm can't separate CJK into each atomic token. We don't
>>>>>>>>> use stop words and "white space" to seperate a paragraph into
>>>>>>>>> "word".
>>>>>>>>      I am not sure what you are saying.  It seems like you are
>>>>>>>> saying the tokenizer doesn't work, so then it seems that the
>>>>>>>> bayesian filter wouldn't be very good at all...
>>>>>>>>
>>>>>>>>      Well, it's been 10 minutes since I read your idea
>> of simply
>>>>>>> putting
>>>>>>>> the bayesian filter at the end, and haven't come up
>> with a reason
>>>>>>>> why it won't work.  So, probably good.  Do you want to
>> do it, or
>>>>>>>> me?
>>>>>>>>
>>>>>>>> --
>>>>>>>> Jon Daley
>>>>>>>> http://jon.limedaley.com/
>>>>>>>>
>>>>>>>> Whenever people agree with me I always feel I must be wrong.
>>>>>>>> -- Oscar Wilde_______________________________________________
>>>>>>>> pLog-svn mailing list
>>>>>>>> pLog-svn at devel.lifetype.net
>>>>>>>> http://limedaley.com/mailman/listinfo/plog-svn
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> pLog-svn mailing list
>>>>>>> pLog-svn at devel.lifetype.net
>>>>>>> http://limedaley.com/mailman/listinfo/plog-svn
>>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> Jon Daley
>>>>> http://jon.limedaley.com/
>>>>>
>>>>> All who would win joy, must share it; happiness was born a twin.
>>>>> -- Lord Byron
>>>>> _______________________________________________
>>>>> pLog-svn mailing list
>>>>> pLog-svn at devel.lifetype.net
>>>>> http://limedaley.com/mailman/listinfo/plog-svn
>>>>
>>>> _______________________________________________
>>>> pLog-svn mailing list
>>>> pLog-svn at devel.lifetype.net
>>>> http://limedaley.com/mailman/listinfo/plog-svn
>>>>
>>> _______________________________________________
>>> pLog-svn mailing list
>>> pLog-svn at devel.lifetype.net
>>> http://limedaley.com/mailman/listinfo/plog-svn
>>>
>>
>> --
>> Jon Daley
>> http://jon.limedaley.com/
>>
>> You are only as wise as others perceive you to be.
>> -- M. Shawn Cole
>> _______________________________________________
>> pLog-svn mailing list
>> pLog-svn at devel.lifetype.net
>> http://limedaley.com/mailman/listinfo/plog-svn
>
> _______________________________________________
> pLog-svn mailing list
> pLog-svn at devel.lifetype.net
> http://limedaley.com/mailman/listinfo/plog-svn
>

-- 
Jon Daley
http://jon.limedaley.com/

Why yes, a bulletproof vest.
Last words of James Rodges, on his final request before the firing squad


More information about the pLog-svn mailing list