[pLog-svn] r6088 - plog/branches/lifetype-1.2/class/security

Mark Wu markplace at gmail.com
Wed Dec 5 23:04:14 EST 2007


Hi Jon and Oscar:

I am also busy in this week.

As I said, I will implement a "ordered" pipeline in 2.0 to solve this
permanently. The idea is like the following way:

The default $pipeline:

$registerFilter( "NullFiter", 1 );
$registerFilter( "CommentFiter", 2 );
$registerFilter( "BayesianFiter", 999 );

User pipeline:

$registerFilter( "Any1Fiter" );
$registerFilter( "Any2Fiter" );

It will add before the last filter ... So, it can also keep the backward
compatibility.

Mark 

> -----Original Message-----
> From: plog-svn-bounces at devel.lifetype.net 
> [mailto:plog-svn-bounces at devel.lifetype.net] On Behalf Of Jon Daley
> Sent: Wednesday, December 05, 2007 11:51 PM
> To: LifeType Developer List
> Subject: Re: [pLog-svn] r6088 - 
> plog/branches/lifetype-1.2/class/security
> 
>  	I have it on my list to get done in the next couple days.
> 
> On Wed, 5 Dec 2007, Oscar Renalias wrote:
> 
> > Is anyone taking care of making these changes so that the Bayesian 
> > filter runs last?
> >
> > On Dec 1, 2007 11:36 AM, Oscar Renalias <oscar at renalias.net> wrote:
> >> The bayesian filter needs to perform additional checks on the 
> >> incoming comment because if it's going to end up being 
> saved in the 
> >> database (marked as spam), we first need to make sure that things 
> >> like the blog id and the article id are correct. But it's not 
> >> strictly necessary that it runs first, and in fact it 
> doesn't really 
> >> matter, so I guess making it run last is still good enough for now.
> >>
> >> Oscar
> >>
> >>
> >> On Dec 1, 2007, at 12:19 AM, Jon Daley wrote:
> >>
> >>>       Should we have a small filter at the front that does these 
> >>> sort of checks, and then have the bayesian filter at the end?  Or 
> >>> perhaps the real reason is that since the bayesian filter 
> actually 
> >>> saves the comment, it needs to have additional checks, no matter 
> >>> where in the order it falls?
> >>>       There are two filters before the bayesian filter, and maybe 
> >>> that logic could go in there?
> >>>       It would be nice to have filters be able to lower the cpu 
> >>> usage on comments that have invalid article ids, etc. since 
> >>> presumably that is spammers trying to mess with the system.
> >>>
> >>> 
> --------------------------------------------------------------------
> >>> ----
> >>> r5918 | oscar | 2007-09-07 17:38:00 -0400 (Fri, 07 Sep 2007) | 5 
> >>> lines
> >>>
> >>> This should solve issues http://bugs.lifetype.net/view.php?id=1386
> >>> ("Spammers are able to post comments even if comments are 
> disabled 
> >>> for a particular post") and 
> >>> http://bugs.lifetype.net/view.php?id=1387
> >>> ("comments
> >>> with article_id = 0 created by some spam bots")
> >>>
> >>> The problem here was that since the bayesian filter is 
> run *before* 
> >>> any application logic is run, it should also check things like 
> >>> whether comments are enabled or not and if the article is 
> found at 
> >>> all or not, even though this same checks are applied 
> later on in the 
> >>> AddCommentAction class. The articleId parameter was taken 
> as is from 
> >>> the request, without performing any check other than 
> checking if it 
> >>> is an integer, so this caused some comments to point to 
> an article 
> >>> with an id of '0'
> >>> because we
> >>> did not check if the article really existed before saving 
> the spam 
> >>> comment. And the same applies to the other situation, with the 
> >>> toggle for enabling and disabling comments.
> >>>
> >>> The solution was to add some additional logic to the 
> BayesianFilter 
> >>> filter class and perform these checks, that does indeed duplicate 
> >>> some of the logic found later in the process flow but I 
> did not find 
> >>> a more elegant solution for this (at least not without a 
> redesign of 
> >>> the whole filter architecture anyway)
> >>> 
> --------------------------------------------------------------------
> >>> ----
> >>>
> >>>
> >>> On Fri, 30 Nov 2007, Paul Westbrook wrote:
> >>>> Hello,
> >>>>  That should be fine.  But in revision 5918 it looks like it is 
> >>>> intentional that the Bayesian filter runs first.
> >>>>
> >>>> --Paul
> >>>>
> >>>> On 11/30/07, Oscar Renalias <oscar at renalias.net> wrote:
> >>>>>
> >>>>> So can this issue be closed by placing the Bayesian 
> filter at the 
> >>>>> end of the pipeline chain?
> >>>>>
> >>>>> On Nov 30, 2007, at 6:48 AM, Jon Daley wrote:
> >>>>>
> >>>>>> On Fri, 30 Nov 2007, Mark Wu wrote:
> >>>>>>> Why can't we just put the bayesian filter in last order? it 
> >>>>>>> seems solve this problem easier.
> >>>>>>      Does that fix everything?  It is certainly the easiest 
> >>>>>> (coding and
> >>>>>> performance) wise.
> >>>>>>      With my thinking it seems like that fixes it - at 
> least for 
> >>>>>> now, because we don't have any other plugins that 
> would use the 
> >>>>>> inputs of others.  And we can maybe do Mark's priority 
> idea if we 
> >>>>>> ever need that sort of thing.
> >>>>>>      As long as it works for Paul's stuff, I think that sounds 
> >>>>>> good.
> >>>>> So,
> >>>>>> then we should take Mark's rev 6088 or whatever it is and use 
> >>>>>> that, but modify it to pass in the previouslyRejected 
> flag, and 
> >>>>>> then put the bayesian at the end.
> >>>>>>
> >>>>>>> BTW,  most lifetype installations in CJK site does rely on 
> >>>>>>> Bayesian Filter to protect the spam attack. Because 
> the tokenize 
> >>>>>>> algorithm can't separate CJK into each atomic token. We don't 
> >>>>>>> use stop words and "white space" to seperate a paragraph into 
> >>>>>>> "word".
> >>>>>>      I am not sure what you are saying.  It seems like you are 
> >>>>>> saying the tokenizer doesn't work, so then it seems that the 
> >>>>>> bayesian filter wouldn't be very good at all...
> >>>>>>
> >>>>>>      Well, it's been 10 minutes since I read your idea 
> of simply
> >>>>> putting
> >>>>>> the bayesian filter at the end, and haven't come up 
> with a reason 
> >>>>>> why it won't work.  So, probably good.  Do you want to 
> do it, or 
> >>>>>> me?
> >>>>>>
> >>>>>> --
> >>>>>> Jon Daley
> >>>>>> http://jon.limedaley.com/
> >>>>>>
> >>>>>> Whenever people agree with me I always feel I must be wrong.
> >>>>>> -- Oscar Wilde_______________________________________________
> >>>>>> pLog-svn mailing list
> >>>>>> pLog-svn at devel.lifetype.net
> >>>>>> http://limedaley.com/mailman/listinfo/plog-svn
> >>>>>
> >>>>> _______________________________________________
> >>>>> pLog-svn mailing list
> >>>>> pLog-svn at devel.lifetype.net
> >>>>> http://limedaley.com/mailman/listinfo/plog-svn
> >>>>>
> >>>>
> >>>
> >>> --
> >>> Jon Daley
> >>> http://jon.limedaley.com/
> >>>
> >>> All who would win joy, must share it; happiness was born a twin.
> >>> -- Lord Byron
> >>> _______________________________________________
> >>> pLog-svn mailing list
> >>> pLog-svn at devel.lifetype.net
> >>> http://limedaley.com/mailman/listinfo/plog-svn
> >>
> >> _______________________________________________
> >> pLog-svn mailing list
> >> pLog-svn at devel.lifetype.net
> >> http://limedaley.com/mailman/listinfo/plog-svn
> >>
> > _______________________________________________
> > pLog-svn mailing list
> > pLog-svn at devel.lifetype.net
> > http://limedaley.com/mailman/listinfo/plog-svn
> >
> 
> --
> Jon Daley
> http://jon.limedaley.com/
> 
> You are only as wise as others perceive you to be.
> -- M. Shawn Cole
> _______________________________________________
> pLog-svn mailing list
> pLog-svn at devel.lifetype.net
> http://limedaley.com/mailman/listinfo/plog-svn



More information about the pLog-svn mailing list