[pLog-svn] r6088 - plog/branches/lifetype-1.2/class/security
Mark Wu
markplace at gmail.com
Wed Dec 5 23:04:14 EST 2007
Hi Jon and Oscar:
I am also busy in this week.
As I said, I will implement a "ordered" pipeline in 2.0 to solve this
permanently. The idea is like the following way:
The default $pipeline:
$registerFilter( "NullFiter", 1 );
$registerFilter( "CommentFiter", 2 );
$registerFilter( "BayesianFiter", 999 );
User pipeline:
$registerFilter( "Any1Fiter" );
$registerFilter( "Any2Fiter" );
It will add before the last filter ... So, it can also keep the backward
compatibility.
Mark
> -----Original Message-----
> From: plog-svn-bounces at devel.lifetype.net
> [mailto:plog-svn-bounces at devel.lifetype.net] On Behalf Of Jon Daley
> Sent: Wednesday, December 05, 2007 11:51 PM
> To: LifeType Developer List
> Subject: Re: [pLog-svn] r6088 -
> plog/branches/lifetype-1.2/class/security
>
> I have it on my list to get done in the next couple days.
>
> On Wed, 5 Dec 2007, Oscar Renalias wrote:
>
> > Is anyone taking care of making these changes so that the Bayesian
> > filter runs last?
> >
> > On Dec 1, 2007 11:36 AM, Oscar Renalias <oscar at renalias.net> wrote:
> >> The bayesian filter needs to perform additional checks on the
> >> incoming comment because if it's going to end up being
> saved in the
> >> database (marked as spam), we first need to make sure that things
> >> like the blog id and the article id are correct. But it's not
> >> strictly necessary that it runs first, and in fact it
> doesn't really
> >> matter, so I guess making it run last is still good enough for now.
> >>
> >> Oscar
> >>
> >>
> >> On Dec 1, 2007, at 12:19 AM, Jon Daley wrote:
> >>
> >>> Should we have a small filter at the front that does these
> >>> sort of checks, and then have the bayesian filter at the end? Or
> >>> perhaps the real reason is that since the bayesian filter
> actually
> >>> saves the comment, it needs to have additional checks, no matter
> >>> where in the order it falls?
> >>> There are two filters before the bayesian filter, and maybe
> >>> that logic could go in there?
> >>> It would be nice to have filters be able to lower the cpu
> >>> usage on comments that have invalid article ids, etc. since
> >>> presumably that is spammers trying to mess with the system.
> >>>
> >>>
> --------------------------------------------------------------------
> >>> ----
> >>> r5918 | oscar | 2007-09-07 17:38:00 -0400 (Fri, 07 Sep 2007) | 5
> >>> lines
> >>>
> >>> This should solve issues http://bugs.lifetype.net/view.php?id=1386
> >>> ("Spammers are able to post comments even if comments are
> disabled
> >>> for a particular post") and
> >>> http://bugs.lifetype.net/view.php?id=1387
> >>> ("comments
> >>> with article_id = 0 created by some spam bots")
> >>>
> >>> The problem here was that since the bayesian filter is
> run *before*
> >>> any application logic is run, it should also check things like
> >>> whether comments are enabled or not and if the article is
> found at
> >>> all or not, even though this same checks are applied
> later on in the
> >>> AddCommentAction class. The articleId parameter was taken
> as is from
> >>> the request, without performing any check other than
> checking if it
> >>> is an integer, so this caused some comments to point to
> an article
> >>> with an id of '0'
> >>> because we
> >>> did not check if the article really existed before saving
> the spam
> >>> comment. And the same applies to the other situation, with the
> >>> toggle for enabling and disabling comments.
> >>>
> >>> The solution was to add some additional logic to the
> BayesianFilter
> >>> filter class and perform these checks, that does indeed duplicate
> >>> some of the logic found later in the process flow but I
> did not find
> >>> a more elegant solution for this (at least not without a
> redesign of
> >>> the whole filter architecture anyway)
> >>>
> --------------------------------------------------------------------
> >>> ----
> >>>
> >>>
> >>> On Fri, 30 Nov 2007, Paul Westbrook wrote:
> >>>> Hello,
> >>>> That should be fine. But in revision 5918 it looks like it is
> >>>> intentional that the Bayesian filter runs first.
> >>>>
> >>>> --Paul
> >>>>
> >>>> On 11/30/07, Oscar Renalias <oscar at renalias.net> wrote:
> >>>>>
> >>>>> So can this issue be closed by placing the Bayesian
> filter at the
> >>>>> end of the pipeline chain?
> >>>>>
> >>>>> On Nov 30, 2007, at 6:48 AM, Jon Daley wrote:
> >>>>>
> >>>>>> On Fri, 30 Nov 2007, Mark Wu wrote:
> >>>>>>> Why can't we just put the bayesian filter in last order? it
> >>>>>>> seems solve this problem easier.
> >>>>>> Does that fix everything? It is certainly the easiest
> >>>>>> (coding and
> >>>>>> performance) wise.
> >>>>>> With my thinking it seems like that fixes it - at
> least for
> >>>>>> now, because we don't have any other plugins that
> would use the
> >>>>>> inputs of others. And we can maybe do Mark's priority
> idea if we
> >>>>>> ever need that sort of thing.
> >>>>>> As long as it works for Paul's stuff, I think that sounds
> >>>>>> good.
> >>>>> So,
> >>>>>> then we should take Mark's rev 6088 or whatever it is and use
> >>>>>> that, but modify it to pass in the previouslyRejected
> flag, and
> >>>>>> then put the bayesian at the end.
> >>>>>>
> >>>>>>> BTW, most lifetype installations in CJK site does rely on
> >>>>>>> Bayesian Filter to protect the spam attack. Because
> the tokenize
> >>>>>>> algorithm can't separate CJK into each atomic token. We don't
> >>>>>>> use stop words and "white space" to seperate a paragraph into
> >>>>>>> "word".
> >>>>>> I am not sure what you are saying. It seems like you are
> >>>>>> saying the tokenizer doesn't work, so then it seems that the
> >>>>>> bayesian filter wouldn't be very good at all...
> >>>>>>
> >>>>>> Well, it's been 10 minutes since I read your idea
> of simply
> >>>>> putting
> >>>>>> the bayesian filter at the end, and haven't come up
> with a reason
> >>>>>> why it won't work. So, probably good. Do you want to
> do it, or
> >>>>>> me?
> >>>>>>
> >>>>>> --
> >>>>>> Jon Daley
> >>>>>> http://jon.limedaley.com/
> >>>>>>
> >>>>>> Whenever people agree with me I always feel I must be wrong.
> >>>>>> -- Oscar Wilde_______________________________________________
> >>>>>> pLog-svn mailing list
> >>>>>> pLog-svn at devel.lifetype.net
> >>>>>> http://limedaley.com/mailman/listinfo/plog-svn
> >>>>>
> >>>>> _______________________________________________
> >>>>> pLog-svn mailing list
> >>>>> pLog-svn at devel.lifetype.net
> >>>>> http://limedaley.com/mailman/listinfo/plog-svn
> >>>>>
> >>>>
> >>>
> >>> --
> >>> Jon Daley
> >>> http://jon.limedaley.com/
> >>>
> >>> All who would win joy, must share it; happiness was born a twin.
> >>> -- Lord Byron
> >>> _______________________________________________
> >>> pLog-svn mailing list
> >>> pLog-svn at devel.lifetype.net
> >>> http://limedaley.com/mailman/listinfo/plog-svn
> >>
> >> _______________________________________________
> >> pLog-svn mailing list
> >> pLog-svn at devel.lifetype.net
> >> http://limedaley.com/mailman/listinfo/plog-svn
> >>
> > _______________________________________________
> > pLog-svn mailing list
> > pLog-svn at devel.lifetype.net
> > http://limedaley.com/mailman/listinfo/plog-svn
> >
>
> --
> Jon Daley
> http://jon.limedaley.com/
>
> You are only as wise as others perceive you to be.
> -- M. Shawn Cole
> _______________________________________________
> pLog-svn mailing list
> pLog-svn at devel.lifetype.net
> http://limedaley.com/mailman/listinfo/plog-svn
More information about the pLog-svn
mailing list