[pLog-svn] r6088 - plog/branches/lifetype-1.2/class/security

Jon Daley plogworld at jon.limedaley.com
Wed Dec 5 10:50:31 EST 2007


 	I have it on my list to get done in the next couple days.

On Wed, 5 Dec 2007, Oscar Renalias wrote:

> Is anyone taking care of making these changes so that the Bayesian
> filter runs last?
>
> On Dec 1, 2007 11:36 AM, Oscar Renalias <oscar at renalias.net> wrote:
>> The bayesian filter needs to perform additional checks on the incoming
>> comment because if it's going to end up being saved in the database
>> (marked as spam), we first need to make sure that things like the blog
>> id and the article id are correct. But it's not strictly necessary
>> that it runs first, and in fact it doesn't really matter, so I guess
>> making it run last is still good enough for now.
>>
>> Oscar
>>
>>
>> On Dec 1, 2007, at 12:19 AM, Jon Daley wrote:
>>
>>>       Should we have a small filter at the front that does these sort of
>>> checks, and then have the bayesian filter at the end?  Or perhaps
>>> the real
>>> reason is that since the bayesian filter actually saves the comment,
>>> it
>>> needs to have additional checks, no matter where in the order it
>>> falls?
>>>       There are two filters before the bayesian filter, and maybe that
>>> logic could go in there?
>>>       It would be nice to have filters be able to lower the cpu usage on
>>> comments that have invalid article ids, etc. since presumably that is
>>> spammers trying to mess with the system.
>>>
>>> ------------------------------------------------------------------------
>>> r5918 | oscar | 2007-09-07 17:38:00 -0400 (Fri, 07 Sep 2007) | 5 lines
>>>
>>> This should solve issues http://bugs.lifetype.net/view.php?id=1386
>>> ("Spammers are able to post comments even if comments are disabled
>>> for a
>>> particular post") and http://bugs.lifetype.net/view.php?id=1387
>>> ("comments
>>> with article_id = 0 created by some spam bots")
>>>
>>> The problem here was that since the bayesian filter is run *before*
>>> any
>>> application logic is run, it should also check things like whether
>>> comments are enabled or not and if the article is found at all or not,
>>> even though this same checks are applied later on in the
>>> AddCommentAction
>>> class. The articleId parameter was taken as is from the request,
>>> without
>>> performing any check other than checking if it is an integer, so this
>>> caused some comments to point to an article with an id of '0'
>>> because we
>>> did not check if the article really existed before saving the spam
>>> comment. And the same applies to the other situation, with the
>>> toggle for
>>> enabling and disabling comments.
>>>
>>> The solution was to add some additional logic to the BayesianFilter
>>> filter
>>> class and perform these checks, that does indeed duplicate some of the
>>> logic found later in the process flow but I did not find a more
>>> elegant
>>> solution for this (at least not without a redesign of the whole filter
>>> architecture anyway)
>>> ------------------------------------------------------------------------
>>>
>>>
>>> On Fri, 30 Nov 2007, Paul Westbrook wrote:
>>>> Hello,
>>>>  That should be fine.  But in revision 5918 it looks like it is
>>>> intentional that the Bayesian filter runs first.
>>>>
>>>> --Paul
>>>>
>>>> On 11/30/07, Oscar Renalias <oscar at renalias.net> wrote:
>>>>>
>>>>> So can this issue be closed by placing the Bayesian filter at the
>>>>> end
>>>>> of the pipeline chain?
>>>>>
>>>>> On Nov 30, 2007, at 6:48 AM, Jon Daley wrote:
>>>>>
>>>>>> On Fri, 30 Nov 2007, Mark Wu wrote:
>>>>>>> Why can't we just put the bayesian filter in last order? it seems
>>>>>>> solve this
>>>>>>> problem easier.
>>>>>>      Does that fix everything?  It is certainly the easiest
>>>>>> (coding and
>>>>>> performance) wise.
>>>>>>      With my thinking it seems like that fixes it - at least for
>>>>>> now,
>>>>>> because we don't have any other plugins that would use the inputs
>>>>>> of
>>>>>> others.  And we can maybe do Mark's priority idea if we ever need
>>>>>> that sort of thing.
>>>>>>      As long as it works for Paul's stuff, I think that sounds
>>>>>> good.
>>>>> So,
>>>>>> then we should take Mark's rev 6088 or whatever it is and use that,
>>>>>> but modify it to pass in the previouslyRejected flag, and then put
>>>>>> the bayesian at the end.
>>>>>>
>>>>>>> BTW,  most lifetype installations in CJK site does rely on
>>>>>>> Bayesian
>>>>>>> Filter to protect the spam attack. Because the tokenize algorithm
>>>>>>> can't separate CJK into each atomic token. We don't use stop words
>>>>>>> and "white space" to seperate a paragraph into "word".
>>>>>>      I am not sure what you are saying.  It seems like you are
>>>>>> saying
>>>>>> the tokenizer doesn't work, so then it seems that the bayesian
>>>>>> filter wouldn't be very good at all...
>>>>>>
>>>>>>      Well, it's been 10 minutes since I read your idea of simply
>>>>> putting
>>>>>> the bayesian filter at the end, and haven't come up with a reason
>>>>>> why it won't work.  So, probably good.  Do you want to do it, or
>>>>>> me?
>>>>>>
>>>>>> --
>>>>>> Jon Daley
>>>>>> http://jon.limedaley.com/
>>>>>>
>>>>>> Whenever people agree with me I always feel I must be wrong.
>>>>>> -- Oscar Wilde_______________________________________________
>>>>>> pLog-svn mailing list
>>>>>> pLog-svn at devel.lifetype.net
>>>>>> http://limedaley.com/mailman/listinfo/plog-svn
>>>>>
>>>>> _______________________________________________
>>>>> pLog-svn mailing list
>>>>> pLog-svn at devel.lifetype.net
>>>>> http://limedaley.com/mailman/listinfo/plog-svn
>>>>>
>>>>
>>>
>>> --
>>> Jon Daley
>>> http://jon.limedaley.com/
>>>
>>> All who would win joy, must share it; happiness was born a twin.
>>> -- Lord Byron
>>> _______________________________________________
>>> pLog-svn mailing list
>>> pLog-svn at devel.lifetype.net
>>> http://limedaley.com/mailman/listinfo/plog-svn
>>
>> _______________________________________________
>> pLog-svn mailing list
>> pLog-svn at devel.lifetype.net
>> http://limedaley.com/mailman/listinfo/plog-svn
>>
> _______________________________________________
> pLog-svn mailing list
> pLog-svn at devel.lifetype.net
> http://limedaley.com/mailman/listinfo/plog-svn
>

-- 
Jon Daley
http://jon.limedaley.com/

You are only as wise as others perceive you to be.
-- M. Shawn Cole


More information about the pLog-svn mailing list