[pLog-svn] BlogNameValidator() cause Chinese blog name error!

Oscar Renalias oscar at renalias.net
Wed Sep 12 04:18:50 EDT 2007


Please don't forget to update the test case BlogNameValidator_Test.
You will probably need to save the class as a UTF-8 string if you need
to add Chinese character, so please do so.

Oscar

On 9/11/07, Mark Wu <markplace at gmail.com> wrote:
> Just see this.
>
> If we all agree, I will try to modify the code based on our discussion. :)
>
> Mark
>
> > -----Original Message-----
> > From: plog-svn-bounces at devel.lifetype.net
> > [mailto:plog-svn-bounces at devel.lifetype.net] On Behalf Of
> > Oscar Renalias
> > Sent: Wednesday, September 12, 2007 2:03 AM
> > To: LifeType Developer List
> > Subject: Re: [pLog-svn] BlogNameValidator() cause Chinese
> > blog name error!
> >
> > It's not just blog_slug. It's also category_slug, album_slug,
> > resource_slug, global_article_category_slug,
> > blog_category_slug, and so on.
> >
> > The main reason for the BlogNameValidator class was to
> > prevent situations in where custom URLs or subdomain URLs
> > would end up with an empty blog slug, because the original
> > contained all HTML or incorrect characters. We can do so, as
> > suggested by Mark, that whenever subdomains and custom URLs
> > are disabled, we don't need to impose this restriction (and
> > the validator will always validate whatever is passed)
> >
> > On 11 Sep 2007, at 20:56, Jon Daley wrote:
> >
> > >     Yes, I agree that looks ugly, and we should try to not
> > do that.
> > > Has someone (oscar) not liked the blog_slug - or is it just that
> > > requires new code, so it isn't a trivial add.
> > >     For validating - it isn't a "free form string", it is a
> > blog name, so
> > > [a-zA-Z] or whatever we do is good enough for English - is there a
> > > similar regexp you can add that covers chinese characters?  (Do we
> > > need to cover other languages too?)
> > >
> > > On Wed, 12 Sep 2007, Mark Wu wrote:
> > >
> > >> Hi Jon:
> > >
> > > Take a look the Wikipeida Zh version.  The chinese string
> > can encode
> > > to
> > > UTF-8 like:
> > >
> > > http://zh.wikipedia.org/w/index.php?title=%E4%B8%AD%E4%B8%96%E7%B4%
> > > 80%E9%A3%
> > > B2%E9%A3%9F%E6%96%87%E5%8C%96&variant=zh-tw
> > >
> > > The browser will accept this. And both FF or IE accept this.
> > >
> > > For me, I don't like it. That's why I siad "blog_slug" is a better
> > > solution for this. :D
> > >
> > > Agreed, we need to validate the input string. But I  really have no
> > > idea how to validate a free form "string".
> > >
> > > And, even we use the domainize or urlize function to
> > validate the blog
> > > name
> > > at this moment, we still use the original blog name input by user
> > > (only
> > > with filter html) in our addBlogAction ...
> > >
> > > So, If the SQL injection occurs in string validator, it happened in
> > > blognamevalidator , too ...
> > >
> > > Mark
> > >
> > >> -----Original Message-----
> > >> From: plog-svn-bounces at devel.lifetype.net [mailto:plog-svn-
> > >> bounces at devel.lifetype.net] On Behalf Of Jon Daley
> > >> Sent: Wednesday, September 12, 2007 1:29 AM
> > >> To: LifeType Developer List
> > >> Subject: Re: [pLog-svn] BlogNameValidator() cause Chinese
> > blog name
> > >> error!
> > >>
> > >>    You said, "if you use UTF8, it would be fixed".  But
> > then you said
> > >> that it would show %xx%yy - is that acceptable?
> > >>  Does it actually show those characters in the URL, or does the
> > >> browser/server change those back to "real" characters,
> > that look how
> > >> you want?
> > >>
> > >>    We need validation on input, and string validator
> > doesn't count.
> > >> Can you write a validator that works for you, and doesn't
> > allow SQL
> > >> injections?
> > >> On Wed, 12 Sep 2007, Mark Wu wrote:
> > >> > As you said, the issue is every where in lifetype when we
> > >> convert the > string
> > >> to a valid url, for example,  {xxxname}  in custom url. It
> > is a old
> > >> problem.
> > >> :(
> > >> That's why  most China/Taiwan user use {xxxid} instead of
> > {xxxname}
> > >> in custom url
> > >> ** I raised this issue before, I said maybe we have to add xx_slug
> > >> for every object that need to urlized. But we all agreed
> > it is not a
> > >> good idea to add xxx_slug to XX objects. :) And, yes, the
> > issue can
> > >> be fixed, if we only use the utf8 ...
> > >> After we urlize the chinese sentense (encode the string to
> > >> utf8)  , the string will become %xx%yy%zz .
> > >> The  "%xx%yy%zz" can use in url path without any problem, but not
> > >> works in domain name ... That's another issue.
> > >> Therefore I said it can't be fixed. :( Mark
> > >> > -----Original Message-----
> > >> > From: plog-svn-bounces at devel.lifetype.net
> > >> > [mailto:plog-svn-bounces at devel.lifetype.net] On Behalf
> > Of Jon Daley
> > >> > Sent: Wednesday, September 12, 2007 12:51 AM
> > >> > To: LifeType Developer List
> > >> > Subject: Re: [pLog-svn] BlogNameValidator() cause Chinese blog
> > >> name > error!
> > >> >
> > >> >          I understand that it returns an empty string,
> > but the problem
> > >> isn't > in the blognamevalidator, but in urlize and
> > domainize, which
> > >> are used > in other places in the code.  Don't you have issues
> > >> elsewhere?
> > >> > > On Wed, 12 Sep 2007, Mark Wu wrote:
> > >> > > > Hi Jon:
> > >> > > Agreed.
> > >> > > But, I don't think it can be fixed if we use domainze()
> > >> function. It > is because the  domainize() and urlize()
> > will remove
> > >> some invalid > characters that not allowed in url.
> > >> > > Sometimes, the whole Chinese sentence after domainize() or
> > >> urlize will > return empty string, or the same string as another
> > >> different Chinese > sentence.
> > >> > > Take the Chinese sentence "台北教會" for example, It means
> > >> "church in > Taipei". After domainze(), it will return
> > EMPTY string.
> > >> So, user can > not create new blog .....
> > >> > > That's why I said I have to change it back to string validator
> > >> ONLY IF > the blog admin does not enable subdomain or blogdomain.
> > >> > > Or the most Chinese user can not add new blog at this moment ,
> > >> it is > really not good.
> > >> > > ** The best way to solve this is add a blog_slug to blogInfo,
> > >> it is > different to blog name. It can avoid all this kind of
> > >> problem.
> > >> > > Mark
> > >> > > > -----Original Message-----
> > >> > > From: plog-svn-bounces at devel.lifetype.net
> > >> > > [mailto:plog-svn-bounces at devel.lifetype.net] On Behalf Of Jon
> > >> Daley
> > >> > > Sent: Tuesday, September 11, 2007 9:24 PM
> > >> > > To: LifeType Developer List
> > >> > > Subject: Re: [pLog-svn] BlogNameValidator() cause Chinese blog
> > >> name > > error!
> > >> > >
> > >> > >        I don't think changing it to string validator is the
> > >> > right answer,
> > >> > > since we use urlize and domainize other places, so if they
> > >> > are broken
> > >> > > for chinese characters, they need to be fixed, otherwise, you
> > >> will > > have issues in other places too.
> > >> > >        A string validator doesn't do anything, so we can't
> > >> > count on that to
> > >> > > actually validate the data.
> > >> > > > > On Tue, 11 Sep 2007, Mark Wu wrote:
> > >> > > > > > Hi Oscar & Jon:
> > >> > > >
> > >> > > > It seems the new BlogNameValidator will cause some error
> > >> > when user
> > >> > > > enter Chinese blog name.
> > >> > > >
> > >> > > > I am still checking on it, it seems the new
> > >> > > Textfilter::domanize() or
> > >> > > > Textfilter::urlize()  casue the error.
> > >> > > >
> > >> > > > If I can not fix this bug, I will change it back to string
> > >> > > validator
> > >> > > > if blog admin does not enable subdomain and blogdomain
> > >> > function. It
> > >> > > > can avoid this kind of problem.
> > >> > > >
> > >> > > > Mark
> > >> > > >
> > >> > > > > --
> > >> > > Jon Daley
> > >> > > http://jon.limedaley.com/
> > >> > > > > The real world is
> > >> > > a special case.
> > >> > > -- Horngren's Observation
> > >> > > _______________________________________________
> > >> > > pLog-svn mailing list
> > >> > > pLog-svn at devel.lifetype.net
> > >> > > http://limedaley.com/mailman/listinfo/plog-svn
> > >> > > _______________________________________________
> > >> > pLog-svn mailing list
> > >> > pLog-svn at devel.lifetype.net
> > >> > http://limedaley.com/mailman/listinfo/plog-svn
> > >> > > --
> > >> > Jon Daley
> > >> > http://jon.limedaley.com/
> > >> > > Keep your face to the sunshine and you cannot see the shadow.
> > >> > -- Helen Keller
> > >> _______________________________________________
> > >> pLog-svn mailing list
> > >> pLog-svn at devel.lifetype.net
> > >> http://limedaley.com/mailman/listinfo/plog-svn
> > >> --
> > >> Jon Daley
> > >> http://jon.limedaley.com/
> > >> The secret to programming is not intelligence,
> > >>    though of course that helps.
> > >> It is not hard work or experience, though they help, too.
> > >> The secret to programming is having smart friends.
> > >
> > > _______________________________________________
> > > pLog-svn mailing list
> > > pLog-svn at devel.lifetype.net
> > > http://limedaley.com/mailman/listinfo/plog-svn
> > >
> > > --
> > > Jon Daley
> > > http://jon.limedaley.com/
> > >
> > > Use of excessive, unnecessary, commas, has always been,
> > >   one of my, pet
> > > peeves._______________________________________________
> > > pLog-svn mailing list
> > > pLog-svn at devel.lifetype.net
> > > http://limedaley.com/mailman/listinfo/plog-svn
> >
> > _______________________________________________
> > pLog-svn mailing list
> > pLog-svn at devel.lifetype.net
> > http://limedaley.com/mailman/listinfo/plog-svn
>
> _______________________________________________
> pLog-svn mailing list
> pLog-svn at devel.lifetype.net
> http://limedaley.com/mailman/listinfo/plog-svn


More information about the pLog-svn mailing list