[pLog-svn] r5062 - plog/branches/lifetype-1.2/class/data
Mark Wu
markplace at gmail.com
Wed Mar 14 04:45:11 EDT 2007
The problem is in MySQL 5.
htmlDeccode will decode the encoded html entities to normalized text. In
mysql5 the normalized_text field is set as collatioin=utf-8 and
character-set = utf8
When we decode it, the htmlDecode will use iso-8859-1 to decode the utf-8
string, and some of the string will break by the decode function.
When the field saved to mysql5, it will show an error say "The data too
long", and the query failed.
But, in mysql4, it works without any problem. :(
Because it accept the filed store some characters not encode by utf-8.
That's the problem. That's why we never find this bug.
The original htmlDecode use HTML_SPECIALCHARS, it won't touch the CJK
characters, it just decode those special characters.
The problem can solved easily if we relax the constraint of MySQL5, turn the
strict mode to traditional mode. (And the user can not touch my.cnf/my.ini
settings when they use shared hosting)
Although it works after relax the mysql5 constraint, it is not a good
solution. Because the CJK characters still broken when we use the new
htmlDecode function.
Mark
> -----Original Message-----
> From: plog-svn-bounces at devel.lifetype.net
> [mailto:plog-svn-bounces at devel.lifetype.net] On Behalf Of
> Oscar Renalias
> Sent: Wednesday, March 14, 2007 4:29 PM
> To: plog-svn at devel.lifetype.net
> Subject: Re: [pLog-svn] r5062 - plog/branches/lifetype-1.2/class/data
>
> htmlDecode is apparently only called from
> Texfilter::normalizeText() and Textfilter::slugify(), nothing else.
>
> But now that I look at this, how can this be the cause for
> the issues you Mark were mentioning? Where exactly was
> htmlDecode() causing issues?
>
> On 14 Mar 2007, at 05:22, Mark Wu wrote:
>
> > I think I found a way to fixied it. But still under test!!
> >
> > We encode our key to UTF-8 before we use it.
> > function htmlDecode( $htmlString, $quote_style = ENT_QUOTES )
> > {
> > // replace numeric entities
> > $htmlString = preg_replace('~&#x([0-9a-f]+);~ei',
> > 'chr(hexdec("\\1"))', $htmlString);
> > $htmlString = preg_replace('~&#([0-9]+);~e',
> 'chr(\\1)',
> > $htmlString);
> > // replace literal entities
> > $trans_table = get_html_translation_table(
> HTML_ENTITIES,
> > $quote_style );
> > foreach ( $trans_table as $key => $value ){
> > $utf8_trans_table[$value] = utf8_encode( $key );
> > }
> > return strtr( $htmlString, $utf8_trans_table );
> > }
> >
> >
> > Another problem is, this only works under utf-8, about other
> > characters, I have no idea...
> >
> > So, we need to get the character set from current locale
> and pass it
> > into textfilter, I afirad it will change a lot.
> >
> > Any good way to pass to encoding/character set information to
> > textfilter?
> >
> > Mark
> >> -----Original Message-----
> >> From: plog-svn-bounces at devel.lifetype.net
> >> [mailto:plog-svn-bounces at devel.lifetype.net] On Behalf Of Oscar
> >> Renalias
> >> Sent: Tuesday, March 13, 2007 11:40 PM
> >> To: plog-svn at devel.lifetype.net
> >> Subject: Re: [pLog-svn] r5062 -
> plog/branches/lifetype-1.2/class/data
> >>
> >> Or how about using mb_eregi_replace() for the regular
> expression, if
> >> available?
> >>
> >> On 3/13/07, Oscar Renalias <oscar at renalias.net> wrote:
> >>> As Jon said, htmlDecode should be the opposite of
> >> filterHtmlEntities:
> >>>
> >>> $x = filterHtmlEntities(htmlDecode($x))
> >>>
> >>> So you can basically get the output of several strings
> >> passed through
> >>> filterHtmlEntities and then make sure that htmlDecode can
> >> revert them
> >>> back to what they were in the beginning. Jon may be able to
> >> give you
> >>> more precise examples.
> >>>
> >>> Oh and a test case would be good too, just to make sure :)
> >>>
> >>> On 3/13/07, Mark Wu <markplace at gmail.com> wrote:
> >>>> Hi Jon:
> >>>>
> >>>> I am looking this problem now, do you have any real
> >> example for me to test?
> >>>>
> >>>> Mark
> >>>>
> >>>>> -----Original Message-----
> >>>>> From: plog-svn-bounces at devel.lifetype.net
> >>>>> [mailto:plog-svn-bounces at devel.lifetype.net] On Behalf Of Reto
> >>>>> Hugi
> >>>>> Sent: Tuesday, March 13, 2007 10:32 PM
> >>>>> To: plog-svn at devel.lifetype.net
> >>>>> Subject: Re: [pLog-svn] r5062 -
> >>>>> plog/branches/lifetype-1.2/class/data
> >>>>>
> >>>>> Jon Daley wrote:
> >>>>>> We can go ahead with it as it is, we probably
> >> should put a
> >>>>>> note somewhere that tells people how to fix it if
> >> they care about this.
> >>>>>> And then, hopefully, Mark can see if there is a way to fix it
> >>>>>> correctly for everyone.
> >>>>>>
> >>>>>
> >>>>> if this would include something like "sorry if we broke it.
> >>>>> if you want to fix it, take file xxx.php and replace it
> >> with the
> >>>>> currently distributed one" I would say we hold the
> >> release back a
> >>>>> couple of days to fix this issue once and for all. Isn't typing
> >>>>> stuff in a textarea (maybe on top of a wysiwyg editor) what any
> >>>>> basic blogging tool should actually do best?
> >>>>>
> >>>>> reto
> >>>>>
> >>>>>
> >>>>> _______________________________________________
> >>>>> pLog-svn mailing list
> >>>>> pLog-svn at devel.lifetype.net
> >>>>> http://limedaley.com/mailman/listinfo/plog-svn
> >>>>
> >>>> _______________________________________________
> >>>> pLog-svn mailing list
> >>>> pLog-svn at devel.lifetype.net
> >>>> http://limedaley.com/mailman/listinfo/plog-svn
> >>>>
> >>>
> >> _______________________________________________
> >> pLog-svn mailing list
> >> pLog-svn at devel.lifetype.net
> >> http://limedaley.com/mailman/listinfo/plog-svn
> >>
> >
> > _______________________________________________
> > pLog-svn mailing list
> > pLog-svn at devel.lifetype.net
> > http://limedaley.com/mailman/listinfo/plog-svn
> >
>
> _______________________________________________
> pLog-svn mailing list
> pLog-svn at devel.lifetype.net
> http://limedaley.com/mailman/listinfo/plog-svn
More information about the pLog-svn
mailing list