[pLog-svn] r5062 - plog/branches/lifetype-1.2/class/data

Oscar Renalias oscar at renalias.net
Wed Mar 14 04:51:02 EDT 2007


You could probably extend Textfilter::htmlDecode() to accept an  
encoding string, and as a side effect, Texftilter::normalizeText()  
too. Then modify all calls to Textfilter::normalizeText() to provide  
this parameter, which should be left as iso-8859-1 by default.

Would this be enough?

On 14 Mar 2007, at 10:45, Mark Wu wrote:

> The problem is in MySQL 5.
>
> htmlDeccode will decode the encoded html entities to normalized  
> text. In
> mysql5 the normalized_text field is set as collatioin=utf-8 and
> character-set = utf8
>
> When we decode it, the htmlDecode will use iso-8859-1 to decode the  
> utf-8
> string, and some of the string will break by the decode function.
>
> When the field saved to mysql5, it will show an error say "The data  
> too
> long", and the query failed.
>
> But, in mysql4, it works without any problem. :(
>
> Because it accept the filed store some characters not encode by utf-8.
>
> That's the problem. That's why we never find this bug.
>
> The original htmlDecode use HTML_SPECIALCHARS, it won't touch the CJK
> characters, it just decode those special characters.
>
> The problem can solved easily if we relax the constraint of MySQL5,  
> turn the
> strict mode to traditional mode. (And the user can not touch my.cnf/ 
> my.ini
> settings when they use shared hosting)
>
> Although it works after relax the mysql5 constraint, it is not a good
> solution. Because the CJK characters still broken when we use the new
> htmlDecode function.
>
> Mark
>
>> -----Original Message-----
>> From: plog-svn-bounces at devel.lifetype.net
>> [mailto:plog-svn-bounces at devel.lifetype.net] On Behalf Of
>> Oscar Renalias
>> Sent: Wednesday, March 14, 2007 4:29 PM
>> To: plog-svn at devel.lifetype.net
>> Subject: Re: [pLog-svn] r5062 - plog/branches/lifetype-1.2/class/data
>>
>> htmlDecode is apparently only called from
>> Texfilter::normalizeText() and Textfilter::slugify(), nothing else.
>>
>> But now that I look at this, how can this be the cause for
>> the issues you Mark were mentioning? Where exactly was
>> htmlDecode() causing issues?
>>
>> On 14 Mar 2007, at 05:22, Mark Wu wrote:
>>
>>> I think I found a way to fixied it. But still under test!!
>>>
>>> We encode our key to UTF-8 before we use it.
>>> 	function htmlDecode( $htmlString, $quote_style = ENT_QUOTES )
>>> 	{
>>>             // replace numeric entities
>>>             $htmlString = preg_replace('~&#x([0-9a-f]+);~ei',
>>> 'chr(hexdec("\\1"))', $htmlString);
>>>             $htmlString = preg_replace('~&#([0-9]+);~e',
>> 'chr(\\1)',
>>> $htmlString);
>>>             // replace literal entities
>>>             $trans_table = get_html_translation_table(
>> HTML_ENTITIES,
>>> $quote_style );
>>> 	foreach ( $trans_table as $key => $value ){
>>> 		$utf8_trans_table[$value] = utf8_encode( $key );
>>> 	}
>>>             return strtr( $htmlString, $utf8_trans_table );
>>> 	}
>>>
>>>
>>> Another problem is, this only works under utf-8, about other
>>> characters, I have no idea...
>>>
>>> So, we need to get the character set from current locale
>> and pass it
>>> into textfilter, I afirad it will change a lot.
>>>
>>> Any good way to pass to encoding/character set information to
>>> textfilter?
>>>
>>> Mark
>>>> -----Original Message-----
>>>> From: plog-svn-bounces at devel.lifetype.net
>>>> [mailto:plog-svn-bounces at devel.lifetype.net] On Behalf Of Oscar
>>>> Renalias
>>>> Sent: Tuesday, March 13, 2007 11:40 PM
>>>> To: plog-svn at devel.lifetype.net
>>>> Subject: Re: [pLog-svn] r5062 -
>> plog/branches/lifetype-1.2/class/data
>>>>
>>>> Or how about using mb_eregi_replace() for the regular
>> expression, if
>>>> available?
>>>>
>>>> On 3/13/07, Oscar Renalias <oscar at renalias.net> wrote:
>>>>> As Jon said, htmlDecode should be the opposite of
>>>> filterHtmlEntities:
>>>>>
>>>>> $x = filterHtmlEntities(htmlDecode($x))
>>>>>
>>>>> So you can basically get the output of several strings
>>>> passed through
>>>>> filterHtmlEntities and then make sure that htmlDecode can
>>>> revert them
>>>>> back to what they were in the beginning. Jon may be able to
>>>> give you
>>>>> more precise examples.
>>>>>
>>>>> Oh and a test case would be good too, just to make sure :)
>>>>>
>>>>> On 3/13/07, Mark Wu <markplace at gmail.com> wrote:
>>>>>> Hi Jon:
>>>>>>
>>>>>> I am looking this problem now, do you have any real
>>>> example for me to test?
>>>>>>
>>>>>> Mark
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: plog-svn-bounces at devel.lifetype.net
>>>>>>> [mailto:plog-svn-bounces at devel.lifetype.net] On Behalf Of Reto
>>>>>>> Hugi
>>>>>>> Sent: Tuesday, March 13, 2007 10:32 PM
>>>>>>> To: plog-svn at devel.lifetype.net
>>>>>>> Subject: Re: [pLog-svn] r5062 -
>>>>>>> plog/branches/lifetype-1.2/class/data
>>>>>>>
>>>>>>> Jon Daley wrote:
>>>>>>>>     We can go ahead with it as it is, we probably
>>>> should put a
>>>>>>>> note somewhere that tells people how to fix it if
>>>> they care about this.
>>>>>>>> And then, hopefully, Mark can see if there is a way to fix it
>>>>>>>> correctly for everyone.
>>>>>>>>
>>>>>>>
>>>>>>> if this would include something like "sorry if we broke it.
>>>>>>> if you want to fix it, take file xxx.php and replace it
>>>> with the
>>>>>>> currently distributed one" I would say we hold the
>>>> release back a
>>>>>>> couple of days to fix this issue once and for all. Isn't typing
>>>>>>> stuff in a textarea (maybe on top of a wysiwyg editor) what any
>>>>>>> basic blogging tool should actually do best?
>>>>>>>
>>>>>>> reto
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> pLog-svn mailing list
>>>>>>> pLog-svn at devel.lifetype.net
>>>>>>> http://limedaley.com/mailman/listinfo/plog-svn
>>>>>>
>>>>>> _______________________________________________
>>>>>> pLog-svn mailing list
>>>>>> pLog-svn at devel.lifetype.net
>>>>>> http://limedaley.com/mailman/listinfo/plog-svn
>>>>>>
>>>>>
>>>> _______________________________________________
>>>> pLog-svn mailing list
>>>> pLog-svn at devel.lifetype.net
>>>> http://limedaley.com/mailman/listinfo/plog-svn
>>>>
>>>
>>> _______________________________________________
>>> pLog-svn mailing list
>>> pLog-svn at devel.lifetype.net
>>> http://limedaley.com/mailman/listinfo/plog-svn
>>>
>>
>> _______________________________________________
>> pLog-svn mailing list
>> pLog-svn at devel.lifetype.net
>> http://limedaley.com/mailman/listinfo/plog-svn
>
> _______________________________________________
> pLog-svn mailing list
> pLog-svn at devel.lifetype.net
> http://limedaley.com/mailman/listinfo/plog-svn
>



More information about the pLog-svn mailing list