[pLog-svn] r5091 - plog/branches/lifetype-1.2/class/data
Mark Wu
markplace at gmail.com
Thu Mar 15 02:39:21 EDT 2007
BTW, I also need some one to test it in MySQL4 utf-8 or latin-1 encoding.
I only can make sure it works in MySQL5 ...
Thanks. :(
Mark
> -----Original Message-----
> From: plog-svn-bounces at devel.lifetype.net
> [mailto:plog-svn-bounces at devel.lifetype.net] On Behalf Of
> mark at devel.lifetype.net
> Sent: Thursday, March 15, 2007 1:55 PM
> To: plog-svn at devel.lifetype.net
> Subject: [pLog-svn] r5091 - plog/branches/lifetype-1.2/class/data
>
> Author: mark
> Date: 2007-03-15 01:54:57 -0400 (Thu, 15 Mar 2007) New Revision: 5091
>
> Modified:
> plog/branches/lifetype-1.2/class/data/textfilter.class.php
> Log:
> Fixed the htmlDecode() according to the discussion thread in
> svn rev. 5062 and MSN discussion with Oscar.
>
> Modified: plog/branches/lifetype-1.2/class/data/textfilter.class.php
> ===================================================================
> ---
> plog/branches/lifetype-1.2/class/data/textfilter.class.php
> 2007-03-14 14:28:27 UTC (rev 5090)
> +++
> plog/branches/lifetype-1.2/class/data/textfilter.class.php
> 2007-03-15 05:54:57 UTC (rev 5091)
> @@ -236,12 +236,33 @@
> // replace numeric entities
> $htmlString =
> preg_replace('~&#x([0-9a-f]+);~ei', 'chr(hexdec("\\1"))',
> $htmlString);
> $htmlString = preg_replace('~&#([0-9]+);~e',
> 'chr(\\1)', $htmlString);
> - // replace literal entities
> - $trans_tbl = get_html_translation_table(
> HTML_SPECIALCHARS, $quote_style );
> - $trans_tbl = array_flip($trans_tbl);
> - $trans_tbl['''] = "'";
> - return strtr($htmlString, $trans_tbl);
> - }
> + // get the entity translation table from PHP
> (current encoding is ISO-8859-1)
> + $trans_table = get_html_translation_table(
> HTML_ENTITIES, $quote_style );
> + // when we want to decode the input string to
> normalized string, there are two factors
> + // we need to take into consideration:
> + // - Input string encoding
> + // - MySQL default-character-set encoding
> + // No matter what input string encoding does,
> the normalized text saved to MySQL should
> + // follow MySQL data validation. If we don't
> follow the constraint, then MySQL will raise
> + // an error for us. (It only happend in MySQL5
> strict mode)
> + // Therefore, we need to check the
> db_character_set in our config file to see we should
> + // use the UTF-8 translation table or ISO-8859-1
> translation table
> + // This should fixed the CJK/UTF-8 characters
> break by Jon's original modification.
> + //
> + // If possible, I really hope we can accept
> UTF-8 encoding only, it will make our life easier.
> + require_once( PLOG_CLASS_PATH .
> "class/config/configfilestorage.class.php" );
> + $config = new ConfigFileStorage();
> + if( $config->getValue(
> 'db_character_set' ) == 'utf8' ) {
> + // Convert the ISO-8859-1
> translation table to UTF-8
> + foreach ( $trans_table as $key
> => $value ){
> +
> $new_trans_table[$value] = utf8_encode( $key );
> + }
> + } else {
> + // Keep original ISO-8859-1
> translation table, just flip it
> + $new_trans_table = array_flip($trans_table);
> + }
> + return strtr( $htmlString, $new_trans_table );
> + }
>
> /**
> * Normalizes the given text. By 'normalizing',
> it means removing all html markup from the text as well @@
> -394,7 +415,7 @@
> *
> * ; / ? : @ & = + $ ,
> *
> - * It will convert accented characters such as , , ,
> etc to their non-accented counterparts (a, e, i) And
> + * It will convert accented characters such as ? ? ? etc to
> + their non-accented counterparts (a, e, i) And
> * any other non-alphanumeric character that hasn't
> been removed or replaced will be thrown away.
> *
> * @param string The string that we wish to convert
> into something that can be used as a URL @@ -408,8 +429,8 @@
> $string =
> str_replace(array(';','/','?',':','@','&','=','+','$',','),
> '', $string);
>
> // replace some characters to similar ones
> - $search = array(' ', '', '', '','','','','', '', '', '',
> - '', '', '', '', '', '', '', '', '' );
> + $search = array(' ', '?, '?, '?,'?,'?,'?,'?, '?, '?, '?,
> + '?, '?, '?, '?, '?, '?, '?, '?, '? );
> lt_include(
> PLOG_CLASS_PATH."class/config/config.class.php" );
> $config =& Config::getConfig();
> $separator = $config->getValue(
> "urlize_word_separator", URLIZE_WORD_SEPARATOR_DEFAULT ); @@
> -436,7 +457,7 @@
> *
> * ; / ? : @ & = + $ ,
> *
> - * It will convert accented characters such as , , , etc to
> + * It will convert accented characters such as ? ? ? etc to
> * their non-accented counterparts (a, e, i) And
> * any other non-alphanumeric character that hasn't
> been removed
> * or replaced will be thrown away.
> @@ -459,7 +480,7 @@
> // replace some characters to similar ones
> // underscores aren't allowed in domain names
> according to rfc specs, and
> // cause trouble in some browsers, particularly
> with cookies.
> - $search = array('-', '_',' ',
> '','','','','','','','','','','','','','','','','','','' );
> + $search = array('-', '_',' ',
> + '?,'?,'?,'?,'?,'?,'?,'?,'?,'?,'?,'?,'?,'?,'?,'?,'?,'?,'? );
> $replace = array( $sep, $sep, $sep,
> 'a','o','u','e','e','a','c','a','e','i','o','u','a','e','i','o
> ','u','e','i' );
> $string = str_replace($search, $replace, $string);
>
>
>
More information about the pLog-svn
mailing list