[pLog-svn] Email Regex

Mark Wu markplace at gmail.com
Tue Jul 18 17:15:30 GMT 2006


I don't like to depend on pear, too .. :)

I just look at the code, it looks we can migrate the following code into our
validator ..

So, Anmar ... Maybe you can try as Oscar suggested. It would be a plus in
our e-mail validator.

Mark

================= code from pear::validate ======================

    function __emailRFC822(&$email, &$options)
    {
        static $address = null;
        static $uncomment = null;
        if (!$address) {
            // atom        =  1*<any CHAR except specials, SPACE and CTLs>
            $atom = '[^][()<>@,;:\\".\s\000-\037\177-\377]+\s*';
            // qtext       =  <any CHAR excepting <">,     ; => may be
folded
            //         "\" & CR, and including linear-white-space>
            $qtext = '[^"\\\\\r]';
            // quoted-pair =  "\" CHAR                     ; may quote any
char
            $quoted_pair = '\\\\.';
            // quoted-string = <"> *(qtext/quoted-pair) <">; Regular qtext
or
            //                                             ;   quoted chars.
            $quoted_string = '"(?:' . $qtext . '|' . $quoted_pair .
')*"\s*';
            // word        =  atom / quoted-string
            $word = '(?:' . $atom . '|' . $quoted_string . ')';
            // local-part  =  word *("." word)             ; uninterpreted
            //                                             ; case-preserved
            $local_part = $word . '(?:\.\s*' . $word . ')*';
            // dtext       =  <any CHAR excluding "[",     ; => may be
folded
            //         "]", "\" & CR, & including linear-white-space>
            $dtext = '[^][\\\\\r]';
            // domain-literal =  "[" *(dtext / quoted-pair) "]"
            $domain_literal = '\[(?:' . $dtext . '|' . $quoted_pair .
')*\]\s*';
            // sub-domain  =  domain-ref / domain-literal
            // domain-ref  =  atom                         ; symbolic
reference
            $sub_domain = '(?:' . $atom . '|' . $domain_literal . ')';
            // domain      =  sub-domain *("." sub-domain)
            $domain = $sub_domain . '(?:\.\s*' . $sub_domain . ')*';
            // addr-spec   =  local-part "@" domain        ; global address
            $addr_spec = $local_part . '@\s*' . $domain;
            // route       =  1#("@" domain) ":"           ; path-relative
            $route = '@' . $domain . '(?:,@\s*' . $domain . ')*:\s*';
            // route-addr  =  "<" [route] addr-spec ">"
            $route_addr = '<\s*(?:' . $route . ')?' . $addr_spec . '>\s*';
            // phrase      =  1*word                       ; Sequence of
words
            $phrase = $word  . '+';
            // mailbox     =  addr-spec                    ; simple address
            //             /  phrase route-addr            ; name &
addr-spec
            $mailbox = '(?:' . $addr_spec . '|' . $phrase . $route_addr .
')';
            // group       =  phrase ":" [#mailbox] ";"
            $group = $phrase . ':\s*(?:' . $mailbox . '(?:,\s*' . $mailbox .
')*)?;\s*';
            //     address     =  mailbox                      ; one
addressee
            //                 /  group                        ; named list
            $address = '/^\s*(?:' . $mailbox . '|' . $group . ')$/';
            $uncomment =
            '/((?:(?:\\\\"|[^("])*(?:' . $quoted_string .
 
')?)*)((?<!\\\\)\((?:(?2)|.)*?(?<!\\\\)\))/';
        }
        // strip comments
        $email = preg_replace($uncomment, '$1 ', $email);
        return preg_match($address, $email);
    }

    /**
     * Validate an email
     *
     * @param string $email email to validate
     * @param mixed boolean (BC) $check_domain   Check or not if the domain
exists
     *              array $options associative array of options
     *              'check_domain' boolean Check or not if the domain exists
     *              'use_rfc822' boolean Apply the full RFC822 grammar
     *
     * @return boolean true if valid email, false if not
     *
     * @access public
     */
    function email($email, $options = null)
    {
        $check_domain = false;
        $use_rfc822 = false;
        if (is_bool($options)) {
            $check_domain = $options;
        } elseif (is_array($options)) {
            extract($options);
        }

        // the base regexp for address
        $regex = '&^(?:                                               #
recipient:
         ("\s*(?:[^"\f\n\r\t\v\b\s]+\s*)+")|                          #1
quoted name
         ([-\w!\#\$%\&\'*+~/^`|{}]+(?:\.[-\w!\#\$%\&\'*+~/^`|{}]+)*)) #2 OR
dot-atom
         @(((\[)?                     #3 domain, 4 as IPv4, 5 optionally
bracketed
         (?:(?:(?:(?:25[0-5])|(?:2[0-4][0-9])|(?:[0-1]?[0-9]?[0-9]))\.){3}
 
(?:(?:25[0-5])|(?:2[0-4][0-9])|(?:[0-1]?[0-9]?[0-9]))))(?(5)\])|
 
((?:[a-z0-9](?:[-a-z0-9]*[a-z0-9])?\.)*[a-z](?:[-a-z0-9]*[a-z0-9])?))  #6
domain as hostname
         $&xi';

        if ($use_rfc822? Validate::__emailRFC822($email, $options) :
            preg_match($regex, $email)) {
            if ($check_domain && function_exists('checkdnsrr')) {
                list (, $domain)  = explode('@', $email);
                if (checkdnsrr($domain, 'MX') || checkdnsrr($domain, 'A')) {
                    return true;
                }
                return false;
            }
            return true;
        }
        return false;
    } 

> -----Original Message-----
> From: plog-svn-bounces at devel.lifetype.net 
> [mailto:plog-svn-bounces at devel.lifetype.net] On Behalf Of 
> Oscar Renalias
> Sent: Wednesday, July 19, 2006 12:39 AM
> To: plog-svn at devel.lifetype.net
> Subject: Re: [pLog-svn] Email Regex
> 
> I don't like to depend on PEAR either but if we can extract 
> their validation code and integrate it into our own 
> validation framework, I think it could be a good idea.
> 
> Try and have a look at the code and if it's not too 
> complicated, please go ahead.
> 
> On 18 Jul 2006, at 18:04, Jon Daley wrote:
> 
> > 	I would bet we don't want to depend on PEAR for just email 
> > validation.  We want to have as minimum external dependencies as 
> > possible.
> >
> > What are the invalid email addresses that pass the regex?
> >
> > We could change it to this pretty easily:
> > "^[a-z0-9]*([_.+-]+[a-z0-9])+@[a-z0-9]+([-.]?[a-z0-9])+\.[a-z]{2,4}"
> >
> > The full expression from the RFC is about a page long, we could 
> > include that if needed to.  I think it is overkill for the 
> real world.
> >
> > On Tue, 18 Jul 2006, Ammar Ibrahim wrote:
> >
> >> The regex for the email validation rule is flawed.
> >>
> >>   define( "EMAIL_FORMAT_RULE_REG_EXP", 
> >> "^[a-z0-9]*([-_.+]?[a-z0-9])+@[a-z0-9]+([-.]?[a-z0-9])+\.[a-z]
> >> {2,4}");
> >>
> >> Not only some invalid emails pass this regex, yet worse, 
> some valid 
> >> emails don't pass e.g. (ab__ar at hotmail.com, note the double 
> >> underscores)
> >>
> >> My suggestion would be to use the PEAR email validator, 
> I've used it 
> >> quite alot and it looks to be very good and intensive. The 
> validator 
> >> package is big, so we can just take the email validation part and 
> >> integrate it in LifeType. If this is what we agree on, I 
> can work on 
> >> this and send you the code.
> >>
> >>
> >> - Ammar
> >>
> >
> > --
> > Jon Daley
> > http://jon.limedaley.com/
> >
> > "Diplomacy" is letting them have it your way.
> > _______________________________________________
> > pLog-svn mailing list
> > pLog-svn at devel.lifetype.net
> > http://devel.lifetype.net/mailman/listinfo/plog-svn
> >
> 
> _______________________________________________
> pLog-svn mailing list
> pLog-svn at devel.lifetype.net
> http://devel.lifetype.net/mailman/listinfo/plog-svn



More information about the pLog-svn mailing list