|
|
Deane-
response embedded.
activeperl-bounces@xxxxxxxxxxxxxxxxxxxxxxxx wrote on 04/26/2006 12:32:01
PM:
> Today's Topics:
> 3. Got this far with regex, now I'm stumped
> (Deane.Rothenmaier@xxxxxxxxxxxxx)
> ----------------------------------------------------------------------
> ------------------------------
>
> Message: 3
> Date: Wed, 26 Apr 2006 09:08:13 -0500
> From: Deane.Rothenmaier@xxxxxxxxxxxxx
> Subject: Got this far with regex, now I'm stumped
> To: activeperl@xxxxxxxxxxxxxxxxxxxxxxxx
> Message-ID:
>
<OFF7B2E756.F74863B7-ON8625715C.004BB2BE-8625715C.004DAC50@xxxxxxxxxxxxx>
>
> Content-Type: text/plain; charset="us-ascii"
>
> Hi, all.
>
> I have a sub that uses a set of URL-parsing regexes that almost works:
took a bit to see the mistake but i see it!!!
>
> if ($url =~ m{^(.*)\.([^\.]+\...\...)$}) {
> $domain = $2;
> $child = $1;
> }
> else {
> if ($url =~ /^[^\.]+?\.\w{2,4}$/) {
> $domain = $url; # "www.xx.yy"
> should have ended up here. . .
your expression as a base pattern: \w{3}\.\w{2}\.\w{2}
your match: ^[^\.]+?\.\w{2,4}$
if your pattern had been \w{3}\.\w{2} it would have matched.
^[^\.]+\.\w{2,4}$ would work as well, but it would also work with what
matches below.
decide how to rework to the exact format you want.
> }
> else {
> $url =~ m{^(.*)\.(.+\.\w{2,4}).*$}; # . . . but it ended up
> here
> $domain = $2;
> $child = $1;
> }
> }
>
> It catches almost all the URL formats it needs to, like
"www.defgh.xx.yy",
> but it misses one possible format, "www.xx.yy". For this URL the sub
that
> uses the regex returns "xx.yy" as the domain and "www" as the child,
which
> means that there's still something not quite right with the regex in the
> second if statement. The sub should've returned "www.xx.yy" as the
domain,
> with no child. See the comments in the code sample for where that URL
> landed, vs. where it should've landed.
>
> I've ordered "Mastering Regular Expressions" but it hasn't arrived yet,
so
> any help would be appreciated.
>
> Thanks,
>
> Deane
-----------------------------------------
PLEASE NOTE:
SeaChange International headquarters in Maynard, MA is moving!
Effective March 1, 2006, our new headquarters address will be:
SeaChange International
50 Nagog Park
Acton, MA 01720 USA
All telephone numbers remain the same:
Main Corporate Telephone: 978-897-0100
Customer Service Telephone: 978-897-7300
_______________________________________________
ActivePerl mailing list
ActivePerl@xxxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
|
|