vADC Forum

Reply
Occasional Contributor
Posts: 12
Registered: ‎12-13-2013
Accepted Solution

TrafficScript regex to match the root domain name.

I want to create a rule that gets the hostHeader and strips off all but the root domain (the last two levels).  For example I want to take www.example.com and grab example.com for my $domain variable.  I created a regular expression in a proof-of-concept script:

$host = http.getHostHeader();

$domain = string.regexSub($host, '([^.]+.[^.]+)$', "$1");

log.info("Root domain: " . $domain);

An example domain of www.example.com matches www.example.com.  I don't see how but the first bit [^.]+. seems to match www.example, even though my understanding of regex means that I want to exclude any string of characters having a dot in it.  I've also tried "([^\.]+\.[^\.]$)" which is a more kosher version of regex where I'm escaping the dot.  In either case the result is the same.  Isn't there a way to negate a character in TrafficScript's version of regex ([^<somechar>])?  Or is there a way to toggle greedy matching?

I have developed a workaround that splits the domain on the dots and simply fetches and re-concatenates the last two strings with another dot.  Even though it works I feel all yucky inside.  Besides I want to understand how TrafficScript implements regex better and this opportunity has so far taught me very little.

Any help is appreciated.

Brocadian
Posts: 232
Registered: ‎11-29-2012

Re: TrafficScript regex to match the root domain name.

A quick google found this stackexchange reference that points to the following regex:

/([0-9a-z-]{2,}\.[0-9a-z-]{2,3}\.[0-9a-z-]{2,3}|[0-9a-z-]{2,}\.[0-9a-z-]{2,3})$/i

Using regexr, I tested it and it seems to work:

SRWare IronScreenSnapz138.png

Occasional Contributor
Posts: 12
Registered: ‎12-13-2013

Re: TrafficScript regex to match the root domain name.

Those lengthy regex tests work -- and maybe even better than mine for the set of all possible domain names -- if you use the regex tester you link.  I was using a different but similar website to test my regex.  The real problem is that TrafficScript matches all of the hostname using the very same regex.  That's really the point of my post.  Is there a TrafficScript-specific regex tester like the general ones we all know and love?  I'm trying to find out how TrafficScript's implementation is different from everyone else's.  My working theory is that it matches greedily, or that it doesn't know what the negate character is.  Or something.

Brocadian
Posts: 232
Registered: ‎11-29-2012

Re: Re: TrafficScript regex to match the root domain name.

Shawn, I think the problem is with your regex group maching..

Try this - it uses regex capture groups to make the first element $1, and everything else $2:


$host = http.getHostHeader();


$domain = string.regexSub($host, '([0-9a-z-]{2,})\.(.*)', "$2");


log.info("Request was for: " . $host . " Root domain is: " . $domain);


Log output looks like it works to me:

Virtual Server vs_web: Request was for: www.app1.subapp2.company.com Root domain is: app1.subapp2.company.com

Virtual Server vs_web: Request was for: www.app1.company.com Root domain is: app1.company.com

Virtual Server vs_web: Request was for: www.company.com Root domain is: company.com

Brocadian
Posts: 232
Registered: ‎11-29-2012

Re: TrafficScript regex to match the root domain name.

Shawn Magill have you had a chance to test the code I posted above? I am keen to see if it has resolved your issue..

A.

Occasional Contributor
Posts: 12
Registered: ‎12-13-2013

Re: TrafficScript regex to match the root domain name.

Yes, sorry for the late reply.  Looks like it works!

Brocadian
Posts: 232
Registered: ‎11-29-2012

Re: TrafficScript regex to match the root domain name.

Glad to hear!

Join the Community

Get quick and easy access to valuable resource designed to help you manage your Brocade Network.

Click to Register
Download FREE NVMe eBook