Identifying subdomains and top-level domains in a URI

Sat Jan 29 21:57:23 UTC 2005

Hi all,

I've got a firefox extension in development that needs to react to site
the person is viewing.  My challenge is to determine the base portion
of the URI--stripped of subdomains but including top-level domains.
E.g., for "http://www.google.com" I need to get "google.com", and for
"subdomain.domain.com.au", I need to get "domain.com.au".  My current
naive system just takes the last two chunks, which means it thinks all
web pages from austrailia are the same site.  (They'll all from
"com.au"!)

What's the intelligent way to do this?  My only thought is work from
the right side of the URI, and keep grabbing chunks until I find one
that is NOT a top-level domain.  But this seems like a cludge.
Any ideas?  Thanks for any insight!

-stan