A valid domain is for me something I m able to register or at least something that looks like I could register it. This is the reason why I like to separate this from "localhost"-names.
And finally I was interested in the main question if avoiding Regex would be faster and this is my result:
<?php
function filter_hostname($name, $domain_only=false) {
// entire hostname has a maximum of 253 ASCII characters
if (!($len = strlen($name)) || $len > 253
// .example.org and localhost- are not allowed
|| $name[0] == . || $name[0] == - || $name[ $len - 1 ] == . || $name[ $len - 1 ] == -
// a.de is the shortest possible domain name and needs one dot
|| ($domain_only && ($len < 4 || strpos($name, . ) === false))
// several combinations are not allowed
|| strpos($name, .. ) !== false
|| strpos($name, .- ) !== false
|| strpos($name, -. ) !== false
// only letters, numbers, dot and hypen are allowed
/*
// a little bit slower
|| !ctype_alnum(str_replace(array( - , . ), , $name))
*/
|| preg_match( /[^a-zd.-]/i , $name)
) {
return false;
}
// each label may contain up to 63 characters
$offset = 0;
while (($pos = strpos($name, . , $offset)) !== false) {
if ($pos - $offset > 63) {
return false;
}
$offset = $pos + 1;
}
return $name;
}
?>
Benchmark results compared with velcrow s function and 10000 iterations (complete results contains many code variants. It was interesting to find the fastest.):
filter_hostname($domain);// $domains: 0.43556308746338 $real_world: 0.33749794960022
is_valid_domain_name($domain);// $domains: 0.81832790374756 $real_world: 0.32248711585999
$real_world
did not contain extreme long domain names to produce better results. And now I can answer your question: With the usage of ctype_alnum()
it would be possible to realize it without regex, but as preg_match()
was faster I would prefer that.
If you don t like the fact that "local.host" is a valid domain name use this function instead that valids against a public tld list. Maybe someone finds the time to combine both.