[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Bug#575209 closed by Holger Levsen <holger@layer-acht.org> (Re: Bug#575209: general: Error resolving hostname [resent])



reopen 575209
reassign 575209 eglibc
found 575209 2.10.2-6
found 575209 2.11-0exp6
severity 575209 important
retitle 575209 Please resolv domain names with hyphens as border chars
tags 575209 + patch
thanks

Hi Holger et al (please drop -devel out of the list of CCs if you feel this is getting off-topic),

sorry, but I find it unacceptable to close this bug referring to a single paragraph in a (random) RFC [0]. However, there is a multitude of other reasons why I think this bug *is* an issue:

- Sites with domain names like <ker-.deviantart.com> do already exist! Do you think they should be accessible by any other proprietary operating system, but not Debian? Not really!

- There is already an inconsistency among the different implementations in Debian (or Linux as a whole), as e.g. ping and any other program using gethostbyname() fail to resolv, whereas nslookup and host succeed.

- The advice in the cited RFC is already ignored. Domain names that start with a digit, e.g. 12345.foo.bar, can be resolved, whereas the RFC tells us "They [labels] must start with a letter, end with a letter or digit [...]". So let's just relax the rules in the RFC (they are only recommendations after all) a bit more to also allow hyphens as border characters in labels. It doesn't harm anyone, it just enables us to resolv a few more actual domain names!

For further discussion, please see the bug reports opened against ubuntu [1] and upstream [2]:
[1] <https://bugs.launchpad.net/glibc/+bug/144431>
[2] <http://sourceware.org/bugzilla/show_bug.cgi?id=4671>

Technically speaking, what IMHO needs to be done is to allow hyphenchar as a borderchar in resolv/res_comp.c in eglibc. Please find my patch below (and while we are at at, why not allow underscorechar as well?).


Cheers,
Fabian

----------8<----------

--- eglibc-2.10.2.orig/resolv/res_comp.c
+++ eglibc-2.10.2/resolv/res_comp.c
@@ -146,8 +146,8 @@
 		   || ((c) >= 0x61 && (c) <= 0x7a))
 #define digitchar(c) ((c) >= 0x30 && (c) <= 0x39)

-#define borderchar(c) (alphachar(c) || digitchar(c))
-#define middlechar(c) (borderchar(c) || hyphenchar(c) || underscorechar(c))
+#define borderchar(c) (alphachar(c) || digitchar(c) || hyphenchar(c))
+#define middlechar(c) (borderchar(c) || underscorechar(c))
 #define	domainchar(c) ((c) > 0x20 && (c) < 0x7f)

 int

---------->8----------

[0] There are even other RFCs that either relax or contradict against the advice of RFC 1035 (thanks Christoph Loehr, who could even write a short essay about this):

RFC 1178:
"
      Don't use digits at the beginning of the name.

Many programs accept a numerical internet address as well as a
name. Unfortunately, some programs do not correctly
distinguish between the two and may be fooled, for example, by
a string beginning with a decimal digit.

Names consisting entirely of hexadecimal digits, such as
"beef", are also problematic, since they can be interpreted
entirely as hexadecimal numbers as well as alphabetic strings.

      Don't use non-alphanumeric characters in a name.
[...]
      Don't expect case to be preserved.
[...]
"

This is a mitigation of RFC 1035, as there is no mention of hyphen characters at all.

RFC 952:
"
No blank or space characters are permitted as part of a name. No distinction is made between upper and lower case. The first character must be an alpha character. The last character must not be a minus sign or period. A host which serves as a GATEWAY should have "-GATEWAY" or "-GW" as part of its name. Hosts which do not serve as Internet gateways should not use "-GATEWAY" and "-GW" as part of their names. A host which is a TAC should have "-TAC" as the last part of its host name, if it is a DoD host. Single character names or nicknames are not allowed.
"

this is contradictory, since there is c.psi.net, which is resolved by gethostbyname().

RFC 1123:
"
The syntax of a legal Internet host name was specified in RFC-952 [DNS:4]. One aspect of host name syntax is hereby changed: the restriction on the first character is relaxed to allow either a letter or a digit. Host software MUST support this more liberal syntax.
"

This is clearly another mitigation of RFC 1035.


Reply to: