[ Date Prev][ Date Next][ Thread Prev][ Thread Next][ Date Index][ Thread Index]
Re: universal character names
- To: jason@xxxxxxxxxx
- Subject: Re: universal character names
- From: Martin von Loewis <loewis@xxxxxxxxxxxxxxxxxxxxxxx>
- Date: Tue, 11 Apr 2000 19:28:12 +0200
> UTF-8 is inappropriate for mangled names, as it uses values > 127 to
> encode non-ASCII characters.
Why is it not appropriate? AFAICT, the gABI has no restriction in that
respect. ch4.strtab.html says
# String table sections hold null-terminated character sequences,
# commonly called strings.
I can see there are a number of alternatives. I think it is important
that there is agreement on the rules, in a way that is also
interoperable with C99 implementations. What those rules are is not
that important.
> GNU Java encodes names in UTF-8 internally. For the mangled name, if there
> are non-ASCII characters, it adds a 'U' to the beginning and encodes each
> such UCS-2 character as _%04x. See gcc/java/mangle.c.
In the C++ ABI, the natural adaptation of that approach would be to
mangle non-ASCII-containing identifiers as _U instead of _Z, right?
Unfortunately, that does not give a solution for C names. I believe
the GNU Java approach also cannot be extended to C99.
Regards,
Martin
|