2002-09-20 Kevin Buettner <kevinb@redhat.com>
From Eli Zaretskii <eliz@is.elta.co.il>: * gdb.texinfo (Character Sets): Use @smallexample instead of @example. Use GNU/Linux instead of Linux. 2002-09-20 Jim Blandy <jimb@redhat.com> * gdb.texinfo: Add character set documentation.
This commit is contained in:
parent
608707ac84
commit
a0eb71c570
@ -1,3 +1,13 @@
|
||||
2002-09-20 Kevin Buettner <kevinb@redhat.com>
|
||||
|
||||
From Eli Zaretskii <eliz@is.elta.co.il>:
|
||||
* gdb.texinfo (Character Sets): Use @smallexample instead of
|
||||
@example. Use GNU/Linux instead of Linux.
|
||||
|
||||
2002-09-20 Jim Blandy <jimb@redhat.com>
|
||||
|
||||
* gdb.texinfo: Add character set documentation.
|
||||
|
||||
2002-09-19 Andrew Cagney <ac131313@redhat.com>
|
||||
|
||||
* gdb.texinfo (Packets): Revise `z' and `Z' packet documentation.
|
||||
|
@ -4493,6 +4493,8 @@ Table}.
|
||||
* Vector Unit:: Vector Unit
|
||||
* Memory Region Attributes:: Memory region attributes
|
||||
* Dump/Restore Files:: Copy between memory and a file
|
||||
* Character Sets:: Debugging programs that use a different
|
||||
character set than GDB does
|
||||
@end menu
|
||||
|
||||
@node Expressions
|
||||
@ -5879,6 +5881,254 @@ the @var{bias} argument is applied.
|
||||
|
||||
@end table
|
||||
|
||||
@node Character Sets
|
||||
@section Character Sets
|
||||
@cindex character sets
|
||||
@cindex charset
|
||||
@cindex translating between character sets
|
||||
@cindex host character set
|
||||
@cindex target character set
|
||||
|
||||
If the program you are debugging uses a different character set to
|
||||
represent characters and strings than the one @value{GDBN} uses itself,
|
||||
@value{GDBN} can automatically translate between the character sets for
|
||||
you. The character set @value{GDBN} uses we call the @dfn{host
|
||||
character set}; the one the inferior program uses we call the
|
||||
@dfn{target character set}.
|
||||
|
||||
For example, if you are running @value{GDBN} on a @sc{gnu}/Linux system, which
|
||||
uses the ISO Latin 1 character set, but you are using @value{GDBN}'s
|
||||
remote protocol (@pxref{Remote,Remote Debugging}) to debug a program
|
||||
running on an IBM mainframe, which uses the @sc{ebcdic} character set,
|
||||
then the host character set is Latin-1, and the target character set is
|
||||
@sc{ebcdic}. If you give @value{GDBN} the command @code{set
|
||||
target-charset ebcdic-us}, then @value{GDBN} translates between
|
||||
@sc{ebcdic} and Latin 1 as you print character or string values, or use
|
||||
character and string literals in expressions.
|
||||
|
||||
@value{GDBN} has no way to automatically recognize which character set
|
||||
the inferior program uses; you must tell it, using the @code{set
|
||||
target-charset} command, described below.
|
||||
|
||||
Here are the commands for controlling @value{GDBN}'s character set
|
||||
support:
|
||||
|
||||
@table @code
|
||||
@item set target-charset @var{charset}
|
||||
@kindex set target-charset
|
||||
Set the current target character set to @var{charset}. We list the
|
||||
character set names @value{GDBN} recognizes below, but if you invoke the
|
||||
@code{set target-charset} command with no argument, @value{GDBN} lists
|
||||
the character sets it supports.
|
||||
@end table
|
||||
|
||||
@table @code
|
||||
@item set host-charset @var{charset}
|
||||
@kindex set host-charset
|
||||
Set the current host character set to @var{charset}.
|
||||
|
||||
By default, @value{GDBN} uses a host character set appropriate to the
|
||||
system it is running on; you can override that default using the
|
||||
@code{set host-charset} command.
|
||||
|
||||
@value{GDBN} can only use certain character sets as its host character
|
||||
set. We list the character set names @value{GDBN} recognizes below, and
|
||||
indicate which can be host character sets, but if you invoke the
|
||||
@code{set host-charset} command with no argument, @value{GDBN} lists the
|
||||
character sets it supports, placing an asterisk (@samp{*}) after those
|
||||
it can use as a host character set.
|
||||
|
||||
@item set charset @var{charset}
|
||||
@kindex set charset
|
||||
Set the current host and target character sets to @var{charset}. If you
|
||||
invoke the @code{set charset} command with no argument, it lists the
|
||||
character sets it supports. @value{GDBN} can only use certain character
|
||||
sets as its host character set; it marks those in the list with an
|
||||
asterisk (@samp{*}).
|
||||
|
||||
@item show charset
|
||||
@itemx show host-charset
|
||||
@itemx show target-charset
|
||||
@kindex show charset
|
||||
@kindex show host-charset
|
||||
@kindex show target-charset
|
||||
Show the current host and target charsets. The @code{show host-charset}
|
||||
and @code{show target-charset} commands are synonyms for @code{show
|
||||
charset}.
|
||||
|
||||
@end table
|
||||
|
||||
@value{GDBN} currently includes support for the following character
|
||||
sets:
|
||||
|
||||
@table @code
|
||||
|
||||
@item ASCII
|
||||
@cindex ASCII character set
|
||||
Seven-bit U.S. @sc{ascii}. @value{GDBN} can use this as its host
|
||||
character set.
|
||||
|
||||
@item ISO-8859-1
|
||||
@cindex ISO 8859-1 character set
|
||||
@cindex ISO Latin 1 character set
|
||||
The ISO Latin 1 character set. This extends ASCII with accented
|
||||
characters needed for French, German, and Spanish. @value{GDBN} can use
|
||||
this as its host character set.
|
||||
|
||||
@item EBCDIC-US
|
||||
@itemx IBM1047
|
||||
@cindex EBCDIC character set
|
||||
@cindex IBM1047 character set
|
||||
Variants of the @sc{ebcdic} character set, used on some of IBM's
|
||||
mainframe operating systems. (@sc{gnu}/Linux on the S/390 uses U.S. @sc{ascii}.)
|
||||
@value{GDBN} cannot use these as its host character set.
|
||||
|
||||
@end table
|
||||
|
||||
Note that these are all single-byte character sets. More work inside
|
||||
GDB is needed to support multi-byte or variable-width character
|
||||
encodings, like the UTF-8 and UCS-2 encodings of Unicode.
|
||||
|
||||
Here is an example of @value{GDBN}'s character set support in action.
|
||||
Assume that the following source code has been placed in the file
|
||||
@file{charset-test.c}:
|
||||
|
||||
@smallexample
|
||||
#include <stdio.h>
|
||||
|
||||
char ascii_hello[]
|
||||
= @{72, 101, 108, 108, 111, 44, 32, 119,
|
||||
111, 114, 108, 100, 33, 10, 0@};
|
||||
char ibm1047_hello[]
|
||||
= @{200, 133, 147, 147, 150, 107, 64, 166,
|
||||
150, 153, 147, 132, 90, 37, 0@};
|
||||
|
||||
main ()
|
||||
@{
|
||||
printf ("Hello, world!\n");
|
||||
@}
|
||||
@end example
|
||||
|
||||
In this program, @code{ascii_hello} and @code{ibm1047_hello} are arrays
|
||||
containing the string @samp{Hello, world!} followed by a newline,
|
||||
encoded in the @sc{ascii} and @sc{ibm1047} character sets.
|
||||
|
||||
We compile the program, and invoke the debugger on it:
|
||||
|
||||
@smallexample
|
||||
$ gcc -g charset-test.c -o charset-test
|
||||
$ gdb -nw charset-test
|
||||
GNU gdb 2001-12-19-cvs
|
||||
Copyright 2001 Free Software Foundation, Inc.
|
||||
@dots{}
|
||||
(gdb)
|
||||
@end example
|
||||
|
||||
We can use the @code{show charset} command to see what character sets
|
||||
@value{GDBN} is currently using to interpret and display characters and
|
||||
strings:
|
||||
|
||||
@smallexample
|
||||
(gdb) show charset
|
||||
The current host and target character set is `iso-8859-1'.
|
||||
(gdb)
|
||||
@end example
|
||||
|
||||
For the sake of printing this manual, let's use @sc{ascii} as our
|
||||
initial character set:
|
||||
@smallexample
|
||||
(gdb) set charset ascii
|
||||
(gdb) show charset
|
||||
The current host and target character set is `ascii'.
|
||||
(gdb)
|
||||
@end example
|
||||
|
||||
Let's assume that @sc{ascii} is indeed the correct character set for our
|
||||
host system --- in other words, let's assume that if @value{GDBN} prints
|
||||
characters using the @sc{ascii} character set, our terminal will display
|
||||
them properly. Since our current target character set is also
|
||||
@sc{ascii}, the contents of @code{ascii_hello} print legibly:
|
||||
|
||||
@smallexample
|
||||
(gdb) print ascii_hello
|
||||
$1 = 0x401698 "Hello, world!\n"
|
||||
(gdb) print ascii_hello[0]
|
||||
$2 = 72 'H'
|
||||
(gdb)
|
||||
@end example
|
||||
|
||||
@value{GDBN} uses the target character set for character and string
|
||||
literals you use in expressions:
|
||||
|
||||
@smallexample
|
||||
(gdb) print '+'
|
||||
$3 = 43 '+'
|
||||
(gdb)
|
||||
@end example
|
||||
|
||||
The @sc{ascii} character set uses the number 43 to encode the @samp{+}
|
||||
character.
|
||||
|
||||
@value{GDBN} relies on the user to tell it which character set the
|
||||
target program uses. If we print @code{ibm1047_hello} while our target
|
||||
character set is still @sc{ascii}, we get jibberish:
|
||||
|
||||
@smallexample
|
||||
(gdb) print ibm1047_hello
|
||||
$4 = 0x4016a8 "\310\205\223\223\226k@@\246\226\231\223\204Z%"
|
||||
(gdb) print ibm1047_hello[0]
|
||||
$5 = 200 '\310'
|
||||
(gdb)
|
||||
@end example
|
||||
|
||||
If we invoke the @code{set target-charset} command without an argument,
|
||||
@value{GDBN} tells us the character sets it supports:
|
||||
|
||||
@smallexample
|
||||
(gdb) set target-charset
|
||||
Valid character sets are:
|
||||
ascii *
|
||||
iso-8859-1 *
|
||||
ebcdic-us
|
||||
ibm1047
|
||||
* - can be used as a host character set
|
||||
@end example
|
||||
|
||||
We can select @sc{ibm1047} as our target character set, and examine the
|
||||
program's strings again. Now the @sc{ascii} string is wrong, but
|
||||
@value{GDBN} translates the contents of @code{ibm1047_hello} from the
|
||||
target character set, @sc{ibm1047}, to the host character set,
|
||||
@sc{ascii}, and they display correctly:
|
||||
|
||||
@smallexample
|
||||
(gdb) set target-charset ibm1047
|
||||
(gdb) show charset
|
||||
The current host character set is `ascii'.
|
||||
The current target character set is `ibm1047'.
|
||||
(gdb) print ascii_hello
|
||||
$6 = 0x401698 "\110\145%%?\054\040\167?\162%\144\041\012"
|
||||
(gdb) print ascii_hello[0]
|
||||
$7 = 72 '\110'
|
||||
(gdb) print ibm1047_hello
|
||||
$8 = 0x4016a8 "Hello, world!\n"
|
||||
(gdb) print ibm1047_hello[0]
|
||||
$9 = 200 'H'
|
||||
(gdb)
|
||||
@end example
|
||||
|
||||
As above, @value{GDBN} uses the target character set for character and
|
||||
string literals you use in expressions:
|
||||
|
||||
@smallexample
|
||||
(gdb) print '+'
|
||||
$10 = 78 '+'
|
||||
(gdb)
|
||||
@end example
|
||||
|
||||
The IBM1047 character set uses the number 78 to encode the @samp{+}
|
||||
character.
|
||||
|
||||
|
||||
@node Macros
|
||||
@chapter C Preprocessor Macros
|
||||
|
||||
|
Loading…
x
Reference in New Issue
Block a user