| CTYPE(3) | Library Functions Manual | CTYPE(3) |
ctype — character
classification and mapping functions
Standard C Library (libc, -lc)
#include
<ctype.h>
isalpha(int
c);
isupper(int
c);
islower(int
c);
isdigit(int
c);
isxdigit(int
c);
isalnum(int
c);
isspace(int
c);
ispunct(int
c);
isprint(int
c);
isgraph(int
c);
iscntrl(int
c);
isblank(int
c);
toupper(int
c);
tolower(int
c);
The above functions perform character tests and conversions on the integer c.
See the specific manual pages for information about the test or conversion performed by each function.
In NetBSD 11, the
ctype functions will always crash with a signal on
certain invalid inputs as a diagnostic aid for applications; see
CAVEATS. Setting the environment variable
LIBC_ALLOWCTYPEABUSE before starting a program will
restore the old behavior of returning nonsense answers for these inputs, or
sometimes but not always crashing, depending on factors such as address
space layout randomization.
To print an upper-case version of a string to stdout, the following code can be used:
const char *s = "xyz";
while (*s != '\0') {
putchar(toupper((unsigned char)*s));
s++;
}
isalnum(3), isalpha(3), isblank(3), iscntrl(3), isdigit(3), isgraph(3), islower(3), isprint(3), ispunct(3), isspace(3), isupper(3), isxdigit(3), tolower(3), toupper(3), ascii(7)
These functions, with the exception of
isblank(), conform to ANSI
X3.159-1989 (“ANSI C89”). All described
functions, including isblank(), also conform to
IEEE Std 1003.1-2001 (“POSIX.1”).
The argument of these functions is of type
int, but only a very restricted subset of values are
actually valid. The argument must either be the value of the macro
EOF (which has a negative value), or must be a
non-negative value within the range representable as
unsigned char. Passing invalid values leads to
undefined behavior.
Values of type int that were returned by
getc(3),
fgetc(3), and similar functions
or macros are already in the correct range, and may be safely passed to
these ctype functions without any casts.
Values of type char or
signed char must first be cast to
unsigned char, to ensure that the values are within
the correct range. Casting a negative-valued char or
signed char directly to int will
produce a negative-valued int, which will be outside
the range of allowed values (unless it happens to be equal to
EOF, but even that would not give the desired
result).
Because the bugs may manifest as silent misbehavior or as crashes
only when fed input outside the US-ASCII range, the
NetBSD implementation of the
ctype functions is designed to elicit a compiler
warning for code that passes inputs of type char in
order to flag code that may pass negative values at runtime that would lead
to undefined behavior:
#include <ctype.h>
#include <locale.h>
#include <stdio.h>
int
main(int argc, char **argv)
{
if (argc < 2)
return 1;
setlocale(LC_ALL, "");
printf(" char=%-4d isprint? %d\n",
(int)*argv[1],
isprint(*argv[1]) ? 1 : 0);
printf("u_char=%-4d isprint? %d\n",
(int)(unsigned char)*argv[1],
isprint((unsigned char)*argv[1]) ? 1 : 0);
return 0;
}
When compiling this program, GCC reports a warning for the line that passes char. At runtime, you may get nonsense answers for some inputs without the cast — if you're lucky and it doesn't crash:
% gcc -Wall -o test test.c
In file included from /usr/include/ctype.h:100,
from test.c:1:
test.c: In function 'main':
test.c:15:21: warning: array subscript has type 'char' [-Wchar-subscripts]
15 | isprint(*argv[1]) ? 1 : 0);
| ^
% LC_CTYPE=C ./test $(printf '\270')
char=-72 isprint? 1
u_char=184 isprint? 0
% LC_CTYPE=C ./test $(printf '\377')
char=-1 isprint? 0
u_char=255 isprint? 0
% LC_CTYPE=fr_FR.ISO8859-1 ./test $(printf '\377')
char=-1 isprint? 0
u_char=255 isprint? 1
Some implementations of libc, such as glibc as of 2018, hide the
undefined behavior by defining the functions to work for all integer inputs
representable by either unsigned char or
char, and suppress the warning. However, this is not
an excuse for avoiding conversion to unsigned char: if
EOF coincides with any such value, as it does when
it is -1 on platforms with signed char, programs that
pass char will still necessarily confuse the
classification and mapping of EOF with the
classification and mapping of some non-EOF inputs.
| September 14, 2025 | NetBSD 11.0 |