char-general-category - Unicode character category

LIBRARY

(import (rnrs))                     ;R6RS
(import (rnrs unicode))             ;R6RS

SYNOPSIS

(char-general-category char)

DESCRIPTION

Returns a symbol representing the Unicode general category of char.

The category is one of the following symbols.

              +-------+----------------------------+
              |Symbol |        Description         |
              +-------+----------------------------+
              |  Lu   |     Letter, uppercase      |
              +-------+----------------------------+
              |  Ll   |     Letter, lowercase      |
              +-------+----------------------------+
              |  Lt   |     Letter, titlecase      |
              +-------+----------------------------+
              |  Lm   |      Letter, modifier      |
              +-------+----------------------------+
              |  Lo   |       Letter, other        |
              +-------+----------------------------+
              |  Mn   |      Mark, nonspacing      |
              +-------+----------------------------+
              |  Mc   |  Mark, spacing combining   |
              +-------+----------------------------+
              |  Me   |      Mark, enclosing       |
              +-------+----------------------------+
              |  Nd   |   Number, decimal digit    |
              +-------+----------------------------+
              |  Nl   |       Number, letter       |
              +-------+----------------------------+
              |  No   |       Number, other        |
              +-------+----------------------------+
              |  Pc   |   Punctuation, connector   |
              +-------+----------------------------+
              |  Pd   |     Punctuation, dash      |
              +-------+----------------------------+
              |  Ps   |     Punctuation, open      |
              +-------+----------------------------+
              |  Pe   |     Punctuation, close     |
              +-------+----------------------------+
              |  Pi   | Punctuation, initial quote |
              +-------+----------------------------+
              |  Pf   |     Punctuation, final     |
              +-------+----------------------------+
              |  Po   |     Punctuation, other     |
              +-------+----------------------------+
              |  Sm   |        Symbol, math        |
              +-------+----------------------------+
              |  Sc   |      Symbol, currency      |
              +-------+----------------------------+
              |  Sk   |      Symbol, modifier      |
              +-------+----------------------------+
              |  So   |       Symbol, other        |
              +-------+----------------------------+
              |  Zs   |      Separator, space      |
              +-------+----------------------------+
              |  Zl   |      Separator, line       |
              +-------+----------------------------+
              |  Zp   |    Separator, paragraph    |
              +-------+----------------------------+
              |  Cc   |       Other, control       |
              +-------+----------------------------+
              |  Cf   |       Other, format        |
              +-------+----------------------------+
              |  Cs   |      Other, surrogate      |
              +-------+----------------------------+
              |  Co   |     Other, private use     |
              +-------+----------------------------+
              |  Cn   |    Other, not assigned     |
              +-------+----------------------------+

RETURN VALUES

Returns a single symbol.

EXAMPLES

(char-general-category #\a)
   => Ll
(char-general-category #\space)
   => Zs
(char-general-category #\x10FFFF)
   => Cn

APPLICATION USAGE

This procedure commonly appears in Scheme readers. Both R6RS and R7RS define the lexical syntax in terms of Unicode general categories. Another use is in algorithms for rendering text.

COMPATIBILITY

This procedure is only present in R6RS (but any implementation that supports Unicode will necessarily have access to the data). The exact values returned will depend on which version of the Unicode database was used to generate the lookup tables.

ERRORS

This procedure can raise exceptions with the following condition types:
&assertion (R6RS)
The wrong number of arguments was passed or an argument was outside its domain.

SEE ALSO

char-alphabetic?(3scm)

STANDARDS

R6RS

AUTHORS

This page is part of the scheme-manpages project. It includes materials from the RnRS documents. More information can be found at https://github.com/schemedoc/manpages/.


Markup created by unroff 1.0sc,    March 04, 2023.