[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#1023878: ITP: libgrapheme -- Unicode string library with small footprint and high performance



Package: wnpp
Severity: wishlist
Owner: Paride Legovini <paride@debian.org>
X-Debbugs-Cc: debian-devel@lists.debian.org, paride@debian.org, dev@frign.de

* Package name    : libgrapheme
  Version         : 2.0.2
  Upstream Author : Laslo Hunhold <dev@frign.de>
* URL             : https://libs.suckless.org/libgrapheme/
* License         : ISC
  Programming Lang: C
  Description     : Unicode string library with small footprint and high performance

libgrapheme is an extremely simple freestanding C99 library providing
utilities for properly handling strings according to the latest Unicode
standard 15.0.0. It offers fully Unicode compliant

 * grapheme cluster (i.e. user-perceived character) segmentation
 * word segmentation
 * sentence segmentation
 * detection of permissible line break opportunities
 * case detection (lower-, upper- and title-case)
 * case conversion (to lower-, upper- and title-case)

on UTF-8 strings and codepoint arrays, which both can also be null-terminated.

The necessary lookup-tables are automatically generated from the Unicode
standard data (contained in the tarball) and heavily compressed. Over
10,000 automatically generated conformance tests and over 150 unit tests
ensure conformance and correctness.

It is also way smaller and much faster than the other established
Unicode string libraries (ICU, GNU's libunistring, libutf8proc).

I plan to maintain the package in salsa under the debian/ namespace,
unless I get a suggestion for an appropriate team. In that case I'd
be happy to team maintain the package.

I already maintain packages from the same upstream, with whom I have
always had an excellent collaboration.

Paride


Reply to: