Je suis Charlie

Autres trucs


Seulement les RFC

Seulement les fiches de lecture

Mon livre « Cyberstructure »


GaBuZoMeu, a parser for language tags

First publication of this article on 19 September 2006
Last update on of 25 August 2009

GaBuZoMeu is a set of programs to parse and check language tags (see the RFC 5646 produced by the IETF Working Group LTRU - Language Tag Registry Update).

Language tags are used by several protocols (like HTTP) or languages (like XML) to indicate the human language you want or have. Examples of language tags are "fr" (French), "apa" (Apache), "se-FI" (Swedish, as practiced in Finland) or "uk-Latn" (ukrainian in the latin script).

An IANA registry stores the registered values.

A language tag can be well-formed (syntactically correct) or not. This can be tested without access to the registry. A language tag can be valid or not (all its subtags registered). This depends on the copy of the registry you use (since the registry changes).

GaBuZoMeu includes the following programs:

  • check-wf checks the well-formedness of tag(s). Example of use: "check-wf fr en-AU".
  • check-valid checks the validity of tag(s). Same use.

It complies with RFC 5646.

GaBuZoMeu is written in Haskell so you'll need an Haskell compiler such as ghc.

To compile, just type "make". To check that everything is OK, you can type "make test". You should get zero "Errors" and zero "Failures".

GaBuZoMeu is distributed as free software, under the GNU General Public Licence. Remarks, patches and bug reports are welcome.

Written and maintained by Stéphane Bortzmeyer <>.

Thanks for AFNIC, the .fr registry, for making this work possible.

Some other language tag parsers can be found at the language tag Web site.

Version PDF de cette page (mais vous pouvez aussi imprimer depuis votre navigateur, il y a une feuille de style prévue pour cela)

Source XML de cette page (cette page est distribuée sous les termes de la licence GFDL)