To make programs easily translatable, all messages should be placed in dictionaries. A dictionary is made of message entries. Each message has a unique ID and a value. In the C++ source, programmers are referring to those messages using the ID whenever they want to print or say something.
Each time a programmer need a new message, he has to add it in the message dictionary and reference it from the C++ source code. This is how most system works (There are other translation system out there).
The system used by Linuxconf is basically different. Messages
are defined in the C++ source code and the dictionaries are
built by
scanning all C++ source files. Messages are defined in the
C++ code.
Programmers must provide and ID and a value for each message right
in the source code. This is much easier (or nicer) to do this
right in the source code than to go back and forth in the dictionary.
Furthermore, the programmer directly see the message definition
in the source. With other system, only the message ID is visible
in the source.
Using the magic of the C preprocessor, the message value is
not compiled in the object code at all. Seen this way, the translation
system used by Linuxconf yield the same result as other system.
It is just nicer to use for programmers.
Lets describe how a programmer use the system.
It is best to define one message dictionary per sub-project or
sub-directory. This is easier to manage and avoid ID name space
congestion. For each directory source of Linuxconf you have
one "dic" file and one "m" file. Both file are produced
simply by doing
make msg
This command scans all C++ source file of the current directory
and update the file ../messages/sources/DIRECTORY.dic and
the file DIRECTORY.m, where DIRECTORY is the name of the
current directory.
make msg use the ../translate/msgscan utility to
scan the source. This utility looks for specific constructs in the
C++ source file. Here they are.
MSG_U macro
The MSG_U macro defines a new message. It defines both its ID
and its value. This macro is usable anywhere a C++ string would
be.
#include "prjfoo.m"
int foo()
{
printf (MSG_U(M_MSG1,"Entering function foo"));
}
MSG_U defines a single value. U stands for unilingual. It
only defines one value.
MSG_B macro
The MSG_B macro is like the MSG_U macro, except it
defines two values, allowing a programmer to code immediately
two languages at once. The B stands for bilingual. This
has not been used in the Linuxconf project but has proven
effective for other projects.
#include "prjfoo.m"
int foo()
{
printf (MSG_U(M_MSG1
,"Entering function foo\n"));
,"Démarrage de la fonction foo\n"));
}
MSG_R macro
The MSG_R macro simply references an already defined
message. This message may have been defined in another source
file (of the same project). Like the other macros, MSG_R may
be used anywhere a C++ string is.
MSG_VERSION macro
This macro has not been used so far. It would allow one programmer to raise the version number of a dictionary, preventing older application to use the newer potentially incompatible dictionary.
The msgclean utility also plays with the version number of the
dictionary. The MSG_VERSION macro is still a concept rather
than a useful addition. Stay tune...
MSG_ macros
The MSG_ macros perform two tasks. First, they are easily
spotted by the msgscan utility. The parsing is simple and reliable
even if the C++ source code is not functional. Second, they
hide the retrieval mechanism (How the message value is retrieved from
the binary dictionary at runtime).
The msgscan utility produce the .m file which looks like this
for the simple example above.
FILE prjfoo.m:
extern const char **_dictionary_prjfoo;
#ifndef DICTIONNARY_REQUEST
#define DICTIONNARY_REQUEST \
const char **_dictionary_prjfoo;\
TRANSLATE_SYSTEM_REQ _dictionary_req_prjfoo\
("prjfoo",_dictionary_prjfoo,55,1);\
void dummy_dict_prjfoo(){}
#endif
#ifndef MSG_U
#define MSG_U(id,m) id
#define MSG_B(id,m,n) id
#define MSG_R(id) id
#endif
#define M_MSG1 _dictionary_prjfoo[0]
As you see, one global variable is created: _dictionary_prjfoo.
A special macro DICTIONARY_REQUEST is defined. This macro
should be placed in one source of the project. It is generally
place in the file _dict.c presented later.
P_MSG_ macros
When you create translatable message, you are using the MSG_U or MSG_B macro (or MSG_R if you are reusing a previously defined message).
This translate to
_dictionary_DIRNAME[message_ID];
where _dictionary_DIRNAME is a char **
(A table allocated using malloc by calling the translat_load function).
DIRNAME is the name of the source directory.
So you can write stuff like this
void foo()
{
static const char *tb[]={
MSG_U(ID1),
MSG_U(ID2)
};
..
}
Which is translated to highly efficient code. There is a problem with that. The following sequence won't work.
static const char *tb[]={
MSG_U(ID1),
MSG_U(ID2)
};
void foo()
{
for (unsgned i=0; i<sizeof(tb)/sizeof(tb[0]); i++){
const char *p = tb[i];
.
.
}
..
}
The table tb[] is initialised at program startup time, before main is even called. In the first example, tb[] inside function foo() is initialised the first time the function is called with a sequence that looks like this (generated magically by the compiler)
static bool is_init=false;
if (!is_init){
..
is_init = true;
}
This is how most compilers handle that. When tb[] is defined outside a function, the variable _dictionary_DIRNAME[] is not yet initialized. The translat_load() function is called explicitly by the application after it knows which language to use and the location of the proper message dictionary.
Before main, _dictionary_DIRNAME is a NULL pointer.
So we need a way to "remember" that a given message has an ID and is
associated with a given table. The P_MSG_U macro is used to hide this
complexity. It creates a new object of type TRANS_NOTLOAD.
In this object, we store a pointer to the _dictionary_DIRNAME
variable and the message ID. Later, we can retrieve the message
by using the get() member function. The offending code would be
rewritten like this:
static TRANS_NOTLOAD *tb[]={
P_MSG_U(ID1,"blabla"),
P_MSG_U(ID2,"...")
};
void foo()
{
for (unsigned i=0; i<sizeof(tb)/sizeof(tb[0]); i++){
const char *p = tb[i]->get();
.
.
}
..
}