Next Previous Contents

2. Principles

To make programs easily translatable, all messages should be placed in dictionaries. A dictionary is made of message entries. Each message has a unique ID and a value. In the C++ source, programmers are referring to those messages using the ID whenever they want to print or say something.

Each time a programmer need a new message, he has to add it in the message dictionary and reference it from the C++ source code. This is how most system works (There are other translation system out there).

The system used by Linuxconf is basically different. Messages are defined in the C++ source code and the dictionaries are built by scanning all C++ source files. Messages are defined in the C++ code. Programmers must provide and ID and a value for each message right in the source code. This is much easier (or nicer) to do this right in the source code than to go back and forth in the dictionary. Furthermore, the programmer directly see the message definition in the source. With other system, only the message ID is visible in the source.

Using the magic of the C preprocessor, the message value is not compiled in the object code at all. Seen this way, the translation system used by Linuxconf yield the same result as other system. It is just nicer to use for programmers.

Lets describe how a programmer use the system.

2.1 One dictionary per source directory

It is best to define one message dictionary per sub-project or sub-directory. This is easier to manage and avoid ID name space congestion. For each directory source of Linuxconf you have one "dic" file and one "m" file. Both file are produced simply by doing

        make msg
        

This command scans all C++ source file of the current directory and update the file ../messages/sources/DIRECTORY.dic and the file DIRECTORY.m, where DIRECTORY is the name of the current directory.

make msg use the ../translate/msgscan utility to scan the source. This utility looks for specific constructs in the C++ source file. Here they are.

2.2 The MSG_U macro

The MSG_U macro defines a new message. It defines both its ID and its value. This macro is usable anywhere a C++ string would be.

        #include "prjfoo.m"

        int foo()
        {
                printf (MSG_U(M_MSG1,"Entering function foo"));
        }
        

MSG_U defines a single value. U stands for unilingual. It only defines one value.

2.3 The MSG_B macro

The MSG_B macro is like the MSG_U macro, except it defines two values, allowing a programmer to code immediately two languages at once. The B stands for bilingual. This has not been used in the Linuxconf project but has proven effective for other projects.

        #include "prjfoo.m"

        int foo()
        {
                printf (MSG_U(M_MSG1
                        ,"Entering function foo\n"));
                        ,"Démarrage de la fonction foo\n"));
        }
        

2.4 The MSG_R macro

The MSG_R macro simply references an already defined message. This message may have been defined in another source file (of the same project). Like the other macros, MSG_R may be used anywhere a C++ string is.

2.5 The MSG_VERSION macro

This macro has not been used so far. It would allow one programmer to raise the version number of a dictionary, preventing older application to use the newer potentially incompatible dictionary.

The msgclean utility also plays with the version number of the dictionary. The MSG_VERSION macro is still a concept rather than a useful addition. Stay tune...

2.6 The magic of the MSG_ macros

The MSG_ macros perform two tasks. First, they are easily spotted by the msgscan utility. The parsing is simple and reliable even if the C++ source code is not functional. Second, they hide the retrieval mechanism (How the message value is retrieved from the binary dictionary at runtime).

The msgscan utility produce the .m file which looks like this for the simple example above.

        FILE prjfoo.m:

        extern const char **_dictionary_prjfoo;
        #ifndef DICTIONNARY_REQUEST
                #define DICTIONNARY_REQUEST \
                const char **_dictionary_prjfoo;\
                TRANSLATE_SYSTEM_REQ _dictionary_req_prjfoo\
                        ("prjfoo",_dictionary_prjfoo,55,1);\
                void dummy_dict_prjfoo(){}
        #endif
        #ifndef MSG_U
                #define MSG_U(id,m)     id
                #define MSG_B(id,m,n)   id
                #define MSG_R(id)       id
        #endif
        #define M_MSG1  _dictionary_prjfoo[0]
        

As you see, one global variable is created: _dictionary_prjfoo. A special macro DICTIONARY_REQUEST is defined. This macro should be placed in one source of the project. It is generally place in the file _dict.c presented later.

2.7 The P_MSG_ macros

When you create translatable message, you are using the MSG_U or MSG_B macro (or MSG_R if you are reusing a previously defined message).

This translate to

        _dictionary_DIRNAME[message_ID];
        

where _dictionary_DIRNAME is a char ** (A table allocated using malloc by calling the translat_load function). DIRNAME is the name of the source directory. So you can write stuff like this

        void foo()
        {
                static const char *tb[]={
                        MSG_U(ID1),
                        MSG_U(ID2)
                };
                ..
        }
        

Which is translated to highly efficient code. There is a problem with that. The following sequence won't work.

        static const char *tb[]={
                MSG_U(ID1),
        MSG_U(ID2) 
        };

        void foo()
        {
                for (unsgned i=0; i<sizeof(tb)/sizeof(tb[0]); i++){
                        const char *p = tb[i];
                        .
                        .
                }
                ..
        }
        

The table tb[] is initialised at program startup time, before main is even called. In the first example, tb[] inside function foo() is initialised the first time the function is called with a sequence that looks like this (generated magically by the compiler)

        static bool is_init=false;
        if (!is_init){
                ..
                is_init = true;
        }
        

This is how most compilers handle that. When tb[] is defined outside a function, the variable _dictionary_DIRNAME[] is not yet initialized. The translat_load() function is called explicitly by the application after it knows which language to use and the location of the proper message dictionary.

Before main, _dictionary_DIRNAME is a NULL pointer. So we need a way to "remember" that a given message has an ID and is associated with a given table. The P_MSG_U macro is used to hide this complexity. It creates a new object of type TRANS_NOTLOAD. In this object, we store a pointer to the _dictionary_DIRNAME variable and the message ID. Later, we can retrieve the message by using the get() member function. The offending code would be rewritten like this:

        static TRANS_NOTLOAD *tb[]={
                P_MSG_U(ID1,"blabla"),
                P_MSG_U(ID2,"...") 
        };

        void foo()
        {
                for (unsigned i=0; i<sizeof(tb)/sizeof(tb[0]); i++){
                        const char *p = tb[i]->get();
                        .
                        .
                }
                ..
        }
        


Next Previous Contents