Localizing Language Resource Files

Sandcastle uses a set of resource files to contain text such as table of contents item titles, common topic element titles, messages, etc. Copies of the resource files are located in the .\Components\Shared\Content subfolder beneath the root help file builder installation folder.

The Resource Files

The common content files are named SharedContent_*.xml and presentation style specific content files are named after their presentation style (Markdown_*.xml and OpenXml_*.xml). The suffix on the filenames indicates the language it represents. The corresponding files will be included based on the language specified in the help file builder project. If content files cannot be found for the selected language, the English (en-US) versions are used.

The supplied files can be translated into other languages for use in your help files. The suffix on the filename is the language ID based on the value returned by the CultureInfo.Name property (i.e. en-US, fr-FR, de-DE). Each file has a .xml extension.

A set of stop word list files also exists in the .\Components\Shared\StopWordList folder that are used when creating the full text index for the website build. The files contain a list of common words that are excluded from the full-text index. This prevents the index files from getting too large and containing useless information. You can add additional words to these files as needed. The files all named using the language ID based on the value returned by the CultureInfo.Name property (i.e. en-US, fr-FR, de-DE). Each file has a.txt extension.

Localizing Language Resources

When you do a build with a language other than English, the help file builder will automatically look for a files named after the selected language and will use the translated files if it finds them. If the translated files cannot be found, a warning is issued in the log file and the English versions will be used. To create a help file builder resource file for a new language, copy one of the existing files and change the name to use the culture name's ID as described above. If you are not sure what ID to use in the filename, go to the Locale Identifier Constants and Strings help topic and locate the locale that you need. The primary language and sub-language columns contain the ID values in parentheses. Combine the two values in parentheses from those columns separated by a dash for the ID to use in the filename. Edit the new file and translate the text to the selected language. If people are kind enough to supply additional translated files, they will be added to later releases.

Due to the number of presentation styles and the number of files that would be created, it is not possible to provide a matching set of Sandcastle's content files for each language. You will need to create localized copies of them for each language that you want. As above, if people are kind enough to supply additional translated files, they will be added to later releases. The standard files in the noted .\Content folders will always be used when English (en-US) is used as the selected language or if files for the selected language cannot be found. Note that these files may change in new releases of the help file builder so you may have to compare and merge changes with your localized versions after each new release is issued.

How the files are encoded is very important if they contain extended characters. To ensure that the help file builder and the Sandcastle tools properly interpret the encoding within the files, it is best to save the files such that they contain byte order marks at the start of the file for Unicode encoded formats as well as an XML header tag that specifies the correct encoding. In the absence of byte order marks, the encoding in the XML header tag ensures that the file is still interpreted correctly. The supplied default language resource files contain examples of this.

When using entities to represent special characters in the XML language resource files or in the header text, copyright text, etc, use the numeric form rather than the name form as the XML parser will not recognize them and will throw an exception. For example, if you specify Ä (Latin capital letter A with diaeresis) an exception will be generated. To fix it, use the numeric form instead (Ä). This also applies to symbols such as © in the copyright text. Instead, you should use © to get the copyright symbol.

See Also