Categories


Popular topics

Example of a wordfile used for syntax highlighting in UltraEdit

As discussed in the Syntax highlighting topic, UltraEdit applies syntax highlighting from definitions and configurations in wordfiles. Wordfiles must be in plain text, ANSI format with DOS style line endings and a .uew file extension. Wordfiles must be no more than 372 KB in size.

The wordfiles are loaded on startup, or when modified via the Add another language... dialog. Wordfiles are loaded from the folder shown in the Editor Display » Syntax Highlighting section of Settings (subfolders are ignored).

The easiest way to create your own wordfile is to take one of the existing wordfiles and modify it as desired, saving it into the wordfiles directory with a unique name and .uew file extension. You will need to restart UltraEdit in order to initialize the new wordfile, but once initialized, any changes you make to it will be reflected in real-time when you switch to an open file being highlighted by the wordfile you're modifying.

This topic exhaustively documents all possible options and sections of the wordfile.

Language definition

The first step to creating a valid wordfile is defining the language it will apply highlighting for. This is done on the first line of the wordfile via the following syntax:

/L#"Language Name"

...where "#" is a number (no longer used for sorting purposes but still required) and "Language Name" is the name as it will appear in the list of available syntax highlighting languages.

Language options

There are several keywords available for specifying syntax highlighting behavior options in the wordfile. These options are set by simply including the keyword in the wordfile. These are typically added immediately after the language definition on line 1 of the wordfile, but can be listed on their own line if you desire. Note: If the option is on its own line in the wordfile, it must be preceded by a forward slash, e.g.:

/DisableMLS

Option name Behavior
Nocase Disables case sensitivity for syntax highlighting. By default, syntax highlighting is case sensitive.
Noquote Disables string highlighting for the language completely.
EnableMLS Enables multi-line string support (an unclosed string may span multiple lines) for the language.
DisableMLS Disables multi-line string support (an unclosed string will not be highlighted past the first line) for the language.
NestBlockComments Enables the matching/pairing of nested block comment highlighting if the coding language supports these. (HTML, for example, does not.)
EnableCFByIndent Overrides any code folding string and applies code folding based upon the code's indentation level (for languages like Python).
EnableSpellasYouType Enables spell-as-you-type inline spell checking for the language (if set in the Spell checker » Miscellaneous section of Settings).

Special "_LANG" language flags

In order to apply correct syntax highlighting and other wordfile-based functionality, for several languages, special internal handling is required. These languages are determined via special flags in the wordfile. There are a fixed number of special language flags, and more may be added in the future. The available flags are described in detail below.

The language flag must exist on the line 1 of the wordfile somewhere after the language definition.

Important note: each language flag may exist only once in a single wordfile. Adding a language flag to multiple wordfiles will result in unexpected / incorrect highlighting for the associated language(s).

AASM_LANG Language flag for: AT&T Assembly

ASP_LANG Language flag for: ASP

Note: the .asp(x) file extension (and all variants) should be listed in the HTML wordfile to support multi-language embedded ASP highlighting. This will not adversely affect standalone ASP files.

COBOL_LANG Language flag for: COBOL

CSHARP_LANG Language flag for: C#

CSS_LANG Language flag for: CSS

C_LANG Language flag for: C/C++

ECMA_LANG Language flag for: EcmaScript

FORTRAN_LANG Language flag for: FORTRAN

With this flag, in highlighted source files UltraEdit treats a 'C', 'c' or '*' in the first column as a line comment indicator and the rest of the line is highlighted as a line comment.

HTML_LANG Language flag for: HTML

HTML is considerably different from other coding languages. With the HTML_LANG keyword, in the wordfile itself, the special characters "<" and optionally "/" can be prepended to any keyword in the wordfile without having to sort lines as is normally required for wordfile color groups. For example, all keywords beginning with "<a" or "</a" should be on the same line as other words beginning with "a". In the same way, all words beginning with "<b" or "</b" should be on the same line as other words beginning with "b", but on a different line from those starting with "<a", "</a", or "a".

This flag also controls HTML tag matching and some other special internal handling for HTML syntax highlighting.

This flag also sets which language type to allow multi-language web-based highlighting; for example, CSS, JavaScript, or PHP embedded in a .html file.

JAVA_LANG Language flag for: Java

JSCRIPT_LANG Language flag for: JavaScript

LATEX_LANG Language flag for: LaTeX

With the LaTeX language flag, UltraEdit applies special highlighting to allow words to be appropriately handled and highlighted with the "\", and with consecutive words.

This also allows the keywords to be sorted in the wordfile color group without all of them being on the same line. If the keyword begins with "\" then the second character is used to determine which line the word should be on. For example, all words beginning with "\a" should be on the same line as other words beginning with "\a" or "a". In the same way, all words beginning with "\b" should be on the same line as other words beginning with "\b" or "b" but on a different line from those starting with "\a".

MASM_LANG Language flag for: Microsoft Assembly

MATLAB_LANG Language flag for: MATLAB

NASM_LANG Language flag for: Netwide Assembly

PASCAL_LANG Language flag for: Pascal

PERL_LANG Language flag for: Perl

PHP_LANG Language flag for: PHP

Note: the .php file extension should be listed in the HTML wordfile to support multi-language embedded PHP highlighting. This will not adversely affect standalone PHP files.

PLB_LANG Language flag for: PLB

PUREBASIC_LANG Language flag for: PureBasic

PYTHON_LANG Language flag for: Python

RUBY_LANG Language flag for: Ruby

SQL_LANG Language flag for: SQL

VBSCRIPT_LANG Language flag for: VBScript

VB_LANG Language flag for: Visual Basic

XML_LANG Language flag for: XML

XSL_LANG Language flag for: XSL

Comments

Line comments

Line comments can be defined with the following syntax, typically placed on line 1 of the wordfile:

Line Comment = #
/Line Comment = # (if on its own line in wordfile)

The above would cause all text following a "#" character to be colored as a comment until the end of the line.

Note: If Block Comment On or Block Comment On Alt is defined in the wordfile, but the Block Comment Off/Block Comment Off Alt is not, the block commenting will stop at the end of the line. This effectively allows block comments to be used as line comments also. (See Block comments section below)

Line comments must be 5 characters or less. If less than 5 characters and on the first line of the wordfile, the characters must be followed by a space.

You can define a second set of line comments with the following syntax:

Line Comment Alt = //
/Line Comment Alt = // (if on its own line in wordfile)

The above would cause all text following "//" to be colored as a comment until the end of the line.

Some languages may require a space follow the line comment character. To facilitate this, the following wordfile definition is available:

Line Comment Num = xCC
/Line Comment Num = xCC (if on its own line in wordfile)

...where x specifies the number of characters (1 to 5) and immediately following are the characters to be used as line comments. In the example above, x would be 3 since the line comment would be "CC " (note the space after "CC").

By default, UltraEdit will treat all characters preceding a comment as valid, but some languages may require that line comments are only valid if they occur after specific characters. To facilitate this, the following wordfile definition is available:

Line Comment Preceding Chars = [a-z]
/Line Comment Preceding Chars = [a-z] (if on its own line in wordfile)

In the example above, UltraEdit would only identify line comments if the line comment character immediately follow any character a – z.

You can also use the tilde (~) to specify a negative matching set of characters (i.e., comments are not valid if they are preceded by the following characters):

Line Comment Preceding Chars = [~a-z]
/Line Comment Preceding Chars = [~a-z] (if on its own line in wordfile)

In the example above, UltraEdit would only identify line comments if the line comment character does not immediately follow any character a – z.

By default, UltraEdit will treat comment characters in any column as a valid comment, but some languages may require that line comments must start in a specific column(s). To facilitate this, the following wordfile definition is available:

Line Comment Valid Columns = [1-7,10] /Line Comment Valid Columns = [1-7,10] (if on its own line in wordfile)

In the example above, UltraEdit would only identify and highlight line comments if the line comment character occurs at columns 1 through 7, or at column 10.

Block comments

Block comments can be defined with the following syntax, typically placed on line 1 of the wordfile:

Block Comment On = /* Block Comment Off = */
/Block Comment On = /*
/Block Comment Off = */ (if on its own line in wordfile)

The above would cause all text between "/*" and "*/" to be highlighted as a block comment, even if it spans multiple lines.

Note: If Block Comment On is defined in the wordfile, but the Block Comment Off is not, the blcok commenting will stop at the end of the line. This effectively allows the block comments to be used as line comments also.

You can define a second set of alternative block comments as well. These can be defined with the following syntax, typically placed on line 1 of the wordfile:

Block Comment On Alt = /* Block Comment Off Alt = */
/Block Comment On Alt = /*
/Block Comment Off Alt = */ (if on its own line in wordfile)

Both block comments and alternate block comments must be no more than 19 characters.

By default, ULtraEdit will end block comment highlighting at the next block comment close string in the source code. To force UltraEdit to explicitly match all block comment start strings with all block comment end strings, use the NestBlockComments directive as described above.

Strings

By default, in any syntax highlighted file, UltraEdit will highlight any text following a double quote (") or single quote (') as a string. The string highlighting occurs until a matching string character of the same type.

You can override the string characters in the wordfile, or disable them completely with the Noquote directive on line 1. To set custom string characters, use the following syntax, typically placed on line 1 of the wordfile:

String Chars = "
/String Chars = " (if on its own line in wordfile)

In the above, only text following " would be highlighted as a string. No more than 2 string characters can be defined.

If you have 2 different string characters set (or are using the defaults) and want a different highlighting color for both types of strings, you can include the string character in one of the color groups in a line by itself, then configure the color for the string's color group in your theme. This will override the default string highlighting color in your them. For example:

/L1"My language" Line Comment = // Block Comment On = /* Block Comment Off = */ String Chars = '"
/C1"Special strings"
"

In the above, any strings encapsulated by " would be highlighted with the color set for the "Special strings" color group, whereas any strings encapsulated by ' would be highlighted with the regular color for strings.

UltraEdit supports multi-line string highlighting, however you can configure this. By default, UltraEdit will highlight strings spanning multiple lines. To disable this, add the DisableMLS directive to line 1 of the wordfile.

String literal prefix

You can define a string prefix character for string literals in the wordfile as well. For example, in C# the string literal is defined by an "@" sign preceding the string. To set the string literal character, use the following syntax, typically placed on line 1 of the wordfile:

String Literal Prefix = @
/String Literal Prefix = @ (if on its own line in wordfile)

In the example above, the "@" before the string indicates that a backslash is not an escape character which is useful for encapsulating file paths in strings. To illustrate, the following two statements are equivalent:

"c:\\data\\"
@"c:\data\"

The only special character in a @"..." literal is the double quote ("), which is simply doubled if you need to embed one.

Escape characters

Most languages support an escape character (usually "\") for overriding string characters (and other characters with special meanings as well). You can define the escape character using the following syntax, typically placed on line 1 of the wordfile:

Escape Char = \
/Escape Char = \ (if on its own line in wordfile)

File extensions / names

File extensions

In most cases, UltraEdit uses a file's extension to determine which wordfile/language to use for its syntax highlighting. The file extensions for a particular wordfile / language are set in the wordfile, typically at the end of line 1, using the following syntax with each file extension separated by a space:

File Extensions = C CPP CC CXX H HPP AWK
/File Extensions = C CPP CC CXX H HPP AWK (if on its own line in wordfile)

File extensions are not case sensitive. When checking a file's extension, UltraEdit will only evaluate the characters after the last dot in the file's name. A maximum of 97 single-byte characters are supported for this setting.

You can set a default syntax highlighting language by adding an asterisk (*) to the file extensions definition. This would force all files not matching a file extension in any other wordfile to use this wordfile / language for syntax highlighting. This includes new unsaved files and files without an extension.

File names

In some cases, you may want to assign a wordfile / language based upon the full file name. This can be set typically on line 1 of the wordfile, use the following syntax with each file name separated by a space:

File Names = config myfile.xml
/File Names = config myfile.xml (if on its own line in wordfile)

In the example above, any file named "config" or "myfile.xml" would be highlighted with this wordfile.

A maximum of 125 single-byte characters are supported for this setting. Names including spaces may not be used.

Shebang parsing

Perl, XML, and other script source files are frequently saved without an extension. To ensure proper, automatic highlighting of these, if the file's extension does not match any of the existing wordfiles' definitions, UltraEdit will search for a language marker / shebang in the first line of the source file. The following defaults are used:

Language Language identifier (must be in line 1)
Perl #!/usr/bin/perl
PHP #!/bin/php
Python #!/bin/python
XML <?xml

Other script files (especially Unix shell scripts) often use some form of the shebang line to identify the language or script to be used, for example "#!/bin/ksh". To support these, you can extend the internal functionality with the language marker setting using the following syntax on a separate line in the wordfile:

/LanguageMarker = #!/bin/ksh

You can also set multiple flags using the following syntax:

/LanguageMarker = "ksh" "sh" "csh"

In the above example, any files with the following lines would be matched and the syntax highlighting of the wordfile would be applied:

#!/bin/ksh
#! /usr/bin/sh
#!/usr/local/bin/csh

Function list strings

In order to populate the Function list, UltraEdit uses function strings defined in the wordfile. Function strings are regular expressions used to match and identify functions is source code. (They don't necessarily have to be used for functions; they can be used for whatever you wish to list in the function list.) Function listing is not case sensitive, regardless of whether the Nocase directive is in the wordfile.

Historially UltraEdit has used its legacy regular expressions syntax for function strings. However, you can specify that function strings will use Perl regular expressions syntax (recommended and used by most modern wordfiles) by adding the following to the top portion of the wordfile on its own line:

/Regexp Type = Perl

The recommended method for creating or modifying function strings is through the Modify groups dialog, accessible by right-clicking in the function list and selecting "Configuration." If modified through this GUI, functions can be arranged and displayed in a hierarchical tree-style view, with the function strings being written out to the wordfile automatically. When written out automatically, these function strings will look similar to the following:

/TGBegin "Functions"
/TGFindStr = "^(?!if\b|else\b|while\b|[\s*])(?:[\w*~_&]+?\s+){1,6}([\w:*~_&]+\s*)\([^);]*\)[^{;]*?(?:^[^\r\n{]*;?[\s]+){0,10}\{"
/TGBegin "Parameters"
/TGFindStr = "\s*([^,]+)"
/TGFindBStart = "\("
/TGFindBEnd = "\)"
/TGEnd
/TGBegin "Variables"
/TGFindStr = "^[ \t]*((?:static[ \t*]+)?(?:const[ \t*]+)?(?:(?:un)?signed[ \t*]+)?(?:long[ \t*]+)?[a-z0-9_]+[ \t*&]+[a-z0-9[\]_]+);"
/TGFindBStart = "\{"
/TGFindBEnd = "\}"
/TGFindStr = "^[ \t]*((?:static[ \t*]+)?(?:const[ \t*]+)?(?:(?:un)?signed[ \t*]+)?(?:long[ \t*]+)?[a-z0-9_]+[ \t*&]+[a-z0-9[\]_]+)[ \t]*=.+;"
/TGFindBStart = "\{"
/TGFindBEnd = "\}"
/TGEnd
/TGEnd

Since these function strings are created automatically by the editor, it is not recommended nor supported to modify them directly. However, you can create your own function strings and manually add them to the wordfile using the legacy function string syntax.

Legacy function strings

To use legacy function strings, add a line similar to the following with the regular expression that matches functions in your source files:

/Function String = "^[ \t]*function[ \t]([^\(])+"

Up to 6 function strings are supported, and each subsequent string should be on its own line with a sequential number starting with 1 for the second function string. For example:

/Function String = "<regexp1>"
/Function String 1 = "<regexp2>"
/Function String 2 = "<regexp3>"
/Function String 3 = "<regexp4>"
/Function String 4 = "<regexp5>"
/Function String 5 = "<regexp6>"

The regular expression you provide for the function string must be encapsulated in double quotes and should tag (via the regular expression's tagging syntax) the portion of the regular expression that will match the function name, as this is what will be returned in the function list. In the real example above using the regular expression [ \t]*function[ \t]([^\(])+[ \t]+, any preceding whitespace, the word "function," any trailing whitespace after the word "function", and any number of characters that are not "(" would be matched, but only the last portion would be returned in the function list as it is the only portion that is tagged via parentheses.

Delimiters

Delimiters are defined as any non-word character. UltraEdit has a default set of built-in delimiters, but you can override these and define your own delimiters on a per-wordfile basis. This is useful if you have a non-standard coding language or some sort of non-grammar-based syntax that may not use standard grammar rules. You can also use this if the language uses what are typically non-word characters as word characters.

To specify the delimiters, add a new line similar to the following to the wordfile:

/Delimiters = ~!@$%^&*()_-+=|\/{}[]:;"'<> ,.?/

If the delimiters setting does not exist in the wordfile, UltraEdit's built-in defaults are used.

With the exception of the "<" and ">" characters in HTML, a character that is a delimiter cannot also be part of a word. So, for example, you cannot specify keywords including the @ symbol and list @ as a delimiter in the wordfile. However, a delimiter can be included at the beginning of a keyword and be highlighted accordingly, but delimiters cannot be included in the middle of keywords. If a "compound" keyword, or a keyword that includes a delimiter character between two sections is desired, the delimiter character would need to be removed from the delimiters list in the wordfile, or the two portions of the keyword would need to be defined separately to highlight correctly.

It is possible to assign the delimiter characters to the color sections. If you have a character that is a delimiter, such as a '+', and you wish this to be colored with one of the keyword color groups you may add this character to a line of its own under the color section, and this will retain its recognition as a delimiter and be highlighted with the appropriate color.

Indent / unindent strings

UltraEdit provids automatic indentation on a per-language basis to indent or unindent a line when the indent / unindent string is typed. By default, UltraEdit uses "{" and "}" as indent and unindent strings, respectively. To override this and specify your own indent and unindent strings, use the following syntax on separate lines in the wordfile:

/Indent Strings = "{"
/Unindent Strings = "}"

Any number of words/strings may be specified in quotes (each word or string must be in a separate set of quotes).

When an indent string is typed in a file being highlighted by the wordfile, an indent will be added to the line. The indent value (tab/spaces, and number thereof) is in accordance with your indentation settings in the Editor » Word wrap/tab settings section of Settings. The indentation is the next tab stop over from the indentation of the preceding line (same as if the Tab key was pressed).

When an unindent string is typed in a file being highlighted by the wordfile, an indent will be removed from the line.

When reindenting existing text (see Reindent selection), you may want to avoid indenting certain lines that are commented out or are compiler directives, for example. You can prevent this by adding indentation ignore strings to the wordfile on a separate line using the following syntax:

/Ignore Strings SOL = "#" "//"

Any number of words/strings may be specified in quotes (each word or string must be in a separate set of quotes). For the word to match it must be the first character(s) of the line. If a line does begin with one of these ignore strings, it would not be indented, and the indenting of the next line would continue on as if this line didn't exist.

Code folding strings

UltraEdit provides code folding on a per-language basis. By default, UltraEdit uses "{" and "}" as open fold and close fold strings, respectively. To override this and specify your own open / close fold strings, use the following syntax on separate lines in the wordfile:

/Open Fold Strings = "{"
/Close Fold Strings = "}"

Alternatively, if you want code folding of the source language to be based on indentation rather than strings, add the EnableCFByIndent directive to line 1 of the wordfile. This will override any open/close fold strings set in the wordfile.

In some cases, you may want the fold logic to ingore lines containing a certain string. To facilitate this, add the following line to the wordfile:

/Ignore Fold Strings = "Exit Function"

Note that there are no default ignore fold strings.

UltraEdit also allows you to specify open and close fold strings that are recognized in block comments only via the following lines in the wordfile:

/Open Comment Fold Strings = "#Region"
/Close Comment Fold Strings = "#End Region"

Brace matching strings

UltraEdit provides brace matching (and the ability to jump to / select matching brace) on a per-language basis. UltraEdit uses the standard brace pairs as defaults: "(" and ")", "{" and "}", and "[" and "]". However, you can override these by using the following syntax on separate lines in the wordfile:

/Open Brace Strings = "If" "For" "Select Case" "Else" "ElseIf"
/Close Brace Strings = "End If" "Next" "End Select" "End If" "ElseIf"

It's important to note that, unlike code folding and indent/unindent strings, brace matching strunks must be positionally matched in their lists. So in the above example, "If" will only brace match to "End If," "For" will only brace match to "Next," etc. "If" will not brace match to "Next", because "If" occurs first in the open brace string list while "Next" occurs second in the close brace string list. Because of this, there may be valid cases where there are duplicate open or close brace strings.

Marker characters

There may be instances where all text between two characters should be highlighted. UltraEdit facilitates this via marker characters, which mark the first and last part of a string that UltraEdit will highlight. All characters between the two characters are highlighted including the marker characters themselves. To add marker characters, add the following to the wordfile on its own line:

/Marker Characters = "ab"

...where "a" is the first character of the string to be highlighted and "b" is the last character. All characters between "a" and "b" will be highlighted, including spaces. If the line is a comment or string, marker characters are ignored. Alphanumeric characters may be used, but whitespace characters (space/tab) are not supported as marker characters.

Marker character highlighting does not span multiple lines; the highlighting will end either at the first occurrence of the closing marker character or the end of the line, whichever comes first.

You can define up to 4 pairs of characters to highlight between as in:

/Marker Characters = "abcdefgh"

...where strings starting with "a" and ending with "b" are highlighted as are strings starting with "c" and ending with "d", etc.

You must also configure the color of the highlighted string by adding the two marker characters under the appropriate color group section as if they were a word such as "ab", "cd", etc. See the [[#Keyword color groups|keyword color groups] section. Marker character highlighting will not work if they're not added to a color group.

Keyword color groups

Keyword color groups comprise the bulk of most wordfiles. These sections are simply collections of string-based keywords that are highlighted within source files. UltraEdit supports up to 20 color groups. The contents of color groups can be arbitrarily determined by the user, although in most cases color groups are used to contain similar types of the language's keywords (for example, "Built-in functions," "Operators," "Basic keywords," etc).

Each color group's first line must consist of a "/C#" directive to designate the color group. The "#" must be a unique number in sequential order, starting with 1. Additionally, you can specify a name for the color group in quotes immediately following the "/C#" directive. While the name is optional, it is highly recommended to provide this as it is used to label the language's color group setting in the theme. The name can be up to 24 characters.

Keywords within a color group must be sorted alphabetically. Multiple keywords starting with the same character may be on one line, as long as they are separated by a space and sorted alphabetically. By default, keyword highlighting is case sensitive, but you can make highlighting case-insensitive by adding the Nocase directive to line 1 of the wordfile.

If the language is case sensitive, the letter "A" is different from "a" and so words starting with "A" must be on a different line than words starting with "a". Otherwise, "A" and "a" are seen as the same letter and keywords beggining with either of these should be grouped / sorted as such.

Examples:

/C1"Basic keywords"
auto
bool break
case char const continue
default defined do double
else enum extern
float for
goto
if int
long
register return
short signed sizeof static struct switch
typedef
union unsigned
void volatile
while
/C2"Data Types"
bool byte
char class
decimal delegate double
enum
float
int interface
long
object
sbyte short string struct
uint ulong ushort

Keywords beginning with a sub-string

Many languages include support for a predefined substring, however the rest of the word is not known or is determined by the programmer as (s)he is writing the code, for example, variables beginning with "$" in PHP. Because variable names are completely arbitrary, it is not possible to highlight them with a string-based syntax. To facilitate these cases, UltraEdit provides sub-string support. The sub-strings must be defined within a color group, as with any other set of keywords, however the line containing the sub-strings should start with "** " and all sub-strings should be on the same line, sorted alphabetically. For example:

/C3"Variables"
** $ aaa bbb

In this example, all words in the source file beginning with "$", "aaa", or "bbb" would be highlighted as the "Variables" color group.

Keywords starting with "/"

As UltraEdit uses '/' as a command character within the wordfile, keywords that begin with this character require special handling. To highlight words beginning with a '/' the line should begin with '// ' followed by the keywords themselves, sorted alphabetically. For example:

/C4"Some / keywords"
// /akeyword /mykeyword /zkeyword