(Created page with "In addition to Perl regular expressions, UltraEdit supports two other "legacy" styles: a proprietary regular expression syntax and a basic Unix syntax. We typically recomm...") |
|||
Line 244: | Line 244: | ||
See also: | See also: | ||
* [[Perl regular expressions]] | * [[Perl regular expressions]] | ||
+ | * [[Special search characters]] | ||
</div> | </div> |
In addition to Perl regular expressions, UltraEdit supports two other "legacy" styles: a proprietary regular expression syntax and a basic Unix syntax. We typically recommend using Perl regular expressions, as these are far more powerful and robust than these two legacy styles.
Symbol | Function |
---|---|
% | Matches the start of line - Indicates the search string must be at the beginning of a line but does not include any line terminator characters in the resulting string selected. |
$ | Matches the end of line - Indicates the search string must be at the end of line but does not include any line terminator characters in the resulting string selected. |
? | Matches any single character except newline. |
* | Matches any number of occurrences of any character except newline. At least one occurrence of the preceding character or one of the characters in preceding character set must be found. |
+ | Matches one or more of the preceding single character/character set. At least one occurrence of the character must be found. |
++ | Matches the preceding single character/character set zero or more times. |
^b | Matches a page break. |
^p | Matches a newline (CR/LF) (paragraph) (DOS Files) |
^r | Matches a newline (CR Only) (paragraph) (MAC Files) |
^n | Matches a newline (LF Only) (paragraph) (UNIX Files) |
^t | Matches a tab character |
[xyz] | A character set. Matches any characters between brackets. |
[~xyz] | A negative character set. Matches any characters NOT between brackets including newline characters. |
^{A^}^{B^} | Matches expression A OR B |
^ | Overrides the following regular expression character |
^(...^) | Brackets or tags an expression to use in the replace command. A regular expression may have up to 9 tagged expressions, numbered according to their order in the regular expression.
The corresponding replacement expression is ^x, for x in the range 1-9. Example: If ^(h*o^) ^(f*s^) matches "hello folks", ^2 ^1 would replace it with "folks hello". |
Note: ^ refers to the character '^' , not the Ctrl key.
Examples:
m?n | matches "man", "men", "min" but not "moon". |
t*t | matches "test", "tonight" and "tea time" (the "tea t" portion) but not "tea time" (newline between "tea " and "time"). |
Te+st | matches "test", "teest", "teeeest" etc. but does not match "tst". |
[aeiou] | matches every lowercase vowel |
[,.?] | matches a literal ",", "." or "?". |
[0-9a-z] | matches any digit, or lowercase letter |
[~0-9] | matches any character except a digit (~ means NOT the following) |
You may search for an expression A or B as follows:
"^{John^}^{Tom^}"
This will search for an occurrence of John or Tom. There should be nothing between the two expressions.
You may combine A or B and C or D in the same search as follows:
"^{John^}^{Tom^} ^{Smith^}^{Jones^}"
This will search for John or Tom followed by Smith or Jones.
Symbol | Function |
---|---|
\ | Indicates the next character has a special meaning. "n" on it's own matches the character "n". "\n" matches a linefeed or newline character. See examples below (\d, \f, \n etc). |
^ | Matches/anchors the beginning of line. |
$ | Matches/anchors the end of line. |
* | Matches the preceding single character/character set zero or more times. |
+ | Matches one or more of the preceding single character/character set. At least one occurrence of the preceding character or one of the characters in preceding character set must be found. |
. | Matches any single character except a newline character. Does not match repeated newlines. |
(expression) | Brackets or tags an expression to use in the replace command. A regular expression may have up to 9 tagged expressions, numbered according to their order in the regular expression.
The corresponding replacement expression is \x, for x in the range 1-9. Example: If (h.*o) (f.*s) matches "hello folks", \2 \1 would replace it with "folks hello". |
[xyz] | A character set. Matches any characters between brackets. |
[^xyz] | A negative character set. Matches any characters NOT between brackets including newline characters. |
\d | Matches a digit character. Equivalent to [0-9]. |
\D | Matches a nondigit character. Equivalent to [^0-9]. |
\f | Matches a form-feed character. |
\n | Matches a linefeed character. |
\r | Matches a carriage return character. |
\s | Matches any whitespace including space, tab, form-feed, etc but not newline. |
\S | Matches any non-whitespace character but not newline. |
\t | Matches a tab character. |
\v | Matches a vertical tab character. |
\w | Matches any alphanumeric character including underscore. |
\W | Matches any character except alphanumeric characters and underscore. |
\p | Matches CR/LF (same as \r\n) to match a DOS line terminator. |
Note: ^ refers to the character '^' , not the Ctrl key.
Examples:
m.n | matches "man", "men", "min" but not "moon". |
Te+st | matches "test", "teest", "teeeest" etc. BUT NOT "tst". |
Te*st | matches "test", "teest", "teeeest" etc. AND "tst". |
[aeiou] | matches every lowercase vowel |
[,.?] | matches a literal ",", "." or "?". |
[0-9a-z] | matches any digit, or lowercase letter |
[^0-9] | matches any character except a digit (^ means NOT the following) |
You may search for an expression A or B as follows:
"(John|Tom)"
This will search for an occurrence of John or Tom. There should be nothing between the two expressions.
You may combine A or B and C or D in the same search as follows:
"(John|Tom) (Smith|Jones)"
This will search for John or Tom followed by Smith or Jones.
If regular expressions aren't enabled for a find/replace, the following special characters are also valid in the Find and Replace fields:
Notation | Represents |
---|---|
^t | Tab character |
^p | New line (DOS files - CR/LF, or hex 0D 0A) |
^r | Carriage return (hex 0D) |
^n | Line feed (new line in Unix based text files) (hex 0A) |
^b | Line break |
^s | Selected text |
^c | Clipboard contents (up to 30,000 characters) |
^^ | Literal "^" character |
Note: ^ refers to the character '^' , not the Ctrl key.