top of page

STRINGS

 

string length string  Returns the length of string 

string index string index  Returns the char at the index'th position in string. The last character can be referenced as end. 

string range string first last  Returns a string composed of the characters from first to last

There are 6 string subcommands that do pattern and string matching. 
 
string compare string1 string2 
Compares string1 to string2 and returns  
    *    -1 ..... If string1 is less than string2 
    *     0 ........ If string1 is equal to string2 
    *     1 ........ If string1 is greater than string2  These comparisons are done lexicographically, not numerically. 
    
string first string1 string2 Returns the index of the character in string1 that starts the first match to string2, or -1 if there is no match to string2 in string1 

string last string1 string2 Returns the index of the character in string1 that starts the last match to string2, or -1 if there is no match to string2 in string1 

string wordend string index Returns the index of the character just after the last one in the word which contains the index'th character of string. A word is any contiguous set of letters, numbers or underscore characters, or a single other character.
 
string wordstart string index Returns the index of the character just before the first one in the word which contains the index'th character of string. A word is any contiguous set of letters, numbers or underscore characters, or a single other character. 

string match pattern string Returns 1 if the pattern matches string. Pattern is a glob style pattern. 

These are the commands which modify a string. Note that none of these modify the string in place. In all cases a new string is returned. 
 
tolower string Returns string with all the letters converted from upper to lower case.

toupper string Returns string with all the letters converted from lower to upper case. 

string totitle string Returns string with the first letter converted from lower to upper case, and the rest converted to lower case. 

trim string ?trimChars? Returns string with all occurrences of trimChars removed from both ends. By default trimChars are whitespace (spaces, tabs, newlines) 

trimleft string ?trimChars? Returns string with all occurrences of trimChars removed from the left. By default trimChars are whitespace (spaces, tabs, newlines) 

trimright string ?trimChars? Returns string with all occurrences of trimChars removed from the right. By default trimChars are whitespace (spaces, tabs, newlines) 

format formatString ?arg1 arg2 ... argN? Returns a string formatted in the same manner as the ANSI sprintf procedure. FormatString is a description of the formatting to use. The full definition of this protocol is in the format man page. A useful subset of the definition is that formatString consists of literal words, backslash sequences, and % fields. The % fields are strings which start with a % and end with one of:  
    *    s... Data is a string 
    *    d... Data is a decimal integer 
    *    x... Data is a hexadecimal integer 
    *    o... Data is an octal integer 
    *    f... Data is a floating point number  The % may be followed by  
    *    -... Left justify the data in this field 
    *    +... Right justify the data in this field  The justification value may be followed by a number giving the minimum number of spaces to use for the data. 

append 

append is very similar to lappend but instead to append elements to a list, it appends strings to a string. The 's structure is:

append varName ?value value ...?

Every argument following varName is appended to the current content of the varName variable, and the new content of the variable returned.

Example:

 set s "thiru"
 thiru
 append s murugan
 thirumurugan
 append s x y z [string length $s]

 

thirumuruganxyz6

 

string 

To perform different string operations Tcl uses a single string manipulation called string, that takes as first argument the operation to do. The rest of the arguments have different meaning in relation to the operation to perform.

For instance to get the length of a string, the first argument to provide to the string  is length, that's the name of the operation to do, or the sub if you prefer. The other argument is the string itself.

string length "Tcl is a string processor"
25

The number 25 is of course the number of characters that are inside the string "Tcl is a string processor". It's important to know that Tcl strings are binary safe, so every kind of character can be inside a string, including the byte with value zero:

string length "ab\000xy"
5


string range

The range sub is used to extract parts of a string. The way it works is very similar to the lrange . Indexes can also be in the form of end-<index>. The formal  structure is:

string range string start-index end-index

Example:

puts [string range "Thirumurugan is a Tcl user" 7 end-10]
rugan is

string index

The index sub just extracts a single character from the whole string.

string index string index

 

Example:

 string index "thirumurugan" 3
r
 string index "thirumurugan" end
n

As a more interesting real-world application of the string index  is the following procedure that inverts the order of the characters in a string, transforming for example "Tcl" in "lcT". Because the final string is reversed the procedure is called stringReverse.

proc stringReverse s {
    set res {}
    for {set i 0} {$i < [string length $s]} {incr i} {
        append res [string index $s end-$i]
    }
    return $res
}

source stringReverse.tcl 
stringReverse "Thirumurugan"
nagurumurihT

 

string equal

An operation that occurs very frequently is to compare two strings. String equal does it searching for an exact match, that's, the strings must match character by character to be considered the same for the . The return value is 1 if the two strings passed as value are the same, otherwise 0 is returned:

string equal thiru murugan
0
string equal tcl tcl
1
string equal tcl TCL
0

"tcl" and "TCL" are not the same for string equal. If you want to compare in a case insensitive way, there is a -nocase option to change the behaviour and consider characters of different case the same:

string equal -nocase tcl TCL
1

Another interesting option is -length num, that limits the comparison to the first num characters:

string equal Thirumurugan Shanmugam
0


string equal -length 3 catch cats
1

string compare

This sub is very similar to equal, but instead to return true or false if the strings are the same or not, the  will return:

-1 if the first string is < than the second
0  if the first string is the same as the second
1  if the first string is > than the second

This gives more information compared to string equal that may be useful for sorting or other tasks.

string match


When there is the need for more powerful string matching capabilties, string match can be used in place of string equal, because instead to compare two strings, the  compares a string against a pattern.

String match supports patterns composed of normal characters, and the following special sequences:

* Matches any sequence of characters. Even an empty string. ? Matches any single character. [chars] Matches the set of characeters specified. It's possible to specify a squence in the x-y form, like [a-z], that will match every character from a to z. \x Matches exactly x without to interpret it in a special way. This is used in order to match *, ?, [, ], \, as single characters.

This is some example of pattern, and what it may match, in order to make it simpler to understand how it works:

*xyz*         can match xyz, thiruxyz, thiruxyzmurugan, and so on.
x?z           can match xaz, xxz, x8z, but can't match xz.
[ab]c         can match ac, bc.
[a-z]*[0-9]   can match alf4, biz5, but can't match 123, 2thiru1

string match ?-nocase? pattern string

The return value is 1 or 0 respectively if string matches pattern or not. The -nocase option can be used to don't care about the case when matching.

 

Example:

 string match {[0-9]} 5
1
 string match thiru* thiru
1
 string match thiru* thirumurugan
1
 string match thiru* muruganthiru
0
 string match ?*@*.* antirez@invece.org
1
 string match ?*@?*.?* antirez@invece.org
1
 string match ?*@?*.?* antirez@org
0

string map

String map is a powerful tool able to substitute occurrences of strings with other strings. The substitution is driven by a key-value pairs list. For example the list {thiru murugan x {} y yy} will replace every occurence of "thiru" with "murugan", will remove every occurrence of "x", and will duplicate every occurrence of "y". The  structure is the following:

string match ?-nocase? pattern string

Examples:

string map {x {}} exchange
echange
string map {1 Tcl 2 great} "1 is 2"
Tcl is great

Note how string map iterates just one time on the original string, so a pattern can't match as effect of an early substitution:

string map {{ } xx x yyy} "Hello World"
HelloxxWorld

When the key value paris list is not constant it's better to use the list  to create it:

set a thiru
thiru
set b murugan
murugan
string map [list $a $b $b $a] thirumurugan
muruganthiru

string is

String is tests if a string is a member of a given class, like integers, alphanumeric characters, spaces, and so on. 

string is class ?-strict? ?-failindex varname? string

For default the  returns 1 for empty strings, so the -strict option is used to invert the behaviour and return 0 on empty strings (i.e. to don't consider the empty string a member of the given class).

The class can be one of the following:

alnum                 alphabet or digit character
alpha                 alphabet character
ascii                 every character in the 7-bit ASCII range
boolean               any form allowed for Tcl booleans (0, 1, yes, no, ...)
control               a control character
digit                 a digit character
double                a valid Tcl double precision number
false                 any form allowed for Tcl boolean with false value
graph                 a printing character, except space
integer               any valid form of 32-bit integers
lower                 a lovercase alphabet character
print                 a printing character including space
punct                 punctuation character
space                 any space character
true                  any form allowed for Tcl boolean with true value
upper                 an uppercase lphabet character
wordchar              any word character. alphanumeric, puntuation, underscore
xdigit                an hexadecimal digit

As you can see some classes are oriented to a single character (like alnum), and some are useful for strings, (like integer). If strings composed of more then a single character are tested against classes oriented to characters, every element of the string must belong to the class for the  to return 1.

 

Some example:

string is integer 33902123
1
string is integer thirumurugan
0
string is upper K
1
string is lower K
0
string is upper "KKK"
0
string is upper "KKz"
0

© 2015 Thirumurugan

bottom of page