Chapter 16
Strings
A string is a very common object in Zen. You have used strings throughout this manual. You have seen how to use string literals, how to concatenate strings, and how to compare strings. However, strings are much more powerful and support a wide variety of operations.
In this chapter, you will learn more about strings.
+++ title = “Understanding Strings” weight = 1 +++
In Zen, strings are not primitive reference types. Strings are implemented by the
String
class defined in the zen.core
package. All strings are objects and Zen
treats strings specially.
A string is a sequence of characters. You can create a string by enclosing your text in single quotes. The character sequence along with the single quotes form a string literal.
Unlike many other programming languages, string are not arrays of characters.
Internally the String
class stores the characters in an array, a detail which
you can safely ignore. But you always need to perceive strings as objects and
not arrays.
So far you have seen only one way to create a string object: to use the string
literal. However, the String
class exposes many constructors, which allow
you to create strings in many ways.
You can perform various operations with strings, such as comparing two strings, searching for substrings, concatenating two strings, and so on. However, you cannot modify them because string objects are immutable. In other words, once a string object is created, you cannot change its characters. You can only derive a new string which is a modified version of the original string. At first, this may seem to be a serious restriction. However, immutable strings make your life easier. You can still perform all types of string operations. Immutable strings are more efficient than mutable strings.
Zen provides other mechanisms which allow you to derive modified versions of a string. Each time you derive a new version of the string, you are creating a new string object. The original string is always left as it is.
If you have to make many changes to a string, deriving a new string for each
change is not efficient. For such situtations, Zen provides the StringBuilder
class, which internally represent strings as arrays. So whatever changes you make
will be quick and efficient. Once you are satisfied with the changes, these
classes allow you to derive an immutable string. After which, you can use the
string as usual.
The String and StringBuilder classes are defined in the zen.core
and zen.text
packages, respectively. Therefore, all these classes are available to your programs
automatically. All the three classes extend the CharacterSequence
abstract class,
which means most operations you find in strings are found in the other class, too.
Before we continue, here is a summary of strings.
- Strings can include escape sequences. An escape sequence allows you to represent a special character which you would not be able to type with your keyboard. It is written with a backward slash followed by a letter.
Here is the list of all the escape sequences availablen in Zen.
Escape Sequence | Description |
---|---|
\n | Newline |
\t | Tab |
\b | Backspace |
\r | Carriage Return |
\f | Form feed |
\’ | Apostrophe |
\” | Quotation mark |
\\ | Backslash |
-
Strings are objects. You enclose a string literal in single quotes. A string can contain zero or more characters, even if a string contains a single
-
Strings are an important part of programming languages, therefore Zen treats strings specially. In fact, strings support many special syntax. For example, string objects are created from string literals. However, you can use the constructors of the String class for more flexbility. You will learn about these constructors in the next lecture.
-
You can combine, or concatenate, strings using the concatenation operator (
+
). You can combine strings with other values, in which case Zen automatically converts the value to a string for you. In the case of objects, thetoString()
function is first invoked to derive a string representation.Here is an example of the concatenation operator.
var firstPart = 'Hello, '
var secondPart = 'world!'
var message = firstPart + secondPart
-
Zen offers a wrapper class for each primitive type defined by the Zen Virtual Machine. These wrapper classes allow you to convert a string to a corresponding value. You can do this with the
parse()
function these classes expose.Here is an example of how you can convert a string to an integer.
// StringToInteger.zen
function main(...arguments)
var text = '2019'
var year = Integer.parse(text)
printf('This is the year %d. After 11 years, it will be the year %d.\n', year + 11)
This example converts the string which contains '2019'
to its equivalent
integer value, which is 2019
. However, if you specify an invalid text these
functions throw NumberFormatException
.
-
Unlike other programming languages, Zen does not support characters inherently. However, integer values can be treated as characters and additional support is provided through the
Character
class.+++ title = “Constructors of the String Class” weight = 2 +++
The String
class exposes many constructors. You will learn about a few of
these constructors in this section. Please refer the Zen API documentation for
more information.
Creating Empty Strings
You can create an empty string by calling the default constructor. An empty string basically contains no characters.
Here is an example.
var emtpyString = new String()
Creating Strings From Array of Characters
The String
class implements a constructor which accepts an array of characters.
This allows you to create a string from an array of characters.
function new(sequence)
Here is an example which demonstrates how you can create strings using arrays.
var sequence = arrayEx(String, 'F', 'e', 'b', 'r', 'u', 'a', 'r', 'y' )
var string = new String(sequence)
This creates a string that holds the text 'February'
.
You can further specify a range within the array you want the constructor to copy using this constructor.
function new(sequence, startIndex, count)
Here, the startIndex
indicates the index from where the characters should be
copied. It is inclusive, which means the string at the start index is also
included. The count parameter indicates the number of characters to copy,
beginning the copy at the start index.
The following example demonstrates this constructor.
var sequence = arrayEx(String, 'F', 'e', 'b', 'r', 'u', 'a', 'r', 'y')
var string = new String(sequence, 3, 2)
This creates a string which represents 'ru'
. Remember indexes in Zen always
begin at 0
, therefore, 3
indicates the fourth character.
Cloning Strings
You can create a copy of the string object using the clone()
function.
Here is an example.
// StringCopy.zen
function main(...arguments)
var string1 = 'Hello'
var string2 = string1.clone()
print(string1 is string2)
print(string1 == string2)
The first string contains the text 'Hello'
. We create another copy of this string.
Since the is
operator tests if both the variables contain the same reference,
the first print statement prints false. In other words, the variables string1
and string2
hold two different references. However, they hold logically equal
objects because they both contain the same text. To prove that they contain the
same text, we use the ==
operator, which invokes the equals()
function. This
is why the second print statement prints true
.
There are many other constructors implemented in the String
class. We encourage
you to learn more about these constructors from the Zen API documentation.
+++ title = “Determining the Size of Strings” weight = 3 +++
You can determine the size of a string using the immutable size
property.
The size of a string is basically the number of code points it contains.
Here is an example which prints the size of a string.
// StringSize.zen
function main(...arguments)
var name = 'Shazam'
printf('The size of "%s" is %d.\n', name.size)
The example above produces the following output:
The size of "Shazam" is 6.
The size of a string is an important value when you want to manipulate strings.
+++ title = “Comparing Strings” weight = 4 +++
String Equality
You can compare two strings for equality using the equality operator (==
).
The behavior of the equality operator is overridden in the equals()
function,
which means you can use it to compare two strings for equality, too.
The declaration of the equals
function is shown here.
function equals(other)
It returns true
if both the strings contain the same characters in the same
order; false
otherwise. You need to remember that this function is case sensistive.
For example, if you compare 'Batman'
and 'batman'
for equality, it would
return false
.
Here is an example which demonstrates how you can compare two strings for equality.
// StringEquality.zen
function main(...arguments)
var string1 = 'Joker'
var string2 = 'Harley Quinn'
if string1.equals(string2)
printf('"%s" is equal to "%s"!\n', string1, string2)
else
printf('"%s" is not equal to "%s"!\n', string1, string2)
You can compare two strings for equality without considering case. For this, you need to invoke the overloaded version of the equals() function.
function equals(other, ignoreCase)
It returns true
if both the strings contain the same characters in the same
order; false
otherwise. You need to remember that this function is case
insensistive. For example, if you compare 'Camel'
and 'camel'
for equality,
it would return true
.
Here is an example which demonstrates how you can compare two strings for equality.
// StringEquality2.zen
function main(...arguments)
var string1 = 'Poison Ivy'
var string2 = 'poison ivy'
if string1.equals(string2, true)
printf('"%s" is equal to "%s"!\n', string1, string2)
else
printf('"%s" is not equal to "%s"!\n', string1, string2)
Comparing String Regions
You can compare a region inside your string with the region of another string.
You need to invoke the regionMatches()
function for this.
Here is the declaration of the regionMatches()
function.
function regionMatches(startIndex, other, otherStartIndex, count)
The startIndex
specifies the index at which the region begins for the string
object against which the function is invoked.
other
represents the other string to compare. The otherIndex
indicates the
index at which the region begins for the other string.
The count
parameter indicates the number of characters in both the regions.
You need to carefully specify this parameter.
You can force regionMatches()
function to ignore case. For this invoke the
overloaded version of the regionMatches()
.
function regionMatches(startIndex, other, otherStartIndex, count, ignoreCase)
The ignoreCase
boolean parameter indicates whether the function should ignore
case when comparing, or not.
Prefixes and Suffixes
You can determine if a string starts with a substring using the startsWith()
function. It returns true
if a string starts with the specified substring;
false
otherwise.
Here is the declaration of the startsWith()
function.
function startsWith(sequence)
Further, you can specify from which index startsWith()
begins comparing.
An overloaded version of the startsWith()
function accepts the starting index.
Here is the declaration of this function.
function startsWith(sequence, startIndex)
Here, the startIndex
specifies the index applies to the string object against
which the function is invoked. It indicates the index from where the comparison is
started.
Similarly, you can determine if a string ends with a substring using the endsWith()
function. It returns false
if a string ends with the specified substring; false
otherwise.
Here is the declaration of the endsWith()
function.
function endsWith(sequence)
Here is an example which demonstrates these functions.
function main(...arguments)
var superhero = 'Swamp Thing'
var r1 = superhero.startsWith('Swam')
var r2 = superhero.endsWith('ing')
var r3 = superhero.startsWith('Thing', 5)
printf('"%s" starts with "Swam": %z\n', superhero, r1)
printf('"%s" ends with "ing": %z\n', superhero, r2)
printf('"%s" starts with "Thing" after 5 characters: %z\n', superhero, r3)
This example produces the following output.
"Swamp Thing" starts with "Swam": true
"Swamp Thing" ends with "ing": true
"Swamp Thing" starts with "Thing" after 5 characters: true
Comparing Strings
Imagine you are sorting an array of strings. How can you determine whether one
string is greater than the other? So far you have learnt about functions that only
help you determine whether they are identical. In order to compare strings
for relativity, you need to use the compare()
function.
The compare()
function compares two strings lexicographically, which means
a A is lesser B, B is lesser than C and so on. For example, A is lesser than Z
and Y is greater than M. In this logic, lowercase letters are always greater
than uppercase letters.
The declaration of the compare()
function is shown here.
function compare(other)
-
It returns a negative number if the current string object is lesser than the specified string.
-
It returns a positive number if the current string object is greater than the specified string.
-
It returns zero when both the strings are equal.
Here is an example, which determines if a string is greater than the other.
// YatchVsDragon.zen
function main(...arguments)
var yatch = 'Yatch'
var dragon = 'Dragon'
if yatch.compare(dragon) > 0
print('Yatch is greater than Dragon.')
else if yatch.compare(dragon) < 0
print('Yatch is lesser than Dragon.')
else
print('Yatch is equal to Dragon.')
This program generate the following output.
Yatch is greater than Dragon.
As mentioned previously, the compare()
function is case sensitive. For case
insensistive comparison you need to invoke the overloaded version of the compare()
function which accepts a Boolean flag indicating whether the function has to
ignore case or not.
The declaration of the overloaded version of the compare()
function is shown
below.
function compare(other, ignoreCase)
+++ title = ‘Extracting Characters From Strings’ weight = 5 +++
You can extract characters from strings in a number of ways. We discuss about each technique in this section.
Internally, strings store characters in an array. The String
class overrides
the subscript operator, which allows you to access characters in a string as if
it were an array. Many functions in the String
class require indexes to operate.
Just like array, string indexes are zero based. So it would help you to understand
strings better if you think of them as arrays.
You can extract a single character from a string using the getCharacter()
function.
It accepts an index of the character you want to retrieve. You cannot specify
negative values. It throws a InvalidStringIndexException
if the index is invalid.
Here is the declaration of the getCharacter()
function.
@Operator symbol='[]'
function getCharacter(index)
Here is an example.
// AccessingCharacter.zen
function main(...arguments)
var color = 'Neon'
var c = color.getCharacter(2)
print(c)
This snipet prints the following output.
o
You can retrieve a range of characters at a time using the getCharacters()
function.
Here is the declaration of the getCharacters()
function.
function getCharacters(startIndex, stopIndex, destination, destinationIndex)
Here, the startIndex
specifies the index from which the copying begins. It is
inclusive.
The stopIndex
specifies the index at which the copying ends. This index is exclusive,
which means the character at this index is not copied.
The characters are copied to the destination
array you specify. The characters
are copied into the destination array beginning at the index specified by
destinationIndex
. You need to make sure that the destination array is large
enough to store the range of characters.
Here is an example which copies a range of characters.
// CharacterExtraction.zen
function main(...arguments)
var message = 'How you doin?'
var startIndex = 4
var stopIndex = 7
var destination = new Array(Character, 3)
message.getCharacters(startIndex, stopIndex, destination, 0)
var message2 = new String(destination)
print(message2)
This example prints the following.
you
If you want to convert all the characters in your string into a character array,
invoke the toCharacterArray()
function. It returns an array of characters from
your string.
The declaration of the toCharacterArray()
function is shown here.
function toCharacterArray()
You can use the getCharacters()
function to perform the same operation, except
getCharacters()
makes you write more code.
Extracting Substrings
The substring()
function allows you to extract a region of your string, usually
referred as substring.
The declaration of this function is shown here.
function substring(index)
Here, the index indicates the index from which the substring is extracted. The substring begins from the specified index and extends to the end of the string.
Here is an example of the substring()
function.
// SubstringExtraction.zen
function main(...arguments)
var lastName = 'Nikola Tesla'.substring(7)
printf('%s was one of the greatest inventors to ever live!\n', lastName)
This example prints the following output.
Tesla was one of the greatest inventors to ever live!
An overloaded version of the substring()
function allows you to extract a
range of characters from the string.
Here is the declaration of the substring()
function.
function substring(startIndex, stopIndex)
Here the startIndex
specifies the start index of the substring and the
stopIndex
specifies the end index of the substring. The startIndex
is
inclusive, whereas, the stopIndex
is exclusive.
Here is an example of the overloaded version of the substring()
function.
var substring = 'Nikola Tesla'.substring(2, 6)
println(substring)
This example prints the following output.
kola