Applications with a graphical user interface (and games surely fall into this category) are able to interact with users by displaying text and by expecting textual input from the user. We have already scratched the surface of this topic in the previous chapters using the QString
class. Now, we will go into further detail.
String encodings
The C++ language does not specify encoding of strings. Thus, any char*
array and any std::string
object can use an arbitrary encoding. When using these types for interaction with native APIs and third-party libraries, you have to refer to their documentation to find out which encoding they use. The encoding used by native APIs of the operating system usually depends on the current locale. Third-party libraries often use the same encoding as native APIs, but some libraries may expect another encoding, for example, UTF-8.
A string literal (that is, each bare text you wrap in quotation marks) will use an implementation defined encoding. Since C++11, you have an option to specify the encoding your text will have:
u8"text" will produce a UTF-8 encoded const char[] array
u"text" will produce a UTF-16 encoded const char16_t[] array
U"text" will produce a UTF-32 encoded const char32_t[] array
Unfortunately, the encoding used for interpreting the source files is still implementation defined, so it’s not safe to put non-ASCII symbols in string literals. You should use escape sequences (such as \unnnn
) to write such literals.
Text in Qt is stored using the QString
class that uses Unicode internally. Unicode allows us to represent characters in almost all languages spoken in the world and is the de facto standard for native encoding of text in most modern operating systems. There are multiple Unicode-based encodings. Memory representation of the content of QString
resembles UTF-16 encoding. Basically, it consists of an array of 16-bit values where each Unicode character is represented by either 1 or 2 values.
When constructing a QString
from a char
array or an std::string
object, it’s important to use a proper conversion method that depends on the initial encoding of the text. By default, QString
assumes UTF-8 encoding of the input text. UTF-8 is compatible with ASCII, so passing UTF-8 or ASCII-only text to QString(const char *str)
is correct. QString
provides a number of static methods to convert from other encodings such as QString::fromLatin1()
or QString::fromUtf16()
. QString::fromLocal8Bit()
method assumes the encoding corresponding to the system locale.
If you have to combine both QString
and std::string
in one program, QString
offers you the toStdString()
and fromStdString()
methods to perform a conversion. These methods also assume UTF-8 encoding of std::string
, so you can’t use them if your strings are in another encoding.
Default representation of string literals (for example, "text"
) is not UTF-16, so each time you convert it to a QString
, an allocation and conversion happens. This overhead can be avoided using the QStringLiteral
macro:
QString str = QStringLiteral("I'm writing my games using Qt");
QStringLiteral
does two things:
- It adds a
u
prefix to your string literal to ensure that it will be encoded in UTF-16 at compile time - It cheaply creates a
QString
and instructs it to use the literal without performing any allocation or encoding conversion
It’s a good habit to wrap all your string literals (except the ones that need to be translated) into QStringLiteral
but it is not required, so don’t worry if you forget to do that.
QByteArray and QString
QString
always contains UTF-16 encoded strings, but what if you have data in an unknown (yet) encoding? Also, what if the data is not even text? In these cases, Qt uses the QByteArray
class. When you read data directly from a file or receive it from a network socket, Qt will return the data as a QByteArray
, indicating that this is an arbitrary array of bytes without any information about the encoding:
QFile file("/path/to/file");
file.open(QFile::ReadOnly);
QByteArray array = file.readAll();
The closest equivalent of QByteArray
in the standard library would be std::vector<char>
. As the name implies, this is just an array of bytes with some helpful methods. In the preceding example, if you know that the file you read is in UTF-8, you can convert the data to a string, as follows:
QString text = QString::fromUtf8(array);
If you have no idea what encoding the file uses, it may be best to use the system encoding, so QString::fromLocal8Bit
would be better. Similarly, when writing to a file, you need to convert the string to a byte array before passing it to the write()
function:
QString text = "new file content\n";
QFile file("/path/to/file");
file.open(QFile::WriteOnly);
QByteArray array = text.toUtf8();
file.write(array);
Basic string operations
The most basic tasks that involve text strings are the ones where you add or remove characters from the string, concatenate strings, and access the string’s content. In this regard, QString
offers an interface that is compatible with std::string
, but it also goes beyond that, exposing many more useful methods.
Adding data at the beginning or at the end of the string can be done using the prepend()
and append()
methods. Inserting data in the middle of a string can be done with the insert()
method that takes the position of the character where we need to start inserting as its first argument and the actual text as its second argument. All these methods have a couple of overloads that accept different objects that can hold textual data, including the classic const char*
array.
Removing characters from a string is similar. The basic way to do this is to use the remove()
method that accepts the position at which we need to delete characters, and the number of characters to delete is as shown:
QString str = QStringLiteral("abcdefghij");
str.remove(2, 4); // str = "abghij"
There is also a remove()
overload that accepts another string. When called, all its occurrences are removed from the original string. This overload has an optional argument that states whether comparison should be done in the default case-sensitive (Qt::CaseSensitive
) or case-insensitive (Qt::CaseInsensitive
) way:
QString str = QStringLiteral("Abracadabra");
str.remove(QStringLiteral("ab"), Qt::CaseInsensitive);
// str = "racadra"
To concatenate strings, you can either simply add two strings together, or you can append one string to the other:
QString str1 = QStringLiteral("abc");
QString str2 = QStringLiteral("def");
QString str1_2 = str1 + str2;
QString str2_1 = str2;
str2_1.append(str1);
Accessing strings can be divided into two use cases. The first is when you wish to extract a part of the string. For this, you can use one of these three methods—left()
, right()
, and mid()
—that return the given number of characters from the beginning or end of the string or extract a substring of a specified length, starting from a given position in the string:
QString original = QStringLiteral("abcdefghij");
QString l = original.left(3); // "abc"
QString r = original.right(2); // "ij"
QString m = original.mid(2, 5); // "cdefg"
The second use case is when you wish to access a single character of the string. The use of the index operator works with QString
in a similar fashion as with std::string
, returning a copy or non-const reference to a given character that is represented by the QChar
class, as shown in the following code:
QString str = "foo";
QChar f = str[0]; // const
str[0] = 'g'; // non-const
QChar f = str.at(0);
The string search and lookup
The second group of functionalities is related to searching for the string. You can use methods such as startsWith()
, endsWith()
, and contains()
to search for substrings in the beginning or end or in an arbitrary place in the string. The number of occurrences of a substring in the string can be retrieved using the count()
method.
If you need to know the exact position of the match, you can use indexOf()
or lastIndexOf()
to receive the position in the string where the match occurs. The first call works by searching forward, and the other one searches backwards. Each of these calls takes two optional parameters—the second one determines whether the search is case-sensitive (similar to how remove
works). The first one is the position in the string where the search begins. It lets you find all the occurrences of a given substring:
int pos = -1;
QString str = QStringLiteral("Orangutans like bananas.");
do {
pos = str.indexOf("an", pos + 1);
qDebug() << "'an' found starts at position" << pos;
} while(pos != -1);
Dissecting strings
There is one more group of useful string functionalities that makes QString
different from std::string
, that is, cutting strings into smaller parts and building larger strings from smaller pieces.
Very often, a string contains substrings that are glued together by a repeating separator (for example, "1,4,8,15"
). While you can extract each field from the record using functions that you already know (for example, indexOf
), an easier way exists. QString
contains a split()
method that takes the separator string as its parameter and returns a list of strings that are represented in Qt by the QStringList
class. Then, dissecting the record into separate fields is as easy as calling the following code:
QString record = "1,4,8,15,16,24,42";
QStringList items = record.split(",");
for(const QString& item: items) {
qDebug() << item;
}
The inverse of this method is the join()
method present in the QStringList
class, which returns all the items in the list as a single string merged with a given separator:
QStringList fields = { "1", "4", "8", "15", "16", "24", "42" };
QString record = fields.join(",");
Converting between numbers and strings
QString
also provides some methods for convenient conversion between textual and numerical values. Methods such as toInt()
, toDouble()
, or toLongLong()
make it easy to extract numerical values from strings. All such methods take an optional bool *ok
parameter. If you pass a pointer to a bool
variable as this parameter, the variable will be set to true
or false
, depending on whether the conversion was successful or not. Methods returning integers also take the second optional parameter that specifies the numerical base (for example, binary, octal, decimal, or hexadecimal) of the value:
bool ok;
int v1 = QString("42").toInt(&ok, 10);
// v1 = 42, ok = true
long long v2 = QString("0xFFFFFF").toInt(&ok, 16);
// v2 = 16777215, ok = true
double v3 = QString("not really a number").toDouble(&ok);
//v3 = 0.0, ok = false
A static method called number()
performs the conversion in the other direction—it takes a numerical value and number base and returns the textual representation of the value:
QString txt = QString::number(42); // txt = "42"
This function has some optional arguments that allow you to control the string representation of the number. For integers, you can specify the numerical base. For doubles, you can choose the scientific format 'e'
or the conventional format 'f'
and specify the number of digits after the decimal delimiter:
QString s1 = QString::number(42, 16); // "2a"
QString s2 = QString::number(42.0, 'f', 6); // "42.000000"
QString s3 = QString::number(42.0, 'e', 6); // "4.200000e+1"
QString str;
str.setNum(1234); // str == "1234"
Other Useful Fuctions
QString str;
QString csv = "forename,middlename,surname,phone";
QString path = "/usr/local/bin/myapp"; // First field is empty
QString::SectionFlag flag = QString::SectionSkipEmpty;
str = csv.section(',', 2, 2); // str == "surname"
str = path.section('/', 3, 4); // str == "bin/myapp"
str = path.section('/', 3, 3, flag); // str == "myapp"
str = csv.section(',', -3, -2); // str == "middlename,surname"
str = path.section('/', -1); // str == "myapp"
QString str = " lots\t of\nwhitespace\r\n ";
str = str.simplified();
// str == "lots of whitespace";
and are there iterator functions like begin and end
QString x = "Say yes!";
QString y = "no";
x.replace(4, 3, y);
// x == "Say no!"
QString s = "Banana";
s.replace(QRegExp("a[mn]"), "ox");
// s == "Boxoxa"
QString str = "colour behaviour flavour neighbour";
str.replace(QString("ou"), QString("o"));
// str == "color behavior flavor neighbor"
QString str("LOGOUT\r\n");
str.chop(2);//remove n characters from the end
// str == "LOGOUT"
QString str = "Vladivostok";
str.truncate(4);
// str == "Vlad"
Using arguments in strings
const int fieldWidth = 4;
qDebug() << QStringLiteral("%1 | %2").arg(5, fieldWidth).arg(6, fieldWidth);
qDebug() << QStringLiteral("%1 | %2").arg(15, fieldWidth).arg(16, fieldWidth);
// output:
// " 5 | 6"
// " 15 | 16"
QString str = tr("Copying file %1 of %2").arg(current).arg(total);
Regular expressions
QRegularExpression regex("[1-9]\\d{0,2}\\s*(mg|g|kg)");
regex.setPatternOptions(QRegularExpression::CaseInsensitiveOption);
qDebug() << regex.match("100 kg").hasMatch(); // true
qDebug() << regex.match("I don't know").hasMatch(); // false
QRegularExpression regex("[1-9]\\d{0,2}\\s*(mg|g|kg)",
QRegularExpression::CaseInsensitiveOption);
When we need to test an input, all we have to do is call match()
, passing the string we would like to check against it. In return, we get an object of the QRegularExpressionMatch
type that contains all the information that is further needed—and not only to check the validity. With QRegularExpressionMatch::hasMatch()
, we then can determine whether the input matches our criteria, as it returns true
if the pattern could be found. Otherwise, of course, false
is returned.
After we have checked that the sent guess is well formed, we have to extract the actual weight from the string. In order to be able to easily compare the different guesses, we further need to transform all values to a common reference unit. In this case, it should be a milligram, the lowest unit. So, let’s see what QRegularExpressionMatch
can offer us for this task.
With capturedTexts()
, we get a string list of the pattern’s captured groups. In our example, this list will contain “23kg” and “kg”. The first element is always the string that was fully matched by the pattern. The next elements are all the substrings captured by the used brackets. Since we are missing the actual number, we have to alter the pattern’s beginning to ([1-9]\d{0,2})
. Now, the list’s second element is the number, and the third element is the unit. Thus, we can write the following:
int getWeight(const QString &input) {
QRegularExpression regex("\\A([1-9]\\d{0,2})\\s*(mg|g|kg)\\z");
regex.setPatternOptions(QRegularExpression::CaseInsensitiveOption);
QRegularExpressionMatch match = regex.match(input);
if(match.hasMatch()) {
const QString number = match.captured(1);
int weight = number.toInt();
const QString unit = match.captured(2).toLower();
if (unit == "g") {
weight *= 1000;
} else if (unit == "kg") {
weight *= 1000000 ;
}
return weight;
} else {
return -1;
}
}
In the function’s first two lines, we set up the pattern and its option. Then, we match it against the passed argument. If QRegularExpressionMatch::hasMatch()
returns true
, the input is valid and we extract the number and unit. Instead of fetching the entire list of captured text with capturedTexts()
, we query specific elements directly by calling QRegularExpressionMatch::captured()
. The passed integer argument signifies the element’s position inside the list. So, calling captured(1)
returns the matched digits as a QString
.
Lastly, let’s take a final look at how to find, for example, all numbers inside a string, even those leading with zeros:
QString input = QStringLiteral("123 foo 09 1a 3");
QRegularExpression regex("\\b\\d+\\b");
QRegularExpressionMatchIterator i = regex.globalMatch(input);
while (i.hasNext()) {
QRegularExpressionMatch match = i.next();
qDebug() << match.captured();
}
very important point you can use regular expression with almost functions of QString like split , indexof , contains , count, replace, remove and section
List Of String
filter by sub string or regular expression
QStringList list;
list << "Bill Murray" << "John Doe" << "Bill Clinton";
QStringList result;
result = list.filter("Bill");
// result: ["Bill Murray", "Bill Clinton"]
indexOf and lastIndexOf by full string or regular expression also join and removeDuplicate and replaceInStrings
QStringList list;
list << "alpha" << "beta" << "gamma" << "epsilon";
list.replaceInStrings("a", "o");
// list == ["olpho", "beto", "gommo", "epsilon"]
QStringList list;
list << "alpha" << "beta" << "gamma" << "epsilon";
list.replaceInStrings(QRegularExpression("^a"), "o");
// list == ["olpha", "beta", "gamma", "epsilon"]
and sort
void QStringList::sort(Qt::CaseSensitivity cs = Qt::CaseSensitive)
QStringRef
The QStringRef class provides a thin wrapper around QString substrings.
and you can get it from QString by
leftRef(int n) const
midRef(int position, int n = -1)
rightRef(int n) const
splitRef(const QString &sep, Qt::SplitBehavior behavior = Qt::KeepEmptyParts, Qt::CaseSensitivity cs = Qt::CaseSensitive) const
QStringRef(const QString *string, int position, int length)