Introduction
In Elixir, strings are declared using double quotes ("") and they are, by default, UTF-8-encoded binaries. A group of bytes represent each codepoint in a string.
Note
A codepoint, in this context, is the binary representation of a UTF-8-encoded character.
Elixir's support for strings is excellent. However, remember that under the hood, they are binaries!
Note
In order to represent some characters in UTF-8, more than one byte is needed sometimes. Take a look at the following examples:
iex> byte_size "aeiou" 5 iex> byte_size "àéíôù" 10 iex> String.length "aeiou" 5 iex> String.length "àéíôù" 5
Even though both strings have the same length, the number of bytes needed to represent them differs.