Fundamental formats
We have already learned about the basics of text data and binary data. In this section, we'll look at these formats in a bit more detail and introduce some additional important data structures.
Text data
Earlier, we mentioned that, in general, text data can be viewed in a text editor. Text files can often be recognized by their file extensions; common ones include .csv
(comma separated), .txt
(plain text), .sql
(SQL database script files), and others. Note that the extension is only a convention and does not guarantee the format of the contents. For example, it's not unusual to receive files with .txt
extensions that are in .csv
format.
However, there is an additional complexity that may arise, depending on how the data was created and stored. Text data may appear the same but be stored in different binary versions of each character. These binary representations are called encodings, and in most cases, you will find data encoded in UTF-8 format...