Exploring data
By default, when a new workspace is deployed, it comes with a managed Hive metastore. A metastore allows you to register datasets in various formats such as Comma-Separated Values (CSV), Parquet, Delta format, text, or JavaScript Object Notation (JSON) as an external table (https://docs.databricks.com/data/metastores/index.html). We will not go too much into detail about the metastore here:
Figure 2.10 – The Data tab
It’s all right if you are not familiar with the term metastore. In simple terms, it is similar to a relational database. In relational databases, there are databases and then table names and schemas. The end user can use SQL to interact with the data stored in databases and tables. Similarly, in Databricks, end users can decide to register datasets stored in cloud storage so that they’re available as tables. You can learn more here: https://docs.databricks.com/spark/latest/spark-sql/language-manual/index...