Data ingestion is the act of collecting data for transfer and storage. There are lots of places that data can come from. Predominantly, data ingestion falls into one of the categories from databases, streams, logs, and files. Among these, databases are the most predominant. These typically consist of your main upstream transactional systems that are the primary data storage for your applications. They take on both relational and non-relational flavors, and there are several techniques for extracting data out of them.
Streams are open-ended sequences of time-series data such as clickstream data from websites or IoT devices, usually published into an API that we host. Logs get generated by applications, services, and operating systems. A data lake is a great place to store all of the data for centralized analysis. Files come from self-hosted filesystems or via third-party data feeds via FTP or APIs. As shown in the following diagram, use the type of data your environment...