Clustered Columnstore Indexes
Clustered Columnstore indexes can make a dramatic difference and are the technology used to optimize real-time analytics. They can achieve an order of magnitude performance gain over a normal row table, a dramatic compression of the data, and minimize interference with real-time transaction processing.
A columnstore has rows and columns, but the data is stored in a column format.
A rowgroup is a set of rows that are compressed into a columnstore format — a maximum of a million rows (1,048,576).
There are an optimum number of rows in a rowgroup that are stored column-wise, and this represents a trade-off between large overhead, if there are too few rows, and an inability to perform in-memory operations if the rows are too big.
Each row consists of column segments, each of which represents a column from the compressed row.
Columnstore is illustrated in Figure 1.1, showing how to load data into a non-clustered columnstore index:
Figure 1.1: Loading data into a non-clustered columnstore index
A clustered columnstore index is how the columnstore table segments are stored in physical media. For performance reasons, and to avoid fragmenting the data, the columnstore index may store some data in a deltastore and a list of the IDs of deleted rows. All deltastore operations are handled by the system and not visible directly to the user. Deltastore and columnstore data is combined when queried.
A delta rowgroup is used to store columnstore indexes until there are enough to store in the columnstore. Once the maximum number of rows is reached, the delta rowgroup is closed, and a background process detects, compresses, and writes the delta rowgroup into the columnstore.
There may be more than one delta rowgroup. All delta rowgroups are described as the deltastore. While loading data, anything less than 102,400 rows will be kept in the deltastore until they group to the maximum size and are written to the columnstore.
Batch mode execution is used during a query to process multiple rows at once.
Loading a clustered columnstore index and the deltastore are shown in Figure 1.2.
Figure 1.2: Loading a clustered columnstore index
Further information can be found here: https://docs.microsoft.com/en-us/sql/relational-databases/indexes/get-started-with-columnstore-for-real-time-operational-analytics?view=sql-server-2017.
Adding Clustered Columnstore Indexes to memory-optimized tables
When using a memory-optimized table, add a non-clustered columnstore index. A clustered columnstore index is especially useful for running analytics on a transactional table.
A clustered columnstore index can be added to an existing memory-optimized table, as shown in the following code snippet:
-- Add a clustered columnstore index to a memory-optimized table ALTER TABLE MyMemOpttable ADD INDEX MyMemOpt_ColIndex clustered columnstore