Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds
Arrow up icon
GO TO TOP
 Learning Geospatial Analysis with Python

You're reading from   Learning Geospatial Analysis with Python Unleash the power of Python 3 with practical techniques for learning GIS and remote sensing

Arrow left icon
Product type Paperback
Published in Nov 2023
Publisher Packt
ISBN-13 9781837639175
Length 432 pages
Edition 4th Edition
Languages
Tools
Arrow right icon
Author (1):
Arrow left icon
Joel Lawhead Joel Lawhead
Author Profile Icon Joel Lawhead
Joel Lawhead
Arrow right icon
View More author details
Toc

Table of Contents (18) Chapters Close

Preface 1. Part 1:The History and the Present of the Industry
2. Chapter 1: Learning about Geospatial Analysis with Python FREE CHAPTER 3. Chapter 2: Learning about Geospatial Data 4. Chapter 3: The Geospatial Technology Landscape 5. Part 2:Geospatial Analysis Concepts
6. Chapter 4: Geospatial Python Toolbox 7. Chapter 5: Python and Geospatial Algorithms 8. Chapter 6: Creating and Editing GIS Data 9. Chapter 7: Python and Remote Sensing 10. Chapter 8: Python and Elevation Data 11. Part 3:Practical Geospatial Processing Techniques
12. Chapter 9: Advanced Geospatial Modeling 13. Chapter 10: Working with Real-Time Data 14. Chapter 11: Putting It All Together 15. Assessments 16. Index 17. Other Books You May Enjoy

Understanding spatial indexing

Geospatial datasets are often very large files, easily reaching hundreds of megabytes or even several gigabytes in size. Geospatial software can be quite slow in trying to repeatedly access large files when performing analysis.

As discussed briefly in Chapter 1, Learning about Geospatial Analysis with Python, spatial indexing creates a guide, which allows the software to quickly locate query results without examining every single feature in the dataset. Spatial indexes allow the software to eliminate possibilities and perform more detailed searches or comparisons on a much smaller subset of the data.

Spatial indexing algorithms

Many spatial indexing algorithms are derivatives of well-established algorithms that have been used on non-spatial information for decades. The two most common spatial indexing algorithms are quadtree index and R-tree index. There is a three-dimensional counterpart to the Quadtree index called an octree data structure. This indexing structure is most commonly seen with point cloud data – specifically, Lidar data.

Quadtree index

The quadtree algorithm actually represents a series of different algorithms based on a common theme. Each node in a quadtree index contains four children. These child nodes are typically square or rectangular in shape. When a node contains a specified number of features and more features are added, the node splits.

The concept of dividing a space into nested squares speeds up spatial searches. The software must only handle five points at a time and use simple greater-than/less-than comparisons to check whether a point is inside a node. Quadtree indexes are most commonly found in file-based index formats.

The following diagram shows a point dataset sorted by a quadtree algorithm. The black points are the actual dataset, while the boxes are the bounding boxes of the index. Note that none of the bounding boxes overlap. The diagram on the left shows the spatial representation of the index, while the diagram on the right shows the hierarchical relationship of a typical index, which is how spatial software sees the index and data.

This structure allows a spatial search algorithm to quickly eliminate possibilities when trying to locate one or more points in relation to some other set of features, as shown in the following diagram:

Figure 2.2 – Visual representation of a quadtree index spatially and organizationally

Figure 2.2 – Visual representation of a quadtree index spatially and organizationally

Now that we understand quadtree indexes, let’s look at another common type of spatial indexes, called R-trees.

R-tree indexes

R-tree indexes are more sophisticated than quadtrees. R-trees are designed to handle 3D data and are optimized to store the index in a way that is compatible with the way databases use disk space and memory. Nearby objects are grouped together using an algorithm from a variety of spatial algorithms. All objects in a group are bounded by a minimum rectangle. These rectangles are aggregated into hierarchical nodes that are balanced at each level.

Unlike a quadtree, the bounding boxes of an R-tree may overlap across nodes. Due to their relative complexity and database-oriented structure, R-trees are most commonly found in spatial databases, as opposed to file-based formats.

The following diagram from https://en.wikipedia.org/wiki/File:R-tree.svg shows a balanced R-tree for a 2D point dataset:

Figure 2.3 – Visual representation of an R-tree index spatially and organizationally

Figure 2.3 – Visual representation of an R-tree index spatially and organizationally

Indexes break up large datasets, but to speed up searching, they may employ a technique called grids. We’ll look at that next.

Grids

Spatial indexes also often employ the concept of an integer grid. Geospatial coordinates are usually floating-point decimal numbers with anywhere from 2 to 16 decimal places.

Performing comparisons on floating-point numbers is far more computationally expensive than working with integers. Indexed searching is about eliminating possibilities that don’t require precision first.

Most spatial indexing algorithms, therefore, map floating-point coordinates to a fixed-sized integer grid. On searching for a particular feature, the software can use more efficient integer comparisons rather than working with floating-point numbers. Once the results are narrowed down, the software can access the full resolution data.

Grid sizes can be as small as 256 x 256 for simple file formats, or can be as large as 3 million x 3 million in large geospatial databases designed to incorporate every known coordinate system and possible resolution.

The integer mapping technique is very similar to the rendering technique that is used to plot data on a graphics canvas in mapping programs. The SimpleGIS script in Chapter 1, Learning about Geospatial Analysis with Python, also uses this technique to render points and polygons using the built-in Python turtle graphics engine.

Spatial indexing is used to speed up the searching and displaying of vector data. There’s a different technique to speed up the rendering of raster data. Let’s take a look at that method.

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €18.99/month. Cancel anytime
Visually different images