Why QGIS?
Raw spatial data from government portals or FOIA releases rarely arrives web-ready. QGIS (Quantum GIS) is a powerful, free, open-source desktop application for:
- Cleaning and validating data
- Reprojecting coordinate systems
- Performing spatial analysis
- Exporting optimized web formats
Download: https://qgis.org
Data Import Workflows
Supported Formats
| Format | Extension | Source |
|---|---|---|
| Shapefile | .shp | Census, NOAA |
| GeoJSON | .geojson | Web APIs |
| KML/KMZ | .kml/.kmz | Google Earth |
| CSV (with coords) | .csv | Spreadsheets |
| GeoPackage | .gpkg | Modern standard |
| File Geodatabase | .gdb | ArcGIS exports |
Loading a Shapefile
- Layer → Add Layer → Add Vector Layer
- Browse to
.shpfile - Select and click Add
- Check Layer Properties for CRS
Loading CSV with Coordinates
- Layer → Add Layer → Add Delimited Text Layer
- Configure:
- X field: longitude column
- Y field: latitude column
- CRS: EPSG:4326 (WGS84)
- Click Add
Coordinate System Management
Understanding CRS
| Term | Meaning |
|---|---|
| GCS | Geographic Coordinate System (degrees) |
| PCS | Projected Coordinate System (meters/feet) |
| EPSG | Standard code for CRS definitions |
Common Systems
| EPSG | Name | Use Case |
|---|---|---|
| 4326 | WGS84 | Web display, GPS |
| 3857 | Web Mercator | Web map tiles |
| 5070 | Albers Equal Area | U.S. area calculations |
| 32610-19 | UTM zones | Regional precision |
Reprojecting Layers
- Right-click layer → Export → Save Features As
- Set CRS to target projection
- Or: Vector → Data Management → Reproject Layer
Critical: Always reproject to equal-area projection (like 5070) before calculating distances or buffers.
Geometry Operations
Buffering
Create zones around features:
- Vector → Geoprocessing Tools → Buffer
- Input layer: Select source
- Distance: Enter in layer units
- Segments: 25+ for smooth curves
- End cap: Round
Example: 100-mile zone = 160,934 meters
Clipping
Cut one layer to another's boundary:
- Vector → Geoprocessing Tools → Clip
- Input: Layer to clip
- Overlay: Boundary layer
- Output: Clipped result
Dissolving
Merge multiple features into one:
- Vector → Geoprocessing Tools → Dissolve
- Useful for combining county boundaries into state
Spatial Joins
Combine attributes from overlapping features:
- Vector → Data Management → Join Attributes by Location
- Select join type (intersects, within, etc.)
- Choose which attributes to keep
Simplification
Why Simplify?
Raw boundary files have millions of vertices that are unnecessary for web display.
| Original Size | Simplified | Web Ready |
|---|---|---|
| 50 MB | 5 MB | 500 KB |
Douglas-Peucker Algorithm
- Vector → Geometry Tools → Simplify
- Set tolerance (in layer units):
- National view: 1000-5000 meters
- State view: 100-500 meters
- County view: 10-50 meters
- Check "Preserve topology"
Balance Detail vs. Size
| Tolerance | Visual Impact |
|---|---|
| 10m | Imperceptible |
| 100m | Minor smoothing |
| 1000m | Visible simplification |
| 5000m | Significant generalization |
Spatial Analysis
Point in Polygon
Determine which polygon contains each point:
- Vector → Data Management → Join Attributes by Location
- Input: Points layer
- Join: Polygon layer
- Predicate: "within"
Population Within Zone
- Join census blocks to zone polygon
- Sum population fields
- Export statistics
Hotspot Analysis
Identify clusters of enforcement activity:
- Processing → Toolbox → Heatmap
- Input: Incident points
- Radius: Search distance
- Output: Density raster
Attribute Editing
Field Calculator
Add or modify attributes:
- Open Attribute Table
- Toggle Editing Mode
- Open Field Calculator
- Create expressions:
# Concatenate fields
"facility_name" || ' - ' || "state"
# Conditional values
CASE
WHEN "facility_type" = 'SPC' THEN 'Federal'
WHEN "facility_type" = 'CDF' THEN 'Private'
ELSE 'Local'
END
# Calculate area (in projected CRS)
$area / 1000000 -- Square kilometers
Batch Editing
For bulk attribute changes:
- Select features (attribute table or map)
- Open Field Calculator
- Check "Update existing field"
- Apply expression
Export for Web
GeoJSON Export
- Right-click layer → Export → Save Features As
- Format: GeoJSON
- CRS: EPSG:4326 (required for web)
- Coordinate precision: 5 decimals (sufficient for most uses)
- Check "RFC 7946" for strict compliance
TopoJSON Conversion
For additional size reduction:
# Install topojson CLI
npm install -g topojson
# Convert GeoJSON to TopoJSON
geo2topo features=input.geojson > output.topojson
# Simplify further
toposimplify -p 0.01 output.topojson > simplified.topojson
Vector Tiles (MVT)
For MapLibre GL JS with massive datasets:
- Processing → Toolbox → Generate XYZ tiles (MBTiles)
- Or use
tippecanoeCLI:
tippecanoe -o output.mbtiles \
--minimum-zoom=0 \
--maximum-zoom=14 \
input.geojson
Automation with Python
PyQGIS Script
from qgis.core import (
QgsVectorLayer,
QgsProject,
QgsCoordinateReferenceSystem,
QgsCoordinateTransform
)
import processing
# Load layer
layer = QgsVectorLayer('/data/facilities.shp', 'facilities', 'ogr')
# Reproject
parameters = {
'INPUT': layer,
'TARGET_CRS': 'EPSG:5070',
'OUTPUT': '/data/facilities_5070.shp'
}
processing.run('native:reprojectlayer', parameters)
# Buffer
parameters = {
'INPUT': '/data/facilities_5070.shp',
'DISTANCE': 160934, # 100 miles in meters
'OUTPUT': '/data/facilities_buffer.shp'
}
processing.run('native:buffer', parameters)
Batch Processing
Process multiple files:
import os
import processing
input_dir = '/data/raw/'
output_dir = '/data/processed/'
for filename in os.listdir(input_dir):
if filename.endswith('.shp'):
input_path = os.path.join(input_dir, filename)
output_path = os.path.join(output_dir, filename.replace('.shp', '.geojson'))
processing.run('native:reprojectlayer', {
'INPUT': input_path,
'TARGET_CRS': 'EPSG:4326',
'OUTPUT': output_path
})
Data Validation
Check Geometry Validity
- Vector → Geometry Tools → Check Validity
- Fix invalid geometries:
- Vector → Geometry Tools → Fix Geometries
Remove Duplicates
- Vector → Data Management → Delete Duplicate Geometries
Attribute Validation
# In Field Calculator - flag missing data
CASE
WHEN "facility_name" IS NULL THEN 'MISSING NAME'
WHEN "latitude" IS NULL THEN 'MISSING COORDS'
ELSE 'OK'
END
Project Organization
Layer Naming Convention
facility_points_original
facility_points_5070
facility_points_simplified
facility_points_web
Save as GeoPackage
For portable, self-contained projects:
- Layer → Save As
- Format: GeoPackage
- Add multiple layers to single file
Related Resources
- 100-Mile Zone - Buffer workflow example
- Checkpoints - Facility data preparation
- Implementation - CI/CD data pipelines
- Tile Servers - Serving processed data