โ˜๏ธ
DuckDB httpfs ยท S3 ยท GCS ยท HTTP Range Requests

Query Petabytes Without
Downloading a Byte

GeoQ uses DuckDB's httpfs extension to stream only the data you need โ€” directly from Amazon S3, Google Cloud Storage, or any HTTPS endpoint โ€” using efficient HTTP range requests.

Get the CLI โ†’ โ† Back to Home

Connect Any Cloud, Any Endpoint

All three major cloud object stores plus open HTTP โ€” configured with a single URI prefix.

๐ŸŸ 
Amazon S3
Use s3://bucket/path/data.copc.laz directly. Credentials are resolved through the full AWS credential chain: environment variables, ~/.aws/credentials, IAM instance roles, or explicit secrets.
s3://IAM RolesAWS SDK Chain
๐Ÿ”ต
Google Cloud Storage
Use gs://bucket/path/data.parquet with GCS service account credentials injected via the DuckDB aws extension secret management.
gs://Service AccountGCS
๐ŸŒ
HTTPS / Public Endpoints
Any HTTPS URL pointing to a supported format works out of the box โ€” USGS national datasets, OpenTopography COPCs, or your own CDN. No credentials needed for public data.
https://Public DataCDN

HTTP Range Requests

Cloud-native formats like COPC, COG (GeoTIFF), and GeoParquet embed spatial indexes and metadata at known byte offsets. DuckDB reads only those offsets โ€” fetching a 10 MB spatial tile from a 100 GB dataset uses only those 10 MB.

This makes GeoQ fast even over high-latency internet connections, because the query engine pushes spatial predicates into the file reader before any network I/O happens.

STEP 1
Read spatial index / header (~4 KB range request)
STEP 2
Resolve matching tiles/chunks from bbox (spatial index lookup)
STEP 3
Fetch only matched byte ranges (e.g. 3 ร— 2 MB = 6 MB of 100 GB)
RESULT
SQL query executed in-memory โ†’ output

Zero-Config Credentials

๐Ÿ”‘
AWS Credential Chain
GeoQ resolves credentials in order: explicit --aws-key flags โ†’ environment variables (AWS_ACCESS_KEY_ID) โ†’ ~/.aws/credentials profile โ†’ EC2/ECS instance metadata IAM role.
๐Ÿ›ก๏ธ
DuckDB Secrets API
Store credentials once with CREATE PERSISTENT SECRET in the DuckDB secrets manager. They are encrypted at rest and reused across sessions โ€” no env-var juggling.
๐ŸŒ
Public Data โ€” No Config
For anonymous-access S3 buckets or HTTPS endpoints, no credentials are needed. Just pass the URI and GeoQ detects and configures anonymous access automatically.

Cloud Source Commands

# Query a COPC point cloud directly from S3 (uses IAM role automatically) geoq cloud point-cloud \ --source s3://my-lidar-bucket/surveys/city.copc.laz \ --bbox "-122.5,37.7,-122.3,37.9" \ --output csv # Query a Cloud-Optimized GeoTIFF from GCS geoq cloud raster \ --source gs://public-dem/ot_be_Brus_2020_LAZ_2021_cloud.tif \ --band 1 \ --output csv # Query a public GeoParquet file over HTTPS geoq cloud vector \ --source https://data.source.coop/example/parcels.parquet \ --sql "SELECT apn, area_sqft, geom FROM features WHERE area_sqft > 5000 LIMIT 100" \ --output geojson # Query a multi-file glob from S3 (st_read_multi extension) geoq cloud vector \ --source "s3://my-bucket/parcels/*.fgb" \ --output geojson

Your Data, Still in the Cloud

No staging. No egress surprises. GeoQ reads only what the query needs.

Get the CLI โ†’ โ† Back to Home