How do you benefit from spatial data?
Spatial data is an important component to any data-driven decision. It not only shows the location of a real-world asset, but it also helps to paint the picture of how that asset is impacted by its surroundings. Many everyday questions are answered by analyzing data based on location and proximity to other locations.
- Enterprises that own and manage land may want to not only store the correct outline of a property as a polygon but also manage revenue or make risk-based decisions given the location of the property.
- A transportation company will want to know the progress of its truck fleet at any given time. Estimating the time of delivery and solving for last mile logistics.
- A telecommunications company may want to assess the impact of weathering on its equipment, or the estimated customer reach given the topology of the land on which the tower is located.
- Or a tourist visiting a large city may want to choose the location of their lodgings based on proximity to landmarks and transportation.
Analysis involving spatial data impacts any enterprise or organization that owns, maintains, or manages physical assets.
However, working with spatial data can have its challenges. From different types of spatial representations to spatial reference considerations to availability of analytical functionality. Many solutions make it possible to work with spatial data, but they often focus on one aspect, or one set of workflows and it can be daunting to pick the right solution for your workflow.
How do you use PostGIS to work with spatial data?
Over the years, PostgreSQL has been a crucial part of many spatial workflows, primarily because of the extensive set of spatial capabilities available through the PostGIS extension. Created over 20 years ago, the extension kept growing in capabilities as the needs of the spatial market evolved. It allows for representation of data as points, lines and polygons, in two spatial data types.
Beginning with the st_geometry spatial type, which allows for persistence of points, lines and polygons projected to a flat surface such as a map or a monitor screen. It includes the standard set of spatial references and the ability to reproject between them, along with spatial indexes and extensive range of analytical functions.
As spatial workflows encompassed more global data, the st_geography spatial type was introduced, which represents data as points, arcs and polygons referenced to a spheroid.
This data is stored as latitude and longitude, using the WGS 84 datum, to approximate the shape of the globe. It is always persisted with spatial reference (SRID) of 4326. The type comes with its own set of analytical functions.
The PostGIS extension has the distinction of supporting both spatial types, the ability to perform analysis and maintain precision and accuracy of data in each type as well as the ability to cast data from one type to another. So, it can be used to maintain the correctness of projected data such a land parcel, or a road in st_geometry, and it can be used to represent data on a global scale such as ride sharing apps or airplane routes in st_geography.
In time, several other PostGIS extensions were created, each to work with different spatial representations, such as raster, point clouds and 3D data and perform operations such as geocoding or topology modeling.
How do you enable Azure Database for PostgreSQL – Flexible Server to work with spatial data?
Azure Database for PostgreSQL - Flexible Server is a fully managed database-as-a-service based on the community version of PostgreSQL which provides the ability to extend the functionality of the instance using extensions, which once loaded, will function like built-in features.
The following spatial extensions can be installed:
- PostGIS - primary spatial extension for PostgreSQL.
- PostGIS _Topology - model connectivity between features, such as road networks or land parcels
- PostGIS_Raster – store and analyze bands of raster data, from imagery to land use or soil types
- PostGIS_Tiger_Geocoder - use spatial data collected as part of the Census Bureau's census, to create address points.
- PostGIS_sfcgal - work with 3D data, such as building heights or surface dimensionality.
You can learn more about Azure Database for PostgreSQL – Flexible Server here.
Try it yourself!
In this example you are going to look for Airbnb rentals in New York City, within walking distance of a subway station. This workflow relies on the PostGIS extension, data publicly available from Airbnb and a backup from Introduction to PostGIS, available under the Creative Commons Attribution-Share-Alike 3.0 license, currently maintained by Paul Ramsey.
For an upcoming trip to New York City, you are looking for a rental within walking distance of many attractions and in close proximity to a subway station. The Broad Street subway station is in the heart of the financial district of Manhattan, and a short distance from the World Trade Center memorial. So, let's look for rental properties within different distances of this station.
First, let's prepare the environment!
Create an instance of PostgreSQL - Flexible Server using Portal
- Deploy an instance of PostgreSQL - Flexible Server
- Enable extensions on the newly created Flex Server
- Create a Blob Storage Account and load the Airbnb data
- Create a Blob Container Introduction to Blob Storage - Azure Storage
- Load Listings for New York City from Airbnb
Connect to your server using pgAdmin4
- Create a database: nyc
create database nyc;
- Enable Extensions in the nyc database
create extension azure_storage;
create extension postgis;
SELECT azure_storage.account_add('name_of_storage_account', 'secret_access_key');
- Restore the backup from the Introduction to PostGIS workshop, which includes a PostgreSQL backup of spatial data layers, gathered as part of the public census by the U.S. Census Bureau.
Backup and restore - Azure Database for PostgreSQL - Flexible Server
- Create the table to store nyc_listings_bnb
CREATE TABLE IF NOT EXISTS public.nyc_listings_bnb
(
id bigint,
name character varying(50),
host_id integer,
host_name character varying(50),
neighbourhood_group character varying(50),
neighbourhood character varying(50),
latitude numeric(10,5),
longitude numeric(10,5),
room_type character varying(50),
price money
);
- Load the Airbnb listings data from blob storage
INSERT INTO nyc_listings_bnb
SELECT id,name,host_id,host_name,neighbourhood_group,neighbourhood, latitude, longitude, room_type, price
FROM azure_storage.blob_get('name_of_storage_account','name_of_container','name_of_csv.csv'
,options:= azure_storage.options_csv_get(header=>true)) AS res (
id bigint,
name character varying(50),
host_id integer,
host_name character varying(50),
neighbourhood_group character varying(50),
neighbourhood character varying(50),
latitude numeric(10,5),
longitude numeric(10,5),
room_type character varying(50),
price money
);
- Construct a PostGIS spatial column for the Airbnb listings table
Alter table nyc_listings_bnb ADD COLUMN listing_geom geometry(point, 4326);
UPDATE nyc_listings_bnb SET listing_geom =ST_SetSRID(ST_Point(longitude, latitude), 4326);
Explore the data
You now have at least 7 layers of spatially enabled data available for exploration and analysis. Let's take a look at the Broad Street station.
select st_transform(geom, 4326)
from nyc_subway_stations
where name = 'Broad St';
PGAdmin4 includes an open-source viewer, Leaflet which will let you visualize a single spatial data layer, using OSM basemaps. The data to be visualized needs to be referenced to SRID 4326 to display against the basemap.
Perform Analysis
You can use the NYC Subway Stations layer to create different geofences around the Broad Street subway station. Using the ST_Buffer function you can create a polygon within a distance of a subway station, each of those polygons will be saved in a table. The distance will need to be given in the units of the data. In this case the data is in SRID 26918, which stores data in meters.
You can use the geofence layers to find available listings from the Airbnb listings data that lie within each polygon.
What if you were looking for an Airbnb listing that was within a hundred feet of a subway station, how many Airbnb listings would I be able to choose from?
create table nyc_geofence_100_Broad as
select st_transform(st_buffer(geom, 100), 4326) as geo_fence
from nyc_subway_stations
where name= 'Broad St';
select geo_fence from nyc_geofence_100_broad;
select host_id, host_name, l.listing_geom
from nyc_listings_bnb as l
join nyc_geofence_100_broad as g
on st_contains (g.geo_fence,l.listing_geom)
The analysis found that 13 Airbnb properties are available within 100 meters or 382 feet from the Broad St. subway station.
But what if you were willing to walk a third of a mile to a subway station, how many Airbnb listings would I be able to choose from?
create table nyc_geofence_500_Broad as
select st_transform(st_buffer(geom, 500), 4326) as geo_fence
from nyc_subway_stations
where name= 'Broad St';
select geo_fence from nyc_geofence_500_broad;
select host_id, host_name, l.listing_geom
from nyc_listings_bnb as l
join nyc_geofence_500_broad as g
on st_contains (g.geo_fence,l.listing_geom)
The analysis found that 514 Airbnb properties are available within 500 meters or .3 miles from the Broad St. subway station.
Extensibility of Azure Database for PostgreSQL - Flexible Server
Congratulations!
You just learned how to enhance your applications with spatial data by leveraging PostGIS with Azure Database for PostgreSQL - Flexible Server.
Updated May 16, 2024
Version 3.0Kasia_Tuszynska
Microsoft
Joined February 26, 2024
Azure Database for PostgreSQL Blog
Follow this blog board to get notified when there's new activity