| Author: | Sean Gillies, <sean.gillies@gmail.com> |
|---|---|
| Revision: | 1.2 |
| Date: | 6 April 2010 |
| Copyright: | This work is licensed under a Creative Commons Attribution 3.0 United States License. |
| Abstract: | This document explains how to use the Shapely Python package for computational geometry. |
|---|
Contents
Deterministic spatial analysis is an important component of computational approaches to problems in agriculture, ecology, epidemiology, sociology, and many other fields. What is the surveyed perimeter/area ratio of these patches of animal habitat? Which properties in this town intersect with the 50-year flood contour from this new flooding model? What are the extents of findspots for ancient ceramic wares with maker's marks "A" and "B", and where do the extents overlap? These are just a few of the possible questions addressable using non-statistical spatial analysis, and more specifically, computational geometry.
Shapely is a Python package for set-theoretic analysis and manipulation of planar features using (via Python's ctypes module) functions from the well known and widely deployed GEOS library. GEOS, a port of the Java Topology Suite, is the geometry engine of the PostGIS spatial extension for the PostgreSQL RDBMS. The designs of JTS and GEOS are largely guided by the Open Geospatial Consortium's Simple Features Access Specification [1] and Shapely adheres mainly to the same set of standard classes and operations. Shapely is deeply rooted in the geographic information systems (GIS) world, but aspires to be equally useful to programmers working on non-tradtional problems.
PostGIS is a cornerstone of open source GIS, but isn't necessarily a solution to all problems in the GIS domain. Not all geographic data originate or reside in a RDBMS or are best processed using SQL. Shapely aims to bring industrial strength computational geometry primitives to bear on programming problems better addressed in a object-oriented style. Imagine a situation where we would like to find or index a substring within another string. Is there overlap between the strings, and if so, what is it? Or maybe we'd like to replace certain characters in a string with others. Now imagine that we're compelled to load the text strings from, say, a log file into a relational database to perform these operations because such string functions aren't available in a non-SQL context. No knock on the RDBMS, a tremendously useful thing, but if there's no mandate to manage (the "M" in "RDBMS") these strings over time in the database, we're using the wrong tool for the job in this imaginary scenario. Now, consider spatial entities like points, curves, and patches instead of character strings, and that they might originate not in the context of a traditional GIS, but from parsing and geocoding of texts or log files or from "social web" activities. If you agree that sometimes PostGIS (or another spatially-enabled RDBMS) is the wrong tool for your computational geometry job, Shapely might be for you.
The premise of Shapely, or one of the premises, is that Python programmers should be able to perform PostGIS type geometry operations outside of an RDBMS. Another is that Python idioms trump GIS (or Java, in this case, since the GEOS library is derived from JTS, a Java project) idioms. Shapely, in a nutshell lets you do PostGIS-y stuff with geometries outside the context of a database using idiomatic Python. Computational geometry, with no extra baggage.
The fundamental types of geometric objects implemented by Shapely are points, curves, and surfaces. Each is associated with three sets of (possibly infinite) points in the plane. The interior, boundary, and exterior sets of a feature are mutually exclusive and their union coincides with the entire plane [2].
That may seem a bit esoteric, but will help clarify the meanings of Shapely's spatial predicates, and it's as deep into theory as this manual will go. Consequences of point-set theory, including some that manifest themselves as "gotchas", for different classes will be discussed later in this manual.
The point type is implemented by a Point class; curve by the LineString and LinearRing classes; and surface by a Polygon class. Shapely implements no smooth (i.e. having continuous tangents) curves. All curves must be approximated by linear splines. All rounded patches must be approximated by regions bounded by linear splines.
Collections of points are implemented by a MultiPoint class, collections of curves by a MultiLineString class, and collections of surfaces by a MultiPolygon class. These collections aren't computationally significant, but are useful for modeling certain kinds of features. A Y-shaped line feature, for example, is well modeled as a whole by a MultiLineString.
The standard data model has additional constraints specific to certain types of geometric objects that will be discussed in following sections of this manual.
See also http://www.vividsolutions.com/jts/discussion.htm#spatialDataModel for more illustrations of this data model.
The spatial data model is accompanied by a group of natural language relationships between geometric objects – contains, intersects, overlaps, touches, etc – and a theoretical framework for understanding them using the 3x3 matrix of the mutual intersections of their component point sets [2]: the DE-9IM. A comprehensive review of the relationships in terms of the DE-9IM is found in [4] and will not be reiterated in this manual.
Following the JTS technical specs [5], this manual will make a distinction between constructive (buffer, convex hull) and set-theoretic operations (intersection, union, etc). The individual operations will be fully described in a following section of the manual.
Even though the Earth is not flat – and for that matter not exactly spherical – there are many analytic problems that can be approached by transforming Earth features to a Cartesian plane, applying tried and true algorithms, and then transforming the results back to geographic coordinates. This practice is as old as the tradition of accurate paper maps.
Shapely does not support coordinate system transformations. All operations on two or more features presume that the features exist in the same Cartesian plane.
Geometric objects are created in the typical Python fashion, using the classes themselves as instance factories. A few of their intrinsic properties will be discussed in this sections, others in the following sections on operations and serializations.
Instances of Point, LineString, and LinearRing have coordinate sequences. Coordinate sequences are immutable. Their parent features are mutable in that they can be assigned new coordinate sequences. A third z coordinate value may be used when constructing instances, but has no effect on geometric analysis. All operations are performed in the x-y plane.
In all constructors, numeric values are converted to type float. In other words, Point(0, 0) and Point(0.0, 0.0) produce geometrically equivalent instances. Shapely does not check the topological simplicity or validity of instances when they are constructed as the cost is unwarranted in most cases. Validating factories are trivially implemented, using the is_valid predicate, by users that require them.
Pseudo-code blocks in this section will use the following notation. Let a be a point in Cartesian coordinates, represented by a Python tuple of 2 ((x, y)) or 3 ((x, y, z)) numerical values. Let (a1, ..., aM) and (b1, ..., bN) be ordered sequences of M and N such points, defining the vertices of a curve.
The Point constructor takes positional coordinate values or point tuple parameters.
>>> from shapely.geometry import Point
>>> point = Point(0.0, 0.0)
>>> q = Point((0.0, 0.0))
A Point has zero area and zero length.
>>> point.area
0.0
>>> point.length
0.0
Its x-y bounding box is a (minx, miny, maxx, maxy) tuple.
>>> point.bounds
(0.0, 0.0, 0.0, 0.0)
Coordinate values are accessed via coords, x, y, and z properties.
>>> list(point.coords)
[(0.0, 0.0)]
>>> point.x
0.0
>>> point.y
0.0
The Point constructor also accepts another Point instance, thereby making a copy.
>>> Point(point)
<shapely.geometry.point.Point object at 0x...>
The LineString constructor takes an ordered sequence of point tuples.
>>> from shapely.geometry import LineString
>>> line = LineString((a1, ..., aM))
Repeated points in the ordered sequence are allowed, but may incur performance penalties and should be avoided. A LineString may cross itself (i.e. be complex and not simple).
A LineString has zero area and non-zero length.
>>> line = LineString([(0, 0), (1, 1)])
>>> line.area
0.0
>>> line.length
1.4142135623730951
Its x-y bounding box is a (minx, miny, maxx, maxy) tuple.
>>> line.bounds
(0.0, 0.0, 1.0, 1.0)
Coordinate values are accessed via the coords property.
>>> len(line.coords)
2
>>> list(line.coords)
[(0.0, 0.0), (1.0, 1.0)]
The constructor also accepts another LineString instance, thereby making a copy.
>>> LineString(line)
<shapely.geometry.linestring.LineString object at 0x...>
A sequence of Point instances is not a valid constructor parameter. A LineString is not composed of Point instances.
The LinearRing constructor takes an ordered sequence of point tuples. The sequence may be explicitly closed by passing identical values in the first and last indices. Otherwise, the sequence will be implicitly closed by copying the first tuple to the last index.
>>> from shapely.geometry.polygon import LinearRing
>>> ring = LinearRing((a1, ..., aM))
Repeated points in the ordered sequence are allowed, but may incur performance penalties and should be avoided. A LinearRing may not cross itself, and may not touch itself at a single point. Note that Shapely will not prevent the creation of such rings, but exceptions will be raised when they are operated on.
A LinearRing has zero area and non-zero length.
>>> ring = LinearRing([(0, 0), (1, 1), (1, 0)])
>>> ring.area
0.0
>>> ring.length
3.4142135623730949
Its x-y bounding box is a (minx, miny, maxx, maxy) tuple.
>>> ring.bounds
(0.0, 0.0, 1.0, 1.0)
Coordinate values are accessed via the coords property.
>>> len(ring.coords)
4
>>> list(ring.coords)
[(0.0, 0.0), (1.0, 1.0), (1.0, 0.0), (0.0, 0.0)]
The LinearRing constructor also accepts another LineString or LinearRing instance, thereby making a copy.
>>> LinearRring(ring)
<shapely.geometry.polygon.LinearRing object at 0x...>
As with LineString, a sequence of Point instances is not a valid constructor parameter.
The Polygon constructor takes two positional parameters. The first is an ordered sequence of point tuples and is treated exactly as in the LinearRing case. The second is an optional unordered sequence of ring-like sequences specifying the interior boundaries or "holes" of the feature.
>>> from shapely.geometry import Polygon
>>> polygon = Polygon((a1, ..., aM), [(b1, ..., bN), ...])
Polygon rings may not cross each other, but may touch at single points. Again, Shapely will not prevent the creation of such features, but exceptions will be raised when they are operated on.
A Polygon has non-zero area and non-zero length.
>>> polygon = Polygon([(0, 0), (1, 1), (1, 0)])
>>> polygon.area
0.5
>>> polygon.length
3.4142135623730949
Its x-y bounding box is a (minx, miny, maxx, maxy) tuple.
>>> polygon.bounds
(0.0, 0.0, 1.0, 1.0)
Component rings are accessed via exterior and interiors properties.
>>> list(polygon.exterior.coords)
[(0.0, 0.0), (1.0, 1.0), (1.0, 0.0), (0.0, 0.0)]
>>> list(polygon.interiors)
[]
The Polygon constructor also accepts instances of LineString and LinearRing.
>>> coords = [(0, 0), (1, 1), (1, 0)]
>>> r = LinearRing(coords)
>>> s = Polygon(r)
>>> s.area
0.5
>>> t = Polygon(s.buffer(1.0).exterior, [r])
>>> t.area
6.5507620529190334
The MultiPoint constructor takes an ordered sequence of point tuples.
>>> from shapely.geometry import MultiPoint
>>> points = MultiPoint([c1, ..., cN])
A MultiPoint has zero area and zero length.
>>> points = MultiPoint([(0.0, 0.0), (1.0, 1.0)])
>>> points.area
0.0
>>> points.length
0.0
Its x-y bounding box is a (minx, miny, maxx, maxy) tuple.
>>> points.bounds
(0.0, 0.0, 1.0, 1.0)
Members of a multi-point collection are accessed via the geoms property.
>>> import pprint
>>> pprint.pprint(list(points.geoms))
[<shapely.geometry.point.Point object at 0x...>,
<shapely.geometry.point.Point object at 0x...>]
The constructor also accepts another MultiPoint instance or an unordered sequence of Point instances, thereby making copies.
>>> MultiPoint([Point(0, 0), Point(1, 1)])
<shapely.geometry.multipoint.MultiPoint object at 0x...>
The MultiLineString constructor takes an unordered sequence of line-like sequences.
>>> from shapely.geometry import MultiLineString
>>> lines = MultiLineString([(a1, ..., aM), (b1, ..., bN), ...])
A MultiLineString has zero area and non-zero length.
>>> coords = [((0, 0), (1, 1)), ((-1, 0), (1, 0))]
>>> lines = MultiLineString(coords)
>>> lines.area
0.0
>>> lines.length
3.4142135623730949
Its x-y bounding box is a (minx, miny, maxx, maxy) tuple.
>>> lines.bounds
(-1.0, 0.0, 1.0, 1.0)
Its members are instances of LineString and are accessed via the geoms property.
>>> len(lines.geoms)
2
>>> pprint.pprint(list(lines.geoms))
[<shapely.geometry.linestring.LineString object at 0x...>,
<shapely.geometry.linestring.LineString object at 0x...>]
The constructor also accepts another instance of MultiLineString or an unordered sequence of LineString instances, thereby making copies.
>>> MultiLineString(lines)
<shapely.geometry.multilinestring.MultiLineString object at 0x...>
>>> MultiLineString(lines.geoms)
<shapely.geometry.multilinestring.MultiLineString object at 0x...>
The MultiPolygon constructor takes a sequence of exterior ring and hole list tuples.
>>> from shapely.geometry import MultiPolygon
>>> polygons = MultiPolygon([((a1, ..., aM), [(b1, ..., bN), ...]), ...])
More explicit notation for the exterior and interior boundaries (or shells and holes) makes usage more clear.
>>> shell = (a1, ..., aM)
>>> holes = [(b1, ..., bN), ...]
>>> polygons = MultiPolygon([(shell, holes), ...])
Perhaps even more clearly, the constructor accepts an unordered sequence of Polygon instances, thereby making copies.
>>> polygons = MultiPolygon([polygon, s, t])
>>> len(polygons.geoms)
3
Its x-y bounding box is a (minx, miny, maxx, maxy) tuple.
>>> polygons.bounds
(-1.0, -1.0, 2.0, 2.0)
An "empty" feature is one with a point set that coincides with the empty set: not a None, but like (). Empty features can be created by calling the constructors with no arguments. Almost no operations are supported by empty features.
>>> line = LineString()
>>> line.is_empty
True
>>> line.length
0.0
>>> line.bounds
()
The coordinates of a empty feature can be set, after which the geometry is no longer empty.
>>> line.coords = [(0, 0), (1, 1)]
>>> line.is_empty
False
>>> line.length
1.4142135623730951
>>> line.bounds
(0.0, 0.0, 1.0, 1.0)
Shapely features provide standard [1] predicate properties and methods. All return True or False.
Standard unary predicates are implemented as instance properties.
has_z: Returns True if the feature has not only x and y, but also z coordinates.
>>> Point(0, 0).has_z
False
>>> Point(0, 0, 0).has_z
True
is_empty: Returns True if the feature's interior and boundary (in point set terms) coincide with the empty set.
>>> Point().is_empty
True
>>> Point(0, 0).is_empty
False
is_ring: Returns True if the feature is closed. A closed feature's boundary coincides with the empty set. Applicable to LineString and LinearRing. This predicate is somewhat redundant considering that closed-ness is in practice a class attribute.
>>> LineString([(0, 0), (1, 1), (1, -1)]).is_ring
False
>>> LinearRing([(0, 0), (1, 1), (1, -1)]).is_ring
True
is_simple: Returns True if the feature does not cross itself. Operations on non-simple LineStrings are fully supported by Shapely.
>>> LineString([(0, 0), (1, 1), (1, -1), (0, 1)]).is_simple
False
is_valid: Returns True if a feature is "valid" in the sense of [1]. A valid LinearRing may not cross itself or touch itself at a single point. A valid Polygon may not possess any overlapping exterior or interior rings. A valid MultiPolygon may not collect any overlapping polygons. Operations on invalid features may fail.
>>> MultiPolygon([Point(0, 0).buffer(2.0), Point(1, 1).buffer(2.0)]).is_valid
False
The two points above are close enough that the polygons resulting from the buffer operations (explained in a following section) overlap.
The is_valid predicate could be used to write a decorator that ensures only valid objects are returned from a function.
def valid_ring(coordinates):
ring = LinearRing(coordinates)
if not ring.is_valid:
raise TopologicalError(
"Given coordinates do not determine a valid LinearRing")
return ring
An exception is raised when the function is passed the self-crossing coordinates from the is_simple example above.
>>> coords = [(0, 0), (1, 1), (1, -1), (0, 1)]
>>> valid_ring(coords)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 5, in valid_ring
shapely.geos.TopologicalError: Given coordinates do not determine a valid LinearRing
This might also be done using a decorator.
from functools import wraps
def is_valid(func):
@wraps(func)
def wrapper(*args, **kwargs):
ob = func(*args, **kwargs)
if not ob.is_valid:
raise TopologicalError(
"Given arguments do not determine a valid geometric object")
return ob
return wrapper
>>> @is_valid
... def ring(coordinates):
... return LinearRing(coordinates)
...
>>> ring(coords)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 7, in wrapper
shapely.geos.TopologicalError: Given arguments do not determine a valid geometric object
Standard binary predicates are implemented as instance methods. These predicates evaluate topological, set-theoretic relationships. In a few cases the results may not be what one might expect. All take another geometric object as argument and return True or False.
contains (other): Returns True if the object's interior contains the boundary and interior of the other object and their boundaries do not touch at all. This predicate applies to all types, and is inverse to within: a.contains(b) == b.within(a) always evaluates to True.
>>> coords = [(0, 0), (1, 1)]
>>> LineString(coords).contains(Point(0.5, 0.5))
True
>>> Point(0.5, 0.5).within(LineString(coords))
True
A line's endpoints are part of its boundary and are therefore not contained.
>>> LineString(coords).contains(Point(1.0, 1.0))
False
crosses (other): Returns True if the interior of the object intersects the interior of the other but does not contain it, and the dimension of the intersection is less than the dimension of the one or the other.
>>> LineString(coords).crosses(LineString([(0, 1), (1, 0)]))
True
A line does not cross a point that it contains.
>>> LineString(coords).crosses(Point(0.5, 0.5))
False
disjoint (other): Returns True if the boundary and interior of the object do not intersect at all with those of the other. This predicate applies to all types and is the inverse of intersects.
>>> Point(0, 0).disjoint(Point(1, 1))
True
equals (other): Returns True if the set-theoretic boundary, interior, and exterior of the object coincide with those of the other. The coordinates passed to the object constructors are of these sets, and determine them, but are not the entirety of the sets. This is a potential "gotcha" for new users. Equivalent lines, for example, can be constructed differently:
>>> a = LineString([(0, 0), (1, 1)])
>>> b = LineString([(0, 0), (0.5, 0.5), (1, 1)])
>>> c = LineString([(0, 0), (0, 0), (1, 1)])
>>> a.equals(b)
True
>>> b.equals(c)
True
This predicate should not be mistaken for Python's == or is constructions.
intersects (other): Returns True if the boundary and interior of the object intersect in any way with those of the other. It is the super-relation of contains, crosses, equals, touches, and within.
touches (other): Returns True if the boundary of the object intersects only the boundary of the other, and their interiors do not intersect with any part of the other. Overlapping features do not therefore "touch", another potential "gotcha".
For example, the following lines touch at (1, 1), but do not overlap.
>>> a = LineString([(0, 0), (1, 1)])
>>> b = LineString([(1, 1), (2, 2)])
>>> a.touches(b)
True
within (other): Returns True if the object's boundary and interior intersect only with the interior of the other (not its boundary or exterior).This applies to all types and is the inverse of contains.
Shapely geometric object have several methods that yield new objects not derived from their point sets. TODO: improve this.
buffer (distance, quadsegs=16): Returns a new object that contains, approximately, all points within a given distance of the original object. A positive distance has an effect of dilation; a negative distance, erosion. The optional quadsegs argument determines the number of segements used to approximate a quarter circle around a point.
>>> line = LineString([(0, 0), (1, 1), (0, 2), (2, 2), (3, 1), (1, 0)])
>>> dilated = line.buffer(0.5)
>>> eroded = dilated.buffer(-0.3)
Figure 1. Dilation of a line (left) and erosion of a polygon (right). New object is shown in blue. [code]
The default buffer of a point is a polygonal patch with 99.8% of the area of the disk it approximates.
>>> p = Point(0, 0).buffer(10.0)
>>> len(p.exterior.coords)
66
>>> p.area
313.65484905459385
With a quadsegs value of 1, the buffer is a square patch.
>>> q = Point(0, 0).buffer(10.0, 1)
>>> len(q.exterior.coords)
5
>>> q.area
200.0
Passed a distance of 0, buffer can be used to "clean" self-touching polygons such as the classic "bowtie".
>>> coords = [(0, 0), (0, 2), (1, 1), (2, 2), (2, 0), (1, 1), (0, 0)]
>>> bowtie = Polygon(coords)
>>> bowtie.is_valid
False
>>> clean = bowtie.buffer(0)
>>> clean.is_valid
True
>>> clean
<shapely.geometry.multipolygon.MultiPolygon object at ...>
>>> len(clean)
2
>>> list(clean[0].exterior.coords)
[(0.0, 0.0), (0.0, 2.0), (1.0, 1.0), (0.0, 0.0)]
>>> list(clean[1].exterior.coords)
[(1.0, 1.0), (2.0, 2.0), (2.0, 0.0), (1.0, 1.0)]
Buffering splits the polygon in two at the point where they touch.
convex_hull: Returns the smallest convex Polygon containing all the points in the object unless the number of points in the object is less than three. For two points, the convex hull collapses to a LineString; for 1, a Point.
Figure 1. Convex hull (blue) of 6 points (left) and of 2 points (right). [code]
simplify (tolerance, preserve_topology=True): Returns a simplified version of the geometric object. All points in the simplified object will be within the tolerance distance of the original geometry. By default a slower algorithm is used that preserves topology. If preserve topology is set to False the much quicker Douglas-Peucker algorithm (TODO: cite) is used, and invalid geometries may result.
Figure 1. Simplification of a nearly circular polygon of radius 1 using a tolerance of 0.1 (left) and 0.5 (right). [code]
>>> p = Point(0.0, 0.0)
>>> x = p.buffer(1.0)
>>> x.area
3.1365484905459389
>>> len(x.exterior.coords)
66
>>> s = x.simplify(0.05, preserve_topology=False)
>>> s.area
3.0614674589207187
>>> len(s.exterior.coords)
17
TODO: Finish sections below this.
>>> polygon.boundary
<shapely.geometry.linestring.LineString object at ...>
>>> line_b.boundary
<shapely.geometry.multipoint.MultiPoint object at ...>
>>> point_r.boundary.is_empty
True
>>> hull = multi_point.convex_hull
>>> polygon.difference(hull)
<shapely.geometry.polygon.Polygon object at ...>
>>> polygon.envelope
<shapely.geometry.polygon.Polygon object at ...>
>>> polygon.intersection(hull)
<shapely.geometry.polygon.Polygon object at ...>
>>> polygon.symmetric_difference(hull)
<shapely.geometry.multipolygon.MultiPolygon object at ...>
Point unions were demonstrated above under convex hull. The union of polygons will be a polygon or a multi-polygon depending on whether they intersect or not:
>>> hull.union(polygon)
<shapely.geometry.polygon.Polygon object at ...>
>>> from shapely.ops import polygonize
>>> lines = [
... ((0, 0), (1, 1)),
... ((0, 0), (0, 1)),
... ((0, 1), (1, 1)),
... ((1, 1), (1, 0)),
... ((1, 0), (0, 0))
... ]
>>> result = polygonize(lines)
>>> list(result.geoms)
[<shapely.geometry.polygon.Polygon object at ...>, <shapely.geometry.polygon.Polygon object at ...>]
>>> lines = MultiLineString([
... ((0, 0), (1, 1)),
... ((2, 0), (2, 1), (1, 1))
... ])
>>> result = linemerge(lines)
>>> result # docstring: +ELLIPSIS
<shapely.geometry.linestring.LineString object at 0x...>
>>> Point(0,0).distance(Point(1,1))
1.4142135623730951
Shapely provides 4 avenues for interoperation with other Python and GIS software.
The WKT representation of any geometry object can be had via the wkt attribute:
>>> point_r.wkt
'POINT (-1.5000000000000000 1.2000000000000000)'
Hex-encode that string and you have a value that can be conveniently inserted directly into PostGIS
>>> point_r.wkt.encode('hex')
'504f494e5420282d312e3530303030303030303030303030303020312e3230303030303030303030303030303029'
New geometries can be created from WKT representations using the shapely.wkt.loads factory (inspired by the pickle module)
>>> from shapely.wkt import loads
>>> loads('POINT (0 0)')
<shapely.geometry.point.Point object at ...>
The WKB representation of any geometry object can be had via the wkb attribute. New geometries can be created from WKB data using the shapely.wkb.loads factory. Use this format to interoperate with ogr.py:
>>> import ogr
>>> from shapely.wkb import loads
>>> source = ogr.Open("/tmp/world_borders.shp")
>>> borders = source.GetLayerByName("world_borders")
>>> feature = borders.GetNextFeature()
>>> loads(feature.GetGeometryRef().ExportToWkb())
<shapely.geometry.polygon.Polygon object at ...>
Shapely geometries provide the Numpy array interface which means that points, line strings, and polygon rings can be used as Numpy arrays:
>>> from numpy import array
>>> a = array(polygon.exterior)
>>> a
array([[-1., -1.],
[-1., 1.],
[ 1., 1.],
[ 1., -1.],
[-1., -1.]])
The numpy.asarray function does not copy coordinate values at the price of slower numpy access to coordinates.
The shapely.geometry.as* functions can also be used to wrap numpy arrays, which can then be analyzed using Shapely while maintaining their original storage. A 1 x 2 array can be adapted to a point
>>> a = array([1.0, 2.0])
>>> pa = asPoint(a)
>>> pa.wkt
'POINT (1.0000000000000000 2.0000000000000000)'
and a N x 2 array can be adapted to a line string
>>> from shapely.geometry import asLineString
>>> a = array([[1.0, 2.0], [3.0, 4.0]])
>>> la = asLineString(a)
>>> la.wkt
'LINESTRING (1.0000000000000000 2.0000000000000000, 3.0000000000000000 4.0000000000000000)'
There is no Numpy array representation of a polygon.
Any object that provides the GeoJSON-like Python geo interface can be adapted and used as a Shapely geometry using the shapely.geometry.asShape function. For example, a dictionary:
>>> from shapely.geometry import asShape
>>> d = {"type": "Point", "coordinates": (0.0, 0.0)}
>>> shape = asShape(d)
>>> shape.geom_type
'Point'
>>> list(shape.coords)
[(0.0, 0.0)]
Or a simple placemark-type object:
>>> class GeoThing(object):
... def __init__(self, d):
... self.__geo_interface__ = d
>>> thing = GeoThing({"type": "Point", "coordinates": (0.0, 0.0)})
>>> shape = asShape(thing)
>>> shape.geom_type
'Point'
>>> list(shape.coords)
[(0.0, 0.0)]
If you want to copy coordinate data to a new geometry, use the shapely.geometry.shape function instead.
Shapely provides functions for efficient operations on large sets of geometries.
To find the subset of points that are contained within a polygon, use shapely.iterops.contains:
>>> from shapely.geometry import Polygon
>>> from shapely.geometry import Point
>>> coords = ((0.0, 0.0), (0.0, 1.0), (1.0, 1.0), (1.0, 0.0), (0.0, 0.0))
>>> polygon = Polygon(coords)
>>> points = [Point(0.5, 0.5), Point(2.0, 2.0)]
>>> from shapely import iterops
>>> list(iterops.contains(polygon, points, True))
[<shapely.geometry.point.Point object at ...>]
The second parameter to iterops.contains can be any kind of iterator, even a generator of objects. If it yields tuples, then the second element of the tuple will be ultimately yielded from iterops.contains.
>>> list(iterops.contains(polygon, iter((p, p.wkt) for p in points)))
['POINT (0.5000000000000000 0.5000000000000000)']
Shapely geometries can be pre-analyzed into a state that supports more efficient batches of operations. To test one polygon containment against a large batch of points, one should first use the prepared.prep function:
>>> from shapely.geometry import Point
>>> from shapely.prepared import polyprep
>>> points = [...] # large list of points
>>> polygon = Point(0.0, 0.0).buffer(1.0)
>>> prepared_polygon = prep(polygon)
>>> prepared_polygon
<shapely.prepared.PreparedGeometry object at ...>
>>> hits = filter(prepared_polygon.contains, points)
Prepared geometries instances have the following methods: contains, contains_properly, covers, and intersects. All have exactly the same arguments and usage as their counterparts in the standard geometries.
Shapely is written by Sean Gillies with contributions from Aron Bierbaum, Howard Butler, Kai Lautaportti (Hexagon IT), Frédéric Junod (Camptocamp SA), Eric Lemoine (Camptocamp SA) and ctypes tips from Justin Bronn (GeoDjango).
| [1] | (1, 2, 3) John R. Herring, Ed., “OpenGIS Implementation Specification for Geographic information - Simple feature access - Part 1: Common architecture,” Oct. 2006. |
| [2] | (1, 2) M.J. Egenhofer and John R. Herring, Categorizing Binary Topological Relations Between Regions, Lines, and Points in Geographic Databases, Orono, ME: University of Maine, 1991. |
| [3] | E. Clementini, P. Di Felice, and P. van OOsterom, “A Small Set of Formal Topological Relationships Suitable for End-User Interaction,” Third International Symposium on Large Spatial Databases (SSD). Lecture Notes in Computer Science no. 692, David Abel and Beng Chin Ooi, Eds., Singapore: Springer Verlag, 1993, pp. 277-295. |
| [4] | C. Strobl, “Dimensionally Extended Nine-Intersection Model (DE-9IM),” Encyclopedia of GIS, S. Shekhar and H. Xiong, Eds., Springer, 2008, pp. 240-245. [PDF] |
| [5] | Martin Davis, “JTS Technical Specifications,” Mar. 2003. [PDF] |