这是indexloc提供的服务,不要输入任何密码
Skip to content

Conversation

@jpolchlo
Copy link

@jpolchlo jpolchlo commented Aug 8, 2017

This PR makes it possible to display map layers that are furnished by a remote service with a TMS-like web interface. With this change, any data-handling back end that implements a simple protocol may display map data in the GeoNotebook web interface.

[Note: What we are calling "TMS" may not adhere strictly to the definition of a TMS server. We'd like to name this interface in a way that is consistent with the community's conventions, such that we don't create confusion further down the line. We'll take recommendations regarding the proper nomenclature to use.]

The interface for the TMS server assumes a standard power-of-two pyramid established over the global extent using a WebMercator projection. At zoom level n, there are 2^n partitions of the extent along each axis, indexed from 0 to 2^n-1, with the cartesian product of the two axes' partitions giving the space of (x, y) keys for which a tile may exist. [Note: At the moment, we index using the Google/Bing/OSM convention where (0, 0) corresponds to the extreme northwest tile. If necessary, one could accommodate a flag to indicate that the tiles are being served in accordance with the TMS standard of an origin in the extreme southwest.]

To add a TMS-provided layer, one needs to provide an object that adheres to the following simple interface:

class TMS:
    def bind(self, host, port=None):
        # binds a service to a host at a specified port (random port if port=None)
        # subsequent calls to bind should fail until unbind is called
    def unbind(self):
        # stops the tile server from listening for further requests
        # subsequent calls to bind to a new host/port may restart the service
    @property
    def host(self):
        # the name of the host listening for tile requests
    @property
    def port(self):
        # returns the port number the service is listening to requests on
    @property
    def url_pattern(self):
        # returns a string template of the URL tile requests can be made on
        # the special tokens {z}, {x}, and {y} should be used as placeholders 
        # for the zoom level and x and y coordinates, respectively

Objects that are duck-typed with this interface (provide these methods/properties) may be wrapped in a TMSRasterData (from geonotebook.wrappers) and passed to M.add_layer as per usual. The TMS server will be bound to a port at the time of creation; removing a TMS-backed layer will unbind the service. [Note: I have not implemented special handling of the standard styling options (e.g., opacity) and have no expectation that they will have any effect. A subsequent PR could address this.]

jamesmcclain and others added 30 commits April 17, 2017 12:42
Also contains misguided attempt to put the tile server into the Jupyter
process.  The fact that a Spark context is needed there combines with
other factors to make this difficult.
Also contains misguided attempt to put the tile server into the Jupyter
process.  The fact that a Spark context is needed there combines with
other factors to make this difficult.
Tile server is very slow, trying to avoid thrashing.
…layer

Drop Layer Handles When `remove_layer` Is Called
Signed-off-by: jpolchlo <jpolchlopek@azavea.com>
Signed-off-by: jpolchlo <jpolchlopek@azavea.com>
echeipesh and others added 19 commits June 26, 2017 13:28
…-protobuf

Update GeoNoteBook to Work with ProtoBufSerialization
Signed-off-by: jpolchlo <jpolchlopek@azavea.com>
Signed-off-by: jpolchlo <jpolchlopek@azavea.com>
…nt-statements

Remove Stray Print Statements
…hronous fetches of remotely-served tiles

Signed-off-by: jpolchlo <jpolchlopek@azavea.com>
Signed-off-by: jpolchlo <jpolchlopek@azavea.com>
Signed-off-by: jpolchlo <jpolchlopek@azavea.com>
…lay TMS layers via proxy provider. Create TMSRasterData wrapper around object with bind and unbind methods, port, host, and url_pattern properties.

Signed-off-by: jpolchlo <jpolchlopek@azavea.com>
Signed-off-by: jpolchlo <jpolchlopek@azavea.com>
Signed-off-by: jpolchlo <jpolchlopek@azavea.com>
Signed-off-by: jpolchlo <jpolchlopek@azavea.com>
Signed-off-by: jpolchlo <jpolchlopek@azavea.com>
@kotfic
Copy link
Contributor

kotfic commented Aug 8, 2017

@jpolchlo Thanks! I'll start looking at this today/tomorrow

@jpolchlo
Copy link
Author

jpolchlo commented Aug 8, 2017

As far as testing is concerned, we have been using locationtech-labs/geopyspark to produce TMS servers. We can work together to build a test notebook if you need.

@aashish24
Copy link
Member

@jpolchlo this is great and thank you for the contribution.

@kotfic
Copy link
Contributor

kotfic commented Aug 14, 2017

@jpolchlo still trying to wrap my head around this a little do you have an example of the kind of object that TMSRasterData is supposed to wrap? Specifically it seems like the primary motivation for the bind/unbind functionality is handling randomly assigned port values?

If you could give me a better sense of what goes on inside the unbind function that TMSRasterData calls I think it would help me understand a little better. If i'm not mistaken needing to call unbind() is what motivates the introduction of disgorge and the _server_state variable passing through kernel.py/layer.py?

setup.py Outdated
# ensuring you can import mapnik before continuing to install GeoNotebook
# using Python 3.
# """)
# sys.exit(1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's change this to "Python bindings for Mapnik (https://github.com/mapnik/python-mapnik) are highly suggested for running GeoNotebook. ..." and then remove the sys.exit()

setup.py Outdated
"notebook",
"fiona",
"mapnik",
# "mapnik",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can just remove this given the above comment.

@jpolchlo
Copy link
Author

Yes, disgorge and _server_state are there to support clean shutdown of the TMS servers—not specifically to handle the random port assignment, though. Conceivably, the vis server could pass through a port assignment. It also seemed like a reasonable way to introduce state to the vis servers, in case that is useful in the future. I half expect that there's something I didn't consider in the system design that makes this dangerous.

The TMS object that I'm working with defines unbind() as in https://github.com/locationtech-labs/geopyspark/blob/master/geopyspark/geotrellis/tms.py#L176-L184, which calls out to https://github.com/locationtech-labs/geopyspark/blob/master/geopyspark-backend/geotrellis/src/main/scala/geopyspark/geotrellis/tms/Server.scala#L44-L48 (which in turn calls to the Scala code in that directory). It is entirely intended to shut down the server, free ports, and generally clean up after itself. In this case, there are processes in the background doing request aggregation so that we make fewer calls out to Spark that need to be terminated.

Signed-off-by: jpolchlo <jpolchlopek@azavea.com>
Copy link
Contributor

@kotfic kotfic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First thank you for all your hard work on this PR @jpolchlo . You've revealed several inadequacies in the API that i'd like to try and resolve and I'm very grateful.

There are, however, a few things here that I think maybe be too specific to go into the code base. Ideally those specific things could be pulled out into a third party extension (python package) though you will be the first external contributor to do that and so we may need to make some changes in the API surface to support that (e.g. things like disgorge).

If its possible i'd like to try and move functionality in TMSRasterData into a reader attached to a RasterData object, and functionality in TMSTileLayer into the vis server's ingest function. If this is possible then Geotrellis/GeoPySpark's TMS specific functionality could be put into an external package easily. We're happy to work to support whatever changes need to happen to the generic API to make that possible.

More specific comments have been added inline.
Thanks again!

from .utils import get_kernel_id
from .wrappers import RasterData, RasterDataCollection, VectorData
from .wrappers import (RasterData,
TMSRasterData,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general i'd like to try and avoid introducing new wrapper types like TMSRasterData and instead push functionality into reader's that expose the correct data/metadata through the RasterData API. It seems like the primary reason to do this is to test TMSRasterData in kernel.py and add a TMSTileLayer? Is that correct? If so would you consider checking data.reader for some kind of TMSRasterDataReader class that was added through something like a tms:// schema to setup.py so it uses a different reader?

name, remote, data=data, vis_url=vis_url, **kwargs
)

self.data = data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be picked up from subclassing DataLayer - was there a reason to not subclass DataLayer?

self.data.bind("0.0.0.0")

vis_options = self.vis_options.serialize()
vis_options.update(kwargs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It probably makes sense to extend the vis_options with kwargs, but its not clear to me that all kwargs passed to a layer will be visualization options. It seems like maybe we could look at adding white lists of options to RasterStyleOptions and VectorStyleOptions so that we could pass any of those that are passed into a Layer onto those classes then just call serialize - as is adding a layer, and then refreshing the page will not pick up these additional kwargs options on layer re-render (because they are not apart of the vis_options state maintained on the layer object).

)

self.data = data
self.data.bind("0.0.0.0")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the exception of the call to bind() it seems like SimpleLayer could be extended to incorporate all this functionality. This would be desirable, because then we would not need to check for TMSRasterData in kernel.py.

If i am reading this correctly, the call to bind is really a feature of the remote TMS Server? In which case i'm not sure the bind function belongs on the TMSRasterData API. Ideally the data nessisary for bind (e.g. the state maintained on the TMSRasterData object) should be used to call whatever initialization is necessary in the ingest function. Is this something that could be moved inside the vis_server?

self.servers = { } # Dictionary servers.

self._kernel = kernel
self._server_state = { }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initially the vis server was intended to be an ephemeral object that implemented an API that coordinated information between the tornado process and the kernel process. The coordinating state was intended to be statically defined in the config file.

In general, I'm hesitant to introduce state now unless there is an immediate and compelling need. I'm also specifically concerned about maintaining the vis_server's state on a different object (e.g. the Geonotebook object). If there is a compelling need to maintain state then i'd like to see if we can think of a better solution.

Again, i may be mis-reading this, but it seems like the immediate need is motivated by the introduction of the disgorge and the need to call unbind on the tms object passed into TMSRasterData?

"KTile.ingest() returned error {}:\n\n{}".format(
r.status_code, r))

def disgorge(self, name, **kwargs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disgorge makes sense to me as a method that is missing from the vis_server API. It seems like if vis_server's have a shot at the data on add_layer() they should get a shot at the data on remove_layer(). With that said, it would be great if the API's could be similar (e.g. disgorge taking a RasterData data object rather than just a name).

If it took the RasterData object, and the RasterData object had the tms object saved as apart of it's state, then it should be able to call unbind without having to save state in _server_state right?

assign a random port at the specified host if the 'port' argument is unset.
"""
def __init__(self, tms, name=None):
self.tms = tms
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if this tms object is something that can be created inside a reader rather than passed in directly to the TMSRasterData object?

The original goal was to provide 'uri' style syntax to RasterData() so we could reference a bunch of different data sources (e.g. files, postgis databases, exotic filesystems etc). That maybe be a pipe dream, but maybe you can point me towards where/how this tms object is actually being instantiated?

@jpolchlo
Copy link
Author

jpolchlo commented Sep 5, 2017

This is a great idea; I hadn't noticed the option of creating a new reader type, but it's fairly clear that it's the right thing to do. It may also resolve some of the questions below regarding bind and the TMS API. If we were to use RasterData('http://localhost:<port>/.../{z}/{x}/{y}.png') (though the argument for a tms schema is readily apparent), then it removes the responsibility from the reader class to start up the TMS endpoint, as it obviously needs to exist before that call can be made. In this case, using GeoPySpark constructs, adding a TMS layer will take the following form

tms = TMS.build(...)
tms.bind()
M.add_layer(RasterData(tms.url_pattern), display_option_1, display_option_2)

while removing it will just require M.remove_layer to be followed by tms.unbind().

The problem with such a solution is that TMS objects can be lost easily in this manner, and the ports continue to run backed by data with references that can never be released. It's easy to suggest that the user needs to take ultimate responsibility, but in an interactive environment with rapid iteration, it should be expected that, unless it's rolled into the reader, these objects will be mishandled. That's the motivation for the ingest/disgorge pairing managing the bind/unbind process. So, for sake of ease of use, and in the interest of reducing the bookkeeping burden from the user, figuring out how to make RasterData(tms) into valid syntax is worthwhile. If we must rely on a string argument so that the schema can be extracted and routed through the entry point mechanism, then some significant ease of use will be lost. It already seems as if VRTs get some special treatment inside the Ktile vis server, so figuring out how special display objects can generalize would be worthwhile. I'd love to get your thoughts on this.

I think in either case, documentation of the reader interface would be of use.

And, yes, it seems that if I move over to using the appropriate reader structure, then we will remove the need for state to be carried, and that change can be reverted.

@jpolchlo jpolchlo closed this Jul 16, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants