Skip to content

Sensorposition extractor should organize by plot #238

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
4 tasks
max-zilla opened this issue Jan 25, 2017 · 15 comments
Closed
4 tasks

Sensorposition extractor should organize by plot #238

max-zilla opened this issue Jan 25, 2017 · 15 comments
Assignees

Comments

@max-zilla
Copy link
Contributor

After discussion with me, @dlebauer , @robkooper , we have an updated organization strategy for geodata to better integrate with geodashboard. The current extractor code is here, deployed to a VM but currently not running pending these updates:
https://github.com/terraref/extractors-metadata/tree/master/sensorposition

the old way
geostream sensor = a site, e.g. Maricopa
geostream stream = an instrument, e.g. VNIR
geostream datapoint = a single moment in time dataset

the new way
geostream sensor = plot identifier, e.g. "season 1 range 1 row 1"
geostream stream = an instrument (there will be separate stream for each plot x instrument)
geostream datapoint = a single moment in time dataset

Basically:

  • we want a separate sensor for each of the ~850 plots we have on the field.
  • we want a stream for each plot x instrument, e.g. for 850 plots with 10 instruments = 8,500 streams
  • datapoint is basically the same

This will require some changes to the extractor.

  • need shapefile definitions of plot boundaries to include with extractor
  • need lookup logic to take lat/lon of dataset and determine which plot it intersects with/is contained by
  • need to update logic to dynamically create sensor if it doesn't exist (already written; just tweak)
  • need to update logic to dynamically create stream if it doesn't exist (already written; just tweak)

@dlebauer do we have such a shapefile readily available?
@Zodiase @yanliu-chn is this a fix I could ask your team to make once shapefile is available?

@max-zilla max-zilla self-assigned this Jan 25, 2017
@dlebauer
Copy link
Member

@max-zilla the plot boundaries can be queried from terraref.ncsa.illinois.edu/bety. The 'canonical' plots (full size). Unfortunately ... the site isn't up. Presumably if these boundaries are stored in the geostreams database, they can be called by extractors that want to know how to subset a plot (#159)?

@dlebauer
Copy link
Member

@max-zilla here is a set of shapefiles https://drive.google.com/drive/u/0/folders/0B5btztQSCEFxVUZ6SlRHVDBpdWM that represent the plots excluding the boundary rows. These are the shapes that should be used for subsetting images.

@max-zilla
Copy link
Contributor Author

OK. I included the shapefiles in the repo for now.

Looking up plots dynamically via Bety has some appeal. We could also potentially store in the geostreams database, that would make the extractor incompatible with generic Clowder deployments (without custom geostreams plot table) but the extractor's so specific to our project that maybe that's fine.

The shapefile is the best short-term option so we can at least start to get the geodashboard piece simmering.

@max-zilla
Copy link
Contributor Author

@Zodiase @yanliu-chn I think we could use Shapely to query the .shp by lat/long and get polygons (plots) the coords intersect with in python.

@ghost ghost assigned yanliu-chn Jan 26, 2017
@dlebauer
Copy link
Member

dlebauer commented Jan 26, 2017

@max-zilla
Copy link
Contributor Author

@yanliu-chn the RangePass field of the polygon is what we want - basically unique identifier for the plot.

If the lat/lon is between 2 polygons, the nearest plot should be sufficient.

@yanliu-chn
Copy link

Done. Please take the python file in /projects/arpae/sw/utilities/ and load it as a module.

export PYTHONPATH=`pwd`:$PYTHONPATH
python
>>> import plotid_by_latlon
>>> plotid_by_latlon.plotQuery('data/sorghumexpfall2016v5_lblentry_1to7.shp', -111.97495668222, 33.0760167027358)

@yanliu-chn
Copy link

yanliu-chn commented Jan 26, 2017

# plotid_by_latlon.py: Query plot id by <lon, lat> point
# Yan Y. Liu <yanliu@illinois.edu>
# 01/26/2017
# Usage: python plotid_by_latlon.py shpfile lon lat
# Output: id of the plot which contains, touches, or is closest to the point. None if something wrong
# Dependency: GDAL 2.0+ with GEOS, PROJ4, and python library support
import sys, os, string
from osgeo import gdal
from osgeo import ogr
from osgeo import osr

def plotQuery(shpFile = None, lon = 0, lat = 0):
    if not os.path.exists(shpFile):
        print "plotQuery(): ERROR shp file does not exist: " + str(shpFile)
        return None

    # open plot shp file
    ds = gdal.OpenEx( shpFile, gdal.OF_VECTOR | gdal.OF_READONLY)
    if ds is None :
        print "plotQuery(): ERROR Open failed: " + str(shpFile) + "\n"
        return None
    layerName = os.path.basename(shpFile).split('.shp')[0]
    lyr = ds.GetLayerByName( layerName )
    if lyr is None :
        print "plotQuery(): ERROR fetch layer: " + str(layerName) + "\n"
        return None

    # load shp file
    lyr.ResetReading()
    num_records = lyr.GetFeatureCount()
    lyr_defn = lyr.GetLayerDefn()
    t_srs = lyr.GetSpatialRef()

    # create point
    point = ogr.Geometry(ogr.wkbPoint)
    point.SetPoint_2D(0, lon, lat)
    s_srs = osr.SpatialReference()
    s_srs.ImportFromEPSG(4326)
    transform = osr.CoordinateTransformation(s_srs, t_srs)
    point.Transform(transform)

    fi_rangepass = lyr_defn.GetFieldIndex('RangePass')
    fi_range = lyr_defn.GetFieldIndex('Range')
    fi_pass = lyr_defn.GetFieldIndex('Pass')
    fi_macentry = lyr_defn.GetFieldIndex('MAC_ENTRY')

    min = 1000000000.0
    minid = None
    for f in lyr: # for each plot
        plotid = f.GetFieldAsString(fi_rangepass)
        rangeid = f.GetFieldAsInteger(fi_range)
        passid = f.GetFieldAsInteger(fi_pass)
        macentryid = f.GetFieldAsInteger(fi_macentry)
        geom = f.GetGeometryRef()
        if (geom.Contains(point) or geom.Touches(point)): # GDAL needs to support Covers() for better efficiency
            #print "plotQuery(): INFO point in plot"
            ds = None
            return plotid
        # calc distance and update nearest
        d = geom.Distance(point)
        if (d < min):
            min = d
            minid = plotid

    ds = None
    if minid is None:
        print "plotQuery(): ERROR searched but couldn't find nearest plot. Check data file or the point. "
        return None
    #print "plotQuery(): INFO point not in plot"
    return minid

# Example run:
# python plotid_by_latlon.py data/sorghumexpfall2016v5_lblentry_1to7.shp -111.97495668222 33.0760167027358
# plotQuery(): INFO point in plot
# 42-3
if __name__ == '__main__':
    shpFile = sys.argv[1] # shp file path
    lon = float(sys.argv[2]) # point lon
    lat = float(sys.argv[3]) # point lat
    print plotQuery(shpFile, lon, lat)

@max-zilla
Copy link
Contributor Author

@yanliu-chn awesome, thanks for the quick turnaround!

@max-zilla
Copy link
Contributor Author

@yanliu-chn trying to deploy this this morning - ran into a snag because python claims it cannot open the shapefile:

from osgeo import gdal
gdal.OpenEx()

Traceback (most recent call last):
  File "<pyshell#1>", line 1, in <module>
    gdal.OpenEx()
AttributeError: 'module' object has no attribute 'OpenEx'

...installing GDAL from scratch on the VM to see if I can resolve this, but might need your help if not.

@yanliu-chn
Copy link

it needs gdal 2.0+ since gdal changed the programming api significantly from 2.0. OpenEx is 2.0 stuff.

@max-zilla
Copy link
Contributor Author

@yanliu-chn got it installed and working now. Also got thrown off because your script is called plotid_by_latlon but the plotQuery() method requires long, lat order :-)

Now I'm dealing with:

2017-02-02 16:48:38,578 [Thread-1       ] INFO    : pyclowder.connectors - [58792ac64f0cd67174ba5eba] : START: Started processing
2017-02-02 16:48:38,578 [Thread-1       ] INFO    : root - sensor lat/lon: (33.07526085235525, -111.9749757421391)
Segmentation fault (core dumped)

Not sure where it's hitting this yet.

@yanliu-chn
Copy link

right. :) people always say latlon, not lonlat. but to make coding logic clear, we always do x and y, which is lon, lat.

the command line takes lon and lat. so if you can serve the point in that order, it should be fine.

@ghost ghost removed the help wanted label Feb 2, 2017
@max-zilla
Copy link
Contributor Author

@max-zilla
Copy link
Contributor Author

This is completed, closing this for now.

Ultimately this code has a lot of logic we'll want in other extractors for matching (lat,lon) to plot and posting to geostreams - I've added key portions to met and canopycover extractors but will keep considering how to generalize.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants