Skip to content

Store location and field of view metadata associated with each file in Clowder #101

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
dlebauer opened this issue May 12, 2016 · 43 comments
Closed
Assignees

Comments

@dlebauer
Copy link
Member

dlebauer commented May 12, 2016

Description

Metadata files provide geospatial information that we should add to meta-data and also to the Clowder PostGIS database

  • convert sensor location and field of view into geojson location metadata in Clowder and
  • add field of view to PostGIS database to enable cross-database queries
    • For more information about Clowder PostGIS database, speak with Luigi Marini and Rob Kooper.

Position of the gantry + camera

assume that the center of the sensor is located at

position + location in camera box

e.g. :
["location in camera box X [m]", "location in camera box Y [m]" + "location in camera box Z [m]"] + "Position x [m]", "Position y [m]" + "Position z [m]"]

Field of view

Assume the value provided for field of view is the height and width of the sides of the bounding box

Transforming to geospatial coordinates

Info from Solmaz on converting XYZ in meters to coordinates (for hyperspectral, but has SE corner origin points) terraref/documentation#9

Check what level of precision is required given resolution of the image. (how many significant digits in WGS85, web mercator).

Example

Here is an example of the current meta-data format (interpretation is discussed in terraref/reference-data#25), until this is clarified please assume that the center of the 'field of view' is located at:

      "optics focus setting (both)": "2m",
      "optics apperture setting (both)": "4",
      "Output data format": "Bayer GR8 3296x2472",
      "sensor purpose": "capture RGB images",
      "location in gantry system": "camera box, facing ground",
      "location in camera box X [m]": "0.877",
      "location in camera box Y [m]": "2.276",
      "location in camera box Z [m]": "0.578",
      "Field of view at 2m in X- Y- direction [m]": "[1.857 1.246]",
      "Bounding Box [m]": "[1.857     1.246]",

An example file on ROGER that you can start with is at /gpfs/largeblockFS/projects/arpae/terraref/raw_data/ua-mac/MovingSensor/ps2Top/2016-05-07/2016-05-07__15-58-43-382/36dfd7fe-e648-427d-b503-fdf47f198545_metadata.json

Reported Z in metadata is calculated from following equation.

ZHeightInM = 2.0f //focal length

  • 0.0f //average height of plants
  • 0.75f //distance ground box at zHeight = 0
  • 0.58f; //camera distance inside box

Context

This feature will be required to find all images based on location (e.g. those associated with a plant or plot)


others for comment: @robkooper @pless @solmazhajmohammadi

@caicai89-
Copy link

location: position (gantry_system_variable_metadata)
+location in camera box (sensor_fixed_metadata)
Question: 1. Are both the location and position in UTM coordinate?

"Field of view at 2m in X- Y- direction [m]": "[1.857 1.246]"
Question: 2. Does Field of view equal to "Field of view at 2m in X- Y- direction [m]" ?

@dlebauer
Copy link
Member Author

dlebauer commented May 17, 2016

This is relative to the SE corner of the gantry. See terraref/documentation#9

@TinoDornbusch
Copy link
Contributor

note that the [0,0] coordinate of the gantry is at south east corner.

@TinoDornbusch
Copy link
Contributor

South east. Wagner&Müller (the company that programmed the gantry) has strange ideas on definitions & procedures.

@TinoDornbusch
Copy link
Contributor

@caicai89- Field of view would describe the size of the rectangle you would see at 2m distance with the sensor.

Note that for line sensors (3d, Hyperspec) this is a y dimension only.

@max-zilla
Copy link
Contributor

Followup from meeting on 05/23 with @dlebauer @caicai89- @yanliu-chn @robkooper.

Summary workflow:

  1. Add message for RabbitMQ on dataset.addmetadata
  2. Write extractor to listen for new metadata on dataset - if we find XYZ position in meters, use Southeast origin lat/long + meter offset of camera box + sensor to determine sensor coordinates
  3. Use geostream API to add sensor coordinates to PostGIS database
  4. Later, show map of geostream datapoints and allow querying

Right now, the message for 1 does not exist. Suggest starting just by converting sample coordinates to lat/lon. Then we can implement that in extractor on actual metadata once other parts are done. After converting coordinates is done, we can enable PostGIS database on TERRA Clowder development instance (http://141.142.209.122/clowder/) and create some sample data there to test with.

@caicai89-
Copy link

@robkooper Could you share the geostream example with me and @yanliu-chn

@dlebauer
Copy link
Member Author

Here is the example file from above (renamed to .txt to allow upload here)

2016-04-13_00-38-15_environmentlogger.json.txt

@max-zilla
Copy link
Contributor

@caicai89- I think we will need to add a *.dataset.metadata.added message, there does not currently seem to be one.

*.dataset.file.added is valid, for example. Will talk with @robkooper but this should be a simple addition to Clowder along the lines of:

current.plugin[RabbitmqPlugin].foreach { p =>
        val dtkey = s"${p.exchange}.dataset.file.added"
        p.extract(ExtractorMessage(file.id, file.id, clowderurl, dtkey, Map.empty, file.length.toString, ds.id, ""))
      }

...adjusted for metadata and triggered in the right places.

There are separate pull requests to add a dataset.file.added message and to add support in PyClowder for dataset extractors:

@caicai89-
Copy link

  1. @max-zilla What is the queue name for the queue with message "*.dataset.metadata.added", "dataset.addmetadata"?
  2. @robkooper Have you prepared the sample code for me to learn and implement?

@max-zilla
Copy link
Contributor

max-zilla commented Jun 8, 2016

@caicai89- the queue name should be defined by your extractor in config.py. For example, take a look at my pending pull request for PyClowder example dataset extractor:
https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/pyclowder/browse/sample-extractors/dataset-filecount?at=refs%2Fheads%2Fbugfix%2FCATS-554-add-pyclowder-support-for-dataset

  • in config.py I give extractorName=datasetFileCount
  • in dataset-filecount.py you can just use:
extractors.connect_message_bus(extractorName=extractorName, messageType=messageType, processFileFunction=process_dataset,
        checkMessageFunction=check_message, rabbitmqExchange=rabbitmqExchange, rabbitmqURL=rabbitmqURL)

...to automatically connect and listen for messages on Clowder exchange, datasetFileCount queue.

The parameters for on_message() and process_file() will have info about dataset. Can show you sometime if you want.

@caicai89-
Copy link

@max-zilla thank you!

@robkooper
Copy link
Member

Quick intro to geostreams:

  • datapoints are a measurement at a specific time at a specific location, needs a stream_id
  • streams are grouping of datapoints that logially belong together, for example a season at a plot, needs a sensor_id
  • sites/sensors is a high level description of a sensor at a location.

In our case we can see a sensor/site as a plot, and a stream as a planting season, and a datapoint a specific image.

Example code that works with the geostreams API:
https://opensource.ncsa.illinois.edu/bitbucket/projects/GEOD/repos/seagrant-parsers-py/browse/USGS/usgs-import.py

A second more complex example : https://opensource.ncsa.illinois.edu/bitbucket/projects/GEOD/repos/seagrant-parsers-py/browse/SeaBird/seabird-import.py

Create sensor : https://opensource.ncsa.illinois.edu/bitbucket/projects/GEOD/repos/seagrant-parsers-py/browse/SeaBird/seabird-import.py#333

Create stream : https://opensource.ncsa.illinois.edu/bitbucket/projects/GEOD/repos/seagrant-parsers-py/browse/SeaBird/seabird-import.py#354

Create datapoint : https://opensource.ncsa.illinois.edu/bitbucket/projects/GEOD/repos/seagrant-parsers-py/browse/SeaBird/seabird-import.py#395

@caicai89-
Copy link

Thanks! @robkooper

@ghost ghost added the kind/discussion label Jul 7, 2016
@ghost
Copy link

ghost commented Jul 7, 2016

@caicai89- this appears to be a discussion ... is there an actual task associated with this issue?

@dlebauer
Copy link
Member Author

dlebauer commented Jul 7, 2016

Task is defined by title
On Thu, Jul 7, 2016 at 9:04 AM Rachel Shekar notifications@github.com
wrote:

@caicai89- https://github.com/caicai89- this appears to be a discussion
... is there an actual task associated with this issue?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#101 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AAcX51QwzcauORzGCkGBAZ3N2niqL8gRks5qTQdkgaJpZM4IdeQP
.

@caicai89-
Copy link

terraref/reference-data#32 (comment) is a discussion

@ghost
Copy link

ghost commented Jul 7, 2016

@caicai89- have you "Stored location and field of view metadata associated with each file in Clowder"?

@ghost ghost modified the milestone: June 2016 Jul 7, 2016
@caicai89-
Copy link

Not yet, last week Max helped me set up the testing environment, and I found some errors during testing. In addition, for the geostreams API part, I still have some questions.

@max-zilla
Copy link
Contributor

max-zilla commented Aug 31, 2016

@caicai89- does the sensor ID creation make sense?

I think in addition to seeing if the metadata is there, we should check dataset name:

  • does it have a hyphen " - " in the name? if so, sensor name is the part before the hyphen
ds = "co2Sensor - 2016-06-10"
sensorname = ds.split(" - ")[0]
sensorMap = {"co2Sensor": "1"} etc.
  • so add the datapoint to a stream based on which sensor it is. Your code should also create the necessary stream if it doesn't exist yet.

@caicai89-
Copy link

@max-zilla Will work on that later.

@ghost
Copy link

ghost commented Sep 22, 2016

@caicai89- have you made the updates requested above? If not, can you put them in a new issue?

@max-zilla
Copy link
Contributor

max-zilla commented Oct 25, 2016

@dlebauer @robkooper I finished up the remaining tasks on this extractor and deployed to clowder-dev instance. the process is ready:

INFO    : pyclowder.extractors -  Starting a New Thread for Process Dataset
sensor lat/lon: (33.07634044407777, -111.97480020961399)
F.O.V. NW lat/lon: (33.07634876600311, -111.97480697587183)
F.O.V. SE lat/lon: (33.076332122152046, -111.97479344335744)
checking stereoTop.
posting datapoint to stream 5
Successfully added datapoint.
  • this requires a "sensor" in Geostreams to exist (I created a UA-MAC sensor)
  • will accept a map of "instrument name" -> "stream ID" in config.py. when a new dataset is processed the instrument name, e.g. stereoTop is checked against this map to see whether a stream was predefined, and a stream is created if not. the following streams should be created and added to config ahead of time like I did on the dev instance (the IDs don't have to match these numbers, but they'll need to be updated):
"stereoTop": "5",
                    "flirIr": "6",
                    "co2Sensor": "7",
                    "cropCircle": "8",
                    "priSensor": "9",
                    "scanner3DTop": "11",
                    "ndviSensor": "10",
                    "ps2Top": "12",
                    "SWIR": "13",
                    "VNIR": "2"
  • datapoints are then added into the corresponding stream.

I verified via the API that the stream + datapoints exist, but I wasn't seeing the datapoints appear in the Clowder UI. This might be a TERRA-dev Clowder issue (they appeared locally in my testing) but the data are there.

We'll need to deploy PostGIS to the production server before we can deploy this there.

NOTE: This adds the datapoint as a "point" still (geojson support in Clowder is being reviewed) but in the properties of the dp we store the FOV polygon so we can use it later:

metadata = {
        "sources": host+"datasets/"+parameters['datasetId'],
        "file_ids": ",".join(fileIdList),
        "centroid": {
            "type": "Point",
            "coordinates": [sensor_latlon[1], sensor_latlon[0]]
        },
        "fov": {
            "type": "Polygon",
            "coordinates": [[[fov_nw_latlon[1], fov_nw_latlon[0], 0],
                             [fov_se_latlon[1], fov_se_latlon[0], 0] ]]
        }
    }

@dlebauer
Copy link
Member Author

Are two points enough to define a polygon? I don't think we can assume that the edges are parallel to lat/lon ( @tingli3 ?)

@ghost
Copy link

ghost commented Nov 2, 2016

Deployed to clowder-dev. @robkooper - Ready to move to production

@ghost
Copy link

ghost commented Nov 14, 2016

can this issue be closed?

@max-zilla
Copy link
Contributor

I plan to install geostreams stuff today/tomorrow and deploy this on production, then I'll close.

@max-zilla
Copy link
Contributor

I installed geostreams database on production and deployed the geospatial metadata extractor. Closing this issue.

@ghost ghost added the 4 - Done label Jan 3, 2017
@dlebauer
Copy link
Member Author

@max-zilla at this point, where can I find the bounding box for the image files? For example, where are the field-of-view bounding boxes for the geotiffs here: https://terraref.ncsa.illinois.edu/clowder/datasets/58dd375a4f0c430e2bff1b21

@max-zilla
Copy link
Contributor

@dlebauer those are stored in the geostreams plots. The sensorposition extractor has this JSON data for each dataset/file it processes:

{
            "fov": {
                "type": "Polygon",
                "coordinates": [
                    [
                        [
                            -111.97502740364716,
                            33.07628117585825,
                            0
                        ],
                        [
                            -111.97501541748397,
                            33.07626777606369,
                            0
                        ]
                    ]
                ]
            },
            "sources": "https://terraref.ncsa.illinois.edu/clowder/datasets/587f9adc4f0cd67174e61751",
            "centroid": {
                "type": "Point",
                "coordinates": [
                    -111.9750214105651,
                    33.07627447596113
                ]
            },
            "file_ids": "587f9add4f0cd67174e61757"
        }

...in this case, what you're looking for are file_ids. I just created this pull request:
https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/clowder/pull-requests/1128/overview

that will add a datapoints search filter so you can search like this:

http://localhost:9000/clowder/api/geostreams/datapoints?filter={"sources":"https://terraref.ncsa.illinois.edu/clowder/datasets/58dd375a4f0c430e2bff1b21"}

or

http://localhost:9000/clowder/api/geostreams/datapoints?filter={"file_ids":"58dd375a4f0c430e2bff1b21"}

etc. The datapoint FOV will have it. But Zongyang has been overhauling how the stereo RBG geotiffs are generated, so the FOVs in the geostreams will need to also be updated to match.

@ghost ghost removed the help wanted label Apr 20, 2017
@dlebauer
Copy link
Member Author

@max-zilla has this been implemented? Does geostreams now store the FOV as a PostGIS geometry for each file?

@max-zilla
Copy link
Contributor

yes, postGIS stores FOV now in sensorposition. closing.

@dlebauer
Copy link
Member Author

dlebauer commented Aug 17, 2017

@max-zilla
Copy link
Contributor

This is in the new extractor updates I have yet to deploy - no results to look at yet. should have some in next day or two.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants