Insert plot level height histogram into Clowder geostreams; height into BETYdb #210

ZongyangLi · 2016-12-08T16:40:36Z

Description

We have scripts to generate plot level height histogram on Roger. The next step is to create a pipeline for this extractor.

Completion Criteria

@solmazhajmohammadi need to provide a parameter of "the position of the calibration object"
- we are now making our assumptions of where is the (0,0,0) point in the field coordinate system.
- from @solmazhajmohammadi we know that "We have used a special object to calibrate the point clouds. (0,0,0) point in the point cloud is somewhere in middle of the field, we will provide the transformation matrix to the gantry coordinate system in metadata"
make sure the correlation between hand measurements and predictions is high enough
- correlation in 9/8, 9/15, 9/22, 9/29 is around 90% with our assumptions
- we found that in 8/31, almost all of plots have a same highest height level at lv. 83, it seems point above lv. 83 was 'disappeared'.
add to pipeline
- insert histogram into geostreams database as attribute
- need a set of raw data from clowder as input, like Create full field stitched mosaic #85
- insert estimate of canopy_height into BETYdb with appropriate method

dlebauer · 2016-12-15T22:29:00Z

@ZongyangLi @rmgarnett @pless

For this extractor, I would suggest that we write the summary stats (histogram) into the metadata and insert a few statistics into BETYdb. For example, we have inserted a trait called '95th quantile height'.

But the key trait from the point cloud is the height estimate calibrated to field measurements. This trait will have the same name as the trait that Maria measured, i.e. 'canopy_height'. I think it would make sense for this extractor to use the calibrated model that Roman developed in #175.

dlebauer · 2016-12-22T15:48:41Z

@rmgarnett what are the (slope, intercept) parameters from the model in #175?

When estimating height at the plot level, can we also estimate uncertainty?

rmgarnett · 2016-12-22T15:59:40Z

[hand height] = 28.2cm + 0.661 * [89th height percentile]

The RMSE/MAE gives a rough estimate of L2/L1 uncertainty. I will do a more thorough analysis in January now that all height distributions are extracted.

dlebauer · 2016-12-22T16:15:21Z

@rmgarnett I suspect RMSE scales with height?

From your plot it is hard to tell how the data are distributed b/c of overlapping points. But I gather strongly right-skewed. I wonder if log transforming x and y would be appropriae, if it would more evenly weight the smaller values. The small plants are important too!

ZongyangLi · 2017-01-04T20:03:41Z

@dlebauer
I have got all height distribution data for season 2 from 8/8 to 11/25, and I created 90th and 95th height percentile csv file, according to @rmgarnett 's research.
90th percentile
95th percentile
Scanner3DTop data in Season 2 is much better than those in Season 1, but still data from 10/13 to 11/04 are unexpected, there are just a few points in those days ply files.

I am wondering if point cloud files might be fixed in those days, if not, what's your opinion of putting them into BETYdb.

dlebauer · 2017-01-05T17:11:54Z

@solmazhajmohammadi could you please check into whether we can recover useful data from 10/13 to 11/04?

@ZongyangLi we need to discuss with @rmgarnett about how to implement this extractor.

dlebauer · 2017-01-10T19:13:05Z

@rmgarnett have you made any progress on adding uncertainty?

rmgarnett · 2017-01-10T22:38:36Z

I will pick this up again this week.

…

On Wed, Jan 11, 2017 at 2:13 AM David LeBauer ***@***.***> wrote: @rmgarnett <https://github.com/rmgarnett> have you made any progress on adding uncertainty? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#210 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAjpCbAL13MIg5xXZ0OKTTsr-lbYxIahks5rQ9hBgaJpZM4LIBt3> .

dlebauer · 2017-01-12T16:38:03Z

@ZongyangLi you can go ahead and insert the data that you have. We can create another issue for adding uncertainty to the height calculations (moving forward this should be done by default ... )

solmazhajmohammadi · 2017-01-12T18:03:57Z

@dlebauer @ZongyangLi, for the data from 10/13 to 11/04, the png files have not been collected correctly, but we can get the height information from the scans that it is done at ~5m

solmazhajmohammadi · 2017-01-12T18:04:00Z

@smarshall-bmr can you please scan the checker boards to find the pointcloud origin?

ZongyangLi · 2017-01-12T18:13:51Z

@solmazhajmohammadi, are you saying to estimate the plot level height base on the highest points in the remaining 3d data? That might be different from what we have done before, because we are using all point cloud data to create a height histogram and calculate quantiles data to make predictions.

solmazhajmohammadi · 2017-01-12T18:42:40Z

@ZongyangLi This could be an option, otherwise the data has been collected with a wrong setting, so we are not able to recover it.

rmgarnett · 2017-01-13T22:21:37Z

I have been reinvestigating the hand measurements using @ZongyangLi's most-recent data. The final model may differ from what's written above, but it will be the same form. I presume the extractor will be easy to modify if we wish to change the model slightly?

dlebauer · 2017-01-13T22:31:16Z

Yes, we could store parameters as metadata and have the extractor pick them up (eg if they change by crop, year, or location)

…

On Fri, Jan 13, 2017 at 4:21 PM Roman Garnett ***@***.***> wrote: I have been reinvestigating the hand measurements using @ZongyangLi <https://github.com/ZongyangLi>'s most-recent data. The final model may differ from what's written above, but it will be the same form. I presume the extractor will be easy to modify if we wish to change the model slightly? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#210 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAcX5xs8Q3cOxCiNSQpFMcN6MlWYskaiks5rR_jygaJpZM4LIBt3> .

rmgarnett · 2017-01-13T22:36:19Z

Perfect.

ZongyangLi · 2017-05-05T16:15:45Z

@dlebauer
When I do that, I receive below error report:
"mean": "31.505", "local_datetime": "2016-08-08T12:00:00", "access_level": "2", "error": "bad date specification; see error output", "species": { "scientificname": "Sorghum bicolor" },
"site": { "sitename": "MAC Field Scanner Field Plot 1603 Season 2", "error": "match not found" },
"method": { "name": "height Estimation from 3D Scanner using formula: [hand height] = 28.2cm + 0.661 * [89th height percentile]", "error": "match not found" },

dlebauer · 2017-05-08T23:56:47Z

@ZongyangLi

The date looks correct - what is the error output?
The sitename needs to match a record in the database terraref.ncsa.illinois.edu/bety/sites
the method name needs to match a name in the methods table terraref.ncsa.illinois.edu/bety/methods

ZongyangLi · 2017-05-09T14:14:39Z

@dlebauer

the error is 'bad data specification', I think it's because of error 2 and error 3.
This is a plot number definition problem I mentioned here: Insert plot level height histogram into Clowder geostreams; height into BETYdb #210 (comment), here I upload a totally 1728 plots data, maybe this makes error 2.
the method name comes from you comment here: Insert plot level height and / or percent cover into BETYdb #193 (comment), I failed to access the methods table.

dlebauer · 2017-05-09T15:01:56Z

Hmm. error 1 is the only one that says 'see error output'. @gsrohde any ideas?
For season 1, Plot names for the 1728 plots are the same as for the 864 plots, but with E, W appended. We don't have the same 1728 plots specified for season 2; I can create the 1728 plots for Season 2 or we could go with the 864 plots.
sorry I was not clear. for the method, please create a new record here: https://terraref.ncsa.illinois.edu/bety/methods/new. Can you access that page?

ZongyangLi · 2017-05-09T22:03:55Z

@dlebauer I successfully create a new method called: 'Scanner 3d ply data to height', and error 3 goes away. Since the 1728 plots still not work, I used the 864 plots definition to finish the operation.

One another error comes out:
"model_validation_errors": "{:mean=>[\"The value of mean for the height trait must be at most 120.\"]}",

By the way, @solmazhajmohammadi gave us the origin in point cloud data last week, I am creating new height estimate data using the new metadata.

Also, we are creating algorithm to use stereo top image to do the 3d reconstruction, and trying to use this stereo 3d model to recover the missing height data from 2016/10/11 to 2016/11/04, once we get the new available data, I will resend them to BETYdb.

dlebauer · 2017-05-09T22:22:37Z

@ZongyangLi thanks,

the error you observed is caused by the fact that each variable has a valid range, and in the case of 'height' the valid range is [0, 120]. And the variable height has units in meters (as you can tell, these ranges are generally very generous; the tallest known tree is 115m).
however, I think that we should be using the variable `canopy_height, since this is the variable that Maria recorded, and I assume that these are the data used to fit the model (please confirm)
See also Merge plant_height, panicle_height, and spike_height reference-data#118 for disambiguating terms related to height.

In summary:

if the model is trained only on Maria's canopy_height measurements, we should call the output canopy_height.
if it is trained on both canopy_height and either panicle_height or its synonym spike_height, we should call the variable plant_height (plant_height = max(canopy_height, panicle_height).
I am assuming that the algorithm does not differentiate plants with or without fruit.

gsrohde · 2017-05-10T06:18:32Z

@dlebauer and @ZongyangLi The message should probably say "See error list" instead. The error list contains the key-value pair date_data_errors=>[\"You can't have a local_datetime attribute on a trait-group's defaults element if no default site has been specified.\"]. But this itself is unclear except when an XML file, not a CSV file, is being uploaded. And I think the error itself is bogus in this case. I'm making an issue for this.

dlebauer · 2017-05-11T16:33:56Z

change to plant_height

ghost · 2017-06-15T16:16:30Z

Currently this extractor generates two numpy files:

ls /projects/arpae/terraref/sites/ua-mac/Level_1/scanner3DTop_plant_height/2017-05-06/2017-05-06__05-05-58-934
scanner3DTop - 2017-05-06__05-05-58-934 highest.npy  scanner3DTop - 2017-05-06__05-05-58-934 histogram.npy

In #303 we mention pushing the max value to geostreams - this isn't implemented yet.

Additionally we should convert the .npy files to an image or geotiff.

ZongyangLi · 2017-06-15T20:41:03Z

@max-zilla I have just uploaded recent codes to https://github.com/terraref/extractors-3dscanner/tree/master/plant_height
And I know it's really a mess code, I need to take some time to make it much more beautiful.

max-zilla · 2017-06-22T14:24:11Z

@ZongyangLi I ran a PLY file from feb through the plant_height extractor - I'm making a pull request to show @nickheyek that writes .tif instead of .npy and pushes some values to geostreams.

The "highest" numpy array looks like this for sample file:

>>> f = r"/Users/mburnette/globus/111fb573-a351-4bf8-9594-aedddc53a850__Top-heading-east_0.ply"                       
>>> plydata = PlyData.read(f)
>>> hist, highest = full_day_to_histogram.gen_height_histogram(plydata, False)
>>> highest
array([[ 1051.20507812],
       [ 1042.48278809],
       [ 1005.7802124 ],
       [ 1016.98535156],
       [ 1013.16113281],
       [ 1053.45422363],
       [ 1031.78356934],
       [ 1024.53540039],
       [ 1019.98809814],
       [ 1020.35986328],
       [ 1007.30383301],
       [  948.19873047],
       [  958.54162598],
       [  989.56195068],
       [    0.        ],
       [    0.        ]])
>>> len(highest)
16

What exactly do these 16 numbers represent? I see the "hist" array is also length 16, but each of its members is itself an array:

>>> hist
array([[ 0.,  0.,  0., ...,  0.,  0.,  0.],
       [ 0.,  0.,  0., ...,  0.,  0.,  0.],
       [ 0.,  0.,  0., ...,  0.,  0.,  0.],
       ..., 
       [ 0.,  0.,  0., ...,  0.,  0.,  0.],
       [ 0.,  0.,  0., ...,  0.,  0.,  0.],
       [ 0.,  0.,  0., ...,  0.,  0.,  0.]])
>>> len(hist)
16

If I were to summarize the "highest" numbers into geostreams, what would you recommend? Something like this is hard to interpret:

{
"highest": [[ 1051.20507812],
       [ 1042.48278809],
       [ 1005.7802124 ],
       [ 1016.98535156],
       [ 1013.16113281],
       [ 1053.45422363],
       [ 1031.78356934],
       [ 1024.53540039],
       [ 1019.98809814],
       [ 1020.35986328],
       [ 1007.30383301],
       [  948.19873047],
       [  958.54162598],
       [  989.56195068],
       [    0.        ],
       [    0.        ]]
}

Can I select the maximum number of this array as a simple "tallest plant in this plot" number?

ZongyangLi · 2017-06-22T15:19:11Z

What exactly do these 16 numbers represent?
Every ply file supposes to cover one row between east boundary and west boundary in the field. At the very beginning, field were divided into 16 column, these 16 numbers represent for the highest point in the column, and the unit is mm. Please use the most recent updated function named
gen_height_histogram_for_Roman(plydata, scanDirection, out_dir, sensor_d, center_position)
This will consider a east or west side sensor, and add the newest origin offset in both side, and make the ground plane as hist[0], and use a 32 column definition.

With these numbers recorded, I would integrate all scanned result to create a full field highest and histogram file, using the according json file.

And what for the 'hist'?
hist records the height distribution, 16 lengths of hist has the same definition as 'highest'. For 400 numbers in each element, it is bins to record how many points in that height level. you can treat col 0 as ground plane in the field, each height level equals to 1 cm in the field. So number(for example 5) in the 400 bin means there are 5 points in height level 400, which means 4 meters in the real world.
Recommended way of summarizing data:

use 'gen_height_histogram_for_Roman' to create hist and highest array, use json file to determine the plot number for each bin, then change to whatever unit that might fit the database.
Notice that each plot might made up of several ply files, so you might create several different number for each plot for one day by a separated process.
create different quantiles data with 'hist' for each plot and use @rmgarnett 's formula to create an estimate height. reference codes: https://github.com/terraref/extractors-3dscanner/blob/master/plant_height/draw_field_scanned_in_grid.py#L195

dlebauer · 2017-06-22T15:41:07Z

Would it make sense to use the same workflow as we are using with other image data, i.e.:

full field stitch (optional ...)
plot subset
analysis,

so that each result represents a single plot?

max-zilla · 2017-06-22T16:13:11Z

i believe we've discussed stitching point clouds into plots - I will try to track down comments from @solmazhajmohammadi

if we simply merge, there will be messy stuff where two passes overlap
we could also just pick one if two overlap
or, we merge plot histograms afterwards

convert point cloud origin to gantry coordinate system before merging, otherwise they will all be on the same spot. pdal transformation matrix

dlebauer · 2017-07-06T16:22:03Z

I don't think that the files overlap (after R and L ply are merged into a single las)
I think using the stitch then subset approach is okay.

@solmazhajmohammadi can you confirm that independent passes do not overlap each other.

dlebauer · 2017-07-06T16:25:14Z

I think we can close this issue and open a new one with low priority that will deal with full field merge then split.

solmazhajmohammadi · 2017-07-13T17:02:10Z

@dlebauer to confirm the independencies between passes, we need to apply the transformation to all the ply files from a single day and merge them together.
The transformation is available in Issue 44

ghost added the 1 - Ready label Dec 14, 2016

ghost assigned solmazhajmohammadi and ZongyangLi Dec 14, 2016

dlebauer mentioned this issue Dec 15, 2016

Create a correlation plot comparing scanner height measurements vs manual height measurements #175

Closed

rmgarnett closed this as completed Dec 22, 2016

rmgarnett reopened this Dec 22, 2016

ghost added sensor/laser3d issues relating to scanner3DTop laser scanner laser 3D bety/application labels Jan 3, 2017

dlebauer removed the 1 - Ready label Jan 5, 2017

ZongyangLi mentioned this issue Jan 5, 2017

Insert plot level height and / or percent cover into BETYdb #193

Closed

2 tasks

dlebauer added this to the January 2017 milestone Jan 12, 2017

dlebauer unassigned solmazhajmohammadi Jan 12, 2017

ZongyangLi mentioned this issue Jan 12, 2017

Irregular 3D scanner data in new calibration files #141

Closed

dlebauer removed the help wanted label May 11, 2017

ghost modified the milestones: May 2017, February 2017 May 17, 2017

ghost added the help wanted label Jun 15, 2017

dlebauer mentioned this issue Jun 15, 2017

update 3dscanner plant_height #332

Closed

ghost removed the help wanted label Jun 15, 2017

max-zilla added the help wanted label Jun 22, 2017

max-zilla self-assigned this Jun 22, 2017

max-zilla modified the milestones: July 2017, May 2017 Jun 22, 2017

max-zilla closed this as completed Jul 13, 2017

max-zilla removed the help wanted label Jul 13, 2017

ZongyangLi mentioned this issue Dec 4, 2017

Review plant_height script output terraref/reference-data#210

Open

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Insert plot level height histogram into Clowder geostreams; height into BETYdb #210

Insert plot level height histogram into Clowder geostreams; height into BETYdb #210

ZongyangLi commented Dec 8, 2016 •

edited

Loading

dlebauer commented Dec 15, 2016

dlebauer commented Dec 22, 2016

rmgarnett commented Dec 22, 2016

dlebauer commented Dec 22, 2016

ZongyangLi commented Jan 4, 2017

dlebauer commented Jan 5, 2017

dlebauer commented Jan 10, 2017

rmgarnett commented Jan 10, 2017 via email

dlebauer commented Jan 12, 2017

solmazhajmohammadi commented Jan 12, 2017

solmazhajmohammadi commented Jan 12, 2017

ZongyangLi commented Jan 12, 2017

solmazhajmohammadi commented Jan 12, 2017

rmgarnett commented Jan 13, 2017

dlebauer commented Jan 13, 2017 via email

rmgarnett commented Jan 13, 2017

ZongyangLi commented May 5, 2017

dlebauer commented May 8, 2017

ZongyangLi commented May 9, 2017

dlebauer commented May 9, 2017

ZongyangLi commented May 9, 2017

dlebauer commented May 9, 2017

gsrohde commented May 10, 2017

dlebauer commented May 11, 2017 •

edited

Loading

ghost commented Jun 15, 2017

ZongyangLi commented Jun 15, 2017

max-zilla commented Jun 22, 2017

ZongyangLi commented Jun 22, 2017

dlebauer commented Jun 22, 2017 •

edited

Loading

max-zilla commented Jun 22, 2017 •

edited

Loading

dlebauer commented Jul 6, 2017

dlebauer commented Jul 6, 2017

solmazhajmohammadi commented Jul 13, 2017

Insert plot level height histogram into Clowder geostreams; height into BETYdb #210

Insert plot level height histogram into Clowder geostreams; height into BETYdb #210

Comments

ZongyangLi commented Dec 8, 2016 • edited Loading

Description

Completion Criteria

dlebauer commented Dec 15, 2016

dlebauer commented Dec 22, 2016

rmgarnett commented Dec 22, 2016

dlebauer commented Dec 22, 2016

ZongyangLi commented Jan 4, 2017

dlebauer commented Jan 5, 2017

dlebauer commented Jan 10, 2017

rmgarnett commented Jan 10, 2017 via email

dlebauer commented Jan 12, 2017

solmazhajmohammadi commented Jan 12, 2017

solmazhajmohammadi commented Jan 12, 2017

ZongyangLi commented Jan 12, 2017

solmazhajmohammadi commented Jan 12, 2017

rmgarnett commented Jan 13, 2017

dlebauer commented Jan 13, 2017 via email

rmgarnett commented Jan 13, 2017

ZongyangLi commented May 5, 2017

dlebauer commented May 8, 2017

ZongyangLi commented May 9, 2017

dlebauer commented May 9, 2017

ZongyangLi commented May 9, 2017

dlebauer commented May 9, 2017

gsrohde commented May 10, 2017

dlebauer commented May 11, 2017 • edited Loading

ghost commented Jun 15, 2017

ZongyangLi commented Jun 15, 2017

max-zilla commented Jun 22, 2017

ZongyangLi commented Jun 22, 2017

dlebauer commented Jun 22, 2017 • edited Loading

max-zilla commented Jun 22, 2017 • edited Loading

dlebauer commented Jul 6, 2017

dlebauer commented Jul 6, 2017

solmazhajmohammadi commented Jul 13, 2017

ZongyangLi commented Dec 8, 2016 •

edited

Loading

dlebauer commented May 11, 2017 •

edited

Loading

dlebauer commented Jun 22, 2017 •

edited

Loading

max-zilla commented Jun 22, 2017 •

edited

Loading