-
Notifications
You must be signed in to change notification settings - Fork 2
Proposed format for hyperspectral and other imaging data #14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The CF metadata look good. In the absence of strong reasons to the contrary, I recommend dimension order as time,lat,lon,wavelength, angle. Keep time as the first dimension if possible. |
Great discussion of decisions about formats and metadata. While several aspects of the CF conventions are really useful and helpful, there are some fairly important caveats. These include lack of support for groups in the files (typically a requirement these days) and a sometimes difficult process of agreeing on names. These are related to the existing CF community tools and to the focus of the CF community on climate and forecast names. You might take a look at the CSDMS standard names as an example of an approach that is different than CF (http://csdms.colorado.edu/wiki/CSDMS_Standard_Names#.C2.A0_CSDMS_Standard_Names). It has a standard set of rules for creating and interpreting names. This is more flexible than the community approval mechanism used by CF. I would suggest that you want a metadata model that supports names from multiple communities rather than just one (ISO does that). It also has the advantage that we have a (proposed) standard approach for adding ISO-compliant metadata to HDF files. That can help ameliorate differences between collection and granule metadata. |
Just noticed that the draft units have slight typos: "degrees north" should be "degrees_north" etc. for UDUnits compatibility. |
@tedhabermann thank you for pointing out CSDMS - is this what you are referring to when you say "we have a (proposed) standard approach for adding ISO-compliant metadata to HDF files"? Given the diversity of disciplines, it is clear that we will have to support multiple naming conventions. Any advice on how to support multiple vocabularies would be appreciated - I proposed amending our database with a simple 'thesaurus' lookup table (#18 (comment)) and I would appreciate feedback on whether the solution is sufficient and robust, or not, and if there is an existing framework for supporting multiple vocabularies. To be clear, proposing new variables for CF is not nearly as much of a priority as coming up with a solution for our project that is clearly defined and can be translated later if necessary. To this end, I have liberally begun making up names following the CF guidelines. I'll take a look at your webinar "Metadata Recommendations, Dialects, Evaluation & Improvement" from last month, and hopefully find some answers as well as a better understanding of these issues. |
@czender fixed - thanks |
@dlebauer - using standard names is separate from including ISO metadata in the HDF files. See slides on ISO in HDF (http://www.slideshare.net/tedhabermann/granules-and-iso-metadata). |
Following up on @dlebauer -- I've already written a few functions for spectral convolution for a bunch of common remote sensing platforms like Landsat and MODIS in the PEcAn RTM package Shawn, Toni, and I are developing. From that link, if you go to |
I looked at the CSDMS conventions for names. I do not see any compelling reason to propose a bunch of names to either them or CF right now. It would slow-down the prototyping. Once the workflow is established we can think about sustainable naming strategies. For now, choosing CF'ish names seems good enough. |
(from #2)
I think you have the right idea of parsing this to attributes, but I will note that the .json files are not designed to meet a standard metadata convention. But presumably a CF-compliant file will? Ultimately, we will want them to be compliant or interoperable with an FGDC-endorsed ISO standard (https://www.fgdc.gov/metadata/geospatial-metadata-standards). Does that sound reasonable? Regarding gantry variable data like x,y,z location and time, I think it would be useful to store this as a meta-data attribute in addition to either dimensions or variables. When you say 'variables' do you mean to store the single value of x,y,z in the metadata as a set of variables? Ultimately these will be used to define the coordinates of each pixel. This is something that I don't understand well and don't know if there is an easy answer. As I understand it, we could transform the images to a flat xy plane that would allow gridded dimensions, but if we map to xyz then they would be treated as variables. I'd appreciate your thoughts on this and if you want to chat off line let me know. |
Charlie uses the phrase "propose a bunch of names" in the context of CSDMS and CF, but that is not correct. The approach to naming in CSDMS and CF are fundamentally different. You propose names to CF and there is typically a long and arduous review process, particularly for names that are outside of the context of "Climate and Forecast". In CSDMS, you create names that are consistent with a set of rules. There is no review process and no associated delay. That is why I suggest you think about using the CSDMS approach as the base approach. It gives you control over the names instead of a committee of climate/forecast people. |
On JSON vs. XML - Both of these representations are important in different contexts. XML is much more prevalent in the metadata world than JSON. If you think only in JSON then you are losing a lot of standard capability. At HDF we are working with both. What is really critical is the naming of the elements. For sensor metadata an approach based on soft-typing is usually more flexible. Instead of standardizing a set of parameter names standardize a method of describing parameters then use that everywhere. This approach has been used quite a bit in sensorML with a fair amount of success. Hard-typed names are almost always a problem in the long run. The tools I am working on take ISO compliant XML and transform it into NcML which can be imported into an HDF file. This works with all (AFAIK) ISO metadata standards. Might even work with SensorML (have not looked at that). It gives you a standard set of names and paths for ISO content in your files. That is the important piece here - tools need to know standard paths for any metadata content. |
The CSDMS approach looks fine to me, especially for quantities that are not likely to be covered by CF. And much of the data generated by this project is not typically thought of as covered by CF, though CF is trying to expand its domain. In any case, the "standard" names we are talking about are much too long to be useful to humans. They are generally stored as a "standard_name" or similar attribute of the field they describe. And the field has a much shorter and easier primary name, e.g., T for temperature. It is too early to worry about what the longer standard names will be until we have a workflow and have looked more carefully at what others call similar quantities. |
Soapbox warning! I appreciate Charlie's point of view on this, but, at the same time, wonder about how many decisions have been made with the logic "it will slow us down" or "lets do this now, then fix it later". I suspect that there have been many such decisions and that many of them have never been fixed as we move on to the next short-term event. These decisions ultimately lead to inoperable datasets that users have to deal with. If we are interested in evolving the culture, we need to do it at the beginning. Then the team gets used to meaningful parameter names instead of abbreviations and we move on from there. |
The soapbox here is always open! I really appreciate this discussion and agree with both of you. Our first task is to put it somewhere safe and make it useable. Our fairly rigid timeline for data product development still gives time for iterative development: alpha release in 2016, beta in 2017, stable in 2018. Our goal is to allow for multiple rounds of feedback. After each release there will be hands-on workshops aimed at getting feedback on whats useful and whats not, and what we should change or create moving forward.
While I think CF may cover most of the variables in the sensor data products, I have followed Guidelines for Construction of CF Standard Names to develop (i.e. make up without intention of requesting approval) a list of CF-style names for variables related to ecosystem and plant physiology that aren't Here is the list of proposed names. Changes / comments welcome. It not perfect but hopefully will give us what we need for now. We will certainly need to support many vocabularies using a thesaurus (many-to-many lookup to with the primary variables table in betydb.org). CSDMS sounds like a good candidate. For data interoperability we will also support ICASA from the USDA and AgMIP (agricultural modeling community), but that doesn't cover sensors. |
Greetings from the Gantry... Attached is the first crack by Markus Radermacher on the metadata framework for imagery.
|
How do we deal with the relative spectral response (RSR) curves of |
This is what @robkooper originally suggested, making the calibration docs accessible via web-based API. However, @czender suggested including it in each file to make the file completely self-documenting, with the rationale that while this is lots of information it is much smaller by the data. This also seems sensible. @rjstrand do you have the spectral response and other calibration data for all of the sensors? |
is the RSR (expected to be) constant-in-time, i.e., same RSR per sensor forever? and how many numbers characterize the RSR at each wavelength? |
Ideally, yeah, it would be constant, but sensors could drift with age, As to how many numbers, depends on the precision of calibration. Imaging On Wed, Feb 3, 2016, 6:57 PM Charlie Zender notifications@github.com
|
Here are the spec sheets for the spectrometers for reference And the sensor suite (also adding a sonic anemometer) |
Is that all the information they provide? If so, it may be worth contacting On Wed, Feb 3, 2016, 9:53 PM David LeBauer notifications@github.com wrote:
|
Here are datasheets for the Hyperspec imaging sensors: Headwall VNIR imaging spectrometer 380-1000nm.pdf |
@solmazhajmohammadi those were the same as attached to my message (in addition to the images of the full sensor suite. But do you know of or can you get the files from the sensor calibration done by the manufacturer? (I think this was done in December) and any subsequent testing? |
@dlebauer Not so far. I'll update you once I get further information. |
@ALL The calibration information for the hyperspec devices was never provided in a digital format. We are in contact with Headwall. Please be patient. |
This is a proposal for spectral and imaging data to be provided as HDF-5 / NetCDF-4 data cubes for computing and downloading by end users.
Following CF naming conventions [1], these would be in a netcdf-4 compatible / well behaved hdf format. Also see [2] for example formats by NOAA
Questions to address:
see also PecanProject/pecan#665
Radiance data
Variables
note: upwelling_spectral_radiance_in_air may only be an intermediate product (and perhaps isn't exported from some sensors?) so the focus is really on the reflectance as a Level 2 product.
Dimensions
[1] http://cfconventions.org/Data/cf-standard-names/29/build/cf-standard-name-table.html
[2] http://www.nodc.noaa.gov/data/formats/netcdf/v1.1/
The text was updated successfully, but these errors were encountered: