Skip to content

Extractors for meteorological data #156

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
28 of 29 tasks
dlebauer opened this issue Aug 25, 2016 · 56 comments
Closed
28 of 29 tasks

Extractors for meteorological data #156

dlebauer opened this issue Aug 25, 2016 · 56 comments
Assignees
Milestone

Comments

@dlebauer
Copy link
Member

dlebauer commented Aug 25, 2016

Description

We have one script to process the environmental logger data.

  • We need scripts to insert data from MAC, Kansas, and UIUC, into the geostreams database
  • need to insert data from environmental logger into the geostreams database
  • Would be good to update these files daily.

Further Suggestions / Request for Feedback

@robkooper Can we use BrownDog / PEcAn infrastructure?

How to upload these files? They are small, not necessarily worth setting up Globus endpoint just for these, if they can be downloaded via FTP.


(Appended below the task list and useful information)

Tasks

  • Input
    • Subscribe related messages (*.dataset.files.added)
    • Process condition
      • All input ready (24 .dat files present)
      • Process not done yet (check metadata)
  • Process
    • Record property transformation (for mapping details see Extractors for meteorological data #156 (comment))
      • "TIMESTAMP" - Convert to ISO 8601 (and use as start_time and end_time)
      • "RECORD" - Discard
      • "BattV" - Discard
      • "PTemp_C" - Discard
      • "AirTC" - Convert to Kelvin
      • "RH" - Ensure unit is percent
      • "Pyro" - Direct use
      • "PAR_ref" - Direct use
      • "WindDir" - Given WS_ms is also present, convert into eastward_wind and northward_wind.
      • "WS_ms" - Ensure unit is meters per second
      • "Rain_mm_Tot" - Direct use
    • Assign proper time stamps
    • Assign coordinates
      • "can be SW corner of Gantry … we can start with 4 digits precision (10m?) and update this later"?
    • Assign stream_id
    • Assign sensor_id
    • Assign sensor_name
    • Add source info (dataset and file)
    • Add unit and sample method
  • Output
    • Save result (GeoStream API)
      • Assert record of sensor, station and stream
    • Mark input data as processed (use metadata).

Raw data structure (.dat files)

{
   "WindDir":{
      "unit":"degrees",
      "sample_method":"Smp"
   },
   "PAR_ref":{
      "unit":"umol/s/m^2",
      "sample_method":"Smp"
   },
   "BattV":{
      "unit":"Volts",
      "sample_method":"Smp"
   },
   "TIMESTAMP":{
      "unit":"TS",
      "sample_method":""
   },
   "Rain_mm_Tot":{
      "unit":"mm",
      "sample_method":"Tot"
   },
   "Pyro":{
      "unit":"W/m^2",
      "sample_method":"Smp"
   },
   "RECORD":{
      "unit":"RN",
      "sample_method":""
   },
   "AirTC":{
      "unit":"Deg C",
      "sample_method":"Smp"
   },
   "WS_ms":{
      "unit":"meters/second",
      "sample_method":"Smp"
   },
   "RH":{
      "unit":"%",
      "sample_method":"Smp"
   },
   "PTemp_C":{
      "unit":"Deg C",
      "sample_method":"Smp"
   }
}
@Zodiase
Copy link
Contributor

Zodiase commented Sep 21, 2016

I'm a bit confused about the goal of this issue.

From the description I see these independent threads:

  1. Upload met data
  2. Process uploaded met data
  3. Provide met data on Roger

(1) doesn't seem to be what an extractor should be responsible for; how is the extractor going to be triggered?

For (2), what are the requirements for the processing?

How is (3) related to extractors?

I'm not sure how and where do I start. Any pointers?

@dlebauer
Copy link
Member Author

dlebauer commented Sep 22, 2016

For MAC weather station

  1. csv files (*.dat) are on roger here: /projects/arpae/terraref/sites/ua-mac/raw_data/weather
  2. An example of one dataset is https://terraref.ncsa.illinois.edu/clowder/datasets/57e115724f0cb775be69a949
  3. convert to json and instert into Clowder PostGIS sensor database (ask @caicai89- and @robkooper for details)

The API is documented here: https://terraref.ncsa.illinois.edu/clowder/assets/docs/api/index.html#!/datasets/addMetadata

But ask @max-zilla, @robkooper and @caicai89- for details.

The file format and variable names / units should follow specifications for PEcAn here: https://pecan.gitbooks.io/pecan-documentation/content/developers_guide/Adding-an-Input-Converter.html

@dlebauer
Copy link
Member Author

dlebauer commented Sep 22, 2016

For schema see #130

Roughly, the schema looks like:

clowder postgis schema-2

  • SENSOR table --> UA MAC
  • STREAM will be 'weather station'
  • DATAPOINT will be all data at a single time point (?@robkooper)

@dlebauer
Copy link
Member Author

dlebauer commented Sep 22, 2016

Here is an example of a record from one time point from here: https://greatlakesmonitoring.org/clowder/api/geostreams/datapoints?geocode=40.4868888889%2C-84.4817222222%2C0&since=2008-09-22+05%3A00%3A00&until=2014-07-03+19%3A00%3A00&format=json:

{
    id: 1863734,
    created: "2014-11-04T00:48:22Z",
    start_time: "2008-09-22T10:00:00Z",
    end_time: "2008-09-22T10:00:00Z",
    properties: {
        source: "http://www.heidelberg.edu/sites/default/files/dsmith/files/ChickasawData.xlsx",
        srp - load: 0.4982,
        Silica,
        mg / L: 9.09,
        Sulfate,
        mg / L: 261.2,
        nitrogen - load: 0.03,
        Chloride,
        mg / L: 268.6,
        phosphorus - load: 0.684,
        SS,
        mg / L(suspended solids): 12.3,
        TKN,
        mg / L(Total Kjeldahl nitrogen): 1.193
    },
    type: "Feature",
    geometry: {
        type: "Point",
        coordinates: [-84.4817222222,
            40.4868888889,
            0
        ]
    },
    stream_id: "7263",
    sensor_id: "899",
    sensor_name: "Chickasaw"
},

@ghost ghost changed the title Extractors for met data Extractors for meta data Sep 22, 2016
@ghost ghost changed the title Extractors for meta data Extractors for meterological data Sep 22, 2016
@dlebauer
Copy link
Member Author

Here is the geostreams schema
geostream.sql.txt

@robkooper
Copy link
Member

Just keep in mind you do not have access to the database, all operations have to be done through the API.

We discussed this in the past and the thinking is to have sites represented as sensors in clowder (ua-mac, ksu, etc). Then have each sensor represented as a stream (VNIR, MET, stereo) and finally have each dataset, or in this case the actual values be represented in the datapoints.

@ghost
Copy link

ghost commented Sep 23, 2016

@dlebauer - is this different from #115? How?

@ghost ghost added this to the September 2016 milestone Sep 23, 2016
@ghost ghost added the 2 - Working <= 5 label Sep 23, 2016
@dlebauer
Copy link
Member Author

@rachelshekar #115 just covers the environmental logger that is on the lemnatec gantry / scanner this is for met data more generally. Goal is to get it into a consistent format in Clowder then create an extractor that converts Clowder datastream to netcdf

@dlebauer
Copy link
Member Author

@robkooper I was mostly wanting the sql schema file to define the data model (since it is slightly different from the erd diagram above).

@max-zilla
Copy link
Contributor

goal is to insert data into postGIS and convert to netCDF via extractor

@Zodiase
Copy link
Contributor

Zodiase commented Oct 5, 2016

@max-zilla Do you know what the extractor should subscribe to in order to monitor new met data files? I was thinking about subscribing to any new files in any dataset (with *.dataset.file.added), count the .dat files in that dataset and process them once I count 24. Do you have a better way of doing this?

@Zodiase
Copy link
Contributor

Zodiase commented Oct 5, 2016

@robkooper Could you explain a bit more about where (a specific dataset?) the extractor should get data from and where the output should go to?

@dlebauer
Copy link
Member Author

dlebauer commented Oct 5, 2016

@Zodiase for the weather station at UA-MAC outside the scanner, the extractor should get data from the 'weather' direcotory (terraref/sites/ua-mac/raw_data/weather/) and insert into the geostreams API.

@Zodiase
Copy link
Contributor

Zodiase commented Oct 5, 2016

@dlebauer Do you know how to use pyclowder to achieve that?

@robkooper
Copy link
Member

robkooper commented Oct 5, 2016

You will need to register with clowder and say you are interested in a specific mimetype of files. The file will be downloaded and you are given a pointer to the file on disk. You can now work with that file and write the results in any location and notify clowder (using pyclowder) about this.

This might be a good point to add some functionality to pyclowder2 to deal with geosrteams and make it easier for you.

@Zodiase
Copy link
Contributor

Zodiase commented Oct 5, 2016

@robkooper I know about the overall process but I don't know how exactly shall I save the results back to clowder. How can I use pyclowder to "insert into the geostreams API" and is there a specific location the results should go?

I think what @dlebauer wants is to process those .dat files on a dataset basis (from what I have found, each dataset contains one day of data separated in 24 files). For that I'll just subscribe to dataset file added events unless anyone has a better way of doing it.

@max-zilla
Copy link
Contributor

max-zilla commented Oct 6, 2016

@Zodiase for the TERRA project our extractors have to be slightly more careful than others, because we want to write the output files to a specific location on Roger. However I don't think that matters here since we don't have output files, just insertion into geostreams database.

  • *.dataset.file.added and counting 24 .dat files is probably a reasonable place to start to trigger the extractor.
  • David is correct that the raw files will live on Roger at /sites/ua-mac/weather. When an extractor is triggered, pyClowder must get the dataset files to pass to your process_dataset(). @robkooper and I recently added pyClowder functionality to check whether the extractor has the file path mounted locally - in our case, we can mount /sites/ua-mac/weather onto the VM that will run this extractor, and then it won't need to download files. It will just return a pointer to where they exist locally. If the VM doesn't have this mount, it will have to download the files in to /tmp.
  • The output of this extractor will be inserted into geostreams database using API endpoints in clowder such as POST /clowder/api/geostreams/streams, POST /clowder/api/sensors, POST /clowder/api/geostreams/datapoints.

I know @caicai89- has been looking at geostreams API, and I need to update Clowder to support more complex geometries than points here - #157. I am not going to get to this until next week.

@dlebauer
Copy link
Member Author

dlebauer commented Oct 6, 2016

@max-zilla this issue does not require inserting polygons ... are the other geostreams api endpoints available (I don't see them here: https://terraref.ncsa.illinois.edu/clowder/assets/docs/api/index.html.

@Zodiase
Copy link
Contributor

Zodiase commented Oct 6, 2016

@dlebauer Could you help me understand the format of the .dat files?

First 7 lines of any .dat file:

"TOA5","WeatherStation","CR1000","39656","CR1000.Std.29","CPU:F13WeatherStation.CR1","39725","SecData"
"TIMESTAMP","RECORD","BattV","PTemp_C","AirTC","RH","Pyro","PAR_ref","WindDir","WS_ms","Rain_mm_Tot"
"TS","RN","Volts","Deg C","Deg C","%","W/m^2","umol/s/m^2","degrees","meters/second","mm"
"","","Smp","Smp","Smp","Smp","Smp","Smp","Smp","Smp","Tot"
"2016-08-30 00:06:24",7276223,12.61,27.37,26.74,27.48,0,0,65,2.45,0
"2016-08-30 00:06:25",7276224,12.61,27.37,26.71,27.42,0,0,65,2.83,0
"2016-08-30 00:06:26",7276225,12.6,27.37,26.71,27.42,0,0,74,2.36,0
...

The data part looks like a typical CSV file and it looks like there are 11 columns. But what are the 4 lines above the data? Which one should I use as the column header? I tried to make sense of these 4 lines and it looks to me the second line should be the column header and the third line is unit? The first and the fourth make no sense to me.

@max-zilla
Copy link
Contributor

max-zilla commented Oct 6, 2016

@Zodiase I am not positive, but I think this might help: https://www.manualslib.com/manual/538296/Campbell-Cr9000.html?page=43

Doesn't really explain the first line - I think that's just some information on the sensor/weather station that collected the data. If you look on page 42 of that link (the one before) I think it describes these, Station Name, Logger Serial Number, etc.

The fourth line looks like a description of how data was collected:

  • Smp = sampled
  • Tot = total
  • Avg (from the manual link) = average
    etc.

@dlebauer
Copy link
Member Author

dlebauer commented Oct 6, 2016

That looks correct
On Thu, Oct 6, 2016 at 3:54 PM Max Burnette notifications@github.com
wrote:

@Zodiase https://github.com/Zodiase I am not positive, but I think this
might help:
https://www.manualslib.com/manual/538296/Campbell-Cr9000.html?page=43

Doesn't really explain the first line - I think that's just some
information on the sensor/weather station that collected the data.

The fourth line looks like a description of how data was collected:

  • Smp = sampled
  • Tot = total
  • Avg (from the manual link) = average etc.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#156 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAcX5465UcokmXyCw59sAoIqygxD59goks5qxWAQgaJpZM4JtN81
.

@Zodiase
Copy link
Contributor

Zodiase commented Oct 6, 2016

@dlebauer So the extractor code I've worked on so far is able to be triggered and parse the raw input files without issues. Now the next step is to compose the JSON output you want. So what I understand is that such a data row "2016-08-30 00:06:24",7276223,12.61,27.37,26.74,27.48,0,0,65,2.45,0 should be converted into one single JSON document. And I know that last time we discussed that each column would go as one attribute in the properties field. But what about the rest of the JSON? The coordinates in the geometry for instance. Could you give a thorough example of the output JSON you want from some input such as the data row I mentioned above?

@dlebauer
Copy link
Member Author

dlebauer commented Oct 6, 2016

{
    "id": 12345,
    "created": "2016-08-30 00:06:24 -08:00Z",
    "start_time": "2016-08-30 00:06:24 -08:00Z",
    "end_time": "2016-08-30 00:06:24 -08:00Z",
    "properties": {
        "source": "http://terraref.ncsa.illinois.edu/clowder/datasets/xyz123abc456",
        "air_temperature, K": "285.12",
        "relative_humidity, %": "27.37",
        "surface_downwelling_shortwave_flux_in_air, W m-2":26.74,
        "surface_downwelling_photosynthetic_photon_flux_in_air, mol m-2 s-1":"0.02674",
        "wind_to_direction, degrees": 65,
        "wind_speed, m/s":2.45,
        "precipitation_rate, mm/s":0
    },
    "type": "Feature",
    "geometry": {
        "type": "Point",
        "coordinates": [-111.975071584772   
            33.074518691823,
            353.38
        ]
    },
    "stream_id": "123",
    "sensor_id": "123",
    "sensor_name": "UA-MAC F13 Weather Station"
},

@ghost ghost modified the milestones: December 2016, September 2016 Nov 30, 2016
@Zodiase
Copy link
Contributor

Zodiase commented Dec 8, 2016

I've implemented the aggregation logic and the code is currently in this branch: https://github.com/terraref/extractors-meterological/tree/5-min-aggregation

@max-zilla Could you test it? I only added some testing code in parser.py and played with some test data (also included in the branch) locally. I think it would be better to test from a deployed extractor.

To change aggregation options:

The test data I played with yields such result:

[
   {
      "geometry":{
         "type":"Point",
         "coordinates":[
            33.0745666667,
            -111.9750833333,
            0
         ]
      },
      "start_time":"2016-08-30T00:06:24-07:00",
      "type":"Feature",
      "end_time":"2016-08-30T00:10:00-07:00",
      "properties":{
         "precipitation_rate":0.0,
         "wind_speed":1.6207870370370374,
         "surface_downwelling_shortwave_flux_in_air":0.0,
         "northward_wind":0.07488770951583902,
         "relative_humidity":26.18560185185185,
         "air_temperature":300.17606481481516,
         "eastward_wind":1.571286062845733,
         "surface_downwelling_photosynthetic_photon_flux_in_air":0.0
      }
   },
   {
      "geometry":{
         "type":"Point",
         "coordinates":[
            33.0745666667,
            -111.9750833333,
            0
         ]
      },
      "start_time":"2016-08-30T00:10:00-07:00",
      "type":"Feature",
      "end_time":"2016-08-30T00:15:00-07:00",
      "properties":{
         "precipitation_rate":0.0,
         "wind_speed":1.4256666666666669,
         "surface_downwelling_shortwave_flux_in_air":0.0,
         "northward_wind":-0.05141511827670856,
         "relative_humidity":24.226333333333386,
         "air_temperature":300.8981666666665,
         "eastward_wind":1.394382855930334,
         "surface_downwelling_photosynthetic_photon_flux_in_air":0.0
      }
   },
   {
      "geometry":{
         "type":"Point",
         "coordinates":[
            33.0745666667,
            -111.9750833333,
            0
         ]
      },
      "start_time":"2016-08-30T00:15:00-07:00",
      "type":"Feature",
      "end_time":"2016-08-30T00:20:00-07:00",
      "properties":{
         "precipitation_rate":0.0,
         "wind_speed":1.3858783783783772,
         "surface_downwelling_shortwave_flux_in_air":0.0,
         "northward_wind":-0.09425296463470188,
         "relative_humidity":23.29226351351351,
         "air_temperature":301.213952702703,
         "eastward_wind":1.348590540556527,
         "surface_downwelling_photosynthetic_photon_flux_in_air":0.0
      }
   },
   {
      "geometry":{
         "type":"Point",
         "coordinates":[
            33.0745666667,
            -111.9750833333,
            0
         ]
      },
      "start_time":"2016-08-30T00:20:00-07:00",
      "type":"Feature",
      "end_time":"2016-08-30T00:25:00-07:00",
      "properties":{
         "precipitation_rate":0.0,
         "wind_speed":0.8310000000000005,
         "surface_downwelling_shortwave_flux_in_air":0.0,
         "northward_wind":-0.35657497924484793,
         "relative_humidity":22.633933333333335,
         "air_temperature":301.50973333333326,
         "eastward_wind":0.7049300737104702,
         "surface_downwelling_photosynthetic_photon_flux_in_air":0.0
      }
   },
   {
      "geometry":{
         "type":"Point",
         "coordinates":[
            33.0745666667,
            -111.9750833333,
            0
         ]
      },
      "start_time":"2016-08-30T00:25:00-07:00",
      "type":"Feature",
      "end_time":"2016-08-30T00:30:00-07:00",
      "properties":{
         "precipitation_rate":0.0,
         "wind_speed":0.6694000000000001,
         "surface_downwelling_shortwave_flux_in_air":0.0,
         "northward_wind":-0.585180649157013,
         "relative_humidity":25.478600000000007,
         "air_temperature":301.2232333333329,
         "eastward_wind":0.30741648387327564,
         "surface_downwelling_photosynthetic_photon_flux_in_air":0.0
      }
   },
   {
      "geometry":{
         "type":"Point",
         "coordinates":[
            33.0745666667,
            -111.9750833333,
            0
         ]
      },
      "start_time":"2016-08-30T00:30:00-07:00",
      "type":"Feature",
      "end_time":"2016-08-30T00:35:00-07:00",
      "properties":{
         "precipitation_rate":0.0,
         "wind_speed":0.6296666666666666,
         "surface_downwelling_shortwave_flux_in_air":0.0,
         "northward_wind":-0.42173249926348644,
         "relative_humidity":26.469933333333355,
         "air_temperature":300.85969999999907,
         "eastward_wind":0.45458948531155813,
         "surface_downwelling_photosynthetic_photon_flux_in_air":0.0
      }
   },
   {
      "geometry":{
         "type":"Point",
         "coordinates":[
            33.0745666667,
            -111.9750833333,
            0
         ]
      },
      "start_time":"2016-08-30T00:35:00-07:00",
      "type":"Feature",
      "end_time":"2016-08-30T00:40:00-07:00",
      "properties":{
         "precipitation_rate":0.0,
         "wind_speed":0.8663333333333328,
         "surface_downwelling_shortwave_flux_in_air":0.0,
         "northward_wind":-0.6006981174489593,
         "relative_humidity":24.133233333333333,
         "air_temperature":300.97440000000034,
         "eastward_wind":0.5790642074746596,
         "surface_downwelling_photosynthetic_photon_flux_in_air":0.0
      }
   },
   {
      "geometry":{
         "type":"Point",
         "coordinates":[
            33.0745666667,
            -111.9750833333,
            0
         ]
      },
      "start_time":"2016-08-30T00:40:00-07:00",
      "type":"Feature",
      "end_time":"2016-08-30T00:45:00-07:00",
      "properties":{
         "precipitation_rate":0.0,
         "wind_speed":1.1200666666666672,
         "surface_downwelling_shortwave_flux_in_air":0.0,
         "northward_wind":-1.0444193473063164,
         "relative_humidity":21.460900000000024,
         "air_temperature":301.59006666666653,
         "eastward_wind":0.3707760504240207,
         "surface_downwelling_photosynthetic_photon_flux_in_air":0.0
      }
   },
   {
      "geometry":{
         "type":"Point",
         "coordinates":[
            33.0745666667,
            -111.9750833333,
            0
         ]
      },
      "start_time":"2016-08-30T00:45:00-07:00",
      "type":"Feature",
      "end_time":"2016-08-30T00:50:00-07:00",
      "properties":{
         "precipitation_rate":0.0,
         "wind_speed":1.3106333333333342,
         "surface_downwelling_shortwave_flux_in_air":0.0,
         "northward_wind":-1.249505862534591,
         "relative_humidity":21.709133333333313,
         "air_temperature":301.60549999999927,
         "eastward_wind":0.38198168724184367,
         "surface_downwelling_photosynthetic_photon_flux_in_air":0.0
      }
   },
   {
      "geometry":{
         "type":"Point",
         "coordinates":[
            33.0745666667,
            -111.9750833333,
            0
         ]
      },
      "start_time":"2016-08-30T00:50:00-07:00",
      "type":"Feature",
      "end_time":"2016-08-30T00:55:00-07:00",
      "properties":{
         "precipitation_rate":0.0,
         "wind_speed":1.297633333333334,
         "surface_downwelling_shortwave_flux_in_air":0.0,
         "northward_wind":-1.253133336504686,
         "relative_humidity":21.457600000000024,
         "air_temperature":301.69336666666703,
         "eastward_wind":0.324976158201803,
         "surface_downwelling_photosynthetic_photon_flux_in_air":0.0
      }
   },
   {
      "geometry":{
         "type":"Point",
         "coordinates":[
            33.0745666667,
            -111.9750833333,
            0
         ]
      },
      "start_time":"2016-08-30T00:55:00-07:00",
      "type":"Feature",
      "end_time":"2016-08-30T01:00:00-07:00",
      "properties":{
         "precipitation_rate":0.0,
         "wind_speed":1.3804999999999998,
         "surface_downwelling_shortwave_flux_in_air":0.0,
         "northward_wind":-1.3556587631934873,
         "relative_humidity":21.25273333333331,
         "air_temperature":301.7047000000008,
         "eastward_wind":0.23843479144932786,
         "surface_downwelling_photosynthetic_photon_flux_in_air":0.0
      }
   },
   {
      "geometry":{
         "type":"Point",
         "coordinates":[
            33.0745666667,
            -111.9750833333,
            0
         ]
      },
      "start_time":"2016-08-30T01:00:00-07:00",
      "type":"Feature",
      "end_time":"2016-08-30T01:05:00-07:00",
      "properties":{
         "precipitation_rate":0.0,
         "wind_speed":1.5816666666666679,
         "surface_downwelling_shortwave_flux_in_air":0.0,
         "northward_wind":-1.5763178363241495,
         "relative_humidity":22.110499999999984,
         "air_temperature":301.4501999999997,
         "eastward_wind":0.11952470035541446,
         "surface_downwelling_photosynthetic_photon_flux_in_air":0.0
      }
   },
   {
      "geometry":{
         "type":"Point",
         "coordinates":[
            33.0745666667,
            -111.9750833333,
            0
         ]
      },
      "start_time":"2016-08-30T01:05:00-07:00",
      "type":"Feature",
      "end_time":"2016-08-30T01:08:23-07:00",
      "properties":{
         "precipitation_rate":0.0,
         "wind_speed":1.682058823529412,
         "surface_downwelling_shortwave_flux_in_air":0.0,
         "northward_wind":-1.6719726912984594,
         "relative_humidity":23.09779411764704,
         "air_temperature":301.1047058823543,
         "eastward_wind":-0.14332925981518027,
         "surface_downwelling_photosynthetic_photon_flux_in_air":0.0
      }
   }
]

Notice all the data entries are in clean 5-minute chunks, except for the first one and the last one (since some data in other datasets may belong to the same 5-minute chunks.

@max-zilla
Copy link
Contributor

I'm pulling this code today while updating the extractor and will test, @Zodiase

@ghost ghost added cyberGIS labels Jan 3, 2017
@ghost
Copy link

ghost commented Jan 12, 2017

@max-zilla - please update

@max-zilla max-zilla assigned max-zilla and unassigned Zodiase Jan 12, 2017
@max-zilla
Copy link
Contributor

@Zodiase I integrated your code and deployed to the extractor VM, but the hardware failure this week delayed my ability to test. Assigning this to myself so I can close once I confirm everything's good but it's 99% ready.

@ghost ghost removed this from the December 2016 milestone Jan 12, 2017
@dlebauer dlebauer added this to the January 2017 milestone Jan 12, 2017
@dlebauer dlebauer changed the title Extractors for meterological data Extractors for meteorological data Jan 12, 2017
@ghost ghost removed the help wanted label Jan 12, 2017
@max-zilla
Copy link
Contributor

Remember UIUC and Kansas, and Charlie may have code pull from netCDF already.

@Zodiase
Copy link
Contributor

Zodiase commented Jan 19, 2017

@max-zilla How are datasets from UIUC and Kansas different from MAC datasets? Would the current message subscription (*.dataset.files.added) work for them? Do they have different schemas? And why would the extractor need code to pull data from netCDF? Shouldn't it only pull from clowder datasets and put into geostream?

@max-zilla
Copy link
Contributor

@Zodiase i have not seen the UIUC/Kansas datasets yet.... I put that comment there as a note during the meeting, but i think the netCDF note was not for this specific extractor but for other met data we might see.

@dlebauer
Copy link
Member Author

Data flow should be raw --> geostreams --> netCDF

The raw--> netCDF developed alongside the hyperspectral extractor is a special case

The raw --> geostreams extractor may need special handling for each data source. We should open separate issues for each additional source. later we can write the geostreams to netCDF extractor and we will only need one.

@max-zilla
Copy link
Contributor

@robkooper @dlebauer this extractor raises a question given our discussions yesterday. If we want to store the geostream info by plot as discussed, we'll probably want a "plot' that covers the entire field as well somehow (or a marker to the side of the field) to indicate the met data is not specific to a plot, but instead to the entire location.

I thought about adding the met datapoint to EVERY plot, but that would make the visualiztion of that metric busy and confusing I think.

Having a synthetic "full field" plot that is not in the lookup shapefile (so things are only assigned to it if we engineer them to do so) could be handy for other reasons eventually also.

@robkooper
Copy link
Member

I agree, having a plat that is the whole site for the met data should work.

@dlebauer
Copy link
Member Author

dlebauer commented Jan 27, 2017 via email

@max-zilla
Copy link
Contributor

This is complete and running with geostream component & 5-minute aggregations.

We can use #173 to discuss the netCDF portion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants