Process to handle scan programs #362

craig-willis · 2017-09-28T14:59:21Z

Per discussion with @smarshall-bmr, there are potentially multiple scans and scan programs on a given day. We need a process to capture the scan program information, update the cleaned metadata with the final program name/tag and update downstream extractors to respect this information (e.g., fullfield needs to run on a day for files associated with a program).

We need a way to determine if we have all files for that program in the raw data for the rulechecker.

@smarshall-bmr mentioned that program names may change over time, so we need to map to a standard name.

There is another issue with multiple scans on the same day with the same program. It's unclear whether this is something we need to address.

Completion criteria:

dlebauer · 2017-09-29T19:20:09Z

Craig compiled a list of unique scripts that @smarshall-bmr can annotate:

In addition, the metadata field (appears to) provide a link to a copy of the script version run on a particular on an ftp server ftp://10.160.21.2//gantry_data/LemnaTec/ScriptBackup/, but this is not resolving.

the metadata looks like

    "gantry_system_variable_metadata": {
      "...":,
      "Script path on local disk": "C:\\LemnaTec\\StoredScripts\\SWIR_VNIR_Day1.cs",
      "Script copy path on FTP server": "ftp://10.160.21.2//gantry_data/LemnaTec/ScriptBackup/SWIR_VNIR_Day1_7e5b6f22-f990-43fc-92d4-fee2db330d4a.cs",
      "..."
   }

@smarshall-bmr where is this ftp server?
@jdmaloney do you know where this ftp server is, and could you put these in an appropriate place under raw_data (like ua-mac/raw_data/gantry_data/LemnaTec/ScriptBackup/)?

max-zilla · 2017-09-29T19:34:25Z

we have the /gantry_data directory accessible on the cache server, but the ScriptBackup is not currently whitelisted for the transfer pipeline. there are 1,687 scripts listed there, e.g.:

SWIR_VNIR_stereoVIS_IR_Sensors_whiteTarget_lightsOFF_a7b474cb-234a-4724-833d-4847cc83d2d4.cs
test_006a98f5-746f-4bbf-9a30-27fcb32cbebf.cs
test_2a0c5379-837c-4889-be6b-38b554158af3.cs
test_6f6f542d-1f1b-411a-a801-df489838e1cd.cs
test_9a0bdead-54fd-4f17-8cbd-fefdbfbf3c4f.cs
Tutorial_53a5cc92-6340-41bb-bd14-e80c960433fd.cs
Tutorial_7600ad45-a660-4e77-97a8-6a09217b000d.cs
Tutorial_ed3471a2-2912-40bf-b15e-bd94ad19be68.cs
VNIR_Stereo_Full_Field_0.02m_47e46752-088b-43dc-9f4a-81b5a77304ba.cs
VNIR_Stereo_Full_Field_0.02m_49925286-0e49-4eeb-9667-0d3d5347f3e0.cs
VNIR_Stereo_Full_Field_0.02m_4e732583-6f17-4b55-b2eb-1fcee83d7fa7.cs
VNIR_Stereo_Full_Field_0.02m_61dd2093-c69c-43c5-b645-184eff39d513.cs
VNIR_Stereo_Full_Field_0.02m_7591c364-f646-4a1a-98c1-cb2570403cf7.cs
VNIR_Stereo_Full_Field_0.02m_9ec4f9b3-f52c-43f9-918a-0bdd9f695f3d.cs
VNIR_Stereo_Full_Field_0.02m_a246d47d-33fa-4ffd-b8d3-803c289ecc69.cs
VNIR_Stereo_Full_Field_0.02m_a372086f-31c4-4fcd-a450-91523ea651e4.cs
VNIR_Stereo_Full_Field_0.02m_c7e5954b-42b7-4dd4-95f4-576fefa16324.cs
VNIR_Stereo_Full_Field_0.02m_e302dfdc-51a0-4cb3-94aa-fe37c1bf92da.cs
VNIR_Stereo_Full_Field_0.02m_e752aad9-44a1-440c-a393-bd2567ec2a47.cs
VNIR_Stereo_Full_Field_0.02m_ea2e8db1-6fde-4855-9d7e-c9dedaf1febd.cs
VNIR_Stereo_Full_Field_0.02m_f2219962-2103-4b05-bc5a-a99d41391974.cs
VNIRtest_4m_cb2741e5-2d62-4998-80aa-295583e50b63.cs

@craig-willis this probably obviates your scan unless we have scripts in your results that dont appear here.

Here is a list I generated.
script_list.txt

craig-willis · 2017-09-29T19:42:12Z

The list I generated has 80 entries.

https://docs.google.com/spreadsheets/d/1tMkPT2jtMgTficfSDx80-RpB6PXuzry2gLEZG_ajvl8/edit

I gather what you've got is specific versions on specific dates? That would be better for true traceability, if we can reference the specific script from the dataset metadata.

craig-willis · 2017-09-29T21:11:59Z

@dlebauer Along with the program descriptions, should we consider linking to the fieldbook spreasheet (or something that references the fieldbook) from the dataset metadata?

https://docs.google.com/spreadsheets/d/1eQSeVMPfrWS9Li4XlJf3qs2F8txmddbwZhjOfMGAvt8/edit#gid=665425213

dlebauer · 2017-09-29T21:37:47Z

@craig-willis that would seem reasonable. Not sure the best way to do this. Much of this could be inserted into the BETYdb managements table (and some of it is there). Only issue is that the record keeping hasn't been consistent over the years.

@smarshall-bmr what are your thoughts?

max-zilla · 2017-10-02T14:30:17Z

@craig-willis @robkooper here's an exam question for ya.

right now we trigger the full field extractor by:

checking every incoming geotiff file and adding it to the list for that day+sensor
if our # of geotiffs for that day+sensor == the number of raw datasets used to generate the geotiffs for that day, trigger
(e.g. trigger RGB GeoTIFF full field when RGB Geotiff count == stereoTop raw_data count for that day)

I have added code to account for multiple scans per day in our full field unique key (day+sensor+scan). BUT that breaks our count check. I need a new method to know when to trigger full field without triggering when only half the geotiffs are generated.

Ideas....

modify create a custom rulechecker query so I can get count of rules for that day+sensor across all scans, then trigger all scans at once if the sum(all scans) == total raw datasets. this is most straightforward but requires a little coding.
bin2tif extractor creates some record in PSQL of scan counts per day that our field mosaic checks against. I dont like this solution much, requires a lot of centralization (which I guess we've already done for rulechecker) and customization.
for this year we have some one-off script create a special file with scan counts per day that rulechecker checks, since our reprocessing is all historical. this seems like worst solution to me.
Rob's idea that others have mentioned - use coverage sum of polygons to cover full field

craig-willis · 2017-10-02T15:26:28Z

@max-zilla As discussed, your first option seems practical for November.

smarshall-bmr · 2017-10-02T16:29:52Z

Description of the scans is available here:
https://docs.google.com/spreadsheets/d/1tMkPT2jtMgTficfSDx80-RpB6PXuzry2gLEZG_ajvl8/edit#gid=0

max-zilla · 2017-11-30T17:15:17Z

we are accounting for this now.

craig-willis assigned craig-willis, max-zilla and smarshall-bmr Sep 28, 2017

dlebauer mentioned this issue Sep 28, 2017

Review RGB fullfield extractor and output terraref/reference-data#183

Closed

8 tasks

dlebauer added the help wanted label Sep 28, 2017

max-zilla closed this as completed Nov 30, 2017

craig-willis reopened this Mar 7, 2018

max-zilla removed the help wanted label May 31, 2018

craig-willis closed this as completed Jun 7, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Process to handle scan programs #362

Process to handle scan programs #362

craig-willis commented Sep 28, 2017 •

edited by smarshall-bmr

Loading

dlebauer commented Sep 29, 2017

max-zilla commented Sep 29, 2017

craig-willis commented Sep 29, 2017

craig-willis commented Sep 29, 2017

dlebauer commented Sep 29, 2017

max-zilla commented Oct 2, 2017 •

edited

Loading

craig-willis commented Oct 2, 2017

smarshall-bmr commented Oct 2, 2017

max-zilla commented Nov 30, 2017

Process to handle scan programs #362

Process to handle scan programs #362

Comments

craig-willis commented Sep 28, 2017 • edited by smarshall-bmr Loading

dlebauer commented Sep 29, 2017

max-zilla commented Sep 29, 2017

craig-willis commented Sep 29, 2017

craig-willis commented Sep 29, 2017

dlebauer commented Sep 29, 2017

max-zilla commented Oct 2, 2017 • edited Loading

craig-willis commented Oct 2, 2017

smarshall-bmr commented Oct 2, 2017

max-zilla commented Nov 30, 2017

craig-willis commented Sep 28, 2017 •

edited by smarshall-bmr

Loading

max-zilla commented Oct 2, 2017 •

edited

Loading