Subject: Marine Geospatial Ecology Tools (MGET) help
Text archives
From: | Jason Roberts <> |
---|---|
To: | Bryan Costa - NOAA Affiliate <> |
Cc: | "" <> |
Subject: | RE: [mget-help] Problem with Create Climatological Rasters for GHRSST L4 SST |
Date: | Wed, 6 Aug 2014 17:13:07 +0000 |
Accept-language: | en-US |
Bryan, Thanks for your interest in MGET. I looked into this. It appears that the NASA PO.DAAC server is operating, but slowly right now. Looking at the output you
sent, it appears that the tool ran successfully 3 hours 40 minutes and then failed. My guess is that the server is just busy. A problem with the GHRSST 1km products (MUR and OUROCEAN) is that they are stored on the server as .bz2 compressed
files and the server must decompress the entire file into its disk cache before being able to read any of it, according to what the NASA folks. This means that even when you are reading a small spatial extent, the server incurs on the fly the decompression
of a 230 MB bz2 file that decompresses to several GB. The server’s response will be slow unless the file happens to be already in its cache, which is unlikely when you are trolling through hundreds of files to build a climatology. Depending on what is happening
on the server, this can make building the climatology almost intolerably long, e.g. taking an hour or more to process one year of data, regardless of spatial extent. My recommendations are: 1.
Increase the Timeout Value parameter of the tool, perhaps from 60 seconds to 180 seconds, to give the server additional time to respond to each request,
in case the on-the-fly decompression is taking an extra long time. 2.
If you wish the tool to keep trying for a long time—for example, if you are leaving the tool running overnight—increase Maximum Retry Time from 300
seconds to something huge, like 86400 (24 hours). 3.
Use the Cache Directory parameter to allow the tool to store its own copy of the data. This way, if you have to restart the tool, it can resume where
it left off, and not have to redownload the time slices it successfully downloaded before. Note that before resuming the tool, you should carefully check the last climatology raster it was working on. Sometimes, when the tool fails, it leaves behind an unpopulated
raster that it cannot delete. I have not been able to figure out all scenarios in which this happens. If you find a raster that cannot be displayed by preview in ArcCatalog, delete it manually. If ArcCatalog fails to delete it, delete the file manually with
Windows Explorer. 4.
Reduce your spatial extent as much as possible. This will minimize the time required to download. Although this is often not the bottleneck—often
it is the server doing the on-the-fly decompression—it still adds cost to download unnecessary data at 1 km, daily resolution. It also bloats the rasters. 5.
Currently the tool does not store data in the Cache Directory in compressed form. If you are downloading a lot of data, enable Windows compression
on the directory to minimize the space it uses. The files will often be highly compressible. If you’re still having trouble, please send me the complete output of the tool so I can see what parameters you used. Hope that helps, Jason From: Bryan Costa - NOAA Affiliate [mailto:]
To Whom it May Concern, I'm writing because I'm having problems using the "Create Climatological Rasters for GHRSST L4 SST" MGET Tool. Specifically, the server connection keeps on timing out. However, there doesn't seem to be any problem with the URL or connecting to the JPL URL. Also, I get the same error on different machines and using different input parameters. I look forward to hearing from you at your convenience. Regards, Bryan Costa -------------------------------------------------------- Error timeout: timed out Traceback (most recent call last): File "C:\Program Files\GeoEco\ArcGISToolbox\Scripts\GHRSSTLevel4CreateClimatologicalArcGISRasters.py", line 5, in <module> ExecuteMethodFromCommandLineAsArcGISTool('GeoEco.DataProducts.NASA.PODAAC', 'GHRSSTLevel4', 'CreateClimatologicalArcGISRasters') File "C:\Python27\ArcGIS10.1\lib\site-packages\GeoEco\ArcGISScripts.py", line 210, in ExecuteMethodFromCommandLineAsArcGISTool exec sourceCode in globals(), locals() File "<string>", line 1, in <module> File "C:\Python27\ArcGIS10.1\lib\site-packages\GeoEco\DataProducts\NASA\PODAAC.py", line 837, in CreateClimatologicalArcGISRasters workspace.ImportDatasets(collection.QueryDatasets(), mode, calculateStatistics=calculateStatistics, buildPyramids=buildPyramids) File "C:\Python27\ArcGIS10.1\lib\site-packages\GeoEco\Datasets\__init__.py", line 816, in ImportDatasets self._ImportDatasets(datasets, mode.lower(), reportProgress, options) File "C:\Python27\ArcGIS10.1\lib\site-packages\GeoEco\Datasets\Collections.py", line 694, in _ImportDatasets self._ImportDatasetsToPath(pathComponentsForPath[path], datasetsForPath[path], mode, progressReporter, options) File "C:\Python27\ArcGIS10.1\lib\site-packages\GeoEco\Datasets\ArcGIS.py", line 607, in _ImportDatasetsToPath self.DatasetType._ImportDatasetsToPath(os.path.join(self.Path, *pathComponents), sourceDatasets, mode, progressReporter, options) File "C:\Python27\ArcGIS10.1\lib\site-packages\GeoEco\Datasets\ArcGIS.py", line 1119, in _ImportDatasetsToPath GDALDataset._ImportDatasetsToPath(path, sourceDatasets, mode, None, {'useArcGISSpatialReference': True, 'useUnscaledData': useUnscaledData, 'calculateStatistics': False, 'blockSize': blockSize}) File "C:\Python27\ArcGIS10.1\lib\site-packages\GeoEco\Datasets\GDAL.py", line 1073, in _ImportDatasetsToPath data = "sourceDatasets[i].Data[rowsCopied:rowsCopied+rowsToCopy," :] File "C:\Python27\ArcGIS10.1\lib\site-packages\GeoEco\Datasets\__init__.py", line 3666, in __getitem__ return getattr(self._Grid(), self._GetMethod)(key) File "C:\Python27\ArcGIS10.1\lib\site-packages\GeoEco\Datasets\__init__.py", line 3306, in _GetUnscaledDataAsArray data, actualNoDataValue = self._ReadNumpyArray(reorderedSliceList) File "C:\Python27\ArcGIS10.1\lib\site-packages\GeoEco\Datasets\Virtual.py", line 2076, in _ReadNumpyArray data = "self._Grids[i].Data.__getitem__(tuple(sliceList)) File "C:\Python27\ArcGIS10.1\lib\site-packages\GeoEco\Datasets\__init__.py", line 3666, in __getitem__ return getattr(self._Grid(), self._GetMethod)(key) File "C:\Python27\ArcGIS10.1\lib\site-packages\GeoEco\Datasets\Virtual.py", line 562, in _GetUnscaledDataAsArray return self._Grid._GetUnscaledDataAsArray(self._AddSlicedDimsToKey(key)) File "C:\Python27\ArcGIS10.1\lib\site-packages\GeoEco\Datasets\__init__.py", line 3306, in _GetUnscaledDataAsArray data, actualNoDataValue = self._ReadNumpyArray(reorderedSliceList) File "C:\Python27\ArcGIS10.1\lib\site-packages\GeoEco\Datasets\Virtual.py", line 1475, in _ReadNumpyArray data = "eval(self._Expression) File "<string>", line 1, in <module> File "C:\Python27\ArcGIS10.1\lib\site-packages\GeoEco\Datasets\__init__.py", line 3666, in __getitem__ return getattr(self._Grid(), self._GetMethod)(key) File "C:\Python27\ArcGIS10.1\lib\site-packages\GeoEco\Datasets\__init__.py", line 3246, in _GetScaledDataAsArray unscaledData = self._GetUnscaledDataAsArray(key) File "C:\Python27\ArcGIS10.1\lib\site-packages\GeoEco\Datasets\__init__.py", line 3306, in _GetUnscaledDataAsArray data, actualNoDataValue = self._ReadNumpyArray(reorderedSliceList) File "C:\Python27\ArcGIS10.1\lib\site-packages\GeoEco\Datasets\Virtual.py", line 342, in _ReadNumpyArray data[t] = self._CachedDatasets[i].UnscaledData.__getitem__(tuple([0] + sliceList[1:])) File "C:\Python27\ArcGIS10.1\lib\site-packages\GeoEco\Datasets\__init__.py", line 3666, in __getitem__ return getattr(self._Grid(), self._GetMethod)(key) File "C:\Python27\ArcGIS10.1\lib\site-packages\GeoEco\Datasets\__init__.py", line 3306, in _GetUnscaledDataAsArray data, actualNoDataValue = self._ReadNumpyArray(reorderedSliceList) File "C:\Python27\ArcGIS10.1\lib\site-packages\GeoEco\Datasets\OPeNDAP.py", line 933, in _ReadNumpyArray downloadedData = self._GetSliceFromServer(sliceList).reshape(map(lambda s: s.stop-s.start, sliceList)) File "C:\Python27\ArcGIS10.1\lib\site-packages\GeoEco\Datasets\OPeNDAP.py", line 986, in _GetSliceFromServer v = self._GetOPeNDAPVariable() File "C:\Python27\ArcGIS10.1\lib\site-packages\GeoEco\Datasets\OPeNDAP.py", line 732, in _GetOPeNDAPVariable self.ParentCollection._Open() File "C:\Python27\ArcGIS10.1\lib\site-packages\GeoEco\Datasets\OPeNDAP.py", line 613, in _Open raise RuntimeError(_(u'Failed to open URL %(url)s with OPeNDAP. The operation was retried for %(retry)i seconds without success. Verify that the URL is correct. If it is, check that the server is operating properly and that your computer
can connect to it. If necessary, contact the server\'s operator for assistance. If the server and network are operating properly, this problem could be a programming error in this tool. If you suspect one, contact the author of this tool for assistance. Error
details: %(e)s: %(msg)s') % {u'url': self._URL, u'retry': self._MaxRetryTime, u'e': e.__class__.__name__, u'msg': self._Unicode(e)}) RuntimeError: Failed to open URL
http://podaac-opendap.jpl.nasa.gov/opendap/hyrax/allData/ghrsst/data/L4/GLOB/JPL/MUR/2005/027/20050127-JPL-L4UHfnd-GLOB-v01-fv03-MUR.nc.bz2 with OPeNDAP. The operation was retried for 300 seconds without success. Verify that the URL is correct. If it is,
check that the server is operating properly and that your computer can connect to it. If necessary, contact the server's operator for assistance. If the server and network are operating properly, this problem could be a programming error in this tool. If you
suspect one, contact the author of this tool for assistance. Error details: timeout: timed out Failed to execute (GHRSSTLevel4CreateClimatologicalArcGISRasters). Failed at Wed Aug 06 12:01:43 2014 (Elapsed Time: 3 hours 40 minutes 41 seconds) -- Bryan Costa Geospatial Scientist Ecosystem Modeler CSS-Dynamac NOAA|CCMA|Biogeography Branch 1305 East West Highway N-SCI-1, SSMC 4, 9th Floor, #9232 Silver Spring, MD 20910 Phone: (301) 713-3028 x146 Fax: (301) 713-4384 Email: The contents of this message are mine and do not necessarily reflect any position of NOAA |