Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TDSCatalog has no datasets #114

Closed
rabernat opened this issue Oct 22, 2016 · 6 comments
Closed

TDSCatalog has no datasets #114

rabernat opened this issue Oct 22, 2016 · 6 comments
Assignees
Milestone

Comments

@rabernat
Copy link

I am trying to use siphon for a simple case: seeing all datasets in a THREDDS catalog.
http://oceandata.sci.gsfc.nasa.gov/opendap/SeaWiFS/L3SMI/2000/001/contents.html

But it is not returning any datasets.

Example:

from siphon.catalog import TDSCatalog
catalog = 'http://oceandata.sci.gsfc.nasa.gov/opendap/SeaWiFS/L3SMI/2001/001/catalog.xml'
cat = TDSCatalog(catalog)
print(cat.catalog_refs)
print(cat.datasets)

This gives two empty OrderedDicts:

OrderedDict()
OrderedDict()

What am I doing wrong here?

@lesserwhirls
Copy link
Collaborator

Hi @rabernat - the server you are hitting is a HYRAX server, which does not advertise its data holdings via THREDDS Catalogs. If you try the URL in a browser, you'll see there is no xml catalog to be parsed. However, if you go to any one of the datasets from the html page, you will see a "Data URL" that you can put directly into NetCDF4.Dataset class to read the data via OPeNDAP.

@rabernat
Copy link
Author

Thanks @lesserwhirls for your quick reply.

If you try the URL in a browser, you'll see there is no xml catalog to be parsed.

When I put
http://oceandata.sci.gsfc.nasa.gov/opendap/SeaWiFS/L3SMI/2001/001/catalog.xml
into the browser, I see a long xml file. Here is a bit of it

<thredds:catalog xmlns:thredds="http://www.unidata.ucar.edu/namespaces/thredds/InvCatalog/v1.0" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:bes="http://xml.opendap.org/ns/bes/1.0#">
<thredds:service name="dap" serviceType="OPeNDAP" base="/opendap/hyrax"/>
<thredds:service name="file" serviceType="HTTPServer" base="/opendap/hyrax"/>
<thredds:service name="wms" serviceType="WMS" base="/ncWMS/wms"/>
<thredds:dataset name="/SeaWiFS/L3SMI/2001/001" ID="/opendap/hyrax/SeaWiFS/L3SMI/2001/001/">
<thredds:dataset name="S2001001.L3m_DAY_CHL_chl_ocx_9km.nc" ID="/opendap/hyrax/SeaWiFS/L3SMI/2001/001/S2001001.L3m_DAY_CHL_chl_ocx_9km.nc">
<thredds:dataSize units="bytes">1990904</thredds:dataSize>
<thredds:date type="modified">2015-10-01T21:23:02</thredds:date>
<thredds:access serviceName="dap" urlPath="/SeaWiFS/L3SMI/2001/001/S2001001.L3m_DAY_CHL_chl_ocx_9km.nc"/>
<thredds:access serviceName="wms" urlPath="?DATASET=lds/SeaWiFS/L3SMI/2001/001/S2001001.L3m_DAY_CHL_chl_ocx_9km.nc&SERVICE=WMS&VERSION=1.3.0&REQUEST=GetCapabilities"/>
</thredds:dataset>
<thredds:dataset name="S2001001.L3m_DAY_CHL_chlor_a_9km.nc" ID="/opendap/hyrax/SeaWiFS/L3SMI/2001/001/S2001001.L3m_DAY_CHL_chlor_a_9km.nc">
<thredds:dataSize units="bytes">1973123</thredds:dataSize>
<thredds:date type="modified">2015-10-01T21:23:14</thredds:date>
<thredds:access serviceName="dap" urlPath="/SeaWiFS/L3SMI/2001/001/S2001001.L3m_DAY_CHL_chlor_a_9km.nc"/>
<thredds:access serviceName="wms" urlPath="?DATASET=lds/SeaWiFS/L3SMI/2001/001/S2001001.L3m_DAY_CHL_chlor_a_9km.nc&SERVICE=WMS&VERSION=1.3.0&REQUEST=GetCapabilities"/>
</thredds:dataset>
...

This looks like a THREDDS Catalog to me. I'm not sure I understand you mean by "does not advertise its data holdings via THREDDS Catalogs". Is this not a THREDDS catalog?

if you go to any one of the datasets from the html page, you will see a "Data URL" that you can put directly into NetCDF4.Dataset class to read the data via OPeNDAP.

If I wanted to manually follow the links, I would have no need for siphon. I am trying to automate this in a script.

@lesserwhirls
Copy link
Collaborator

Ok, not sure why I wasn't able to see the xml doc before - I got a generic error page last time I tried. I was unaware that HYRAX servers exposed THREDDS catalogs, so that's my bad.

The issue is that Siphon currently does not use the access elements in the xml document to create the access_urls - the reason is that none of the catalogs that were used to develop siphon used the access element and so it was overlooked.

I've opened an issue to address this bug. Thanks!

@rabernat
Copy link
Author

Fantastic, thanks!

@dopplershift
Copy link
Member

@rabernat Just got your feedback from the AOSPy workshop--I'll move this up my priority list, but I'm a bit swamped at the moment, so it might be early December before I can look into this.

@rabernat
Copy link
Author

@dopplershift: It's not urgent! Work-arounds have been found.

lesserwhirls added a commit to lesserwhirls/siphon that referenced this issue Mar 18, 2017
lesserwhirls added a commit to lesserwhirls/siphon that referenced this issue Mar 18, 2017
lesserwhirls added a commit to lesserwhirls/siphon that referenced this issue Mar 21, 2017
lesserwhirls added a commit to lesserwhirls/siphon that referenced this issue Mar 21, 2017
lesserwhirls added a commit to lesserwhirls/siphon that referenced this issue Mar 21, 2017
@dopplershift dopplershift modified the milestone: 0.4.1 Mar 31, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants