Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix nc_open_mem #400

Merged
merged 6 commits into from
May 5, 2017
Merged

fix nc_open_mem #400

merged 6 commits into from
May 5, 2017

Conversation

thehesiod
Copy link
Contributor

@thehesiod thehesiod commented Apr 28, 2017

@thehesiod thehesiod mentioned this pull request Apr 28, 2017
@thehesiod
Copy link
Contributor Author

for those who are on OSX I've created a custom brew tap with the patch: https://github.com/thehesiod/homebrew-tap/blob/master/Formula/netcdf.rb

@thehesiod
Copy link
Contributor Author

btw, is there a way to create a diskless netCDF file, then get the bytes back w/o touching the disk? Something like nc_sync_and_return_bytes. Would be really useful for testing

@DennisHeimbigner
Copy link
Collaborator

Not at the moment. And I am not sure it is doable for netcdf-4 because we rely on
the HDF5 in memory capability for netcdf-4.

@DennisHeimbigner
Copy link
Collaborator

After a quick look, it looks like we would
have to write a custom driver; probably a wrapper around
the existing H5FD_CORE driver would work. But sufficiently
challenging so that we are not likely to tackle this any time soon.

@thehesiod
Copy link
Contributor Author

thanks for checking! Hopefully this patch can land soon.

@thehesiod
Copy link
Contributor Author

thehesiod commented May 1, 2017

ack, even with this fix it seems like for non trivial files it's returning corrupt data. With the following file:
foo.zip

and the following code:

#include <iostream>
#include <netcdf.h>
#include <netcdf_mem.h>
#include <iostream>
#include <fstream>
#include <assert.h>
#import <vector>


using namespace std;

#define    HRAPY  200
#define    HRAPX  333

static size_t value_count[] = {HRAPY, HRAPX};
static size_t start[] = {0, 0};

int main(int argc, const char * argv[]) {
    ifstream file("/tmp/foo.nc", ios::in | ios::binary | ios::ate);
    streamsize size = file.tellg();
    file.seekg(0, ios::beg);
    
    std::vector<char> buffer(size);
    int status;
    int ncid;
    
    if (file.read(buffer.data(), size))
    {
        status = nc_open_mem(".nc", 0, size, buffer.data(), &ncid);
        if(status != NC_NOERR) {
            cout << nc_strerror(status);
            assert(false);
        }
        
        int rh_id;
        status = nc_inq_varid(ncid, "amountofprecip", &rh_id);
        if(status != NC_NOERR) {
            cout << nc_strerror(status);
            assert(false);
        }
        
        int rh_vals[HRAPY][HRAPX];
        status = nc_get_vara_int(ncid, rh_id, start, value_count, (int*)rh_vals);
        if(status != NC_NOERR) {
            cout << nc_strerror(status);
            assert(false);
        }
    }

    return 0;
}

note there's no errors, however rows > 100 return garbage, row 100 looks like:

[ -71   76   96   37   20    8  -67  -47  -89   62   52  -52   78   86  -62
   60   46   44  -68  -44  -34   28   30  -30   -2   -1   14  -64  -68  -63
  -88  -86  -95   40  -70  -55    5  -56  -50  -61   88   57   45  -73   -3
  -45   19    5  -73   -6  -

whereas row 101 looks like:

[ 4325376  2818048   393215 -3997697 -4063233 -1114113 -1310720  4915199
 -6488064  5832703 -1114113 -2228225 -2686977 -3932160  6160384  3276799
  -720896  3670015 -4063232  6225919 -5636096  1966080  3014655 -4063233
 -5308416  1114112  3407871 -4653057 

opening the same file with nc_open does not return corrupt data.

@thehesiod
Copy link
Contributor Author

looks like it returns garbage after first iteration of for loop in ncx_getn_int_int

@thehesiod
Copy link
Contributor Author

thehesiod commented May 1, 2017

just logged #401 as I found this happens even with NC_DISKLESS mode.

@thehesiod
Copy link
Contributor Author

ok, have a fix for NC_DISKLESS/nc_open_mem, not sure why it works but so far it seems to work ok. Let me know what you guys think.

@WardF WardF self-assigned this May 2, 2017
@WardF WardF added this to the future milestone May 2, 2017
@WardF
Copy link
Member

WardF commented May 2, 2017

@DennisHeimbigner This looks straightforward to me but since it is your code I thought I'd wait for you to weigh in before doing anything. It's a very small change and at a glance appears to be a good one.

WardF added a commit that referenced this pull request May 2, 2017
@DennisHeimbigner
Copy link
Collaborator

Ok,digging deep,what I am seeing is a swap failure because
of a two byte discrepency in addresses. This is so weird that
I don't entirely believe it.

DennisHeimbigner added a commit that referenced this pull request May 3, 2017
It turns out that the chunksize used in the
ncio-related code must be a multiple of eight in
size.  Both memio.c and mmapio.c were
potentially violating this constraint.

See also pr #400
@DennisHeimbigner
Copy link
Collaborator

Pull request #403 must be applied before applying this pull request

@thehesiod
Copy link
Contributor Author

ok, reverted my hack, @DennisHeimbigner could you comment on the comment

/* Use half the filesize as the blocksize ; why? */

here: https://github.com/Unidata/netcdf-c/blob/master/libsrc/memio.c#L385
perhaps this is not needed anymore?

@thehesiod
Copy link
Contributor Author

@WardF If you wouldn't mind, I think it would be good adding a test for memio so it doesn't break again.

@WardF
Copy link
Member

WardF commented May 3, 2017

No problem, but it appears that Dennis added a test as part of his upull request

@thehesiod
Copy link
Contributor Author

@WardF confused because if he added a test it should fail w/o my fix. Ya I see the test supports opening a file from mem but I think it turns off the mem support by default (undef MEM). Pinging @DennisHeimbigner I'm guessing he added a test that theoretically could be run as DISKLESS or MEM but right now is DISKLESS.

@DennisHeimbigner
Copy link
Collaborator

I added a modified version of your code as our testcase; assume you don't mind
my using your code.

@thehesiod
Copy link
Contributor Author

@DennisHeimbigner np, does that mean your PR fixes the issue this PR attempts to fix, or your testcase is currently failing in your PR?

@DennisHeimbigner
Copy link
Collaborator

I'm guessing he added a test that theoretically could be run as DISKLESS or MEM
but right now is DISKLESS.
Correct, it turns out that using nc_open_mem is irrelevant to the problem. The problem
was with reading a file using NC_DISKLESS. I had that define in the test case as a holdover
from when I was trying to determine the cause of this problem. Shall I remove it?

@thehesiod
Copy link
Contributor Author

@DennisHeimbigner sorry, there were two issues, let me explain:

Issue 1 (nc_open_mem returns error code):
At first I was unable to use nc_open_mem because it returns OSError, so I created this PR (note setting diskless flag), that got rid of the OSError.

Issue 2 (corrupt data):
After the fix from this PR, I noticed that the data was corrupt, and created that second part about DISKLESS mode (which your PR fixes).

So we're wondering if you also want to fix what this fixes in your PR, which would require 2 testcases, or perhaps one uber testcase of nc_open_mem w/ CDF5 data.

@DennisHeimbigner
Copy link
Collaborator

No, we should probably have a separate test case for the issue 1 (nc_open_mem returns error).
the testcase I added is intended soley to address issue 2 (garbage data). I assumed your issue 1
pr had a test case. Is that not so?

@DennisHeimbigner
Copy link
Collaborator

Ok, another problem. When I run my test case using nc_open_mem, it works fine
and the bug fixed here is not occuring?
Ward, were you able to duplicate this bug (nc_open_mem failure)?

@WardF
Copy link
Member

WardF commented May 3, 2017

I will check in a bit; I was focused on diskless before, and not open mem. I'm in meetings currently and for much of the rest of the day but will attempt to duplicate ASAP.

@DennisHeimbigner
Copy link
Collaborator

Alex- can you post the dfile.c that you are using (with or without this pr)?

@thehesiod
Copy link
Contributor Author

ya, working on it now, still fighting MADIS data :)

@thehesiod
Copy link
Contributor Author

ok, just used the latest 4.4.1.1 release with the following code:

#include <iostream>
#include <netcdf.h>
#include <netcdf_mem.h>
#include <iostream>
#include <fstream>
#include <assert.h>
#import <vector>


using namespace std;


int main(int argc, const char * argv[]) {
    ifstream file("/tmp/20160513_1700.nc", ios::in | ios::binary | ios::ate);
    streamsize size = file.tellg();
    file.seekg(0, ios::beg);
    
    std::vector<char> buffer(size);
    int status;
    int ncid;
    
    if (file.read(buffer.data(), size))
    {
        status = nc_open_mem(".nc", 0, size, buffer.data(), &ncid);
        if(status != NC_NOERR) {
            cout << nc_strerror(status);
            assert(false);
        }
        
        std::cout << "Success!\n";
    }
    
    return 0;
}

and this file: https://madis-data.bldr.ncep.noaa.gov/madisPublic1/data/archive/2016/05/13/LDAD/mesonet/netCDF/20160513_1700.gz

and I get No such file or directory

@thehesiod
Copy link
Contributor Author

regarding difile.c, can get it from my branch: https://github.com/thehesiod/netcdf-c/blob/nc_open_mem_fix/libdispatch/dfile.c

@DennisHeimbigner
Copy link
Collaborator

Sorry, I just cannot duplcate the failure.

@thehesiod
Copy link
Contributor Author

ok, let me try on debian

@thehesiod
Copy link
Contributor Author

ok my steps to repro:

docker run --rm -ti debian
apt-get update
apt-get install -y wget clang build-essential libcurl4-gnutls-dev libhdf4-dev libhdf5-dev vim
cd /tmp

# get test file
wget https://madis-data.bldr.ncep.noaa.gov/madisPublic1/data/archive/2016/05/13/LDAD/mesonet/netCDF/20160513_1700.gz
gunzip 20160513_1700.gz
mv 20160513_1700 20160513_1700.nc

# install cmake
wget https://cmake.org/files/v3.8/cmake-3.8.0-Linux-x86_64.sh && \
    chmod u+x cmake-3.8.0-Linux-x86_64.sh && \
    ./cmake-3.8.0-Linux-x86_64.sh --prefix=/usr/local --skip-license

# build netcdf-c
wget https://github.com/Unidata/netcdf-c/archive/v4.4.1.1.tar.gz && \
    tar xvpf v4.4.1.1.tar.gz

cd netcdf-c-4.4.1.1
mkdir build_dir
cd build_dir
cmake .. -DCMAKE_BUILD_TYPE=RELEASE -DHAVE_HDF5_H=/usr/include/hdf5/serial/hdf5.h
make -j$(nproc) install

# build test app
cd /tmp
clang++ test.cpp -lnetcdf -o test
./test

results in:

root@52c457f5b5f5:/tmp# ./test 
test: test.cpp:27: int main(int, const char **): Assertion `false' failed.
No such file or directoryAborted

@WardF
Copy link
Member

WardF commented May 4, 2017

Thanks for providing a docker-based solution to reproduce this! I'll follow up on this.

Copy link
Member

@WardF WardF left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@WardF WardF merged commit e29dc26 into Unidata:master May 5, 2017
@thehesiod thehesiod deleted the nc_open_mem_fix branch July 21, 2017 01:47
netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this pull request May 16, 2018
Upstream changes:
## 4.6.1 - March 15, 2018

* [Bug Fix] Corrected an issue which could result in a dap4 failure. See [Github #888](Unidata/netcdf-c#888) for more information.
* [Bug Fix][Enhancement] Allow `nccopy` to control output filter suppresion.  See [Github #894](Unidata/netcdf-c#894) for more information.
* [Enhancement] Reverted some new behaviors that, while in line with the netCDF specification, broke existing workflows.  See [Github #843](Unidata/netcdf-c#843) for more information.
* [Bug Fix] Improved support for CRT builds with Visual Studio, improves zlib detection in hdf5 library. See [Github #853](Unidata/netcdf-c#853) for more information.
* [Enhancement][Internal] Moved HDF4 into a distinct dispatch layer. See [Github #849](Unidata/netcdf-c#849) for more information.

## 4.6.0 - January 24, 2018
* [Enhancement] Full support for using HDF5 dynamic filters, both for reading and writing. See the file docs/filters.md.
* [Enhancement] Added an option to enable strict null-byte padding for headers; this padding was specified in the spec but was not enforced.  Enabling this option will allow you to check your files, as it will return an E_NULLPAD error.  It is possible for these files to have been written by older versions of libnetcdf.  There is no effective problem caused by this lack of null padding, so enabling these options is informational only.  The options for `configure` and `cmake` are `--enable-strict-null-byte-header-padding` and `-DENABLE_STRICT_NULL_BYTE_HEADER_PADDING`, respectively.  See [Github #657](Unidata/netcdf-c#657) for more information.
* [Enhancement] Reverted behavior/handling of out-of-range attribute values to pre-4.5.0 default. See [Github #512](Unidata/netcdf-c#512) for more information.
* [Bug] Fixed error in tst_parallel2.c. See [Github #545](Unidata/netcdf-c#545) for more information.
* [Bug] Fixed handling of corrupt files + proper offset handling for hdf5 files. See [Github #552](Unidata/netcdf-c#552) for more information.
* [Bug] Corrected a memory overflow in `tst_h_dimscales`, see [Github #511](Unidata/netcdf-c#511), [Github #505](Unidata/netcdf-c#505), [Github #363](Unidata/netcdf-c#363) and [Github #244](Unidata/netcdf-c#244) for more information.

## 4.5.0 - October 20, 2017

* Corrected an issue which could potential result in a hang while using parallel file I/O. See [Github #449](Unidata/netcdf-c#449) for more information.
* Addressed an issue with `ncdump` not properly handling dates on a 366 day calendar. See [GitHub #359](Unidata/netcdf-c#359) for more information.

### 4.5.0-rc3 - September 29, 2017

* [Update] Due to ongoing issues, native CDF5 support has been disabled by **default**.  You can use the options mentioned below (`--enable-cdf5` or `-DENABLE_CDF5=TRUE` for `configure` or `cmake`, respectively).  Just be aware that for the time being, Reading/Writing CDF5 files on 32-bit platforms may result in unexpected behavior when using extremely large variables.  For 32-bit platforms it is best to continue using `NC_FORMAT_64BIT_OFFSET`.
* [Bug] Corrected an issue where older versions of curl might fail. See [GitHub #487](Unidata/netcdf-c#487) for more information.
* [Enhancement] Added options to enable/disable `CDF5` support at configure time for autotools and cmake-based builds.  The options are `--enable/disable-cdf5` and `ENABLE_CDF5`, respectively.  See [Github #484](Unidata/netcdf-c#484) for more information.
* [Bug Fix] Corrected an issue when subsetting a netcdf3 file via `nccopy -v/-V`. See [Github #425](Unidata/netcdf-c#425) and [Github #463](Unidata/netcdf-c#463) for more information.
* [Bug Fix] Corrected `--has-dap` and `--has-dap4` output for cmake-based builds. See [GitHub #473](Unidata/netcdf-c#473) for more information.
* [Bug Fix] Corrected an issue where `NC_64BIT_DATA` files were being read incorrectly by ncdump, despite the data having been written correctly.  See [GitHub #457](Unidata/netcdf-c#457) for more information.
* [Bug Fix] Corrected a potential stack buffer overflow.  See [GitHub #450](Unidata/netcdf-c#450) for more information.

### 4.5.0-rc2 - August 7, 2017

* [Bug Fix] Addressed an issue with how cmake was implementing large file support on 32-bit systems. See [GitHub #385](Unidata/netcdf-c#385) for more information.
* [Bug Fix] Addressed an issue where ncgen would not respect keyword case. See [GitHub #310](Unidata/netcdf-c#310) for more information.

### 4.5.0-rc1 - June 5, 2017

* [Enhancement] DAP4 is now included. Since dap2 is the default for urls, dap4 must be specified by
(1) using "dap4:" as the url protocol, or
(2) appending "#protocol=dap4" to the end of the url, or
(3) appending "#dap4" to the end of the url
Note that dap4 is enabled by default but remote-testing is
disbled until the testserver situation is resolved.
* [Enhancement] The remote testing server can now be specified with the `--with-testserver` option to ./configure.
* [Enhancement] Modified netCDF4 to use ASCII for NC_CHAR.  See [Github Pull request #316](Unidata/netcdf-c#316) for more information.
* [Bug Fix] Corrected an error with how dimsizes might be read. See [Github #410](Unidata/netcdf-c#410) for more information.
* [Bug Fix] Corrected an issue where 'make check' would fail if 'make' or 'make all' had not run first.  See [Github #339](Unidata/netcdf-c#339) for more information.
* [Bug Fix] Corrected an issue on Windows with Large file tests. See [Github #385](Unidata/netcdf-c#385]) for more information.
* [Bug Fix] Corrected an issue with diskless file access, see [Pull Request #400](Unidata/netcdf-c#400) and [Pull Request #403](Unidata/netcdf-c#403) for more information.
* [Upgrade] The bash based test scripts have been upgraded to use a common test_common.sh include file that isolates build specific information.
* [Upgrade] The bash based test scripts have been upgraded to use a common test_common.sh include file that isolates build specific information.
* [Refactor] the oc2 library is no longer independent of the main netcdf-c library. For example, it now uses ncuri, nclist, and ncbytes instead of its homegrown equivalents.
* [Bug Fix] `NC_EGLOBAL` is now properly returned when attempting to set a global `_FillValue` attribute. See [GitHub #388](Unidata/netcdf-c#388) and [GitHub #389](Unidata/netcdf-c#389) for more information.
* [Bug Fix] Corrected an issue where data loss would occur when `_FillValue` was mistakenly allowed to be redefined.  See [Github #390](Unidata/netcdf-c#390), [GitHub #387](Unidata/netcdf-c#387) for more information.
* [Upgrade][Bug] Corrected an issue regarding how "orphaned" DAS attributes were handled. See [GitHub #376](Unidata/netcdf-c#376) for more information.
* [Upgrade] Update utf8proc.[ch] to use the version now maintained by the Julia Language project (https://github.com/JuliaLang/utf8proc/blob/master/LICENSE.md).
* [Bug] Addressed conversion problem with Windows sscanf.  This primarily affected some OPeNDAP URLs on Windows.  See [GitHub #365](Unidata/netcdf-c#365) and [GitHub #366](Unidata/netcdf-c#366) for more information.
* [Enhancement] Added support for HDF5 collective metadata operations when available. Patch submitted by Greg Sjaardema, see [Pull request #335](Unidata/netcdf-c#335) for more information.
* [Bug] Addressed a potential type punning issue. See [GitHub #351](Unidata/netcdf-c#351) for more information.
* [Bug] Addressed an issue where netCDF wouldn't build on Windows systems using MSVC 2012. See [GitHub #304](Unidata/netcdf-c#304) for more information.
* [Bug] Fixed an issue related to potential type punning, see [GitHub #344](Unidata/netcdf-c#344) for more information.
* [Enhancement] Incorporated an enhancement provided by Greg Sjaardema, which may improve read/write times for some complex files.  Basically, linked lists were replaced in some locations where it was safe to use an array/table.  See [Pull request #328](Unidata/netcdf-c#328) for more information.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

nc_open_mem fails
3 participants