Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update CICE documentation with Compliance Testing Code from Andrew. #42

Merged
merged 6 commits into from
Dec 2, 2017
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
75 changes: 74 additions & 1 deletion _sources/cice_3_user_guide.rst.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2262,7 +2262,80 @@ hemispheres, and must exceed a critical value nominally set to
test and the Two-Stage test described in the previous section are
provided in :cite:`Hunke2018`.


In applying equations :eq:`t-distribution` through :eq:`short-means`,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest that lines 2265 to 2300 need to be removed from this, and the corresponding html files also needs to be changed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please note also, in reviewing this pull request, I have found some isses with the documentation of the documentation that are not actually in this request, but are important for its implementation.

First about-sphinx-documentation.txt contains a link to https://github.com/NCAR/CICE that is not accessible to the outside world, at least not to me. Please revise this link. In addition, this file includes a bunch of funny characters when using a text editor (vim or emacs) to read it that look like the text has been cut and pasted. Characters such as apostrophes are unreadable. This is also true of the file sphinx-documentation-workflow.txt.

Second, the Makefile as currently exists under /doc does not work, instead giving the error message: "Makefile:12: *** commands commence before first target. Stop.". This is because of the following lines:

User-friendly check for sphinx-build

ifeq ($(shell which $(SPHINXBUILD) >/dev/null 2>&1; echo $$?), 1)
$(error The '$(SPHINXBUILD)' command was not found. Make sure you have Sphinx installed, then set the SPHINXBUILD environment variable to point to the full path of the '$(SPHINXBUILD)' executable. Alternatively you can add the directory with the executable to your PATH. If you don't have Sphinx installed, grab it from http://sphinx-doc.org/)
endif

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that text was from Matt in section 3? I'll work on the other stuff, but the Makefile works if you have python/2.7.11 installed as well as sphinx. For the citations, you also need sphinxcontrib-bibtex installed. I will fix that link. That is our user's guide for the CESM-CICE. Thanks for the feedback Andrew.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just the lines I've highlighted are from my original write-up and have changed. Matt has added everything after the "Practical Testing Procedure" header. My version of python was 2.7.10.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it.

we are accounting for the fact, however imperfectly, that a
:math:`t`-test should be a comparison of the means from two series of
independent samples. The typical affect of applying these equations to
sea ice model output is that :math:`n' \ll n`. For that reason, we need
a lengthy time series to narrow the range of acceptable values
in :eq:`t-crit`. There is little point in using more frequent output
from CICE than daily instantaneous values, since this would have little
impact on decreasing :math:`r_1` in :eq:`lag-1-auto-correlation`.

Using these equations, a standard procedure in testing for
science-changing answers in CICE and Icepack is as follows: First, make
every attempt to obtain bit-for-bit reproducibility in the model code.
Once all available software-testing options have been exhausted, and the
source of the bit-for-bit test failure has been pinpointed, proceed with
the :math:`t`-test documented above if the expectation is that code
alterations should not be science-changing.
Equations :eq:`t-distribution` through :eq:`short-means` are
implemented in the reverse order from which they are presented here, and
applied individually to daily samples of :math:`h`, :math:`c`, :math:`u`
and :math:`v` from 5-year time series at every model grid point: i)
Calculate :math:`\bar{x}_{1:n-1}`, :math:`\bar{x}_{2:n}`, and
:math:`\bar{x}` in :eq:`short-means` for simulations :math:`a` and
:math:`b`; ii) Compute :eq:`lag-1-auto-correlation`,
:eq:`effective-sample-size` and :eq:`unbiased-sigma`, in that order,
for each simulation :math:`a` and :math:`b`, and finally; iii) Determine
whether the null hypothesis is true at each model grid point in
:eq:`t-crit` using equation :eq:`t-distribution` and a lookup
:math:`t`-distribution table. Should :math:`H_0` be confirmed at each
grid point, and for each variable :math:`h`, :math:`c`, :math:`u` and
:math:`v`, this test contributes to evidence that changes to CICE and
Icepack code are unlikely to alter scientific results. To guard against
the possibility of a Type II error, the test should be performed for
several different confidence intervals, nominally set at 68, 80 and 95%,
the first and last of these values corresponding to :math:`\sigma` and
:math:`2\sigma` tests.

***************************
Practical Testing Procedure
***************************

The CICE code compliance test is performed by running a python script (cice.t-test.py).
In order to run the script, the following requirements must be met:

* Python v2.7 or later
* netCDF Python package
* numpy Python package

In order to generate the files necessary for the compliance test, test cases should be
created with the ``ttest`` option (i.e., ``-s ttest``) when running create.case. This
option results in daily, non-averaged history files being written for a 5 year simulation.

To run the compliance test:

.. code-block:: bash

cp configuration/scripts/tests/QC/cice.t-test.py .
./cice.t-test.py /path/to/baseline/history /path/to/test/history

The script will produce output similar to:

| \INFO:__main__:Number of files: 1825
| \INFO:__main__:Two-Stage Test Passed
| \INFO:__main__:Quadratic Skill Test Passed for Northern Hemisphere
| \INFO:__main__:Quadratic Skill Test Passed for Southern Hemisphere
| \INFO:__main__:
| \INFO:__main__:Quality Control Test PASSED

Additionally, the exit code from the test (``echo $?``) will be 0 if the test passed,
and 1 if the test failed.

Implementation notes: 1) Provide a pass/fail on each of the confidence
intervals, 2) Facilitate output of a bitmap for each test so that
locations of failures can be identified.

.. _tabnamelist:

Expand Down
69 changes: 69 additions & 0 deletions cice_3_user_guide.html
Original file line number Diff line number Diff line change
Expand Up @@ -2133,6 +2133,75 @@ <h4>3.6.4.6. Additional Details<a class="headerlink" href="#additional-details"
<span class="math">\(S_{crit}=0.99\)</span> to pass the test. Practical illustrations of this
test and the Two-Stage test described in the previous section are
provided in <a class="reference internal" href="zreferences.html#hunke2018" id="id29">[38]</a>.</p>
<p>In applying equations&nbsp;<a class="reference internal" href="#equation-t-distribution">(1)</a> through&nbsp;<a class="reference internal" href="#equation-short-means">(4)</a>,
we are accounting for the fact, however imperfectly, that a
<span class="math">\(t\)</span>-test should be a comparison of the means from two series of
independent samples. The typical affect of applying these equations to
sea ice model output is that <span class="math">\(n' \ll n\)</span>. For that reason, we need
a lengthy time series to narrow the range of acceptable values
in&nbsp;<code class="xref eq docutils literal"><span class="pre">t-crit</span></code>. There is little point in using more frequent output
from CICE than daily instantaneous values, since this would have little
impact on decreasing <span class="math">\(r_1\)</span> in <code class="xref eq docutils literal"><span class="pre">lag-1-auto-correlation</span></code>.</p>
<p>Using these equations, a standard procedure in testing for
science-changing answers in CICE and Icepack is as follows: First, make
every attempt to obtain bit-for-bit reproducibility in the model code.
Once all available software-testing options have been exhausted, and the
source of the bit-for-bit test failure has been pinpointed, proceed with
the <span class="math">\(t\)</span>-test documented above if the expectation is that code
alterations should not be science-changing.
Equations&nbsp;<a class="reference internal" href="#equation-t-distribution">(1)</a> through&nbsp;<a class="reference internal" href="#equation-short-means">(4)</a> are
implemented in the reverse order from which they are presented here, and
applied individually to daily samples of <span class="math">\(h\)</span>, <span class="math">\(c\)</span>, <span class="math">\(u\)</span>
and <span class="math">\(v\)</span> from 5-year time series at every model grid point: i)
Calculate <span class="math">\(\bar{x}_{1:n-1}\)</span>, <span class="math">\(\bar{x}_{2:n}\)</span>, and
<span class="math">\(\bar{x}\)</span> in <a class="reference internal" href="#equation-short-means">(4)</a> for simulations <span class="math">\(a\)</span> and
<span class="math">\(b\)</span>; ii) Compute <code class="xref eq docutils literal"><span class="pre">lag-1-auto-correlation</span></code>,
<code class="xref eq docutils literal"><span class="pre">effective-sample-size</span></code> and <code class="xref eq docutils literal"><span class="pre">unbiased-sigma</span></code>, in that order,
for each simulation <span class="math">\(a\)</span> and <span class="math">\(b\)</span>, and finally; iii) Determine
whether the null hypothesis is true at each model grid point in
<code class="xref eq docutils literal"><span class="pre">t-crit</span></code> using equation <a class="reference internal" href="#equation-t-distribution">(1)</a> and a lookup
<span class="math">\(t\)</span>-distribution table. Should <span class="math">\(H_0\)</span> be confirmed at each
grid point, and for each variable <span class="math">\(h\)</span>, <span class="math">\(c\)</span>, <span class="math">\(u\)</span> and
<span class="math">\(v\)</span>, this test contributes to evidence that changes to CICE and
Icepack code are unlikely to alter scientific results. To guard against
the possibility of a Type II error, the test should be performed for
several different confidence intervals, nominally set at 68, 80 and 95%,
the first and last of these values corresponding to <span class="math">\(\sigma\)</span> and
<span class="math">\(2\sigma\)</span> tests.</p>
</div>
<div class="section" id="practical-testing-procedure">
<h4>3.6.5.3. Practical Testing Procedure<a class="headerlink" href="#practical-testing-procedure" title="Permalink to this headline">¶</a></h4>
<p>The CICE code compliance test is performed by running a python script (cice.t-test.py).
In order to run the script, the following requirements must be met:</p>
<ul class="simple">
<li>Python v2.7 or later</li>
<li>netCDF Python package</li>
<li>numpy Python package</li>
</ul>
<p>In order to generate the files necessary for the compliance test, test cases should be
created with the <code class="docutils literal"><span class="pre">ttest</span></code> option (i.e., <code class="docutils literal"><span class="pre">-s</span> <span class="pre">ttest</span></code>) when running create.case. This
option results in daily, non-averaged history files being written for a 5 year simulation.</p>
<p>To run the compliance test:</p>
<div class="highlight-bash"><div class="highlight"><pre><span></span>cp configuration/scripts/tests/QC/cice.t-test.py .
./cice.t-test.py /path/to/baseline/history /path/to/test/history
</pre></div>
</div>
<p>The script will produce output similar to:</p>
<blockquote>
<div><div class="line-block">
<div class="line">INFO:__main__:Number of files: 1825</div>
<div class="line">INFO:__main__:Two-Stage Test Passed</div>
<div class="line">INFO:__main__:Quadratic Skill Test Passed for Northern Hemisphere</div>
<div class="line">INFO:__main__:Quadratic Skill Test Passed for Southern Hemisphere</div>
<div class="line">INFO:__main__:</div>
<div class="line">INFO:__main__:Quality Control Test PASSED</div>
</div>
</div></blockquote>
<p>Additionally, the exit code from the test (<code class="docutils literal"><span class="pre">echo</span> <span class="pre">$?</span></code>) will be 0 if the test passed,
and 1 if the test failed.</p>
<p>Implementation notes: 1) Provide a pass/fail on each of the confidence
intervals, 2) Facilitate output of a bitmap for each test so that
locations of failures can be identified.</p>
</div>
</div>
</div>
Expand Down
2 changes: 1 addition & 1 deletion cice_4_index.html
Original file line number Diff line number Diff line change
Expand Up @@ -1661,7 +1661,7 @@ <h2>4.1. Comprehensive Alphabetical Index<a class="headerlink" href="#comprehens
<td>&#160;</td>
</tr>
<tr class="row-even"><td>kstrength</td>
<td><span class="math">\(\bullet\)</span> ice stength formulation (1= <a class="reference internal" href="zreferences.html#rothrock75" id="id2">[57]</a>, 0 = <a class="reference internal" href="zreferences.html#hibler79" id="id3">[26]</a>)</td>
<td><span class="math">\(\bullet\)</span> ice stength formulation (1= <a class="reference internal" href="zreferences.html#rothrock75" id="id2">[61]</a>, 0 = <a class="reference internal" href="zreferences.html#hibler79" id="id3">[26]</a>)</td>
<td>1</td>
<td>&#160;</td>
</tr>
Expand Down
2 changes: 1 addition & 1 deletion searchindex.js

Large diffs are not rendered by default.