-
Notifications
You must be signed in to change notification settings - Fork 2
/
NEWS
381 lines (262 loc) · 13.8 KB
/
NEWS
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
CHarm 0.4.2:
* Added support for NEON SIMD instructions on ARM64 CPUs (v8 or newer).
* Improved performance of spherical harmonic analysis and synthesis of point
data values (up to ~20 %, depending on the processor). In spherical harmonic
analysis, expensive horizontal sums of SIMD vectors (`SUM_R`) were reduced
(~10 % improvement). In SIMD macros computing Legendre functions, some
unnecessary blends were be removed by suitable initializations (~20
% improments in analysis and synthesis).
* The internal parameter `SIMD_BLOCK` was split to `SIMD_BLOCK_A` and
`SIMD_BLOCK_S` that are used with spherical harmonic analysis and synthesis,
respectively. After the improvements from the previous bullet point, the
optimal value of `SIMD_BLOCK_A` seem to be about twice that of
`SIMD_BLOCK_S`. This further improves the performance of spherical harmonic
analysis by about 10 %.
* Added tests of `MASK_TRUE_ALL`, `MASK_TRUE_ANY`, `SUM_R` and `BLEND_R`
macros.
CHarm 0.4.1:
This is a maintenance release enabling the installation of PyHarm using `pip`.
A few minor bugs were additionally fixed and some minor improvements were
applied.
* On Linux (x86_64), macOS (x86_64, ARM64) and Windows (x86_64), PyHarm can now
be installed using `pip install pyharm`.
* In PyHarm, the `pathname` input parameter to the `to_file` method of the
`Shc` class no longer requires to specify `./` when saving to the current
working directory. `pathname` may now contain only the file name, implying
the coefficients should be saved to the current working directory.
* Fixed outside-of-bounds reads bug in the spherical harmonic synthesis at
Driscoll--Healy grids. The outputs of the synthesis were correct but various
errors might occur rarely (e.g., segmentation fault).
* Removed variable-length arrays from one specific internal routine of CHarm.
CHarm is now completely free from variable-length arrays.
* Functions returning NaN now use the `NAN` macro from `math.h`.
Previously, `0.0 / 0.0` was used to get NaN. Now this is only a fallback if
the `NAN` macro is not found in `math.h`.
* The internal parameter `SIMD_BLOCK` is now set to `4` for all kinds of AVX
instruction sets. This value seems to perform the best overall.
* Changed PyHarm building process. The C-part of PyHarm is now compiled from
within Python as an extension module.
* Lib names no longer have the `_omp` suffix, even if CHarm is compiled
with OpenMP support. Depending on the precision, the names are `libcharmf`,
`libcharm` and `libcharmq`. This affects the way how CHarm is linked.
* Various documentation improvements.
CHarm 0.4.0:
This release adds new functions to synthesize the full first- and second-order
gradients in the local north-oriented reference frame (LNOF) at evaluation
points. The API is backward compatible except for a minor change in the `misc`
module.
* Added functions to compute point values of the full first- and second-order
gradients in LNOF and of the first- and second-order derivatives with respect
to the spherical coordinates.
In CHarm, the new functions are placed in the `shs` module:
* `charm_shs_point_grad1` (the full first-order gradient in LNOF),
* `charm_shs_point_grad2` (the full second-order gradient in LNOF),
* `charm_shs_point_guru` (guru interface for first- and second-order
derivatives with respect to the spherical
coordinates).
In PyHarm, the new functions are:
* `pyharm.shs.point_grad1`,
* `pyharm.shs.point_grad2`,
* `pyharm.shs.point_guru`.
Example codes are provided in the cookbook.
* Added members `npoint` and `ncell` to the `charm_point` and `charm_cell`
structures, respectively. The new members represent the total number of
points/cells stored by the structures both for scattered points/cells and
grids. They are particularly useful when allocating the memory for the
signal to be synthesized at the points/cells represented by the structures.
* Values
* `CHARM_SHC_NMAX_MODEL`
* `CHARM_SHC_NMAX_ERROR`
are now symbolic constants instead of enumerations. This is because ISO
C restricts enumerator values to the range of `int`, but we need the maximum
values of `unsigned long`. The latter thus do not fall within the range that
can be portably stored by enumerations.
* Renamed
* `charm_misc_print_version`
to
* `charm_misc_print_info`
which better describes its purpose. Now it also prints various kind of
useful user-defined compilation flags (`CFLAGS`, `CPPFLAGS`, etc).
* Added functions `charm_misc_get_version` to return a string specifying the
CHarm version number determined on compilation time.
* Added function `charm_misc_buildopt_version_fftw` that returns a string
specifying the FFTW version that was used to compile CHarm.
* Changed value of the internal parameter `SIMD_BLOCK` from `8` to `2`. On
some recent processors, this may improve the performance up to 40 %, while no
significant deteriorations was encountered on older CPUs. In the future, it
would be nice to tune this parameter for the host's CPU during compilation.
CHarm 0.3.1:
* Added support to read the ICGEM's time variable gravity field models.
Supported are both the `icgem1.0` and `icgem2.0` formats.
* Added support to get the maximum harmonic degree of coefficients from data
files without the need to initialize a `charm_shc` structure. In CHarm, this
can be done using the functions to read spherical harmonic coefficients from
data files. In PyHarm, a new method `nmax_from_file` was added to the `Shc`
class to this end.
* Added support to read and write spherical harmonic coefficients in the `dov`
text format (degree, order, value).
* PyHarm functions to read spherical harmonic coefficients
* `pyharm.shc.Shc.from_file_gfc`,
* `pyharm.shc.Shc.from_file_tbl`,
* `pyharm.shc.Shc.from_file_bin`,
* `pyharm.shc.Shc.from_file_mtx`
were merged into a single function
* `pyharm.shc.Shc.from_file`.
PyHarm functions to write spherical harmonic coefficients
* `pyharm.shc.Shc.to_file_tbl`,
* `pyharm.shc.Shc.to_file_bin`,
* `pyharm.shc.Shc.to_file_mtx`
were merged into a single function
* `pyharm.shc.Shc.to_file`.
* Renamed symbolic constants
* `CHARM_SHC_WRITE_TBL_N`,
* `CHARM_SHC_WRITE_TBL_M`,
to
* `CHARM_SHC_WRITE_N`,
* `CHARM_SHC_WRITE_M`,
so that both can be used with the `tbl` and `dov` formats.
CHarm 0.3.0:
This release significantly improves performance of point synthesis and
analysis, fixes a few bugs and makes a few modifications of the API.
* Fixed memory leak in `charm_integ_pn1m1pn2m2`.
* Renamed symbolic constants
* `CHARM_LEG_PNMJ_ORDER_MNJ`,
* `CHARM_LEG_PNMJ_ORDER_MJN`
to
* `CHARM_LEG_PMNJ`,
* `CHARM_LEG_PMJN`,
respectively.
* Removed the `xnum` module from the public API. It offered only little added
value, as all its functions are easy to implement. This makes the API
cleaner, especially in long terms, given that much more interesting modules
are yet to come.
* The function `charm_leg_pnmj_length` was removed from the API. The number of
coefficients stored in the `charm_pnmj` structure is already provided in its
`npnmj` member.
* Added functions to check various features of CHarm that are determined during
the compilation time:
* `charm_misc_buildopt_precision`,
* `charm_misc_buildopt_omp_charm`,
* `charm_misc_buildopt_omp_fftw`,
* `charm_misc_buildopt_simd`,
* `charm_misc_buildopt_isfinite`.
* Switched to `sphinx_book_theme` for html docs.
* Added support for polar optimization to increase computational speed
(Eqs. 7 and 8 of Reinecke and Seljebotn 2013). As long as the polar
optimization parameters (see the `glob` module) are chosen appropriately,
this technique improves the performance and still keeps an excellent
accuracy. However, unwisely chosen tuning parameters may negatively affect
the output accuracy.
By default, the polar optimization is turned off.
References:
* Reinecke, M., Seljebotn, D. S. (2013) Libsharp -- spherical harmonic
transforms revisited. Astronomy and Astrophysics 554, A112, doi:
10.1051/0004-6361/201321494.
* Improved caching for point synthesis and analysis by the blocking technique.
No effort has yet been made to apply the cache blocking also for cell
analysis and synthesis, which therefore offer a sub-optimal implementation.
This might change in the future if the performance of cell transforms becomes
an issue or a limitation.
* Added automatic dynamical switching to the computation of Legendre functions
(Fukushima, 2016).
References:
* Fukushima, T. (2016) Numerical computation of point values, derivatives,
and integrals of associated Legendre function of the first kind and point
values and derivatives of oblate spheroidal harmonics of the second kind
of high degree and order. In: Rizos, C., Willis, P. (eds): IAG 150 Years:
Proceedings of the 2013 IAG Scientific Assembly, Potsdam, Germany, 1--6
September, 2013, 143:193--197. https://doi.org/10.1007/1345_2015_124
* Added dynamic scheduling to parallel for loops in shs. This slightly
improves the performance, because the computation time of Legendre functions
varies across the meridian (mainly due to X-numbers, dynamical switching and,
if applied, polar optimization). In sha, the dynamic scheduling has been in
use right from the beginning.
* Introduced `NEG_R` macro to properly switch the sign of SIMD vectors.
* Improved test suite. The functions are now smaller and cleaner. Some new
tests were also added, mostly to check various custom allocation functions.
* New benchmarks are available at
https://www.charmlib.org/build/html/benchmarks.html.
CHarm 0.2.0:
* Added Python wrapper called PyHarm. This step necessitated several
modifications of the CHarm API, some of which are not backward compatible.
The changes are listed below.
* The `charm_crd` structure was replaced by new structures:
* `charm_point` and
* `charm_cell`,
which distinguish between evaluation points and evaluation cells.
* Functions:
* `charm_shc_init`,
* `charm_crd_init`,
* `charm_leg_pnmj_init`
were replaced by
* `charm_shc_calloc`,
* `charm_crd_calloc`,
* `charm_leg_pnmj_calloc`.
The new functions behave in the same fashion as the old ones, including their
interface.
The following functions have been added:
* `charm_shc_malloc`,
* `charm_crd_malloc`,
* `charm_leg_pnmj_malloc`,
which behave similarly as their `*_calloc` counterparts, but provide
uninitialized memory.
Functions:
* `charm_shc_init`,
* `charm_crd_init`
have now a different meaning. They can be used to create the respective
structures from data arrays provided by the user.
* Changed API of the following functions to read/write spherical harmonic
coefficients from/to files:
* `charm_shc_read_bin`,
* `charm_shc_read_gfc`,
* `charm_shc_read_mtx`,
* `charm_shc_read_tbl`,
* `charm_shc_write_bin`,
* `charm_shc_write_mtx`,
* `charm_shc_write_tbl`.
Previously, it was necessary to open the stream for the input/output file,
call the CHarm function to read/write the coefficients and, finally, close
the stream. All these steps had to be done by the user. Now, the user only
calls the function to read/write the coefficients and specifies the file's
path name as one of the input parameters. Opening and closing the stream is
done by CHarm.
This should considerably simplify making the Python wrapper, given than
wrapping `TYPE *` pointer with `ctypes` is not that trivial.
* Symbolic constants `CHARM_CRD_CELLS_*` and `CHARM_CRD_POINTS_*` were renamed
to `CHARM_CRD_CELL_*` and `CHARM_CRD_POINT_*`.
* The `nmj_order` member of `charm_pnmj` was renamed to `ordering`.
* Bug fix for `integ_pn1m1pn2m2`. The routine now works correctly with both
ordering schemes of Fourier coefficients of Legendre functions.
* Bug fix for `shc_read_bin`. The function no longer throws an error when
reading coefficients up to a maximum degree that is lower than the one in the
binary file.
* Fix a few leg_pnmj tests that did not actually run.
* Fixed use-after-free bug in degree amplitude tests.
* Two members, `nc` and `ns`, were added to the `charm_shc` structure to
represent the total number of spherical harmonic coefficients `Cnm` and
`Snm`, respectively. This does not break the previous API.
* Added support for Fortran's `D` and `d` decimal exponents when reading from
text files.
* Compiling CHarm as a shared library is no longer disabled by default.
* The documentation can now be build with `make html` instead of `make docs`.
CHarm 0.1.2:
* Bug fix for incorrectly included c-file instead of its respective header file
in `integ_cc.c`. The bug was introduced in CHarm 0.1.0.
CHarm 0.1.1:
* A few typo fixes in docs on installing.
CHarm 0.1.0:
* Added support for AVX, AVX2 and AVX-512 vector CPU instructions.
* Bug fix in synthesis of point/mean values at grids having only a single
point/cell in the longitudinal direction (`shs_point_grd.c`,
`shs_cell_grd.c`). The same bug fix applies to the analysis of area-mean
values (`sha_cell.c`) and the synthesis of area-mean values on irregular
surfaces (`shs_cell_isurf.c`). Each of the routines returned an error before
the computation could start.
* Removed some functions from API that bring little added value to the users
(`charm_misc_is_nearly_equal`, `charm_misc_arr_min`, `charm_misc_arr_max`,
`charm_misc_arr_mean`, `charm_misc_arr_std`, `charm_misc_arr_rms`,
`charm_misc_arr_chck_lin_incr` and `charm_misc_arr_chck_symm`).
CHarm 0.0.1:
* Fixed data-sharing bug in `shs_cell_isurf_coeffs.c`.
CHarm 0.0.0:
* Initial release