Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce segment responses held in memory #85

Closed
j-c-cook opened this issue Apr 10, 2021 · 7 comments · Fixed by #84
Closed

Reduce segment responses held in memory #85

j-c-cook opened this issue Apr 10, 2021 · 7 comments · Fixed by #84
Assignees

Comments

@j-c-cook
Copy link
Contributor

j-c-cook commented Apr 10, 2021

The segment response matrix h_ij is of size nq x nq x nt, where nq is the number of sources in the field, and nt is the number of time steps.

Two variants of the segment response matrix h_dt and dh_ij are computed. The factor increment matrix dh_ij is used in calculating the borehole wall temperatures based on all previous responses, equation 18 from Cimmino (2018). The h_dt matrix sets up the linear interpolation for the current thermal response factor matrix h_ij_dt, and can be seen in equation 18 in Cimmino (2018).

With the current methodology here, the segment response matrix h_ij is necessary for computing both h_dt and dh_ij. Therefore, it is not directly possible to get around having to hold 3 (nq x nq x nt) segment response factors in memory. With each of the segment responses being double precision (at 16 bytes).

Would it be possible to directly compute h_ij_dt and dh_ij[:, :, t] from h_ij without computing dh_ij and h_dt without losing speed performance?


Massimo Cimmino (2018) Fast calculation of the g-functions of geothermal borehole fields using similarities in the evaluation of the finite line source solution, Journal of Building Performance Simulation, 11:6, 655-668.

@MassimoCimmino
Copy link
Owner

Can you take a look at the new implementation in #33 ? I discovered while working on this that the interp1d stored a copy of h_ij by default as h_ij.y. The new implementation uses this and computes h_dt once per time step (its size is now nq x nq).


Similarities.thermal_response_factors() returns an interp1d object with a stored copy of h_ij.

# Interp1d object for thermal response factors
h_ij = interp1d(np.hstack((0., time)),
np.dstack((np.zeros((self.nSources,self.nSources)), h_ij)),
kind=kind, copy=True, axis=2)
toc = tim.time()
if self.disp: print('{} sec'.format(toc - tic))
return h_ij

h_ij = self.thermal_response_factors(time, alpha, kind=self.kind)

The matrix h_dt is now calculated one timestep at a time. This could potentially slow down calculations.

# Current thermal response factor matrix
if p > 0:
dt = self.time[p] - self.time[p-1]
else:
dt = self.time[p]
# Thermal response factors evaluated at t=dt
h_dt = h_ij(dt)

@j-c-cook
Copy link
Contributor Author

j-c-cook commented Apr 10, 2021

Really nice discovery about interp1d, that helps.

Issue #33 was initially proposed as a Refactor (at least that's what I was focused on back in December 2020 #56), your additional work could now classify it as Optimize-Refactor. This will reduce the memory consumption by 2 * (nq x nq x nt) * 16 bytes. For a field with 400 boreholes at 12 segments per borehole, there will be about 11 GB less memory consumed (I'd say 16 GB of RAM is standard for personal PC's right now, so that is significant).

An acceptance of pull request #84 will merge this into master and will be released in v2.0.0.

I'll run memory and time tests on this when I get the chance. I believe time difference will be negligible.

I think this issue should be linked to pull request #84 and close upon acceptance.

@MassimoCimmino
Copy link
Owner

If this is to be tied to #84, we can consider adding a dtype option to the gFunctionclass. By default, all numpy arrays are initialized as np.float64 (double precision). I have successfully evaluated the g-function of fields of >1000 boreholes by bringing the precision down to np.float16 (half-precision). I have not tested the consequences on the accuracy.

Is there a need for this feature?

@j-c-cook
Copy link
Contributor Author

j-c-cook commented Apr 10, 2021

numpy.half      16-bit-precision
numpy.single    32-bit-precision
numpy.double    64-bit-precision

I think an optional argument for the precision would be a good feature. The lowest precision with acceptable accuracy could be default. An input will allow for the consequences of accuracy to be more easily tested.

MassimoCimmino added a commit that referenced this issue Apr 10, 2021
@MassimoCimmino
Copy link
Owner

There is an argument to make to keep the default np.double according to numpy standards since this is the data type expected by the average user. Acceptable accuracy is subjective and dependent on the application. Decreasing the precision should only be relevant when memory is an issue (calculation time should not be affected).

There is also numpy.longdouble.

@j-c-cook
Copy link
Contributor Author

I agree that it makes sense for the default optional dtype argument to be np.double in the gFunction class. Thanks for a0f78b1.

@MassimoCimmino
Copy link
Owner

numpy data types are the subject of NEP40. This may have future repercussions on the dtype feature. Something to keep in mind.

MassimoCimmino added a commit that referenced this issue Apr 17, 2021
MassimoCimmino added a commit that referenced this issue Apr 17, 2021
@MassimoCimmino MassimoCimmino self-assigned this May 2, 2021
MassimoCimmino added a commit that referenced this issue May 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants