Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change default for window size in EquivalentSourcesGB #487

Merged
merged 15 commits into from
May 31, 2024

Conversation

indiauppal
Copy link
Member

@indiauppal indiauppal commented Apr 2, 2024

Change the default value for the window_size in the EquivalentSourcesGB constructor to "default": window size will be estimated to locate approximately 5000 data points in each window. If window_size="default" and less than 5000 data points are used to fit the sources, a single window for all data and sources will be used. Check if the passed value of window_size is valid. Add a new window_size_ attribute to the class that is set after fitting the sources. Update docstrings and extend tests.

Relevant issues/PRs:
Fixes #425

…such that approximately 5000 data points are in each window.
@indiauppal
Copy link
Member Author

indiauppal commented Apr 16, 2024

@indiauppal
Copy link
Member Author

indiauppal commented Apr 23, 2024

  • Raise an warning for no. of points < 5e3

Completed with @santisoler

indiauppal added 2 commits May 7, 2024 16:54
Update test to check for data points less than 5e3 and the associated warning.
@santisoler
Copy link
Member

This is starting to look great @indiauppal!

I'm leaving a few ideas after the meeting we had today:

  1. We should probably extend the test function for less than 5000 points and test if the outputs match the ones for a single window. Something like:
    data_windows, source_windows = eqs._create_windows(grid_coords)
    # The output lists should have a single element each (corresponding to the single window)
    assert len(data_windows) == 1
    assert len(source_windows) == 1
    # Check if all sources and data points are inside the window
    for coord in grid_coords:
        npt.assert_allclose(coord, coord[data_windows[0]])
    for coord in eqs.points_:
        npt.assert_allclose(coord, coord[source_windows[0]])
  2. We could explore replacing those np.aranges with slices. We only need those indices to slice the sources and coordinates arrays in the self._gradient_boosting method, so anything we could use to slice them that are more memory efficient should work. This is just a minor optimization, so it's not that critical for this PR.
  3. Since now we have another attribute the fit method assigns (the window_size_), we should add it to the list of Attributes in the class docstring. Something like the think I wrote in Change default value of depth in equivalent sources #491 for the depth_ argument:
    depth_ : float or None
    Estimated depth of the sources calculated as 4.5 times the median
    distance between first neighboring sources. This attribute is set to
    None if ``points`` is passed.
  4. It would be nice to add a check for the window_size argument in the __init__ method. Basically, we should raise an error if the passed value is a str and is not "default". That's the only string value that should be valid. Check this out for inspiration:
    if isinstance(depth, str) and depth != "default":
    raise ValueError(
    f"Found invalid 'depth' value equal to '{depth}'."
    "It should be 'default' or a numeric value."
    )

This is somewhat my personal wishlist for this PR, so feel free to assign me a few of these tasks if you want. As always, feel free to ask for help if you need it 🙂

Looking forward to see this merged!

@santisoler
Copy link
Member

@indiauppal, I'm updating this branch after the fix I made for the failing Mac testing. Remember to run a git pull to sync your local repo with the latest change in this branch.

@santisoler santisoler merged commit c079fb3 into fatiando:main May 31, 2024
19 checks passed
@indiauppal indiauppal deleted the window_size branch August 12, 2024 23:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Better default for window_size in EquivalentSourcesGB
2 participants