Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new level-0 macro layer. #830

Open
wants to merge 11 commits into
base: master
Choose a base branch
from
Open

Add new level-0 macro layer. #830

wants to merge 11 commits into from

Conversation

devinamatthews
Copy link
Member

@devinamatthews devinamatthews commented Nov 3, 2024

Details:

  • Developed by @fgvanzee and @devinamatthews.
  • Level-0 scalar macros have moved from a named-based system (e.g. bli_dcopys( ... )) to a macro argument-based system (bli_tcopys( d,d, ... )).
  • All macros are explicitly mixed-type.
  • All input and output operands can have a distinct type (precision and/or domain). Unnecessary computations and spurious NaN/Inf propagation are avoided in mixed-domain cases.
  • All macros which do math (i.e. not copy/set/etc.) take an additional computational precision.
  • Tile-level macros, 1m, broadcast-B, and other extensions are also included.
  • All macros should correctly handle aliasing of input and output operands (this needs to be rigorously checked).
  • The macros work generically over the defined types -- new types only need limited support (primarily conversion to other types and basic math).
  • Fixes Cannot use complex bli_scal2s with &x == &y #828.

Details:

- Developed by @fgvanzee and @devinamatthews.
- Level-0 scalar macros have moved from a named-based system (e.g. `bli_dcopys( ... )`) to a macro argument-based system (`bli_tcopys( d,d, ... )`).
- All macros are explicitly mixed-type.
- All input and output operands can have a distinct type (precision and/or domain). Unnecessary computations and spurious NaN/Inf propagation are avoided in mixed-domain cases.
- All macros which do math (i.e. not copy/set/etc.) take an additional computational precision.
- Tile-level macros, 1m, broadcast-B, and other extensions are also included.
- All macros should correctly handle aliasing of input and output operands (this needs to be rigorously checked).
- The macros work generically over the defined types -- new types only need limited support (primarily conversion to other types and basic math).
@devinamatthews devinamatthews marked this pull request as draft November 3, 2024 22:02
@devinamatthews
Copy link
Member Author

@fgvanzee I'm going to first rigorously check this fixes #828 and also add some tests.

@devinamatthews devinamatthews marked this pull request as ready for review November 5, 2024 00:10
@devinamatthews
Copy link
Member Author

@fgvanzee everything works now, with a full level-0 testsuite. "In-place" axpys, axpbys, xpbys, and scal2s also tested and work correctly.

@fgvanzee
Copy link
Member

fgvanzee commented Nov 9, 2024

Awesome! Many thanks for finishing this up.

@@ -38,7 +38,7 @@ To summarize: In order to observe multithreaded parallelism within a BLIS operat

BLIS disables multithreading by default. In order to allow multithreaded parallelism from BLIS, you must first enable multithreading explicitly at configure-time.

As of this writing, BLIS optionally supports multithreading via OpenMP or POSIX threads(or both).
As of this writing, BLIS optionally supports multithreading via OpenMP or POSIX bli_threads(or both).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to be a search-and-replace misfire.

Copy link
Member

@fgvanzee fgvanzee Nov 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wait, or maybe it's not a typo? Is this meant as a to reference the bli_pthread_*() API wrapper?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a typo.

@devinamatthews
Copy link
Member Author

devinamatthews commented Nov 15, 2024 via email

devinamatthews and others added 2 commits November 15, 2024 15:00
Revert typo in docs. [ci skip]
Copy link
Member

@fgvanzee fgvanzee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't review the C++ files, but everything else looks good.

@devinamatthews
Copy link
Member Author

OK, I'll want to wait to merge this 'til I clear a few of the other PRs out of the queue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cannot use complex bli_scal2s with &x == &y
2 participants