Skip to content

Commit

Permalink
update
Browse files Browse the repository at this point in the history
  • Loading branch information
wjr-z committed Oct 12, 2024
1 parent 06fca7b commit a9934cf
Show file tree
Hide file tree
Showing 11 changed files with 188 additions and 320 deletions.
74 changes: 65 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,18 +2,74 @@
C++17,64位系统。 \
绝大多数优化仅针对x64。 目前异常处理支持不佳。
1. 非侵入式容器 bitset, B+-tree, basic_vector...
- bitset : 支持不初始化值的bitset。在解析器中很常见,仅需根据depth更新而不需要初始化。
- B+-tree : 优化的B+-tree, 对每次拷贝长度的上下限进行了计算,并根据此定制copy/copy_backward。
bset, bmap ...
- basic_vector : 可定制通用vector,暂时支持类似std::vector, std::inplace_vector(C++26), fixed_vector等。\
inplace_vector为栈上分配内存。 \
fixed_vector仅构造时动态分配内存,之后不进行扩容,这有利于部分优化实现。\
biginteger中size为int32_t,需要表示正负数,也可定制其vector,只需修改其storage即可。 \
提供安全的方式修改vector的storage,即安全的方式从T*转换为vector\<T\>,或反过来。
2. 侵入式容器 list, forward_list, lock-free forward_list...
3. 预处理器
4. compressed_pair\<T, U\>
类似Linux侵入式容器 \
3. 预处理器
4. compressed_pair\<T, U\>
EBO优化pair。尽可能的trivial
5. tuple\<Args...\>
尽可能的trivial
6. span
7. inline_key
C++20 span的C++17实现。
7. inline_arg
模板函数中很难选择使用值传递还是引用传递,这是const T&和T的wrapper,可以根据trivial, size等手动/自动选择相应的wrapper。
8. uninitialized : aligned_storage_t替代品
尽可能的trivial。手动构造和析构。
9. lazy_initialized\<T\> : 延迟初始化
10. biginteger
11. math :
12. JSON
13. format : to_chars, from_chars 。 fast_float改进。
类似于uninitialized,但是必须在生命周期前构造,生命周期结束时自动析构。
10. math :
clz(countl_zero), ctz(coutr_zero), popcount, uint128_t, mul64x64 ... \
大整数底层实现,性能与GMP中mpn基本相同(x64下),部分略快于mpn,部分算法优化远快于mpn。 \
GCC/Clang支持内联汇编,部分简单函数可内联。后续可能对常量长度进行专门优化。 \
11. biginteger :
大整数库封装类
12. JSON :
综合了simdjson的性能和nlohmann的易用性,使用SIMD加速解析,使用nlohmann类似数据结构构建document, \
使用了定制的relocate优化。可自定义visitor,例如使用虚函数直接根据类构建,或类似simdjson生成迭代器等。
13. format : to_chars, from_chars ...
fast_float整数path优化。 \
to_chars支持直接输出到迭代器。对于back_inserter,并非简单的定义buffer,写入到buffer然后拷贝到迭代器中, \
而是判断容器是否支持直接拷贝(例如默认std::char_traits可以简单拷贝,其他traits需要自定义是否支持),是否支持resize/append,\
若支持,则可以使用resize/append后直接写入。 \
```
std::basic_string<char, nodex<char>> str;
auto ptr = to_chars_unchecked(std::back_inserter(str), 123);
```
14. template preprocessor :
15. stack alloccator : 动态扩展内存分配器。堆预分配内存模拟栈。
16. string
例如
```
tp_sort_t<tp_list<integral_constant<int, 3>, integral_constant<int, 2>,
integral_constant<int, 1>>>
=> tp_list<integral_constant<int, 1>, integral_constant<int, 2>,
integral_constant<int, 3>>
```
可自定义比较参数。
15. stack alloccator : 动态扩展内存分配器。堆预分配内存模拟栈
16. string :
实现部分C++17中不支持的函数,例如resize_and_overwrite、starts_with、ends_with等
17. switch of tp_list. Example :
```
using type = tp_integers_list_t<char, 'a', 'b', 'c', '3', '4'>;
vswitch<type>('a', [](auto) {
///
});
```
tp_integers_list_t可以通过宏字符串参数构建,也可以使用tp_xxx_t构建。 \
直接调用switch,支持最大256个case。
18. expected
C++20 expected的C++17实现。 \
新增compressed_unexpected,但目前仅为复用代码,后续可能移至其他类(使用错误码而非bool记录是否存在值)。
19. crtp
control_special_members_base可以方便的生成尽可能trivial的特殊成员函数。 \
enable_special_members_of_args_base可以方便的禁用特殊成员函数。
20. concurrency
pause,lock-free单向链表简单实现,spin_mutex
todo : 无锁数据结构,RCU, scheduler等
1 change: 1 addition & 0 deletions include/wjr/algorithm.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -132,6 +132,7 @@ constexpr void __sort_impl(Iter first, Iter last, Pred pred) {

} // namespace algorithm_detail

/// @todo
template <typename Iter, typename Pred>
constexpr void sort(Iter first, Iter last, Pred pred) {
algorithm_detail::__sort_impl(first, last, pred);
Expand Down
53 changes: 42 additions & 11 deletions include/wjr/assert.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -43,36 +43,67 @@
#error "WJR_DEBUG_LEVEL must be 0 ~ 3"
#endif

#if WJR_DEBUG_LEVEL == 0
#undef WJR_LIGHT_ASSERT
#define WJR_LIGHT_ASSERT
#endif

namespace wjr {

WJR_NORETURN extern void __assert_failed(const char *expr, const char *file,
const char *func, int line) noexcept;
WJR_NORETURN WJR_COLD extern void __assert_failed(const char *expr, const char *file,
const char *func, int line) noexcept;

WJR_NORETURN WJR_COLD extern void __assert_light_failed(const char *expr) noexcept;

// LCOV_EXCL_START

/// @private
template <typename... Args>
void __assert_handler(const char *expr, const char *file, const char *func, int line,
Args &&...args) noexcept {
std::cerr << "Message:";
WJR_NORETURN void __assert_failed_handler(const char *expr, const char *file,
const char *func, int line,
Args &&...args) noexcept {
std::cerr << "Assert message:";
(void)(std::cerr << ... << std::forward<Args>(args));
std::cerr << '\n';
__assert_failed(expr, file, func, line);
}

/// @private
inline void __assert_handler(const char *expr, const char *file, const char *func,
int line) noexcept {
WJR_NORETURN inline void __assert_failed_handler(const char *expr, const char *file,
const char *func, int line) noexcept {
__assert_failed(expr, file, func, line);
}

/// @private
template <typename... Args>
WJR_NORETURN void __assert_light_failed_handler(const char *expr, const char *,
const char *, int,
Args &&...args) noexcept {
std::cerr << "Assert message:";
(void)(std::cerr << ... << std::forward<Args>(args));
std::cerr << '\n';
__assert_light_failed(expr);
}

/// @private
WJR_NORETURN inline void __assert_light_failed_handler(const char *expr, const char *,
const char *, int) noexcept {
__assert_light_failed(expr);
}

#if defined(WJR_LIGHT_ASSERT)
#define WJR_ASSERT_FAILED_HANDLER ::wjr::__assert_light_failed_handler
#else
#define WJR_ASSERT_FAILED_HANDLER ::wjr::__assert_failed_handler
#endif

// LCOV_EXCL_STOP

#define WJR_ASSERT_CHECK_I(expr, ...) \
do { \
if (WJR_UNLIKELY(!(expr))) { \
::wjr::__assert_handler(#expr, WJR_FILE, WJR_CURRENT_FUNCTION, WJR_LINE, \
##__VA_ARGS__); \
WJR_ASSERT_FAILED_HANDLER(#expr, WJR_FILE, WJR_CURRENT_FUNCTION, WJR_LINE, \
##__VA_ARGS__); \
WJR_UNREACHABLE(); \
} \
} while (false)
Expand All @@ -83,8 +114,8 @@ inline void __assert_handler(const char *expr, const char *file, const char *fun
#define WJR_ASSERT_ASSUME_CHECK_I(expr, ...) \
do { \
if (WJR_UNLIKELY(!(expr))) { \
::wjr::__assert_handler(#expr, WJR_FILE, WJR_CURRENT_FUNCTION, WJR_LINE, \
##__VA_ARGS__); \
WJR_ASSERT_FAILED_HANDLER(#expr, WJR_FILE, WJR_CURRENT_FUNCTION, WJR_LINE, \
##__VA_ARGS__); \
WJR_UNREACHABLE(); \
} \
WJR_ASSUME(expr); \
Expand Down
2 changes: 1 addition & 1 deletion include/wjr/concurrency/spin_mutex.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ class spin_mutex {
spin_mutex(const spin_mutex &) = delete;
spin_mutex &operator=(const spin_mutex &) = delete;

#if WJR_DEBUG_LEVEL > 2
#if WJR_DEBUG_LEVEL >= 3
~spin_mutex() { WJR_ASSERT_L0(!m_flag.load(memory_order_acquire)); }
#endif

Expand Down
2 changes: 1 addition & 1 deletion include/wjr/crtp/nonsendable.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@

#include <wjr/assert.hpp>

#if WJR_DEBUG_LEVEL > 2
#if WJR_DEBUG_LEVEL >= 3
#define WJR_HAS_DEBUG_NONSENDABLE_CHECKER WJR_HAS_DEF
#endif

Expand Down
2 changes: 0 additions & 2 deletions include/wjr/expected.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -383,8 +383,6 @@ struct expected_storage_base<T, compressed_unexpected<E, init>, false> {
std::is_nothrow_destructible_v<T> &&std::is_nothrow_destructible_v<E>) {
if (this->has_value()) {
std::destroy_at(std::addressof(this->m_val));
} else {
std::destroy_at(std::addressof(this->m_err));
}
}

Expand Down
2 changes: 1 addition & 1 deletion include/wjr/iterator/detail.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -122,7 +122,7 @@ template <typename Iter, WJR_REQUIRES(is_contiguous_iterator_v<Iter>)>
using iterator_contiguous_pointer_t =
std::add_pointer_t<iterator_contiguous_value_t<Iter>>;

#if WJR_DEBUG_LEVEL > 1
#if WJR_DEBUG_LEVEL >= 2
#define WJR_HAS_DEBUG_CONTIGUOUS_ITERATOR_CHECK WJR_HAS_DEF
#endif

Expand Down
41 changes: 18 additions & 23 deletions include/wjr/math/mul.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,6 @@
#include <wjr/math/bit.hpp>
#include <wjr/math/shift.hpp>
#include <wjr/math/sub.hpp>
#include <wjr/memory/safe_pointer.hpp>

#if defined(_MSC_VER) && defined(WJR_X86)
#define WJR_HAS_BUILTIN_MSVC_MULH64 WJR_HAS_DEF
Expand Down Expand Up @@ -553,22 +552,20 @@ using toom_interpolation_high_p_struct = std::array<uint64_t, P - 2>;
stk usage : l * 2
*/
extern void toom22_mul_s(uint64_t *WJR_RESTRICT dst, const uint64_t *src0, size_t n,
const uint64_t *src1, size_t m,
safe_pointer<uint64_t> stk) noexcept;
const uint64_t *src1, size_t m, uint64_t *stk) noexcept;

extern void toom2_sqr(uint64_t *WJR_RESTRICT dst, const uint64_t *src, size_t n,
safe_pointer<uint64_t> stk) noexcept;
uint64_t *stk) noexcept;

/*
l = ceil(n/3)
stk usage : l * 4
*/
extern void toom33_mul_s(uint64_t *WJR_RESTRICT dst, const uint64_t *src0, size_t n,
const uint64_t *src1, size_t m,
safe_pointer<uint64_t> stk) noexcept;
const uint64_t *src1, size_t m, uint64_t *stk) noexcept;

extern void toom3_sqr(uint64_t *WJR_RESTRICT dst, const uint64_t *src, size_t n,
safe_pointer<uint64_t> stk) noexcept;
uint64_t *stk) noexcept;

WJR_CONST WJR_INTRINSIC_CONSTEXPR size_t toom22_s_itch(size_t m) noexcept {
return m * 4 + (m / 2) + 64;
Expand Down Expand Up @@ -615,19 +612,17 @@ WJR_INTRINSIC_INLINE void mul_s(uint64_t *WJR_RESTRICT dst, const uint64_t *src0
}

template <typename T>
safe_pointer<uint64_t> __mul_s_allocate(T &al, WJR_MAYBE_UNUSED size_t n) noexcept {
if constexpr (std::is_same_v<T, safe_pointer<uint64_t>>) {
uint64_t *__mul_s_allocate(T &al, WJR_MAYBE_UNUSED size_t n) noexcept {
if constexpr (std::is_same_v<T, uint64_t *>) {
return al;
} else {
return span<uint64_t>(static_cast<uint64_t *>(al.allocate(sizeof(uint64_t) * n)),
n);
return static_cast<uint64_t *>(al.allocate(sizeof(uint64_t) * n));
}
}

template <__mul_mode mode>
void __inline_mul_n_impl(uint64_t *WJR_RESTRICT dst, const uint64_t *src0,
const uint64_t *src1, size_t n,
safe_pointer<uint64_t> mal) noexcept {
const uint64_t *src1, size_t n, uint64_t *mal) noexcept {
WJR_ASSERT_ASSUME(n >= 1);
WJR_ASSERT_L2(WJR_IS_SEPARATE_P(dst, n * 2, src0, n));
WJR_ASSERT_L2(WJR_IS_SEPARATE_P(dst, n * 2, src1, n));
Expand All @@ -637,11 +632,11 @@ void __inline_mul_n_impl(uint64_t *WJR_RESTRICT dst, const uint64_t *src0,
}

if (mode <= __mul_mode::toom22 || n < toom33_mul_threshold) {
safe_pointer<uint64_t> stk = __mul_s_allocate(mal, toom22_n_itch(n));
uint64_t *stk = __mul_s_allocate(mal, toom22_n_itch(n));
return toom22_mul_s(dst, src0, n, src1, n, stk);
}

safe_pointer<uint64_t> stk = __mul_s_allocate(mal, toom33_n_itch(n));
uint64_t *stk = __mul_s_allocate(mal, toom33_n_itch(n));
return toom33_mul_s(dst, src0, n, src1, n, stk);
}

Expand All @@ -651,7 +646,7 @@ extern void __noinline_mul_n_impl(uint64_t *WJR_RESTRICT dst, const uint64_t *sr
template <__mul_mode mode>
WJR_INTRINSIC_INLINE void __mul_n(uint64_t *WJR_RESTRICT dst, const uint64_t *src0,
const uint64_t *src1, size_t n,
WJR_MAYBE_UNUSED safe_pointer<uint64_t> stk) noexcept {
WJR_MAYBE_UNUSED uint64_t *stk) noexcept {
if constexpr (mode <= __mul_mode::toom33) {
__inline_mul_n_impl<mode>(dst, src0, src1, n, stk);
} else {
Expand All @@ -661,7 +656,7 @@ WJR_INTRINSIC_INLINE void __mul_n(uint64_t *WJR_RESTRICT dst, const uint64_t *sr

template <__mul_mode mode, uint64_t m0 = UINT64_MAX, uint64_t m1 = UINT64_MAX>
void __mul_n(uint64_t *WJR_RESTRICT dst, const uint64_t *src0, const uint64_t *src1,
size_t n, safe_pointer<uint64_t> stk, uint64_t &c_out, uint64_t cf0,
size_t n, uint64_t *stk, uint64_t &c_out, uint64_t cf0,
uint64_t cf1) noexcept {
WJR_ASSERT_ASSUME(cf0 <= m0);
WJR_ASSERT_ASSUME(cf1 <= m1);
Expand Down Expand Up @@ -697,7 +692,7 @@ WJR_INTRINSIC_INLINE void mul_n(uint64_t *WJR_RESTRICT dst, const uint64_t *src0

template <__mul_mode mode>
inline void __inline_sqr_impl(uint64_t *WJR_RESTRICT dst, const uint64_t *src, size_t n,
safe_pointer<uint64_t> mal) noexcept {
uint64_t *mal) noexcept {
WJR_ASSERT_ASSUME(n >= 1);
WJR_ASSERT_L2(WJR_IS_SEPARATE_P(dst, n * 2, src, n));

Expand All @@ -706,11 +701,11 @@ inline void __inline_sqr_impl(uint64_t *WJR_RESTRICT dst, const uint64_t *src, s
}

if (mode <= __mul_mode::toom22 || n < toom3_sqr_threshold) {
safe_pointer<uint64_t> stk = __mul_s_allocate(mal, toom22_n_itch(n));
uint64_t *stk = __mul_s_allocate(mal, toom22_n_itch(n));
return toom2_sqr(dst, src, n, stk);
}

safe_pointer<uint64_t> stk = __mul_s_allocate(mal, toom33_n_itch(n));
uint64_t *stk = __mul_s_allocate(mal, toom33_n_itch(n));
return toom3_sqr(dst, src, n, stk);
}

Expand All @@ -719,7 +714,7 @@ extern void __noinline_sqr_impl(uint64_t *WJR_RESTRICT dst, const uint64_t *src,

template <__mul_mode mode>
void __sqr(uint64_t *WJR_RESTRICT dst, const uint64_t *src, size_t n,
WJR_MAYBE_UNUSED safe_pointer<uint64_t> stk) noexcept {
WJR_MAYBE_UNUSED uint64_t *stk) noexcept {
if constexpr (mode <= __mul_mode ::toom33) {
__inline_sqr_impl<mode>(dst, src, n, stk);
} else {
Expand All @@ -728,8 +723,8 @@ void __sqr(uint64_t *WJR_RESTRICT dst, const uint64_t *src, size_t n,
}

template <__mul_mode mode, uint64_t m = UINT64_MAX>
void __sqr(uint64_t *WJR_RESTRICT dst, const uint64_t *src, size_t n,
safe_pointer<uint64_t> stk, uint64_t &c_out, uint64_t cf) noexcept {
void __sqr(uint64_t *WJR_RESTRICT dst, const uint64_t *src, size_t n, uint64_t *stk,
uint64_t &c_out, uint64_t cf) noexcept {
WJR_ASSERT_ASSUME(cf <= m);

__sqr<mode>(dst, src, n, stk);
Expand Down
Loading

0 comments on commit a9934cf

Please sign in to comment.