forked from rapidsai/cudf
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test #7
Open
galipremsagar
wants to merge
1,699
commits into
branch-24.06
Choose a base branch
from
test
base: branch-24.06
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
test #7
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
galipremsagar
pushed a commit
that referenced
this pull request
Nov 8, 2024
Fixes call to `data_type{}` ctor in `json_test.cpp`. The 2-parameter ctor is for fixed-point-types only and will assert in a debug build if used incorrectly: https://github.com/rapidsai/cudf/blob/2db58d58b4a986c2c6fad457f291afb1609fd458/cpp/include/cudf/types.hpp#L277-L280 Partial stack trace from a gdb run ``` #5 0x000077b1530bc71b in __assert_fail_base (fmt=0x77b153271130 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x58c3e4baaa98 "id == type_id::DECIMAL32 || id == type_id::DECIMAL64 || id == type_id::DECIMAL128", file=0x58c3e4baaa70 "/cudf/cpp/include/cudf/types.hpp", line=279, function=<optimized out>) at ./assert/assert.c:92 #6 0x000077b1530cde96 in __GI___assert_fail ( assertion=0x58c3e4baaa98 "id == type_id::DECIMAL32 || id == type_id::DECIMAL64 || id == type_id::DECIMAL128", file=0x58c3e4baaa70 "/cudf/cpp/include/cudf/types.hpp", line=279, function=0x58c3e4baaa38 "cudf::data_type::data_type(cudf::type_id, int32_t)") at ./assert/assert.c:101 #7 0x000058c3e48ba594 in cudf::data_type::data_type (this=0x7fffdd3f7530, id=cudf::type_id::STRING, scale=0) at /cudf/cpp/include/cudf/types.hpp:279 #8 0x000058c3e49215d9 in JsonReaderTest_MixedTypesWithSchema_Test::TestBody (this=0x58c3e5ea13a0) at /cudf/cpp/tests/io/json/json_test.cpp:2887 ``` Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Karthikeyan (https://github.com/karthikeyann) - Muhammad Haseeb (https://github.com/mhaseeb123) URL: rapidsai#17273
Follow up to rapidsai#16715. Now that the usages of the `masked` keyword in RAPIDS have been address (rapidsai/cuspatial#1496 is the only one I could find), I think we can remove this keyword all together in this method Authors: - Matthew Roeschke (https://github.com/mroeschke) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) URL: rapidsai#17530
Contributes to rapidsai#17317 Authors: - Matthew Roeschke (https://github.com/mroeschke) Approvers: - Matthew Murray (https://github.com/Matt711) URL: rapidsai#17488
Contributes to rapidsai#7795. Also contributes to rapidsai/build-planning#76. Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Nghia Truong (https://github.com/ttnghia) - Yunsong Wang (https://github.com/PointKernel) - Bradley Dice (https://github.com/bdice) - David Wendt (https://github.com/davidwendt) URL: rapidsai#17545
From an offline conversation, fixes the follow discrepancy between cudf and pandas ```python In [1]: import cudf In [2]: import numpy as np In [3]: ser = cudf.Series([np.nan, np.nan, 0.9], nan_as_null=False) In [4]: ser Out[4]: 0 NaN 1 NaN 2 0.9 dtype: float64 In [5]: ser.quantile(0.9) Out[5]: np.float64(nan) In [6]: import pandas as pd In [7]: ser = pd.Series([np.nan, np.nan, 0.9]) In [8]: ser.quantile(0.9) Out[8]: np.float64(0.9) ``` Authors: - Matthew Roeschke (https://github.com/mroeschke) Approvers: - Bradley Dice (https://github.com/bdice) URL: rapidsai#17593
This PR exposes all json reader options in pylibcudf and enables it via kwargs in `cudf.read_json` since kwargs cannot be used in cython, kwargs is passed as dict to cython. These options are hidden in docs intentionally, as these options are mostly used for testing feature requests from spark json reader now. These options are expected to change. Authors: - Karthikeyan (https://github.com/karthikeyann) Approvers: - Matthew Murray (https://github.com/Matt711) URL: rapidsai#17563
Contributes to rapidsai#17317 More can be removed once my other cudf._lib PRs are in Authors: - Matthew Roeschke (https://github.com/mroeschke) Approvers: - Bradley Dice (https://github.com/bdice) URL: rapidsai#17586
Contributes to rapidsai#17317 Also I found that `PackedColumns` was not being use anywhere. It appears it was added back in rapidsai#8153 for dask_cudf but I cannot see it being used there anymore Authors: - Matthew Roeschke (https://github.com/mroeschke) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) URL: rapidsai#17548
… `/src`) (rapidsai#17550) Replaced the calls to `cudaMemcpyAsync` with the new `cuda_memcpy`/`cuda_memcpy_async` utility, which optionally avoids using the copy engine. Also took the opportunity to use cudf::detail::host_vector and its factories to enable wider pinned memory use. Remaining instances are either not viable (e.g. copying `h_needs_fallback`, interop) or D2D copies. Authors: - Vukasin Milovanovic (https://github.com/vuule) Approvers: - David Wendt (https://github.com/davidwendt) - Nghia Truong (https://github.com/ttnghia) URL: rapidsai#17550
Resolves rapidsai#17595 Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Nghia Truong (https://github.com/ttnghia) - David Wendt (https://github.com/davidwendt) URL: rapidsai#17603
…, ...)` type (rapidsai#17604) From an offline discussion, a pandas object with an `category[interval[...]]` type would be incorrectly be interpreted as a `category[struct[...]]` type. This can cause further problems with `cudf.pandas` as a `category[struct[...]]` type cannot be properly interpreted by pandas. Authors: - Matthew Roeschke (https://github.com/mroeschke) Approvers: - Bradley Dice (https://github.com/bdice) URL: rapidsai#17604
) Contributes to rapidsai#17317 Authors: - Matthew Roeschke (https://github.com/mroeschke) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) URL: rapidsai#17456
…sai#17460) Contributes to rapidsai#17317 Authors: - Matthew Roeschke (https://github.com/mroeschke) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) URL: rapidsai#17460
Clang-tidy does not like `[[nodiscard]]` after `__device__` and I don't like red squigly lines. Authors: - Vukasin Milovanovic (https://github.com/vuule) Approvers: - Yunsong Wang (https://github.com/PointKernel) - David Wendt (https://github.com/davidwendt) URL: rapidsai#17608
Recent changes in dask and dask-expr have broken `dask_cudf.read_csv` (dask/dask-expr#1178, dask/dask#11603). Fortunately, the breaking changes help us avoid legacy CSV code in the long run. Authors: - Richard (Rick) Zamora (https://github.com/rjzamora) Approvers: - Peter Andreas Entschev (https://github.com/pentschev) - Lawrence Mitchell (https://github.com/wence-) URL: rapidsai#17612
Apart of rapidsai#17565 Authors: - Matthew Murray (https://github.com/Matt711) Approvers: - Matthew Roeschke (https://github.com/mroeschke) URL: rapidsai#17599
…i#17611) A recent nightly failure discovered by @davidwendt here: https://github.com/rapidsai/cudf/actions/runs/12367794950/job/34543121050 indicates an environment cannot be created with `pytorch>=2.4.0` and `pyarrow==14.0.0 & 14.0.1`. Thus this bump to `14.0.2`. Authors: - GALI PREM SAGAR (https://github.com/galipremsagar) - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Bradley Dice (https://github.com/bdice) URL: rapidsai#17611
This PR has two fixes: - Since we're pinning to a commit, a shallow clone will start failing as soon as HEAD gets bumped on the main branch (which will happen next when cuml/raft logging features are merged). We need to stop using shallow clones. - The CMake code for setting the default logging levels was setting the wrong macro name. Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Bradley Dice (https://github.com/bdice) URL: rapidsai#17588
…apidsai#17610) Fixes memcheck error found in nightly build checks in the STREAM_REPLACE_TEST's `ReplaceTest.NormalizeNansAndZerosMutable` gtest. The mutable-view passed to the `cudf::normalize_nans_and_zeros` API was pointing to invalidated data. The following line created the invalid view ``` cudf::mutable_column_view mutable_view = cudf::column(input, cudf::test::get_default_stream()); ``` The temporary `cudf::column` is destroyed once the `mutable_view` is created so this view would now point to a freed column. The view must be created from a non-temporary column and also must be non-temporary itself so that it is not implicitly converted to a `column_view`. Error introduced by rapidsai#17436 Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) - Vukasin Milovanovic (https://github.com/vuule) URL: rapidsai#17610
Contributes to rapidsai#17317 Authors: - Matthew Roeschke (https://github.com/mroeschke) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) URL: rapidsai#17555
Forward-merge branch-25.02 into branch-25.04
galipremsagar
force-pushed
the
test
branch
2 times, most recently
from
January 31, 2025 21:44
2b55b8b
to
e627f5e
Compare
Towards rapidsai#17843 Authors: - Matthew Roeschke (https://github.com/mroeschke) Approvers: - Matthew Murray (https://github.com/Matt711) URL: rapidsai#17863
This migrates amd64 CI jobs (PRs and nightlies) to use L4 GPUs from the NVKS cluster. xref: rapidsai/build-infra#184 Authors: - Bradley Dice (https://github.com/bdice) Approvers: - James Lamb (https://github.com/jameslamb) URL: rapidsai#17877
Moving forward with removal of the (redundant) `gpu` namespace in cuIO. Also moved the entire ORC implementation to `cudf::io::orc::detail`, leaving only the implementation of the public API in `cudf::io::orc`. Also removed a few unused headers, or moved them to be included in the right files. Authors: - Vukasin Milovanovic (https://github.com/vuule) Approvers: - Yunsong Wang (https://github.com/PointKernel) - Muhammad Haseeb (https://github.com/mhaseeb123) URL: rapidsai#17891
## Description This PR fixes cudf ci nightly test failures: https://github.com/rapidsai/cudf/actions/runs/13097249137/job/36541039646 ## Checklist - [x] I am familiar with the [Contributing Guidelines](https://github.com/rapidsai/cudf/blob/HEAD/CONTRIBUTING.md). - [x] New or existing tests cover these changes. - [x] The documentation is up to date with these changes.
Forward-merge branch-25.02 into branch-25.04
`data` attribute of numpy should be marked private as it actually points to the underlying memory and it will be distinct for a cupy array. Authors: - GALI PREM SAGAR (https://github.com/galipremsagar) Approvers: - Bradley Dice (https://github.com/bdice) - Vyas Ramasubramani (https://github.com/vyasr) URL: rapidsai#17890
Forward-merge branch-25.02 into branch-25.04
Issue: rapidsai/pre-commit-hooks#61 Authors: - Kyle Edwards (https://github.com/KyleFromNVIDIA) Approvers: - Bradley Dice (https://github.com/bdice) URL: rapidsai#17887
Toward rapidsai#17843 Authors: - Matthew Roeschke (https://github.com/mroeschke) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) URL: rapidsai#17897
rapidsai#17839) xref rapidsai#12494 and rapidsai#12495 `cudf.dtype` is useful when cudf is passed a `dtype` argument from a user to perform inference on the input to make it cudf-compatable. Internally, we don't need this inference because we know the exact types to be passed & that are supported by cudf (columns), so this PR avoids calling `cudf.dtype` internally. Generally: * Define `CUDF_STRING_DTYPE` as a definitive cudf Python string type instead of `cudf/np.dtype("O"/"object", "str")` * Prefer using `np.<type>` instead of `"<type>"` (using `np.` like an enum namespace) Authors: - Matthew Roeschke (https://github.com/mroeschke) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) URL: rapidsai#17839
Contributes to rapidsai/build-planning#146 Proposes: * setting `[tool.scikit-build].ninja.make-fallback = false`, so `scikit-build-core` will not silently fallback to using GNU Make if `ninja` is not available Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Bradley Dice (https://github.com/bdice) URL: rapidsai#17894
…sistency (rapidsai#17908) Some older code in the ORC reader/writer uses PascalCase, which is not used in the rest of libcudf. This PR renames such functions and types to align the style with the rest of the code base. The types that are based on the ORC specs are kept as PascalCase to make it easy to identify such types. Authors: - Vukasin Milovanovic (https://github.com/vuule) Approvers: - Muhammad Haseeb (https://github.com/mhaseeb123) - Vyas Ramasubramani (https://github.com/vyasr) URL: rapidsai#17908
… labels (rapidsai#17905) closes rapidsai#17902 Authors: - Matthew Roeschke (https://github.com/mroeschke) Approvers: - GALI PREM SAGAR (https://github.com/galipremsagar) - Bradley Dice (https://github.com/bdice) URL: rapidsai#17905
This is a small change adding a script to run pylibcudf tests, like we have for other Python libraries in this repository. Authors: - Bradley Dice (https://github.com/bdice) Approvers: - Matthew Murray (https://github.com/Matt711) - Vyas Ramasubramani (https://github.com/vyasr) URL: rapidsai#17882
…17837) Fixes rapidsai#17836 Authors: - Shruti Shivakumar (https://github.com/shrshi) Approvers: - Vukasin Milovanovic (https://github.com/vuule) - Vyas Ramasubramani (https://github.com/vyasr) URL: rapidsai#17837
Forward-merge branch-25.02 into branch-25.04
Run the CI for `spark-rapids-jni` to ensure that we don't break their build. Resolves rapidsai#17337 Authors: - Kyle Edwards (https://github.com/KyleFromNVIDIA) Approvers: - Gera Shegalov (https://github.com/gerashegalov) - Vyas Ramasubramani (https://github.com/vyasr) - Bradley Dice (https://github.com/bdice) URL: rapidsai#17781
…es in cudf::io (rapidsai#17734) As part of the improvement effort discussed in rapidsai#15907, this merge request removes some of the excessive `std::string` copies and uses `std::string_view` in place of `std::string` when the lifetime semantics are clear. `std::string` is only replaced in this MR in linear functions and constructors, but not in structs as there's no established ownership or lifetime semantics to guarantee the `string_view`s will not outlive their source. There were also some cases of excessive copies, i.e. consider: ```cpp struct source_info{ source_info(std::string const& s) : str{s}{} private: std::string str; }; ``` In the above example, the string is likely to be allocated twice if a temporary/string-literal is used to construct "s": one for the temporary and one for the copy constructor for `str` ```cpp struct source_info{ source_info(std::string s) : str{std::move(s)}{} private: std::string str; }; ``` The string is only allocated once in all scenarios. This also applies to `std::vector` and is arguably worse as there's no small-vector-optimization (i.e. `std::string`'s small-string-optimization/SSO). Authors: - Basit Ayantunde (https://github.com/lamarrr) - Muhammad Haseeb (https://github.com/mhaseeb123) - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Tianyu Liu (https://github.com/kingcrimsontianyu) - Muhammad Haseeb (https://github.com/mhaseeb123) - David Wendt (https://github.com/davidwendt) URL: rapidsai#17734
…apidsai#17859) The PTX parser replaces PTX code with inline PTX code (using inline ASM blocks). It considers a branch label and the immediate instruction as a single unit to process. During the ASM->CUDA transform step, it searches for the `ret` instruction in the string and replaces the whole statement and not the substring that contains the `ret;` instruction. which means an expression like: ```asm BB0_1: ret; ``` is parsed as: ```asm BB0_1: ret; ``` and then transformed to: ```asm bra RETTGT ``` instead of: ```asm BB0_1: bra RETTGT ``` This merge request fixes this bug. Authors: - Basit Ayantunde (https://github.com/lamarrr) Approvers: - David Wendt (https://github.com/davidwendt) - Shruti Shivakumar (https://github.com/shrshi) URL: rapidsai#17859
Fixes incorrect pylibcudf/libcudf example created in rapidsai#17803. Authors: - Matthew Murray (https://github.com/Matt711) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) URL: rapidsai#17912
Forward-merge branch-25.02 into branch-25.04
Adds multi-partition `Join` support following the same design as rapidsai#17441 In order to support parallel joins, this PR also introduces a special `Shuffle` node. Authors: - Richard (Rick) Zamora (https://github.com/rjzamora) Approvers: - Matthew Murray (https://github.com/Matt711) - Lawrence Mitchell (https://github.com/wence-) URL: rapidsai#17518
Currently pylibcudf does not export a dependency on libcudf at all, which is incorrect. Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Peter Andreas Entschev (https://github.com/pentschev) - Bradley Dice (https://github.com/bdice) - James Lamb (https://github.com/jameslamb) URL: rapidsai#17915
Forward-merge branch-25.02 into branch-25.04
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.