use memoryviews instead of ndarrays #22147

jbrockmendel · 2018-07-31T21:27:59Z

For lookups and slicing memoryviews are supposed to be more performant than ndarrays. I haven't been able to detect any difference. @xhochy IIRC you've looked at this in some depth; are memoryviews unambiguously better for our purposes?

Getting rid of from numpy cimport [...] is necessary (but not sufficient) for getting rid of numpy 1.7 deprecation warnings.

…ffers3

xhochy · 2018-08-01T07:57:25Z

I also had the feeling that they are more performant and flexible than using ndarray but they don't support const object[:] at the moment which was problematic in my case. I actually have the feeling that this is a bug in the type parsing of Cython as const object is not possible but const object[:] should be. Asked for clarification at cython/cython#2485 but got no answer yet and I did not fully understand the Cython codebase to answer it myself.

codecov · 2018-08-01T08:26:34Z

Codecov Report

Merging #22147 into master will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master   #22147   +/-   ##
=======================================
  Coverage   92.07%   92.07%           
=======================================
  Files         170      170           
  Lines       50694    50694           
=======================================
  Hits        46674    46674           
  Misses       4020     4020

Flag	Coverage Δ
#multiple	`90.47% <ø> (ø)`	⬆️
#single	`42.29% <ø> (ø)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 57c7daa...4cfe9e8. Read the comment docs.

jreback · 2018-08-01T10:39:02Z

so this is not faster, but not slower? is it worth it at all?

jbrockmendel · 2018-08-01T19:13:02Z

so this is not faster, but not slower? is it worth it at all?

Based on %timeit results, the memoryviews do appear to be every-so-slightly faster.

The other upsides I have in mind are:

weens us off of being numpy-specific (think: arrow)
easier to lint (though we're a few steps away from that)
getting rid of from numpy cimport ... is prerequisite to getting rid of deprecation warnings

jreback · 2018-08-01T21:53:27Z

ok thanks

* master: (47 commits) Run tests in conda build [ci skip] (pandas-dev#22190) TST: Check DatetimeIndex.drop on DST boundary (pandas-dev#22165) CI: Fix Travis failures due to lint.sh on pandas/core/strings.py (pandas-dev#22184) Documentation: typo fixes in MultiIndex / Advanced Indexing (pandas-dev#22179) DOC: added .join to 'see also' in Series.str.cat (pandas-dev#22175) DOC: updated Series.str.contains see also section (pandas-dev#22176) 0.23.4 whatsnew (pandas-dev#22177) fix: scalar timestamp assignment (pandas-dev#19843) (pandas-dev#19973) BUG: Fix get dummies unicode error (pandas-dev#22131) Fixed py36-only syntax [ci skip] (pandas-dev#22167) DEPR: pd.read_table (pandas-dev#21954) DEPR: Removing previously deprecated datetools module (pandas-dev#6581) (pandas-dev#19119) BUG: Matplotlib scatter datetime (pandas-dev#22039) CLN: Use public method to capture UTC offsets (pandas-dev#22164) implement tslibs/src to make tslibs self-contained (pandas-dev#22152) Fix categorical from codes nan 21767 (pandas-dev#21775) BUG: Better handling of invalid na_option argument for groupby.rank(pandas-dev#22124) (pandas-dev#22125) use memoryviews instead of ndarrays (pandas-dev#22147) Remove depr. warning in SeriesGroupBy.count (pandas-dev#22155) API: Default to_* methods to compression='infer' (pandas-dev#22011) ...

jbrockmendel added 3 commits July 31, 2018 14:23

use memoryviews instead of ndarrays

59140c3

change more usages of ndarray to memorview

1b6d568

Merge branch 'master' of https://github.com/pandas-dev/pandas into bu…

4cfe9e8

…ffers3

jreback added the Compat pandas objects compatability with Numpy or Python functions label Aug 1, 2018

jreback added this to the 0.24.0 milestone Aug 1, 2018

jreback merged commit d010469 into pandas-dev:master Aug 1, 2018

jbrockmendel deleted the buffers3 branch August 1, 2018 22:16

dberenbaum pushed a commit to dberenbaum/pandas that referenced this pull request Aug 3, 2018

use memoryviews instead of ndarrays (pandas-dev#22147)

3f1f3b2

Sup3rGeo pushed a commit to Sup3rGeo/pandas that referenced this pull request Oct 1, 2018

use memoryviews instead of ndarrays (pandas-dev#22147)

0180e5b

Sup3rGeo pushed a commit to Sup3rGeo/pandas that referenced this pull request Oct 1, 2018

use memoryviews instead of ndarrays (pandas-dev#22147)

f4e7012

WillAyd mentioned this pull request Jul 12, 2019

Fix compilation warnings #27371

Merged

3 tasks

This was referenced Jun 17, 2020

BUG: Series(..) coerces datetime64[non-ns] array differently depending on writable flag #34843

Closed

BUG: read-only buffer failures in datetime parsing #34857

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use memoryviews instead of ndarrays #22147

use memoryviews instead of ndarrays #22147

jbrockmendel commented Jul 31, 2018

xhochy commented Aug 1, 2018

codecov bot commented Aug 1, 2018 •

edited

Loading

jreback commented Aug 1, 2018

jbrockmendel commented Aug 1, 2018

jreback commented Aug 1, 2018

use memoryviews instead of ndarrays #22147

use memoryviews instead of ndarrays #22147

Conversation

jbrockmendel commented Jul 31, 2018

xhochy commented Aug 1, 2018

codecov bot commented Aug 1, 2018 • edited Loading

Codecov Report

jreback commented Aug 1, 2018

jbrockmendel commented Aug 1, 2018

jreback commented Aug 1, 2018

codecov bot commented Aug 1, 2018 •

edited

Loading