SPARK-1428: MLlib should convert non-float64 NumPy arrays to float64 instead of complaining #356

techaddict · 2014-04-08T06:48:09Z

No description provided.

mengxr · 2014-04-08T08:11:12Z

@techaddict Is it easy to add a test to verify that it works?

techaddict · 2014-04-08T14:50:39Z

@mengxr ok i'll try adding some test's

mateiz · 2014-04-08T18:15:53Z

Jenkins, this is ok to test

mateiz · 2014-04-08T18:16:42Z

You should also check that the vector does not contain complex numbers, since that is the one NumPy data type we can't convert to floats. You can do

if numpy.issubdtype(v.dtype, numpy.complex):
    raise TypeError(...)

techaddict · 2014-04-09T04:23:53Z

@mateiz updated, should i add tests too (IMHO i don't think there is need, because its a trivial patch ) ?

mateiz · 2014-04-09T05:42:21Z

Actually one more thing, can you set copy=true in astype (or just not set it)? I'm not sure that it preserves the right byte order and such and I know that this works with copy=true. Also we don't want to mess with the user's input data.

mateiz · 2014-04-09T05:42:43Z

Sorry meant true, not false.

techaddict · 2014-04-09T05:47:43Z

@mateiz You mean
v = v.astype(float64)
or v = v.astype(float64, copy=True)

techaddict · 2014-04-10T06:20:36Z

@mateiz is there any other problem ?

mateiz · 2014-04-10T16:29:19Z

This seems to be failing tests unfortunately -- click through to the Jenkins log. This is what it says:

=========================================================================
Running PySpark tests
=========================================================================
**********************************************************************
File "pyspark/mllib/_common.py", line 76, in __main__._deserialize_double_vector
Failed example:
    array_equal(x, _deserialize_double_vector(_serialize_double_vector(x)))
Exception raised:
    Traceback (most recent call last):
      File "/usr/local/lib/python2.7/doctest.py", line 1289, in __run
        compileflags, 1) in test.globs
      File "<doctest __main__._deserialize_double_vector[1]>", line 1, in <module>
        array_equal(x, _deserialize_double_vector(_serialize_double_vector(x)))
      File "pyspark/mllib/_common.py", line 55, in _serialize_double_vector
        if numpy.issubdtype(v.dtype, numpy.complex):
    NameError: global name 'numpy' is not defined
**********************************************************************
   1 of   2 in __main__._deserialize_double_vector
***Test Failed*** 1 failures.
Had test failures; see logs.

mateiz · 2014-04-10T16:30:39Z

This is why it would be good to add a test actually. For instance you can add a test with _deserialize_double_vector(_serialize_double_vector(array([1,2,3])) and check that it returns array([1.0,2.0,3.0]).

…instead of complaining

techaddict · 2014-04-10T17:19:22Z

@mateiz done working now 👍

mateiz · 2014-04-10T18:18:22Z

Thanks Sandeep. Merged into master and 1.0.

AmplabJenkins · 2014-04-10T18:38:11Z

Can one of the admins verify this patch?

…instead of complaining Author: Sandeep <sandeep@techaddict.me> Closes #356 from techaddict/1428 and squashes the following commits: 3bdf5f6 [Sandeep] SPARK-1428: MLlib should convert non-float64 NumPy arrays to float64 instead of complaining (cherry picked from commit 3bd3129) Signed-off-by: Matei Zaharia <matei@databricks.com>

…instead of complaining Author: Sandeep <sandeep@techaddict.me> Closes apache#356 from techaddict/1428 and squashes the following commits: 3bdf5f6 [Sandeep] SPARK-1428: MLlib should convert non-float64 NumPy arrays to float64 instead of complaining

…ndpoint Enable legacy endpoint format for docker-machine

SPARK-1428: MLlib should convert non-float64 NumPy arrays to float64 …

3bdf5f6

…instead of complaining

techaddict closed this Apr 10, 2014

techaddict reopened this Apr 10, 2014

asfgit closed this in 3bd3129 Apr 10, 2014

techaddict deleted the 1428 branch July 3, 2016 04:59

tangzhankun pushed a commit to tangzhankun/spark that referenced this pull request Jul 21, 2017

Config for hard cpu limit on pods; default unlimited (apache#356)

8b3248f

erikerlandson pushed a commit to erikerlandson/spark that referenced this pull request Jul 28, 2017

Config for hard cpu limit on pods; default unlimited (apache#356)

cdf6c36

mccheah pushed a commit to mccheah/spark that referenced this pull request Oct 3, 2018

Don't consider circle-test-results file if empty (apache#356)

b594dfb

bzhaoopenstack pushed a commit to bzhaoopenstack/spark that referenced this pull request Sep 11, 2019

Merge pull request apache#356 from theopenlab/docker-machine-legacy-e…

2a9908c

…ndpoint Enable legacy endpoint format for docker-machine

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SPARK-1428: MLlib should convert non-float64 NumPy arrays to float64 instead of complaining #356

SPARK-1428: MLlib should convert non-float64 NumPy arrays to float64 instead of complaining #356

techaddict commented Apr 8, 2014

mengxr commented Apr 8, 2014

techaddict commented Apr 8, 2014

mateiz commented Apr 8, 2014

mateiz commented Apr 8, 2014

techaddict commented Apr 9, 2014

mateiz commented Apr 9, 2014

mateiz commented Apr 9, 2014

techaddict commented Apr 9, 2014

techaddict commented Apr 10, 2014

mateiz commented Apr 10, 2014

mateiz commented Apr 10, 2014

techaddict commented Apr 10, 2014

mateiz commented Apr 10, 2014

AmplabJenkins commented Apr 10, 2014

SPARK-1428: MLlib should convert non-float64 NumPy arrays to float64 instead of complaining #356

SPARK-1428: MLlib should convert non-float64 NumPy arrays to float64 instead of complaining #356

Conversation

techaddict commented Apr 8, 2014

mengxr commented Apr 8, 2014

techaddict commented Apr 8, 2014

mateiz commented Apr 8, 2014

mateiz commented Apr 8, 2014

techaddict commented Apr 9, 2014

mateiz commented Apr 9, 2014

mateiz commented Apr 9, 2014

techaddict commented Apr 9, 2014

techaddict commented Apr 10, 2014

mateiz commented Apr 10, 2014

mateiz commented Apr 10, 2014

techaddict commented Apr 10, 2014

mateiz commented Apr 10, 2014

AmplabJenkins commented Apr 10, 2014