Anatomy of an array introduction. Obvious way is the fastest. #29

ichernob · 2017-01-02T06:43:43Z

Hello,
I've tried this code:

Z = np.ones(4 * 1000000, np.float32)
timeit("Z[...] = 0", globals())
timeit("Z.view(np.float16)[...] = 0", globals())
timeit("Z.view(np.int16)[...] = 0", globals())
timeit("Z.view(np.int32)[...] = 0", globals())
timeit("Z.view(np.float32)[...] = 0", globals())
timeit("Z.view(np.int64)[...] = 0", globals())
timeit("Z.view(np.float64)[...] = 0", globals())
timeit("Z.view(np.complex128)[...] = 0", globals())
timeit("Z.view(np.int8)[...] = 0", globals())

And gave following results:
100 loops, best of 3: 905 usec per loop
100 loops, best of 3: 918 usec per loop
100 loops, best of 3: 925 usec per loop
100 loops, best of 3: 915 usec per loop
100 loops, best of 3: 910 usec per loop
100 loops, best of 3: 912 usec per loop
100 loops, best of 3: 902 usec per loop
100 loops, best of 3: 1.9 msec per loop
100 loops, best of 3: 1.91 msec per loop

And i don't understand the root cause of such opposite results. Could you kindly clarify?
Thanks in advance.

P.S. I'm using python 3.5.2 64bit version along with Anaconda.
The sysinfo() output:
Date: 01/02/17
Python: 3.5.2
Numpy: 1.11.1
Scipy: 0.17.1
Matplotlib: 1.5.1

The text was updated successfully, but these errors were encountered:

rougier · 2017-01-02T07:17:36Z

Thanks for the report. Your results are surprising. Could you also test using IPython and the magic %timeit (just to be sure I did not mess up the timeit function) ?

Note: I edited your post because the listing was not displayed properly.

ichernob · 2017-01-02T07:42:53Z

Thanks for the answering. I will try a little bit later and post here the results

ichernob · 2017-01-02T17:29:52Z

Well, unfirtunately, right now i'm unable to use numpy via ironpython (never met it before, really can't understand how to get numpy without pip). But i've ran the same code from another computer and get different results:
100 loops, best of 3: 1.21 msec per loop
100 loops, best of 3: 1.21 msec per loop
100 loops, best of 3: 1.26 msec per loop
100 loops, best of 3: 1.22 msec per loop
100 loops, best of 3: 1.21 msec per loop
10 loops, best of 3: 4.3 msec per loop
10 loops, best of 3: 4.22 msec per loop
100 loops, best of 3: 2.21 msec per loop
100 loops, best of 3: 1.01 msec per loop
Also, from PTVS results have differend trend:

claws · 2017-01-06T06:46:11Z

@ruichernob, I think you have confusing IronPython with IPython. IPython is what you want, not IronPython. You can install IPython into your existing Python using pip:

$ pip install ipython

godaygo · 2018-03-21T14:42:01Z

Hi! To start, thank you for great tutorial!
I am experiencing the same issue with times as OP. I've measured the following snippets with yours timeit function (I've also tested with %timeit the results are very close):

timeit("Z[...] = 0", globals())
timeit("Z.view(np.float64)[...] = 0", globals())
timeit("Z.view(np.float32)[...] = 0", globals())
timeit("Z.view(np.float16)[...] = 0", globals())
timeit("Z.view(np.complex)[...] = 0", globals())
timeit("Z.view(np.int64)[...] = 0", globals())
timeit("Z.view(np.int32)[...] = 0", globals())
timeit("Z.view(np.int16)[...] = 0", globals())
timeit("Z.view(np.int8)[...] = 0", globals())
timeit("Z.fill(0)", globals())

I've measured on two computers, with:

Python 3.6.4
numpy 1.14.2

The specs of the first computer:
Windows 10
CPU: Intel Xenon E5-1650v4 3.60GHz
RAM: 128GB DDR4-2400
Times:

100 loops, best of 3: 750 usec per loop
100 loops, best of 3: 758 usec per loop
100 loops, best of 3: 757 usec per loop
100 loops, best of 3: 760 usec per loop
100 loops, best of 3: 1.06 msec per loop
100 loops, best of 3: 758 usec per loop
100 loops, best of 3: 757 usec per loop
100 loops, best of 3: 760 usec per loop
100 loops, best of 3: 758 usec per loop
100 loops, best of 3: 747 usec per loop

The specs of the second computer:
Windows 7
CPU: Intel Pentium P6100 2.00GHz
RAM: 4GB DDR3-1333
Times:

100 loops, best of 3: 2.59 msec per loop
10 loops, best of 3: 3.38 msec per loop
10 loops, best of 3: 2.59 msec per loop
100 loops, best of 3: 2.62 msec per loop
100 loops, best of 3: 3.26 msec per loop
100 loops, best of 3: 2.69 msec per loop
100 loops, best of 3: 2.62 msec per loop
100 loops, best of 3: 2.63 msec per loop
10 loops, best of 3: 3.32 msec per loop
100 loops, best of 3: 2.55 msec per loop

As you can see, the results are somewhat consistent with each other, but do not match your observations.

rougier · 2018-03-22T06:39:36Z

Given the consistent output from you and @ruichernob it looks that I might be wrong. I don't remember how did I come to this conclusion. I'm pretty sure I got the results written in the book but I might be the only one in the end 😄. Would you mind proposing a PR to fix what's written in the book?

godaygo · 2018-03-22T08:26:57Z

It would be great if you had the opportunity to recheck these results on your computer with current version of numpy. After all, everything can be :) And of course the results posted in the book could be fair before.

Since the basic idea of this section is that the obvious method is not optimal, just a change in the timings will make this section meaningless. As for me, the only obvious way to fill the entire array with some value is to use the .fill method of ndarray and obviously this interface was introduced for this purpose.

I've tried to come up with a same simple example where such tricks will allow to overtake another obvious way, but unfortunately not yet found :) In addition, "There should be one-- and preferably only one --obvious way to do it." Having said this, if the fresh results you rechecked will be in agreement, I would just skip this example so as not to be misleading. I apologize that I can not offer an example for replacement.

rougier · 2018-03-23T05:46:07Z

On OSX 10.13.3, Pyton 3.6.4, numpy 1.14.2, I got:

>>> Z.view(np.float16)[...] = 0
100 loops, best of 3: 2.85 msec per loop
>>> Z.view(np.int16)[...] = 0
100 loops, best of 3: 2.87 msec per loop
>>> Z.view(np.int32)[...] = 0
100 loops, best of 3: 1.46 msec per loop
>>> Z.view(np.float32)[...] = 0
100 loops, best of 3: 1.58 msec per loop
>>> Z.view(np.int64)[...] = 0
100 loops, best of 3: 1 msec per loop
>>> Z.view(np.float64)[...] = 0
100 loops, best of 3: 1.01 msec per loop
>>> Z.view(np.complex128)[...] = 0
100 loops, best of 3: 918 usec per loop
>>> Z.view(np.int8)[...] = 0
100 loops, best of 3: 614 usec per loop

godaygo · 2018-03-23T07:50:34Z

Thank you, interesting results! Could you still timeit with array.fill method. If you do not mind, I would ask a question about this on SO?

rougier · 2018-03-28T12:44:02Z

More or less the same:

>>> Z.view(np.float16).fill(0)
100 loops, best of 3: 2.82 msec per loop
>>> Z.view(np.int16).fill(0)
100 loops, best of 3: 2.82 msec per loop
>>> Z.view(np.int32).fill(0)
100 loops, best of 3: 1.48 msec per loop
>>> Z.view(np.float32).fill(0)
100 loops, best of 3: 1.52 msec per loop
>>> Z.view(np.int64).fill(0)
100 loops, best of 3: 1.05 msec per loop
>>> Z.view(np.float64).fill(0)
100 loops, best of 3: 1.04 msec per loop
>>> Z.view(np.complex128).fill(0)
100 loops, best of 3: 930 usec per loop
>>> Z.view(np.int8).fill(0)
100 loops, best of 3: 601 usec per loop

rougier closed this as completed Feb 28, 2018

rougier reopened this Mar 28, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Anatomy of an array introduction. Obvious way is the fastest. #29

Anatomy of an array introduction. Obvious way is the fastest. #29

ichernob commented Jan 2, 2017 •

edited by rougier

rougier commented Jan 2, 2017

ichernob commented Jan 2, 2017

ichernob commented Jan 2, 2017

claws commented Jan 6, 2017

godaygo commented Mar 21, 2018 •

edited

rougier commented Mar 22, 2018

godaygo commented Mar 22, 2018 •

edited

rougier commented Mar 23, 2018

godaygo commented Mar 23, 2018

rougier commented Mar 28, 2018

Anatomy of an array introduction. Obvious way is the fastest. #29

Anatomy of an array introduction. Obvious way is the fastest. #29

Comments

ichernob commented Jan 2, 2017 • edited by rougier

rougier commented Jan 2, 2017

ichernob commented Jan 2, 2017

ichernob commented Jan 2, 2017

claws commented Jan 6, 2017

godaygo commented Mar 21, 2018 • edited

rougier commented Mar 22, 2018

godaygo commented Mar 22, 2018 • edited

rougier commented Mar 23, 2018

godaygo commented Mar 23, 2018

rougier commented Mar 28, 2018

ichernob commented Jan 2, 2017 •

edited by rougier

godaygo commented Mar 21, 2018 •

edited

godaygo commented Mar 22, 2018 •

edited