Discussion:
[f2py] f2py speed
Gabor Kalman
2013-02-23 17:29:02 UTC
Permalink
I’m a relatively new user of f2py.
To test what performance advantage I can get with f2py, I have created (a somewhat artificially) simple script.
First I describe it and then I will show the source code.

1. Description:

Take 3 constants (3 integers) and multiply those together in a double loop of range of 10,000 (i.e. 10**8 computations).
If I use a PYTHON ONLY scrip, it took 28 sec. (with Windows 7 on a Toshiba C655D, Python 27)
If I “buried” the computation in a GFORTRAN sub, it required only 0.015 sec.

I can’t find anything wrong with my source codes. So are these results plausible?

2. Codes:

module: py_v_p2fy.py

import time
import lib1
#====================== PYTHON ONLY =======================
def main1(z):
for i in range(10000):
for j in range(10000):
zz=2*z

start_time = time.clock()
z=2*3*4
main1(z)
print "done with PYTHON ONLY"
print time.clock() - start_time, "seconds"
print "-----"
#=================== PYTHON WITH F2PY ====================
def main2(x1,x2,x3):
lib1.f1(x1,x2,x3)

start_time = time.clock()
x1=2
x2=3
x3=4
main2(x1,x2,x3)
print "done with PYTHON WITH F2PY"
print time.clock() - start_time, "seconds"

#========
module: lib1.f90

subroutine f1(x1,x2,x3)
integer,intent(IN) :: x1,x2,x3

integer :: z,zz,i,j

do i=1,10000
do j=1,10000
z = x1*x2*x3
zz=2*z
end do
end do
end subroutine f1

#===================
module (batch-file): run.bat


python C:\Python27\Scripts\f2py.py ^
--build-dir .\tmp ^
--fcompiler=gnu95 ^
-c lib1.f90 -m lib1
pause
Pearu Peterson
2013-02-23 19:39:24 UTC
Permalink
Post by Gabor Kalman
I’m a relatively new user of f2py.
To test what performance advantage I can get with f2py, I have created (a
somewhat artificially) simple script.
First I describe it and then I will show the source code.
Take 3 constants (3 integers) and multiply those together in a double loop
of range of 10,000 (i.e. 10**8 computations).
If I use a PYTHON ONLY scrip, it took 28 sec. (with Windows 7 on a Toshiba
C655D, Python 27)
If I “buried” the computation in a GFORTRAN sub, it required only 0.015 sec.
I can’t find anything wrong with my source codes. So are these results plausible?
Yes, the results are expected.

First, these are as they are due to the simple fact that Python is
interpreted language (operation types are resolved at run time) while
Fortran is compiled language (operation types are resolved at compile time).

Second, your benchmark is simply unfair to Python and is not representable
for assessing the performance advantage of f2py. You should use your
application code, written in Python or f2py wrapped Fortran, to get most
appropriate assessment.

In general, when you want to speed up your Python code, first determine
which parts of it take most runtime and write only those parts in Fortran
and call these via wrapper generators such us f2py.

Some illustrative examples how to speed up Python follow below.

Note that already simple python loop takes some time as object creation
for i in range(10000):
for j in range(10000):
pass
Post by Gabor Kalman
%time main10(0)
CPU times: user 2.04 s, sys: 0.00 s, total: 2.04 s
Wall time: 2.04 s
for i in xrange(10000):
for j in xrange(10000):
pass
Post by Gabor Kalman
%time main10x(0)
CPU times: user 1.39 s, sys: 0.00 s, total: 1.39 s
Wall time: 1.38 s
for i in r:
for j in r:
pass
Post by Gabor Kalman
%time main10c(0)
CPU times: user 1.05 s, sys: 0.00 s, total: 1.05 s
Wall time: 1.05 s
for i in r:
for j in r:
zz = 2*z
Post by Gabor Kalman
%time main10o(0)
CPU times: user 3.46 s, sys: 0.00 s, total: 3.46 s
Wall time: 3.45 s
Post by Gabor Kalman
%time main10o(0.0)
CPU times: user 6.16 s, sys: 0.00 s, total: 6.16 s
Wall time: 6.17 s



Pearu
Ethan Gutmann
2013-02-23 20:25:28 UTC
Permalink
I'm not sure if it is happening in this case, but because you don't use the
results in the fortran example, it is entirely possibly that the compiler
is optimizing out the entire content of that subroutine.

Benchmarks can be tricky to write for this sort of reason, so you really
should try benchmarking your actual code if possible, or looking at other
benchmarks around the web.

Furthermore, if I were going to do something like that in python I would
use numpy instead, for something this simple, numpy is likely to be within
a factor of 2-10 of pure fortran code.

Ethan

On Feb 23, 2013, at 10:29 AM, Gabor Kalman <kalman_g-H+***@public.gmane.org> wrote:

I’m a relatively new user of f2py.
To test what performance advantage I can get with f2py, I have created (a
somewhat artificially) simple script.
First I describe it and then I will show the source code.

1. Description:

Take 3 constants (3 integers) and multiply those together in a double loop
of range of 10,000 (i.e. 10**8 computations).
If I use a PYTHON ONLY scrip, it took 28 sec. (with Windows 7 on a Toshiba
C655D, Python 27)
If I “buried” the computation in a GFORTRAN sub, it required only 0.015 sec.

I can’t find anything wrong with my source codes. So are these results
plausible?

2. Codes:

module: py_v_p2fy.py

import time
import lib1
#====================== PYTHON ONLY =======================
def main1(z):
for i in range(10000):
for j in range(10000):
zz=2*z

start_time = time.clock()
z=2*3*4
main1(z)
print "done with PYTHON ONLY"
print time.clock() - start_time, "seconds"
print "-----"
#=================== PYTHON WITH F2PY ====================
def main2(x1,x2,x3):
lib1.f1(x1,x2,x3)

start_time = time.clock()
x1=2
x2=3
x3=4
main2(x1,x2,x3)
print "done with PYTHON WITH F2PY"
print time.clock() - start_time, "seconds"

#========
module: lib1.f90

subroutine f1(x1,x2,x3)
integer,intent(IN) :: x1,x2,x3

integer :: z,zz,i,j

do i=1,10000
do j=1,10000
z = x1*x2*x3
zz=2*z
end do
end do
end subroutine f1

#===================
module (batch-file): run.bat


python C:\Python27\Scripts\f2py.py ^
--build-dir .\tmp ^
--fcompiler=gnu95 ^
-c lib1.f90 -m lib1
pause

Continue reading on narkive:
Loading...