[CMake] Re: [Swig-user] Performance bug in pycontainers.swg

Fri May 18 18:29:26 EDT 2007

Josh Cherry wrote:
> 1. The slowness depends (in part, at least) on the fact that there is a 
> cont.end() call for every iteration.  If the code is changed to look like 
> this:
> 
>      cont_end = cont.end()
>      while it != cont.end():
>          it.next()
> 
> it is much faster.  Of course, calling end() *should* be constant time.
> 

In C++, maybe, where the compiler may be able to inline the call
completely.  When you bind it to a scripting language, end() is always a
function call.  This slowdown is unavoidable.
Even in C++, the recommendation is NOT to call end() repeatedly if you
can avoid it, as inlining is just a hint and the compiler could fail to
do it (for whatever reason).

> Ondrej Marsalek wrote:
>
>         n   pyvector     pylist    pydeque      pyset
>      1024       1.87      25.19       2.40      19.79

Those numbers don't look that right, but they don't look wrong either to
me.

I should also point out that this:

>   cont = pystl.pylist(xrange(n))

will indeed do a dummy check on each element of the range when *FILLING*
the container, not when iterating thru it.  Again, this check cannot be
avoided.
Take this out of the timing and you'll see the numbers perform much better.

Here's for example the test fixed (for ruby):

--
%module li_std_speed

%include <std_list.i>
%include <std_vector.i>
%include <std_deque.i>
%include <std_set.i>

%template(RbList)   std::list<swig::GC_VALUE>;
%template(RbVector) std::vector<swig::GC_VALUE>;
%template(RbDeque)  std::deque<swig::GC_VALUE>;
%template(RbSet)    std::set<swig::GC_VALUE>;

--

require 'li_std_speed'
include Li_std_speed

require 'benchmark'

def iterate(c)
  i = c.begin
  e = c.end
  while i != e
    i.next
  end
end

n = 16384

elems = (0..n).to_a

puts '--- build containers (takes long)'

all = {}
[RbVector, RbList, RbDeque, RbSet].each do |klass|
  puts "build #{klass}"
  all[klass] = klass.new( elems )
end

puts '--- iterate'

GC.disable # to avoid measuring any GC slowdown
all.each do |klass, c|
  Benchmark.benchmark('') do |x|
    name = klass.to_s.sub(/.*::/,'') + ' '*5
    name = name[0,10]
    x.report("#{name} #{n}") { iterate(c) }
  end
end

Iteration speed is linear on all containers:

--- build containers (takes long)
build Li_std_speed::RbVector
build Li_std_speed::RbList
build Li_std_speed::RbDeque
build Li_std_speed::RbSet
--- iterate
RbList     16384  0.083333   0.000000   0.083333 (  0.052038)
RbDeque    16384  0.116667   0.000000   0.116667 (  0.061717)
RbVector   16384  0.100000   0.000000   0.100000 (  0.061583)
RbSet      16384  0.100000   0.000000   0.100000 (  0.063578)

If, however, assignment is taking into account (this is done with the
container's assign if available), each container behaves quite
differently.

require 'li_std_speed'
include Li_std_speed

require 'benchmark'

n = 16384
@elems = (0..n).to_a

def create_and_iterate(klass)
  c = klass.new( @elems )
  i = c.begin
  e = c.end
  while i != e
    i.next
  end
end

GC.disable # to avoid measuring any GC slowdown
[RbVector, RbList, RbDeque, RbSet].each do |klass|
  Benchmark.benchmark('') do |x|
    name = klass.to_s.sub(/.*::/,'') + ' '*5
    name = name[0,10]
    x.report("#{name} #{n}") { create_and_iterate(klass) }
  end
end
---
RbVector   16384  4.183333   0.033333   4.216667 (  2.550497)
RbList     16384 17.566667   0.016667  17.583333 ( 10.562868)
RbDeque    16384  3.533333   0.000000   3.533333 (  2.118050)
RbSet      16384 17.066667   0.016667  17.083333 ( 10.272435)

*NOTE:  std::set is slow as expected.  std::deque and std::vector are
expected to be relatively fast.
The performance of std::list is weird, as you expect insertions on a
list to be constant time.  Problem is, the list copy constructor is
*not* constant, while a std::vector/deque is (usually just a memcpy).
Currently swig is kind of dumb in how copy constructors are handled in
the STL, as the container is actually copied twice when the element
passed in is not another STL container, but the language's native
list/array container.

Summary: no bug.

-- 
Gonzalo Garramuño
ggarra at advancedsl.com.ar

AMD4400 - ASUS48N-E
GeForce7300GT
Kubuntu Edgy