Sunday, June 1, 2014

Bytearray status update

I'm almost done with fixing the complexity issues with bytearrays.

One of the challenges in doing this was that bytearrays support most of the same operations that strings do in Python, except they accept arbitrary objects that support the buffer protocol as a second (or third) operand.  For example:

>>> 'abc' + memoryview('def')
Traceback (most recent call last):
  File "", line 1, in 
TypeError: cannot concatenate 'str' and 'memoryview' objects
>>> bytearray('abc') + memoryview('def')
bytearray(b'abcdef')
>>> from string import maketrans
>>> t = memoryview(maketrans('abc', '123'))
>>> 'abc'.translate(t)
Traceback (most recent call last):
  File "", line 1, in 
TypeError: expected a character buffer object
>>> bytearray('abc').translate(t)
bytearray(b'123')

To deal with this I added support for __getitem__ and __len__ (and a couple of other magic methods) to RPython and did some refactoring so that the same code could be used to handle buffer operands that is used to handle string operands. There are still a few places where the parameters of methods of bytearrays are still copied to strings, but the important thing is the bytearray itself isn't copied unnecessarily any more.  I'll clean up those few places over the next day or so.

Here are some examples that demonstrate the preformance improvements I have been working on:

% pypy --version
Python 2.7.6 (394146e9bb67, May 08 2014, 16:45:59)
[PyPy 2.3.0 with GCC 4.2.1 Compatible Apple LLVM 5.0 (clang-500.2.79)]
% ./pypy-c --version
Python 2.7.6 (d91c375d8356+, Jun 01 2014, 19:59:07)
[PyPy 2.4.0-alpha0 with GCC 4.2.1 Compatible Apple LLVM 5.1 (clang-503.0.40)]

% pypy -m timeit -s "data = bytearray('1' * 1000000)" "data + data"
100 loops, best of 3: 10.5 msec per loop
% ./pypy-c -m timeit -s "data = bytearray('1' * 1000000)" "data + data"
1000 loops, best of 3: 669 usec per loop
% python -m timeit -s "data = bytearray('1' * 1000000)" "data + data"
10000 loops, best of 3: 102 usec per loop

% pypy -m timeit -s "data = bytearray('1' * 1000000)" "data[0:10]"
1000 loops, best of 3: 827 usec per loop
% ./pypy-c -m timeit -s "data = bytearray('1' * 1000000)" "data[0:10]"
100000000 loops, best of 3: 0.00963 usec per loop
% python -m timeit -s "data = bytearray('1' * 1000000)" "data[0:10]"
10000000 loops, best of 3: 0.13 usec per loop

% pypy -m timeit -s "data = bytearray(''.join(str(i) for i in range(1000)))" "data.index('32')"
1000000 loops, best of 3: 1.57 usec per loop
% ./pypy-c -m timeit -s "data = bytearray(''.join(str(i) for i in range(1000)))" "data.index('32')"
10000000 loops, best of 3: 0.183 usec per loop
% python -m timeit -s "data = bytearray(''.join(str(i) for i in range(1000)))" "data.index('32')"
1000000 loops, best of 3: 0.216 usec per loop

As you can see, the improvement (in these admittedly very artificial examples) is dramatic.  Things are still slower than CPython. I suspect that has to do with CPython being able to work with the underlying array directly.  I doubt these differences will be nearly so pronounced for real-world workloads.

Tomorrow, I'll look to get my work merged.  Early next week I'll get started on working on the unicode implementation.  More about that next time.

No comments:

Post a Comment