Issues with the built-in function range(), xrange() in Python 2.x/3.x

TL;DR

Jump to second part, the first part illustrate how did I found the problem, which is not interesting at all.

How did I find this problem

Recently, my roommate is practicing for interview with Leetcode, he keeps coming to ask for my help with some problems. The problems in Leetcode are really easy (especially when compared with problems in OJs oriented for ACM), but if you keep sensitive, it could still help you find some corner in the language that you would never notice otherwise.

Here is my log about investigating difference of range() between python2 and python3 - and their different against xrange().

From my point of view, 248. Strobogrammatic Number III has two different ways to solve - one is to use DFS to generate all numbers(which won’t exceed $5^10=9,765,625$ for range within long, enough even for slow language like Python), another way is to use some mathematical method, hard to code but much quicker.

But who cares about time complexity, especially for someone else? “Just use the Brute Force, Luke!” When trying to solve with some brute force DFS, I wrote the following code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
class Solution(object):
def __init__(self):
self.result = 0
def strobogrammaticInRange(self, low, high):
self.result = 0
for i in list(range(len(low), len(high) + 1)):
self.dfs(low, high, i, "")
return self.result
def dfs(self, low, high, n, str):
if n == 0 and int(str) in range(int(low),int(high)+1):
self.result += 1
return
if n % 2 == 1:
for i in ["0", "1", "8"]:
self.dfs(low, high, n - 1, i)
if n == 0 or n % 2 == 1:
return
if n == 2:
for i in [("1", "1"), ("6", "9"), ("8", "8"), ("9", "6")]:
self.dfs(low, high, n - 2, i[0]+str+i[1])
else:
for i in [("0", "0"),("1", "1"), ("6", "9"), ("8", "8"), ("9", "6")]:
self.dfs(low, high, n - 2, i[0]+str+i[1])

Not surprisingly, it worked! However, it encountered with a MLE on Leetcode, but why?

Ok, it’s time to do some profile on memory, use the following code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
from memory_profiler import profile
class Solution(object):
def __init__(self):
self.result = 0
def strobogrammaticInRange(self, low, high):
self.result = 0
for i in list(range(len(low), len(high) + 1)):
self.dfs(low, high, i, "")
return self.result
@profile
def dfs(self, low, high, n, str):
if n == 0 and int(str) in range(int(low),int(high)+1):
self.result += 1
return
if n % 2 == 1:
for i in ["0", "1", "8"]:
self.dfs(low, high, n - 1, i)
if n == 0 or n % 2 == 1:
return
if n == 2:
for i in [("1", "1"), ("6", "9"), ("8", "8"), ("9", "6")]:
self.dfs(low, high, n - 2, i[0]+str+i[1])
else:
for i in [("0", "0"),("1", "1"), ("6", "9"), ("8", "8"), ("9", "6")]:
self.dfs(low, high, n - 2, i[0]+str+i[1])
if __name__=="__main__":
s=Solution()
print(s.strobogrammaticInrange("10000001","20000000"))

python3 -m memory_profiler 248.py and there you go!

profile in python3

Here is the problem, when I was investigating this problem I used python3.5.2, and the result is like the following, everything goes as I expected, only 12.6MB is used.

Then maybe it’s caused by the different between my development environment and Leetcode (which is now using 2.7.12) ?

Try python -m memory_profiler 248.py, and here comes the result:

Haha, the evil is hidden in the line 18, it must be the int(str) in range(int(low),int(high)+1)

Not that surprisingly, the range() in python3 is more like xrange() in python2. Just modify the range() to xrange() and there you go!

Wait a second, a Time Limit Exceeded? Another profile on the code shows the problem still lies in xrange(). Wait, it’s just a syntactic sugar to enhance the readability of my code! It just an easy way to write int(str)>=int(low) and int(str)<=int(high). What makes the xrange() consumes much more time? Shouldn’t it be a O(1) operation? Why do my code, which is using range() runs in Python3 like lightning while the xrange() in Python2 is so slow that it even encountered with a Time Limit Exceeded?

Issue solving

Ok, let’s use some code to simplify the problem.

In short, the problem is like the following:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
[cnnblike@localhost Leetcode]$ python
Python 2.7.12 (default, Sep 29 2016, 12:52:02)
[GCC 6.2.1 20160916 (Red Hat 6.2.1-2)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import timeit
>>> timeit.timeit('1000000000 in range(0,1000000000,10)', number=1)
2.7295479774475098
>>> timeit.timeit('1000000000 in xrange(0,1000000000,10)', number=1)
1.7883760929107666
[cnnblike@localhost Leetcode]$ python3
Python 3.5.2 (default, Sep 14 2016, 11:28:32)
[GCC 6.2.1 20160901 (Red Hat 6.2.1-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import timeit
>>> timeit.timeit('1000000000 in range(0,1000000000,10)', number=1)
2.9259999791975133e-06

It’s not that shocking to know that the range() in python2 is intolerable slow, after all, it would return a list to memory, and the filling part would really consume a lot of time.

But what the fuck is xrange() that slow? Comparing to range() in python3, they should both return a iterator to the caller and it must be the difference between iterator that cause the different behavior.

How do in behave?

First we need to know how do the in behave and know what method in called. Check Mapping Operators to Functions - Python 2 and Mapping Operators to Functions - Python 3 would indicate what the operator in would call the __contains__(a,b) to check if a is in b.

Next, how do __contains__ behave? Check object.contains(self,item) - Python 2 and object.contains(self,item) - Python 3 - they are almost the same, so the difference of xrange() and range() must hidden in their implementation.

1
2
3
object.__contains__(self, item)
Called to implement membership test operators. Should return true if item is in self, false otherwise. For mapping objects, this should consider the keys of the mapping rather than the values or the key-item pairs.
For objects that don’t define __contains__(), the membership test first tries iteration via __iter__(), then the old sequence iteration protocol via __getitem__(), see this section in the language reference.

Second Step:

Check here Xrange Type, here comes the truth, the reason that xrange() is so slow:

XRange objects have very little behavior: they only support indexing, iteration, and the len() function.

Yes, the xrange() doesn’t support __contains__() so the in use __iter__() to iteration all items in the iterator, and this is why range() is so slow.

But the range() of Python3 support __contains__.

This could also be verified with dir():

1
2
3
4
5
6
7
8
9
10
11
12
13
[cnnblike@localhost Leetcode]$ python
Python 2.7.12 (default, Sep 29 2016, 12:52:02)
[GCC 6.2.1 20160916 (Red Hat 6.2.1-2)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> dir(xrange)
['__class__', '__delattr__', '__doc__', '__format__', '__getattribute__', '__getitem__', '__hash__', '__init__', '__iter__', '__len__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__reversed__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__']
[cnnblike@localhost Leetcode]$ python3
Python 3.5.2 (default, Sep 14 2016, 11:28:32)
[GCC 6.2.1 20160901 (Red Hat 6.2.1-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> dir(range)
['__class__', '__contains__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__init__', '__iter__', '__le__', '__len__', '__lt__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__reversed__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'count', 'index', 'start', 'step', 'stop']

The behavior of CPython interpreter

Check the code here range __contains__ source code of Python3 . You could see what Python3 actually do in the contains() method,

  • check if the value is within the whole range start <= ob < stop for positive steps and stop < ob <= start for negative steps
  • use remainder to check if the value is in actually number, result = ((int(ob) - start) % step) == 0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
static int
range_contains_long(rangeobject *r, PyObject *ob)
{
int cmp1, cmp2, cmp3;
PyObject *tmp1 = NULL;
PyObject *tmp2 = NULL;
PyObject *zero = NULL;
int result = -1;
zero = PyLong_FromLong(0);
if (zero == NULL) /* MemoryError in int(0) */
goto end;
/* Check if the value can possibly be in the range. */
cmp1 = PyObject_RichCompareBool(r->step, zero, Py_GT);
if (cmp1 == -1)
goto end;
if (cmp1 == 1) { /* positive steps: start <= ob < stop */
cmp2 = PyObject_RichCompareBool(r->start, ob, Py_LE);
cmp3 = PyObject_RichCompareBool(ob, r->stop, Py_LT);
}
else { /* negative steps: stop < ob <= start */
cmp2 = PyObject_RichCompareBool(ob, r->start, Py_LE);
cmp3 = PyObject_RichCompareBool(r->stop, ob, Py_LT);
}
if (cmp2 == -1 || cmp3 == -1) /* TypeError */
goto end;
if (cmp2 == 0 || cmp3 == 0) { /* ob outside of range */
result = 0;
goto end;
}
/* Check that the stride does not invalidate ob's membership. */
tmp1 = PyNumber_Subtract(ob, r->start);
if (tmp1 == NULL)
goto end;
tmp2 = PyNumber_Remainder(tmp1, r->step);
if (tmp2 == NULL)
goto end;
/* result = ((int(ob) - start) % step) == 0 */
result = PyObject_RichCompareBool(tmp2, zero, Py_EQ);
end:
Py_XDECREF(tmp1);
Py_XDECREF(tmp2);
Py_XDECREF(zero);
return result;
}

Conclusion

  1. Fuck Leetcode! It’s 2017 now, why do you still insist using Python2?
  2. Never use Python2, many new specification would not be added back.
  3. When you have to use Python2, try to use chained comparation like ``int(low)<=int(str)<=int(high), instead ofxrange()`.