Jump to second part, the first part illustrate how did I found the problem, which is not interesting at all.
How did I find this problem
Recently, my roommate is practicing for interview with Leetcode, he keeps coming to ask for my help with some problems. The problems in Leetcode are really easy (especially when compared with problems in OJs oriented for ACM), but if you keep sensitive, it could still help you find some corner in the language that you would never notice otherwise.
Here is my log about investigating difference of
range() between python2 and python3 - and their different against
From my point of view, 248. Strobogrammatic Number III has two different ways to solve - one is to use DFS to generate all numbers(which won’t exceed $5^10=9,765,625$ for range within long, enough even for slow language like Python), another way is to use some mathematical method, hard to code but much quicker.
But who cares about time complexity, especially for someone else? “Just use the Brute Force, Luke!” When trying to solve with some brute force DFS, I wrote the following code:
Not surprisingly, it worked! However, it encountered with a MLE on Leetcode, but why?
Ok, it’s time to do some profile on memory, use the following code:
python3 -m memory_profiler 248.py and there you go!
Here is the problem, when I was investigating this problem I used python3.5.2, and the result is like the following, everything goes as I expected, only 12.6MB is used.
Then maybe it’s caused by the different between my development environment and Leetcode (which is now using 2.7.12) ?
python -m memory_profiler 248.py, and here comes the result:
Haha, the evil is hidden in the line 18, it must be the
int(str) in range(int(low),int(high)+1)
Not that surprisingly, the
range() in python3 is more like
xrange() in python2. Just modify the
xrange() and there you go!
Wait a second, a
Time Limit Exceeded? Another profile on the code shows the problem still lies in
xrange(). Wait, it’s just a syntactic sugar to enhance the readability of my code! It just an easy way to write
int(str)>=int(low) and int(str)<=int(high). What makes the
xrange() consumes much more time? Shouldn’t it be a
O(1) operation? Why do my code, which is using
range() runs in Python3 like lightning while the
xrange() in Python2 is so slow that it even encountered with a
Time Limit Exceeded?
Ok, let’s use some code to simplify the problem.
In short, the problem is like the following:
It’s not that shocking to know that the
range() in python2 is intolerable slow, after all, it would return a list to memory, and the filling part would really consume a lot of time.
But what the fuck is
xrange() that slow? Comparing to
range() in python3, they should both return a iterator to the caller and it must be the difference between iterator that cause the different behavior.
First we need to know how do the
in behave and know what method
in called. Check Mapping Operators to Functions - Python 2 and Mapping Operators to Functions - Python 3 would indicate what the operator
in would call the
__contains__(a,b) to check if
a is in
Next, how do
__contains__ behave? Check object.contains(self,item) - Python 2 and object.contains(self,item) - Python 3 - they are almost the same, so the difference of
range() must hidden in their implementation.
Check here Xrange Type, here comes the truth, the reason that
xrange() is so slow:
XRange objects have very little behavior: they only support indexing, iteration, and the len() function.
xrange() doesn’t support
__contains__() so the
__iter__() to iteration all items in the iterator, and this is why
range() is so slow.
range() of Python3 support
This could also be verified with
The behavior of CPython interpreter
Check the code here range __contains__ source code of Python3 . You could see what Python3 actually do in the
- check if the value is within the whole range
start <= ob < stopfor positive steps and
stop < ob <= startfor negative steps
- use remainder to check if the value is in actually number,
result = ((int(ob) - start) % step) == 0
- Fuck Leetcode! It’s 2017 now, why do you still insist using Python2?
- Never use Python2, many new specification would not be added back.
- When you have to use Python2, try to use chained comparation like ``int(low)<=int(str)<=int(high)
, instead ofxrange()`.