https://docs.python.org/2/library/traceback.html


traceback — Print or retrieve a stack traceback


This module provides a standard interface to extract, format and print stack traces of Python programs. It exactly mimics the behavior of the Python interpreter when it prints a stack trace. This is useful when you want to print stack traces under program control, such as in a “wrapper” around the interpreter.

The module uses traceback objects — this is the object type that is stored in the variables sys.exc_traceback (deprecated) and sys.last_traceback and returned as the third item from sys.exc_info().

The module defines the following functions:

traceback.print_tb(tb[limit[file]])

Print up to limit stack trace entries from the traceback object tb. If limit is omitted or None, all entries are printed. If file is omitted or None, the output goes to sys.stderr; otherwise it should be an open file or file-like object to receive the output.

traceback.print_exception(etypevaluetb[limit[file]])

Print exception information and up to limit stack trace entries from the traceback tb to file. This differs from print_tb() in the following ways: (1) if tb is not None, it prints a header Traceback (most recent call last):; (2) it prints the exception etype and value after the stack trace; (3) if etype is SyntaxError and value has the appropriate format, it prints the line where the syntax error occurred with a caret indicating the approximate position of the error.

traceback.print_exc([limit[file]])

This is a shorthand for print_exception(sys.exc_type, sys.exc_value, sys.exc_traceback, limit, file). (In fact, it uses sys.exc_info() to retrieve the same information in a thread-safe way instead of using the deprecated variables.)

traceback.format_exc([limit])

This is like print_exc(limit) but returns a string instead of printing to a file.

New in version 2.4.

traceback.print_last([limit[file]])

This is a shorthand for print_exception(sys.last_type, sys.last_value, sys.last_traceback, limit, file). In general it will work only after an exception has reached an interactive prompt (see sys.last_type).

traceback.print_stack([f[limit[file]]])

This function prints a stack trace from its invocation point. The optional f argument can be used to specify an alternate stack frame to start. The optional limit and file arguments have the same meaning as for print_exception().

traceback.extract_tb(tb[limit])

Return a list of up to limit “pre-processed” stack trace entries extracted from the traceback object tb. It is useful for alternate formatting of stack traces. If limit is omitted or None, all entries are extracted. A “pre-processed” stack trace entry is a 4-tuple (filenameline number, function name*, text) representing the information that is usually printed for a stack trace. The text is a string with leading and trailing whitespace stripped; if the source is not available it is None.

traceback.extract_stack([f[limit]])

Extract the raw traceback from the current stack frame. The return value has the same format as for extract_tb(). The optional f and limit arguments have the same meaning as for print_stack().

traceback.format_list(extracted_list)

Given a list of tuples as returned by extract_tb() or extract_stack(), return a list of strings ready for printing. Each string in the resulting list corresponds to the item with the same index in the argument list. Each string ends in a newline; the strings may contain internal newlines as well, for those items whose source text line is not None.

traceback.format_exception_only(etypevalue)

Format the exception part of a traceback. The arguments are the exception type, etype and value such as given by sys.last_type and sys.last_value. The return value is a list of strings, each ending in a newline. Normally, the list contains a single string; however, for SyntaxError exceptions, it contains several lines that (when printed) display detailed information about where the syntax error occurred. The message indicating which exception occurred is the always last string in the list.

traceback.format_exception(etypevaluetb[limit])

Format a stack trace and the exception information. The arguments have the same meaning as the corresponding arguments to print_exception(). The return value is a list of strings, each ending in a newline and some containing internal newlines. When these lines are concatenated and printed, exactly the same text is printed as does print_exception().

traceback.format_tb(tb[limit])

A shorthand for format_list(extract_tb(tb, limit)).

traceback.format_stack([f[limit]])

A shorthand for format_list(extract_stack(f, limit)).

traceback.tb_lineno(tb)

This function returns the current line number set in the traceback object. This function was necessary because in versions of Python prior to 2.3 when the -O flag was passed to Python the tb.tb_lineno was not updated correctly. This function has no use in versions past 2.3.

28.10.1. Traceback Examples

This simple example implements a basic read-eval-print loop, similar to (but less useful than) the standard Python interactive interpreter loop. For a more complete implementation of the interpreter loop, refer to the code module.

import sys, traceback

def run_user_code(envdir):
    source = raw_input(">>> ")
    try:
        exec source in envdir
    except:
        print "Exception in user code:"
        print '-'*60
        traceback.print_exc(file=sys.stdout)
        print '-'*60

envdir = {}
while 1:
    run_user_code(envdir)

The following example demonstrates the different ways to print and format the exception and traceback:

import sys, traceback

def lumberjack():
    bright_side_of_death()

def bright_side_of_death():
    return tuple()[0]

try:
    lumberjack()
except IndexError:
    exc_type, exc_value, exc_traceback = sys.exc_info()
    print "*** print_tb:"
    traceback.print_tb(exc_traceback, limit=1, file=sys.stdout)
    print "*** print_exception:"
    traceback.print_exception(exc_type, exc_value, exc_traceback,
                              limit=2, file=sys.stdout)
    print "*** print_exc:"
    traceback.print_exc()
    print "*** format_exc, first and last line:"
    formatted_lines = traceback.format_exc().splitlines()
    print formatted_lines[0]
    print formatted_lines[-1]
    print "*** format_exception:"
    print repr(traceback.format_exception(exc_type, exc_value,
                                          exc_traceback))
    print "*** extract_tb:"
    print repr(traceback.extract_tb(exc_traceback))
    print "*** format_tb:"
    print repr(traceback.format_tb(exc_traceback))
    print "*** tb_lineno:", exc_traceback.tb_lineno

The output for the example would look similar to this:

*** print_tb:
  File "<doctest...>", line 10, in <module>
    lumberjack()
*** print_exception:
Traceback (most recent call last):
  File "<doctest...>", line 10, in <module>
    lumberjack()
  File "<doctest...>", line 4, in lumberjack
    bright_side_of_death()
IndexError: tuple index out of range
*** print_exc:
Traceback (most recent call last):
  File "<doctest...>", line 10, in <module>
    lumberjack()
  File "<doctest...>", line 4, in lumberjack
    bright_side_of_death()
IndexError: tuple index out of range
*** format_exc, first and last line:
Traceback (most recent call last):
IndexError: tuple index out of range
*** format_exception:
['Traceback (most recent call last):\n',
 '  File "<doctest...>", line 10, in <module>\n    lumberjack()\n',
 '  File "<doctest...>", line 4, in lumberjack\n    bright_side_of_death()\n',
 '  File "<doctest...>", line 7, in bright_side_of_death\n    return tuple()[0]\n',
 'IndexError: tuple index out of range\n']
*** extract_tb:
[('<doctest...>', 10, '<module>', 'lumberjack()'),
 ('<doctest...>', 4, 'lumberjack', 'bright_side_of_death()'),
 ('<doctest...>', 7, 'bright_side_of_death', 'return tuple()[0]')]
*** format_tb:
['  File "<doctest...>", line 10, in <module>\n    lumberjack()\n',
 '  File "<doctest...>", line 4, in lumberjack\n    bright_side_of_death()\n',
 '  File "<doctest...>", line 7, in bright_side_of_death\n    return tuple()[0]\n']
*** tb_lineno: 10

The following example shows the different ways to print and format the stack:

>>> import traceback
>>> def another_function():
...     lumberstack()
...
>>> def lumberstack():
...     traceback.print_stack()
...     print repr(traceback.extract_stack())
...     print repr(traceback.format_stack())
...
>>> another_function()
  File "<doctest>", line 10, in <module>
    another_function()
  File "<doctest>", line 3, in another_function
    lumberstack()
  File "<doctest>", line 6, in lumberstack
    traceback.print_stack()
[('<doctest>', 10, '<module>', 'another_function()'),
 ('<doctest>', 3, 'another_function', 'lumberstack()'),
 ('<doctest>', 7, 'lumberstack', 'print repr(traceback.extract_stack())')]
['  File "<doctest>", line 10, in <module>\n    another_function()\n',
 '  File "<doctest>", line 3, in another_function\n    lumberstack()\n',
 '  File "<doctest>", line 8, in lumberstack\n    print repr(traceback.format_stack())\n']

This last example demonstrates the final few formatting functions:

>>> import traceback
>>> traceback.format_list([('spam.py', 3, '<module>', 'spam.eggs()'),
...                        ('eggs.py', 42, 'eggs', 'return "bacon"')])
['  File "spam.py", line 3, in <module>\n    spam.eggs()\n',
 '  File "eggs.py", line 42, in eggs\n    return "bacon"\n']
>>> an_error = IndexError('tuple index out of range')
>>> traceback.format_exception_only(type(an_error), an_error)
['IndexError: tuple index out of range\n']



https://realpython.com/introduction-to-python-generators/

Understanding Generators

So far, you’ve learned about the two primary ways of creating generators: by using generator functions and generator expressions. You might even have an intuitive understanding of how generators work. Let’s take a moment to make that knowledge a little more explicit.

Generator functions look and act just like regular functions, but with one defining characteristic. Generator functions use the Python yield keyword instead of return. Recall the generator function you wrote earlier:

def infinite_sequence():
    num = 0
    while True:
        yield num
        num += 1

This looks like a typical function definition, except for the Python yield statement and the code that follows it. yield indicates where a value is sent back to the caller, but unlike return, you don’t exit the function afterward.

Instead, the state of the function is remembered. That way, when next() is called on a generator object (either explicitly or implicitly within a for loop), the previously yielded variable num is incremented, and then yielded again. Since generator functions look like other functions and act very similarly to them, you can assume that generator expressions are very similar to other comprehensions available in Python.

Building Generators With Generator Expressions

Like list comprehensions, generator expressions allow you to quickly create a generator object in just a few lines of code. They’re also useful in the same cases where list comprehensions are used, with an added benefit: you can create them without building and holding the entire object in memory before iteration. In other words, you’ll have no memory penalty when you use generator expressions. Take this example of squaring some numbers:

>>>
>>> nums_squared_lc = [num**2 for num in range(5)]
>>> nums_squared_gc = (num**2 for num in range(5))

Both nums_squared_lc and nums_squared_gc look basically the same, but there’s one key difference. Can you spot it? Take a look at what happens when you inspect each of these objects:

>>>
>>> nums_squared_lc
[0, 1, 4, 9, 16]
>>> nums_squared_gc
<generator object <genexpr> at 0x107fbbc78>

The first object used brackets to build a list, while the second created a generator expression by using parentheses. The output confirms that you’ve created a generator object and that it is distinct from a list.

Profiling Generator Performance

You learned earlier that generators are a great way to optimize memory. While an infinite sequence generator is an extreme example of this optimization, let’s amp up the number squaring examples you just saw and inspect the size of the resulting objects. You can do this with a call to sys.getsizeof():

>>>
>>> import sys
>>> nums_squared_lc = [i * 2 for i in range(10000)]
>>> sys.getsizeof(nums_squared_lc)
87624
>>> nums_squared_gc = (i ** 2 for i in range(10000))
>>> print(sys.getsizeof(nums_squared_gc))
120

In this case, the list you get from the list comprehension is 87,624 bytes, while the generator object is only 120. This means that the list is over 700 times larger than the generator object!

There is one thing to keep in mind, though. If the list is smaller than the running machine’s available memory, then list comprehensions can be faster to evaluate than the equivalent generator expression. To explore this, let’s sum across the results from the two comprehensions above. You can generate a readout with cProfile.run():

>>>
>>> import cProfile
>>> cProfile.run('sum([i * 2 for i in range(10000)])')
         5 function calls in 0.001 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.001    0.001    0.001    0.001 <string>:1(<listcomp>)
        1    0.000    0.000    0.001    0.001 <string>:1(<module>)
        1    0.000    0.000    0.001    0.001 {built-in method builtins.exec}
        1    0.000    0.000    0.000    0.000 {built-in method builtins.sum}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}


>>> cProfile.run('sum((i * 2 for i in range(10000)))')
         10005 function calls in 0.003 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    10001    0.002    0.000    0.002    0.000 <string>:1(<genexpr>)
        1    0.000    0.000    0.003    0.003 <string>:1(<module>)
        1    0.000    0.000    0.003    0.003 {built-in method builtins.exec}
        1    0.001    0.001    0.003    0.003 {built-in method builtins.sum}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

Here, you can see that summing across all values in the list comprehension took about a third of the time as summing across the generator. If speed is an issue and memory isn’t, then a list comprehension is likely a better tool for the job.

Remember, list comprehensions return full lists, while generator expressions return generators. Generators work the same whether they’re built from a function or an expression. Using an expression just allows you to define simple generators in a single line, with an assumed yield at the end of each inner iteration.

The Python yield statement is certainly the linchpin on which all of the functionality of generators rests, so let’s dive into how yield works in Python.



http://pythonstudy.xyz/python/article/23-Iterator%EC%99%80-Generator

for 문과 반복자

파이썬에서는 “반복자”라는 특별한 성질을 가진 타입들이 있다. 반복자는 영어로 iterator라고 하는데, 파이썬에서 반복자는 기술적으로는 단순히 __next__() 라는 메소드를 가지고 있는 객체를 말한다. (사실 반복자 그 자체는 제너레이터의 일종으로 볼 수도 있고, 기술적으로 제너레이터이다라고 해도 틀린표현은 아니다.) 보통 반복자들은 일련의 연속적인 값 혹은 특정한 규칙을 가진 수열을 계산하는 값을 내부에 가지고 있고, 이를 __next__() 가 호출될 때마다 내부적으로 관리하는 수열의 다음항을 리턴할 수 있는 객체가 된다.

참고로 내장함수의 도움말을 살펴보면 다음과 같다. 여기서는 iterator 라는 표현을 직접적으로 쓰고 있으며, 반복자로부터 다음번 아이템 얻어 리턴한다고 쓰여있다.

In [1]: next? Docstring: next(iterator[, default])  
Return the next item from the iterator. 
If default is given and the iterator is exhausted, 
it is returned instead of raising StopIteration. 
Type: builtin_function_or_method

참고로 반복자가 더 이상 만들어 낼 다음번 항이 없는 경우에는 StopIteration 예외를 일으키고, 이는 곧 순회(iteration)의 끝을 의미한다.

반복가능 (iterable) 프로토콜

파이썬에서 반복가능하다는 말은 결국 for ... in 문에 적용가능하다는 말과 동치이고, 기술적으로는 이터레이터(iterator, 반복자)를 가지고 있는 객체라는 의미이다. 반복가능한 타입의 객체로부터 반복자를 얻어내는 내장함수는 iter() 함수이다. (뒤에서 살펴보겠지만, next(x)x.__next__()를 호출하듯이, iter(x)x.__iter__()를 호출한다. 사실 이것이 반복가능 프로토콜의 핵심이다. )역시 반복 가능한 객체에 대한 힌트를 얻기 위해서 이 함수의 도움말을 살펴보도록 하자.

In [4]: iter? Docstring: iter(iterable) 
-> iterator iter(callable, sentinel) 
-> iterator  Get an iterator from an object. 
In the first form, the argument must supply its own iterator, 
or be a sequence. In the second form, the callable is called until it returns the sentinel. 
Type: builtin_function_or_method

우리는 여기서 많은 것을 볼 수 있다.

  1. iter 함수는 어떤 객체로부터 반복자를 얻어서 리턴한다. (list를 만든 후 dir(list)를 통해보면, 디셔너리 자료형이 반환되는데, 그안에 __iter__함수가 존재하는 것을 볼 수 있다. iter()함수는 바로 이 __iter__()함수의 리턴값을 반환한다)
  2. 반복가능한 객체는 반복자를 이미 가지고 있거나, 아니면 그 스스로가 연속열이다.
  3. 반복자 뿐만 아니라 특정한 종결값을 리턴할 때까지 계속 어떠한 값을 리턴할 수 있는 함수도 반복가능으로 취급한다.

지금까지의 힌트를 취합하면 다음과 같은 사실들을 알아내었다고 정리할 수 있다.

  1. iter() 함수를 이용하면 iterable한 객체의 반복자를 얻을 수 있다.
  2. next() 함수를 이용하면 반복자의 매 항을 얻을 수 있다.
  3. next() 함수를 이용했을 때 반복자가 더 이상 내 줄 값이 없으면 StopIteration 예외를 일으킨다.
  4. 그리고 range() 함수가 리턴하는 객체는 for ... in 문에 사용할 수 있으니, iterable 하다.

for … in 루프의 구조

그러면 이러한 사실로부터 우리는 파이썬의 for ... in 문을 while 문으로 재구성해볼 수 있다.

for i in range(3):   print(3)

위 반복문은 우리가 알아낸 사실에 근거하여 다음과 같이 쓸 수 있다.

## range(3)은 iterable 한 객체를 리턴하고 
## iter() 함수를 이용하면 그로부터 반복자를 얻을 수 있다.  
x = range(3)

## range는 다음과 같은 네임스페이스를 갖는다. range()함수는 __iter__ 함수를 가지고 있는 것을 확인할 수 있다. 
dir(x)
>>> ['__bool__', '__class__', '__contains__', '__delattr__', '__dir__', 
'__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__gt__', 
'__hash__', '__init__', '__init_subclass__', '__iter__', '__le__', '__len__', '__lt__', 
'__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__reversed__', '__setattr__', 
'__sizeof__', '__str__', '__subclasshook__', 'count', 'index', 'start', 'step', 'stop']
def customFor(x):
iteratorX = x.__iter__()   

## 이터레이터 객체를 갖는다면 이터레이터화 한다.

while True:      
	try:        
		i = next(iteratorX) 
		## 반복자로부터 다음 항을 얻는다.         
		return i      
	except StopIteration: 
		## 반복자가 고갈되면 StopIteration이 뜬다.        
		print("iteratrion finished.")        
		break  

# 0 # 1 # 2 # iteration finished.

그리고 파이썬의 for ... in 문은 실제로 이렇게 돌아간다. 임의의 리스트나 문자열, 튜플 등에 대해서 아래와 같이 반복자를 직접 얻어서 next() 함수를 계속 호출해볼 수 있다.

In [8]: x = iter([1,2,3,4])    
In [9]: x  Out[9]: <list_iterator at 0x222958f2b38>    
In [10]: next(x)  Out[10]: 1    
In [11]: next(x)  Out[11]: 2    
In [12]: next(x)  Out[12]: 3    
In [13]: next(x)  Out[13]: 4    
In [14]: next(x)  ---------------------------------------------------------------------------  
StopIteration Traceback (most recent call last)  <ipython-input-14-5e4e57af3a97> in <module>()  
----> 1 next(x)    StopIteration:

반복자

그렇다면 이번에는 반복자에 대해서 좀 더 알아보도록 하자. 반복자는 앞서 말했듯이 __next__() 메소드를 가지고 있는 객체라고 했다. 내장함수 next()는 모종의 약속에 의해서 인자로 받는 객체의 __next__() 메소드를 호출하고 그 결과를 리턴해주는 역할을 할 뿐이다.

커스텀 반복자 클래스

그렇다면 예를 들어서 피보나치 수열의 항을 순차적으로 리턴할 수 있는 반복자를 만들 수 있지 않을까?

class FibonacciSeq:   
    def __init__(self):     
    	self.a, self.b = 0, 1    

    def __next__(self):     
        self.a, self.b = self.b, self.a + self.b     
        ## 무한 수열이 될 수 있으니, 200보다 큰 값이 만들어지면 끝낸다.      
            if self.a > 200:       
                raise StopIteration     
                return self.a  
        ## 테스트 
        f = FibonacciSeq() 
        next(f)

# 1 next(f) # 1 next(f) # 3 next(f) # 5 next(f) # 8

대략 성공적이다. 하지만 이렇게 반복자를 만들더라도 iterable한 객체는 따로 존재한다. 즉 지금까지는 itrable한 객체가 있고, 여기에 iter() 함수를 통해서 반복자를 만들어서 각 항을 순회했다는 것이다. 즉 for ... in 문에서 필요한 것은 반복자 그 자체가 아닌 반복자를 생성할 수 있는 객체, 즉 iterable 한 객체이다.

하지만 방금 작성한 FibonacciSeq 클래스는 그 자체가 반복자이면서 반복가능한 객체가 될 수 있다. 왜냐하면 next()함수와 마찬가지로 iter() 함수는 인자로 받은 객체에 대해서 __iter__() 메소드를 호출하여 반복자를 얻기 때문이다. 이 역시 모종의 약속이 미리 정해져 있는 셈이다. 어쨌든 그 스스로가 반복자의 모든 요건을 갖추고 있으므로 __iter__()는 그 자신을 리턴하는 것으로만 간단히 정의하면 된다.

따라서 다음과 같이 피보나치 생성 클래스를 수정해보자. 수정하는 김에 한계값 자체는 생성시에 인자로 받을 수 있게끔 함께 변경한다.

class FibonacciSeq:   
    def __init__(self, upto=200):     
        self.limit = upto     
        self.a, self.b = 0, 1    

    def __iter__(self):     
        return self    

    def __next__(self):     
        self.a, self.b = self.b, self.a + self.b 
        
        if self.a > self.limit:       
            raise StopIteration     
            return self.a  

## 테스트 : 이제 for ... in 문에서도 잘 작동한다.
for f in FibonacciSeq(200):   
	print(f)

반복가능한 객체

내부에 __iter__(), __next__() 메소드를 가지는 객체를 만들기만 하면 이 객체는 기술적으로는 반복가능한 객체가 된다고 했다. 이를 통해서 특정한 하나의 값을 반복하는 리피터라든지, 여러 개의 연속열을 한꺼번에 순회할 수 있는 체인시퀀스 같은 도구를 만들 수도 있을 것이며, 그외에 내부 속성들을 for … in 문을 통해서 순회할 수 있는 객체도 디자인할 수 있을 것이다. 예를 들면 어떤 학생들의 시험 성적을 관리하는 코드에서 Student라는 클래스를 만들고 이 클래스 내부에 eng, math, sci 등 과목들의 점수를 저장했을 때, __next__() 메소드에서 미리 정한 순서에 따라 해당 속성값을 리턴하도록 한다면 Student 클래스의 인스턴스는 for ... in 문을 통해서 개별 과목의 점수를 순회할 수 있을 것이다.

https://medium.com/@charliesharding/learning-python-no-template-literals-da7cbd77e3ba


Learning Python — Template Literals?

Charles Harding
Mar 11, 2017 · 1 min read

Coming from Javascript, python is a fairly approachable language. I’m going to try to keep up these posts as a reference for others learning python as well. One of the first things that I’ve encountered is the absence of template literals in the language.

In Javascript, one can type:

let age = 27;
console.log(`my age is ${age}!`);

And the output will be 'my age is 27!' . This is useful when you have a lot of embedded variables in a string so that you don’t have to keep concatenating with ending quotes and plus signs.

'lol ' + adversary.name + ' you suk at ' + adversary.favoriteThing + '!!!!111!11!'

I found an active proposal for this feature but it was posted two years ago. https://www.python.org/dev/peps/pep-0498/

Turns out they totally do exist, just under a different name and syntax.

x = "There are %d types of people." % 10
binary = "binary"
do_not = "don't"
y = "Those who know %s and those who %s." % (binary, do_not)

The %s and % (varnames) is the way of inserting the values of those variables into the string. %s is used for strings whereas %d is used for numbers.

As pointed out in the comments by Phil Owens — a better functionality actually has been added in to python 3.6

“As of 3.6 you can use the syntax:

date = '22nd'txt = f"Today is March {date}."

Thanks Phil!


https://docs.python.org/3/library/itertools.html#itertools.product


itertools — Functions creating iterators for efficient looping


This module implements a number of iterator building blocks inspired by constructs from APL, Haskell, and SML. Each has been recast in a form suitable for Python.

The module standardizes a core set of fast, memory efficient tools that are useful by themselves or in combination. Together, they form an “iterator algebra” making it possible to construct specialized tools succinctly and efficiently in pure Python.

For instance, SML provides a tabulation tool: tabulate(f) which produces a sequence f(0), f(1), .... The same effect can be achieved in Python by combining map() and count() to form map(f, count()).

These tools and their built-in counterparts also work well with the high-speed functions in the operator module. For example, the multiplication operator can be mapped across two vectors to form an efficient dot-product: sum(map(operator.mul, vector1, vector2)).

Infinite iterators:

Iterator

Arguments

Results

Example

count()

start, [step]

start, start+step, start+2*step, …

count(10) --> 10 11 12 13 14 ...

cycle()

p

p0, p1, … plast, p0, p1, …

cycle('ABCD') --> A B C D A B C D ...

repeat()

elem [,n]

elem, elem, elem, … endlessly or up to n times

repeat(10, 3) --> 10 10 10

Iterators terminating on the shortest input sequence:

Iterator

Arguments

Results

Example

accumulate()

p [,func]

p0, p0+p1, p0+p1+p2, …

accumulate([1,2,3,4,5]) --> 1 3 6 10 15

chain()

p, q, …

p0, p1, … plast, q0, q1, …

chain('ABC', 'DEF') --> A B C D E F

chain.from_iterable()

iterable

p0, p1, … plast, q0, q1, …

chain.from_iterable(['ABC', 'DEF']) --> A B C D E F

compress()

data, selectors

(d[0] if s[0]), (d[1] if s[1]), …

compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F

dropwhile()

pred, seq

seq[n], seq[n+1], starting when pred fails

dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1

filterfalse()

pred, seq

elements of seq where pred(elem) is false

filterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8

groupby()

iterable[, key]

sub-iterators grouped by value of key(v)

islice()

seq, [start,] stop [, step]

elements from seq[start:stop:step]

islice('ABCDEFG', 2, None) --> C D E F G

starmap()

func, seq

func(*seq[0]), func(*seq[1]), …

starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000

takewhile()

pred, seq

seq[0], seq[1], until pred fails

takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4

tee()

it, n

it1, it2, … itn splits one iterator into n

zip_longest()

p, q, …

(p[0], q[0]), (p[1], q[1]), …

zip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-

Combinatoric iterators:

Iterator

Arguments

Results

product()

p, q, … [repeat=1]

cartesian product, equivalent to a nested for-loop

permutations()

p[, r]

r-length tuples, all possible orderings, no repeated elements

combinations()

p, r

r-length tuples, in sorted order, no repeated elements

combinations_with_replacement()

p, r

r-length tuples, in sorted order, with repeated elements

Examples

Results

product('ABCD', repeat=2)

AA AB AC AD BA BB BC BD CA CB CC CD DA DB DC DD

permutations('ABCD', 2)

AB AC AD BA BC BD CA CB CD DA DB DC

combinations('ABCD', 2)

AB AC AD BC BD CD

combinations_with_replacement('ABCD',2)

AA AB AC AD BB BC BD CC CD DD

Itertool functions

The following module functions all construct and return iterators. Some provide streams of infinite length, so they should only be accessed by functions or loops that truncate the stream.

itertools.accumulate(iterable[func*initial=None])

Make an iterator that returns accumulated sums, or accumulated results of other binary functions (specified via the optional func argument).

If func is supplied, it should be a function of two arguments. Elements of the input iterable may be any type that can be accepted as arguments to func. (For example, with the default operation of addition, elements may be any addable type including Decimal or Fraction.)

Usually, the number of elements output matches the input iterable. However, if the keyword argument initial is provided, the accumulation leads off with the initial value so that the output has one more element than the input iterable.

Roughly equivalent to:

def accumulate(iterable, func=operator.add, *, initial=None):
    'Return running totals'
    # accumulate([1,2,3,4,5]) --> 1 3 6 10 15
    # accumulate([1,2,3,4,5], initial=100) --> 100 101 103 106 110 115
    # accumulate([1,2,3,4,5], operator.mul) --> 1 2 6 24 120
    it = iter(iterable)
    total = initial
    if initial is None:
        try:
            total = next(it)
        except StopIteration:
            return
    yield total
    for element in it:
        total = func(total, element)
        yield total

There are a number of uses for the func argument. It can be set to min() for a running minimum, max() for a running maximum, or operator.mul() for a running product. Amortization tables can be built by accumulating interest and applying payments. First-order recurrence relations can be modeled by supplying the initial value in the iterable and using only the accumulated total in func argument:

>>>
>>> data = [3, 4, 6, 2, 1, 9, 0, 7, 5, 8]
>>> list(accumulate(data, operator.mul))     # running product
[3, 12, 72, 144, 144, 1296, 0, 0, 0, 0]
>>> list(accumulate(data, max))              # running maximum
[3, 4, 6, 6, 6, 9, 9, 9, 9, 9]

# Amortize a 5% loan of 1000 with 4 annual payments of 90
>>> cashflows = [1000, -90, -90, -90, -90]
>>> list(accumulate(cashflows, lambda bal, pmt: bal*1.05 + pmt))
[1000, 960.0, 918.0, 873.9000000000001, 827.5950000000001]

# Chaotic recurrence relation https://en.wikipedia.org/wiki/Logistic_map
>>> logistic_map = lambda x, _:  r * x * (1 - x)
>>> r = 3.8
>>> x0 = 0.4
>>> inputs = repeat(x0, 36)     # only the initial value is used
>>> [format(x, '.2f') for x in accumulate(inputs, logistic_map)]
['0.40', '0.91', '0.30', '0.81', '0.60', '0.92', '0.29', '0.79', '0.63',
 '0.88', '0.39', '0.90', '0.33', '0.84', '0.52', '0.95', '0.18', '0.57',
 '0.93', '0.25', '0.71', '0.79', '0.63', '0.88', '0.39', '0.91', '0.32',
 '0.83', '0.54', '0.95', '0.20', '0.60', '0.91', '0.30', '0.80', '0.60']

See functools.reduce() for a similar function that returns only the final accumulated value.

New in version 3.2.

Changed in version 3.3: Added the optional func parameter.

Changed in version 3.8: Added the optional initial parameter.

itertools.chain(*iterables)

Make an iterator that returns elements from the first iterable until it is exhausted, then proceeds to the next iterable, until all of the iterables are exhausted. Used for treating consecutive sequences as a single sequence. Roughly equivalent to:

def chain(*iterables):
    # chain('ABC', 'DEF') --> A B C D E F
    for it in iterables:
        for element in it:
            yield element
classmethod chain.from_iterable(iterable)

Alternate constructor for chain(). Gets chained inputs from a single iterable argument that is evaluated lazily. Roughly equivalent to:

def from_iterable(iterables):
    # chain.from_iterable(['ABC', 'DEF']) --> A B C D E F
    for it in iterables:
        for element in it:
            yield element
itertools.combinations(iterabler)

Return r length subsequences of elements from the input iterable.

Combinations are emitted in lexicographic sort order. So, if the input iterable is sorted, the combination tuples will be produced in sorted order.

Elements are treated as unique based on their position, not on their value. So if the input elements are unique, there will be no repeat values in each combination.

Roughly equivalent to:

def combinations(iterable, r):
    # combinations('ABCD', 2) --> AB AC AD BC BD CD
    # combinations(range(4), 3) --> 012 013 023 123
    pool = tuple(iterable)
    n = len(pool)
    if r > n:
        return
    indices = list(range(r))
    yield tuple(pool[i] for i in indices)
    while True:
        for i in reversed(range(r)):
            if indices[i] != i + n - r:
                break
        else:
            return
        indices[i] += 1
        for j in range(i+1, r):
            indices[j] = indices[j-1] + 1
        yield tuple(pool[i] for i in indices)

The code for combinations() can be also expressed as a subsequence of permutations() after filtering entries where the elements are not in sorted order (according to their position in the input pool):

def combinations(iterable, r):
    pool = tuple(iterable)
    n = len(pool)
    for indices in permutations(range(n), r):
        if sorted(indices) == list(indices):
            yield tuple(pool[i] for i in indices)

The number of items returned is n! / r! / (n-r)! when 0 <= r <= n or zero when r > n.

itertools.combinations_with_replacement(iterabler)

Return r length subsequences of elements from the input iterable allowing individual elements to be repeated more than once.

Combinations are emitted in lexicographic sort order. So, if the input iterable is sorted, the combination tuples will be produced in sorted order.

Elements are treated as unique based on their position, not on their value. So if the input elements are unique, the generated combinations will also be unique.

Roughly equivalent to:

def combinations_with_replacement(iterable, r):
    # combinations_with_replacement('ABC', 2) --> AA AB AC BB BC CC
    pool = tuple(iterable)
    n = len(pool)
    if not n and r:
        return
    indices = [0] * r
    yield tuple(pool[i] for i in indices)
    while True:
        for i in reversed(range(r)):
            if indices[i] != n - 1:
                break
        else:
            return
        indices[i:] = [indices[i] + 1] * (r - i)
        yield tuple(pool[i] for i in indices)

The code for combinations_with_replacement() can be also expressed as a subsequence of product() after filtering entries where the elements are not in sorted order (according to their position in the input pool):

def combinations_with_replacement(iterable, r):
    pool = tuple(iterable)
    n = len(pool)
    for indices in product(range(n), repeat=r):
        if sorted(indices) == list(indices):
            yield tuple(pool[i] for i in indices)

The number of items returned is (n+r-1)! / r! / (n-1)! when n > 0.

New in version 3.1.

itertools.compress(dataselectors)

Make an iterator that filters elements from data returning only those that have a corresponding element in selectors that evaluates to True. Stops when either the data or selectors iterables has been exhausted. Roughly equivalent to:

def compress(data, selectors):
    # compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F
    return (d for d, s in zip(data, selectors) if s)

New in version 3.1.

itertools.count(start=0step=1)

Make an iterator that returns evenly spaced values starting with number start. Often used as an argument to map() to generate consecutive data points. Also, used with zip() to add sequence numbers. Roughly equivalent to:

def count(start=0, step=1):
    # count(10) --> 10 11 12 13 14 ...
    # count(2.5, 0.5) -> 2.5 3.0 3.5 ...
    n = start
    while True:
        yield n
        n += step

When counting with floating point numbers, better accuracy can sometimes be achieved by substituting multiplicative code such as: (start + step * i for i in count()).

Changed in version 3.1: Added step argument and allowed non-integer arguments.

itertools.cycle(iterable)

Make an iterator returning elements from the iterable and saving a copy of each. When the iterable is exhausted, return elements from the saved copy. Repeats indefinitely. Roughly equivalent to:

def cycle(iterable):
    # cycle('ABCD') --> A B C D A B C D A B C D ...
    saved = []
    for element in iterable:
        yield element
        saved.append(element)
    while saved:
        for element in saved:
              yield element

Note, this member of the toolkit may require significant auxiliary storage (depending on the length of the iterable).

itertools.dropwhile(predicateiterable)

Make an iterator that drops elements from the iterable as long as the predicate is true; afterwards, returns every element. Note, the iterator does not produce any output until the predicate first becomes false, so it may have a lengthy start-up time. Roughly equivalent to:

def dropwhile(predicate, iterable):
    # dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1
    iterable = iter(iterable)
    for x in iterable:
        if not predicate(x):
            yield x
            break
    for x in iterable:
        yield x
itertools.filterfalse(predicateiterable)

Make an iterator that filters elements from iterable returning only those for which the predicate is False. If predicate is None, return the items that are false. Roughly equivalent to:

def filterfalse(predicate, iterable):
    # filterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8
    if predicate is None:
        predicate = bool
    for x in iterable:
        if not predicate(x):
            yield x
itertools.groupby(iterablekey=None)

Make an iterator that returns consecutive keys and groups from the iterable. The key is a function computing a key value for each element. If not specified or is Nonekey defaults to an identity function and returns the element unchanged. Generally, the iterable needs to already be sorted on the same key function.

The operation of groupby() is similar to the uniq filter in Unix. It generates a break or new group every time the value of the key function changes (which is why it is usually necessary to have sorted the data using the same key function). That behavior differs from SQL’s GROUP BY which aggregates common elements regardless of their input order.

The returned group is itself an iterator that shares the underlying iterable with groupby(). Because the source is shared, when the groupby() object is advanced, the previous group is no longer visible. So, if that data is needed later, it should be stored as a list:

groups = []
uniquekeys = []
data = sorted(data, key=keyfunc)
for k, g in groupby(data, keyfunc):
    groups.append(list(g))      # Store group iterator as a list
    uniquekeys.append(k)

groupby() is roughly equivalent to:

class groupby:
    # [k for k, g in groupby('AAAABBBCCDAABBB')] --> A B C D A B
    # [list(g) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D
    def __init__(self, iterable, key=None):
        if key is None:
            key = lambda x: x
        self.keyfunc = key
        self.it = iter(iterable)
        self.tgtkey = self.currkey = self.currvalue = object()
    def __iter__(self):
        return self
    def __next__(self):
        self.id = object()
        while self.currkey == self.tgtkey:
            self.currvalue = next(self.it)    # Exit on StopIteration
            self.currkey = self.keyfunc(self.currvalue)
        self.tgtkey = self.currkey
        return (self.currkey, self._grouper(self.tgtkey, self.id))
    def _grouper(self, tgtkey, id):
        while self.id is id and self.currkey == tgtkey:
            yield self.currvalue
            try:
                self.currvalue = next(self.it)
            except StopIteration:
                return
            self.currkey = self.keyfunc(self.currvalue)
itertools.islice(iterablestop)
itertools.islice(iterablestartstop[step])

Make an iterator that returns selected elements from the iterable. If start is non-zero, then elements from the iterable are skipped until start is reached. Afterward, elements are returned consecutively unless step is set higher than one which results in items being skipped. If stop is None, then iteration continues until the iterator is exhausted, if at all; otherwise, it stops at the specified position. Unlike regular slicing, islice() does not support negative values for startstop, or step. Can be used to extract related fields from data where the internal structure has been flattened (for example, a multi-line report may list a name field on every third line). Roughly equivalent to:

def islice(iterable, *args):
    # islice('ABCDEFG', 2) --> A B
    # islice('ABCDEFG', 2, 4) --> C D
    # islice('ABCDEFG', 2, None) --> C D E F G
    # islice('ABCDEFG', 0, None, 2) --> A C E G
    s = slice(*args)
    start, stop, step = s.start or 0, s.stop or sys.maxsize, s.step or 1
    it = iter(range(start, stop, step))
    try:
        nexti = next(it)
    except StopIteration:
        # Consume *iterable* up to the *start* position.
        for i, element in zip(range(start), iterable):
            pass
        return
    try:
        for i, element in enumerate(iterable):
            if i == nexti:
                yield element
                nexti = next(it)
    except StopIteration:
        # Consume to *stop*.
        for i, element in zip(range(i + 1, stop), iterable):
            pass

If start is None, then iteration starts at zero. If step is None, then the step defaults to one.

itertools.permutations(iterabler=None)

Return successive r length permutations of elements in the iterable.

If r is not specified or is None, then r defaults to the length of the iterable and all possible full-length permutations are generated.

Permutations are emitted in lexicographic sort order. So, if the input iterable is sorted, the permutation tuples will be produced in sorted order.

Elements are treated as unique based on their position, not on their value. So if the input elements are unique, there will be no repeat values in each permutation.

Roughly equivalent to:

def permutations(iterable, r=None):
    # permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC
    # permutations(range(3)) --> 012 021 102 120 201 210
    pool = tuple(iterable)
    n = len(pool)
    r = n if r is None else r
    if r > n:
        return
    indices = list(range(n))
    cycles = list(range(n, n-r, -1))
    yield tuple(pool[i] for i in indices[:r])
    while n:
        for i in reversed(range(r)):
            cycles[i] -= 1
            if cycles[i] == 0:
                indices[i:] = indices[i+1:] + indices[i:i+1]
                cycles[i] = n - i
            else:
                j = cycles[i]
                indices[i], indices[-j] = indices[-j], indices[i]
                yield tuple(pool[i] for i in indices[:r])
                break
        else:
            return

The code for permutations() can be also expressed as a subsequence of product(), filtered to exclude entries with repeated elements (those from the same position in the input pool):

def permutations(iterable, r=None):
    pool = tuple(iterable)
    n = len(pool)
    r = n if r is None else r
    for indices in product(range(n), repeat=r):
        if len(set(indices)) == r:
            yield tuple(pool[i] for i in indices)

The number of items returned is n! / (n-r)! when 0 <= r <= n or zero when r > n.

itertools.product(*iterablesrepeat=1)

Cartesian product of input iterables.

Roughly equivalent to nested for-loops in a generator expression. For example, product(A, B) returns the same as ((x,y) for x in A for y in B).

The nested loops cycle like an odometer with the rightmost element advancing on every iteration. This pattern creates a lexicographic ordering so that if the input’s iterables are sorted, the product tuples are emitted in sorted order.

To compute the product of an iterable with itself, specify the number of repetitions with the optional repeat keyword argument. For example, product(A, repeat=4) means the same as product(A, A, A, A).

This function is roughly equivalent to the following code, except that the actual implementation does not build up intermediate results in memory:

def product(*args, repeat=1):
    # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
    # product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111
    pools = [tuple(pool) for pool in args] * repeat
    result = [[]]
    for pool in pools:
        result = [x+[y] for x in result for y in pool]
    for prod in result:
        yield tuple(prod)
itertools.repeat(object[times])

Make an iterator that returns object over and over again. Runs indefinitely unless the times argument is specified. Used as argument to map() for invariant parameters to the called function. Also used with zip() to create an invariant part of a tuple record.

Roughly equivalent to:

def repeat(object, times=None):
    # repeat(10, 3) --> 10 10 10
    if times is None:
        while True:
            yield object
    else:
        for i in range(times):
            yield object

A common use for repeat is to supply a stream of constant values to map or zip:

>>>
>>> list(map(pow, range(10), repeat(2)))
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
itertools.starmap(functioniterable)

Make an iterator that computes the function using arguments obtained from the iterable. Used instead of map() when argument parameters are already grouped in tuples from a single iterable (the data has been “pre-zipped”). The difference between map() and starmap() parallels the distinction between function(a,b) and function(*c). Roughly equivalent to:

def starmap(function, iterable):
    # starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000
    for args in iterable:
        yield function(*args)
itertools.takewhile(predicateiterable)

Make an iterator that returns elements from the iterable as long as the predicate is true. Roughly equivalent to:

def takewhile(predicate, iterable):
    # takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4
    for x in iterable:
        if predicate(x):
            yield x
        else:
            break
itertools.tee(iterablen=2)

Return n independent iterators from a single iterable.

The following Python code helps explain what tee does (although the actual implementation is more complex and uses only a single underlying FIFO queue).

Roughly equivalent to:

def tee(iterable, n=2):
    it = iter(iterable)
    deques = [collections.deque() for i in range(n)]
    def gen(mydeque):
        while True:
            if not mydeque:             # when the local deque is empty
                try:
                    newval = next(it)   # fetch a new value and
                except StopIteration:
                    return
                for d in deques:        # load it to all the deques
                    d.append(newval)
            yield mydeque.popleft()
    return tuple(gen(d) for d in deques)

Once tee() has made a split, the original iterable should not be used anywhere else; otherwise, the iterable could get advanced without the tee objects being informed.

tee iterators are not threadsafe. A RuntimeError may be raised when using simultaneously iterators returned by the same tee() call, even if the original iterable is threadsafe.

This itertool may require significant auxiliary storage (depending on how much temporary data needs to be stored). In general, if one iterator uses most or all of the data before another iterator starts, it is faster to use list() instead of tee().

itertools.zip_longest(*iterablesfillvalue=None)

Make an iterator that aggregates elements from each of the iterables. If the iterables are of uneven length, missing values are filled-in with fillvalue. Iteration continues until the longest iterable is exhausted. Roughly equivalent to:

def zip_longest(*args, fillvalue=None):
    # zip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
    iterators = [iter(it) for it in args]
    num_active = len(iterators)
    if not num_active:
        return
    while True:
        values = []
        for i, it in enumerate(iterators):
            try:
                value = next(it)
            except StopIteration:
                num_active -= 1
                if not num_active:
                    return
                iterators[i] = repeat(fillvalue)
                value = fillvalue
            values.append(value)
        yield tuple(values)

If one of the iterables is potentially infinite, then the zip_longest() function should be wrapped with something that limits the number of calls (for example islice() or takewhile()). If not specified, fillvalue defaults to None.

Itertools Recipes

This section shows recipes for creating an extended toolset using the existing itertools as building blocks.

Substantially all of these recipes and many, many others can be installed from the more-itertools project found on the Python Package Index:

pip install more-itertools

The extended tools offer the same high performance as the underlying toolset. The superior memory performance is kept by processing elements one at a time rather than bringing the whole iterable into memory all at once. Code volume is kept small by linking the tools together in a functional style which helps eliminate temporary variables. High speed is retained by preferring “vectorized” building blocks over the use of for-loops and generators which incur interpreter overhead.

def take(n, iterable):
    "Return first n items of the iterable as a list"
    return list(islice(iterable, n))

def prepend(value, iterator):
    "Prepend a single value in front of an iterator"
    # prepend(1, [2, 3, 4]) -> 1 2 3 4
    return chain([value], iterator)

def tabulate(function, start=0):
    "Return function(0), function(1), ..."
    return map(function, count(start))

def tail(n, iterable):
    "Return an iterator over the last n items"
    # tail(3, 'ABCDEFG') --> E F G
    return iter(collections.deque(iterable, maxlen=n))

def consume(iterator, n=None):
    "Advance the iterator n-steps ahead. If n is None, consume entirely."
    # Use functions that consume iterators at C speed.
    if n is None:
        # feed the entire iterator into a zero-length deque
        collections.deque(iterator, maxlen=0)
    else:
        # advance to the empty slice starting at position n
        next(islice(iterator, n, n), None)

def nth(iterable, n, default=None):
    "Returns the nth item or a default value"
    return next(islice(iterable, n, None), default)

def all_equal(iterable):
    "Returns True if all the elements are equal to each other"
    g = groupby(iterable)
    return next(g, True) and not next(g, False)

def quantify(iterable, pred=bool):
    "Count how many times the predicate is true"
    return sum(map(pred, iterable))

def padnone(iterable):
    """Returns the sequence elements and then returns None indefinitely.

    Useful for emulating the behavior of the built-in map() function.
    """
    return chain(iterable, repeat(None))

def ncycles(iterable, n):
    "Returns the sequence elements n times"
    return chain.from_iterable(repeat(tuple(iterable), n))

def dotproduct(vec1, vec2):
    return sum(map(operator.mul, vec1, vec2))

def flatten(listOfLists):
    "Flatten one level of nesting"
    return chain.from_iterable(listOfLists)

def repeatfunc(func, times=None, *args):
    """Repeat calls to func with specified arguments.

    Example:  repeatfunc(random.random)
    """
    if times is None:
        return starmap(func, repeat(args))
    return starmap(func, repeat(args, times))

def pairwise(iterable):
    "s -> (s0,s1), (s1,s2), (s2, s3), ..."
    a, b = tee(iterable)
    next(b, None)
    return zip(a, b)

def grouper(iterable, n, fillvalue=None):
    "Collect data into fixed-length chunks or blocks"
    # grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx"
    args = [iter(iterable)] * n
    return zip_longest(*args, fillvalue=fillvalue)

def roundrobin(*iterables):
    "roundrobin('ABC', 'D', 'EF') --> A D E B F C"
    # Recipe credited to George Sakkis
    num_active = len(iterables)
    nexts = cycle(iter(it).__next__ for it in iterables)
    while num_active:
        try:
            for next in nexts:
                yield next()
        except StopIteration:
            # Remove the iterator we just exhausted from the cycle.
            num_active -= 1
            nexts = cycle(islice(nexts, num_active))

def partition(pred, iterable):
    'Use a predicate to partition entries into false entries and true entries'
    # partition(is_odd, range(10)) --> 0 2 4 6 8   and  1 3 5 7 9
    t1, t2 = tee(iterable)
    return filterfalse(pred, t1), filter(pred, t2)

def powerset(iterable):
    "powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
    s = list(iterable)
    return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))

def unique_everseen(iterable, key=None):
    "List unique elements, preserving order. Remember all elements ever seen."
    # unique_everseen('AAAABBBCCDAABBB') --> A B C D
    # unique_everseen('ABBCcAD', str.lower) --> A B C D
    seen = set()
    seen_add = seen.add
    if key is None:
        for element in filterfalse(seen.__contains__, iterable):
            seen_add(element)
            yield element
    else:
        for element in iterable:
            k = key(element)
            if k not in seen:
                seen_add(k)
                yield element

def unique_justseen(iterable, key=None):
    "List unique elements, preserving order. Remember only the element just seen."
    # unique_justseen('AAAABBBCCDAABBB') --> A B C D A B
    # unique_justseen('ABBCcAD', str.lower) --> A B C A D
    return map(next, map(operator.itemgetter(1), groupby(iterable, key)))

def iter_except(func, exception, first=None):
    """ Call a function repeatedly until an exception is raised.

    Converts a call-until-exception interface to an iterator interface.
    Like builtins.iter(func, sentinel) but uses an exception instead
    of a sentinel to end the loop.

    Examples:
        iter_except(functools.partial(heappop, h), IndexError)   # priority queue iterator
        iter_except(d.popitem, KeyError)                         # non-blocking dict iterator
        iter_except(d.popleft, IndexError)                       # non-blocking deque iterator
        iter_except(q.get_nowait, Queue.Empty)                   # loop over a producer Queue
        iter_except(s.pop, KeyError)                             # non-blocking set iterator

    """
    try:
        if first is not None:
            yield first()            # For database APIs needing an initial cast to db.first()
        while True:
            yield func()
    except exception:
        pass

def first_true(iterable, default=False, pred=None):
    """Returns the first true value in the iterable.

    If no true value is found, returns *default*

    If *pred* is not None, returns the first item
    for which pred(item) is true.

    """
    # first_true([a,b,c], x) --> a or b or c or x
    # first_true([a,b], x, f) --> a if f(a) else b if f(b) else x
    return next(filter(pred, iterable), default)

def random_product(*args, repeat=1):
    "Random selection from itertools.product(*args, **kwds)"
    pools = [tuple(pool) for pool in args] * repeat
    return tuple(random.choice(pool) for pool in pools)

def random_permutation(iterable, r=None):
    "Random selection from itertools.permutations(iterable, r)"
    pool = tuple(iterable)
    r = len(pool) if r is None else r
    return tuple(random.sample(pool, r))

def random_combination(iterable, r):
    "Random selection from itertools.combinations(iterable, r)"
    pool = tuple(iterable)
    n = len(pool)
    indices = sorted(random.sample(range(n), r))
    return tuple(pool[i] for i in indices)

def random_combination_with_replacement(iterable, r):
    "Random selection from itertools.combinations_with_replacement(iterable, r)"
    pool = tuple(iterable)
    n = len(pool)
    indices = sorted(random.randrange(n) for i in range(r))
    return tuple(pool[i] for i in indices)

def nth_combination(iterable, r, index):
    'Equivalent to list(combinations(iterable, r))[index]'
    pool = tuple(iterable)
    n = len(pool)
    if r < 0 or r > n:
        raise ValueError
    c = 1
    k = min(r, n-r)
    for i in range(1, k+1):
        c = c * (n - k + i) // i
    if index < 0:
        index += c
    if index < 0 or index >= c:
        raise IndexError
    result = []
    while r:
        c, n, r = c*r//n, n-1, r-1
        while index >= c:
            index -= c
            c, n = c*(n-r)//n, n-1
        result.append(pool[-1-n])
    return tuple(result)


https://docs.python.org/3/library/typing.html



typing — Support for type hints

New in version 3.5.

Source code: Lib/typing.py

Note

 

The Python runtime does not enforce function and variable type annotations. They can be used by third party tools such as type checkers, IDEs, linters, etc.


This module provides runtime support for type hints as specified by PEP 484PEP 526PEP 544PEP 586PEP 589, and PEP 591. The most fundamental support consists of the types AnyUnionTupleCallableTypeVar, and Generic. For full specification please see PEP 484. For a simplified introduction to type hints see PEP 483.

The function below takes and returns a string and is annotated as follows:

def greeting(name: str) -> str:
    return 'Hello ' + name

In the function greeting, the argument name is expected to be of type str and the return type str. Subtypes are accepted as arguments.

Type aliases

A type alias is defined by assigning the type to the alias. In this example, Vector and List[float] will be treated as interchangeable synonyms:

from typing import List
Vector = List[float]

def scale(scalar: float, vector: Vector) -> Vector:
    return [scalar * num for num in vector]

# typechecks; a list of floats qualifies as a Vector.
new_vector = scale(2.0, [1.0, -4.2, 5.4])

Type aliases are useful for simplifying complex type signatures. For example:

from typing import Dict, Tuple, Sequence

ConnectionOptions = Dict[str, str]
Address = Tuple[str, int]
Server = Tuple[Address, ConnectionOptions]

def broadcast_message(message: str, servers: Sequence[Server]) -> None:
    ...

# The static type checker will treat the previous type signature as
# being exactly equivalent to this one.
def broadcast_message(
        message: str,
        servers: Sequence[Tuple[Tuple[str, int], Dict[str, str]]]) -> None:
    ...

Note that None as a type hint is a special case and is replaced by type(None).

NewType

Use the NewType() helper function to create distinct types:

from typing import NewType

UserId = NewType('UserId', int)
some_id = UserId(524313)

The static type checker will treat the new type as if it were a subclass of the original type. This is useful in helping catch logical errors:

def get_user_name(user_id: UserId) -> str:
    ...

# typechecks
user_a = get_user_name(UserId(42351))

# does not typecheck; an int is not a UserId
user_b = get_user_name(-1)

You may still perform all int operations on a variable of type UserId, but the result will always be of type int. This lets you pass in a UserId wherever an int might be expected, but will prevent you from accidentally creating a UserId in an invalid way:

# 'output' is of type 'int', not 'UserId'
output = UserId(23413) + UserId(54341)

Note that these checks are enforced only by the static type checker. At runtime, the statement Derived = NewType('Derived', Base) will make Derived a function that immediately returns whatever parameter you pass it. That means the expression Derived(some_value) does not create a new class or introduce any overhead beyond that of a regular function call.

More precisely, the expression some_value is Derived(some_value) is always true at runtime.

This also means that it is not possible to create a subtype of Derived since it is an identity function at runtime, not an actual type:

from typing import NewType

UserId = NewType('UserId', int)

# Fails at runtime and does not typecheck
class AdminUserId(UserId): pass

However, it is possible to create a NewType() based on a ‘derived’ NewType:

from typing import NewType

UserId = NewType('UserId', int)

ProUserId = NewType('ProUserId', UserId)

and typechecking for ProUserId will work as expected.

See PEP 484 for more details.

Note

 

Recall that the use of a type alias declares two types to be equivalent to one another. Doing Alias = Original will make the static type checker treat Alias as being exactly equivalent to Original in all cases. This is useful when you want to simplify complex type signatures.

In contrast, NewType declares one type to be a subtype of another. Doing Derived = NewType('Derived', Original) will make the static type checker treat Derived as a subclass of Original, which means a value of type Original cannot be used in places where a value of type Derived is expected. This is useful when you want to prevent logic errors with minimal runtime cost.

New in version 3.5.2.

Callable

Frameworks expecting callback functions of specific signatures might be type hinted using Callable[[Arg1Type, Arg2Type], ReturnType].

For example:

from typing import Callable

def feeder(get_next_item: Callable[[], str]) -> None:
    # Body

def async_query(on_success: Callable[[int], None],
                on_error: Callable[[int, Exception], None]) -> None:
    # Body

It is possible to declare the return type of a callable without specifying the call signature by substituting a literal ellipsis for the list of arguments in the type hint: Callable[..., ReturnType].

Generics

Since type information about objects kept in containers cannot be statically inferred in a generic way, abstract base classes have been extended to support subscription to denote expected types for container elements.

from typing import Mapping, Sequence

def notify_by_email(employees: Sequence[Employee],
                    overrides: Mapping[str, str]) -> None: ...

Generics can be parameterized by using a new factory available in typing called TypeVar.

from typing import Sequence, TypeVar

T = TypeVar('T')      # Declare type variable

def first(l: Sequence[T]) -> T:   # Generic function
    return l[0]

User-defined generic types

A user-defined class can be defined as a generic class.

from typing import TypeVar, Generic
from logging import Logger

T = TypeVar('T')

class LoggedVar(Generic[T]):
    def __init__(self, value: T, name: str, logger: Logger) -> None:
        self.name = name
        self.logger = logger
        self.value = value

    def set(self, new: T) -> None:
        self.log('Set ' + repr(self.value))
        self.value = new

    def get(self) -> T:
        self.log('Get ' + repr(self.value))
        return self.value

    def log(self, message: str) -> None:
        self.logger.info('%s: %s', self.name, message)

Generic[T] as a base class defines that the class LoggedVar takes a single type parameter T . This also makes T valid as a type within the class body.

The Generic base class defines __class_getitem__() so that LoggedVar[t] is valid as a type:

from typing import Iterable

def zero_all_vars(vars: Iterable[LoggedVar[int]]) -> None:
    for var in vars:
        var.set(0)

A generic type can have any number of type variables, and type variables may be constrained:

from typing import TypeVar, Generic
...

T = TypeVar('T')
S = TypeVar('S', int, str)

class StrangePair(Generic[T, S]):
    ...

Each type variable argument to Generic must be distinct. This is thus invalid:

from typing import TypeVar, Generic
...

T = TypeVar('T')

class Pair(Generic[T, T]):   # INVALID
    ...

You can use multiple inheritance with Generic:

from typing import TypeVar, Generic, Sized

T = TypeVar('T')

class LinkedList(Sized, Generic[T]):
    ...

When inheriting from generic classes, some type variables could be fixed:

from typing import TypeVar, Mapping

T = TypeVar('T')

class MyDict(Mapping[str, T]):
    ...

In this case MyDict has a single parameter, T.

Using a generic class without specifying type parameters assumes Any for each position. In the following example, MyIterable is not generic but implicitly inherits from Iterable[Any]:

from typing import Iterable

class MyIterable(Iterable): # Same as Iterable[Any]

User defined generic type aliases are also supported. Examples:

from typing import TypeVar, Iterable, Tuple, Union
S = TypeVar('S')
Response = Union[Iterable[S], int]

# Return type here is same as Union[Iterable[str], int]
def response(query: str) -> Response[str]:
    ...

T = TypeVar('T', int, float, complex)
Vec = Iterable[Tuple[T, T]]

def inproduct(v: Vec[T]) -> T: # Same as Iterable[Tuple[T, T]]
    return sum(x*y for x, y in v)

Changed in version 3.7: Generic no longer has a custom metaclass.

A user-defined generic class can have ABCs as base classes without a metaclass conflict. Generic metaclasses are not supported. The outcome of parameterizing generics is cached, and most types in the typing module are hashable and comparable for equality.

The Any type

A special kind of type is Any. A static type checker will treat every type as being compatible with Any and Any as being compatible with every type.

This means that it is possible to perform any operation or method call on a value of type on Any and assign it to any variable:

from typing import Any

a = None    # type: Any
a = []      # OK
a = 2       # OK

s = ''      # type: str
s = a       # OK

def foo(item: Any) -> int:
    # Typechecks; 'item' could be any type,
    # and that type might have a 'bar' method
    item.bar()
    ...

Notice that no typechecking is performed when assigning a value of type Any to a more precise type. For example, the static type checker did not report an error when assigning a to s even though s was declared to be of type str and receives an int value at runtime!

Furthermore, all functions without a return type or parameter types will implicitly default to using Any:

def legacy_parser(text):
    ...
    return data

# A static type checker will treat the above
# as having the same signature as:
def legacy_parser(text: Any) -> Any:
    ...
    return data

This behavior allows Any to be used as an escape hatch when you need to mix dynamically and statically typed code.

Contrast the behavior of Any with the behavior of object. Similar to Any, every type is a subtype of object. However, unlike Any, the reverse is not true: object is not a subtype of every other type.

That means when the type of a value is object, a type checker will reject almost all operations on it, and assigning it to a variable (or using it as a return value) of a more specialized type is a type error. For example:

def hash_a(item: object) -> int:
    # Fails; an object does not have a 'magic' method.
    item.magic()
    ...

def hash_b(item: Any) -> int:
    # Typechecks
    item.magic()
    ...

# Typechecks, since ints and strs are subclasses of object
hash_a(42)
hash_a("foo")

# Typechecks, since Any is compatible with all types
hash_b(42)
hash_b("foo")

Use object to indicate that a value could be any type in a typesafe manner. Use Any to indicate that a value is dynamically typed.

Nominal vs structural subtyping

Initially PEP 484 defined Python static type system as using nominal subtyping. This means that a class A is allowed where a class B is expected if and only if A is a subclass of B.

This requirement previously also applied to abstract base classes, such as Iterable. The problem with this approach is that a class had to be explicitly marked to support them, which is unpythonic and unlike what one would normally do in idiomatic dynamically typed Python code. For example, this conforms to the PEP 484:

from typing import Sized, Iterable, Iterator

class Bucket(Sized, Iterable[int]):
    ...
    def __len__(self) -> int: ...
    def __iter__(self) -> Iterator[int]: ...

PEP 544 allows to solve this problem by allowing users to write the above code without explicit base classes in the class definition, allowing Bucket to be implicitly considered a subtype of both Sized and Iterable[int] by static type checkers. This is known as structural subtyping (or static duck-typing):

from typing import Iterator, Iterable

class Bucket:  # Note: no base classes
    ...
    def __len__(self) -> int: ...
    def __iter__(self) -> Iterator[int]: ...

def collect(items: Iterable[int]) -> int: ...
result = collect(Bucket())  # Passes type check

Moreover, by subclassing a special class Protocol, a user can define new custom protocols to fully enjoy structural subtyping (see examples below).

Classes, functions, and decorators

The module defines the following classes, functions and decorators:

class typing.TypeVar

Type variable.

Usage:

T = TypeVar('T')  # Can be anything
A = TypeVar('A', str, bytes)  # Must be str or bytes

Type variables exist primarily for the benefit of static type checkers. They serve as the parameters for generic types as well as for generic function definitions. See class Generic for more information on generic types. Generic functions work as follows:

def repeat(x: T, n: int) -> Sequence[T]:
    """Return a list containing n references to x."""
    return [x]*n

def longest(x: A, y: A) -> A:
    """Return the longest of two strings."""
    return x if len(x) >= len(y) else y

The latter example’s signature is essentially the overloading of (str, str) -> str and (bytes, bytes) -> bytes. Also note that if the arguments are instances of some subclass of str, the return type is still plain str.

At runtime, isinstance(x, T) will raise TypeError. In general, isinstance() and issubclass() should not be used with types.

Type variables may be marked covariant or contravariant by passing covariant=True or contravariant=True. See PEP 484 for more details. By default type variables are invariant. Alternatively, a type variable may specify an upper bound using bound=<type>. This means that an actual type substituted (explicitly or implicitly) for the type variable must be a subclass of the boundary type, see PEP 484.

class typing.Generic

Abstract base class for generic types.

A generic type is typically declared by inheriting from an instantiation of this class with one or more type variables. For example, a generic mapping type might be defined as:

class Mapping(Generic[KT, VT]):
    def __getitem__(self, key: KT) -> VT:
        ...
        # Etc.

This class can then be used as follows:

X = TypeVar('X')
Y = TypeVar('Y')

def lookup_name(mapping: Mapping[X, Y], key: X, default: Y) -> Y:
    try:
        return mapping[key]
    except KeyError:
        return default
class typing.Protocol(Generic)

Base class for protocol classes. Protocol classes are defined like this:

class Proto(Protocol):
    def meth(self) -> int:
        ...

Such classes are primarily used with static type checkers that recognize structural subtyping (static duck-typing), for example:

class C:
    def meth(self) -> int:
        return 0

def func(x: Proto) -> int:
    return x.meth()

func(C())  # Passes static type check

See PEP 544 for details. Protocol classes decorated with runtime_checkable() (described later) act as simple-minded runtime protocols that check only the presence of given attributes, ignoring their type signatures.

Protocol classes can be generic, for example:

class GenProto(Protocol[T]):
    def meth(self) -> T:
        ...

New in version 3.8.

class typing.Type(Generic[CT_co])

A variable annotated with C may accept a value of type C. In contrast, a variable annotated with Type[C] may accept values that are classes themselves – specifically, it will accept the class object of C. For example:

a = 3         # Has type 'int'
b = int       # Has type 'Type[int]'
c = type(a)   # Also has type 'Type[int]'

Note that Type[C] is covariant:

class User: ...
class BasicUser(User): ...
class ProUser(User): ...
class TeamUser(User): ...

# Accepts User, BasicUser, ProUser, TeamUser, ...
def make_new_user(user_class: Type[User]) -> User:
    # ...
    return user_class()

The fact that Type[C] is covariant implies that all subclasses of C should implement the same constructor signature and class method signatures as C. The type checker should flag violations of this, but should also allow constructor calls in subclasses that match the constructor calls in the indicated base class. How the type checker is required to handle this particular case may change in future revisions of PEP 484.

The only legal parameters for Type are classes, Anytype variables, and unions of any of these types. For example:

def new_non_team_user(user_class: Type[Union[BaseUser, ProUser]]): ...

Type[Any] is equivalent to Type which in turn is equivalent to type, which is the root of Python’s metaclass hierarchy.

New in version 3.5.2.

class typing.Iterable(Generic[T_co])

A generic version of collections.abc.Iterable.

class typing.Iterator(Iterable[T_co])

A generic version of collections.abc.Iterator.

class typing.Reversible(Iterable[T_co])

A generic version of collections.abc.Reversible.

class typing.SupportsInt

An ABC with one abstract method __int__.

class typing.SupportsFloat

An ABC with one abstract method __float__.

class typing.SupportsComplex

An ABC with one abstract method __complex__.

class typing.SupportsBytes

An ABC with one abstract method __bytes__.

class typing.SupportsIndex

An ABC with one abstract method __index__.

New in version 3.8.

class typing.SupportsAbs

An ABC with one abstract method __abs__ that is covariant in its return type.

class typing.SupportsRound

An ABC with one abstract method __round__ that is covariant in its return type.

class typing.Container(Generic[T_co])

A generic version of collections.abc.Container.

class typing.Hashable

An alias to collections.abc.Hashable

class typing.Sized

An alias to collections.abc.Sized

class typing.Collection(Sized, Iterable[T_co], Container[T_co])

A generic version of collections.abc.Collection

New in version 3.6.0.

class typing.AbstractSet(Sized, Collection[T_co])

A generic version of collections.abc.Set.

class typing.MutableSet(AbstractSet[T])

A generic version of collections.abc.MutableSet.

class typing.Mapping(Sized, Collection[KT], Generic[VT_co])

A generic version of collections.abc.Mapping. This type can be used as follows:

def get_position_in_index(word_list: Mapping[str, int], word: str) -> int:
    return word_list[word]
class typing.MutableMapping(Mapping[KT, VT])

A generic version of collections.abc.MutableMapping.

class typing.Sequence(Reversible[T_co], Collection[T_co])

A generic version of collections.abc.Sequence.

class typing.MutableSequence(Sequence[T])

A generic version of collections.abc.MutableSequence.

class typing.ByteString(Sequence[int])

A generic version of collections.abc.ByteString.

This type represents the types bytesbytearray, and memoryview.

As a shorthand for this type, bytes can be used to annotate arguments of any of the types mentioned above.

class typing.Deque(deque, MutableSequence[T])

A generic version of collections.deque.

New in version 3.5.4.

New in version 3.6.1.

class typing.List(list, MutableSequence[T])

Generic version of list. Useful for annotating return types. To annotate arguments it is preferred to use an abstract collection type such as Sequence or Iterable.

This type may be used as follows:

T = TypeVar('T', int, float)

def vec2(x: T, y: T) -> List[T]:
    return [x, y]

def keep_positives(vector: Sequence[T]) -> List[T]:
    return [item for item in vector if item > 0]
class typing.Set(set, MutableSet[T])

A generic version of builtins.set. Useful for annotating return types. To annotate arguments it is preferred to use an abstract collection type such as AbstractSet.

class typing.FrozenSet(frozenset, AbstractSet[T_co])

A generic version of builtins.frozenset.

class typing.MappingView(Sized, Iterable[T_co])

A generic version of collections.abc.MappingView.

class typing.KeysView(MappingView[KT_co], AbstractSet[KT_co])

A generic version of collections.abc.KeysView.

class typing.ItemsView(MappingView, Generic[KT_co, VT_co])

A generic version of collections.abc.ItemsView.

class typing.ValuesView(MappingView[VT_co])

A generic version of collections.abc.ValuesView.

class typing.Awaitable(Generic[T_co])

A generic version of collections.abc.Awaitable.

New in version 3.5.2.

class typing.Coroutine(Awaitable[V_co], Generic[T_co T_contra, V_co])

A generic version of collections.abc.Coroutine. The variance and order of type variables correspond to those of Generator, for example:

from typing import List, Coroutine
c = None # type: Coroutine[List[str], str, int]
...
x = c.send('hi') # type: List[str]
async def bar() -> None:
    x = await c # type: int

New in version 3.5.3.

class typing.AsyncIterable(Generic[T_co])

A generic version of collections.abc.AsyncIterable.

New in version 3.5.2.

class typing.AsyncIterator(AsyncIterable[T_co])

A generic version of collections.abc.AsyncIterator.

New in version 3.5.2.

class typing.ContextManager(Generic[T_co])

A generic version of contextlib.AbstractContextManager.

New in version 3.5.4.

New in version 3.6.0.

class typing.AsyncContextManager(Generic[T_co])

A generic version of contextlib.AbstractAsyncContextManager.

New in version 3.5.4.

New in version 3.6.2.

class typing.Dict(dict, MutableMapping[KT, VT])

A generic version of dict. Useful for annotating return types. To annotate arguments it is preferred to use an abstract collection type such as Mapping.

This type can be used as follows:

def count_words(text: str) -> Dict[str, int]:
    ...
class typing.DefaultDict(collections.defaultdict, MutableMapping[KT, VT])

A generic version of collections.defaultdict.

New in version 3.5.2.

class typing.OrderedDict(collections.OrderedDict, MutableMapping[KT, VT])

A generic version of collections.OrderedDict.

New in version 3.7.2.

class typing.Counter(collections.Counter, Dict[T, int])

A generic version of collections.Counter.

New in version 3.5.4.

New in version 3.6.1.

class typing.ChainMap(collections.ChainMap, MutableMapping[KT, VT])

A generic version of collections.ChainMap.

New in version 3.5.4.

New in version 3.6.1.

class typing.Generator(Iterator[T_co], Generic[T_co, T_contra, V_co])

A generator can be annotated by the generic type Generator[YieldType, SendType, ReturnType]. For example:

def echo_round() -> Generator[int, float, str]:
    sent = yield 0
    while sent >= 0:
        sent = yield round(sent)
    return 'Done'

Note that unlike many other generics in the typing module, the SendType of Generator behaves contravariantly, not covariantly or invariantly.

If your generator will only yield values, set the SendType and ReturnType to None:

def infinite_stream(start: int) -> Generator[int, None, None]:
    while True:
        yield start
        start += 1

Alternatively, annotate your generator as having a return type of either Iterable[YieldType] or Iterator[YieldType]:

def infinite_stream(start: int) -> Iterator[int]:
    while True:
        yield start
        start += 1
class typing.AsyncGenerator(AsyncIterator[T_co], Generic[T_co, T_contra])

An async generator can be annotated by the generic type AsyncGenerator[YieldType, SendType]. For example:

async def echo_round() -> AsyncGenerator[int, float]:
    sent = yield 0
    while sent >= 0.0:
        rounded = await round(sent)
        sent = yield rounded

Unlike normal generators, async generators cannot return a value, so there is no ReturnType type parameter. As with Generator, the SendType behaves contravariantly.

If your generator will only yield values, set the SendType to None:

async def infinite_stream(start: int) -> AsyncGenerator[int, None]:
    while True:
        yield start
        start = await increment(start)

Alternatively, annotate your generator as having a return type of either AsyncIterable[YieldType] or AsyncIterator[YieldType]:

async def infinite_stream(start: int) -> AsyncIterator[int]:
    while True:
        yield start
        start = await increment(start)

New in version 3.6.1.

class typing.Text

Text is an alias for str. It is provided to supply a forward compatible path for Python 2 code: in Python 2, Text is an alias for unicode.

Use Text to indicate that a value must contain a unicode string in a manner that is compatible with both Python 2 and Python 3:

def add_unicode_checkmark(text: Text) -> Text:
    return text + u' \u2713'

New in version 3.5.2.

class typing.IO
class typing.TextIO
class typing.BinaryIO

Generic type IO[AnyStr] and its subclasses TextIO(IO[str]) and BinaryIO(IO[bytes]) represent the types of I/O streams such as returned by open().

class typing.Pattern
class typing.Match

These type aliases correspond to the return types from re.compile() and re.match(). These types (and the corresponding functions) are generic in AnyStr and can be made specific by writing Pattern[str]Pattern[bytes]Match[str], or Match[bytes].

class typing.NamedTuple

Typed version of collections.namedtuple().

Usage:

class Employee(NamedTuple):
    name: str
    id: int

This is equivalent to:

Employee = collections.namedtuple('Employee', ['name', 'id'])

To give a field a default value, you can assign to it in the class body:

class Employee(NamedTuple):
    name: str
    id: int = 3

employee = Employee('Guido')
assert employee.id == 3

Fields with a default value must come after any fields without a default.

The resulting class has an extra attribute __annotations__ giving a dict that maps the field names to the field types. (The field names are in the _fields attribute and the default values are in the _field_defaults attribute both of which are part of the namedtuple API.)

NamedTuple subclasses can also have docstrings and methods:

class Employee(NamedTuple):
    """Represents an employee."""
    name: str
    id: int = 3

    def __repr__(self) -> str:
        return f'<Employee {self.name}, id={self.id}>'

Backward-compatible usage:

Employee = NamedTuple('Employee', [('name', str), ('id', int)])

Changed in version 3.6: Added support for PEP 526 variable annotation syntax.

Changed in version 3.6.1: Added support for default values, methods, and docstrings.

Changed in version 3.8: Deprecated the _field_types attribute in favor of the more standard __annotations__ attribute which has the same information.

Changed in version 3.8: The _field_types and __annotations__ attributes are now regular dictionaries instead of instances of OrderedDict.

class typing.TypedDict(dict)

A simple typed namespace. At runtime it is equivalent to a plain dict.

TypedDict creates a dictionary type that expects all of its instances to have a certain set of keys, where each key is associated with a value of a consistent type. This expectation is not checked at runtime but is only enforced by type checkers. Usage:

class Point2D(TypedDict):
    x: int
    y: int
    label: str

a: Point2D = {'x': 1, 'y': 2, 'label': 'good'}  # OK
b: Point2D = {'z': 3, 'label': 'bad'}           # Fails type check

assert Point2D(x=1, y=2, label='first') == dict(x=1, y=2, label='first')

The type info for introspection can be accessed via Point2D.__annotations__ and Point2D.__total__. To allow using this feature with older versions of Python that do not support PEP 526TypedDict supports two additional equivalent syntactic forms:

Point2D = TypedDict('Point2D', x=int, y=int, label=str)
Point2D = TypedDict('Point2D', {'x': int, 'y': int, 'label': str})

See PEP 589 for more examples and detailed rules of using TypedDict with type checkers.

New in version 3.8.

class typing.ForwardRef

A class used for internal typing representation of string forward references. For example, List["SomeClass"] is implicitly transformed into List[ForwardRef("SomeClass")]. This class should not be instantiated by a user, but may be used by introspection tools.

typing.NewType(typ)

A helper function to indicate a distinct types to a typechecker, see NewType. At runtime it returns a function that returns its argument. Usage:

UserId = NewType('UserId', int)
first_user = UserId(1)

New in version 3.5.2.

typing.cast(typval)

Cast a value to a type.

This returns the value unchanged. To the type checker this signals that the return value has the designated type, but at runtime we intentionally don’t check anything (we want this to be as fast as possible).

typing.get_type_hints(obj[globals[locals]])

Return a dictionary containing type hints for a function, method, module or class object.

This is often the same as obj.__annotations__. In addition, forward references encoded as string literals are handled by evaluating them in globals and locals namespaces. If necessary, Optional[t] is added for function and method annotations if a default value equal to None is set. For a class C, return a dictionary constructed by merging all the __annotations__ along C.__mro__ in reverse order.

typing.get_origin(tp)
typing.get_args(tp)

Provide basic introspection for generic types and special typing forms.

For a typing object of the form X[Y, Z, ...] these functions return X and (Y, Z, ...). If X is a generic alias for a builtin or collections class, it gets normalized to the original class. For unsupported objects return None and () correspondingly. Examples:

assert get_origin(Dict[str, int]) is dict
assert get_args(Dict[int, str]) == (int, str)

assert get_origin(Union[int, str]) is Union
assert get_args(Union[int, str]) == (int, str)

New in version 3.8.

@typing.overload

The @overload decorator allows describing functions and methods that support multiple different combinations of argument types. A series of @overload-decorated definitions must be followed by exactly one non-@overload-decorated definition (for the same function/method). The @overload-decorated definitions are for the benefit of the type checker only, since they will be overwritten by the non-@overload-decorated definition, while the latter is used at runtime but should be ignored by a type checker. At runtime, calling a @overload-decorated function directly will raise NotImplementedError. An example of overload that gives a more precise type than can be expressed using a union or a type variable:

@overload
def process(response: None) -> None:
    ...
@overload
def process(response: int) -> Tuple[int, str]:
    ...
@overload
def process(response: bytes) -> str:
    ...
def process(response):
    <actual implementation>

See PEP 484 for details and comparison with other typing semantics.

@typing.final

A decorator to indicate to type checkers that the decorated method cannot be overridden, and the decorated class cannot be subclassed. For example:

class Base:
    @final
    def done(self) -> None:
        ...
class Sub(Base):
    def done(self) -> None:  # Error reported by type checker
          ...

@final
class Leaf:
    ...
class Other(Leaf):  # Error reported by type checker
    ...

There is no runtime checking of these properties. See PEP 591 for more details.

New in version 3.8.

@typing.no_type_check

Decorator to indicate that annotations are not type hints.

This works as class or function decorator. With a class, it applies recursively to all methods defined in that class (but not to methods defined in its superclasses or subclasses).

This mutates the function(s) in place.

@typing.no_type_check_decorator

Decorator to give another decorator the no_type_check() effect.

This wraps the decorator with something that wraps the decorated function in no_type_check().

@typing.type_check_only

Decorator to mark a class or function to be unavailable at runtime.

This decorator is itself not available at runtime. It is mainly intended to mark classes that are defined in type stub files if an implementation returns an instance of a private class:

@type_check_only
class Response:  # private or not available at runtime
    code: int
    def get_header(self, name: str) -> str: ...

def fetch_response() -> Response: ...

Note that returning instances of private classes is not recommended. It is usually preferable to make such classes public.

@typing.runtime_checkable

Mark a protocol class as a runtime protocol.

Such a protocol can be used with isinstance() and issubclass(). This raises TypeError when applied to a non-protocol class. This allows a simple-minded structural check, very similar to “one trick ponies” in collections.abc such as Iterable. For example:

@runtime_checkable
class Closable(Protocol):
    def close(self): ...

assert isinstance(open('/some/file'), Closable)

Warning: this will check only the presence of the required methods, not their type signatures!

New in version 3.8.

typing.Any

Special type indicating an unconstrained type.

  • Every type is compatible with Any.

  • Any is compatible with every type.

typing.NoReturn

Special type indicating that a function never returns. For example:

from typing import NoReturn

def stop() -> NoReturn:
    raise RuntimeError('no way')

New in version 3.5.4.

New in version 3.6.2.

typing.Union

Union type; Union[X, Y] means either X or Y.

To define a union, use e.g. Union[int, str]. Details:

  • The arguments must be types and there must be at least one.

  • Unions of unions are flattened, e.g.:

    Union[Union[int, str], float] == Union[int, str, float]
    
  • Unions of a single argument vanish, e.g.:

    Union[int] == int  # The constructor actually returns int
    
  • Redundant arguments are skipped, e.g.:

    Union[int, str, int] == Union[int, str]
    
  • When comparing unions, the argument order is ignored, e.g.:

    Union[int, str] == Union[str, int]
    
  • You cannot subclass or instantiate a union.

  • You cannot write Union[X][Y].

  • You can use Optional[X] as a shorthand for Union[X, None].

Changed in version 3.7: Don’t remove explicit subclasses from unions at runtime.

typing.Optional

Optional type.

Optional[X] is equivalent to Union[X, None].

Note that this is not the same concept as an optional argument, which is one that has a default. An optional argument with a default does not require the Optional qualifier on its type annotation just because it is optional. For example:

def foo(arg: int = 0) -> None:
    ...

On the other hand, if an explicit value of None is allowed, the use of Optional is appropriate, whether the argument is optional or not. For example:

def foo(arg: Optional[int] = None) -> None:
    ...
typing.Tuple

Tuple type; Tuple[X, Y] is the type of a tuple of two items with the first item of type X and the second of type Y. The type of the empty tuple can be written as Tuple[()].

Example: Tuple[T1, T2] is a tuple of two elements corresponding to type variables T1 and T2. Tuple[int, float, str] is a tuple of an int, a float and a string.

To specify a variable-length tuple of homogeneous type, use literal ellipsis, e.g. Tuple[int, ...]. A plain Tuple is equivalent to Tuple[Any, ...], and in turn to tuple.

typing.Callable

Callable type; Callable[[int], str] is a function of (int) -> str.

The subscription syntax must always be used with exactly two values: the argument list and the return type. The argument list must be a list of types or an ellipsis; the return type must be a single type.

There is no syntax to indicate optional or keyword arguments; such function types are rarely used as callback types. Callable[..., ReturnType] (literal ellipsis) can be used to type hint a callable taking any number of arguments and returning ReturnType. A plain Callable is equivalent to Callable[..., Any], and in turn to collections.abc.Callable.

typing.Literal

A type that can be used to indicate to type checkers that the corresponding variable or function parameter has a value equivalent to the provided literal (or one of several literals). For example:

def validate_simple(data: Any) -> Literal[True]:  # always returns True
    ...

MODE = Literal['r', 'rb', 'w', 'wb']
def open_helper(file: str, mode: MODE) -> str:
    ...

open_helper('/some/path', 'r')  # Passes type check
open_helper('/other/path', 'typo')  # Error in type checker

Literal[...] cannot be subclassed. At runtime, an arbitrary value is allowed as type argument to Literal[...], but type checkers may impose restrictions. See PEP 586 for more details about literal types.

New in version 3.8.

typing.ClassVar

Special type construct to mark class variables.

As introduced in PEP 526, a variable annotation wrapped in ClassVar indicates that a given attribute is intended to be used as a class variable and should not be set on instances of that class. Usage:

class Starship:
    stats: ClassVar[Dict[str, int]] = {} # class variable
    damage: int = 10                     # instance variable

ClassVar accepts only types and cannot be further subscribed.

ClassVar is not a class itself, and should not be used with isinstance() or issubclass()ClassVar does not change Python runtime behavior, but it can be used by third-party type checkers. For example, a type checker might flag the following code as an error:

enterprise_d = Starship(3000)
enterprise_d.stats = {} # Error, setting class variable on instance
Starship.stats = {}     # This is OK

New in version 3.5.3.

typing.Final

A special typing construct to indicate to type checkers that a name cannot be re-assigned or overridden in a subclass. For example:

MAX_SIZE: Final = 9000
MAX_SIZE += 1  # Error reported by type checker

class Connection:
    TIMEOUT: Final[int] = 10

class FastConnector(Connection):
    TIMEOUT = 1  # Error reported by type checker

There is no runtime checking of these properties. See PEP 591 for more details.

New in version 3.8.

typing.AnyStr

AnyStr is a type variable defined as AnyStr = TypeVar('AnyStr', str, bytes).

It is meant to be used for functions that may accept any kind of string without allowing different kinds of strings to mix. For example:

def concat(a: AnyStr, b: AnyStr) -> AnyStr:
    return a + b

concat(u"foo", u"bar")  # Ok, output has type 'unicode'
concat(b"foo", b"bar")  # Ok, output has type 'bytes'
concat(u"foo", b"bar")  # Error, cannot mix unicode and bytes
typing.TYPE_CHECKING

A special constant that is assumed to be True by 3rd party static type checkers. It is False at runtime. Usage:

if TYPE_CHECKING:
    import expensive_mod

def fun(arg: 'expensive_mod.SomeType') -> None:
    local_var: expensive_mod.AnotherType = other_fun()

Note that the first type annotation must be enclosed in quotes, making it a “forward reference”, to hide the expensive_mod reference from the interpreter runtime. Type annotations for local variables are not evaluated, so the second annotation does not need to be enclosed in quotes.

New in version 3.5.2.


https://spoqa.github.io/2019/02/15/python-timezone.html


안녕하세요. 스포카 크리에이터 김두리입니다.

스포카는 많은 프로덕트에서 국제화 서비스를 제공하고 있습니다. 그래서 시간대와 시간을 제대로 정확하게 처리하는 것은 중요합니다. 하지만 파이썬의 datetime.datetime은 날짜(datetime.date)와 시각(datetime.time)의 정보를 담고 있고, 시간대(datetime.timezone)의 정보는 담거나 담지 않을 수도 있으므로 헷갈리는 부분이 존재합니다.

  • 시간을 처리할 때 시간대는 왜 중요할까요? 시간대가 명시되지 않은 시각은 충분한 정보를 내포하고 있지 않습니다. 저는 얼마 전, Google Calendar API를 이용하여 작업할 때 골치 아픈 일을 겪었습니다. 오늘의 일정을 불러오고 싶어서 오늘 0시~24시로 데이터를 요청했지만, 계속해서 결괏값에 다음 날의 일정도 포함되어서 반환되었습니다.
  • 왜 다음날 일정도 포함되었던 걸까요? 아래와 같은 코드를 작성하여 Google Calendar API에 요청했습니다.
today = datetime.date.today()
from_ = datetime.datetime(today.year, today.month, today.day, 0, 0, 0)
to = datetime.datetime(today.year, today.month, today.day, 23, 59, 59)
events = get_events_from_google_calendar(from_, to)

몇 시간 동안 머리를 싸매고 코드를 한 줄 한 줄 따져가며 고민을 했습니다. 결국, 제가 요청한 시각에 시간대가 지정되어 있지 않아 get_events_from_google_calendar() 함수 내부에서 from_과 to가 의도하지 않은 시간대의 시각으로 인식되어서 발생했던 문제라는 것을 알게 되었습니다.

# 원래 의도했던 시간대: 대한민국 시간대(KST)에서 오늘 000
KST = datetime.timezone(datetime.timedelta(hours=9))
from1 = datetime.datetime(today.year, today.month, today.day, 0, 0, 0,
                          tzinfo=KST)
# get_events_from_google_calendar()가 받아들인 시간대: UTC 시간대에서 오늘 000
from2 = datetime.datetime(today.year, today.month, today.day, 0, 0, 0,
                          tzinfo=datetime.timezone.utc)

위 예제에서 from2 - from1를 하게 되면 timedelta(hours=9)가 계산됩니다. 우리가 원했던 것은 KST 기준 오늘 0시부터의 일정이었지만, Google Calendar API에서는 시간대를 UTC로 취급하여 KST 기준 오늘 9시부터 다음날 9시까지의 일정을 불러왔던 것입니다.

이렇듯 시간 관련 작업을 할 때 시간대에 대해 제대로 알고 있지 않으면 의도치 않게 많은 시간을 소모하게 될 수도 있습니다.

오늘은 제가 파이썬으로 시간대 관련 처리를 하며 모았던 정보를 정리하여 공유하고자 글을 작성하게 되었습니다.

시간대

나라 또는 지역마다 살아가는 시각이 다르기 때문에 시간대에 따른 편차가 존재합니다. 이 차이가 피부로 잘 와닿지 않은 채 살아가더라도 캘린더 API나 국제화 서비스 준비 등등 시간과 관련된 작업을 진행하다 보면 시간대 문제에 직면하게 됩니다.

시간대는 영국의 그리니치 천문대(본초 자오선, 경도 0도)를 기준으로 지역에 따른 시간의 차이, 다시 말해 지구의 자전에 따른 지역 사이에 생기는 낮과 밤의 차이를 인위적으로 조정하기 위해 고안된 시간의 구분 선을 일컫는다. 시간대는 협정 세계시(UTC)를 기준으로 한 상대적인 차이로 나타낸다.

  • UTC에 대한 더 자세한 내용은 여기를 참고해주세요.
  • 시간대에 대한 더 자세한 내용은 여기를 참고해주세요.

파이썬의 datetime.datetime.now()는 실행 환경의 시간대에 따라서 시각을 표시합니다.

2019-01-01 00:00:00 +09:00에 시간대가 Asia/Seoul로 설정된 제 랩탑에서 현재 시각을 가지고 오면, 아래와 같은 시각이 표시됩니다.

>>> print(datetime.datetime.now())
2019-01-01 00:00:00.000000

그런데, 같은 시각에 Asia/Taipei로 설정된 랩탑에서는 현재 시각이 아래와 같이 표시됩니다.

>>> print(datetime.datetime.now())
2018-12-31 23:00:00.000000

위의 예제처럼 시간대에 따라 시각이 다를 수 있다는 것을 알 수 있습니다.

나라별 시간대 비교해보기

UTC를 기준으로 시간이 빠르면 +시차, 시간이 느리면 -시차로 표시합니다.

시간대나라코드
UTC-5미국(동부)EST
UTC영국GMT
UTC+8대만TW
UTC+9대한민국KST
UTC+9일본JST
UTC+10오스트레일리아(동부)AEST
  • 나라별 시간대 차이에 대한 더 자세한 내용은 여기를 참고해주세요.

시간대를 명확히 표시하지 않은 시각은 혼동을 일으킬 수 있습니다. 예를 들어서, 서울에 살고 있는 점주가 2019년 1월 1일 0시 0분에 방문한 고객을 알고 싶어 한다고 가정해봅시다. 이 데이터를 파이썬으로 표현하면 아래와 같이 적을 수 있습니다.

KST = datetime.timezone(datetime.timedelta(hours=9))
korea_1_1 = datetime.datetime(2019, 1, 1, 0, 0, 0, tzinfo=KST)

만약, 대만에 사는 점주가 이를 요청했다면 아래와 같이 적을 수 있습니다.

TW = datetime.timezone(datetime.timedelta(hours=8))
taipei_1_1 = datetime.datetime(2019, 1, 1, 0, 0, 0, tzinfo=TW)

위 예제에서 보이는 것 같이 대한민국과 대만에 있는 점주가 같은 시각을 요청했더라도, 시간대(KST/TW)에 따라서 별도로 처리해야 합니다.

assert korea_1_1 != taipei_1_1
assert taipei_1_1 - korea_1_1 == datetime.timedelta(hours=1) # 같은 시각이지만 시간대에 따라서 시간차가 있습니다.

그렇기 때문에 시간대가 표시되어 있지 않은 2019년 1월 1일이라는 정보만으로는 정확한 시각을 알 수 없습니다.

naive_1_1 = datetime.datetime(2019, 1, 1, 0, 0, 0)
assert korea_1_1 != naive_1_1
assert taipei_1_1 != naive_1_1

이런 상황을 해결하기 위해 시각은 어떤 한 시각을 기준으로 하여 그 차이가 표시되어야 합니다. 그 기준으로 정한 것이 UTC입니다. 대한민국은 UTC를 기준으로 아홉시간 빠르기 때문에 korea_1_1의 시각을 UTC 시간대로 표현하면 2018-12-31 15:00:00+00:00입니다. 대만은 UTC를 기준으로 여덟시간 빠르기 때문에 taipei_1_1의 시각을 UTC 시간대로 표현하면 2018-12-31 16:00:00+00:00입니다. 위의 시각은 각각 대한민국(2019-01-01 00:00:00+09:00), 대만(2019-01-01 00:00:00+08:00)으로 표시할 수 있습니다. 이렇게 시간대와 같이 표시하면 혼란 없이 정상적으로 처리할 수 있습니다.

datetime

datetime은 파이썬에서 기본으로 제공하는 표준 라이브러리로, 간단하거나 복잡한 방식으로 날짜와 시각을 조작하기 위한 클래스를 제공합니다.

The datetime module supplies classes for manipulating dates and times in both simple and complex ways.

datetime은 시간대 포함 여부에 따라서 naive datetime, aware datetime 두 가지로 나눕니다.

naive datetime / aware datetime

datetime의 타입을 알아봅시다. 파이썬에서 시간 관련 연산을 하다 보면 종종 아래와 같은 에러 문구를 만날 수 있습니다.

>>> a = datetime.datetime.now()
>>> b = datetime.datetime.now(datetime.timezone.utc)
>>> a - b
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: can't subtract offset-naive and offset-aware datetimes
  • naive datetime : naive datetime 객체는 그 자체만으로 시간대를 찾을 수 있는 충분한 정보를 포함하지 않습니다. (e.g. datetime.datetime(2019, 2, 15, 4, 58, 4, 114979))

  • aware datetime(timezone-aware) : 시간대를 포함합니다. (e.g.datetime.datetime(2019, 2, 15, 4, 58, 4, 114979, tzinfo=<UTC>)) aware datetime 객체는 자신의 시각 정보를 다른 aware datetime 객체와 상대적인 값으로 조정할 수 있도록 시간대나 일광 절약 시간 정책 혹은 적용 가능한 알고리즘 정보를 담고 있습니다.

tzinfo는 UTC, 시간대 이름 및 DST 오프셋에서 로컬 시간의 오프셋을 나타내는 방법을 담고 있습니다. 더 자세한 내용은 공식 문서를 확인해주세요.

naive datetime은 어느 시간대를 기준으로 하는 시각인지 모호하므로 aware datetime을 이용하는 것을 권장합니다.

직접 확인해보기

준비한 몇 가지 코드를 보며 확인해봅시다. naive datetime과 aware datetime의 차이를 확인하고, 시간대 지정 방법에 대한 내용을 다룹니다.

개발환경

여기서는 datetime을 쉽게 다루기 위해 pytz 라이브러리를 사용합니다. pytz는 아래와 같은 장점이 있습니다.

  1. 시간대를 시간차가 아닌 사람이 알아보기 쉬운 지역 이름으로 비교적 쉽게 설정할 수 있습니다.
  2. 원하는 시간대의 aware datetime으로 변경해주는 localize() 메소드를 제공합니다.

pytz 사용에 앞서, pytz가 제공하는 시간대 식별자를 확인하시려면 다음을 따라 해주세요.

import pytz

for tz in pytz.all_timezones:
    print(tz)

혹은 여기를 참고하셔도 좋습니다.

naive datetime

naive datetime은 날짜와 시각만을 갖습니다.

import datetime

datetime.datetime.utcnow()
# UTC 기준 naive datetime : datetime.datetime(2019, 2, 15, 4, 54, 29, 281594)

datetime.datetime.now()
# 실행 환경 시간대 기준 naive datetime : datetime.datetime(2019, 2, 15, 13, 54, 32, 939155)

aware datetime

naive datetime과 달리 aware datetime은 시간대 정보(tzinfo) 도 갖습니다.

import datetime
from pytz import utc

utc.localize(datetime.datetime.utcnow())
# UTC 기준 aware datetime : datetime.datetime(2019, 2, 15, 4, 55, 3, 310474, tzinfo=<UTC>)

now는 UTC를 기준으로 현재 시각을 생성합니다. 하지만, naive한 시각입니다.

now = datetime.datetime.utcnow()

이 시각은 naive한 시각이므로 pytz.timezone.localize를 통해 timezone-aware한 시각으로 변환된 시각과 동일하지 않습니다.

assert now != utc.localize(now)

시간대 제대로 지정하기

시간대가 무엇이고, 명시하는 것이 왜 중요한지 알게 되셨다면 시간대를 원하는 의도에 맞게 지정하는 방법에 대해 알아봅시다.

import datetime
from pytz import timezone, utc

KST = timezone('Asia/Seoul')

now = datetime.datetime.utcnow()
# UTC 기준 naive datetime : datetime.datetime(2019, 2, 15, 4, 18, 28, 805879)

utc.localize(now)
# UTC 기준 aware datetime : datetime.datetime(2019, 2, 15, 4, 18, 28, 805879, tzinfo=<UTC>)

KST.localize(now)
# UTC 시각, 시간대만 KST : datetime.datetime(2019, 2, 15, 4, 18, 28, 805879, tzinfo=<DstTzInfo 'Asia/Seoul' KST+9:00:00 STD>)

utc.localize(now).astimezone(KST)
# KST 기준 aware datetime : datetime.datetime(2019, 2, 15, 13, 18, 28, 805879, tzinfo=<DstTzInfo 'Asia/Seoul' KST+9:00:00 STD>)

replace() 메소드로 날짜나 시간대를 변경할 수 있습니다.

KST = timezone('Asia/Seoul')
TW = timezone('Asia/Taipei')

date = datetime.datetime.now()
# datetime.datetime(2019, 2, 15, 13, 59, 44, 872224)

date.replace(hour=10) # hour만 변경
# datetime.datetime(2019, 2, 15, 10, 59, 44, 872224)

date.replace(tzinfo=KST) # tzinfo만 변경
# datetime.datetime(2019, 2, 15, 13, 59, 44, 872224, tzinfo=<DstTzInfo 'Asia/Seoul' LMT+8:28:00 STD>)

date.replace(tzinfo=TW) # tzinfo만 변경
# datetime.datetime(2019, 2, 15, 13, 59, 44, 872224, tzinfo=<DstTzInfo 'Asia/Taipei' LMT+8:06:00 STD>)

하지만 replace는 그 속성 자체만을 바꿔버리는 것이기 때문에 사용에 주의할 필요가 있습니다.

now = datetime.datetime.utcnow()

assert utc.localize(now) == now.replace(tzinfo=utc)
assert KST.localize(now) != now.replace(tzinfo=KST)
assert TW.localize(now) != now.replace(tzinfo=TW)

그뿐만 아니라 replace()를 이용할 경우 의도하지 않은 시간대로 설정될 수도 있으므로 유의해야 합니다. 그 이유는 아래와 같습니다.

  • 시간대는 생각보다 자주 바뀝니다(더 자세한 내용은 스포카의 규칙 2번을 참고해주세요). 이렇게 변경되는 사항들은 tz database에 기록되는데, pytz는 이에 기반합니다. pytz의 버전이 2018.9와 같은 날짜로 되어있는데 2018.9 버전은 2018년 9월 기준 시간대 테이블을 기준으로 시간대를 만들어주는 버전입니다. 이 버전에선 Asia/Seoul의 시간대는 UTC+9입니다.
  • pytz는 무슨 이유에서 인지 datetime.replace()나 datetime.astimezone()에서 호출될 때 이 tz database 타임 테이블의 맨 첫 번째(가장 오래된) 기록을 가지고 변환을 시도합니다. 서울의 경우 초기에 UTC+8:28이었기 때문에 이 정보를 기반으로 변환합니다.

그래서 pytz를 사용할 때는 pytz.timezone.localize()를 항상 써야 하고, .astimezone()같은 파이썬의 표준 메서드들을 사용하고 싶다면 datetime.timezone을 사용해야 합니다.

스포카의 규칙

스포카에서 datetime을 다룰 때 흔히 따르는 두 가지 큰 원칙이 있습니다.

1. naive datetime은 절대 사용하지 않습니다.

가장 큰 이유는 naive datetime과 aware datetime을 서로 섞어서 쓰지 못한다는 것입니다.

>>> from datetime import datetime, timezone
>>> datetime.utcnow() + datetime.now(tz=timezone.utc)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'datetime.datetime' and 'datetime.datetime'

동적 타입 언어에서 쓸 수 있는 가장 간단한 타입 검사 수단인 isinstance() 체크로도 이 둘을 구별할 수가 없으므로, 코드의 어느 지점에서 naive datetime이 섞이기 시작하면 예기치 않은 지점에서 버그 발생 가능성이 급격히 올라갑니다. Python 2에서 str과 unicode를 섞으면 안 되는 것과 비슷한 이유라고 생각하시면 됩니다.

2. 장기적으로 보존해야 하는 datetime은 항상 UTC를 기준으로 저장합니다.

지역 시간대는 지정학적 또는 정치적인 이유로 생각보다 자주 바뀝니다. 예컨대 1961년 이전까지 한국은 UTC+08:30을 지역 시간대로 사용했었고, 1988년 올림픽 즈음에는 일광 절약 시간대를 시행하고 있었습니다. 시간대 데이터베이스(tz database)는 이런 변경 내역을 담고 있고, pytz가 제공하는 시간대 객체의 동작에도 반영되어 있습니다. 그 때문에 시간대 데이터베이스가 제때 업데이트되지 않거나, 갑작스러운 시간대 변경으로 데이터베이스에 반영이 늦어지거나 하면, 시간 계산에서 오차가 발생할 여지가 있습니다. 또한 같은 aware datetime 이어도 서로 다른 시간대를 가진 datetime끼리 연산하거나 하는 상황도 문제를 복잡하게 만들고, DB나 다른 서비스의 API를 사용할 때, 그 서비스가 시간대를 제대로 다루는 데에 필요한 복잡도를 감수하는 대신 단순히 UTC 기준의 고정 오프셋 시간대만 사용하는 등의 이유로 서로 지원 범위가 맞지 않아 곤란을 겪을 수도 있습니다.

혼선을 줄일 수 있는 좋은 규칙 중 하나는, str과 unicode를 다루던 것과 비슷하게 모든 내부적인 계산에서 UTC 기준의 aware datetime만 사용하고, 사용자에게 보여줘야 할 때만 필요한 시간대로 변환해서 보여 주는 것입니다.

스포카에서는 메인 서버의 dodo.datetime 유틸리티 모듈도 이런 규칙을 따르고 있으며, 대부분의 SQLAlchemy DB 모델 객체의 DateTime 컬럼에서 timezone=True 옵션을 켜서 사용하고 있습니다.

정리

시간 관련 작업을 하신다면 아래 사항을  기억해주세요.

  1. 시간대를 명시합시다.
  2. 시각을 애플리케이션 로직이나 데이터베이스에서 저장할 때는 UTC로 사용하고, 유저에게 표시할 때만 유저의 시간대로 변환하여 보여주도록 합시다.
    • 백엔드 서버끼리 통신할 때도 항상 UTC를 사용한다는 가정을 하면, 시간대가 없더라도 robust하게 처리할 수 있습니다.




Adding Dates and Times in Python

Mon, Oct 19, 2009

Tech Tips

Using the built-in modules datetime and timedelta, you can perform date and time addition/subtraction in python:

  1. from datetime import datetime  
  2. from datetime import timedelta  
  3.   
  4. #Add 1 day  
  5. print datetime.now() + timedelta(days=1)  
  6.   
  7. #Subtract 60 seconds  
  8. print datetime.now() - timedelta(seconds=60)  
  9.   
  10. #Add 2 years  
  11. print datetime.now() + timedelta(days=730)  
  12.   
  13. #Other Parameters you can pass in to timedelta:  
  14. # days, seconds, microseconds,   
  15. # milliseconds, minutes, hours, weeks  
  16.   
  17. #Pass multiple parameters (1 day and 5 minutes)  
  18. print datetime.now() + timedelta(days=1,minutes=5)  

Here is a python reference that gives more examples and advanced features:
http://docs.python.org/library/datetime.html

If you are coming from a .net or sql environment, here are the above examples in C# and SQL (Microsoft) for comparison:
C#

  1. DateTime myTime = new DateTime();  
  2.   
  3. --Add 1 day  
  4. myTime.AddDays(1);  
  5.   
  6. --Subtract 60 seconds  
  7. myTime.AddSeconds(-60);  
  8.   
  9. --Add 2 years  
  10. myTime.AddYears(2);  

SQL

  1. --Add 1 day  
  2. select DATEADD(day, 1, getdate())  
  3.   
  4. --Subtract 60 seconds  
  5. select DATEADD(second, -60, getdate())  
  6.   
  7. --Add 2 years  
  8. select DATEADD(Year, 2, getdate())  


+ Recent posts