Effective Python #1 | Pythonic Thinking


Introduction

  • The Pythonic style isn’t regimented or enforced by the compiler. It has emerged over time through experience using the language and working with others.

    • Python programmers prefer to be explicit, to choose simple over complex, and to maximize readability.


Item 1. Know Which Version of Python You’re Using

  1. Throughout this book, the majority of example code is in the syntax of Python 3.7 (released in June 2018).

  2. This book also provides some examples in the syntax of Python 3.8 (released in October 2019)



Item 2. Follow the PEP 8 Style Guide

  1. Always follow the Python Enhancement Proposal #8 (PEP 8) style guide when writing Python code.

  2. Sharing a common style with the larger Python community facili- tates collaboration with others.

  3. Using a consistent style makes it easier to modify your own code later.

  • Whitespace

    • In a file, functions and classes should be separated by two blank lines.

    • In a class, methods should be separated by one blank line.

  • Naming

    • Protected instance attributes should be in _leading_underscore format.

    • Private instance attributes should be in __double_leading_underscore format.

  • Imports

    • Imports should be in sections in the following order: standard library modules, third-party modules, your own modules. Each subsection should have imports in alphabetical order.


Item 3. Know the Differences Between bytes and str

  1. bytes contains sequences of 8-bit values, and str contains sequences of Unicode code points.

    • str.encode() : convert Unicode data to binary data

    • bytes.decode() : convert binary data to Unicode data

a = b'h\x65llo'
print(list(a)) # [104, 101, 108, 108, 111]
print(a) # b'hello'

a = 'a\u0300 propos'
print(list(a)) # ['a', '`', ' ', 'p', 'r', 'o', 'p', 'o', 's']
print(a) # à propos
  1. Use helper functions to ensure that the inputs you operate on are the type of character sequence that you expect (8-bit values, UTF-8-encoded strings, Unicode code points, etc).
# always returns a str
def to_str(bytes_or_str):
    if isinstance(bytes_or_str, bytes):
        value = bytes_or_str.decode('utf-8')
    else:
        value = bytes_or_str
    return value  # Instance of str

# always returns a bytes
def to_bytes(bytes_or_str):
    if isinstance(bytes_or_str, str):
        value = bytes_or_str.encode('utf-8')
    else:
        value = bytes_or_str
    return value  # Instance of bytes
  1. bytes and str instances can’t be used together with operators (like >, ==, +, and %).

  2. If you want to read or write binary data to/from a file, always open the file using a binary mode (like ‘rb’ or ‘wb’).

  3. If you want to read or write Unicode data to/from a file, be careful about your system’s default text encoding. Explicitly pass the encoding parameter to open if you want to avoid surprises.

    • UnicodeDecodeError: 'utf-8' codec can't decode byte ... : the file was opened in read text mode (‘r’) instead of read binary mode (‘rb’). When a handle is in text mode, it uses the system’s default text encoding (utf-8) to interpret binary data


Item4. Prefer Interpolated F-Strings Over C-style Format Strings and str.format

  1. C-style format strings that use the % operator suffer from a variety of gotchas and verbosity problems.

  2. The str.format method introduces some useful concepts in its formatting specifiers mini language, but it otherwise repeats the mistakes of C-style format strings and should be avoided.

  3. F-strings are a new syntax for formatting values into strings that solves the biggest problems with C-style format strings.

  4. F-strings are succinct yet powerful because they allow for arbitrary Python expressions to be directly embedded within format specifiers.



Item5. Write Helper Functions Instead of Complex Expressions

  1. Python’s syntax makes it easy to write single-line expressions that are overly complicated and difficult to read.

  2. Move complex expressions into helper functions, especially if you need to use the same logic repeatedly.

  3. An if/else expression provides a more readable alternative to using the Boolean operators or and and in expressions.

# before
red = my_values.get('red', [''])[0] or 0
green = my_values.get('green', [''])[0] or 0
opacity = my_values.get('opacity', [''])[0] or 0

# after
def get_first_int(values, key, default=0):
    found = values.get(key, [''])
    if found[0]:
        return int(found[0])
    return default


Item6. Prefer Multiple Assignment Unpacking Over Indexing

  1. Python has special syntax called unpacking for assigning multiple values in a single statement.

  2. Unpacking is generalized in Python and can be applied to any iterable, including many levels of iterables within iterables.

  3. Reduce visual noise and increase code clarity by using unpacking to avoid explicitly indexing into sequences.

item = ('Peanut butter', 'Jelly')
first, second = item  # Unpacking


Item7. Prefer enumerate Over range

  1. enumerate provides concise syntax for looping over an iterator and getting the index of each item from the iterator as you go

  2. Prefer enumerate instead of looping over a range and indexing into a sequence.

  3. You can supply a second parameter to enumerate to specify the number from which to begin counting (zero is the default).

# before
for i in range(len(flavor_list)):
    flavor = flavor_list[i]
    print(f'{i + 1}: {flavor}')

# after
for i, flavor in enumerate(flavor_list, 1):
    print(f'{i}: {flavor}')


Item 8. Use zip to Process Iterators in Parallel

  1. The zip built-in function can be used to iterate over multiple iterators in parallel.

  2. zip creates a lazy generator that produces tuples, so it can be used on infinitely long inputs.

  3. zip truncates its output silently to the shortest iterator if you supply it with iterators of different lengths.

  4. Use the zip_longest function from the itertools built-in module if you want to use zip on iterators of unequal lengths without truncation.



Item 9: Avoid else Blocks After for and while Loops

  1. Python has special syntax that allows else blocks to immediately follow for and while loop interior blocks.

  2. The else block after a loop runs only if the loop body did not encounter a break statement.

  3. Avoid using else blocks after loops because their behavior isn’t intuitive and can be confusing



Item 10: Prevent Repetition with Assignment Expressions

  1. Assignment expressions use the walrus operator (:=) to both assign and evaluate variable names in a single expression, thus reducing repetition.

  2. When an assignment expression is a subexpression of a larger expression, it must be surrounded with parentheses.

  3. Although switch/case statements and do/while loops are not available in Python, their functionality can be emulated much more clearly by using assignment expressions.

# before
while True:
    fresh_fruit = pick_fruit()
    if not fresh_fruit:
        break
    ...

# after
while fresh_fruit := pick_fruit():
    ...