|||

Video Transcript

X

Getting to Know Python 3.7: Data Classes, async/await and More!

If you're like me, or like many other Python developers, you've probably lived (and maybe migrated) through a few version releases. Python 3.7(.3), one of the latest releases, includes some impressive new language features that help to keep Python one of the easiest, and most powerful languages out there. If you're already using a Python 3.x version, you should consider upgrading to Python 3.7. Read on to learn more about some of the exciting features and improvements.

Data Classes

One of the most tedious parts about working with Python prior to 3.7 in an object-oriented way was creating classes to represent data in your application.

Prior to Python 3.7, you would have to declare a variable in your class, and then set it in your __init__ method from a named parameter. With applications that had complex data models, this invariably led to a large number of boilerplate model and data contract code that had to be maintained.

With Python 3.7, thanks to PEP-557, you now have access to a decorator called @dataclass, that automatically adds an implicit __init__ function for you when you add typings to your class variables. When the decorator is added, Python will automatically inspect the attributes and typings of the associated class and generate an __init__ function with parameters in the order specified.

from typing import List
from dataclasses import dataclass, field

@dataclass
class Foo:
    name: str
    id: str
    bars: List[str] = field(default_factory=list)


# usage

a_foo = Foo("My foo’s name", "Foo-ID-1", ["1","2"])

You can still add class methods to your data class, and use it like you would any other class. For JSON support, see the library dataclasses-json on PYPI.

Asyncio and the async/await Keywords

The most obvious change here is that async and await are now reserved keywords in Python. This goes hand in hand with some improvements to asyncio, Python's concurrency library. Notably, this includes high-level API improvements which make it easier to run asynchronous functions. Take the following as an example of what was required to make a function asynchronous prior to Python 3.7:

import asyncio
loop = asyncio.get_event_loop()
loop.run_until_complete(some_async_task())
loop.close()

Now in Python 3.7:

import asyncio
asyncio.run(some_async_task())

breakpoint()

In previous versions of Python adding in a breakpoint to use the built-in Python debugger (pdb) would require import pdb; pdb.set_trace().

PEP-553 adds the ability to use a new keyword and function, called breakpoint, used like below:

do_something()
breakpoint()
do_something_else()

When running from a console, this will enter straight away into pdb and allow the user to enter debug statements, evaluate variables, and step through program execution. See here for more information on how to use pdb.

Lazy Loading via Module Attributes

Some experienced Python users might be familiar with __getattr__ and dir for classes and objects. PEP-562 exposes __getattr__ for modules as well.

Without diving into the realm of technical possibilities that this exposes, one of its clearest and most obvious use cases is that it now allows for modules to lazy load. Consider the example below, modified from PEP-562, and its usage.

/mymodule/__init__.py

import importlib

__all__ = ['mysubmodule', ...]

def __getattr__(name):
    if name in __all__:
        return importlib.import_module("." + name, __name__)
    raise AttributeError(f"module {__name__!r} has no attribute {name!r}")

/mymodule/mysubmodule.py

print("Submodule loaded")

class BigClass:
    pass

/main.py

import mymodule
mymodule.mysubmodule.BigClass # prints Submodule loaded

Notice that although we imported mymodule in this example, the submodule containing BigClass didn't load until we called it.

Context Variables

When using async/await functions in the Python event loop prior to 3.7, context managers that used thread local variables had the chance to bleed values across executions, potentially creating bugs that are difficult to find.

Python 3.7 introduces the concept of context variables, which are variables that have different values depending on their context. They're similar to thread locals in that there are potentially different values, but instead of differing across execution threads, they differ across execution contexts and are thus compatible with async and await functions.

Here's a quick example of how to set and use a context variable in Python 3.7. Notice that when you run this, the second async call produces the default value as it is evaluating in a different context.

import contextvars
import asyncio

val = contextvars.ContextVar("val", default="0")

async def setval():
   val.set("1")

async def printval():
   print(val.get())

asyncio.run(setval()) # sets the value in this context to 1
asyncio.run(printval()) # prints the default value “0” as its a different context

Order of Dictionaries Preserved

Python dictionaries were considered unordered dictionaries for many versions, which meant that you could write the following in Python 3.6 and earlier, and expect an out-of-order result when iterating over the keys.

>>> x = {'first': 1, 'second': 2, 'third': 3}
>>> print([k for k in x])
['second', 'third', 'first']

For those prior versions, there was OrderedDict available from collections to the rescue, which provided the strong ordering guarantees needed with certain use cases.

In Python 3.6 dictionaries were re-implemented to be ordered dictionaries, and now in Python 3.7 it is officially part of the language specification. This means that dictionary order can now be relied on but also must be accounted for when considering backwards compatibility.

Don't expect usage of OrderedDict to go away anytime though; it is still in Python 3.7, and has more advanced operations and different equality comparisons than the standard dict.

Also, this update has proven to be one of the more unpopular updates to Python 3.7. It allows for a developer to ambiguously define an ordered dict when he/she didn't mean to.

Optimizations to Python 3.7

Still not sure if you should check out Python 3.7? You should know that Python 3.7 has numerous performance improvements, notably:

  • Python startup time has been reduced between 10-30% on various operating systems.
  • Typing operations are faster.
  • List.sort and sorted methods have improved between 45-70% for common cases.
  • dict.copy() is now 5.5 times faster.
  • namedtuple creation via collections.namedtuple() is 4-6 times faster.

For a complete list, check out the official release notes.

If you want a deeper dive into some of the Python 3.7 language features, check out this lightning talk I gave at the PyCascades conference.

Or try it out by deploying a Python app to Heroku. As of April 2019, Python 3.6.8 is the default version installed if you don’t explicitly specify a version in a runtime.txt file. Put python-3.7.3 in it to try out all these new Python features.

👋 Heroku is a Diamond Sponsor of PyCon 2019, May 1-9. If you'll be there, please come say hi to the team at the Heroku booth. 🐍

Browse the archives for engineering or all blogs Subscribe to the RSS feed for engineering or all blogs.