Back
Close

Python dataclass

[CG]Maxime
1,335 views

Introduction

Among the new features of Python 3.7, a new one is the decorator @dataclass that simplify the creation of data classes by auto-generating special methods such as __init__() and __repr__().

A data class is a class whose main purpose is to store data without functionality. This kind of class, also known as data structure, is very common. For example, a class used to store the coordinates of a point is simply a class with 3 fields (x, y, z).

However, we often need to add a constructor, a representation method, a comparison function, etc. These functions are cumbersome, and this is precisely what should be handled transparently by the language.

As a matter of fact, some languages, such as Kotlin, already offers an easy way to create data classes. In Java this can be done using the Lombok library and its @Data annotation.

Example

Here's an example of use of @dataclass:

from dataclasses import dataclass
@dataclass
class Point:
x: float
y: float
z: float = 0.0
p = Point(1.5, 2.5)
print(p)
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

By default, this will auto-generate the functions needed to instantiate, compare and print the data class instances.

In other words, this is equivalent to:

class Point:
def __init__(self, x: float, y: float, z: float = 0.0):
self.x = x
self.y = y
self.z = z
def __repr__(self):
return f"Point(x={self.x}, y={self.y}, z={self.z})"
def __eq__(self, other):
if other.__class__ is self.__class__:
return (self.x, self.y, self.z) == (other.x, other.y, other.z)
return NotImplemented
# __ne__, __lt__, __le__, __gt__, __ge__ are also generated!
p = Point(1.5, 2.5)
print(p)
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

Note that this particular example could also be done using namedtuple, but the syntax is more complex to understand, even if it is shorter:

from collections import namedtuple

Point = namedtuple('Point', ['x', 'y', 'z'], defaults=(0.0,))

dataclass Parameters

The @dataclass decorator accepts a list of parameters to control which methods should be generated:

@dataclasses.dataclass(*, init=True, repr=True, eq=True, order=False, unsafe_hash=False, frozen=False)¶
  • init: if True, generates the __init__ method.
  • repr: if True, generates the __repr__ method.
  • eq: if True, generates the __eq__ method by comparing the fields as they were tuples.
  • order: if True, generates the __lt__, __le__, __gt__, and __ge__ methods.
  • unsafe_hash: if False, generates the __hash__ method depending on the values of eq and frozen. If True, the __hash__ function will be generated.
  • frozen: if True, then the instances will be immutable (read-only).

See the documentation for more information.

Field-specific configuration

In the dataclasses module, there's a field function that allows to provide field-specific configuration:

from typing import List
from dataclasses import dataclass, field
@dataclass
class C:
x: int
y: int = field(repr=False)
mylist: List[int] = field(default_factory=list)
c = C(7, 42)
c.mylist += [1, 2, 3]
print(c)
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

This allows to control the default value, whether it should be displayed by the __repr__ method, ignored by the comparison functions, included in the __hash__ method, etc.

def field(*, default=MISSING, default_factory=MISSING, repr=True,
          hash=None, init=True, compare=True, metadata=None)

See the documentation for more information.

Post-init processing

The generated __init__() code will call a method named __post_init__(). This is useful to initialize a variable based on the values of other variables. Note that if no __init__ method is generated, then __post_init__ will not be called.

Other Dataclasses Functions

The dataclasses module also provide a bunch of useful functions:

  • fields: return a tuple of Field objects. A Field object contains the configuration of a field.
  • asdict: converts an instance of data class to a dict of its fields.
  • astuple: converts an instance of data class to a tuple of its fields.
  • make_dataclass: creates a new data class dynamically.
  • replace: clone the given data class instance and modify some fields.
  • is_dataclass: tells whether the given object is an instance of a data class.

References

Create your playground on Tech.io
This playground was created on Tech.io, our hands-on, knowledge-sharing platform for developers.
Go to tech.io
codingame x discord
Join the CodinGame community on Discord to chat about puzzle contributions, challenges, streams, blog articles - all that good stuff!
JOIN US ON DISCORD
Online Participants