Ducks, the Python object indexer

ducks πŸ¦†οƒ

Index your Python objects for fast lookup by their attributes.

GitHub stars tests Actions Status Coverage license - MIT python - 3.7+

Install

pip install ducks

Usage

The main container in ducks is called Dex.

from ducks import Dex

# make some objects
objects = [
    {'x': 3, 'y': 'a'},
    {'x': 6, 'y': 'b'},
    {'x': 9, 'y': 'c'}
]

# Create a Dex containing the objects.
# Index on x and y.
dex = Dex(objects, ['x', 'y'])

# match objects
dex[{
    'x': {'>': 5, '<': 10},  # where 5 < x < 10
    'y': {'in': ['a', 'b']}  # and y is 'a' or 'b'
}]
# result: [{'x': 6, 'y': 'b'}]

This is a Dex of dicts, but the objects can be any type, even primitives like strings.

Dex supports ==, !=, in, not in, <, <=, >, >=.

The indexes can be dict keys, object attributes, or custom functions.

See Quick Start for more examples of all of these.

Is ducks fast?

Yes. Here’s how the ducks containers compare to other datastores on an example task.

https://raw.githubusercontent.com/manimino/ducks/main/docs/img/perf_bench.png

In this benchmark, two million objects are generated. Each datastore is used to find the subset of 200 of them that match four constraints. The ducks containers Dex and FrozenDex are shown to be very efficient at this, outperforming by 5x and and 10x respectively.

Benchmark code is in the Jupyter notebook.

Docs

Quick Start covers all the features you need, like pickling, nested attribute handling, and thread concurrency.

How It Works is a deep dive on the implementation details.

Demos has short scripts showing example uses.

Contents