StoqBloomFilter

Overview

Native support for bloom filters.

Examples

Create new bloom filter with a maximum of 5000 items and a false positive ratelimited of 0.0001%:

from stoq.filters import StoqBloomFilter

bloomfilter = StoqBloomFilter()
bloomfilter.create_filter("/tmp/stoq.bloom", 5000, 0.001)

Open a previously created bloom filter:

from stoq.filters import StoqBloomFilter

bloomfilter = StoqBloomFilter()
bloomfilter.import_filter("/tmp/stoq.bloom")

Save the bloomfilter to disk every 60 seconds:

bloomfilter.backup_scheduler(60)

Check if a string is in the bloom filter, if not, add it:

bloomfilter.query_filter("google.com", add_missing=True)
class stoq.filters.StoqBloomFilter
backup_scheduler(interval)

Set a syncing schedule for the persistent bloom filter

Parameters:interval (int) – Interval between syncing bloom filter to disk
create_filter(filepath, size, falsepos_rate)

Create new bloom filter

Parameters:
  • filepath (bytes) – Path to persistent bloom filter on disk
  • size (int) – Maximum number of elements in bloom filter
  • falsepos_rate (float) – Maximum false positive probability
import_filter(filepath)

Load a previously created persistent bloom filter

Parameters:filepath (bytes) – Path to persistent bloom filter on disk
query_filter(item, add_missing=False)

Identify whether an item exists within filter or not

Parameters:
  • item (bytes) – Item to query the bloom filter with
  • add_missing (bool) – If set to True, the item will be added to the bloom filter if it doesn’t exist
Returns:

True if item exists, False if not.

Return type:

bool