In this article, we will discuss Bloom filters. An HBase Bloom Filter is an efficient mechanism to test whether a StoreFile contains a specific row or row-col cell.

Without Bloom filters, The only way to find a row key in a StoreFile is to check the store file’s block index. The StoreFile’s block index stores the start row key of each block in the StoreFile. Bloom Filters provides an in-memory structure to reduce disk reads to only the files likely to contain that row. In short, finding a row in a StoreFile can be considered an in-memory index.

If your application modifies all or the majority of the rows of Hbase regularly, the majority of StoreFiles will have a piece of the row you are searching for. Thus, Bloom filters may only help a little. In time-series data, when a few records are updated at a time or when updated in batches, each row is written in a separate Store file.

In this case, the Bloom filter helps a lot in improving the performance of HBase reads. It is done by discarding Store files that do not contain the row being searched.

After testing the above settings on test data of about 10 GB, we implemented the same in the streaming data HBase database. We observed a performance gain in line with the above experimental results.

Emergys Blog

Recent Articles

  • ITSM for a Global BFSI

    Transforming ITSM for a Global BFSI Leader with BMC Helix

    Transforming ITSM for a Global BFSI Leader with BMC Helix

    In the dynamic Banking and Financial Services sector, efficient [...]

    In the dynamic Banking and Financial Services sector, efficient IT operations are vital for superior [...]

  • Migrating from Remedyforce to BMC Helix

    Enhance Your IT Service Management: Migrating from Remedyforce to BMC Helix

    Enhance Your IT Service Management: Migrating from Remedyforce to BMC Helix

    In today’s rapidly evolving business landscape, organizations must constantly seek [...]

    In today’s rapidly evolving business landscape, organizations must constantly seek ways to optimize their IT service [...]

  • Credit Unions to Accelerate Growth

    Credit Unions to Accelerate Growth with Cost-Effective Hyperautomation

    Credit Unions to Accelerate Growth with Cost-Effective Hyperautomation

    Credit unions are grappling with evolving customer expectations, economic [...]

    Credit unions are grappling with evolving customer expectations, economic uncertainty, and growing competition from fintech [...]