Data Sampling

Data Sampling

Working with data can occur in two modes - exploration or recurring monitoring. While for monitoring, data precision is of utmost importance, and speed is not much of an issue (as reports usually load periodically), for exploration, it is desirable to act fast when applying new filters, date ranges and choosing integrations; precision is not the most important.

Understanding these two modes lead us to…

🧪
…release of our new sampling data feature, which significantly decreases the load times of your dashboards!

👩🏿‍🏭 How does it work? 🧙🏿‍♀️ Why is it awesome?

Automatically! You do not need to turn anything on, no need to mess with any settings.

When you edit a report in the Monitoring module - for example, by applying a new filter, changing a date range or changing the integration selection - the data analysis will be carried out over a smaller, sampled data set. This ensures the data query is significantly faster, making data exploration more pleasant, as you no longer need to wait for the whole data set to load after applying the specified conditions.

Once you are satisfied with the conditions applied to your data set, just hit Proceed button on the purple bottom bar to apply the filters to the complete data set in order to receive the precise, non-sampled data.

🤖 What are the technical specifications behind this?

The system generates sample data from customers' users only if the total number of members in a specific report exceeds 20,000. Sampling data is not meaningful if there are fewer members.

When the number of members exceeds 20,000, we randomly select and copy 20,000 members' data fields and activities to a designated area where sample data is stored. If the sample data is successfully copied, we store this information in the account with sampling enabled. When modifying any filters in our reports, we utilize the sample data instead of using all the production data.