SQL Profiling
SQL Profiling collects table level and column level statistics. The SQL-based profiler does not run alone, but rather can be enabled for other SQL-based sources. Enabling profiling will slow down ingestion runs.
Running profiling against many tables or over many rows can run up significant costs. While we've done our best to limit the expensiveness of the queries the profiler runs, you should be prudent about the set of tables profiling is enabled on or the frequency of the profiling runs.
Capabilities
Extracts:
- Row and column counts for each table
- For each column, if applicable:
- null counts and proportions
- distinct counts and proportions
- minimum, maximum, mean, median, standard deviation, some quantile values
- histograms or frequencies of unique values
Supported Sources
SQL profiling is supported for all SQL sources. Check the individual source page to verify if it supports profiling.
Questions
If you've got any questions on configuring profiling, feel free to ping us on our Slack!