DataHub Releases
Summary
Version | Release Date | Links |
---|---|---|
v0.10.5 | 2023-08-02 | Release Notes, View on GitHub |
v0.10.4 | 2023-06-09 | Release Notes, View on GitHub |
v0.10.3 | 2023-05-25 | View on GitHub |
v0.10.2 | 2023-04-13 | View on GitHub |
v0.10.1 | 2023-03-23 | View on GitHub |
v0.10.0 | 2023-02-07 | View on GitHub |
v0.9.6.1 | 2023-01-31 | View on GitHub |
v0.9.6 | 2023-01-13 | View on GitHub |
v0.9.5 | 2022-12-23 | View on GitHub |
v0.9.4 | 2022-12-20 | View on GitHub |
v0.9.3 | 2022-11-30 | View on GitHub |
v0.9.2 | 2022-11-04 | View on GitHub |
v0.9.1 | 2022-10-31 | View on GitHub |
v0.9.0 | 2022-10-11 | View on GitHub |
v0.8.45 | 2022-09-23 | View on GitHub |
v0.8.44 | 2022-09-01 | View on GitHub |
v0.8.43 | 2022-08-09 | View on GitHub |
v0.8.42 | 2022-08-03 | View on GitHub |
v0.8.41 | 2022-07-15 | View on GitHub |
v0.8.40 | 2022-06-30 | View on GitHub |
v0.8.39 | 2022-06-24 | View on GitHub |
v0.8.38 | 2022-06-09 | View on GitHub |
v0.8.37 | 2022-06-09 | View on GitHub |
v0.8.36 | 2022-06-02 | View on GitHub |
v0.8.35 | 2022-05-18 | View on GitHub |
v0.8.34 | 2022-05-04 | View on GitHub |
v0.8.33 | 2022-04-15 | View on GitHub |
v0.8.32 | 2022-04-04 | View on GitHub |
v0.8.31 | 2022-03-17 | View on GitHub |
v0.8.30 | 2022-03-17 | View on GitHub |
v0.10.5
Released on 2023-08-02 by @david-leifker.
Release Highlights
NEW: Unified Search and Browse Experience
It’s here, it’s here! We are incredibly excited to roll out our re-designed, streamlined Search and Browse experience. End-users now have a one-stop-shop to search for specific data entities and browse across systems, making it easier than ever to find the most relevant and meaningful resources within DataHub.
Checkout the screenshot below and get a full walk-through in this video!
<img width="1041" alt="CleanShot 2023-08-03 at 14 47 55@2x" src="https://github.com/datahub-project/datahub/assets/15873986/2f47d033-6c2b-483a-951d-e6d6b807f0d0">
User Experience
- Column-Level Lineage (CLL) visualization update: you can now visualize CLL relationships through DataJobs (i.e. Airflow DAGs)
- Unique Glossary Terms: We now prevent creating duplicate Glossary Term names within a Term Group
- Domains: You can now configure the Documentation tab to be the default landing page within a Domain
- Formatting updates to Row Count to make large numbers more human readable (ie. 3283337 > 3.2M)
- Stats Tab: Y-axis scale now dynamically set to reflect the minimum & maximum values, improving readability
Metadata ingestion
Ingestion Enhancements:
- BigQuery: Set
platform_instance
using project_id - PowerBI: Ingest datasets not used in visualizations (tiles/pages
- Kafka Connect: Ability to set
platform_instance
- Nifi: Support for basic auth
- Presto on Hive: Extract all table properties from Hive Metastore
- Elasticsearch: Support for basic profiling
- Add advanced configuration for LDAP manager ingestion
Lineage Improvements:
- Schema-aware SQL parsing to derive column-level lineage
- Column-level lineage support for BigQuery, Tableau, and Snowflake View definitions
- Snowflake: Extract Snowpipe S3 lineage
Developer Experience
- Fine-grained ownership policies
- PATCH support for DataJob Inputs/Outputs
- New endpoints to extract size of time-series indices and truncate/cleanup time-series indices in Elasticsearch; support for bulk-deletes
- Initial support for exception reporting via Sentry
- New OpenAPI endpoint to get Task Status
- SDK: Easily generate container URNs
Docs
- Improvements to our File-Based Lineage doc, specifically focused on Fine-Grained Lineage config components (link)
- Code examples of how to manage Posts within DataHub (link)
- Guide to generating custom browse paths for the new search experience (link)
What's Changed
- refractor(classification): datahub classifier init by @mayurinehate in https://github.com/datahub-project/datahub/pull/8193
- fix(glue): fix typo in reported warning, report with flow_urn by @mayurinehate in https://github.com/datahub-project/datahub/pull/8138
- fix(ingest/delta-lake): fix CI issues due to delta lake version bump by @mayurinehate in https://github.com/datahub-project/datahub/pull/8215
- Upgrade kafka and its dependencies to 3.4 in docker compose by @jinlintt in https://github.com/datahub-project/datahub/pull/8161
- chore(release): update default cli for managed ingestion by @pedro93 in https://github.com/datahub-project/datahub/pull/8226
- fix(ownership): Corrects graphQL resolver for entity operations by @pedro93 in https://github.com/datahub-project/datahub/pull/8219
- fix(cli/quickstart): handle docker hangs gracefully by @hsheth2 in https://github.com/datahub-project/datahub/pull/8211
- fix(cli): make quickstart robust to docker race conditions by @hsheth2 in https://github.com/datahub-project/datahub/pull/8233
- fix(search): tag/term should filter for both entity and field level by @anshbansal in https://github.com/datahub-project/datahub/pull/7881
- docs(tests): document test eval endpoint by @anshbansal in https://github.com/datahub-project/datahub/pull/8227
- feat(ingest/bigquery_v2): enable platform instance using project id by @asikowitz in https://github.com/datahub-project/datahub/pull/8216
- feat(stats): make rowcount more human readable by @joshuaeilers in https://github.com/datahub-project/datahub/pull/8232
- docs(es): Update aws deploy docs to correct ElasticSearch version by @iprentic in https://github.com/datahub-project/datahub/pull/8240
- feat(sdk): support patches as MCPs in file source by @hsheth2 in https://github.com/datahub-project/datahub/pull/8220
- fix(apiAuth): add resources where applicable and update docs by @RyanHolstien in https://github.com/datahub-project/datahub/pull/8234
- feat(patch): support datajob input output by @RyanHolstien in https://github.com/datahub-project/datahub/pull/8190
- feat(ingest/unity): Set external url for containers and datasets by @asikowitz in https://github.com/datahub-project/datahub/pull/8238
- docs(airflow): add docs on custom operators by @matthew-coudert-cko in https://github.com/datahub-project/datahub/pull/7913
- chore(release): update datahub upgrade docs by @pedro93 in https://github.com/datahub-project/datahub/pull/8228
- fix(ingestion/tableau): Remove unused field documentViewId by @mohdsiddique in https://github.com/datahub-project/datahub/pull/8225
- feat(ui): create fast path for immediate processing of ui sourced changes by @RyanHolstien in https://github.com/datahub-project/datahub/pull/8200
- fix(ingest/druid) Handling gracefully if no table returned in a schema by @treff7es in https://github.com/datahub-project/datahub/pull/8203
- fix(kafka-setup): bump kafka version by @david-leifker in https://github.com/datahub-project/datahub/pull/8245
- feat(ingestion/powerbi): Ingest datasets not used in PowerBI visualization(tiles/pages) by @mohdsiddique in https://github.com/datahub-project/datahub/pull/8212
- fix(sdk/dataflow): deprecate cluster and use env and platform_instance instead by @shubhamjagtap639 in https://github.com/datahub-project/datahub/pull/8201
- fix(ingest): pass platform correctly to browse path v2 helper by @asikowitz in https://github.com/datahub-project/datahub/pull/8244
- feat(search): Supporting Aggregations for hasX fields by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/8241
- fix(ingest): Call validator on the base urn as well as aspect components when ingesting by @iprentic in https://github.com/datahub-project/datahub/pull/8250
- docs(website): adjust markprompt z-index so it's not covered by nav by @jeffmerrick in https://github.com/datahub-project/datahub/pull/8255
- fix(patch): Fix exception when using default patch for patching missing aspects by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/8221
- fix(custom-search): revert underscore as quoted by @david-leifker in https://github.com/datahub-project/datahub/pull/8163
- chore(ci): add back optional static sleep for tests by @anshbansal in https://github.com/datahub-project/datahub/pull/8258
- chore(checkbox): darken all checkboxes by @joshuaeilers in https://github.com/datahub-project/datahub/pull/8248
- chore(assertions): catch any exception on assertion delete by @joshuaeilers in https://github.com/datahub-project/datahub/pull/8247
- feat(opensearch): Rollover usage events at a file size rather than time-based manner by @iprentic in https://github.com/datahub-project/datahub/pull/8182
- fix(ingest/okta): Set default of okta_profile_to_username_attr to email by @asikowitz in https://github.com/datahub-project/datahub/pull/8263
- feat(ui) Update Search & Browse to be a unified experience by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8235
- fix(ingest/tableau): split table columns query from datasources query by @mayurinehate in https://github.com/datahub-project/datahub/pull/8217
- fix(ingest/okta): Set default of okta connector to match OIDC defaults by @anshbansal in https://github.com/datahub-project/datahub/pull/8272
- feat(elasticsearch): Add endpoint for getting the size of timeseries indices by @iprentic in https://github.com/datahub-project/datahub/pull/8265
- feat(ingest/delete-cli): Add configurable batch size; update docs by @asikowitz in https://github.com/datahub-project/datahub/pull/8274
- fix aggregation sorting in browsev2 sidebar by @joshuaeilers in https://github.com/datahub-project/datahub/pull/8276
- Support de-selecting browse paths by @joshuaeilers in https://github.com/datahub-project/datahub/pull/8242
- feat(cli): Initial support for sending exceptions to Sentry by @treff7es in https://github.com/datahub-project/datahub/pull/7172
- fix(ingestion/powerbi): use admin api resolver to fetch modified workspaces by @mohdsiddique in https://github.com/datahub-project/datahub/pull/8273
- fix: dbt-athena types mapping for complex types by @svdimchenko in https://github.com/datahub-project/datahub/pull/8264
- feat(graphql) Prevent duplicate glossary term names within a group by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8187
- Add retries to JavaEntityClient:deleteReferencesTo by @joshuaeilers in https://github.com/datahub-project/datahub/pull/8268
- feat(ingest): Create zero usage aspects by @asikowitz in https://github.com/datahub-project/datahub/pull/8205
- fix(docs) Update Chrome extension docs to reflect current reality by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8284
- refactor(validations): Add URL-based Routing to Dataset Validations Tab by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/8254
- fix(metadata-io): retry transactions on serialization errors when using a PostgreSQL database by @Masterchen09 in https://github.com/datahub-project/datahub/pull/8278
- docs(ingest/lineage): Update fine grained file lineage docs by @eboneil in https://github.com/datahub-project/datahub/pull/8283
- docs(posts): add examples by @abiwill in https://github.com/datahub-project/datahub/pull/7688
- chore(deprecate): remove legacy sql table by @david-leifker in https://github.com/datahub-project/datahub/pull/8253
- fix(ingest/csv-enricher): Adding extra check in csv enricher to ignore non-urn urns by @treff7es in https://github.com/datahub-project/datahub/pull/8169
- tests(urn): Add tests for more cases of invalid urns by @iprentic in https://github.com/datahub-project/datahub/pull/8285
- feat(search): add search annotations for profile aspect by @anshbansal in https://github.com/datahub-project/datahub/pull/8282
- fix(ingest/snowflake): snowflake profiling geometry type by @mayurinehate in https://github.com/datahub-project/datahub/pull/8279
- refactor(unity): Remove databricks_cli and cleanup by @asikowitz in https://github.com/datahub-project/datahub/pull/8249
- Sidebar local storage setting + toggle tooltip by @joshuaeilers in https://github.com/datahub-project/datahub/pull/8288
- fix(ui) Fix UI issues with self-referencing column level lineage by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8296
- feat(ui) Add ability to view CLL through DataJobs in lineage visualization by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8281
- docs(business glossary) Update business glossary docs by @eboneil in https://github.com/datahub-project/datahub/pull/8287
- docs(graphql): add developer guide for adding a new graphql endpoint by @iprentic in https://github.com/datahub-project/datahub/pull/8297
- fix(test): consolidate mae-consumer test entity registry by @david-leifker in https://github.com/datahub-project/datahub/pull/8309
- fix(ingestion) Fixes producing MAE events with browsePathsV2 aspect by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8304
- fix(embed): set embed url to false for tableau config by @gabe-lyons in https://github.com/datahub-project/datahub/pull/8308
- fix(embed): hide chart & dashboard previews if not for looker by @gabe-lyons in https://github.com/datahub-project/datahub/pull/8307
- fix(ingest/unity): Pin databricks-sdk and update docs by @asikowitz in https://github.com/datahub-project/datahub/pull/8293
- fix(ui) Only show search and browse V2 onboarding steps if flag is on by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8315
- fix(ingest/looker): Fix typo on ViewField creation for measures by @asikowitz in https://github.com/datahub-project/datahub/pull/8318
- docs(managed datahub): docs for v0.2.9 by @anshbansal in https://github.com/datahub-project/datahub/pull/8323
- feat(ingest/snowflake): snowpipe s3 lineage by @mayurinehate in https://github.com/datahub-project/datahub/pull/8262
- fix(ingest/postgres): fix profiling errors, skip json type column by @mayurinehate in https://github.com/datahub-project/datahub/pull/8291
- tests(elasticsearch): Add fixture test for basic scroll functionality by @iprentic in https://github.com/datahub-project/datahub/pull/8321
- feat(tableau): add config knobs for excluding external links from tableau by @gabe-lyons in https://github.com/datahub-project/datahub/pull/8314
- fix(documentation): remove links from associatedUrn by @joshuaeilers in https://github.com/datahub-project/datahub/pull/8319
- fix(browsev2): improved error handling by @joshuaeilers in https://github.com/datahub-project/datahub/pull/8326
- fix(search) Add facets list to our cache key to avoid cache collisions by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8327
- feat(elasticsearch): Add rest.li endpoint that does truncation cleanup of a timeseries index by @iprentic in https://github.com/datahub-project/datahub/pull/8277
- Container link in browse v2 sidebar by @joshuaeilers in https://github.com/datahub-project/datahub/pull/8305
- fix(browse): try to prevent overlapping pagination calls by @joshuaeilers in https://github.com/datahub-project/datahub/pull/8329
- feat(usage): add max width to users tooltip by @gabe-lyons in https://github.com/datahub-project/datahub/pull/8335
- feat(usagestats): Optimize elasticsearch query for usage stats aggregations by @iprentic in https://github.com/datahub-project/datahub/pull/8333
- feat(ingest): add YamlFileUpdater utility by @hsheth2 in https://github.com/datahub-project/datahub/pull/8266
- feat(ui) Show Acryl information with button and banner behind flag by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8330
- test(ingest/trino): xfail test to unblock CI by @asikowitz in https://github.com/datahub-project/datahub/pull/8340
- fix(restli): Add docs for get task status, and fix hostname regex by @iprentic in https://github.com/datahub-project/datahub/pull/8341
- docs(lineage): add read lineage example by @eboneil in https://github.com/datahub-project/datahub/pull/8322
- fix(async): submit additional default aspects only when not in async mode by @RyanHolstien in https://github.com/datahub-project/datahub/pull/8320
- feat(auth): Fine grained ownership policies by @skrydal in https://github.com/datahub-project/datahub/pull/7499
- fix(ingest/s3): Fix for flaky s3 test - uploading s3 files in consistent order by @treff7es in https://github.com/datahub-project/datahub/pull/8367
- fix(ingest/airflow): Remove info log on import by @fjmacagno in https://github.com/datahub-project/datahub/pull/8246
- fix(ui) Update copy of the demo site acryl banner by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8370
- test(ingest/mysql): Configure sql_server tests for arm64 by @asikowitz in https://github.com/datahub-project/datahub/pull/8360
- fix(browse): filter entities by whether they might exist in the instance by @joshuaeilers in https://github.com/datahub-project/datahub/pull/8355
- ci(docs): add missing deps for lxml package for vercel by @hsheth2 in https://github.com/datahub-project/datahub/pull/8372
- feat(browsepathv2): enable incremental update browsepath by @david-leifker in https://github.com/datahub-project/datahub/pull/8354
- chore(smoke-test): use a more recent ingestion cli version in tests by @david-leifker in https://github.com/datahub-project/datahub/pull/8374
- feat(stats): show size in bytes and scale at y=min by @joshuaeilers in https://github.com/datahub-project/datahub/pull/8375
- fix(schema-registry): fix internal schema reg with custom duhe topic … by @david-leifker in https://github.com/datahub-project/datahub/pull/8371
- fix(java) Add try catch block when backfilling browse v2 by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8377
- feat(ingest): Add advanced configuration for LDAP manager ingestion by @bda618 in https://github.com/datahub-project/datahub/pull/7784
- fix(ingest): update pydantic helpers to address unique name issue by @mayurinehate in https://github.com/datahub-project/datahub/pull/8324
- fix(cli): local variable reference before assignment by @segun-s in https://github.com/datahub-project/datahub/pull/8222
- feat(ingest): Turn on browse path v2 creation by @asikowitz in https://github.com/datahub-project/datahub/pull/8342
- chore(ingest/delta-lake): cleanup import error handling by @hsheth2 in https://github.com/datahub-project/datahub/pull/8230
- test(ingest/nifi): Configure nifi tests for arm64 by @asikowitz in https://github.com/datahub-project/datahub/pull/8363
- build(ingest): Pin pydeequ to unblock CI by @asikowitz in https://github.com/datahub-project/datahub/pull/8381
- fix(ingest/sql-common): Fix profile_table_level_only by @asikowitz in https://github.com/datahub-project/datahub/pull/8331
- feat(ingest): schema-aware SQL parsing for column-level lineage by @hsheth2 in https://github.com/datahub-project/datahub/pull/8334
- fix(config) Set search and browse flags default off by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8378
- test(ingest/kafka): Configure kafka connect tests for arm64 by @asikowitz in https://github.com/datahub-project/datahub/pull/8362
- fix(ui): fix a too much recursion error when column lineage is highlighted by @Masterchen09 in https://github.com/datahub-project/datahub/pull/8207
- fix(ingest/s3): Deequ import rearragement by @treff7es in https://github.com/datahub-project/datahub/pull/8389
- feat(ingest): Add disable flag for TopicRecordNameStrategy by @segun-s in https://github.com/datahub-project/datahub/pull/8224
- refactor(graphql): make graphql engine extensible by @shirshanka in https://github.com/datahub-project/datahub/pull/8394
- feat(ui) Allow a configurable default tab for domain entity profile page by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8316
- test(ingest): Aspect level golden file comparison by @asikowitz in https://github.com/datahub-project/datahub/pull/8310
- test(ingest/airflow): Fix test for airflow 2.6.3 by @asikowitz in https://github.com/datahub-project/datahub/pull/8393
- feat(ingest/bigquery): support column-level lineage by @hsheth2 in https://github.com/datahub-project/datahub/pull/8382
- build(ingest): Inline import testing utils for check cli by @asikowitz in https://github.com/datahub-project/datahub/pull/8400
- refactor(ui): uniform ordering of items on the entities sidebar section by @sudhakarast in https://github.com/datahub-project/datahub/pull/8365
- test(ingest/testing-utils): Add back delta info ignore path by @asikowitz in https://github.com/datahub-project/datahub/pull/8402
- fix(ingest/bigquery): skip self-references when generating lineage by @hsheth2 in https://github.com/datahub-project/datahub/pull/8403
- feat(ingest): datamodel to ingest organisation role metadata for a dataset by @sheeru in https://github.com/datahub-project/datahub/pull/8267
- test(ingest/kafka-connect): Attempt to fix flaky test by @asikowitz in https://github.com/datahub-project/datahub/pull/8404
- feat(ingest/dbt-cloud): reduce graphql query complexity by @hsheth2 in https://github.com/datahub-project/datahub/pull/8390
- fix(ingest/snowflake): fix azure cloud region ids in external url by @mayurinehate in https://github.com/datahub-project/datahub/pull/8376
- feat(elasticsearch): Implement optimization to use reindexing instead… by @iprentic in https://github.com/datahub-project/datahub/pull/8352
- feat(ingest/presto-on-hive): Extracting all the table properties from Hive Metastore by @treff7es in https://github.com/datahub-project/datahub/pull/8348
- feat(openapi): Add openapi endpoint for getting task status by @iprentic in https://github.com/datahub-project/datahub/pull/8391
- feat(ingest/airflow): able to set
platform_instance
inDataset
by @dungdm93 in https://github.com/datahub-project/datahub/pull/8313 - test(ingest/minio): Configure delta lake minio tests for arm64 by @asikowitz in https://github.com/datahub-project/datahub/pull/8364
- docs(ingest): Add warning for Python 3.7 deprecation by @asikowitz in https://github.com/datahub-project/datahub/pull/8411
- fix(ingest/tableau): graceful handling of get all datasources failure… by @mayurinehate in https://github.com/datahub-project/datahub/pull/8406
- fix(owner): Corrects ownership aspect generation during update operations by @pedro93 in https://github.com/datahub-project/datahub/pull/8399
- chore(stats): change default stats lookback by @anshbansal in https://github.com/datahub-project/datahub/pull/8408
- feat(ingest/kafka-connect): allow setting platform_instance for kafka… by @mayurinehate in https://github.com/datahub-project/datahub/pull/8299
- fix(ingestion/powerbi): increment msal version by @mohdsiddique in https://github.com/datahub-project/datahub/pull/8385
- docs(perf-test) Update README by @eboneil in https://github.com/datahub-project/datahub/pull/8410
- fix(ingest/s3): fix test flakiness by @treff7es in https://github.com/datahub-project/datahub/pull/8416
- fix(ingest): tweak ingestion exit codes by @hsheth2 in https://github.com/datahub-project/datahub/pull/8418
- build(ingest/boto3): Update boto3-stubs to fix CI by @asikowitz in https://github.com/datahub-project/datahub/pull/8425
- feat(ingest/snowflake): View CLL from sql parsing of view definition by @asikowitz in https://github.com/datahub-project/datahub/pull/8419
- fix(ingest/snowflake): Add sqlglot as snowflake dependency by @asikowitz in https://github.com/datahub-project/datahub/pull/8427
- fix(schema-reg): allow other response codes from schema registry check by @david-leifker in https://github.com/datahub-project/datahub/pull/8302
- fix: add docs on update description via graphQL by @yoonhyejin in https://github.com/datahub-project/datahub/pull/8289
- docs(databricks/spark-lineage): Fix incorrect statement by @asikowitz in https://github.com/datahub-project/datahub/pull/8423
- feat(browsev2): styling updates and select platform by @joshuaeilers in https://github.com/datahub-project/datahub/pull/8428
- fix(ui ingestion): fixing issue where stale fields could stick around when changing recipes by @gabe-lyons in https://github.com/datahub-project/datahub/pull/8421
- ci: workarounds for pyyaml installation by @hsheth2 in https://github.com/datahub-project/datahub/pull/8435
- build(ingest/boto3): Update boto3-stubs to fix CI by @asikowitz in https://github.com/datahub-project/datahub/pull/8452
- fix(ingestion-redshift): Fix Redshift ingestion logs by @arunvasudevan in https://github.com/datahub-project/datahub/pull/8454
- fix(ingest/bigquery): make sql parsing more robust by @hsheth2 in https://github.com/datahub-project/datahub/pull/8450
- fix(GreatExpections): AssertionRunEventClass does not match the examp… by @JifeiMei in https://github.com/datahub-project/datahub/pull/8243
- chore(ingest): hide ignore old/new state options by @hsheth2 in https://github.com/datahub-project/datahub/pull/8438
- docs(env): add env vars authentication by @david-leifker in https://github.com/datahub-project/datahub/pull/8436
- feat(graphql-plugins): add ability for plugins to call back to core e… by @shirshanka in https://github.com/datahub-project/datahub/pull/8449
- feat(io): refactor metadata-io module by @RyanHolstien in https://github.com/datahub-project/datahub/pull/8306
- feat(ingest/mysql): Add estimate row count for mysql by @eboneil in https://github.com/datahub-project/datahub/pull/8420
- ingest(elasticsearch): add basic profiling by @anshbansal in https://github.com/datahub-project/datahub/pull/8351
- feat(ingest/lookml): fail when nothing was produced by @hsheth2 in https://github.com/datahub-project/datahub/pull/8464
- chore(ingest): drop bigquery-beta and snowflake-beta aliases by @hsheth2 in https://github.com/datahub-project/datahub/pull/8451
- feat(ingest/nifi): add support for basic auth in nifi by @mayurinehate in https://github.com/datahub-project/datahub/pull/8457
- Fix query_tab test that was failing on CI run by @kkorchak in https://github.com/datahub-project/datahub/pull/8463
- ingest(mysql): add storage bytes information by @anshbansal in https://github.com/datahub-project/datahub/pull/8294
- fix(cache) Fix caching bug with new search filters by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8434
- fix(browseV2) Escape forward slashes in browse v2 query by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8446
- fix(ingestion/powerbi-report-srever): handle requests.exceptions.JSONDecodeError by @mohdsiddique in https://github.com/datahub-project/datahub/pull/8442
- feat(sdk): easily generate container urns by @hsheth2 in https://github.com/datahub-project/datahub/pull/8198
- Update presto-on-hive URN in data_platforms.json by @gabe-lyons in https://github.com/datahub-project/datahub/pull/8484
- fix(mysql): getting table name correctly by @anshbansal in https://github.com/datahub-project/datahub/pull/8476
- feat(ingest/elastic): reduce number of calls made by @anshbansal in https://github.com/datahub-project/datahub/pull/8477
- refactor(search): Support searching multiple entities in search() as in scroll() by @iprentic in https://github.com/datahub-project/datahub/pull/8461
- fix(ingest): Generate browse paths v2 for more sources; properly pass platform_instance by @asikowitz in https://github.com/datahub-project/datahub/pull/8501
- chore(ingest): add example of training metric/hyper parameters by @anshbansal in https://github.com/datahub-project/datahub/pull/8491
- feat(ingest): enable pipeline reporting by default by @hsheth2 in https://github.com/datahub-project/datahub/pull/8472
- feat(docs) Add guide for generating browsePathsV2 aspects by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8448
- fix(browsepathv2): default browse path with empty space by @anshbansal in https://github.com/datahub-project/datahub/pull/8503
- docs: add docs on sqlglot lineage by @hsheth2 in https://github.com/datahub-project/datahub/pull/8482
- feat(search ui): Adding support for pluggable filter rendering by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/8455
- fix(ingest): hint at --update-golden-files option when tests fail by @hsheth2 in https://github.com/datahub-project/datahub/pull/8507
- ci: fix commandLine usage in build.gradle by @hsheth2 in https://github.com/datahub-project/datahub/pull/8510
- fix(ui) Fix broken dataPlatformInstance references in browseV2 by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8485
- fix(dataProduct) Show entity count excluding soft deleted entities by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8444
- feat(ui): Adding support for rendering assertion health status in Dataset Search Card, Search Preview, etc. by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/8460
- docs(ingest/bigquery): add permissions to profile google drive backed… by @mayurinehate in https://github.com/datahub-project/datahub/pull/8490
- chore(ingest/tableau): miscellaneous cleanup refractor by @mayurinehate in https://github.com/datahub-project/datahub/pull/8417
- docs(ingest/lookml): clarify connection map config by @hsheth2 in https://github.com/datahub-project/datahub/pull/8508
- config(ebean): add ebean retry configuration by @david-leifker in https://github.com/datahub-project/datahub/pull/8500
- fix(ingest): respect max_threads for ingestion reporter by @hsheth2 in https://github.com/datahub-project/datahub/pull/8521
- chore(ingest): bump sqllineage and sqlparse by @hsheth2 in https://github.com/datahub-project/datahub/pull/8481
- fix(search): fix lightning cache enable logic by @david-leifker in https://github.com/datahub-project/datahub/pull/8522
- docs(docker): document docker container dependency tree by @david-leifker in https://github.com/datahub-project/datahub/pull/8496
- feat(lineage): Apply search flags to scroll query in LineageSearchService by @iprentic in https://github.com/datahub-project/datahub/pull/8518
- feat(search): Throw exception instead of returning an empty response from scroll in an error case by @iprentic in https://github.com/datahub-project/datahub/pull/8517
- fix(gms): GMS hang when upgrade image #8270 by @yangjiandan in https://github.com/datahub-project/datahub/pull/8271
- fix(ui): Allows deselection of members in add members modal for a group by @Sukeerthi31 in https://github.com/datahub-project/datahub/pull/8349
- fix(ui) Remove initial redirect logic from frontend by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8401
- fix(sso) - Add redirect_uri to authenticate route on 401 error by @mkamalas in https://github.com/datahub-project/datahub/pull/8346
- fix(auth): ignore case when comparing http headers by @lix-mms in https://github.com/datahub-project/datahub/pull/8356
- fix(ui): use locale lowercase when filtering columns of an entity in the lineage by @Masterchen09 in https://github.com/datahub-project/datahub/pull/8213
- feat(elasticsearch): allow bulk delete by @david-leifker in https://github.com/datahub-project/datahub/pull/8424
- feat(metrics): add metrics for aspect write and bytes by @david-leifker in https://github.com/datahub-project/datahub/pull/8526
- fix(ingest/build): Fix sagemaker mypy and flake8 issues by @treff7es in https://github.com/datahub-project/datahub/pull/8530
- feat(siblings): hiding non-existant siblings in FE by @gabe-lyons in https://github.com/datahub-project/datahub/pull/8528
- fix(ingest): pin boto3-stubs in CI by @hsheth2 in https://github.com/datahub-project/datahub/pull/8527
- docs: small update to homepage by @shirshanka in https://github.com/datahub-project/datahub/pull/8483
- fix(ingest): remove duplication of tags by @anshbansal in https://github.com/datahub-project/datahub/pull/8532
- ci: reduce git fetch depth by @hsheth2 in https://github.com/datahub-project/datahub/pull/8473
- feat(ingest/vertica): performance improvement and bug fixes by @vishalkSimplify in https://github.com/datahub-project/datahub/pull/8328
- test(ingest): test case statements with sql parser by @hsheth2 in https://github.com/datahub-project/datahub/pull/8437
- feat(ingestion/tableau): support column level lineage for custom sql by @mohdsiddique in https://github.com/datahub-project/datahub/pull/8466
- fix(ingest/json-schema): convert non-string enums to strings by @benjamin-awd in https://github.com/datahub-project/datahub/pull/8479
- feat(browseV2): add browseV2 logic to system update by @RyanHolstien in https://github.com/datahub-project/datahub/pull/8506
- feat(cli): Adds ability to upload recipes to DataHub's UI by @pedro93 in https://github.com/datahub-project/datahub/pull/8317
- feat(presto-on-hive): allow v1 fieldpaths in the presto-on-hive source by @gabe-lyons in https://github.com/datahub-project/datahub/pull/8474
- fix(ui) Make multiple small updates to new search and browse by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8524
- feat(search): Allow aggregating on facets that are not explicitly part of default filter set by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/8540
- fix(test): increase siblings.js test stability by @david-leifker in https://github.com/datahub-project/datahub/pull/8542
New Contributors
- @matthew-coudert-cko made their first contribution in https://github.com/datahub-project/datahub/pull/7913
- @eboneil made their first contribution in https://github.com/datahub-project/datahub/pull/8283
- @fjmacagno made their first contribution in https://github.com/datahub-project/datahub/pull/8246
- @segun-s made their first contribution in https://github.com/datahub-project/datahub/pull/8222
- @sudhakarast made their first contribution in https://github.com/datahub-project/datahub/pull/8365
- @sheeru made their first contribution in https://github.com/datahub-project/datahub/pull/8267
- @dungdm93 made their first contribution in https://github.com/datahub-project/datahub/pull/8313
- @JifeiMei made their first contribution in https://github.com/datahub-project/datahub/pull/8243
- @kkorchak made their first contribution in https://github.com/datahub-project/datahub/pull/8463
- @Sukeerthi31 made their first contribution in https://github.com/datahub-project/datahub/pull/8349
- @lix-mms made their first contribution in https://github.com/datahub-project/datahub/pull/8356
- @benjamin-awd made their first contribution in https://github.com/datahub-project/datahub/pull/8479
Full Changelog: https://github.com/datahub-project/datahub/compare/v0.10.4...v0.10.5
v0.10.4
Released on 2023-06-09 by @pedro93.
Release Highlights
User Experience
You can now create and assign Custom Ownership types within DataHub; plus, we now display the owner type on an Entity Page <img width="336" alt="ownershiptype-displayed" src="https://github.com/datahub-project/datahub/assets/15873986/3bd84ef5-0860-4dfb-8670-b23857c6d6e0">
Various bug fixes to Column Level Lineage visualization
Metadata ingestion
- You can now define column-level lineage (aka fine-grained lineage) via our file-based lineage source
- Looker: Ingest Looks that are not part of a Dashboard
- Glue: Error reporting now includes lineage failures
- BigQuery: Now support deduplicating LogEntries based on insertId, timestamp, and logName
Docs
- CSV Enricher: improvements to sample CSV and recipe
- Guide for changing default DataHub credentials
- Updated guide to apply time-based filters on Lineage
What's Changed
- ci(ingest/kafka): improve kafka integration test reliability by @hsheth2 in https://github.com/datahub-project/datahub/pull/8085
- fix(ingest/bigquery): Deduplicate LogEntries based on insertId, timestamp, logName by @asikowitz in https://github.com/datahub-project/datahub/pull/8132
- feat(ingest/glue): report glue job lineage failures, update doc by @mayurinehate in https://github.com/datahub-project/datahub/pull/8126
- feat(lineage source): add fine grained lineage support by @anshbansal in https://github.com/datahub-project/datahub/pull/7904
- docs(glue): fix broken link by @mayurinehate in https://github.com/datahub-project/datahub/pull/8135
- feat(custom ownership): Adds Custom ownership types as a top level entity by @pedro93 in https://github.com/datahub-project/datahub/pull/8045
- Update updating-datahub.md for v0.10.3 release by @iprentic in https://github.com/datahub-project/datahub/pull/8139
- feat: add dbt-athena adapter support for column types mapping by @svdimchenko in https://github.com/datahub-project/datahub/pull/8116
- docs(csv-enricher): add example csv file & recipe by @gabe-lyons in https://github.com/datahub-project/datahub/pull/8141
- chore(ci): update base requirements file by @anshbansal in https://github.com/datahub-project/datahub/pull/8144
- fix(ingest/s3): Path spec aware folder traversal by @treff7es in https://github.com/datahub-project/datahub/pull/8095
- fix(ui) Fix selecting columns in Lineage tab for CLL by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8129
- feat(search): adding support for
_entityType
filter in the application layer + frontend by @gabe-lyons in https://github.com/datahub-project/datahub/pull/8102 - docs(ingest/nifi): fix broken links by @mayurinehate in https://github.com/datahub-project/datahub/pull/8143
- fix(scroll): fix scroll cache key for hazelcast by @RyanHolstien in https://github.com/datahub-project/datahub/pull/8149
- chore(json): fix json vulnerability by @RyanHolstien in https://github.com/datahub-project/datahub/pull/8150
- fix(ingest/json-schema): handle property inheritance in unions by @hsheth2 in https://github.com/datahub-project/datahub/pull/8121
- chore(log): fix log as error instead of info by @anshbansal in https://github.com/datahub-project/datahub/pull/8146
- fix(lineagecounts) Include entities that are filtered out due to sibling logic in the filtered count of lineage counts by @iprentic in https://github.com/datahub-project/datahub/pull/8152
- fix(stats): display consistent query count on stats tab by @joshuaeilers in https://github.com/datahub-project/datahub/pull/8151
- fix(ingest): remove
original_table_name
logic in sql source by @hsheth2 in https://github.com/datahub-project/datahub/pull/8130 - feat(ingest): add more fail-safes to stateful ingestion by @hsheth2 in https://github.com/datahub-project/datahub/pull/8111
- feat(ingest/snowflake): support for more operation types by @mayurinehate in https://github.com/datahub-project/datahub/pull/8158
- fix(ui) Show Entities first on Domain pages again by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/8159
- fix(ingest/nifi): allow nifi site url with context path by @mayurinehate in https://github.com/datahub-project/datahub/pull/8156
- feat(ingest): Create Browse Paths V2 under flag by @asikowitz in https://github.com/datahub-project/datahub/pull/8120
- fix(ingestion/looker): set project-name for imported_projects views by @mohdsiddique in https://github.com/datahub-project/datahub/pull/8086
- fix(docs): Fix ownership type typos by @pedro93 in https://github.com/datahub-project/datahub/pull/8155
- docs(townhall) feb and march town hall agenda and recording by @maggiehays in https://github.com/datahub-project/datahub/pull/7676
- feat(ingest/unity): Add qualified name to dataset properties by @asikowitz in https://github.com/datahub-project/datahub/pull/8164
- feat(ingest/bigquery_v2): enable platform instance using project id by @Khurzak in https://github.com/datahub-project/datahub/pull/8142
- feat(ingest/snowflake): Deprecate legacy lineage and optimize query history joins by @asikowitz in https://github.com/datahub-project/datahub/pull/8176
- fix(ingest/kafka): Fixing error printing in Kafka properties get call by @treff7es in https://github.com/datahub-project/datahub/pull/8145
- fix(ingest/snowflake): set use_quoted_name to profile lowercase tables by @mayurinehate in https://github.com/datahub-project/datahub/pull/8168
- feat(classification): support for regex based custom infotypes by @mayurinehate in https://github.com/datahub-project/datahub/pull/8177
- fix(restli): update base client retry logic by @david-leifker in https://github.com/datahub-project/datahub/pull/8172
- fix(ingest): Fix modeldocgen; bump feast to relax pyarrow constraint by @asikowitz in https://github.com/datahub-project/datahub/pull/8178
- refactor(ci): move from sleep to kafka lag based testing by @shirshanka in https://github.com/datahub-project/datahub/pull/8094
- docs(lineage): document timestamp filtering in lineage feature by @iprentic in https://github.com/datahub-project/datahub/pull/8174
- build(ingest/feast): Pin feast to minor version by @asikowitz in https://github.com/datahub-project/datahub/pull/8180
- feat(ingest/snowflake): Okta OAuth support; update docs by @asikowitz in https://github.com/datahub-project/datahub/pull/8157
- feat(ingest/presto-on-hive): add support for extra properties and merge property capabilities by @treff7es in https://github.com/datahub-project/datahub/pull/8147
- docs(managed datahub): release notes for v0.2.8 by @anshbansal in https://github.com/datahub-project/datahub/pull/8185
- fix(nocode): fix DeleteLegacyGraphRelationshipsStep for Elasticsearch by @david-leifker in https://github.com/datahub-project/datahub/pull/8181
- feat(docker):Add the jattach tool to the docker container(#7538) by @yangjiandan in https://github.com/datahub-project/datahub/pull/8040
- refactor: Return original exception as caused by by @Jorricks in https://github.com/datahub-project/datahub/pull/7722
- docs(ingest) Add MetadataChangeProposalWrapper import to example code by @iprentic in https://github.com/datahub-project/datahub/pull/8175
- fix(ingest/kafka): Better error handling around topic and topic description extraction by @asikowitz in https://github.com/datahub-project/datahub/pull/8183
- fix(vulnerabilities)/vulnerabilities_fixes_datahub (#8075) by @david-leifker in https://github.com/datahub-project/datahub/pull/8189
- fix: add dedicated guide on changing default credentials by @yoonhyejin in https://github.com/datahub-project/datahub/pull/8153
- feat(classification): configurable minimum values threshold by @mayurinehate in https://github.com/datahub-project/datahub/pull/8186
- fix(ingestion/looker): ingest looks not part of dashboard by @mohdsiddique in https://github.com/datahub-project/datahub/pull/8140
- fix(ingest/profiling): only apply monkeypatches once when profiling by @hsheth2 in https://github.com/datahub-project/datahub/pull/8160
- docs(tableau): site config is required for tableau cloud / tableau online by @mohdsiddique in https://github.com/datahub-project/datahub/pull/8041
- fix(ingest/bigquery): Swap log order to avoid confusion by @asikowitz in https://github.com/datahub-project/datahub/pull/8197
- fix(ingest/redshift): Adding env parameter where it was missing for urn generation by @treff7es in https://github.com/datahub-project/datahub/pull/8199
- revert(ingest/bigquery): Do not emit DataPlatformInstance; remove references to platform_instance by @asikowitz in https://github.com/datahub-project/datahub/pull/8196
- docs(managed datahub): add docs link to v0.2.8 by @anshbansal in https://github.com/datahub-project/datahub/pull/8202
- Add combined health check endpoint which can check multiple components by @iprentic in https://github.com/datahub-project/datahub/pull/8191
- chore(cp-schema-registry): bump minor version by @david-leifker in https://github.com/datahub-project/datahub/pull/8192
- feat(ingest): Produce browse paths v2 on demand and with platform instance by @asikowitz in https://github.com/datahub-project/datahub/pull/8173
New Contributors
- @svdimchenko made their first contribution in https://github.com/datahub-project/datahub/pull/8116
- @Khurzak made their first contribution in https://github.com/datahub-project/datahub/pull/8142
- @Jorricks made their first contribution in https://github.com/datahub-project/datahub/pull/7722
Full Changelog: https://github.com/datahub-project/datahub/compare/v0.10.3...v0.10.4
v0.10.3
Released on 2023-05-25 by @iprentic.
View the release notes for v0.10.3 on GitHub.
DataHub v0.10.2
Released on 2023-04-13 by @iprentic.
View the release notes for DataHub v0.10.2 on GitHub.
DataHub v0.10.1
Released on 2023-03-23 by @aditya-radhakrishnan.
View the release notes for DataHub v0.10.1 on GitHub.
DataHub v0.10.0
Released on 2023-02-07 by @david-leifker.
View the release notes for DataHub v0.10.0 on GitHub.
DataHub v0.9.6.1
Released on 2023-01-31 by @david-leifker.
View the release notes for DataHub v0.9.6.1 on GitHub.
DataHub v0.9.6
Released on 2023-01-13 by @maggiehays.
View the release notes for DataHub v0.9.6 on GitHub.
DataHub v0.9.5
Released on 2022-12-23 by @jjoyce0510.
View the release notes for DataHub v0.9.5 on GitHub.
[Known Issues] DataHub v0.9.4
Released on 2022-12-20 by @maggiehays.
View the release notes for [Known Issues] DataHub v0.9.4 on GitHub.
DataHub v0.9.3
Released on 2022-11-30 by @maggiehays.
View the release notes for DataHub v0.9.3 on GitHub.
DataHub v0.9.2
Released on 2022-11-04 by @maggiehays.
View the release notes for DataHub v0.9.2 on GitHub.
DataHub v0.9.1
Released on 2022-10-31 by @maggiehays.
View the release notes for DataHub v0.9.1 on GitHub.
DataHub v0.9.0
Released on 2022-10-11 by @szalai1.
View the release notes for DataHub v0.9.0 on GitHub.
DataHub v0.8.45
Released on 2022-09-23 by @gabe-lyons.
View the release notes for DataHub v0.8.45 on GitHub.
DataHub v0.8.44
Released on 2022-09-01 by @jjoyce0510.
View the release notes for DataHub v0.8.44 on GitHub.
DataHub v0.8.43
Released on 2022-08-09 by @maggiehays.
View the release notes for DataHub v0.8.43 on GitHub.
v0.8.42
Released on 2022-08-03 by @gabe-lyons.
View the release notes for v0.8.42 on GitHub.
v0.8.41
Released on 2022-07-15 by @anshbansal.
View the release notes for v0.8.41 on GitHub.
v0.8.40
Released on 2022-06-30 by @gabe-lyons.
View the release notes for v0.8.40 on GitHub.
v0.8.39
Released on 2022-06-24 by @maggiehays.
View the release notes for v0.8.39 on GitHub.
[!] DataHub v0.8.38
Released on 2022-06-09 by @jjoyce0510.
View the release notes for [!] DataHub v0.8.38 on GitHub.
[!] DataHub v0.8.37
Released on 2022-06-09 by @jjoyce0510.
View the release notes for [!] DataHub v0.8.37 on GitHub.
DataHub V0.8.36
Released on 2022-06-02 by @treff7es.
View the release notes for DataHub V0.8.36 on GitHub.
[!] DataHub v0.8.35
Released on 2022-05-18 by @dexter-mh-lee.
View the release notes for [!] DataHub v0.8.35 on GitHub.
v0.8.34
Released on 2022-05-04 by @maggiehays.
View the release notes for v0.8.34 on GitHub.
DataHub v0.8.33
Released on 2022-04-15 by @dexter-mh-lee.
View the release notes for DataHub v0.8.33 on GitHub.
DataHub v0.8.32
Released on 2022-04-04 by @dexter-mh-lee.
View the release notes for DataHub v0.8.32 on GitHub.
DataHub v0.8.31
Released on 2022-03-17 by @dexter-mh-lee.
View the release notes for DataHub v0.8.31 on GitHub.
Datahub v0.8.30
Released on 2022-03-17 by @rslanka.
View the release notes for Datahub v0.8.30 on GitHub.