William Lo [Fri, 19 Aug 2022 17:47:31 +0000 (10:47 -0700)]
Guard against exists fs call as well (#3538)
William Lo [Thu, 18 Aug 2022 22:20:03 +0000 (15:20 -0700)]
Add error handling for timeaware finder to handle scenarios where fil… (#3537)
* Add error handling for timeaware finder to handle scenarios where files do not exist or folders not matching date format
* Check path exists before attempting ls
Andy Jiang [Thu, 18 Aug 2022 20:28:53 +0000 (13:28 -0700)]
[GOBBLIN-1675] Add pagination for GaaS on server side (#3533)
* added mysql statement and take in pageContext
* Working functionality for count and start in api layer
* Revert unnecessary changes
* Revert unncessary changes
* Add tests for pagination with filterFlow
* Handle start and count for getall call
* Handle start and count for getAll and getFilterFlows call in API layer of GaaS
* Add tests and fix for pagination for getAll
* Refactor conditions
* Fixed tests
* Add in condition to separate user defined count and start from default values
* Fix checkstyle
* Handle get all with pagination case separately from filter
* Completed ServiceManagerTest for getAll
* remove modified_time field from Get all statement
* Completed GobblinServiceManager tests for filtered flows with pagination
* Remove separate call for tests
* Add context for debug logs and documenation
* Update user.name and user.email on commits
* Add in second ordering field
* Fix testGitCreate test by utilizing json
Co-authored-by: Andy Jiang <andjiang@andjiang-mn1.linkedin.biz>
William Lo [Tue, 9 Aug 2022 22:02:31 +0000 (15:02 -0700)]
[GOBBLIN-1672] Refactor metrics from DagManager into its own class, add metrics per … (#3532)
* Refactor metrics from DagManager into its own class, add metrics per executor for SUCCESS, FAILED, SLA_EXCEEDED, START_SLA_EXCEEDED
* Address review, fix flow gauge for failed flows
William Lo [Mon, 8 Aug 2022 23:29:14 +0000 (16:29 -0700)]
[GOBBLIN-1677] Fix timezone property to read from key correctly (#3535)
* Fix timezone property to read from key correctly
* Code cleanup and additional test around invalid timezones
Bharath Krishna [Mon, 8 Aug 2022 20:23:01 +0000 (13:23 -0700)]
[Gobblin-931] Fix typo in gobblin CLI usage (#3530)
* Correct typoe "gobblon" to "gobblin"
* Fix some documentation wording
Co-authored-by: Bharath Krishna <bmurali@roku.com>
Bharath Krishna [Mon, 8 Aug 2022 20:22:27 +0000 (13:22 -0700)]
[GOBBLIN-1671] : Fix gobblin.sh script to add external jars as colon separated to HADOOP_CLASSPATH (#3531)
When using `mapreduce` mode in gobblin.sh, the additional jars passed to
gobblin.sh through --jars are comma separated. They are incorrectly
added to HADOOP_CLASSPATH that takes colon (:) separated jars.
William Lo [Mon, 25 Jul 2022 20:49:01 +0000 (13:49 -0700)]
[GOBBLIN-1656] Return a http status 503 on GaaS when quota is exceeded for user or flowgroup (#3516)
* Add e2e tests and set http response code for quota exceeded
* cleanup
* Fix checkstyle test
* Improve guard against schedule change if quota is exceeded
* Fix bug relating to exception propagation and scheduler not checking quota due to current attempt number
* Address review comments
* Refactor based on review feedback
* Fix test
* Cleanup around handling responses from callbacks in GaaS API
* Fix checkstyle
* catch quotaexceededexception instead of checking type explicitly
* Log other errors and throw 500
* Fix checkstyle dead store
* Fix checkstyle again
William Lo [Thu, 21 Jul 2022 20:15:54 +0000 (13:15 -0700)]
[GOBBLIN-1669] Clean up TimeAwareRecursiveCopyableDataset to support seconds in time… (#3528)
* Clean up TimeAwareRecursiveCopyableDataset to support seconds in timestamp, squash logs for timestamp paths that do not exist, and improve efficiency on search for non-nested timestamps
* Fix checkstyle
* Address review
* Refactor based on reviews to handle every date pattern format recursively without the iterator
* Calculate start and end date in helper
William Lo [Thu, 21 Jul 2022 16:41:09 +0000 (09:41 -0700)]
[GOBBLIN-1670] Remove rat tasks and unneeded checkstyles blocking build pipeline (#3529)
* Remove rat task on ci-cd
* Disable checkstyle for generated code
* Remove checkstyle on generated rest files as well
* Fix checkstyle for generated java code
* Remove checkstyle for gobblin-metrics-base generated code
* Remove checkstyle for generated code in test utils and http
Jack Moseley [Mon, 18 Jul 2022 19:04:55 +0000 (12:04 -0700)]
[GOBBLIN-1668] Add audit counts for iceberg registration (#3527)
* Add audit counts for iceberg registration
* Address comments
* Address comments
Christopher Harris [Mon, 11 Jul 2022 21:48:43 +0000 (16:48 -0500)]
[GOBBLIN-1667] Create new predicate - ExistingPartitionSkipPredicate (#3526)
Currently the hive.dataset.existing.entity.policy.ABORT will not
abort if there is an existing partition. One option to resolve this
is to support the ABORT configuration but that might be backwards
incompatible, so introducing a new skip predicate called
ExistingPartitionSkipPredicate that will skip any partition that
already exists in the target table
Hanghang Nate Liu [Tue, 21 Jun 2022 20:10:49 +0000 (13:10 -0700)]
Calculate requested container count based on adding allocated count and outstanding ContainerRequests in Yarn (#3524)
Hanghang Nate Liu [Fri, 17 Jun 2022 18:28:59 +0000 (11:28 -0700)]
make the requestedContainerCountMap correctly update the container count (#3523)
update the place to decrease the requestedContainerCountMap
William Lo [Wed, 15 Jun 2022 22:04:14 +0000 (15:04 -0700)]
Fix running counts for retried flows (#3520)
Jack Moseley [Wed, 15 Jun 2022 19:01:02 +0000 (12:01 -0700)]
Allow table to flush after write failure (#3522)
Zihan Li [Tue, 14 Jun 2022 20:21:01 +0000 (13:21 -0700)]
[GOBBLIN-1652]Add more log in the KafkaJobStatusMonitor in case it fails to process one GobblinTrackingEvent (#3513)
* address comments
* use connectionmanager when httpclient is not cloesable
* [GOBBLIN-1652]Add more log in the KafkaJobStatusMonitor in case it fails to process one GobblinTrackingEvent
* add test
* fix test
Co-authored-by: Zihan Li <zihli@zihli-mn2.linkedin.biz>
Hanghang Nate Liu [Thu, 9 Jun 2022 19:27:42 +0000 (12:27 -0700)]
Make Yarn container and helix instance allocation group by tag (#3519)
vbohra [Tue, 7 Jun 2022 21:48:27 +0000 (14:48 -0700)]
[GOBBLIN-1657] Update completion watermark on change_property in IcebergMetadataWriter (#3517)
* [GOBBLIN-1655] Update completion watermark for quiet tables during iceberg registration
* [GOBBLIN-1657] Update completion watermark on change_proerty GMCE
* Added test case to check watermark update on change_property
Co-authored-by: Vikram Bohra <vbohra@vbohra-mn1.linkedin.biz>
Zihan Li [Wed, 1 Jun 2022 21:39:47 +0000 (14:39 -0700)]
[GOBBLIN-1654] Add capacity floor to avoid aggressively requesting resource and small files. (#3515)
* address comments
* use connectionmanager when httpclient is not cloesable
* [GOBBLIN-1654] Add capacity floor to avoid aggressively requesting resource and small files.
* address comments
Co-authored-by: Zihan Li <zihli@zihli-mn2.linkedin.biz>
William Lo [Wed, 1 Jun 2022 18:31:32 +0000 (11:31 -0700)]
[GOBBLIN-1653] Shorten job name length if it exceeds 255 characters (#3514)
* Shorten job name length if it exceeds 255 characters (max size for a directory component)
* Address review to account for flow name in hash
William Lo [Fri, 27 May 2022 20:15:30 +0000 (13:15 -0700)]
[GOBBLIN-1650] Implement flowGroup quotas for the DagManager (#3511)
* Implement flowGroup quotas for the DagManager
* Address review and add comments to tests
* Add guard for double increments on already tracked dags
* Fix tests
Kip Kohn [Thu, 26 May 2022 17:11:00 +0000 (10:11 -0700)]
[GOBBLIN-1648] Complete use of JDBC `DataSource` 'read-only' validation query by incorporating where previously omitted (#3509)
* Complete use of JDBC `DataSource` 'read-only' validation query by incorporating where previously omitted
* Add logging to demonstrate validation query successfully configured
Jack Moseley [Wed, 25 May 2022 18:20:38 +0000 (11:20 -0700)]
Add config to set close timeout in HiveRegister (#3512)
Arjun Singh Bora [Mon, 23 May 2022 17:57:47 +0000 (10:57 -0700)]
add an API in AbstractBaseKafkaConsumerClient to list selected topics (#3501)
Matthew Ho [Wed, 18 May 2022 22:59:57 +0000 (15:59 -0700)]
[GOBBLIN-1649] Revert gobblin-1633 (#3510)
William Lo [Wed, 18 May 2022 22:34:54 +0000 (15:34 -0700)]
[GOBBLIN-1639] Prevent metrics reporting if configured, clean up workunit count metric (#3500)
* Move workunit count metrics emitting to Gobblin pipeline, add configuration to prevent metrics reporting if configured
* rename config key
* fix test
* Fix checkstyle and other tests
* Create a custom extensible hook for GaaS metrics on JobLauncher
* Add tests
* Fix failing tests
* Address review
* Address review comment
Jack Moseley [Tue, 17 May 2022 21:02:43 +0000 (14:02 -0700)]
[GOBBLIN-1647] Add hive commit GTE to HiveMetadataWriter (#3508)
* Add hive commit GTE to HiveMetadataWriter
* Address comment
Matthew Ho [Fri, 13 May 2022 16:32:54 +0000 (09:32 -0700)]
[GOBBLIN-1633] Fix compaction actions on job failure not retried if compaction succeeds (#3494)
* GOBBLIN-1633
Fix compaction on job failure not retried if compaction succeeds
* Fix typos
Matthew Ho [Thu, 12 May 2022 22:13:06 +0000 (15:13 -0700)]
[GOBBLIN-1646] Revert yarn container / helix tag group changes (#3507)
Revert "Fix bug when shrinking the container in Yarn service (#3504)"
This reverts commit
dd6d910a7e7a90d15258c6c77ebe626ae6d573f9.
Revert "[GOBBLIN-1620]Make yarn container allocation group by helix tag (#3487)"
This reverts commit
3e877951c284ccd68be3634522f9fc2c3d39f81a.
William Lo [Wed, 11 May 2022 20:48:40 +0000 (13:48 -0700)]
[GOBBLIN-1641] Add meter for sla exceeded flows (#3502)
* Add meter for sla exceeded flows
* fix tests
* Fix test nullpointer
* Address review + augment tests
Matthew Ho [Wed, 11 May 2022 18:36:12 +0000 (11:36 -0700)]
GOBBLIN-1644 (#3506)
Log assigned participant when helix participant check fails
Zihan Li [Wed, 11 May 2022 17:45:27 +0000 (10:45 -0700)]
[GOBBLIN-1645]Change the prefix of dagManager heartbeat to make it consistent with other metrics (#3505)
* address comments
* use connectionmanager when httpclient is not cloesable
* [GOBBLIN-1645]Change the prefix of dagManager heartbeat to make it consistent with other metrics
Co-authored-by: Zihan Li <zihli@zihli-mn2.linkedin.biz>
Hanghang Nate Liu [Tue, 10 May 2022 00:12:41 +0000 (17:12 -0700)]
Fix bug when shrinking the container in Yarn service (#3504)
* Fix bug when shrinking the container in Yarn service
update unit test
update requestedContainerCountMap
* address comment
Jack Moseley [Thu, 5 May 2022 17:27:42 +0000 (10:27 -0700)]
[GOBBLIN-1637] Add writer, operation, and partition info to failed metadata writer events (#3498)
* Add writer, operation, and partition info to failed metadata writer events
* Add partitionKeys to failure event
* Add all failed writers to event
William Lo [Thu, 5 May 2022 00:30:19 +0000 (17:30 -0700)]
[GOBBLIN-1638] Fix unbalanced running count metrics due to Azkaban failures (#3499)
* Fix unbalanced running count metrics due to Azkaban failures
* Fix tests
* Address comments
* Rename test
* Update comment with review
William Lo [Thu, 28 Apr 2022 22:24:10 +0000 (15:24 -0700)]
[GOBBLIN-1634] Add retries on flow sla kills (#3495)
* Add retries on flow sla kills
* Address review
* Address review comment
Hanghang Nate Liu [Thu, 28 Apr 2022 19:46:35 +0000 (12:46 -0700)]
[GOBBLIN-1620]Make yarn container allocation group by helix tag (#3487)
* make yarn service aware of helix tag and resource requirment for each workflow so that containers will be assigned to correct task
update test cases
update helix instance tag during task runner initiation
update logs
update test case
* remove lib not used, add test case
address comments
* update test cases
* remove container min and max config
Zihan Li [Thu, 28 Apr 2022 00:49:55 +0000 (17:49 -0700)]
[GOBBLIN-1636] Close DatasetCleaner after clean task (#3497)
* address comments
* use connectionmanager when httpclient is not cloesable
* [GOBBLIN-1636] Close DatasetCleaner after clean task
Co-authored-by: Zihan Li <zihli@zihli-mn2.linkedin.biz>
Zihan Li [Wed, 27 Apr 2022 21:28:43 +0000 (14:28 -0700)]
[GOBBLIN-1635] Avoid loading env configuration when using config store to improve the performance (#3496)
* address comments
* use connectionmanager when httpclient is not cloesable
* [GOBBLIN-1631]Emit heartbeat for dagManagerThread
* [GOBBLIN-1635] Avoid loading env configuration when using config store to improve the performance
* [GOBBLIN-1630] Remove flow level metrics for adhoc flows (#3491)
* Remove emitting metrics for adhoc flows in dagmanager and orchestrator
* Add tests
* Fix tests
* Address comments
* Improve test by validating gauge value
* use data node aliases to figure out data node names before using DMAS (#3493)
* [GOBBLIN-1619] WriterUtils.mkdirsWithRecursivePermission contains race condition and puts unnecessary load on filesystem (#3477)
* [GOBBLIN-1619] Fix race cond. in writerutil mkdirs
* writer util mkdirs previously had race condition when multiple processes
try to create the same parent directory. This causes incorrect
FileNotFoundException
* new implementation does not change the behavior
* Test coverage for retry config
* Wait for file to exist via retry cfg before setting perms
* use user supplied props to create FileSystem in DatasetCleanerTask (#3483)
Co-authored-by: Zihan Li <zihli@zihli-mn2.linkedin.biz>
Co-authored-by: William Lo <lo.william97@gmail.com>
Co-authored-by: Arjun Singh Bora <abora@linkedin.com>
Co-authored-by: Matthew Ho <homatt999@gmail.com>
Arjun Singh Bora [Tue, 26 Apr 2022 18:00:36 +0000 (11:00 -0700)]
use user supplied props to create FileSystem in DatasetCleanerTask (#3483)
Matthew Ho [Thu, 21 Apr 2022 19:21:02 +0000 (12:21 -0700)]
[GOBBLIN-1619] WriterUtils.mkdirsWithRecursivePermission contains race condition and puts unnecessary load on filesystem (#3477)
* [GOBBLIN-1619] Fix race cond. in writerutil mkdirs
* writer util mkdirs previously had race condition when multiple processes
try to create the same parent directory. This causes incorrect
FileNotFoundException
* new implementation does not change the behavior
* Test coverage for retry config
* Wait for file to exist via retry cfg before setting perms
Arjun Singh Bora [Tue, 19 Apr 2022 22:02:11 +0000 (15:02 -0700)]
use data node aliases to figure out data node names before using DMAS (#3493)
William Lo [Mon, 18 Apr 2022 23:37:30 +0000 (16:37 -0700)]
[GOBBLIN-1630] Remove flow level metrics for adhoc flows (#3491)
* Remove emitting metrics for adhoc flows in dagmanager and orchestrator
* Add tests
* Fix tests
* Address comments
* Improve test by validating gauge value
Zihan Li [Wed, 13 Apr 2022 22:17:00 +0000 (15:17 -0700)]
[GOBBLIN-1631]Emit heartbeat for dagManagerThread (#3492)
* address comments
* use connectionmanager when httpclient is not cloesable
* [GOBBLIN-1631]Emit heartbeat for dagManagerThread
Co-authored-by: Zihan Li <zihli@zihli-mn2.linkedin.biz>
William Lo [Mon, 11 Apr 2022 21:42:12 +0000 (14:42 -0700)]
[GOBBLIN-1624] Refactor quota management, fix various bugs in accounting of running … (#3481)
* Refactor quota management, fix various bugs in accounting of running jobs
* Add javadocs
* Address comments, add metric counts to tests
* Address scenario on startup where quota is decreased
* rename onstartup to onInit
Matthew Ho [Mon, 11 Apr 2022 20:53:58 +0000 (13:53 -0700)]
[GOBBLIN-1613] Add metadata writers field to GMCE schema (#3490)
* [GOBBLIN-1613] Add metadata writers field to GMCE schema
* generalize dataset, platform, and table naming
* more test coverage for GMCE writer
* Reverting data.json syntax change.
- Avro doesn't follow regular json syntax
* Clean up random semi colon
* Improve naming
Zihan Li [Thu, 7 Apr 2022 21:17:42 +0000 (14:17 -0700)]
Update README.md
Zihan Li [Tue, 29 Mar 2022 22:39:03 +0000 (15:39 -0700)]
[GOBBLIN-1629] Make GobblinMCEWriter be able to catch error when calculating hive specs (#3489)
* address comments
* use connectionmanager when httpclient is not cloesable
* [GOBBLIN-1629] Make GobblinMCEWriter be able to catch error when calculating hive specs
Co-authored-by: Zihan Li <zihli@zihli-mn2.linkedin.biz>
Jack Moseley [Tue, 29 Mar 2022 21:16:32 +0000 (14:16 -0700)]
Add/fix some fields of MetadataWriterFailureEvent (#3485)
Arjun Singh Bora [Mon, 28 Mar 2022 17:43:00 +0000 (10:43 -0700)]
[GOBBLIN-1627] provide option to convert datanodes names (#3484)
* provide option to convert datanodes names
address review comments
* address review comments
* address review comments
* address review comment
William Lo [Mon, 28 Mar 2022 17:42:23 +0000 (10:42 -0700)]
Add coverage for edge cases when table paths do not exist, check parents (#3482)
Zihan Li [Mon, 28 Mar 2022 17:39:44 +0000 (10:39 -0700)]
[GOBBLIN-1616] Add close connection logic in salseforceSource (#3486)
* [GOBBLIN-1616] Add close connection logic in salseforceSource
* remove unused import
* address comments
* use connectionmanager when httpclient is not cloesable
Co-authored-by: Zihan Li <zihli@zihli-mn2.linkedin.biz>
Zihan Li [Thu, 17 Mar 2022 21:45:24 +0000 (14:45 -0700)]
[GOBBLIN-1621] Make HelixRetriggeringJobCallable emit job skip event when job is dropped due to previous job is running (#3478)
* [GOBBLIN-1621] Make HelixRetriggeringJobCallable emit job skip event when job is dropped due to previous job is running
* address typo
* address comments
* fix checkStyle
* address comments
Zihan Li [Wed, 16 Mar 2022 20:36:26 +0000 (13:36 -0700)]
[GOBBLIN-1623] Fix NPE when try to close RestApiConnector (#3480)
* Fix NPE when try to close RestApiConnector
* fix typo
William Lo [Tue, 15 Mar 2022 23:24:38 +0000 (16:24 -0700)]
Clear bad mysql packages from cache in CI/CD machines (#3479)
Arjun Singh Bora [Mon, 7 Mar 2022 19:49:29 +0000 (11:49 -0800)]
[GOBBLIN-1617] pass configurations to some HadoopUtils APIs (#3475)
* pass configurations to some HadoopUtils APIs
* address review comments
Zihan Li [Thu, 3 Mar 2022 02:06:33 +0000 (18:06 -0800)]
[GOBBLIN-1616] Make RestApiConnector be able to close the connection finally (#3474)
* [GOBBLIN-1616] Make RestliAPIConnector be able to close the connection finally
* address comments
Arjun Singh Bora [Wed, 2 Mar 2022 05:37:22 +0000 (21:37 -0800)]
add config to set log level for any class (#3473)
William Lo [Thu, 24 Feb 2022 22:35:51 +0000 (14:35 -0800)]
Fix bug where partitioned tables would always return the wrong equality in paths (#3472)
William Lo [Thu, 17 Feb 2022 18:42:26 +0000 (10:42 -0800)]
[GOBBLIN-1602] Change hive table location and partition check to validate using FS r… (#3459)
* Change hive table location and partition check to validate using FS resolvePath to resolve logical paths
* Add tests for Unpartitioned file set
* Address review, add additional throw if locations mismatch for partition location validation
* Fix checkstyles again
* allow partial success policy for workunits
Jack Moseley [Tue, 15 Feb 2022 03:59:37 +0000 (19:59 -0800)]
Don't flush on change_property operation (#3467)
Jack Moseley [Sat, 12 Feb 2022 01:35:01 +0000 (17:35 -0800)]
Fix case where error GTE is incorrectly sent from MCE writer (#3466)
Arjun Singh Bora [Fri, 11 Feb 2022 16:15:13 +0000 (08:15 -0800)]
partial rollback of PR 3464 (#3465)
William Lo [Thu, 10 Feb 2022 19:03:08 +0000 (11:03 -0800)]
[GOBBLIN-1604] Throw exception if there are no allocated requests due to lack of res… (#3461)
* Throw exception if there are no allocated requests due to lack of resources
* Fix typo
William Lo [Tue, 8 Feb 2022 21:22:34 +0000 (13:22 -0800)]
[GOBBLIN-1603] Throws error if configured when encountering an IO exception while co… (#3460)
* Throws error if configured when encountering an IO exception while collecting copy entities
* Fix checkstyle
Arjun Singh Bora [Tue, 8 Feb 2022 19:11:14 +0000 (11:11 -0800)]
[GOBBLIN-1606] change DEFAULT_GOBBLIN_COPY_CHECK_FILESIZE value (#3464)
* change DEFAULT_GOBBLIN_COPY_CHECK_FILESIZE value
do not reuse the metric
* fix unit tests
* address review comments
Abhishek Nath [Mon, 7 Feb 2022 23:36:45 +0000 (15:36 -0800)]
Upgraded dropwizard metrics library version from 3.2.3 -> 4.1.2 and added a new wrapper class on dropwizard Timer.Context class to handle the code compatibility as the newer version of this class implements AutoClosable instead of Closable. (#3463)
William Lo [Mon, 7 Feb 2022 18:51:21 +0000 (10:51 -0800)]
[GOBBLIN-1605] Fix mysql ubuntu download 404 not found for Github Actions CI/CD (#3462)
* fix mysql ubuntu download 404 not found for Github Actions CI/CD
* use fix missing flag instead of updating every package
* use apt-get update
Arjun Singh Bora [Mon, 31 Jan 2022 18:42:01 +0000 (10:42 -0800)]
[GOBBLIN-1601] implement ChangePermissionCommitStep (#3457)
* implement ChangePermissionCommitStep
add a configuration to match permission of ancestor directories permissions in source and destination
* address review comments
Zihan Li [Tue, 25 Jan 2022 02:58:27 +0000 (18:58 -0800)]
[GOBBLIN-1598]Fix metrics already exist issue in dag manager (#3454)
* [GOBBLIN-1598]Fix metrics already exist issue in dag manager
* fix typo
* address comments
William Lo [Wed, 19 Jan 2022 19:51:25 +0000 (11:51 -0800)]
[GOBBLIN-1597] Add error handling in dagmanager to continue if dag fails to process,… (#3452)
* Add error handling in dagmanager to continue if dag fails to process, make Azkaban client retry on timeouts
* Addressed comments
vgnanasekaran [Wed, 19 Jan 2022 19:46:00 +0000 (11:46 -0800)]
GOBBLIN-1579 Fail job on hive existing target table location mismatch (#3433)
Co-authored-by: Gnanasekaran <vgnanasekaran@paypal.com>
William Lo [Fri, 14 Jan 2022 01:05:48 +0000 (17:05 -0800)]
[GOBBLIN-1596] Ignore already exists exception if the table has already been created… (#3451)
* Ignore already exists exception if the table has already been created by another thread or job entirely
* Address review + add concurrency fix
Zihan Li [Thu, 13 Jan 2022 19:42:02 +0000 (11:42 -0800)]
[GOBBLIn-1595]Fix the dead lock during hive registration (#3450)
William Lo [Fri, 7 Jan 2022 23:50:57 +0000 (15:50 -0800)]
Add guard in DagManager for improperly formed SLA (#3449)
Jack Moseley [Thu, 6 Jan 2022 18:22:17 +0000 (10:22 -0800)]
[GOBBLIN-1588] Send failure events for write failures when watermark is advanced in MCE writer (#3441)
* Send failure events for write failures when watermark is advanced in MCE writer
* Address comments
Zihan Li [Wed, 5 Jan 2022 23:16:41 +0000 (15:16 -0800)]
[GOBBLIN-1593] Fix bugs in dag manager about metric reporting and job status monitor (#3448)
* [GOBBLIN-1593] Fix bugs in dag manager about metric reporting and job status monitor
* address comments
* fix typo
Kip Kohn [Wed, 22 Dec 2021 00:10:56 +0000 (16:10 -0800)]
Fix bug in `JobSpecSerializer` of inadequately preventing access errors (within `MysqlJobCatalog`) (#3447)
William Lo [Tue, 21 Dec 2021 19:28:24 +0000 (11:28 -0800)]
[GOBBLIN-1583] Add System level job start SLA (#3437)
* Add System level job start SLA
* Address review and make time unit configurable, improve naming, add comments to tests
* Address review
Zihan Li [Mon, 20 Dec 2021 20:39:22 +0000 (12:39 -0800)]
[GOBBLIN-1592] Make hive copy be able to apply filter on directory (#3446)
Zihan Li [Fri, 17 Dec 2021 19:46:30 +0000 (11:46 -0800)]
[GOBBLIN-1585]GaaS (DagManager) keep retrying a failed job beyond max attempt number (#3439)
* [GOBBLIN-1585]GaaS (DagManager) keep retrying a failed job beyond max attempt number
* address comments
* adding current attempts in job config and cluster events
* add generation into job status
* address comments
* change comments
* address comments
Zihan Li [Tue, 14 Dec 2021 23:21:17 +0000 (15:21 -0800)]
[GOBBLIN-1590] Add low/high watermark information in event emitted by Gobblin cluster (#3443)
Zihan Li [Tue, 14 Dec 2021 21:40:10 +0000 (13:40 -0800)]
[HotFix]Try to fix the mysql dependency issue in Github action (#3445)
* try to fix the dependency issue
* test2
* test3
* test4
* test5
Arjun Singh Bora [Tue, 14 Dec 2021 17:31:48 +0000 (09:31 -0800)]
Lazily initialize FileContext and do not store a handle of it so it can be GC'ed when required (#3444)
umustafi [Thu, 9 Dec 2021 22:37:12 +0000 (14:37 -0800)]
[GOBBLIN-1584] Add replace record logic for Mysql writer (#3438)
* [GOBBLIN-1584] Add replace record logic for Mysql writer
* Allows user to configure a mysql ingestion job to allow replacement of the record's value for an existing record in the table.
* Also replaces a backward iteration of ResultSet() with forward iteration in MySqlWriterCommands
* remove extra line
* refactor to improve usage of overwrite data field, adjust tests accordingly
* improve documentation and throwing errors
* modify exception for TeradataWriterCommands
Co-authored-by: Urmi Mustafi <umustafi@umustafi-mn1.linkedin.biz>
William Lo [Mon, 6 Dec 2021 22:27:43 +0000 (14:27 -0800)]
Bump up code cov version (#3440)
umustafi [Tue, 30 Nov 2021 00:56:58 +0000 (16:56 -0800)]
[GOBBLIN-1581] Iterate over Sql ResultSet in Only the Forward Direction (#3435)
* after sql-connector was bumped, exception results from ResultSet.first() being called because it iterates in backward direction of set
Co-authored-by: Urmi Mustafi <umustafi@umustafi-mn1.linkedin.biz>
Arjun Singh Bora [Mon, 29 Nov 2021 22:00:21 +0000 (14:00 -0800)]
[GOBBLIN-1575] use reference count in helix manager, so that connect/disconnect are called once and at the right time (#3427)
* make a copy of helix manager, so that each job can keep using it without worrying about disconnect
* address review comments
* address review comments
* use reference count to connect/disconnect only once
* address review comments
Zihan Li [Mon, 29 Nov 2021 18:49:53 +0000 (10:49 -0800)]
[GOBBLIN-1582] Fill low/high watermark info in SourceState for QueryBasedSource (#3436)
* [GOBBLIN-1582] Fill low/high watermark info in SourceState for QueryBasedSource
* add unit test
* address comments to make high/low watermark optional
* Refactor `MysqlSpecStore` into a generalization, `MysqlNonFlowSpecStore` (not limited to `FlowSpec`s), also useable for `TopologySpec`s (#3414)
* Refactor `MysqlSpecStore` into a generalization, `MysqlNonFlowSpecStore` (not limited to `FlowSpec`s), also useable for `TopologySpec`s
* Add missing file, `MysqlNonFlowSpecStoreTest`
* Fixup `MysqlNonFlowSpecStoreTest`
* Simplify implementaiton of `MysqlSpecStore.getSpecsImpl`.
* Rename `MysqlNonFlowSpecStore` to `MysqlBaseFlowSpecStore`.
* Aid maintainers with additional code comments
* [GOBBLIN-1557] Make KafkaSource getFilteredTopics method protected (#3408)
The method was originally private, and it is useful to be able to
override it in subclasses, to redefine how to get topics to be processed.
Change-Id: If94cda2f7a5e65e52e2453427c60f4abb932b3f8
* [GOBBLIN-1567] do not set a custom maxConnLifetime for sql connection (#3418)
* do not set a custom maxConnLifetime for sql connection
* address review comment
* [GOBBLIN-1568] Exponential backoff for Salesforce bulk api polling (#3420)
* Exponential backoff for Salesforce bulk api polling
* Read min and max wait time from prop with default
* set RETENTION_DATASET_ROOT in CleanableIcebergDataset so that any retention job can use this information (#3422)
* [GOBBLIN-1569] Add RDBMS-backed `MysqlJobCatalog`, as alternative to file system storage (#3421)
* Add RDBMS-backed `MysqlJobCatalog`, as alternative to file system storage
* Streamline `JobSpecDeserializer` error handling, on review feedback.
* Refactor `GsonJobSpecSerDe` into a reusable `GenericGsonSpecSerDe`.
* Fix javadoc slipup
* Tag metrics with proxy url if available (#3423)
* remove use of deprecated helix class (#3424)
codestyle changes
* [GOBBLIN-1565]Make GMCEWriter fault tolerant so that one topic failure will not affect other topics in the same container (#3419)
* [GOBBLIN-1565]Make GMCEWriter fault tolerant so that one topic failure will not affect other topics in the same container
* address comments
* change the way we set low watermark to have a better indicate for the watermark range of the snapshot
* address comments
* fix test error
* [GOBBLIN-1552] determine flow status correctly when dag manager is disabled (#3403)
* determine flow status based on the fact if dag manager is enabled
this is needed because when dag manager is not enabled, flow level events are not emitted
and cannot be used to determine flow status. in that case flow status has to be determined
by using job statuses.
store flow status in the FlowStatus
* address review comments
* address review comments
* removed a commented line
* [GOBBLIN-1564] codestyle changes, typo corrections, improved javadoc and fix a sync… (#3415)
* codestyle changes, typo corrections, improved javadoc and fix a synchronization issue
* address review comments
* add review comments
* address review comments
* address review comments
* fix bugsFixMain
* do not delete data while dropping a hive table because data is deleted, if needed, separately (#3431)
* [GOBBLIN-1574] Added whitelist for iceberg tables to add new partitio… (#3426)
* [GOBBLIN-1574] Added whitelist for iceberg tables to add new partition column
* fix to failing test case
* Updated IncebergMetadataWriterTest to blacklist the test table from non-completeness tests
* moved dataset name update in tablemetadata
* Added newPartition checks in Table Metadata
* Fixed test case to include new_parition_enabled
Co-authored-by: Vikram Bohra <vbohra@vbohra-mn1.linkedin.biz>
* [GOBBLIN-1577] change the multiplier used in ExponentialWaitStrategy to a reasonable… (#3430)
* change the multiplier used in ExponentialWaitStrategy to 1 second. old multiplier 2ms was retrying too fast for some use cases
* .
* [GOBBLIN-1580] Check table exists instead of call create table directly to make sure table exists (#3432)
* [hotfix] workaround to catch exception when iceberg does not support get metrics for non-union type
* address comments
* [GOBBLIN-1580]Check table exists instead of call create table directly to make sure table exists
* [GOBBLIN-1573]Fix the ClassNotFoundException in streaming test pipeline (#3425)
* [GOBBLIN-1565]Make GMCEWriter fault tolerant so that one topic failure will not affect other topics in the same container
* address comments
* address comments
* [GOBBLIN-1573]Fix the ClassNotFoundException in streaming test pipeline
* [GOBBLIN-1576] skip appending record count to staging file if present… (#3429)
* [GOBBLIN-1576] skip appending record count to staging file if present already
* fixed checkstyle
* fixed method
Co-authored-by: Vikram Bohra <vbohra@vbohra-mn1.linkedin.biz>
* fix the NPE in dagManager
* fix quota check issue in dagManager
* address comments
Co-authored-by: Kip Kohn <ckohn@linkedin.com>
Co-authored-by: Joseph Allemandou <joseph.allemandou@gmail.com>
Co-authored-by: Arjun Singh Bora <abora@linkedin.com>
Co-authored-by: Jiashuo Wang <willwjs@umich.edu>
Co-authored-by: William Lo <lo.william97@gmail.com>
Co-authored-by: vbohra <vbohra@linkedin.com>
Co-authored-by: Vikram Bohra <vbohra@vbohra-mn1.linkedin.biz>
vbohra [Tue, 23 Nov 2021 17:56:03 +0000 (09:56 -0800)]
[GOBBLIN-1576] skip appending record count to staging file if present… (#3429)
* [GOBBLIN-1576] skip appending record count to staging file if present already
* fixed checkstyle
* fixed method
Co-authored-by: Vikram Bohra <vbohra@vbohra-mn1.linkedin.biz>
Zihan Li [Tue, 23 Nov 2021 00:43:34 +0000 (16:43 -0800)]
[GOBBLIN-1573]Fix the ClassNotFoundException in streaming test pipeline (#3425)
* [GOBBLIN-1565]Make GMCEWriter fault tolerant so that one topic failure will not affect other topics in the same container
* address comments
* address comments
* [GOBBLIN-1573]Fix the ClassNotFoundException in streaming test pipeline
Zihan Li [Tue, 23 Nov 2021 00:43:17 +0000 (16:43 -0800)]
[GOBBLIN-1580] Check table exists instead of call create table directly to make sure table exists (#3432)
* [hotfix] workaround to catch exception when iceberg does not support get metrics for non-union type
* address comments
* [GOBBLIN-1580]Check table exists instead of call create table directly to make sure table exists
Arjun Singh Bora [Fri, 19 Nov 2021 01:30:12 +0000 (17:30 -0800)]
[GOBBLIN-1577] change the multiplier used in ExponentialWaitStrategy to a reasonable… (#3430)
* change the multiplier used in ExponentialWaitStrategy to 1 second. old multiplier 2ms was retrying too fast for some use cases
* .
vbohra [Thu, 18 Nov 2021 23:54:14 +0000 (15:54 -0800)]
[GOBBLIN-1574] Added whitelist for iceberg tables to add new partitio… (#3426)
* [GOBBLIN-1574] Added whitelist for iceberg tables to add new partition column
* fix to failing test case
* Updated IncebergMetadataWriterTest to blacklist the test table from non-completeness tests
* moved dataset name update in tablemetadata
* Added newPartition checks in Table Metadata
* Fixed test case to include new_parition_enabled
Co-authored-by: Vikram Bohra <vbohra@vbohra-mn1.linkedin.biz>
Arjun Singh Bora [Thu, 18 Nov 2021 19:17:28 +0000 (11:17 -0800)]
do not delete data while dropping a hive table because data is deleted, if needed, separately (#3431)
Arjun Singh Bora [Wed, 10 Nov 2021 23:09:20 +0000 (15:09 -0800)]
[GOBBLIN-1564] codestyle changes, typo corrections, improved javadoc and fix a sync… (#3415)
* codestyle changes, typo corrections, improved javadoc and fix a synchronization issue
* address review comments
* add review comments
* address review comments
* address review comments
* fix bugsFixMain
Arjun Singh Bora [Wed, 10 Nov 2021 23:07:39 +0000 (15:07 -0800)]
[GOBBLIN-1552] determine flow status correctly when dag manager is disabled (#3403)
* determine flow status based on the fact if dag manager is enabled
this is needed because when dag manager is not enabled, flow level events are not emitted
and cannot be used to determine flow status. in that case flow status has to be determined
by using job statuses.
store flow status in the FlowStatus
* address review comments
* address review comments
* removed a commented line
Zihan Li [Wed, 10 Nov 2021 01:24:28 +0000 (17:24 -0800)]
[GOBBLIN-1565]Make GMCEWriter fault tolerant so that one topic failure will not affect other topics in the same container (#3419)
* [GOBBLIN-1565]Make GMCEWriter fault tolerant so that one topic failure will not affect other topics in the same container
* address comments
* change the way we set low watermark to have a better indicate for the watermark range of the snapshot
* address comments
* fix test error
Arjun Singh Bora [Mon, 8 Nov 2021 22:58:23 +0000 (14:58 -0800)]
remove use of deprecated helix class (#3424)
codestyle changes