gobblin.git
2 years ago[GOBBLIN-1180] Removed dependencies on gobblin-parquet
vbohra [Thu, 4 Jun 2020 00:42:36 +0000 (17:42 -0700)] 
[GOBBLIN-1180] Removed dependencies on gobblin-parquet

Closes #3026 from vikrambohra/gobblinParquet

2 years ago[GOBBLIN-1078][RETRY TASK INITIALIZATION] Adding condition to ensure cancellation...
autumnust [Tue, 2 Jun 2020 22:56:59 +0000 (15:56 -0700)] 
[GOBBLIN-1078][RETRY TASK INITIALIZATION] Adding condition to ensure cancellation happened after run

Add unit test for normal sequence

Fix cancellation unit test with increased sleeping
time as temp solution

Add missing constant variables after rebasing

Bring back the piece that is lost from rebasing
for fixing travis

Address reviewer's comments

Update gitignore files for generated stuff

Closes #2919 from autumnust/taskCancelLock

2 years ago[GOBBLIN-1177] Provide a config for overprovisioning Gobblin Yarn containers by a...
sv2000 [Tue, 2 Jun 2020 22:46:24 +0000 (15:46 -0700)] 
[GOBBLIN-1177] Provide a config for overprovisioning Gobblin Yarn containers by a configurable amount[]

Closes #3023 from sv2000/containerOverProvision

2 years ago[GOBBLIN-1176] create gobblin-all module resolving full dependency tree
vbohra [Tue, 2 Jun 2020 22:03:04 +0000 (15:03 -0700)] 
[GOBBLIN-1176] create gobblin-all module resolving full dependency tree

Closes #3021 from vikrambohra/gobblinAll

2 years ago[GOBBLIN-1175] Provide an option to all GobblinYarnAppLauncher to detach from Yarn...
sv2000 [Tue, 2 Jun 2020 15:08:22 +0000 (08:08 -0700)] 
[GOBBLIN-1175] Provide an option to all GobblinYarnAppLauncher to detach from Yarn application

Closes #3020 from sv2000/azkabanIgnoreKill

2 years ago[GOBBLIN-1154] Improve gaas error messages
Jack Moseley [Tue, 2 Jun 2020 01:27:07 +0000 (18:27 -0700)] 
[GOBBLIN-1154] Improve gaas error messages

Closes #2993 from jack-moseley/gaas-improved-
errors

2 years ago[GOBBLIN-1165] Add config to enable user to set additional yarn classpathes
Zihan Li [Tue, 2 Jun 2020 00:06:36 +0000 (17:06 -0700)] 
[GOBBLIN-1165] Add config to enable user to set additional yarn classpathes

Closes #3017 from ZihanLi58/GOBBLIN-1165-n

2 years ago[GOBBLIN-1170] Add missing booleanWritable type
Lei Sun [Mon, 1 Jun 2020 21:58:21 +0000 (14:58 -0700)] 
[GOBBLIN-1170] Add missing booleanWritable type

Closes #3013 from autumnust/addmissingtype

2 years ago[GOBBLIN-1172] Migrate to Ubuntu 18 with openjdk8 for Travis
aprokofi [Mon, 1 Jun 2020 21:56:54 +0000 (14:56 -0700)] 
[GOBBLIN-1172] Migrate to Ubuntu 18 with openjdk8 for Travis

Closes #3016 from aplex/buildenv-jdk

2 years ago[GOBBLIN-1173] Prevent propagating exceptions from the finally block in KafkaSource...
sv2000 [Mon, 1 Jun 2020 21:49:37 +0000 (14:49 -0700)] 
[GOBBLIN-1173] Prevent propagating exceptions from the finally block in KafkaSource#getWorkUnits[]

Closes #3018 from
sv2000/preserveExceptionKafkaSource

2 years ago[GOBBLIN-1160] No spec delete on gobblin service start
Arjun [Mon, 1 Jun 2020 21:44:20 +0000 (14:44 -0700)] 
[GOBBLIN-1160] No spec delete on gobblin service start

Closes #3011 from arjun4084346/noSpecRemoveOnStart

2 years ago[GOBBLIN-1168] add metrics in all SpecStore implementations
Arjun [Mon, 1 Jun 2020 20:12:35 +0000 (13:12 -0700)] 
[GOBBLIN-1168] add metrics in all SpecStore implementations

Closes #3001 from arjun4084346/flowSpecFields2

2 years ago[GOBBLIN-1162] Provide an option to allow slow containers to commit su…
sv2000 [Mon, 1 Jun 2020 18:29:14 +0000 (11:29 -0700)] 
[GOBBLIN-1162] Provide an option to allow slow containers to commit su…

Closes #3002 from sv2000/containerSuicide

2 years agoThis is an empty commit
vbohra [Sat, 30 May 2020 20:33:46 +0000 (13:33 -0700)] 
This is an empty commit

Closes #3014 from vikrambohra/emptyCommit

2 years ago[Gobblin 1169][GOBBLIN-1169][GOBBLIN-1167][GOBBLIN-1163][GOBBLIN-1159] Undo Reverted...
vbohra [Fri, 29 May 2020 23:26:35 +0000 (16:26 -0700)] 
[Gobblin 1169][GOBBLIN-1169][GOBBLIN-1167][GOBBLIN-1163][GOBBLIN-1159] Undo Reverted code for ELR + new changes

Closes #3012 from vikrambohra/GOBBLIN-1169

2 years ago[GOBBLIN-1167][GOBBLIN-1163][GOBBLIN-1159] Undo elr changes
vbohra [Fri, 29 May 2020 17:20:32 +0000 (10:20 -0700)] 
[GOBBLIN-1167][GOBBLIN-1163][GOBBLIN-1159] Undo elr changes

Closes #3010 from vikrambohra/undoELR

2 years ago[GOBBLIN-1163] Fix travis formatting error
vbohra [Thu, 28 May 2020 20:26:20 +0000 (13:26 -0700)] 
[GOBBLIN-1163] Fix travis formatting error

Closes #3003 from vikrambohra/travisAlive

2 years ago[GOBBLIN-1158] Use input dir to document old files instead of file pathes to reduce …
Zihan Li [Wed, 27 May 2020 19:37:47 +0000 (12:37 -0700)] 
[GOBBLIN-1158] Use input dir to document old files instead of file pathes to reduce …

Closes #2997 from ZihanLi58/COMPACTIONACTION

2 years ago[GOBBLIN-1159] Added code to publish gobblin artifacts to bintray
vbohra [Wed, 27 May 2020 17:47:26 +0000 (10:47 -0700)] 
[GOBBLIN-1159] Added code to publish gobblin artifacts to bintray

Closes #2998 from vikrambohra/bintrayFinal

2 years ago[GOBBLIN-1157] get a json representation of object if config type is different
Arjun [Fri, 22 May 2020 04:39:52 +0000 (21:39 -0700)] 
[GOBBLIN-1157] get a json representation of object if config type is different

Closes #2996 from arjun4084346/configList

2 years ago[GOBBLIN-1155] Make socket connect timeout configurable for couchbase writer
Jack Moseley [Thu, 21 May 2020 22:34:35 +0000 (15:34 -0700)] 
[GOBBLIN-1155] Make socket connect timeout configurable for couchbase writer

Closes #2994 from jack-moseley/couchbase-socket-
timeout

2 years ago[GOBBLIN-1150] spec catalog table schema change
Arjun [Wed, 20 May 2020 22:37:12 +0000 (15:37 -0700)] 
[GOBBLIN-1150] spec catalog table schema change

Closes #2988 from arjun4084346/jsonConfig

2 years ago[GOBBLIN-1127][GOBBLIN-1153] Revert " Provide an option to make metric reporting...
sv2000 [Wed, 20 May 2020 18:57:35 +0000 (11:57 -0700)] 
[GOBBLIN-1127][GOBBLIN-1153] Revert " Provide an option to make metric reporting instantiatio…"

This reverts commit
239115778a08590d7ab5dfa32334efabe0e4fb49.

Closes #2992 from sv2000/revertMetricReportFailure

2 years ago[GOBBLIN-1149] Abstract out method for constructing descriptor from config
Jack Moseley [Tue, 19 May 2020 15:29:15 +0000 (08:29 -0700)] 
[GOBBLIN-1149] Abstract out method for constructing descriptor from config

Closes #2987 from jack-moseley/construct-
descriptor

2 years ago[GOBBLIN-1152] enable helix instance only if it is a participant
Arjun [Mon, 18 May 2020 19:26:05 +0000 (12:26 -0700)] 
[GOBBLIN-1152] enable helix instance only if it is a participant

Closes #2985 from arjun4084346/fixTaskRunner

2 years ago[GOBBLIN-1147] Use one dfsClient in FsDataWriter to to rename and exists check to av…
Zihan Li [Mon, 18 May 2020 17:21:36 +0000 (10:21 -0700)] 
[GOBBLIN-1147] Use one dfsClient in FsDataWriter to to rename and exists check to av…

Closes #2986 from ZihanLi58/APA-22444-new

2 years ago[GOBBLIN-1151] use gson in place of jackson for serialize/deserialize
Arjun [Fri, 15 May 2020 19:00:22 +0000 (12:00 -0700)] 
[GOBBLIN-1151] use gson in place of jackson for serialize/deserialize

use gson in place of jackson for
serialize/deserialize
enhance ConfigUtils::configToProperties() to
handle non-string type config value

unit test for backward compatibility

Trigger notification

Closes #2983 from
arjun4084346/readableRequesterList

2 years ago[GOBBLIN-1144] remove specs from gobblin service job scheduler
Arjun [Thu, 14 May 2020 17:18:36 +0000 (10:18 -0700)] 
[GOBBLIN-1144] remove specs from gobblin service job scheduler

Dear Gobblin maintainers,

Please accept this PR. I understand that it will
not be reviewed until I have checked off all the
steps below!

### JIRA
- [x] My PR addresses the following [Gobblin JIRA]
(https://issues.apache.org/jira/browse/GOBBLIN/)
issues and references them in the PR title. For
example, "[GOBBLIN-XXX] My Gobblin PR"
    -
https://issues.apache.org/jira/browse/GOBBLIN-1144

### Description
- [x] Here are some details about my PR, including
screenshots (if applicable):
implement option 4 mentioned in the doc https://do
cs.google.com/document/d/1OsImllAZRnJIp2NWEOdlfw0X
tqY1b-ysyKEZYLHwVbQ/edit

### Tests
- [x] My PR adds the following unit tests __OR__
does not need testing for this extremely good
reason:
trivial changes

### Commits
- [x] My commits all reference JIRA issues in
their subject lines, and I have squashed multiple
commits if they address the same issue. In
addition, my commits follow the guidelines from
"[How to write a good git commit
message](http://chris.beams.io/posts/git-
commit/)":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not
"adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Closes #2981 from
arjun4084346/flowCatalogRaceCondition

2 years ago[GOBBLIN-1126] Make ORC compaction shuffle key configurable
Lei Sun [Wed, 13 May 2020 05:43:19 +0000 (22:43 -0700)] 
[GOBBLIN-1126] Make ORC compaction shuffle key configurable

Create Orc-Schema audo-filler;
Adding unit test for that;
Trying to re-use the object when upconvert is
required, and trying to reuse that for Orc record
projection, need to finish unit test for reuse
part first

Refactoring many pieces and pass all unit tests
for nested schema up-conversion

Remove Junit library in unit tests

Reorder the method in OrcUtils to improve
readability

Fix unit tests

Fix union and add tests for column projection

Add reducer-side OrcStruct comparator

Add unit tests for Reducer side of dedup for ORC

Make unit test check record content after
compaction

Edit gitignore file to make ignore vs-code related
configuration files

Fix unit tests

Add test for multi-key on reducer side

Remove excessive log in upConvertOrcStruct

Add helper to reflect problematic file during
compaciton to help debug

Catch all types of exception in map method

Address reviewer's comments

Address reviewer's comments

Enhance unit test: Add union into reducer-side
dedup's schema

Add one more tests for OrcUtils and separate the
testing workload in travis for compaction job

Closes #2966 from autumnust/orc-compaction-
compare-key-configurable

2 years ago[GOBBLIN-1146] Allow configuring autocommit in JDBCWriters
zhchen [Tue, 12 May 2020 23:52:51 +0000 (16:52 -0700)] 
[GOBBLIN-1146] Allow configuring autocommit in JDBCWriters

Closes #2984 from zxcware/jdbc

2 years ago[GOBBLIN-1127] Provide an option to make metric reporting instantiatio…
sv2000 [Mon, 11 May 2020 17:04:28 +0000 (10:04 -0700)] 
[GOBBLIN-1127] Provide an option to make metric reporting instantiatio…

Closes #2967 from sv2000/metricReportFatal

2 years ago[GOBBLIN-1145] add path in serde props
Arjun [Sat, 9 May 2020 00:31:47 +0000 (17:31 -0700)] 
[GOBBLIN-1145] add path in serde props

Closes #2982 from arjun4084346/distcpSdParams

2 years ago[GOBBLIN-1142] Hive Distcp support filter on partitioned or snapshot tables
zhchen [Thu, 7 May 2020 21:35:15 +0000 (14:35 -0700)] 
[GOBBLIN-1142] Hive Distcp support filter on partitioned or snapshot tables

Closes #2979 from zxcware/hive-distcp

2 years ago[GOBBLIN-1141] add support for common job properties in helix job scheduler
Arjun [Mon, 4 May 2020 22:59:18 +0000 (15:59 -0700)] 
[GOBBLIN-1141] add support for common job properties in helix job scheduler

add support for common job properties in helix job
scheduler

address review comments

address review comments

Closes #2977 from arjun4084346/useClusterConfigs

2 years ago[GOBBLIN-1133] Add CompactionSuiteBaseWithConfigurableCompleteAction to make complete...
Zihan Li [Sat, 2 May 2020 06:30:14 +0000 (23:30 -0700)] 
[GOBBLIN-1133] Add CompactionSuiteBaseWithConfigurableCompleteAction to make complete action configurable

Fix Issue that YarnService use the old token to
acquire new container

pull origin master to change the test

remove unintentional change

address comments

refractor class name

[GOBBLIN-1133]Add
CompactionSuiteBaseWithConfigurableCompleteAction
to make complete action configurable

reformate code

address comments

Closes #2973 from ZihanLi58/GOBBLIN-1133

2 years ago[GOBBLIN-1137] Add API for getting list of proxy users from an azkaban project
Jack Moseley [Fri, 1 May 2020 22:08:08 +0000 (15:08 -0700)] 
[GOBBLIN-1137] Add API for getting list of proxy users from an azkaban project

Closes #2976 from jack-moseley/get-proxy-users

2 years ago[GOBBLIN-1136] Make LogCopier be able to refresh FileSystem for long running job...
Zihan Li [Fri, 1 May 2020 20:23:52 +0000 (13:23 -0700)] 
[GOBBLIN-1136] Make LogCopier be able to refresh FileSystem for long running job use cases

Closes #2975 from ZihanLi58/GOBBLIN-1136

2 years ago[GOBBLIN-1135] added back flow remove feature for spec executors when dag manager...
Arjun [Thu, 30 Apr 2020 23:55:11 +0000 (16:55 -0700)] 
[GOBBLIN-1135] added back flow remove feature for spec executors when dag manager is not enabled
codesyle changes

Closes #2974 from arjun4084346/deleteClusterJobs

2 years ago[GOBBLIN-1130] Add API for adding proxy user to azkaban project
Jack Moseley [Thu, 30 Apr 2020 06:39:30 +0000 (23:39 -0700)] 
[GOBBLIN-1130] Add API for adding proxy user to azkaban project

Closes #2971 from jack-moseley/add-proxy

2 years ago[GOBBLIN-1132] move the logic of requester list verification to RequesterService...
Arjun [Wed, 29 Apr 2020 00:00:20 +0000 (17:00 -0700)] 
[GOBBLIN-1132] move the logic of requester list verification to RequesterService implementation

Closes #2969 from arjun4084346/requesterListFix

2 years ago[GOBBLIN-1131] Bump up ORC deps to 1.6.3 to pick ORC-616
Lei Sun [Mon, 27 Apr 2020 21:44:14 +0000 (14:44 -0700)] 
[GOBBLIN-1131] Bump up ORC deps to 1.6.3 to pick ORC-616

Closes #2972 from autumnust/master

2 years ago[GOBBLIN-1125] Add metrics to measure job status state store performan…
sv2000 [Thu, 23 Apr 2020 23:29:18 +0000 (16:29 -0700)] 
[GOBBLIN-1125] Add metrics to measure job status state store performan…

Closes #2965 from sv2000/jobStatusMetrics

2 years ago[GOBBLIN-1123][GOBBLIN-1124][GOBBLIN-1115] Report orchestration delay for Gobblin...
sv2000 [Thu, 23 Apr 2020 20:14:31 +0000 (13:14 -0700)] 
[GOBBLIN-1123][GOBBLIN-1124][GOBBLIN-1115] Report orchestration delay for Gobblin Service flows

Closes #2963 from sv2000/gaasOrchestrationDelay

2 years ago[GOBBLIN-1115] Add flow level data movement authorization in gaas
Jack Moseley [Thu, 23 Apr 2020 16:15:27 +0000 (09:15 -0700)] 
[GOBBLIN-1115] Add flow level data movement authorization in gaas

Closes #2955 from jack-moseley/data-authorization

2 years ago[GOBBLIN-1124] Add exception error message.
Kuai Yu [Wed, 22 Apr 2020 20:56:06 +0000 (13:56 -0700)] 
[GOBBLIN-1124] Add exception error message.

Closes #2964 from yukuai518/error

2 years ago[GOBBLIN-1121] Fix Issue that YarnService use the old token to acquire new container
Zihan Li [Tue, 21 Apr 2020 22:23:37 +0000 (15:23 -0700)] 
[GOBBLIN-1121] Fix Issue that YarnService use the old token to acquire new container

Closes #2961 from ZihanLi58/GOBBLIN-1121-new

2 years ago[GOBBLIN-1122] Bump up helix-lib version[]
sv2000 [Tue, 21 Apr 2020 18:17:31 +0000 (11:17 -0700)] 
[GOBBLIN-1122] Bump up helix-lib version[]

Closes #2962 from sv2000/helix09

2 years ago[GOBBLIN-1120] Reinitialize HelixManager when Helix participant check throws an excep...
sv2000 [Sat, 18 Apr 2020 00:07:35 +0000 (17:07 -0700)] 
[GOBBLIN-1120] Reinitialize HelixManager when Helix participant check throws an exception[]

Closes #2960 from
sv2000/helixAssignedParticipantCheck

2 years ago[GOBBLIN-1116] Avoid registering schema with schema registry during Me…
sv2000 [Fri, 17 Apr 2020 21:36:36 +0000 (14:36 -0700)] 
[GOBBLIN-1116] Avoid registering schema with schema registry during Me…

Closes #2956 from sv2000/metricsReporterFailure

2 years ago[GOBBLIN-1117] Enable record count verification for ORC format
Lei Sun [Thu, 16 Apr 2020 23:05:57 +0000 (16:05 -0700)] 
[GOBBLIN-1117] Enable record count verification for ORC format

Closes #2957 from autumnust/orc-recompact-fix

2 years ago[GOBBLIN-1119] Enable close-on-flush for quality-checker's err-file
Lei Sun [Thu, 16 Apr 2020 20:59:57 +0000 (13:59 -0700)] 
[GOBBLIN-1119] Enable close-on-flush for quality-checker's err-file

Enable close-on-flush for quality-checker's err-
file

Address comments

Closes #2959 from
autumnust/qualityCheckererrFileFlush

2 years ago[GOBBLIN-1118] Bump up ORC version to 1.6.2 to pick up ORC-569
Lei Sun [Wed, 15 Apr 2020 21:19:55 +0000 (14:19 -0700)] 
[GOBBLIN-1118] Bump up ORC version to 1.6.2 to pick up ORC-569

Closes #2958 from autumnust/orc1.6.2

2 years ago[GOBBLIN-1112] Implement a new HttpMethodRetryHandler that allows retr…
sv2000 [Tue, 14 Apr 2020 15:44:19 +0000 (08:44 -0700)] 
[GOBBLIN-1112] Implement a new HttpMethodRetryHandler that allows retr…

Closes #2951 from sv2000/httpRetryHandler

2 years ago[GOBBLIN-1113] Carry forward requester list property when updating flowconfig
Jack Moseley [Mon, 13 Apr 2020 03:07:16 +0000 (20:07 -0700)] 
[GOBBLIN-1113] Carry forward requester list property when updating flowconfig

Closes #2952 from jack-moseley/update-requester

2 years ago[GOBBLIN-1114] OrcValueMapper schema evolution up-conversion recursive
Lei Sun [Sun, 12 Apr 2020 16:50:11 +0000 (09:50 -0700)] 
[GOBBLIN-1114] OrcValueMapper schema evolution up-conversion recursive

OrcValueMapper schema evolution up-conversion
recursive

Fix findBugsMain

Address comments

Address comments

Closes #2954 from autumnust/master

2 years ago[GOBBLIN-1111] CsvToJsonConverterV2 should not print out raw data in the log
Haoji Liu [Fri, 10 Apr 2020 18:30:42 +0000 (11:30 -0700)] 
[GOBBLIN-1111] CsvToJsonConverterV2 should not print out raw data in the log

Jira
https://issues.apache.org/jira/browse/GOBBLIN-1111

Test Done: None, deleting log lines.

Closes #2953 from haojiliu/master

2 years ago[GOBBLIN-1105] some refactoring and make MysqlJobStatusStateStore implements DatasetS...
Arjun [Wed, 8 Apr 2020 18:53:04 +0000 (11:53 -0700)] 
[GOBBLIN-1105] some refactoring and make MysqlJobStatusStateStore implements DatasetStateStore

Closes #2945 from arjun4084346/PR-3

2 years ago[GOBBLIN-1110] fix deadlock in job cancellation
Arjun [Tue, 7 Apr 2020 22:23:25 +0000 (15:23 -0700)] 
[GOBBLIN-1110] fix deadlock in job cancellation
replacing deprecated class MessageHandlerFactory with MultiTypeMessageHandlerFactory

Closes #2950 from arjun4084346/taskDriverStop

2 years ago[GOBBLIN-1101] DSS-25241): Enhance bulk api retry for ExceedQuota
Alex Li [Tue, 7 Apr 2020 15:12:45 +0000 (08:12 -0700)] 
[GOBBLIN-1101] DSS-25241): Enhance bulk api retry for ExceedQuota

GOBBLIN-1101(DSS-25241): Enhance bulk api retry
for ExceedQuota

throw out runtime exception for
InteruptedException

fix typo

restore DEFAULT_FETCH_RETRY_LIMIT

fix find root cause exception

trigger build again

add key for sleep duration

trigger again

fix format

Closes #2942 from arekusuri/GOBBLIN-1101

2 years ago[GOBBLIN-1109] partial rollback of PR#2836
Arjun [Mon, 6 Apr 2020 19:52:55 +0000 (12:52 -0700)] 
[GOBBLIN-1109] partial rollback of PR#2836

Dear Gobblin maintainers,

Please accept this PR. I understand that it will
not be reviewed until I have checked off all the
steps below! jack-moseley please review

### JIRA
- [x] My PR addresses the following [Gobblin JIRA]
(https://issues.apache.org/jira/browse/GOBBLIN/)
issues and references them in the PR title. For
example, "[GOBBLIN-XXX] My Gobblin PR"
    -
https://issues.apache.org/jira/browse/GOBBLIN-1109

### Description
- [x] Here are some details about my PR, including
screenshots (if applicable):
partial rollback of PR#2836, because we want to
keep the behavior of flowConfigV1 API unchanged

### Tests
- [x] My PR adds the following unit tests __OR__
does not need testing for this extremely good
reason:
NA

### Commits
- [x] My commits all reference JIRA issues in
their subject lines, and I have squashed multiple
commits if they address the same issue. In
addition, my commits follow the guidelines from
"[How to write a good git commit
message](http://chris.beams.io/posts/git-
commit/)":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not
"adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Closes #2949 from arjun4084346/rollbackV1Changes

2 years ago[GOBBLIN-1108] bump up mysql-connector
Arjun [Sat, 4 Apr 2020 00:20:15 +0000 (17:20 -0700)] 
[GOBBLIN-1108] bump up mysql-connector

Closes #2948 from arjun4084346/mysqlBumpUp

2 years ago[GOBBLIN-1107] Lazily initialize Helix TaskStateModelFactory in Gobbli…
sv2000 [Fri, 3 Apr 2020 16:29:02 +0000 (09:29 -0700)] 
[GOBBLIN-1107] Lazily initialize Helix TaskStateModelFactory in Gobbli…

Closes #2947 from sv2000/taskRunner

2 years ago[GOBBLIN-1106] do not remove requester list
Arjun [Wed, 1 Apr 2020 20:47:13 +0000 (13:47 -0700)] 
[GOBBLIN-1106] do not remove requester list

Closes #2946 from arjun4084346/requesterFix

2 years ago[GOBBLIN-1100][GOBBLIN-1104] Change access modifier for generateTagsForPartitions...
Lei Sun [Tue, 31 Mar 2020 20:08:34 +0000 (13:08 -0700)] 
[GOBBLIN-1100][GOBBLIN-1104] Change access modifier for generateTagsForPartitions to accomodate with []

Closes #2944 from autumnust/accessLevel

2 years ago[GOBBLIN-1102] Add link to GIP[]
vbohra [Tue, 31 Mar 2020 17:05:40 +0000 (10:05 -0700)] 
[GOBBLIN-1102] Add link to GIP[]

Closes #2943 from vikrambohra/GOBBLIN-1102

2 years ago[GOBBLIN-1099] Handle orphaned Yarn containers in Gobblin-on-Yarn clus…
sv2000 [Sat, 28 Mar 2020 04:26:00 +0000 (21:26 -0700)] 
[GOBBLIN-1099] Handle orphaned Yarn containers in Gobblin-on-Yarn clus…

Closes #2940 from sv2000/yarnOrphans

2 years ago[GOBBLIN-1094] Added documentation of High level consumer
vbohra [Fri, 27 Mar 2020 22:26:42 +0000 (15:26 -0700)] 
[GOBBLIN-1094] Added documentation of High level consumer

Closes #2935 from vikrambohra/GOBBLIN-1094

2 years ago[GOBBLIN-1100] Set average fetch time in the KafkaExtractor even when…
Hung Tran [Fri, 27 Mar 2020 18:34:20 +0000 (11:34 -0700)] 
[GOBBLIN-1100] Set average fetch time in the KafkaExtractor even when…

Closes #2941 from htran1/kafka-missing-avg-fetch-
time

2 years ago[GOBBLIN-1098] Remove commons-lang and slf4j from the orc-dep fat jar
Hung Tran [Thu, 26 Mar 2020 21:17:22 +0000 (14:17 -0700)] 
[GOBBLIN-1098] Remove commons-lang and slf4j from the orc-dep fat jar

Closes #2939 from htran1/orc-dep-fat-remove-deps

2 years ago[GOBBLIN-1096] Work with DST change in compaction watermark
zhchen [Wed, 25 Mar 2020 20:38:20 +0000 (13:38 -0700)] 
[GOBBLIN-1096] Work with DST change in compaction watermark

Closes #2937 from zxcware/dst

2 years ago[GOBBLIN-1097] ResultChainingIterator.add should check if the argument iterator is...
Alex Li [Wed, 25 Mar 2020 01:01:30 +0000 (18:01 -0700)] 
[GOBBLIN-1097] ResultChainingIterator.add should check if the argument iterator is null[]

Closes #2938 from arekusuri/GOBBLIN-1097-fix-
iterator-null

2 years ago[GOBBLIN-1091] Pass Yarn application id as part of AppMaster and YarnTaskRunner's...
sv2000 [Fri, 20 Mar 2020 16:34:13 +0000 (09:34 -0700)] 
[GOBBLIN-1091] Pass Yarn application id as part of AppMaster and YarnTaskRunner's start up command[]

Closes #2933 from sv2000/yarnApplicationId

2 years ago[GOBBLIN-1092][Gobblin 1092] added some logs, fix checkstyle, removed some redundant...
Arjun [Fri, 20 Mar 2020 03:32:25 +0000 (20:32 -0700)] 
[GOBBLIN-1092][Gobblin 1092] added some logs, fix checkstyle, removed some redundant code

Closes #2932 from arjun4084346/debug

2 years ago[GOBBLIN-1089] Refactor policyChecker for extensibility
Lei Sun [Thu, 19 Mar 2020 21:23:07 +0000 (14:23 -0700)] 
[GOBBLIN-1089] Refactor policyChecker for extensibility

Refactor policyChecker for extensibility

Make RowLevelPolicyChecker configurable

Make policy list visible to derived class

Address the comments

Closes #2930 from autumnust/policyCheckerRefactor

2 years ago[GOBBLIN-1090] send compiled_skip metrics
Arjun [Thu, 19 Mar 2020 03:58:13 +0000 (20:58 -0700)] 
[GOBBLIN-1090] send compiled_skip metrics

Closes #2931 from arjun4084346/compiledMetrics

2 years ago[GOBBLIN-1088] Don't lowercase partition pattern config
Jack Moseley [Wed, 18 Mar 2020 16:22:10 +0000 (09:22 -0700)] 
[GOBBLIN-1088] Don't lowercase partition pattern config

Closes #2929 from jack-moseley/descriptor-
lowercase

2 years ago[GOBBLIN-1087] Track and report histogram of observed lag from Gobblin…
sv2000 [Wed, 18 Mar 2020 04:37:13 +0000 (21:37 -0700)] 
[GOBBLIN-1087] Track and report histogram of observed lag from Gobblin…

Closes #2928 from sv2000/observedLag

2 years ago[GOBBLIN-1080] Add configuration to add schema creation time in converter
Zihan Li [Wed, 18 Mar 2020 04:04:23 +0000 (21:04 -0700)] 
[GOBBLIN-1080] Add configuration to add schema creation time in converter

Closes #2925 from ZihanLi58/GOBBLIN-1080-new

2 years ago[GOBBLIN-1084] Refresh flowgraph when templates are modified
Jack Moseley [Wed, 18 Mar 2020 01:18:34 +0000 (18:18 -0700)] 
[GOBBLIN-1084] Refresh flowgraph when templates are modified

Closes #2924 from jack-moseley/refresh-flowgraph

2 years ago[GOBBLIN-1086] Add job orchestrated time, use job start/prepare time to set job start...
Arjun [Wed, 18 Mar 2020 00:42:43 +0000 (17:42 -0700)] 
[GOBBLIN-1086] Add job orchestrated time, use job start/prepare time to set job start time in GaaS jobs

Closes #2927 from arjun4084346/fixStart

2 years ago[GOBBLIN-1075] Add option to return latest failed flows
Jack Moseley [Mon, 16 Mar 2020 23:32:29 +0000 (16:32 -0700)] 
[GOBBLIN-1075] Add option to return latest failed flows

Closes #2915 from jack-moseley/failed-flows

2 years ago[GOBBLIN-1085][Gobblin-1085] fix compaction initialization
Arjun [Fri, 13 Mar 2020 21:02:26 +0000 (14:02 -0700)] 
[GOBBLIN-1085][Gobblin-1085] fix compaction initialization

fix compaction initialization

address review comments

Closes #2926 from arjun4084346/compactionFix

2 years ago[GOBBLIN-1073] Add proxy user and requester quota to GaaS
Jack Moseley [Fri, 13 Mar 2020 04:26:12 +0000 (21:26 -0700)] 
[GOBBLIN-1073] Add proxy user and requester quota to GaaS

Closes #2913 from jack-moseley/gaas-quota

2 years ago[GOBBLIN-1082] compile a flow before storing it in spec catalog
Arjun [Thu, 12 Mar 2020 18:01:09 +0000 (11:01 -0700)] 
[GOBBLIN-1082] compile a flow before storing it in spec catalog

Closes #2921 from
arjun4084346/storeSpecAfterCompile

2 years ago[GOBBLIN-1081] Adding support of timestamp data type for CsvToJsonConverter
Haoji Liu [Thu, 12 Mar 2020 17:29:00 +0000 (10:29 -0700)] 
[GOBBLIN-1081] Adding support of timestamp data type for CsvToJsonConverter

Currently CsvToJsonConverterV2 only supprot
string, bool, double and int.
Need to support Avro logic type timestamp

Test Done:
1. unit test
2. integration test by running a Gobblin job with
converter CsvToJsonConverterV2, and set one column
to timestamp

Closes #2920 from haojiliu/gaas

2 years ago[GOBBLIN-1040] HighLevelConsumer re-design by removing references to …
vbohra [Wed, 11 Mar 2020 23:14:19 +0000 (16:14 -0700)] 
[GOBBLIN-1040] HighLevelConsumer re-design by removing references to …

Closes #2900 from vikrambohra/GOBBLIN-1040

2 years ago[GOBBLIN-1079][Gobblin-1079] set extract.is.full property
Arjun [Wed, 11 Mar 2020 20:31:24 +0000 (13:31 -0700)] 
[GOBBLIN-1079][Gobblin-1079] set extract.is.full property

Dear Gobblin maintainers,

Please accept this PR. I understand that it will
not be reviewed until I have checked off all the
steps below!
yukuai518  please review

### JIRA
- [x] My PR addresses the following [Gobblin-1079]
(https://issues.apache.org/jira/browse/GOBBLIN/)
issues and references them in the PR title.

### Description
- [x] Here are some details about my PR, including
screenshots (if applicable):
a new pull job that pulls from file based source
S3 with snapshot_only extract type.
But the job is still writing the output as _append
instead of _full .
this PR will use the appropriate configs so the
output path is correctly calculated.

### Tests
- [x] My PR adds the following unit tests __OR__
does not need testing for this extremely good
reason:
trivial changes

### Commits
- [x] My commits all reference JIRA issues in
their subject lines, and I have squashed multiple
commits if they address the same issue. In
addition, my commits follow the guidelines from
"[How to write a good git commit
message](http://chris.beams.io/posts/git-
commit/)":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not
"adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Closes #2918 from arjun4084346/extractIsFullFix

2 years ago[Gobblin-1077][GOBBLIN-1077] Fix bug in HiveDataset.resolveConfig
Zihan Li [Wed, 11 Mar 2020 01:22:26 +0000 (18:22 -0700)] 
[Gobblin-1077][GOBBLIN-1077] Fix bug in HiveDataset.resolveConfig

add writer schema to workUnitState

directly use writer.latest.schema

address comments

fix the bug in HiveDataset.resolveConfig

remove duplicated code in YarnService

reformat test code

address comments

Closes #2917 from ZihanLi58/GOBBLIN-1077

2 years ago[GOBBLIN-1076] Make Gobblin cluster working directories configurable
sv2000 [Wed, 11 Mar 2020 00:23:14 +0000 (17:23 -0700)] 
[GOBBLIN-1076] Make Gobblin cluster working directories configurable

Closes #2916 from sv2000/clusterWorkDir

2 years ago[GOBBLIN-1072] Being more conservative on releasing containers
autumnust [Mon, 9 Mar 2020 16:42:14 +0000 (09:42 -0700)] 
[GOBBLIN-1072] Being more conservative on releasing containers

Add sliding window to protect AutoScaling from
fluctuation of number of active Helix Partitions

Less conservative on tagging an instance as unused

Add a unit test

Address comments

Closes #2912 from
autumnust/conservativeReleaseYarnContainers

2 years ago[GOBBLIN-1071] Retry task initialization
autumnust [Sun, 8 Mar 2020 20:22:17 +0000 (13:22 -0700)] 
[GOBBLIN-1071] Retry task initialization

Retry task initialization

Removing task-cancel blocking code ( and will
create in another PR)

Address comments

Fix travis failure

Fix travis failure II

Fix travis failure III

Closes #2909 from autumnust/retryTaskCreation

2 years ago[GOBBLIN-1074] Sort job status array when returning flow status
Jack Moseley [Sat, 7 Mar 2020 00:47:52 +0000 (16:47 -0800)] 
[GOBBLIN-1074] Sort job status array when returning flow status

Closes #2914 from jack-moseley/sort-job-status

2 years ago[Gobblin-1069][GOBBLIN-1069] Add NPE check in handleContainerCompletion method
Zihan Li [Sat, 7 Mar 2020 00:46:17 +0000 (16:46 -0800)] 
[Gobblin-1069][GOBBLIN-1069] Add NPE check in handleContainerCompletion method

Closes #2911 from ZihanLi58/GOBBLIN-1069-new

2 years ago[GOBBLIN-1069] Add NPE check in handleContainerCompletion method
Zihan Li [Thu, 5 Mar 2020 20:08:23 +0000 (12:08 -0800)] 
[GOBBLIN-1069] Add NPE check in handleContainerCompletion method

add writer schema to workUnitState

directly use writer.latest.schema

Add NPE check in handleContainerCompletion method

change code style

Closes #2908 from ZihanLi58/GOBBLIN-1069

2 years ago[GOBBLIN-1065] Fix SSL verification issue for macOS
autumnust [Wed, 4 Mar 2020 18:01:27 +0000 (10:01 -0800)] 
[GOBBLIN-1065] Fix SSL verification issue for macOS

Closes #2903 from autumnust/master

2 years ago[GOBBLIN-1067] Add SFTP DataNode type in Gobblin-as-a-Service (GaaS) FlowGraph[]
sv2000 [Tue, 3 Mar 2020 21:09:41 +0000 (13:09 -0800)] 
[GOBBLIN-1067] Add SFTP DataNode type in Gobblin-as-a-Service (GaaS) FlowGraph[]

Closes #2906 from sv2000/sftpNode

2 years ago[GOBBLIN-1064] Make KafkaAvroSchemaRegistry extendable
Zihan Li [Tue, 3 Mar 2020 16:18:15 +0000 (08:18 -0800)] 
[GOBBLIN-1064] Make KafkaAvroSchemaRegistry extendable

add writer schema to workUnitState

directly use writer.latest.schema

Closes #2905 from ZihanLi58/GOBBLIN-1064

2 years ago[GOBBLIN-1066] field projection with namespace
zhchen [Mon, 2 Mar 2020 17:33:39 +0000 (09:33 -0800)] 
[GOBBLIN-1066] field projection with namespace

Closes #2904 from zxcware/hsec

2 years ago[GOBBLIN-1050] Verify requester when updating/deleting FlowConfig
Jack Moseley [Fri, 28 Feb 2020 19:22:08 +0000 (11:22 -0800)] 
[GOBBLIN-1050] Verify requester when updating/deleting FlowConfig

Closes #2890 from jack-moseley/check-requester

2 years ago[GOBBLIN-1063] add log
Arjun [Fri, 28 Feb 2020 05:52:31 +0000 (21:52 -0800)] 
[GOBBLIN-1063] add log

Closes #2902 from arjun4084346/log