Add initial version of automatic troubleshooter (#3296)
authorAlex Prokofiev <aprokofiev@linkedin.com>
Thu, 1 Jul 2021 19:14:17 +0000 (12:14 -0700)
committerGitHub <noreply@github.com>
Thu, 1 Jul 2021 19:14:17 +0000 (12:14 -0700)
commitd5a85e35cff176d9bfa833c668a395785357e38c
tree1c2aa1d9a2c02211891a4d2d5ffb34b9df241bff
parent4b1d57ff82783eab04ab19fe183086f0154365e3
Add initial version of automatic troubleshooter (#3296)

* GOBBLIN-1457 Add initial version of automatic troubleshooter

Gobblin users, developers and SREs spend a significant amount of time
troubleshooting issues with their jobs and flows. Previously,
troubleshooting required looking into logs across multiple systems,
like Hadoop mappers, Azkaban jobs and Gobblin service. Log messages
required knowledge of how Gobblin works internally, and users frequently
had to involve Gobblin developers to understand them.

This is an initial commit for automatic Gobblin troubleshooter. The
current implementation will intercept logs from Mappers and Azkaban job
and compile a list of issues based on them. This list will be filtered
and prioritized. Then it will be displayed to the user in logs, and
forwarded as an event for consumption by Gobblin service.

Future commits will add issue reporting in Gobblin service, as well
as improve on issue filtering & refining.

For more information, check the docs on AutomaticTroubleshooter class,
and review the design doc:
https://docs.google.com/document/d/1BAYr-dHtdauX6Uf13VP3-IlHkyUQ5V64PjFIg9MtpaA/edit#

* Refactored troubleshooter to remove log4j dependency from runtime

* Addressed code review comments
39 files changed:
gobblin-all/build.gradle
gobblin-api/src/main/java/org/apache/gobblin/configuration/ConfigurationKeys.java
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinHelixJobLauncher.java
gobblin-modules/gobblin-troubleshooter/build.gradle [new file with mode: 0644]
gobblin-modules/gobblin-troubleshooter/src/main/java/org/apache/gobblin/troubleshooter/AutoTroubleshooterLogAppender.java [new file with mode: 0644]
gobblin-modules/gobblin-troubleshooter/src/main/java/org/apache/gobblin/troubleshooter/AutomaticTroubleshooterImpl.java [new file with mode: 0644]
gobblin-modules/gobblin-troubleshooter/src/test/java/org/apache/gobblin/troubleshooter/AutoTroubleshooterLogAppenderTest.java [new file with mode: 0644]
gobblin-modules/gobblin-troubleshooter/src/test/java/org/apache/gobblin/troubleshooter/AutomaticTroubleshooterTest.java [new file with mode: 0644]
gobblin-runtime/build.gradle
gobblin-runtime/src/main/java/org/apache/gobblin/runtime/AbstractJobLauncher.java
gobblin-runtime/src/main/java/org/apache/gobblin/runtime/GobblinMultiTaskAttempt.java
gobblin-runtime/src/main/java/org/apache/gobblin/runtime/JobContext.java
gobblin-runtime/src/main/java/org/apache/gobblin/runtime/TaskState.java
gobblin-runtime/src/main/java/org/apache/gobblin/runtime/TaskStateCollectorService.java
gobblin-runtime/src/main/java/org/apache/gobblin/runtime/ThrowableWithErrorCode.java [new file with mode: 0644]
gobblin-runtime/src/main/java/org/apache/gobblin/runtime/mapreduce/MRJobLauncher.java
gobblin-runtime/src/main/java/org/apache/gobblin/runtime/troubleshooter/AutomaticTroubleshooter.java [new file with mode: 0644]
gobblin-runtime/src/main/java/org/apache/gobblin/runtime/troubleshooter/AutomaticTroubleshooterConfig.java [new file with mode: 0644]
gobblin-runtime/src/main/java/org/apache/gobblin/runtime/troubleshooter/AutomaticTroubleshooterFactory.java [new file with mode: 0644]
gobblin-runtime/src/main/java/org/apache/gobblin/runtime/troubleshooter/DefaultIssueRefinery.java [new file with mode: 0644]
gobblin-runtime/src/main/java/org/apache/gobblin/runtime/troubleshooter/InMemoryIssueRepository.java [new file with mode: 0644]
gobblin-runtime/src/main/java/org/apache/gobblin/runtime/troubleshooter/Issue.java [new file with mode: 0644]
gobblin-runtime/src/main/java/org/apache/gobblin/runtime/troubleshooter/IssueEventBuilder.java [new file with mode: 0644]
gobblin-runtime/src/main/java/org/apache/gobblin/runtime/troubleshooter/IssueRefinery.java [new file with mode: 0644]
gobblin-runtime/src/main/java/org/apache/gobblin/runtime/troubleshooter/IssueRepository.java [new file with mode: 0644]
gobblin-runtime/src/main/java/org/apache/gobblin/runtime/troubleshooter/IssueSeverity.java [new file with mode: 0644]
gobblin-runtime/src/main/java/org/apache/gobblin/runtime/troubleshooter/NoopAutomaticTroubleshooter.java [new file with mode: 0644]
gobblin-runtime/src/main/java/org/apache/gobblin/runtime/troubleshooter/NoopIssueRefinery.java [new file with mode: 0644]
gobblin-runtime/src/main/java/org/apache/gobblin/runtime/troubleshooter/NoopIssueRepository.java [new file with mode: 0644]
gobblin-runtime/src/main/java/org/apache/gobblin/runtime/troubleshooter/TroubleshooterException.java [new file with mode: 0644]
gobblin-runtime/src/main/java/org/apache/gobblin/runtime/util/GsonUtils.java [new file with mode: 0644]
gobblin-runtime/src/test/java/org/apache/gobblin/runtime/DummyJobContext.java
gobblin-runtime/src/test/java/org/apache/gobblin/runtime/TaskStateCollectorServiceTest.java
gobblin-runtime/src/test/java/org/apache/gobblin/runtime/TaskStateTest.java
gobblin-runtime/src/test/java/org/apache/gobblin/runtime/troubleshooter/AutomaticTroubleshooterConfigTest.java [new file with mode: 0644]
gobblin-runtime/src/test/java/org/apache/gobblin/runtime/troubleshooter/AutomaticTroubleshooterFactoryTest.java [new file with mode: 0644]
gobblin-runtime/src/test/java/org/apache/gobblin/runtime/troubleshooter/InMemoryIssueRepositoryTest.java [new file with mode: 0644]
gobblin-runtime/src/test/java/org/apache/gobblin/runtime/troubleshooter/IssueEventBuilderTest.java [new file with mode: 0644]
gradle/scripts/dependencyDefinitions.gradle