Junkai Xue [Mon, 12 Jun 2017 18:51:38 +0000 (11:51 -0700)]
[maven-release-plugin] prepare for next development iteration
Junkai Xue [Mon, 12 Jun 2017 18:27:39 +0000 (11:27 -0700)]
[maven-release-plugin] prepare release helix-0.6.8
Lei Xia [Sat, 10 Jun 2017 00:20:22 +0000 (17:20 -0700)]
[HELIX-660]Configurable operation timeout for Helix ZKClient.
Junkai Xue [Wed, 24 May 2017 19:13:25 +0000 (12:13 -0700)]
Minor tests stablizing
Modify test sleeping time for stablizing tests.
Jiajun Wang [Wed, 24 May 2017 01:02:24 +0000 (18:02 -0700)]
[HELIX-657] Fix unexpected idealstate overwrite when persist assignment is on.
1. Change persist method from set to update in PersistAssignmentStage.
The new updater only overwrites map and list fields that the controller will update during PersistAssignmentStage.
All the other updates from other sources that are made during controller read and write will be kept, as long as those fields are not purposely updated by the controller.
If current node does not exist, new updater return null.
2. Update accessors who relies on updater to check new data before applying the change. If the returned new data is null, should skip updating or creating.
Also, add a test case for PersistAssignmentStage to cover the change.
Lei Xia [Tue, 23 May 2017 20:58:24 +0000 (13:58 -0700)]
Allow user to enable persisting preference list and best possible state map into IdealState in full-auto mode.
Lei Xia [Tue, 23 May 2017 19:27:23 +0000 (12:27 -0700)]
Add support of setting/updating Cluster/Resource/Instance configs in ConfigAccessor.
Junkai Xue [Tue, 16 May 2017 23:39:21 +0000 (16:39 -0700)]
Test fixes for release
Junkai Xue [Tue, 16 May 2017 22:42:59 +0000 (15:42 -0700)]
Minor improvement for batch message test
Weihan Kong [Mon, 22 May 2017 22:30:15 +0000 (15:30 -0700)]
Fix TestClusterVerifier
Full-Auto verifier requires BestPossibleState to be persisted but this
change hasn’t been synced into the code base from LinkedIn Helix yet.
Patch an ad-hoc fix for TestClusterVerifier, in order to publish new
release quickly. This verifier will be synced with LinkedIn Helix later.
Junkai Xue [Tue, 16 May 2017 22:39:27 +0000 (15:39 -0700)]
Revert "[maven-release-plugin] prepare release helix-0.6.8"
This reverts commit
ebad4c310414c7cc7b331fb662ac94c97bf02398.
Junkai Xue [Tue, 16 May 2017 22:39:11 +0000 (15:39 -0700)]
Revert "[maven-release-plugin] prepare for next development iteration"
This reverts commit
90781bc64bc6078bb0cf552aa1d9016392be4fb0.
Junkai Xue [Tue, 16 May 2017 20:58:32 +0000 (13:58 -0700)]
[maven-release-plugin] prepare for next development iteration
Junkai Xue [Tue, 16 May 2017 19:19:56 +0000 (12:19 -0700)]
[maven-release-plugin] prepare release helix-0.6.8
Junkai Xue [Fri, 12 May 2017 18:49:01 +0000 (11:49 -0700)]
Make map in NotificationContext synchronized
One issue we observed is that when batch messages enabled, it will have NPE in ZNRecord merge record.
Race condition could be the root cause. The only place can have race condition is the current state update map in NotificationContext, which is passed as input for multiple sub tasks in BatchMessageHandler.
Junkai Xue [Wed, 10 May 2017 19:18:41 +0000 (12:18 -0700)]
[Helix-656] Support customize batch state transition thread pool
To better support batch message handling, we shall make batch state transition thread pool configurable.
Junkai Xue [Tue, 9 May 2017 18:38:36 +0000 (11:38 -0700)]
[HELIX-631] Fix AutoRebalanceStrategy replica not assigned
In our current AutoRebalanceStrategy, Helix uses greedy algorithm to assign replicas. With the constraint that two replicas from same partition should not assigned to same node and nodes' capacity calculated by evenly distributed. Thus there may some replicas are not assigned.
With this fix, Helix will try to force assign the orphaned replicas to the node with minimum overload. This may cause imbalanced assignment.
Lei Xia [Fri, 7 Apr 2017 21:08:54 +0000 (14:08 -0700)]
Bump to JDK 1.7.
Junkai Xue [Thu, 27 Apr 2017 00:43:30 +0000 (17:43 -0700)]
Expose Callbacks that can let async operation of ZkClient function
Current async related operation in ZkClient cannot be utilized as the input arguments are the Callbacks hidden in ZkAsyncCallbacks class.
Junkai Xue [Tue, 11 Apr 2017 01:23:24 +0000 (18:23 -0700)]
Add test for testing submessage fail but update status.
Junkai Xue [Tue, 4 Apr 2017 00:25:33 +0000 (17:25 -0700)]
Add Test for Batch Message ThreadPool
Lei Xia [Wed, 5 Apr 2017 16:34:13 +0000 (09:34 -0700)]
Update ivy files with new version name.
kishoreg [Mon, 3 Apr 2017 07:10:20 +0000 (00:10 -0700)]
Creating a separate threadpool to handle batchMessages
Lei Xia [Thu, 30 Mar 2017 23:12:07 +0000 (16:12 -0700)]
Minor fix on the asynchronous group callbacks.
Junkai Xue [Thu, 30 Mar 2017 22:45:26 +0000 (15:45 -0700)]
Make more user friendly API change
Current API provide a map that returns resource to disabled partitions that combined in one String. Change to return a map that returns resource to list of disabled partitions.
kishoreg [Tue, 28 Mar 2017 21:22:14 +0000 (14:22 -0700)]
Auto compress ZNode that are greater than 1MB
kishoreg [Fri, 24 Mar 2017 17:48:05 +0000 (10:48 -0700)]
Adding support to batch ZK callback optionally by setting sys var asyncBatchModeEnabled=true
Lei Xia [Sun, 12 Mar 2017 23:33:06 +0000 (16:33 -0700)]
Add deprecated clusterStateVerifiers classes back to their original packages for back-compatiblilty, marked them all as deprecated.
Lei Xia [Wed, 15 Feb 2017 16:17:58 +0000 (08:17 -0800)]
Add PropertyPathConfig back to code-base for API dependency backcompatible, will remove the class in next major release.
Junkai Xue [Wed, 8 Mar 2017 23:41:26 +0000 (15:41 -0800)]
Add back old getInstanceEnabledForPartition function
Add back old getInstanceEnabledForPartition to make backward compatible.
Junkai Xue [Wed, 8 Mar 2017 23:39:21 +0000 (15:39 -0800)]
Test base refactoring and add new synchronize tests
Due to heavy asynchronized tests running in Helix, current tests are running slow. Thus introduce these new synchronized tests to improve the test efficiency.
1. Split out TaskSynchronizedTestBase.
2. Add synchronized test for delay jobs.
Junkai Xue [Sat, 18 Feb 2017 00:51:40 +0000 (16:51 -0800)]
Add methods for creating WorkflowContext and JobContext for integration test
Add methods for creating WorkflowContext and JobContext for integration test
Jean-Francois Im [Wed, 8 Mar 2017 22:21:45 +0000 (14:21 -0800)]
Ignore instances with no instance configuration
Ignore instances with no instance configuration when fetching the list
of instances that have a specific tag.
The deletion order in ZKHelixAdmin#dropInstance deletes the instance
configuration before deleting the instance itself. If this is
interrupted midway, the instance configuration is deleted but the
instance is present in the list of instances.
When fetching the list of instances with a given tag, this means that
if an instance has its configuration missing, the instance
configuration will be null and the loop will exit with NPE. This patch
adds a null check to avoid aborting the loop.
Junkai Xue [Thu, 9 Feb 2017 22:06:57 +0000 (14:06 -0800)]
Support cancel tasks with synchronized check task status
Currently, in Helix, cancel and stop a job does not check subtasks status. In this rb:
1. Add new API to support sync stopping a workflow/queue
2. Controller side check subtasks are stopped before mark job status.
Junkai Xue [Thu, 9 Feb 2017 22:01:16 +0000 (14:01 -0800)]
Fix ResourceConfig validation
Fix validation of resource config for RebalanceConfig check
Lei Xia [Thu, 9 Feb 2017 21:26:18 +0000 (13:26 -0800)]
Fix the java 8 issue.
Junkai Xue [Thu, 9 Feb 2017 19:36:32 +0000 (11:36 -0800)]
[HELIX-653] Fix enable/disable partition in instances for resource specific
Helix currently enable/disable partition in instances across all the resources if partition is same. Fix it with resource associated partition enable/disable.
Junkai Xue [Thu, 9 Feb 2017 19:29:52 +0000 (11:29 -0800)]
Fix build for package changed
Verifier package has been changed and compile failed.
Yinan Li [Thu, 9 Feb 2017 04:22:54 +0000 (20:22 -0800)]
Import EvaluateCriteria change from master branch
Import partial of code change of:
https://github.com/apache/helix/commit/
2a2908ac3d536cf3595bb2eb23d49c8153c51d5e
Yinan Li [Mon, 6 Feb 2017 22:29:38 +0000 (14:29 -0800)]
Added new DataSource values LIVEINSTANCES and INSTANCES and made CriteriaEvaluator support them
Lei Xia [Fri, 18 Nov 2016 02:15:50 +0000 (18:15 -0800)]
Avoid moving partitions unnecessarily when auto-rebalancing using default AutoRebalanceStrategy.
Lei Xia [Wed, 10 Aug 2016 15:40:15 +0000 (08:40 -0700)]
Fix bug in AutoRebalanceStrategy to try to assign orphan replicas to its preferred nodes instead of random nodes.`
Lei Xia [Thu, 3 Nov 2016 21:13:42 +0000 (14:13 -0700)]
Fix a bug in BestPossibleExternalViewVerifier.
Lei Xia [Mon, 31 Oct 2016 18:06:00 +0000 (11:06 -0700)]
Move all options from IdealState to ResourceConfig, add Bulder for building ResourceConfig, and a new RebalanceConfig to hold all rebalance options for a resource.
Lei Xia [Thu, 13 Oct 2016 01:15:10 +0000 (18:15 -0700)]
Persist controller leader change history with timestamp for each leader controller.
Lei Xia [Tue, 30 Aug 2016 20:26:52 +0000 (13:26 -0700)]
Persist the timestamp along with controller leader change history.
Lei Xia [Fri, 23 Sep 2016 15:26:50 +0000 (08:26 -0700)]
Refactor/renames the instances releated methods in ClusterDataCache.
Lei Xia [Wed, 21 Sep 2016 17:53:31 +0000 (10:53 -0700)]
Add cluster-level and resource-level config option to allow disable delayed rebalance of entire cluster or individual resource.
Boyan Li [Mon, 19 Sep 2016 18:42:11 +0000 (11:42 -0700)]
Add more messaging metrics to participant.
Lei Xia [Mon, 19 Sep 2016 17:20:01 +0000 (10:20 -0700)]
Log improvement: add resource name to logs in different places.
Lei Xia [Mon, 12 Sep 2016 23:42:17 +0000 (16:42 -0700)]
New DelayedAutoRebalancer featured with delayed partition movements during rebalancing.
Lei Xia [Fri, 9 Sep 2016 20:31:14 +0000 (13:31 -0700)]
Persist assignment map using specific format for MasterSlave resources (This is a short-term solution, we should get rid of this asap).
Boyan Li [Wed, 14 Sep 2016 22:29:54 +0000 (15:29 -0700)]
Add a messaging monitoring mbean to ParticipantStatusMonitor object.
Lei Xia [Thu, 15 Sep 2016 18:56:10 +0000 (11:56 -0700)]
More refactors on rebalancer releated pipelines. 1) Move get rebalancer and mappingCalculator logics out to separate methods to make the main flow clearer. 2) Move definition of ANY_LIVEINSTANCE from HelixConstants to IdealState.
Boyan Li [Wed, 14 Sep 2016 17:48:23 +0000 (10:48 -0700)]
Rename ParticipantMonitor class
Boyan Li [Thu, 8 Sep 2016 23:36:25 +0000 (16:36 -0700)]
Revert "add a mbean for participant and emit received msgs"
This reverts commit
ccff605eb311072644f355686853351a10ff5b95.
Lei Xia [Fri, 2 Sep 2016 01:21:36 +0000 (18:21 -0700)]
Persist participant's offline timestamp in ParticipantHistory.
This is to persist the timestamp when a participant is going offline.
1) If a participant goes offlien gracefully (by calling disconnect()), participant will write a timestamp to its history record.
2) If a participant goes offline without calling disconnect() (e.g, GC, machine crashes), controller will try to set the timestamp in its pipeline triggered by liveInstanceChanges.
Lei Xia [Mon, 12 Sep 2016 17:59:40 +0000 (10:59 -0700)]
Refactor: add AbstractRebalancer as an abstract class, which contains the default implementations as well as utility functions that will be used by all specific rebalancers.
Lei Xia [Fri, 9 Sep 2016 22:58:46 +0000 (15:58 -0700)]
Add reset() method to MockParticipantManager to allow reuse of the participant.
Lei Xia [Fri, 9 Sep 2016 18:07:37 +0000 (11:07 -0700)]
Refactor: rename ParticipantManagerHelper to ParticipantManager and move all logic to handle new participant session into the ParticipangManager.
Lei Xia [Tue, 30 Aug 2016 20:29:08 +0000 (13:29 -0700)]
Persist session change history with timestamp for each participant.
Boyan Li [Mon, 29 Aug 2016 16:47:27 +0000 (09:47 -0700)]
Add a mbean for participant and emit received msgs
Boyan Li [Tue, 23 Aug 2016 04:24:43 +0000 (21:24 -0700)]
Add logs for session sync messages
Boyan Li [Fri, 5 Aug 2016 18:07:15 +0000 (11:07 -0700)]
participant syncs session id to controller
Junkai Xue [Fri, 29 Jul 2016 23:25:01 +0000 (16:25 -0700)]
Fix BestPossibleExternalViewVerifier toString NPE
Lei Xia [Fri, 1 Jul 2016 17:20:50 +0000 (10:20 -0700)]
Add support for flexible hirerachy representation of a cluster topology.
Lei Xia [Thu, 21 Jul 2016 18:29:02 +0000 (11:29 -0700)]
Add StrictMatchExternalViewVerifier that verifies whether the ExternalViews of given resources (or all resources in the cluster) match exactly as its ideal mapping (in idealstate).
Lei Xia [Wed, 20 Jul 2016 01:17:19 +0000 (18:17 -0700)]
Restructure ClusterVerifiers. Add HelixClusterVerifier interface, add abstract class ZkHelixClusterVerifier, and a BestPossibleExternViewVerifier implementation.
Lei Xia [Mon, 18 Jul 2016 23:06:22 +0000 (16:06 -0700)]
Refactor: put all cluster verifiers into a sub-module of tools.
Lei Xia [Mon, 27 Jun 2016 22:46:13 +0000 (15:46 -0700)]
Add Multi-round CRUSH rebalance strategy.
Lei Xia [Fri, 1 Jul 2016 23:27:53 +0000 (16:27 -0700)]
Add option to allow persisting best possible partition assignment in IdealState for semi-auto and full-auto modes.
Lei Xia [Tue, 7 Jun 2016 23:25:36 +0000 (16:25 -0700)]
Deprecated AutoModeISBuilder and AutoRebalanceModeIsBuilder and created SemiAutoISBuilder and FullAutoISBuilder instead.
Lei Xia [Fri, 15 Apr 2016 22:38:15 +0000 (15:38 -0700)]
Support of client's customized threadpool for state-transition message handling.
Lei Xia [Mon, 6 Feb 2017 00:37:29 +0000 (16:37 -0800)]
Minor fix: Do not set MaxPartitionPerNode in IdealState if it is not greater than 0.
Junkai Xue [Sun, 29 Jan 2017 01:38:41 +0000 (17:38 -0800)]
Refactor isClusterSetup log structure for debugging purpose
Junkai Xue [Sun, 29 Jan 2017 01:20:55 +0000 (17:20 -0800)]
Replace HelixDataAccessor.createProperty with more specific methods
Junkai Xue [Sun, 29 Jan 2017 01:06:46 +0000 (17:06 -0800)]
Move HelixUtil.get*Path to PropertyPathBuilder
Junkai Xue [Sun, 29 Jan 2017 00:55:24 +0000 (16:55 -0800)]
Rename PropertyPathConfig to PropertyPathBuilder
Junkai Xue [Sun, 29 Jan 2017 00:46:01 +0000 (16:46 -0800)]
Assign orders for tests in TestSchedulerMessage
Junkai Xue [Sun, 29 Jan 2017 00:34:53 +0000 (16:34 -0800)]
More specific log message when verification is in progress
Junkai Xue [Sun, 29 Jan 2017 00:33:17 +0000 (16:33 -0800)]
Add test for SEMI_AUTO
Junkai Xue [Sun, 29 Jan 2017 00:31:55 +0000 (16:31 -0800)]
Make synchronized for AsyncCallback.startTimer to avoid race condition
Junkai Xue [Sat, 17 Dec 2016 01:21:43 +0000 (17:21 -0800)]
Support delaying jobs schedule with configurable delay time and start time
Lei Xia [Thu, 19 Jan 2017 23:11:20 +0000 (15:11 -0800)]
[maven-release-plugin] prepare for next development iteration
Lei Xia [Thu, 19 Jan 2017 23:11:05 +0000 (15:11 -0800)]
[maven-release-plugin] prepare release helix-0.6.7
Lei Xia [Thu, 19 Jan 2017 22:32:30 +0000 (14:32 -0800)]
Fix Java 6 compilation error.
Priyesh Narayanan [Tue, 10 Jan 2017 00:45:46 +0000 (16:45 -0800)]
[HELIX-651] Add a method in HelixAdmin to set the InstanceConfig of an existing instance
- Add a setInstanceConfig() method in HelixAdmin interface
- Add an implementation for the same in ZkHelixAdmin
- Add a test in TestZkHelixAdmin
Junkai Xue [Fri, 16 Dec 2016 23:39:57 +0000 (15:39 -0800)]
[HELIX-650] Add StateTransitionConfig and Expose API add state transition timeout
1. Add StateTransitionConfig for add state transition properties, such as timeout.
2. Add the new API for setting state transition timeout.
3. Add logics in message generation for timeout setting in message that backward compatible.
Junkai Xue [Fri, 16 Dec 2016 23:26:22 +0000 (15:26 -0800)]
[HELIX-649] Fix StateModelDef name is not consistent
The StateModelDef is not consistent as the user provided since it use the ZNRecord id of StateModel
Junkai Xue [Fri, 16 Dec 2016 02:20:03 +0000 (18:20 -0800)]
[HELIX-648] Extend WorkflowConfig and JobConfig to ResourceConfig
WorkflowConfig and JobConfig are stored as ResourceConfig but still needs extra conversion in current codebase. Thus we have to extend those two configs to ResourceConfig.
Junkai Xue [Fri, 16 Dec 2016 00:57:19 +0000 (16:57 -0800)]
[HELIX-646] DeleteJob from a recurrent job queue should not throw exception if last scheduled queue not exist
When delete a job from recurrent jobqueue, it will delete job from last scheduled one and recurrent template. But the last scheduled one could not started or expired, thus we should ignore the fail of deletion.
Junkai Xue [Fri, 16 Dec 2016 00:49:25 +0000 (16:49 -0800)]
[HELIX-645] Fix Task State Model INIT priority number
Junkai Xue [Fri, 16 Dec 2016 00:48:38 +0000 (16:48 -0800)]
[HELIX-644] Add accessor method for getting zkSerializer from ZkClient
Junkai Xue [Fri, 16 Dec 2016 00:47:27 +0000 (16:47 -0800)]
[HELIX-643] Make instance variables in DistClusterControllerStateModel as protected fields to make them visiable to its subclass.
Junkai Xue [Fri, 16 Dec 2016 00:46:11 +0000 (16:46 -0800)]
[HELIX-642] Disable the participant instance once it disconnected due to unstable ZK
When ZK is not stable, the instance connection will be disconnected. Before disconnect from ZK, disable the instance.
Junkai Xue [Fri, 16 Dec 2016 00:45:14 +0000 (16:45 -0800)]
[HELIX-641] Add total message recevied for each instance
Junkai Xue [Mon, 31 Oct 2016 18:48:17 +0000 (11:48 -0700)]
[HELIX-637] Populate the StateModelDefinition once it is updated
Did a check for StateModeDefinition. Will be rewrite once it is updated.
Lei Xia [Mon, 31 Oct 2016 23:25:40 +0000 (16:25 -0700)]
[maven-release-plugin] prepare for next development iteration
Lei Xia [Mon, 31 Oct 2016 23:23:03 +0000 (16:23 -0700)]
[maven-release-plugin] prepare release helix-0.6.6
Lei Xia [Mon, 31 Oct 2016 23:17:19 +0000 (16:17 -0700)]
Revert "[maven-release-plugin] prepare release helix-0.6.6"
This reverts commit
58a98373a214883c13ef0313c70110b481a5c2dd.