couchdb-mem3.git
2 years agoMerge remote branch 'cloudant:79066-port-chunkified-replicate_batch' master
Eric Avdey [Thu, 24 Nov 2016 18:00:17 +0000 (14:00 -0400)] 
Merge remote branch 'cloudant:79066-port-chunkified-replicate_batch'

This closes #26

Signed-off-by: Eric Avdey <eiri@eiri.ca>
2 years agoChunk missing revisions before attempting to save on target 26/head
Benjamin Anderson [Wed, 29 Oct 2014 19:52:30 +0000 (12:52 -0700)] 
Chunk missing revisions before attempting to save on target

In cases with pathological documents revision patterns (e.g., 10000 open
conflicts and tree depth of 300000 on a single document), attempting to
replicate the full revision tree in one batch causes the system to crash by
attempting to send an oversized message. We've observed messages of > 4GB in the
wild.

This patch divides the set of revisions-to-replicate for a single document into
chunks of a configurable size, thereby allowing operators to keep the system
stable when attempting to replicate these troublesome documents.

BugzID: 37676

2 years agoMerge remote branch 'cloudant:3102-fix-config_subscription'
ILYA Khlopotov [Tue, 23 Aug 2016 21:59:49 +0000 (14:59 -0700)] 
Merge remote branch 'cloudant:3102-fix-config_subscription'

This closes #25

Signed-off-by: ILYA Khlopotov <iilyak@ca.ibm.com>
2 years agoUpdate handle_config_terminate API 25/head
ILYA Khlopotov [Wed, 17 Aug 2016 18:57:54 +0000 (11:57 -0700)] 
Update handle_config_terminate API

COUCHDB-3102

2 years agoMerge remote branch 'cloudant:fix-type-spec'
ILYA Khlopotov [Wed, 25 May 2016 02:16:56 +0000 (19:16 -0700)] 
Merge remote branch 'cloudant:fix-type-spec'

This closes #24

Signed-off-by: ILYA Khlopotov <iilyak@ca.ibm.com>
2 years agoFix type-spec of mem3_sync:next_replication/3 24/head
ILYA Khlopotov [Wed, 25 May 2016 02:10:50 +0000 (19:10 -0700)] 
Fix type-spec of mem3_sync:next_replication/3

2 years agoMerge remote branch 'cloudant:fix-type-spec-for-range-field'
ILYA Khlopotov [Thu, 19 May 2016 12:02:09 +0000 (05:02 -0700)] 
Merge remote branch 'cloudant:fix-type-spec-for-range-field'

This closes #23

Signed-off-by: ILYA Khlopotov <iilyak@ca.ibm.com>
2 years agoFix type spec for range field in #shard{} 23/head
ILYA Khlopotov [Wed, 18 May 2016 23:49:35 +0000 (16:49 -0700)] 
Fix type spec for range field in #shard{}

Add '_' as a valid value for range field to be able to remove complains
when we create ets match spec in mem3_shards:for_shard_name.

2 years agoMerge remote branch 'cloudant:43260-create-target-shard-if-missing'
ILYA Khlopotov [Wed, 18 May 2016 14:51:51 +0000 (07:51 -0700)] 
Merge remote branch 'cloudant:43260-create-target-shard-if-missing'

This closes #21

Signed-off-by: ILYA Khlopotov <iilyak@ca.ibm.com>
2 years agoMake sure mem3_rep autocreates target shards 21/head
Paul J. Davis [Thu, 8 Jan 2015 19:08:03 +0000 (13:08 -0600)] 
Make sure mem3_rep autocreates target shards

The change to our fancier history entries introduced a regression that
internal replication wouldn't automatically create the target shards.
This fixes the issue by adding a get_or_create_db/2 in mem3_rep and
switches the use of couch_db:open_int/2 to that function.

2 years agoPass ADMIN_CTX when opening dbs 22/head
Robert Newson [Tue, 10 May 2016 20:02:53 +0000 (21:02 +0100)] 
Pass ADMIN_CTX when opening dbs

COUCHDB-3016

2 years agoAdd read_concurrency option to mem3_shards table 19/head
Benjamin Anderson [Sun, 10 Apr 2016 06:21:58 +0000 (23:21 -0700)] 
Add read_concurrency option to mem3_shards table

This table sees a great deal of activity from various subsystems -
turning on read_concurrency should be a win.

COUCHDB-2984

2 years agoUse ets:select/2 to retrieve shards by name
Benjamin Anderson [Sun, 10 Apr 2016 06:08:39 +0000 (23:08 -0700)] 
Use ets:select/2 to retrieve shards by name

The result of mem3_shards:for_db/1 on databases with high q values can
be very large, resulting in suboptimal performance for high-volume
callers.

mem3_sync_event_listener is only interested in a small subset of the
result of mem3_shards:for_db/1; moving this filter in to an ets:select/2
call improves performance significantly.

COUCHDB-2984

2 years agoReduce frequency of mem3_sync:push/2 calls
Benjamin Anderson [Sun, 10 Apr 2016 05:44:58 +0000 (22:44 -0700)] 
Reduce frequency of mem3_sync:push/2 calls

In high-throughput scenarios on databases with large q values the
mem3_sync event listener becomes overloaded with messages due to the
poor performance of the shard selection logic.

It's not strictly necessary to sync on every update, but we do need to
be careful not to lose updates by keeping history too naively. This
patch adds a configurable delay and push frequencyto reduce pressure on
the mem3_sync event listener.

COUCHDB-2984

2 years agoRefactor mem3_sync events to dedicated module
Benjamin Anderson [Sun, 10 Apr 2016 03:55:58 +0000 (20:55 -0700)] 
Refactor mem3_sync events to dedicated module

COUCHDB-2984

2 years agoRevert "Remove maintenace modes from ushards" 2953-revert 18/head
Tony Sun [Fri, 26 Feb 2016 17:04:42 +0000 (09:04 -0800)] 
Revert "Remove maintenace modes from ushards"

This reverts commit deed2f0eb15d634a643312e71e343c1e19e1b07e.

COUCHDB-2953

2 years agoRemove maintenace modes from ushards 2953-ushards-mm 17/head
Tony Sun [Wed, 24 Feb 2016 20:38:25 +0000 (12:38 -0800)] 
Remove maintenace modes from ushards

Maintenance mode nodes were being served for ushards and this lead to
nodedown errors. We now only serve non-maintenance mode nodes.

COUCHDB-2953

3 years agoDon't start couch_log app if we don't use it
Alexander Shorin [Tue, 1 Dec 2015 14:25:05 +0000 (17:25 +0300)] 
Don't start couch_log app if we don't use it

This makes no sense since we mock it, but causes unwanted troubles
with config startup dependency.

3 years agoReturn HTTP 405 for unsupported request method 16/head
Alexander Shorin [Mon, 12 Oct 2015 17:31:00 +0000 (20:31 +0300)] 
Return HTTP 405 for unsupported request method

3 years agoMerge remote-tracking branch 'cloudant/fix-eunit-couch-log'
Robert Newson [Tue, 6 Oct 2015 16:57:03 +0000 (17:57 +0100)] 
Merge remote-tracking branch 'cloudant/fix-eunit-couch-log'

3 years agoFix EUnit tests. 15/head
Nick Vatamaniuc [Tue, 6 Oct 2015 14:55:19 +0000 (10:55 -0400)] 
Fix EUnit tests.

 Need couch_log. In one case mock it becuase no other apps are started.

 In another use application:ensure_all_started() -- R16B02+ feature.

3 years agoPass supervisor's children to couch_epi 14/head
ILYA Khlopotov [Tue, 29 Sep 2015 20:04:43 +0000 (13:04 -0700)] 
Pass supervisor's children to couch_epi

3 years agoFix code formating
ILYA Khlopotov [Tue, 29 Sep 2015 16:09:25 +0000 (09:09 -0700)] 
Fix code formating

3 years agoFix typo in behaviour name
ILYA Khlopotov [Mon, 28 Sep 2015 20:23:38 +0000 (13:23 -0700)] 
Fix typo in behaviour name

3 years agoUpdate to new couch_epi API
ILYA Khlopotov [Mon, 28 Sep 2015 17:28:45 +0000 (10:28 -0700)] 
Update to new couch_epi API

3 years agoFix crypto deprecations 13/head
Robert Newson [Wed, 23 Sep 2015 18:24:12 +0000 (19:24 +0100)] 
Fix crypto deprecations

COUCHDB-2825

3 years agoUse dynamic http handlers 10/head
ILYA Khlopotov [Wed, 15 Jul 2015 15:37:47 +0000 (08:37 -0700)] 
Use dynamic http handlers

We use dynamic http handlers for:

  - `_membership`
  - `_shards`

3 years agoreformat to 80 chars/line 8/head
Robert Kowalski [Mon, 16 Mar 2015 22:26:09 +0000 (23:26 +0100)] 
reformat to 80 chars/line

3 years agochange readme for the couchdb project
Robert Kowalski [Sun, 15 Mar 2015 22:27:26 +0000 (23:27 +0100)] 
change readme for the couchdb project

3 years agoadd license file
Robert Kowalski [Sun, 15 Mar 2015 22:22:44 +0000 (23:22 +0100)] 
add license file

3 years agoMerge remote-tracking branch 'kxepal/rename-system-databases'
Alexander Shorin [Thu, 26 Feb 2015 20:27:20 +0000 (23:27 +0300)] 
Merge remote-tracking branch 'kxepal/rename-system-databases'

This closes #6

3 years agoAdd underscore prefix for nodes and dbs database names 6/head
Alexander Shorin [Thu, 26 Feb 2015 18:46:56 +0000 (21:46 +0300)] 
Add underscore prefix for nodes and dbs database names

That's how we name system databases and there should be no exceptions.

COUCHDB-2619

3 years agoRename "shard_db" option to "shards_db" and "node_db" one to "nodes_db"
Alexander Shorin [Thu, 26 Feb 2015 18:18:30 +0000 (21:18 +0300)] 
Rename "shard_db" option to "shards_db" and "node_db" one to "nodes_db"

COUCHDB-2628

3 years agoMerge remote-tracking branch 'iilyak/2561-make-config-API-consistent'
Alexander Shorin [Wed, 4 Feb 2015 15:43:18 +0000 (18:43 +0300)] 
Merge remote-tracking branch 'iilyak/2561-make-config-API-consistent'

This closes #5

COUCHDB-2561

3 years agoDon't restart event handler on termination 5/head
ILYA Khlopotov [Fri, 30 Jan 2015 19:21:35 +0000 (11:21 -0800)] 
Don't restart event handler on termination

COUCHDB-2561

3 years agoUpdate config_listener behaviuor
ILYA Khlopotov [Thu, 29 Jan 2015 21:55:48 +0000 (13:55 -0800)] 
Update config_listener behaviuor

COUCHDB-2561

3 years agoUse ADMIN_CTX macro from couch_db.hrl
Alexander Shorin [Mon, 26 Jan 2015 04:29:37 +0000 (07:29 +0300)] 
Use ADMIN_CTX macro from couch_db.hrl

4 years agoUpdate mem3 for new changes API 2/head
Benjamin Bastian [Fri, 22 Aug 2014 11:39:43 +0000 (18:39 +0700)] 
Update mem3 for new changes API

4 years agoDelete mem3_rebalance for now, currently useless
Robert Newson [Wed, 24 Sep 2014 20:12:36 +0000 (21:12 +0100)] 
Delete mem3_rebalance for now, currently useless

4 years agoAdd and export n/2
Robert Newson [Mon, 15 Sep 2014 23:19:52 +0000 (00:19 +0100)] 
Add and export n/2

4 years agofix stats paths
Robert Newson [Fri, 29 Aug 2014 18:12:54 +0000 (19:12 +0100)] 
fix stats paths

4 years agoopen dbs, nodes as sys dbs
Robert Newson [Fri, 29 Aug 2014 16:23:07 +0000 (17:23 +0100)] 
open dbs, nodes as sys dbs

4 years agoUpdate to use couch_stats windsor-merge
Paul J. Davis [Thu, 21 Aug 2014 06:29:28 +0000 (01:29 -0500)] 
Update to use couch_stats

4 years agoReplace twig with couch_log
Paul J. Davis [Sun, 17 Aug 2014 21:18:30 +0000 (16:18 -0500)] 
Replace twig with couch_log

4 years agoUpdate mem3_rebalance to work with couch_mrview
Paul J. Davis [Sun, 17 Aug 2014 18:54:51 +0000 (13:54 -0500)] 
Update mem3_rebalance to work with couch_mrview

4 years agoRemove mem3_util:owner
Robert Newson [Mon, 16 Jun 2014 12:04:36 +0000 (13:04 +0100)] 
Remove mem3_util:owner

4 years agoGet the shard suffix for a given database
Russell Branca [Tue, 29 Apr 2014 23:36:06 +0000 (16:36 -0700)] 
Get the shard suffix for a given database

This grabs the shards for the given database name, and then pulls out
the first shard and extracts out the suffix. mem3:shards is ets
backed, so in the general case this should be fast.

BugzId: 29571

4 years agoAllow mem3_shards:local to take a list or binary
Russell Branca [Tue, 29 Apr 2014 22:56:59 +0000 (15:56 -0700)] 
Allow mem3_shards:local to take a list or binary

BugzId: 29571

4 years agoFast forward internal repl. between file copies
Adam Kocoloski [Tue, 4 Feb 2014 20:44:14 +0000 (15:44 -0500)] 
Fast forward internal repl. between file copies

In the case where two files have the same UUID we can analyze epoch
information to determine the safe start sequence.

BugzID: 27753

4 years agoAvoid decom:true nodes when fixing zoning
Mike Wallace [Fri, 20 Dec 2013 14:15:43 +0000 (14:15 +0000)] 
Avoid decom:true nodes when fixing zoning

This patch prevents mem3_rebalance:fix_zoning from suggesting moves
onto nodes that are flagged with "decom":true.

BugzID: 26362

4 years agoRefactor mem3_rpc:add_checkpoint/2
Paul J. Davis [Mon, 9 Dec 2013 20:04:39 +0000 (14:04 -0600)] 
Refactor mem3_rpc:add_checkpoint/2

This is based on Adam Kocoloski's original add_checkpoint/2 but uses a
body recursive function to avoid the final reverse/filter steps.

BugzId: 21973

4 years agoWrite plan to /tmp/rebalance_plan.txt
Adam Kocoloski [Thu, 31 Oct 2013 16:23:49 +0000 (12:23 -0400)] 
Write plan to /tmp/rebalance_plan.txt

Was a request by @mattwhite to help with automation.  I was fairly
sloppy in the implementation here, could leave this off and do a better
job next time.

4 years agoAllow target_uuid prefixes in find_source_seq
Paul J. Davis [Fri, 6 Dec 2013 19:52:28 +0000 (13:52 -0600)] 
Allow target_uuid prefixes in find_source_seq

Since sequence values only contain UUID prefixes so we need to account
for that when locating the replication checkpoints.

BugId: 21973

4 years agoInclude replication history on checkpoint docs
Paul J. Davis [Fri, 6 Dec 2013 18:08:07 +0000 (12:08 -0600)] 
Include replication history on checkpoint docs

This changes how and what we store on internal replication checkpoint
documents. The two major changes are that we are now identifying
checkpoint documents by the database UUIDs (instead of the node that
hosted them) and we're storing a history of checkpoint information to
allow us to be able to replace dead shards.

The history is a list of checkpoint entries stored with exponentially
decreasing granularity. This allows us to store ~30 checkpoints covering
ranges into the billions of update sequences which means we won't need
to worry about truncations or other issues for the time being.

There's also a new mem3_rep:find_source_seq/4 helper function that will
find a local update_seq replacement provided information for a remote
shard copy. This logic is a bit subtle and should be reused rather than
reimplemented.

BugzId: 21973

4 years agoInline open_doc_revs into open_docs
Paul J. Davis [Fri, 6 Dec 2013 17:59:45 +0000 (11:59 -0600)] 
Inline open_doc_revs into open_docs

This function was trivial and never reused. It was more confusing to
have it as a separate function rather than just inlining into where it's
used.

4 years agoAdd a new mem3_rpc module for replication RPCs
Paul J. Davis [Fri, 6 Dec 2013 17:55:47 +0000 (11:55 -0600)] 
Add a new mem3_rpc module for replication RPCs

This is intended to make the local/remote code execution contexts a lot
more clear.

4 years agoReorder functions into a logical progression
Paul J. Davis [Fri, 6 Dec 2013 17:42:49 +0000 (11:42 -0600)] 
Reorder functions into a logical progression

This just moves functions around in the mem3_rep module to give a better
logical progression. Purely stylistic but it should make things easier
to read and find.

4 years agoUpdate whitespace and exports formatting
Paul J. Davis [Fri, 6 Dec 2013 17:36:47 +0000 (11:36 -0600)] 
Update whitespace and exports formatting

4 years agoRemove old code_change, set module version to 1
Robert Newson [Fri, 22 Nov 2013 16:50:15 +0000 (16:50 +0000)] 
Remove old code_change, set module version to 1

4 years agoAllow for rebalancing "special" DBs
Adam Kocoloski [Wed, 30 Oct 2013 21:43:40 +0000 (17:43 -0400)] 
Allow for rebalancing "special" DBs

For example, _replicator or _users.

BugzID: 24612

4 years agoSuggest moves from all donor nodes in parallel
Adam Kocoloski [Wed, 30 Oct 2013 21:40:20 +0000 (17:40 -0400)] 
Suggest moves from all donor nodes in parallel

Previously the generator would suggest all moves from the first node
before moving onto the second one.  In the case where the quantum of
jobs is much smaller than the number of moves per node this results in
the other donors being neglected for long periods.

BugzID: 24612

4 years agoAllow targets to exceed floor, add another check
Adam Kocoloski [Wed, 30 Oct 2013 18:17:03 +0000 (14:17 -0400)] 
Allow targets to exceed floor, add another check

Sometimes we want to transfer a shard to a target even though it's
already at the floor.  We add another check to make sure we're not
wasitng effort -- the difference in shard counts between the source and
the target must be 2 or greater.

We also refactor the global shard count code to avoid future atom /
binary problems.

BugzID: 24466

4 years agoRefactor global candidate selection
Adam Kocoloski [Wed, 30 Oct 2013 17:10:42 +0000 (13:10 -0400)] 
Refactor global candidate selection

The old approach was getting unwieldy, hopefully this makes the tests
more explicit.

BugzID: 24466

4 years agoStop donating once the target level is achieved
Adam Kocoloski [Wed, 30 Oct 2013 15:53:22 +0000 (11:53 -0400)] 
Stop donating once the target level is achieved

Also switched to a record accumulator for clarity.

BugzID: 24466

4 years agoFix two bugs in the global balancing phase
Adam Kocoloski [Wed, 30 Oct 2013 15:05:41 +0000 (11:05 -0400)] 
Fix two bugs in the global balancing phase

* Nodes with 0 shards were being ignored because wrong datatype.
* The limit was being ignored because max not min.

BugzID: 24680

4 years agoAllow skip straight to global phase
Adam Kocoloski [Wed, 30 Oct 2013 14:00:08 +0000 (10:00 -0400)] 
Allow skip straight to global phase

When rebalancing a DB-per-user cluster with small Q values its typical
that

a) the local phase takes a loooong time, and
b) the local phase doesn't suggest any moves

While the local phase should still run at least once, we'll expose a
flag to skip straight to the global phase since we'll need to run the
plan generator many many times and we can't afford to wait.

BugzID: 24680

4 years agoRefuse to place shards on decom:true nodes
Adam Kocoloski [Wed, 23 Oct 2013 14:08:51 +0000 (10:08 -0400)] 
Refuse to place shards on decom:true nodes

BugzID: 24420

4 years agoRely on decom:true attribute to filter decom nodes
Adam Kocoloski [Wed, 23 Oct 2013 14:02:33 +0000 (10:02 -0400)] 
Rely on decom:true attribute to filter decom nodes

BugzID: 24420

4 years agoEnsure that the owner of a doc is also a host
Adam Kocoloski [Tue, 22 Oct 2013 19:29:00 +0000 (15:29 -0400)] 
Ensure that the owner of a doc is also a host

BugzID: 24395

4 years agoRewrite rebalancing plan generator
Adam Kocoloski [Wed, 25 Sep 2013 18:25:44 +0000 (14:25 -0400)] 
Rewrite rebalancing plan generator

This patch splits the functionality of the module out into three
classes or work:

* Fixing zoning and replica level violations
* Contracting a cluster
* Rebalancing shards across a cluster

The implementations of the first two features are pretty similar - find
the shards that need to be moved, then choose an optimal home for each
of them.  By default the contraction code will remove shards from nodes
in the "decom" zone, and the rebalancing code will ignore that zone
entirrely. An optimal home is a node that

a) is in the correct zone, and
b) has the fewest # of shards for the DB among nodes in the zone, and
c) has the fewest total # of shards among nodes satisfying a) and b)

The implementation of rebalancing is a bit more complicated.  The
rebalancing algorithm looks roughly like this

For DB in all_dbs:
    Ensure all nodes have at least (N*Q) div length(Nodes) shards
    Ensure no node has more than (N*Q) div length(Nodes) + 1 shards
For node in nodes:
    If node has more than TotalShards div length(Nodes) + 1 shards:
        Donate shard to another node

The net result is that each database is balanced across the cluster and
the cluster as a whole is globally balanced.

The current version of the module prints out shard move and copy
operations in a clou-friendly format via io:format.  It also returns a
list of {Op, #shard{}, node()} tuples representing the operations.

The rebalancer will stop after generating 1000 operations by default.
The limit can be customized by using the 1-arity versions of expand,
contract and fix_zoning, but note that the performance of the rebalancer
degrades as the number of pending operations increases.

BugzID: 23690
BugzID: 20770

4 years agoFix latent single-shard range hack
Paul J. Davis [Thu, 5 Sep 2013 19:08:48 +0000 (14:08 -0500)] 
Fix latent single-shard range hack

We had an off-by-one error when we fake #shard{} records for node local
databases. This fixes the issue. The bug was noticeable when attempting
to pass these shards to `fabric_view:is_progress_possible/1`.

BugzId: 22809

4 years agoUse a consistent commenting syntax
Adam Kocoloski [Mon, 19 Aug 2013 13:43:08 +0000 (09:43 -0400)] 
Use a consistent commenting syntax

4 years agoAddress comments from PR
Adam Kocoloski [Mon, 19 Aug 2013 13:40:38 +0000 (09:40 -0400)] 
Address comments from PR

4 years agoEnsure all shards are moved off non-target nodes
Adam Kocoloski [Fri, 16 Aug 2013 17:40:50 +0000 (13:40 -0400)] 
Ensure all shards are moved off non-target nodes

BugzID: 20742

4 years agoStabilize mem3_util:owner/2
Robert Newson [Wed, 31 Jul 2013 10:14:30 +0000 (11:14 +0100)] 
Stabilize mem3_util:owner/2

BugzID: 21413

4 years agoMove rotate_list to mem3_util
Robert Newson [Wed, 31 Jul 2013 10:14:15 +0000 (11:14 +0100)] 
Move rotate_list to mem3_util

4 years agoSupport balancing across a subset of nodes
Adam Kocoloski [Tue, 2 Jul 2013 18:30:09 +0000 (14:30 -0400)] 
Support balancing across a subset of nodes

mem3_balance implicitly assumed the set of nodes over which the DB is
hosted is expanding.  We need to make a couple of small changes in the
case of cluster contraction.

BugzID: 20742

4 years agoFix load_shards_from_disk/2
Robert Newson [Thu, 27 Jun 2013 18:19:24 +0000 (19:19 +0100)] 
Fix load_shards_from_disk/2

load_shards_from_disk/2 did not expect #ordered_shards to be returned
from load_shards_from_disk/1. Since it uses a list comprehension the
mistake is silently squashed, resulting in an empty list.

In production this manifests are the occasional failure, where 'n' is
calculated as 0, causing quorum reads to fail. The very next call
succeeds as it reads the cached versions and correctly downcasts.

BugzID: 20629

4 years agoPreserve key and incorporate range into rotation key
Robert Newson [Tue, 25 Jun 2013 11:37:33 +0000 (12:37 +0100)] 
Preserve key and incorporate range into rotation key

4 years agowe're not rotating by DbName any more
Robert Newson [Tue, 25 Jun 2013 11:36:59 +0000 (12:36 +0100)] 
we're not rotating by DbName any more

4 years agorefactor choose_ushards
Robert Newson [Tue, 25 Jun 2013 11:20:59 +0000 (12:20 +0100)] 
refactor choose_ushards

4 years agoZero out shard caches on upgrade
Adam Kocoloski [Fri, 21 Jun 2013 04:14:42 +0000 (00:14 -0400)] 
Zero out shard caches on upgrade

The mix of #shard and #ordered_shard records breaks ushards.  Different
nodes can start returning different results.

4 years agoAdd function to assist with rebalancing
Robert Newson [Wed, 3 Apr 2013 19:13:35 +0000 (20:13 +0100)] 
Add function to assist with rebalancing

This function takes either a database name or a list of shards and a
list of target nodes to balance the shards across. Every node with
less than a fair share of shards will steal shards from the node with
the most shards as long as both shards are in the same zone.

BugzID: 18638

4 years agoUse a private record for event listener state
Adam Kocoloski [Fri, 24 May 2013 19:03:54 +0000 (15:03 -0400)] 
Use a private record for event listener state

4 years agoFix trivial typo
Adam Kocoloski [Fri, 24 May 2013 19:00:32 +0000 (15:00 -0400)] 
Fix trivial typo

4 years agoBalance replication ownership across nodes
Adam Kocoloski [Thu, 23 May 2013 14:32:58 +0000 (10:32 -0400)] 
Balance replication ownership across nodes

The previous algorithm was biased towards low-numbered nodes, and in the
case of a 3 node cluster would declare db1 to be the owner of all
replications.  We can do better just by leveraging the existing
ushards code.

There's a possibility to refactor this as a new ushards/2 function if
that's perceived as useful.

BugzID: 19870

4 years agoUpdate to use the new couch_event application
Paul J. Davis [Tue, 23 Apr 2013 22:26:30 +0000 (17:26 -0500)] 
Update to use the new couch_event application

4 years agoChoose ushards according to persistent record
Robert Newson [Tue, 23 Apr 2013 21:54:48 +0000 (22:54 +0100)] 
Choose ushards according to persistent record

The order of nodes in the by_range section of "dbs" documents is now
promoted to the principal order for ushards. Ushards still accounts
for Liveness, selecting the first live replica and still supports
Spread by rotating this list using the CRC32 of the database name
(since many databases will have the same layout).

If by_range and by_node are not symmetrical then by_node is used and
order is undefined to match existing behavior.

4 years agoIf two shards differ we need to sync
Paul J. Davis [Tue, 16 Apr 2013 22:12:38 +0000 (17:12 -0500)] 
If two shards differ we need to sync

There's no security if two shards return different answers but it gives
us enough of a signal to know that we need to trigger a full on
synchronization.

BugzId: 18955

4 years agoMoving shard maps _membership endpoint to _shards db handler
Russell Branca [Fri, 12 Apr 2013 20:48:14 +0000 (16:48 -0400)] 
Moving shard maps _membership endpoint to _shards db handler

4 years agoAdd doc shard info endpoint
Russell Branca [Fri, 12 Apr 2013 19:06:58 +0000 (15:06 -0400)] 
Add doc shard info endpoint

4 years agoFix _membership/$DBNAME api endpoint
Russell Branca [Thu, 11 Apr 2013 18:18:12 +0000 (14:18 -0400)] 
Fix _membership/$DBNAME api endpoint

This switches the JSON key to be a binary, as required by jiffy.

Also, remove extraneous <<"parts">> path from the url.

Show full shard range.

4 years agoUpdate to use new multi rexi_server protocol
Paul J. Davis [Tue, 19 Mar 2013 03:57:46 +0000 (22:57 -0500)] 
Update to use new multi rexi_server protocol

4 years agoHandle the #doc_info case in changes_enumerator
Russell Branca [Wed, 18 Jun 2014 22:04:35 +0000 (15:04 -0700)] 
Handle the #doc_info case in changes_enumerator

This is to handle the special case where the user migrates a CouchDB
database to BigCouch and they have not yet compacted the
database. Once the database has been compacted, this #doc_info clause
should never be encountered.

4 years agoDon't log when ensuring dbs exist
Robert Newson [Tue, 3 Jun 2014 10:21:02 +0000 (11:21 +0100)] 
Don't log when ensuring dbs exist

4 years agoAdd function to determine shard membership locally 1843-feature-bigcouch
Robert Newson [Wed, 7 May 2014 13:48:25 +0000 (14:48 +0100)] 
Add function to determine shard membership locally

mem3:belongs/2 allows you to determine if a given doc id belongs to a
given shard (whether a #shard{} record or just the filename of a
shard) without looking up the shard map or making any remote
calls.

4 years agoChange API to function per level
Robert Newson [Wed, 12 Feb 2014 23:23:47 +0000 (23:23 +0000)] 
Change API to function per level

4 years agoSwitch to couch_log
Robert Newson [Wed, 12 Feb 2014 20:11:56 +0000 (20:11 +0000)] 
Switch to couch_log

4 years agoAdd license headers
Paul J. Davis [Tue, 11 Feb 2014 07:54:37 +0000 (01:54 -0600)] 
Add license headers

4 years agoAdd ejson_body to all mem3 open_doc attempts that need it
Robert Newson [Mon, 23 Dec 2013 16:55:10 +0000 (16:55 +0000)] 
Add ejson_body to all mem3 open_doc attempts that need it