pulsar.git
17 hours agoa simple debezium integration test (#3154) master
Jia Zhai [Tue, 11 Dec 2018 00:08:37 +0000 (08:08 +0800)] 
a simple debezium integration test (#3154)

### Motivation

create a simple debezium integration test.
In this test, it start a Debezium MySQL Container based on "debezium/example-mysql:0.8", then use debezium source connector to binlog from MySQL, and store the debezium output into Pulsar.
It verify the consumer readout message number is as expected.

### Modifications

 create a simple debezium integration test.

### Result

this test passed.

22 hours agoIntroduce Source and Sink exceptions in Status (#3142)
Sanjeev Kulkarni [Mon, 10 Dec 2018 19:48:44 +0000 (11:48 -0800)] 
Introduce Source and Sink exceptions in Status (#3142)

* Introduce Source and Sink exceptions in Status

* Added the counter at the right place

22 hours agoUse raw message when manually parsing messages from topic storage (#3146)
Matteo Merli [Mon, 10 Dec 2018 19:09:12 +0000 (11:09 -0800)] 
Use raw message when manually parsing messages from topic storage (#3146)

* Use raw message when manually parsing messages from topic storage

* Added missing headers

* Fixed copy to byte[]

24 hours ago[website][doc] Update the deployment documentation on how to deploy Pulsar to one...
Sijie Guo [Mon, 10 Dec 2018 17:21:50 +0000 (01:21 +0800)] 
[website][doc] Update the deployment documentation on how to deploy Pulsar to one node (#3152)

*Motivation*

Sometimes, people might try out Pulsar by using one machine and expand to multiple nodes
later. The steps to deploy Pulsar to one node is same as deploying to multiple nodes.
However there are some places required attentions. This PR is to update the deployment doc
with instructions on how to deploy Pulsar to one node

24 hours agoadd_proxy_daemon_command (#3151)
Samuel [Mon, 10 Dec 2018 17:21:12 +0000 (01:21 +0800)] 
add_proxy_daemon_command (#3151)

35 hours agoIn managed ledger ReadOnlyCursor, optimize for when there's a new ledger created...
Matteo Merli [Mon, 10 Dec 2018 06:54:44 +0000 (22:54 -0800)] 
In managed ledger ReadOnlyCursor, optimize for when there's a new ledger created (#3148)

### Motivation

In managed ledger, the check `Cursor.hasMoreEntries()` is very cheap when the last ledger has at least some entries (eg: just longs comparison). If it's empty, we need to do a more expensive check that involve looking at the ledgers list to make sure the answer is always correct.

When creating a read-only cursor, we get a snapshot of the current state. If the topic was left with a new ledger created and no entry into that, the `hasMoreEntries()` will be always expensive.

To avoid that, if the last ledger is empty, initialize the last written position on the last entry of previous ledger.

35 hours ago[Message Routing] Support default MessageRouter initialization as RoundRobinPartition...
Eren Avsarogullari [Mon, 10 Dec 2018 06:51:54 +0000 (06:51 +0000)] 
[Message Routing] Support default MessageRouter initialization as RoundRobinPartition (#3150)

### Motivation
This PR aims to initialize **default** `MessageRouter` as `RoundRobinPartition` instead of `SinglePartition` by aligning `MessageRoutingMode` default settings under `ProducerBuilderImpl.setMessageRoutingMode`. In general, current implementation does not cause any problem due to `RoundRobin` coming as default and `MessageRoutingMode` is type-safe. However, it is useful for alignment both consistency of set/get logics.

Following PR with #3126

### Modifications
1- Patch is applied at `RoundRobinPartitionMessageRouterImpl`
2- New UTs are added by creating `PartitionedProducerImplTest`
3- Javadoc fix

2 days agofix rpmVersoin (#3149)
legendtkl [Sun, 9 Dec 2018 07:09:17 +0000 (15:09 +0800)] 
fix rpmVersoin (#3149)

2 days agoUpdate pulsar.properties (#3147)
Matteo Merli [Sun, 9 Dec 2018 05:53:44 +0000 (21:53 -0800)] 
Update pulsar.properties (#3147)

Use IPv4 ZK address to avoid initial connection to IPv6

2 days ago[Pulsar Broker] CleanUp and Javadocs fix (#3145)
Eren Avsarogullari [Sat, 8 Dec 2018 23:03:31 +0000 (23:03 +0000)] 
[Pulsar Broker] CleanUp and Javadocs fix (#3145)

2 days ago[Message Routing] Set CustomPartition implicitly when messageRouter is set (#3126)
Eren Avsarogullari [Sat, 8 Dec 2018 22:06:23 +0000 (22:06 +0000)] 
[Message Routing] Set CustomPartition implicitly when messageRouter is set (#3126)

### Motivation
This PR proposes some refactoring and fixes on message routing logic.

### Modifications
- `getMessageRouter` logic is refactored on`PartitionedProducerImpl`.
- If `messageRouter` is set through `ProducerBuilderImpl`, `messageRoutingMode` needs to be set as `MessageRoutingMode.CustomPartition` implicitly. Default is `MessageRoutingMode.RoundRobinPartition`
- Javadoc is fixed
- Custom Message Router Documentation is fixed
- UT coverage is added (`ProducerBuilderImplTest` and `PartitionedProducerConsumerTest`)

2 days agoRecycle objects when parsing single message metadata (#3143)
Matteo Merli [Sat, 8 Dec 2018 22:05:02 +0000 (14:05 -0800)] 
Recycle objects when parsing single message metadata (#3143)

### Motivation

Reduce objects created when parsing the messages inside a batch.

2 days ago optimizing json deserialization in sql (#3144)
Boyang Jerry Peng [Sat, 8 Dec 2018 22:04:39 +0000 (14:04 -0800)] 
 optimizing json deserialization in sql (#3144)

### Motivation

Use Dsl json for  json deserialization is much faster

3 days agomore optimizations for sql (#3139)
Boyang Jerry Peng [Fri, 7 Dec 2018 21:41:45 +0000 (13:41 -0800)] 
more optimizations for sql (#3139)

* more optimizations for sql

* cleaning up

* adding jctools dependency

* add jctools to license

3 days agofix CompressionCodecLZ4 (#3141)
Boyang Jerry Peng [Fri, 7 Dec 2018 21:14:10 +0000 (13:14 -0800)] 
fix CompressionCodecLZ4 (#3141)

3 days agoMake Source/Sink status Source/Sink specific (#3137)
Sanjeev Kulkarni [Fri, 7 Dec 2018 18:39:41 +0000 (10:39 -0800)] 
Make Source/Sink status Source/Sink specific (#3137)

* Make Source/Sink status Source/Sink specific

* Fix unittest

4 days agoImprove user prompts on get topic reference (#3111)
penghui [Fri, 7 Dec 2018 04:12:23 +0000 (12:12 +0800)] 
Improve  user prompts on get topic reference (#3111)

Motivation
Create a partitioned topic, before client use this topic or create a subscription on this topic, when get stats of internal topic, will get Topic not found error. But actually partitioned topic is already created,
internal topic have not been generated yet.

Modifications
Add following check:

getTopicReference() on a partitioned internal topic and partitioned topic is not exist will get Partitioned Topic not found
getTopicReference() on a partitioned internal topic, partitioned topic is exist but internal topics not exist will get Internal topics have not been generated yet

* Separate topic not found exception and topic owner not found exception(issue-3042)
* Add partitioned topic check(issue-3042)
* Use getPartitionedTopicMetadata(issue-3042)
* Add PersistentTopicsTest UT (issue-3042)

4 days agoEnhanced Pulsar Message API as well as functions context to get more details about...
Sanjeev Kulkarni [Fri, 7 Dec 2018 02:37:06 +0000 (18:37 -0800)] 
Enhanced Pulsar Message API as well as functions context to get more details about messages (#3136)

4 days agoUse BK v2 protocol from Presto connector (#3135)
Matteo Merli [Fri, 7 Dec 2018 00:59:49 +0000 (16:59 -0800)] 
Use BK v2 protocol from Presto connector (#3135)

4 days agoForce to load LZ4 JNI wrapper (#3134)
Matteo Merli [Thu, 6 Dec 2018 21:59:11 +0000 (13:59 -0800)] 
Force to load LZ4 JNI wrapper (#3134)

### Motivation

In newer `lz4-java` JNI wrapper, the JNI library is not automatically loaded (or attempted to be loaded): calling explicitly so that we use the more performant C library instead of the Java based fallback code.

4 days agoCleanup Exception thrown during error (#3133)
Sanjeev Kulkarni [Thu, 6 Dec 2018 20:27:38 +0000 (12:27 -0800)] 
Cleanup Exception thrown during error (#3133)

* Cleanup Exception thrown during error

* Added more validation

4 days agoFix calendar link (#3120)
Sijie Guo [Thu, 6 Dec 2018 18:50:18 +0000 (10:50 -0800)] 
Fix calendar link (#3120)

5 days agoEnsure that the input topic exists before doing trigger (#3130)
Sanjeev Kulkarni [Thu, 6 Dec 2018 15:28:45 +0000 (07:28 -0800)] 
Ensure that the input topic exists before doing trigger (#3130)

5 days agoFew optimizations for pulsar sql (#3128)
Boyang Jerry Peng [Thu, 6 Dec 2018 07:11:39 +0000 (23:11 -0800)] 
Few optimizations for pulsar sql (#3128)

* optimizing pulsar sql

* optimizing

* cleaning up

6 days agoCorrect instances of 'availble' found (#3124)
Grant Wu [Wed, 5 Dec 2018 01:04:38 +0000 (20:04 -0500)] 
Correct instances of 'availble' found (#3124)

6 days agoFixed Kubernetes runtime bug (#3123)
Sanjeev Kulkarni [Wed, 5 Dec 2018 00:31:46 +0000 (16:31 -0800)] 
Fixed Kubernetes runtime bug (#3123)

6 days agoignore unknown properties in worker config (#3125)
Boyang Jerry Peng [Wed, 5 Dec 2018 00:31:04 +0000 (16:31 -0800)] 
ignore unknown properties in worker config (#3125)

6 days agoAdd log dir creation in ProcessRuntime (#3114)
Ali Ahmed [Tue, 4 Dec 2018 18:38:09 +0000 (10:38 -0800)] 
Add log dir creation in ProcessRuntime (#3114)

7 days agoException Handling for Null Pointer Exception in BrokerService (#3108)
Richard Yu [Tue, 4 Dec 2018 01:59:01 +0000 (17:59 -0800)] 
Exception Handling for Null Pointer Exception in BrokerService (#3108)

### Motivation
In a unit test under pulsar-broker, I have found that some NullPointerExceptions were thrown, although the tests had passed. I decided it will be best if we minimized the number of such exceptions as they are usually not intentionally thrown.

### Result
The NullPointerException will no longer occur.

7 days agoJSONSchema fails to serialise fields on objects that are nested in a collection ...
chrismiller [Tue, 4 Dec 2018 01:48:12 +0000 (01:48 +0000)] 
JSONSchema fails to serialise fields on objects that are nested in a collection (#2969)

* JSONSchema doesn't serialise fields of objects that are in nested collections

* fix json schema

* Fix a couple of typos

7 days agoConsolidate timer threads in functions (#3109)
Boyang Jerry Peng [Tue, 4 Dec 2018 01:47:13 +0000 (17:47 -0800)] 
Consolidate timer threads in functions (#3109)

* Use instanceCache schedulerExecutorService for timer

* fixing process runtime

* addressing comments

7 days ago[conf] Add annotations for documenting proxy configuration settings (#3106)
Sijie Guo [Tue, 4 Dec 2018 00:21:49 +0000 (16:21 -0800)] 
[conf] Add annotations for documenting proxy configuration settings (#3106)

* [conf] Add annotations for documenting proxy configuration settings

*Motivation*

It is non-trivial to keep configuration in-sync between code and configuration file.
The change is introducing documentation related annotations. So the annotations can be used
for generating proxy configuration file.

7 days agoAllow subscribers to access subscription admin-api (#2981)
Rajan Dhabalia [Mon, 3 Dec 2018 22:17:50 +0000 (14:17 -0800)] 
Allow subscribers to access subscription admin-api (#2981)

* Allow subscribers to access subscription admin-api

* add sub-authorization check on canConsume

* make per-sub optional

* fix: allowe super-admin to consume

7 days agoAdd pull-mode on Websockets (#3058)
Christophe Bornet [Mon, 3 Dec 2018 22:11:38 +0000 (23:11 +0100)] 
Add pull-mode on Websockets (#3058)

* Add pull-mode on Websockets

Fix #3052

* Add WebSocket pull-mode test

* Add doc on WebSocket pull mode

7 days agoChange un-ack messages start tracking behavior (#3079)
penghui [Mon, 3 Dec 2018 19:47:33 +0000 (03:47 +0800)] 
Change un-ack messages start tracking behavior (#3079)

### Motivation
#### Expected behavior

User process the same message many times but failed, if user set a dead letter policy, when message process times exceed the max redelivery count in dead letter policy, message will send to the dead letter topic.

#### Actual behavior

When a consumer subscribe a topic, but wait a while then start receive messages, but messages already send to dead letter topic.

#### Steps to reproduce

Here is the code to reproduce

```java
public class RedeliveryIssue {

    public static void main(String[] args) throws PulsarClientException, InterruptedException {

        final String topic = "my-topic";

        PulsarClient client = PulsarClient.builder()
                .serviceUrl("pulsar://localhost:6650")
                .build();

        Consumer<byte[]> consumer = client.newConsumer()
                .topic(topic)
                .subscriptionType(SubscriptionType.Shared)
                .subscriptionName(UUID.randomUUID().toString())
                .ackTimeout(3, TimeUnit.SECONDS)
                .deadLetterPolicy(DeadLetterPolicy.builder().maxRedeliverCount(2).build())
                .subscribe();

        Producer<byte[]> producer = client.newProducer()
                .topic(topic)
                .create();

        producer.send(("a message").getBytes());

        // wait a while, message will send to dead letter topic
        Thread.sleep(10000L);

        do {
            // can't receive message
            Message<byte[]> msg = consumer.receive();
            System.out.println(new String(msg.getValue()));
        } while (true);
    }
}
```

#### System configuration
**Pulsar version**: 2.2.0

### Modifications

Remove un-ack message tracking on message received.
Add un-ack message tracking on consumer call receive

### Result

UT passed

7 days agoreport function exceptions via prometheus (#3107)
Boyang Jerry Peng [Mon, 3 Dec 2018 19:46:45 +0000 (11:46 -0800)] 
report function exceptions via prometheus (#3107)

### Motivation

Allow exceptions in functions to be reported via prometheus

example output:
```
# TYPE pulsar_function_user_exception gauge
pulsar_function_user_exception{cluster="standalone",error="error val: 0",fqfn="public/default/py-test",function="py-test",instance_id="0",namespace="public/default",tenant="public",ts="1543786615954"} 1.0
pulsar_function_user_exception{tenant="public",namespace="public/default",function="test",instance_id="0",cluster="standalone",fqfn="public/default/test",error="error val: 5c53460e-03cf-4368-88d4-f23aeb3adf84",ts="1543787031371",} 1.0
```

7 days agorevert #2590 (#3110)
Jia Zhai [Mon, 3 Dec 2018 19:44:28 +0000 (03:44 +0800)] 
revert #2590 (#3110)

### Motivation

In PR #2590, we made a wrong change, to make redelivery happened more early.
This PR Try to revert change of commit 0ab2325fa33231f1c69782e081483012467dfca.

### Modifications

 revert #2590

### Result

Worked as before 2590, Ut pass.

8 days ago fixing and refactoring function status (#3102)
Boyang Jerry Peng [Mon, 3 Dec 2018 04:06:26 +0000 (20:06 -0800)] 
 fixing and refactoring function status (#3102)

* fixing and refactoring function status

* further refactoring

* cleaning up

8 days agoadd default impl for getKey() (#3105)
Boyang Jerry Peng [Sun, 2 Dec 2018 20:15:13 +0000 (12:15 -0800)] 
add default impl for getKey() (#3105)

10 days ago[conf] clean up proxy configuration to use lombok setter and getter (#3103)
Sijie Guo [Sat, 1 Dec 2018 12:38:30 +0000 (04:38 -0800)] 
[conf] clean up proxy configuration to use lombok setter and getter (#3103)

10 days ago[build] generate connector yaml config only when building a release (#3100)
Sijie Guo [Sat, 1 Dec 2018 10:00:41 +0000 (02:00 -0800)] 
[build] generate connector yaml config only when building a release (#3100)

10 days ago[conf] Clean up ServiceConfiguration to use lombok getter and setter (#3101)
Sijie Guo [Sat, 1 Dec 2018 05:50:37 +0000 (21:50 -0800)] 
[conf] Clean up ServiceConfiguration to use lombok getter and setter (#3101)

*Motivation*

Each time we add a new setting, we need to add one setter and one getter method.
This change is to remove those setters and getters, to use lombok @Setter and @Getter.
It will make life easier.

10 days ago[website] Fix apache pulsar events calendar (#3099)
Sijie Guo [Fri, 30 Nov 2018 22:43:27 +0000 (14:43 -0800)] 
[website] Fix apache pulsar events calendar (#3099)

*Motivation*

Fix apache pulsar events calendar link

10 days agoImprove and correct status for function, sources, sinks (#3088)
Boyang Jerry Peng [Fri, 30 Nov 2018 20:09:10 +0000 (12:09 -0800)] 
Improve and correct status for function, sources, sinks (#3088)

* Improve and correct status for function, sources, sinks

* cleaning up unused protobuf

* cleaning up

* fixing unit tests

* change getstatus -> status

* fix integration tests

* fix integration tests for sources and sinks

11 days agoRedeliver messages that can't be decrypted. (#3097)
Jai Asher [Fri, 30 Nov 2018 09:33:19 +0000 (01:33 -0800)] 
Redeliver messages that can't be decrypted. (#3097)

11 days agoOption to not to use rbac in helm deployment (#3082)
Yifan Zhang [Fri, 30 Nov 2018 04:39:47 +0000 (07:39 +0300)] 
Option to not to use rbac in helm deployment (#3082)

* option to not to use rbac

* default value to match previous settings

11 days agoFixed protobuf-shaded/pom.xml with correct version (#3096)
Matteo Merli [Fri, 30 Nov 2018 01:16:21 +0000 (17:16 -0800)] 
Fixed protobuf-shaded/pom.xml with correct version (#3096)

11 days agoPython Client: fix `Consumer.unsubscribe` (#3093)
Fran├žois Laignel [Fri, 30 Nov 2018 00:42:32 +0000 (01:42 +0100)] 
Python Client: fix `Consumer.unsubscribe` (#3093)

* Python Client: fix `Consumer.unsubscribe`

Fix a typo in `self._consumer.unsubscribe`

* Python Client: test `Consumer.unsubscribe`

12 days agoFix process of calculating msgBacklog included in stats (#3092)
massakam [Thu, 29 Nov 2018 17:35:20 +0000 (02:35 +0900)] 
Fix process of calculating msgBacklog included in stats (#3092)

12 days agoFixed C++ CMake dependency and enabled parallel build (#3084)
Matteo Merli [Thu, 29 Nov 2018 05:53:08 +0000 (21:53 -0800)] 
Fixed C++ CMake dependency and enabled parallel build (#3084)

* Fixed C++ CMake dependency and enabled parallel build

* Fixed also for examples

12 days agoAdd pulsar-build:manylinux-cp37-cp37m to pushed docker images (#3091)
massakam [Thu, 29 Nov 2018 05:29:36 +0000 (14:29 +0900)] 
Add pulsar-build:manylinux-cp37-cp37m to pushed docker images (#3091)

When we run `pulsar-client-cpp/docker/create-images.sh`, the following six docker images are created.
```
pulsar-build:manylinux-cp27-cp27mu
pulsar-build:manylinux-cp27-cp27m
pulsar-build:manylinux-cp34-cp34m
pulsar-build:manylinux-cp35-cp35m
pulsar-build:manylinux-cp36-cp36m
pulsar-build:manylinux-cp37-cp37m
```
These images are pushed to Docker Hub when `pulsar-client-cpp/docker/push-images.sh` is executed. But `pulsar-build:manylinux-cp37-cp37m` not included in the target.

https://github.com/apache/pulsar/blob/7bbcc72f5a2690da701c299856ff0d2b62eed5d9/pulsar-client-cpp/docker/create-images.sh#L26-L33
https://github.com/apache/pulsar/blob/7bbcc72f5a2690da701c299856ff0d2b62eed5d9/pulsar-client-cpp/docker/push-images.sh#L28-L34

12 days agoRemoved AspectJ build plugin from pulsar-broker (#3085)
Matteo Merli [Thu, 29 Nov 2018 05:28:32 +0000 (21:28 -0800)] 
Removed AspectJ build plugin from pulsar-broker (#3085)

### Motivation

In #3023 I had moved AspectJ to `pulsar-zookeeper-utils` module. Now the build plugin is not needed anymore in `pulsar-broker`.

12 days agoPIP-25: Added token authentication docs (#3089)
Matteo Merli [Thu, 29 Nov 2018 05:27:43 +0000 (21:27 -0800)] 
PIP-25: Added token authentication docs (#3089)

### Motivation

Added client and admin documentation token based authentication.

12 days agoPIP-25: C++ / Python / Go implementation for token authentication (#3067)
Matteo Merli [Thu, 29 Nov 2018 04:42:43 +0000 (20:42 -0800)] 
PIP-25: C++ / Python / Go implementation for token authentication (#3067)

* PIP-25: C++ implementation for token authentication

* PIP-25: C and Go implementation for token authentication

* PIP-25: Python implementation

* Addressed comments

* Added missing <sstream> include

* Fixed argument name that was changed earlier

* Fixed secret key path

12 days agoSet not just resource requests but also limits (#3043)
Sanjeev Kulkarni [Wed, 28 Nov 2018 22:52:15 +0000 (14:52 -0800)] 
Set not just resource requests but also limits (#3043)

12 days agoIssue #2582: Schema javadoc missing from client api docs (#3086)
Sijie Guo [Wed, 28 Nov 2018 22:51:28 +0000 (14:51 -0800)] 
Issue #2582: Schema javadoc missing from client api docs (#3086)

*Motivation*

Fixes #2582

Schema javadoc is missing from client api docs

12 days agoQueryState now spits out FunctionState (#3076)
Sanjeev Kulkarni [Wed, 28 Nov 2018 21:06:52 +0000 (13:06 -0800)] 
QueryState now spits out FunctionState (#3076)

### Motivation
Instead of spitting some unformatted stuff, made querystate print out a proper json object.

12 days agoPIP-25: Token based authentication (#2888)
Matteo Merli [Wed, 28 Nov 2018 20:42:05 +0000 (12:42 -0800)] 
PIP-25: Token based authentication (#2888)

* PIP-25: Token based authentication

* Addressed comments

* Use Authorization header

* Update to support env: data: and file: as sources for keys and tokens

* Fixed cli description

* Updated broker.conf

* Improved consistency in reading keys and CLI tools

* Fixed check for http headers

* Accept rel time with no specified unit

* Fixed reading data: URL

* Addressed comments

* Added integration tests

* Addressed comments

* Added CLI command to validate token against key

* Fixed integration tests

* Removed env:

* Fixed rel time parsing

12 days agoDon't overwrite integration tests longs if container restarted (#3080)
Ivan Kelly [Wed, 28 Nov 2018 18:57:09 +0000 (19:57 +0100)] 
Don't overwrite integration tests longs if container restarted (#3080)

In the arquillian tests, processes were restarted using supervisorctl,
which preserved the contents of the filesystem. This was changed in
the testcontainers change to restart the whole container, which
obliterates the filesystem.

When a container is stopped, the logs are copied to the target
directory. In the case of multiple restarts the logs from the later
restarts were overwriting the logs from earlier restarts so the older
logs were lost.

This PR adds a check to ensure that we don't overwrite old logs and
appends a number to the log if a old log exists.

13 days agoDisable retry on integration tests (#3081)
Ivan Kelly [Wed, 28 Nov 2018 18:01:30 +0000 (19:01 +0100)] 
Disable retry on integration tests (#3081)

In the original arquillian tests, retry was disabled because tests are
run against a cluster that's brought up at class level, so any retry
would be against a dirty environment. This seems to have been
accidently changed when moving to testcontainers, thought the same
condition holds true.

This patch disabled the retries. These are new tests. They shouldn't
be flaking.

13 days agoExtra logging for offloaders (#2733)
Ivan Kelly [Wed, 28 Nov 2018 16:05:04 +0000 (17:05 +0100)] 
Extra logging for offloaders (#2733)

Some extra logging for offloader loading and initialization which
would have been useful when debugging #2697.

13 days agocache StorageClient (#3078)
Jia Zhai [Wed, 28 Nov 2018 14:45:59 +0000 (22:45 +0800)] 
cache StorageClient (#3078)

13 days agoFixed wordcount example to actually extract words from sentence (#3075)
Sanjeev Kulkarni [Wed, 28 Nov 2018 06:47:23 +0000 (22:47 -0800)] 
Fixed wordcount example to actually extract words from sentence (#3075)

13 days agoDelete extraneous punctuation (#3072)
Grant Wu [Wed, 28 Nov 2018 04:10:37 +0000 (23:10 -0500)] 
Delete extraneous punctuation (#3072)

13 days ago[Pulsar-Flink] Add Scala Examples (#3071)
Eren Avsarogullari [Wed, 28 Nov 2018 01:46:11 +0000 (01:46 +0000)] 
[Pulsar-Flink] Add Scala Examples (#3071)

* [Pulsar-Flink] Add Scala Examples

* Line break is added for input text.

* Adding ASF Header.

* Fix License format

13 days agoprometheus metrics for functions served via brokers or function instances should...
Boyang Jerry Peng [Tue, 27 Nov 2018 22:49:01 +0000 (14:49 -0800)] 
prometheus metrics for functions served via brokers or function instances should match (#3066)

* prometheus metrics for functions served via brokers or instances themselves should match

* add additional testing

13 days agoMake sure to properly count number of processed messages in python (#3060)
Sanjeev Kulkarni [Tue, 27 Nov 2018 18:51:29 +0000 (10:51 -0800)] 
Make sure to properly count number of processed messages in python (#3060)

* Make sure to properly count number of processed messages in python

* Removed total processed

* Fixed build

* Fixed buil

* Address feedback

* Fixed unittest

* Removed unused value

* Added licence headers

* Removed unnecessary changes

* Fix integration tests

* Added numReceived as part of function status

* Unnecessary change revert

* Enhance test

13 days ago[doc] Generate connector yaml files for connectors (#3065)
Sijie Guo [Tue, 27 Nov 2018 18:08:32 +0000 (10:08 -0800)] 
[doc] Generate connector yaml files for connectors (#3065)

*Motivation*

Include the example yaml files in the io distribution package

2 weeks agoCheck for nullness of SchemaInfo (#3064)
Sanjeev Kulkarni [Tue, 27 Nov 2018 02:31:31 +0000 (18:31 -0800)] 
Check for nullness of SchemaInfo (#3064)

### Motivation

Functions using CustomSerde depend on SerdeSchema which returns a null SchemaInfo. The ConsumerImpl needs to check if its null before accessing it.

2 weeks ago[Pulsar-Flink] Extends Validations (#3063)
Eren Avsarogullari [Tue, 27 Nov 2018 02:30:30 +0000 (02:30 +0000)] 
[Pulsar-Flink] Extends Validations (#3063)

### Motivation
This PR aims to extend validations on Flink Connector.

### Modifications
1- `FlinkPulsarProducer` constructor needs to be robust for **blank** values.
2- `PulsarSourceBuilder` needs to be robust for **blank** values.
3- `totalMessageCount` variable looks redundant so can be removed on `PulsarConsumerSource`
4- `PulsarSourceBuilder` UT coverage is added.

### Test Coverage
New UT coverage is added.

2 weeks agoDocumentation for TLS protocol version and ciphers (#3057)
Ivan Kelly [Mon, 26 Nov 2018 18:06:07 +0000 (19:06 +0100)] 
Documentation for TLS protocol version and ciphers (#3057)

PR #1225 added the ability to configure the TLS protocol version and
cipher on the server side, but this was never added to the
documentation. This patch fixes that omission.

Issue: #2402

2 weeks agoMove maven-dependency-plugin in distribution/server/pom.xml (#3053)
Matteo Merli [Mon, 26 Nov 2018 18:05:36 +0000 (10:05 -0800)] 
Move maven-dependency-plugin in distribution/server/pom.xml (#3053)

* Move maven-dependency-plugin in distribution/server/pom.xml

Motivation

maven-dependency-plugin is used to generate a classpath.txt
file with the list of jars for runtime. This is only used
by `bin/pulsar` command and it picks up the dependencies of
the distribution-server module.

There's no need to generate this classpath.txt file for
every module.

* Fixed pom

2 weeks ago Correcting metrics and adding tests (#3050)
Boyang Jerry Peng [Mon, 26 Nov 2018 14:45:10 +0000 (09:45 -0500)] 
 Correcting metrics and adding tests (#3050)

* Correcting metrics and adding tests

* remove commmented out code

* remove space

* fix integration test

* improving impl

* fix ObjectMapper

2 weeks ago[Pulsar-Flink] Add Batch Json Sink Support (#3046)
Eren Avsarogullari [Sun, 25 Nov 2018 19:42:58 +0000 (19:42 +0000)] 
[Pulsar-Flink] Add Batch Json Sink Support (#3046)

### Motivation
This PR aims to add Flink - Pulsar Batch Json Sink Support. If user works with Flink DataSet API and would like to write the DataSets to Pulsar in Json format, this sink can help.

Ref: [Flink Batch Sink API](https://ci.apache.org/projects/flink/flink-docs-stable/dev/batch/#data-sinks)

### Modifications
Please find the change-set as follows:

**1-** Defines `PulsarJsonOutputFormat` to write Flink Batch `DataSets` into Pulsar by providing ready `JsonSerializationSchema`.
**2-** UT Coverages
**3-** `FlinkPulsarBatchJsonSinkExample` to show how to be used by users.
**4-** `README.md `documentation

2 weeks agoAdd configs for broker stats update frequency (#3047)
Ali Ahmed [Sun, 25 Nov 2018 19:42:36 +0000 (11:42 -0800)] 
Add configs for broker stats update frequency (#3047)

This allows to configure the initial delay and update frequency for broker stats. This is useful for dev and test clusters.

Fixes #687

2 weeks ago[state] make setting and opening state table more robust (#3029)
Sijie Guo [Sun, 25 Nov 2018 19:42:00 +0000 (11:42 -0800)] 
[state] make setting and opening state table more robust (#3029)

2 weeks ago[io][docs] introduce annotations for generating connector yaml config files (#2936)
Sijie Guo [Sun, 25 Nov 2018 19:41:26 +0000 (11:41 -0800)] 
[io][docs] introduce annotations for generating connector yaml config files (#2936)

*Motivation*

Currently all io connectors lack example yaml files. Manually write those files is error-prone.
We need a programmable way that automatically generate example connector yaml files.

*Changes*

- introduce annotations for documenting connector yaml files.
- provide a generator to generate yaml files
- provide a shell script to run generator
- when building io package, generate yaml configs

2 weeks agosupport mysql binlog sync to pulsar by canal (#2998)
tuteng [Sun, 25 Nov 2018 19:40:44 +0000 (03:40 +0800)] 
support mysql binlog sync to pulsar by canal (#2998)

### Motivation

support alibaba canal
https://github.com/alibaba/canal/wiki

### Modifications

Integrated canal client

### Result

support binlog sync to pulsar
use python pulsar-client to consume

```
import pulsar

client = pulsar.Client('pulsar://localhost:6650')
consumer = client.subscribe('my-topic',
                            subscription_name='my-sub')

while True:
    msg = consumer.receive()
    print("Received message: '%s'" % msg.data())
    consumer.acknowledge(msg)

client.close()
```

```
output:
Received message: '[{"data":null,"database":"testdb","es":1542446501000,"id":44,"isDdl":true,"mysqlType":null,"old":null,"sql":"CREATE TABLE `users320` (   `id` int(11) NOT NULL AUTO_INCREMENT,   `name` varchar(50) DEFAULT NULL,   `extra` varchar(50) DEFAULT NULL,   PRIMARY KEY (`id`),   KEY `ix_users_name` (`name`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8","sqlType":null,"table":"users320","ts":1542446501114,"type":"CREATE"}]'
```

2 weeks agoFixes for Java 11 (#3006)
Matteo Merli [Sun, 25 Nov 2018 19:25:37 +0000 (11:25 -0800)] 
Fixes for Java 11 (#3006)

* [java-11] Compilation fixes

* Runtime fixes

* Fix dependencies for discovery service

* Fixed json mapper provider in new jersey

* Fixed license files

* Added hk2 dep to proxy

* Added hk2 deps to pulsar-client-admin as well

* added missing jar in license file

* More license fixes

2 weeks agoExpose Redelivery Count By Message Properties(issue-3030) (#3033)
penghui [Sun, 25 Nov 2018 15:39:56 +0000 (23:39 +0800)] 
Expose Redelivery Count By Message Properties(issue-3030) (#3033)

* Expose Redelivery Count By Message Properties(issue-3030)

* Add getRedeliveryCount in Message interface(issue-3030)

* Change redeliveryCount to final.

2 weeks agoUpdate libcurl version included in docker images (#3035)
massakam [Sun, 25 Nov 2018 15:37:04 +0000 (00:37 +0900)] 
Update libcurl version included in docker images (#3035)

* Update libcurl version included in docker images

* Drop python 3.3 support

2 weeks agoIssue #2751: Add check to fix NPE (#3034)
Jia Zhai [Sun, 25 Nov 2018 05:23:31 +0000 (13:23 +0800)] 
Issue #2751: Add check to fix NPE (#3034)

Motivation
We may meet NPE like this:
```
java.lang.NullPointerException: null
at org.apache.pulsar.common.api.proto.PulsarApi$KeyValue$Builder.setValue(PulsarApi.java:1923) ~[org.apache.pulsar-pulsar-common-2.1.1-incubating.jar:2.1.1-incubating]
```
This is related to protobuf, it does not support null-able field directly.
protocolbuffers/protobuf#1606

In this fix we try to avoid this by add checking before this method is called.

2 weeks agoTypo in Auth TLS docs (#3049)
Cristian [Sun, 25 Nov 2018 04:52:47 +0000 (20:52 -0800)] 
Typo in Auth TLS docs (#3049)

2 weeks agoUpdate dockerfile-maven plugin (#3040)
Matteo Merli [Sun, 25 Nov 2018 04:24:27 +0000 (20:24 -0800)] 
Update dockerfile-maven plugin (#3040)

Motivation
dockerfile-maven 1.3.7 has issues connecting to newer Docker daemons. Latest version works fine.

2 weeks ago[website] Add cdc connector doc to sidebar (#3045)
Sijie Guo [Sun, 25 Nov 2018 04:23:54 +0000 (20:23 -0800)] 
[website] Add cdc connector doc to sidebar (#3045)

Add cdc connector doc to sidebar

2 weeks agoDocumentation: impove the doc 'Deploying and managing Pulsar Functions' (#3048)
legendtkl [Sun, 25 Nov 2018 04:23:06 +0000 (12:23 +0800)] 
Documentation: impove the doc 'Deploying and managing Pulsar Functions' (#3048)

when follow the doc Deploying and managing Pulsar Functions to do python udf function create and deploy, we will encounter an exception as follow.
```
$ bin/pulsar-admin functions trigger --tenant public --namespace default --name myfunc --trigger-value "hello world"
HTTP 408 Request Timeout
```
The root cause is the python environment is not ready. To be specific, we should install pulsar python client before do python udf function creating and deploy.

Maybe we should point it out clearly in the doc as following, and that is what the PR is doing.

If you're going to deploy and trigger python user-defined functions, you should install [the pulsar python client](http://pulsar.apache.org/docs/en/client-libraries-python/) first.

2 weeks agoUse topic unload instead of broker restarts in TestS3Offload (#3036)
Matteo Merli [Sun, 25 Nov 2018 02:48:58 +0000 (18:48 -0800)] 
Use topic unload instead of broker restarts in TestS3Offload (#3036)

### Motivation

Topic unload is a lighter way to clean up all the broker state for a topic compared to restarting the broker, though it achieves the same result.

2 weeks ago[Pulsar-Flink] Add Batch Csv Sink Support (#3039)
Eren Avsarogullari [Thu, 22 Nov 2018 19:32:45 +0000 (19:32 +0000)] 
[Pulsar-Flink] Add Batch Csv Sink Support (#3039)

### Motivation
This PR aims to add Flink - Pulsar Batch Csv Sink Support. If user works with Flink DataSet API and would like to write the DataSets to Pulsar in Csv format, this sink can help.

This is also similar approach what Flink currently offers:
```
DataSet<Tuple3<String, Integer, Double>> values = // [...]
values.writeAsCsv(filepath) // writing Datasets to FileSystem
```
Ref: [Flink Batch Sink API](https://ci.apache.org/projects/flink/flink-docs-stable/dev/batch/#data-sinks)

### Modifications
Please find the change-set as follows:

**1-** Defines `PulsarCsvOutputFormat` to write Flink Batch `DataSets` into Pulsar by providing ready `CsvSerializationSchema`.
**2-** Abstracts current implementation to support both `PulsarOutputFormat` (which supports for user-defined serialization schema) and `PulsarCsvOutputFormat`
**3-** UT Coverages
**4-** `FlinkPulsarBatchCsvSinkExample` to show how to be used by users.
**5-** `README.md `documentation

2 weeks agoDisable ZK & BK fsync calls in integration tests (#3037)
Matteo Merli [Thu, 22 Nov 2018 16:42:53 +0000 (08:42 -0800)] 
Disable ZK & BK fsync calls in integration tests (#3037)

* Disable ZK & BK fsync calls in integration tests

* Added warning in zk conf file

2 weeks agoIssue #2657: change function cli getstate to use REST endpoint (#2943)
Jia Zhai [Thu, 22 Nov 2018 12:05:24 +0000 (20:05 +0800)] 
Issue #2657: change function cli getstate to use REST endpoint (#2943)

# Motivation
change StateGetter to use REST endpoint

# Modifications
change StateGetter to use REST endpoint
change related unit test

# Result
unit tests pass.

2 weeks agoMoved broker aspectj to zk-utils (#3023)
Matteo Merli [Thu, 22 Nov 2018 03:47:43 +0000 (19:47 -0800)] 
Moved broker aspectj to zk-utils (#3023)

2 weeks agoFixed lookup of boost_python with boost-1.68 (#3041)
Matteo Merli [Thu, 22 Nov 2018 01:48:21 +0000 (17:48 -0800)] 
Fixed lookup of boost_python with boost-1.68 (#3041)

2 weeks agoAdd default ackTimeout(30s) for dead letter policy (#3014)
penghui [Thu, 22 Nov 2018 01:35:01 +0000 (09:35 +0800)] 
Add default ackTimeout(30s) for dead letter policy (#3014)

Fix issue #2987
### Motivation

In version 2.2.0, support DeadLetterTopic feature. This feature based on message redelivery. So ack timeout is necessary.

### Modifications

Set ackTimeout(30s) when enable the dead letter policy but not set the ackTimeout;

2 weeks agoAdd bookkeeper service and other fixes (#3026)
Byron Ruth [Wed, 21 Nov 2018 21:51:21 +0000 (16:51 -0500)] 
Add bookkeeper service and other fixes (#3026)

* Add bookkeeper service and other fixes

- Add the topologyKey for the pod affinity for bookies
- Fixed a misspelling in the name of the ledgers disk
- Fixed the selector in the proxy service to select proxy pods

Signed-off-by: Byron Ruth <b@devel.io>
* Update bookkeeper service comment to note DNS

Signed-off-by: Byron Ruth <b@devel.io>
2 weeks ago[state] use closeAsync to close state storage client (#3028)
Sijie Guo [Wed, 21 Nov 2018 19:50:48 +0000 (11:50 -0800)] 
[state] use closeAsync to close state storage client (#3028)

2 weeks ago adding windowed metrics for functions (#3021)
Boyang Jerry Peng [Wed, 21 Nov 2018 19:08:09 +0000 (11:08 -0800)] 
 adding windowed metrics for functions (#3021)

* adding windowed metrics for functions

* adding license headers and cleaning up

* remove unnecessary import

* add RestException

* fixing bugs and refactoring code

* fix bug in instanceCache

* fix bug

* add test for stats and fix minor bug

2 weeks agoadd debezium source documentation (#2976)
Jia Zhai [Wed, 21 Nov 2018 08:18:02 +0000 (16:18 +0800)] 
add debezium source documentation (#2976)

* add debezium source documentation

* change following Sanjeev's comments

2 weeks agoFix C++ tests after merge (#3027)
Matteo Merli [Wed, 21 Nov 2018 07:39:10 +0000 (23:39 -0800)] 
Fix C++ tests after merge (#3027)

Motivation
There was a conflict between #3003 and #3020 that made C++ tests to fail after merging because of the namespaces changes.

* Fix C++ tests after merge
* Fix topic name comparison
* Increase tests timeout from 2 to 5min

2 weeks agoEnable SSL of LibCurl linked to C++ library (#3024)
massakam [Wed, 21 Nov 2018 06:12:55 +0000 (15:12 +0900)] 
Enable SSL of LibCurl linked to C++ library (#3024)

* Enable SSL of LibCurl linked to C++ library

* Update openssl versions

* Update openssl to 1.1.0j

2 weeks agoUpdate Function Semantics (#2985)
Sanjeev Kulkarni [Wed, 21 Nov 2018 03:09:34 +0000 (22:09 -0500)] 
Update Function Semantics (#2985)

* Make update functions better

* Compiled

* more checks

* bug fix

* Added tests

* Tests pass

* Fixed tests

* Fixed tests

* Added tests

* Added unittests

* Fixed unittest

* Fixed unittest

* Fixed unittest

* Timeout fix

* Fixed unittest

* Fix unittest

* Addressed feedback