kibble-scanners.git
6 days agowe can't work on a jira ticket without fields data master
Daniel Gruno [Mon, 15 Jan 2018 16:19:17 +0000 (17:19 +0100)] 
we can't work on a jira ticket without fields data

12 days agoIf not key phrases, put _NULL_ to avoid breaking ES
Daniel Gruno [Tue, 9 Jan 2018 01:37:27 +0000 (02:37 +0100)] 
If not key phrases, put _NULL_ to avoid breaking ES

ES does not seem to like empty sets here, so we'll
put _NULL_ in there when no phrases were found,
and ignore that in the UI.

12 days agobetter quote removal
Daniel Gruno [Tue, 9 Jan 2018 00:55:07 +0000 (01:55 +0100)] 
better quote removal

12 days agoBetter trimming of unnecessary text elements
Daniel Gruno [Tue, 9 Jan 2018 00:48:41 +0000 (01:48 +0100)] 
Better trimming of unnecessary text elements

We don't want to be analysing:
- quotes
- "on $date, bla bla wrote" sort of sentences
- URLs, email addresses

12 days agoforgot to add kpe to init.py
Daniel Gruno [Tue, 9 Jan 2018 00:32:31 +0000 (01:32 +0100)] 
forgot to add kpe to init.py

12 days agoInitial stab at KPE for Kibble
Daniel Gruno [Tue, 9 Jan 2018 00:29:09 +0000 (01:29 +0100)] 
Initial stab at KPE for Kibble

This only supports pony mail so far.
We'll have to work on support for Pipermail etc

3 weeks agoadditional emotional weighting available
Daniel Gruno [Fri, 29 Dec 2017 11:02:44 +0000 (12:02 +0100)] 
additional emotional weighting available

6 weeks agothere's a value for this too.
Daniel Gruno [Fri, 8 Dec 2017 18:39:14 +0000 (19:39 +0100)] 
there's a value for this too.

6 weeks agopicoAPI has scores for positivity/negativity, let's use those
Daniel Gruno [Fri, 8 Dec 2017 12:48:37 +0000 (13:48 +0100)] 
picoAPI has scores for positivity/negativity, let's use those

6 weeks agoweave picoAPI into pm-tone
Daniel Gruno [Fri, 8 Dec 2017 12:12:02 +0000 (13:12 +0100)] 
weave picoAPI into pm-tone

6 weeks agoadd support for picoAPI sentiment analysis
Daniel Gruno [Fri, 8 Dec 2017 12:11:05 +0000 (13:11 +0100)] 
add support for picoAPI sentiment analysis

The more the merrier

6 weeks agobump the limit from 1 email to 100 per scan at max
Daniel Gruno [Thu, 7 Dec 2017 10:36:10 +0000 (11:36 +0100)] 
bump the limit from 1 email to 100 per scan at max

6 weeks agoget ponymail-tone scanner to work with lib changes
Daniel Gruno [Thu, 7 Dec 2017 10:35:50 +0000 (11:35 +0100)] 
get ponymail-tone scanner to work with lib changes

grab all bodies, array them up, then scan them all at once

6 weeks agorework tone lib to accept an array of bodies
Daniel Gruno [Thu, 7 Dec 2017 10:35:24 +0000 (11:35 +0100)] 
rework tone lib to accept an array of bodies

Azure accepts up to 1000 bodies at the same time to
speed up and prevent rate limits, so let's make use of that.
also rework the watson to accept an array, even through
it's still one call per body.

6 weeks agoaccount for azure rate limiting
Daniel Gruno [Thu, 7 Dec 2017 10:13:59 +0000 (11:13 +0100)] 
account for azure rate limiting

6 weeks agoensure that azure returns a valid response
Daniel Gruno [Thu, 7 Dec 2017 09:58:15 +0000 (10:58 +0100)] 
ensure that azure returns a valid response

6 weeks agothis needs to be a string representation
Daniel Gruno [Thu, 7 Dec 2017 09:56:05 +0000 (10:56 +0100)] 
this needs to be a string representation

6 weeks agoadd commented out example watson/azure creds
Daniel Gruno [Wed, 6 Dec 2017 22:52:01 +0000 (23:52 +0100)] 
add commented out example watson/azure creds

6 weeks agoalso add azure text analysis option
Daniel Gruno [Wed, 6 Dec 2017 22:49:02 +0000 (23:49 +0100)] 
also add azure text analysis option

rename watson's to watsonTone.

6 weeks agocatch exception and store in db if we fail to scan
Daniel Gruno [Wed, 6 Dec 2017 11:13:25 +0000 (12:13 +0100)] 
catch exception and store in db if we fail to scan

6 weeks agoreport when we're done scanning
Daniel Gruno [Wed, 6 Dec 2017 11:12:02 +0000 (12:12 +0100)] 
report when we're done scanning

6 weeks ago^- (merghebegin work on a PoC twitter scanner
Daniel Gruno [Wed, 6 Dec 2017 11:11:02 +0000 (12:11 +0100)] 
^- (merghebegin work on a PoC twitter scanner

might be replaced with a streams container, but for now we'll
have something to work with, data-wise.

7 weeks agooverride alerts should cause a rewind attempt as well
Daniel Gruno [Mon, 27 Nov 2017 15:05:40 +0000 (16:05 +0100)] 
override alerts should cause a rewind attempt as well

When git says 'Your local changes to the following files would be
overwritten by checkout' we should probably try a rewind as well.

2 months agofail gracefully if watson breaks
Daniel Gruno [Tue, 24 Oct 2017 20:42:04 +0000 (22:42 +0200)] 
fail gracefully if watson breaks

let's do some better debug of this later on.

2 months agoneed to import the exceptions module
Daniel Gruno [Tue, 24 Oct 2017 17:55:14 +0000 (19:55 +0200)] 
need to import the exceptions module

2 months agocatch connection aborts
Daniel Gruno [Tue, 24 Oct 2017 17:48:33 +0000 (19:48 +0200)] 
catch connection aborts

2 months agothis is sometimes all caps, need to lowercase it all
Daniel Gruno [Tue, 24 Oct 2017 17:37:08 +0000 (19:37 +0200)] 
this is sometimes all caps, need to lowercase it all

2 months agobugzilla is also a known robit
Daniel Gruno [Tue, 24 Oct 2017 12:07:36 +0000 (14:07 +0200)] 
bugzilla is also a known robit

2 months agosome lists require credentials
Daniel Gruno [Tue, 24 Oct 2017 12:07:24 +0000 (14:07 +0200)] 
some lists require credentials

also switch to the json plugin for fetching data, since that
handles cookies already

2 months agocompact, add doc_as_upsert for updates which 404s
Daniel Gruno [Mon, 23 Oct 2017 21:47:43 +0000 (23:47 +0200)] 
compact, add doc_as_upsert for updates which 404s

2 months agoupsert isn't working
Daniel Gruno [Mon, 23 Oct 2017 21:39:08 +0000 (23:39 +0200)] 
upsert isn't working

gotta set the optype instead for bulks

2 months agorefactor logic
Daniel Gruno [Mon, 23 Oct 2017 21:26:19 +0000 (23:26 +0200)] 
refactor logic

- we want to stop after 50 emails
- we don't want robits
- emails already analysed count towards the 50!

2 months agograb a lot of email, bail at 50 legit ones
Daniel Gruno [Mon, 23 Oct 2017 18:42:45 +0000 (20:42 +0200)] 
grab a lot of email, bail at 50 legit ones

2 months agothese should be upserts
Daniel Gruno [Mon, 23 Oct 2017 18:38:24 +0000 (20:38 +0200)] 
these should be upserts

2 months agocrop out quotes from email
Daniel Gruno [Mon, 23 Oct 2017 17:54:18 +0000 (19:54 +0200)] 
crop out quotes from email

we're only interested in the actual reply, not what was quoted

2 months agoif watson borks, we bork silently
Daniel Gruno [Mon, 23 Oct 2017 17:47:19 +0000 (19:47 +0200)] 
if watson borks, we bork silently

2 months agoless confusion if we don't print the ignored emails
Daniel Gruno [Mon, 23 Oct 2017 17:09:08 +0000 (19:09 +0200)] 
less confusion if we don't print the ignored emails

let's only print debug data WHEN we find an email to analyse

2 months agoignore robits
Daniel Gruno [Mon, 23 Oct 2017 17:04:31 +0000 (19:04 +0200)] 
ignore robits

2 months agoadd ponymail tone analysis
Daniel Gruno [Mon, 23 Oct 2017 17:03:35 +0000 (19:03 +0200)] 
add ponymail tone analysis

Only enabled it watson configured.
set to only grab the last 50 emails, so as to not
abuse the service. WIP!

2 months agoadd a small plugin for tone analysis
Daniel Gruno [Mon, 23 Oct 2017 17:02:23 +0000 (19:02 +0200)] 
add a small plugin for tone analysis

2 months agokibblebit shortcut for dbname, notify when pushing stragglers
Daniel Gruno [Mon, 23 Oct 2017 17:02:02 +0000 (19:02 +0200)] 
kibblebit shortcut for dbname, notify when pushing stragglers

2 months agonote that ForkManager is required here
Daniel Gruno [Mon, 23 Oct 2017 10:20:48 +0000 (12:20 +0200)] 
note that ForkManager is required here

2 months agonote that a threaded cloc exists
Daniel Gruno [Mon, 23 Oct 2017 10:18:56 +0000 (12:18 +0200)] 
note that a threaded cloc exists

2 months agodefault to 'issue' issue type
Daniel Gruno [Sun, 22 Oct 2017 16:55:09 +0000 (18:55 +0200)] 
default to 'issue' issue type

This is so we can easily distinguish between pull/merge requests
and plain issues

3 months agouse kibblebit.append here, properly upsert docs
Daniel Gruno [Sun, 22 Oct 2017 16:27:28 +0000 (18:27 +0200)] 
use kibblebit.append here, properly upsert docs

3 months agokeep it quiet unless there is an error
Daniel Gruno [Sat, 21 Oct 2017 19:41:27 +0000 (21:41 +0200)] 
keep it quiet unless there is an error

git keeps 'odd' output for stderr, so pipe that
to stdout and only print if return code isn't 0

3 months agofix path, output brokenness
Daniel Gruno [Sat, 21 Oct 2017 19:27:42 +0000 (21:27 +0200)] 
fix path, output brokenness

- fix the path (cd to full path!)
- on error, spit out what went wrong

3 months agoonly display count on every 10th ticket
Daniel Gruno [Sat, 21 Oct 2017 19:09:39 +0000 (21:09 +0200)] 
only display count on every 10th ticket

we don't wanna flood when we have thousands of tickets.

3 months agoneeds to be forced UTC here
Daniel Gruno [Sat, 21 Oct 2017 09:59:38 +0000 (11:59 +0200)] 
needs to be forced UTC here

3 months agowe can safely push larger bulk objects
Daniel Gruno [Sat, 21 Oct 2017 09:44:10 +0000 (11:44 +0200)] 
we can safely push larger bulk objects

3 months agowe also accept trunk as a default, if no master
Daniel Gruno [Sat, 21 Oct 2017 09:40:09 +0000 (11:40 +0200)] 
we also accept trunk as a default, if no master

3 months agorename coc document
Daniel Gruno [Sat, 21 Oct 2017 09:19:32 +0000 (11:19 +0200)] 
rename coc document

3 months agostart working on a contributing doc
Daniel Gruno [Sat, 21 Oct 2017 09:18:17 +0000 (11:18 +0200)] 
start working on a contributing doc

3 months agoAdd the ASF Code of Conduct
Daniel Gruno [Sat, 21 Oct 2017 09:16:25 +0000 (11:16 +0200)] 
Add the ASF Code of Conduct

3 months agouse KibbleBit.pprint here
Daniel Gruno [Sat, 21 Oct 2017 08:53:32 +0000 (10:53 +0200)] 
use KibbleBit.pprint here

3 months agoupdate source status
Daniel Gruno [Sat, 21 Oct 2017 08:53:16 +0000 (10:53 +0200)] 
update source status

3 months agogithub issues now have labels, let's store those
Daniel Gruno [Fri, 20 Oct 2017 19:54:52 +0000 (21:54 +0200)] 
github issues now have labels, let's store those

3 months agoreturn plugins in correct run-order
Daniel Gruno [Fri, 20 Oct 2017 12:24:32 +0000 (14:24 +0200)] 
return plugins in correct run-order

we need sync to happen first, then the rest..

3 months agoget involved!
Daniel Gruno [Thu, 19 Oct 2017 15:08:02 +0000 (17:08 +0200)] 
get involved!

tbd!

3 months agoadd requirements for running scanners
Daniel Gruno [Thu, 19 Oct 2017 15:03:10 +0000 (17:03 +0200)] 
add requirements for running scanners

3 months agodistinguish between issue and PR
Daniel Gruno [Wed, 18 Oct 2017 15:54:52 +0000 (17:54 +0200)] 
distinguish between issue and PR

3 months agoadd gh traffic stats scanner
Daniel Gruno [Wed, 18 Oct 2017 12:56:50 +0000 (14:56 +0200)] 
add gh traffic stats scanner

3 months agofix var name
Daniel Gruno [Mon, 16 Oct 2017 19:17:20 +0000 (21:17 +0200)] 
fix var name

3 months agoensure db versions match
Daniel Gruno [Mon, 16 Oct 2017 19:12:19 +0000 (21:12 +0200)] 
ensure db versions match

3 months agodon't print gigantic diffs!
Daniel Gruno [Thu, 12 Oct 2017 22:21:35 +0000 (00:21 +0200)] 
don't print gigantic diffs!

3 months agodebug print
Daniel Gruno [Thu, 12 Oct 2017 15:44:05 +0000 (17:44 +0200)] 
debug print

3 months agoadd per-view scans
Daniel Gruno [Thu, 12 Oct 2017 15:43:59 +0000 (17:43 +0200)] 
add per-view scans

3 months agoproperly add/update docs
Daniel Gruno [Wed, 11 Oct 2017 21:30:24 +0000 (23:30 +0200)] 
properly add/update docs

3 months agowork with upserts, complain if no ID present
Daniel Gruno [Wed, 11 Oct 2017 21:29:53 +0000 (23:29 +0200)] 
work with upserts, complain if no ID present

3 months agoscan for and save reply-to values and subjects
Daniel Gruno [Wed, 11 Oct 2017 15:11:40 +0000 (17:11 +0200)] 
scan for and save reply-to values and subjects

3 months agoproperly allow filtering by source ID
Daniel Gruno [Wed, 11 Oct 2017 15:11:19 +0000 (17:11 +0200)] 
properly allow filtering by source ID

3 months agothis needs to be an UTC time stamp
Daniel Gruno [Sun, 24 Sep 2017 23:31:50 +0000 (01:31 +0200)] 
this needs to be an UTC time stamp

3 months agoadd a simple text version of get
Daniel Gruno [Fri, 22 Sep 2017 16:10:35 +0000 (18:10 +0200)] 
add a simple text version of get

3 months agoAdd load balancing
Daniel Gruno [Fri, 22 Sep 2017 13:09:56 +0000 (15:09 +0200)] 
Add load balancing

If working with multiple nodes, allow them to determine
which sources to work with and which to pass to other nodes.

3 months agoAdd evolution scanner
Daniel Gruno [Fri, 22 Sep 2017 11:45:35 +0000 (13:45 +0200)] 
Add evolution scanner

3 months agoreplace with ints inside sloc util
Daniel Gruno [Fri, 22 Sep 2017 11:45:06 +0000 (13:45 +0200)] 
replace with ints inside sloc util

3 months agoAdd SLoC counter(s)
Daniel Gruno [Fri, 22 Sep 2017 11:13:14 +0000 (13:13 +0200)] 
Add SLoC counter(s)

3 months agoAdd git census scanner
Daniel Gruno [Fri, 22 Sep 2017 10:55:54 +0000 (12:55 +0200)] 
Add git census scanner

3 months agoprint when we force a bulk push
Daniel Gruno [Fri, 22 Sep 2017 09:58:11 +0000 (11:58 +0200)] 
print when we force a bulk push

3 months agoAdd Gerrit Code Review scanner
Daniel Gruno [Fri, 22 Sep 2017 09:58:02 +0000 (11:58 +0200)] 
Add Gerrit Code Review scanner

3 months agoditch this name for now, premature!
Daniel Gruno [Fri, 22 Sep 2017 09:20:11 +0000 (11:20 +0200)] 
ditch this name for now, premature!

3 months agoupdate readme
Daniel Gruno [Fri, 22 Sep 2017 09:18:08 +0000 (11:18 +0200)] 
update readme

3 months agoAdd git-sync plugin and git utility lib
Daniel Gruno [Fri, 22 Sep 2017 09:16:42 +0000 (11:16 +0200)] 
Add git-sync plugin and git utility lib

4 months agoAdd BugZilla scanner
Daniel Gruno [Thu, 21 Sep 2017 19:05:08 +0000 (21:05 +0200)] 
Add BugZilla scanner

4 months agoforgot the string to check
Daniel Gruno [Thu, 21 Sep 2017 18:59:04 +0000 (20:59 +0200)] 
forgot the string to check

4 months agoAdd GitHub Issues scanner
Daniel Gruno [Thu, 21 Sep 2017 17:52:55 +0000 (19:52 +0200)] 
Add GitHub Issues scanner

4 months agothis is defined inside scan()
Daniel Gruno [Thu, 21 Sep 2017 17:21:24 +0000 (19:21 +0200)] 
this is defined inside scan()

4 months agonote which scanners we have available atm
Daniel Gruno [Thu, 21 Sep 2017 17:16:21 +0000 (19:16 +0200)] 
note which scanners we have available atm

4 months agoAdd JIRA scanner
Daniel Gruno [Thu, 21 Sep 2017 16:53:04 +0000 (18:53 +0200)] 
Add JIRA scanner

4 months agoUpdates
Daniel Gruno [Thu, 21 Sep 2017 16:52:03 +0000 (18:52 +0200)] 
Updates

- allow fetching a document
- pprint bulk error
- sort sources

4 months agoallow running only a specific scanner
Daniel Gruno [Thu, 21 Sep 2017 16:51:21 +0000 (18:51 +0200)] 
allow running only a specific scanner

4 months agoPM requires cookies
Daniel Gruno [Thu, 21 Sep 2017 16:50:47 +0000 (18:50 +0200)] 
PM requires cookies

4 months agofixups for basic auth
Daniel Gruno [Thu, 21 Sep 2017 16:48:59 +0000 (18:48 +0200)] 
fixups for basic auth

4 months agopprint this
Daniel Gruno [Thu, 21 Sep 2017 12:51:16 +0000 (14:51 +0200)] 
pprint this

4 months agoadd initial pipermail scanner and urlmisc lib for it
Daniel Gruno [Thu, 21 Sep 2017 12:49:37 +0000 (14:49 +0200)] 
add initial pipermail scanner and urlmisc lib for it

4 months agoremove this, it's in the scanners dir as __init__.py now
Daniel Gruno [Thu, 21 Sep 2017 12:08:44 +0000 (14:08 +0200)] 
remove this, it's in the scanners dir as __init__.py now

4 months agoupdate readme to reflect new structure
Daniel Gruno [Thu, 21 Sep 2017 12:07:57 +0000 (14:07 +0200)] 
update readme to reflect new structure

4 months agochange plugin structure
Daniel Gruno [Thu, 21 Sep 2017 12:07:02 +0000 (14:07 +0200)] 
change plugin structure

4 months agoadd jsonapi.post, rename getJSON to get
Daniel Gruno [Thu, 21 Sep 2017 11:54:35 +0000 (13:54 +0200)] 
add jsonapi.post, rename getJSON to get

4 months agoAdd age limitation arg, expand help
Daniel Gruno [Thu, 21 Sep 2017 11:18:50 +0000 (13:18 +0200)] 
Add age limitation arg, expand help

--age N will set a rule that sources will not be scanned
if they have been scanned less than N hours ago by any scanner.
New sources that have never been scanned will of course
be scanned regardless.