Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better README and out-of-box example #7

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
91 changes: 64 additions & 27 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,23 +1,36 @@
Kohlrabi is a mini webapp, based off of Tornado, for viewing tabular report
data. You can try running it like this:
Kohlrabi is a [Tornado](https://github.com/facebook/tornado) based webapp for viewing tabular report data.

python kohlrabi/main.py -c config.yaml.example
Out of the Box Example
======================

You can try running kohlrabi immediately like this:

example/launch_server_example.sh
python example/push_data_example.py

This will start a Kohlrabi server instance at http://localhost:8888/kohlrabi/.
Then push_data_example.py will push some fake data to the server in JSON form to
http://localhost:8888/kohlrabi/upload/. You can then browse two different data
reports for two different days each: 'Daily Signups' and 'MySQL Query Report' for
the days of 2011-02-14 and 2011-02-13.

Customizing Kohlrabi
====================

Out of the box, Kohlrabi includes a module `kohlrabi.modules.example` that
demonstrates some example reports. These are meant to be an inspiration for the
types of reports you might want to put into Kohlrabi, and how to create a new
class. However, they're probably not very useful to most people; most people
will need to customize what data they store in Kohlrabi to be slightly or
entirely different.
Out of the box, Kohlrabi includes two sample module definitions in `kohlrabi/modules/example.py`.
The two modules are `Daily Signups` and 'MySQL Query Report`. These are meant to be
inspirational templates for the types of reports you might want to put into
Kohlrabi by showing you how to create/add new modules. However, they're probably not immediately
useful to most people; you will need to customize what data they store in Kohlrabi
to satisfy your needs.

The way customization works in Kohlrabi is to create a new Python module in the
same format as the one in `kohlrabi.modules.example` (look at the source
The way customization works in Kohlrabi is to create a new Python file in the
same format as the one in `kohlrabi/modules/example.py` (look at the source
code). In the configuration file, you'll specify this as your `module`; this
module should be something available in `sys.path` that can be imported using
Python's `__import__` directive. Any SQLAlchemy tables in this module with the
Python's `__import__` directive. (The example script works because the current
directory is automatically added to sys.path. You will certainly want to set it
manually in a production environment). Any SQLAlchemy tables in this module with the
metaclass `ReportMeta` will be detected by Kohlrabi as a potential data source,
which you can upload data for.

Expand All @@ -27,10 +40,14 @@ server, indicating the date, the data for the report, and the data source.
The next section will cover this in more detail.

Adding New Reports
------------------
==================

Setting up a Module
-------------------

It's easiest to explain this with an example. Suppose the report module
specified by the config `module` variable has the following code in it:
(This code is available in `kohlrabi/modules/example.py`)

from sqlalchemy import *
from kohlrabi.db import *
Expand All @@ -54,14 +71,17 @@ specified by the config `module` variable has the following code in it:

@classmethod
def report_data(cls, date):
return session.query(cls).filter(cls.date == date).order_by(cls.signups.id)
return session.query(cls).filter(cls.date == date).order_by(cls.signups)

This is a data source that will track users who sign up on your site, based on
the HTTP `Referrer` header. The table has three columns: `referrer` will track
the domain that referred the initial visit to your site, `clickthroughs` will
track who many people came to the site from that referrer, and `signups` will
track how many of those people actually signed up.

Setting up the DataBase
-----------------------

The next step is to create the table in your Kohlrabi SQLite database. If you
don't do this, Kohlrabi will automatically create the table, but the table won't
have any indexes. In most cases you should probably at least add an index on the
Expand All @@ -80,37 +100,54 @@ querying from the `report_data` method:

OK, that's all the setup you need to do on the Kohlrabi side of things: create a
Python SQLAlchemy class, and create a table in your SQLite database. The second
step is to write a report that generates data to store in Kohlrabi. You can do
this however you want, in any language you want. This report should finish by
step is to write a report that generates data to store in Kohlrabi.

Sending data to Kohlrabi
-------------------------

You can do this however you want, in any language you want. This report should finish by
making a normal HTTP POST request to your Kohlrabi instance, with URL `/upload`,
and the following POST parameters:

* `date` -- the date for this data, in the format YYYY-MM-DD
* `table` -- the name of the Python class you defined earlier (in this example, `DailySignups`)
* `module` -- the name of the Python class you defined earlier (in this example, `DailySignups`)
* `data` -- A JSON list of dictionaries mapping column names (excluding `id` and `date`) to their respective values

For instance, if we were running Kohlrabi on `http://localhost:8888`, then the
following Python code would generate a sample report for 2001-01-1:
(This code is available in kohlrabi/example/pusher_example.py)

import json
import urllib

urllib.urlopen('http://localhost:8888/upload',
urllib.urlencode({'date': '2010-01-01',
'data': json.dumps([{'referrer': 'www.yahoo.com',
'clickthroughs': 100,
'signups': 7},
{'referrer': 'www.google.com',
'clickthroughs': 500,
'signups': 42}]),
'table': 'DailySignups'}))
urllib.urlopen('http://localhost:8888/kohlrabi/upload',
urllib.urlencode({'date': '2011-02-13',
'data': json.dumps([{'referrer': 'www.yahoo.com',
'clickthroughs': 32984,
'signups': 123},
{'referrer': 'www.google.com',
'clickthroughs': 23452,
'signups': 432},
{'referrer': 'www.excite.com',
'clickthroughs': 82,
'signups': 0},
{'referrer': 'www.ask.com',
'clickthroughs': 31,
'signups': 0},
{'referrer': 'www.cuil.com',
'clickthroughs': 4,
'signups': 0},
{'referrer': 'www.bing.com',
'clickthroughs': 21032,
'signups': 98}]),
'module': 'DailySignups'}))

Just to reiterate: because the interface to Kohlrabi is a normal HTTP request
using JSON, you can use any language to send data to Kohlrabi. You can use Java,
Ruby, a bash script, etc. Whatever works for you.

Configuration
-------------
=============

This section describes the parameters that can be placed in the config file. The
config file should be in YAML format. You can specify the path to the
Expand Down
2 changes: 1 addition & 1 deletion config.yaml.example → example/config.yaml.example
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
database: sqlite:///kohlrabi.sqlite
database: sqlite:///example/example.db
debug: true
module: kohlrabi.modules.example
path_prefix: /kohlrabi/
Binary file added example/example.db
Binary file not shown.
11 changes: 11 additions & 0 deletions example/launch_server_example.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
#!/usr/bin/env bash

export PYTHONPATH=kohlrabi/:example/:$PYTHONPATH
echo
echo Starting Kohlrabi Server using config.yaml.example
echo Try visiting http://localhost:8888/kohlrabi/
echo run: \'python ./pusher_example.py\' to push some sample data to the server
echo Press C-c to stop
echo

python -m main -c example/config.yaml.example
96 changes: 96 additions & 0 deletions example/pusher_example.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
#!/usr/bin/env python

import json
import urllib

# Daily Signups for 2011-02-13
urllib.urlopen('http://localhost:8888/kohlrabi/upload',
urllib.urlencode({'date': '2011-02-13',
'data': json.dumps([{'referrer': 'www.yahoo.com',
'clickthroughs': 32984,
'signups': 123},
{'referrer': 'www.google.com',
'clickthroughs': 23452,
'signups': 432},
{'referrer': 'www.excite.com',
'clickthroughs': 82,
'signups': 0},
{'referrer': 'www.ask.com',
'clickthroughs': 31,
'signups': 0},
{'referrer': 'www.cuil.com',
'clickthroughs': 4,
'signups': 0},
{'referrer': 'www.bing.com',
'clickthroughs': 21032,
'signups': 98}]),
'module': 'DailySignups'}))

# Daily Signups for 2011-02-14
urllib.urlopen('http://localhost:8888/kohlrabi/upload',
urllib.urlencode({'date': '2011-02-14',
'data': json.dumps([{'referrer': 'www.yahoo.com',
'clickthroughs': 32234,
'signups': 87},
{'referrer': 'www.google.com',
'clickthroughs': 21103,
'signups': 499},
{'referrer': 'www.excite.com',
'clickthroughs': 65,
'signups': 0},
{'referrer': 'www.ask.com',
'clickthroughs': 45,
'signups': 1},
{'referrer': 'www.cuil.com',
'clickthroughs': 1,
'signups': 0},
{'referrer': 'www.bing.com',
'clickthroughs': 29238,
'signups': 121}]),
'module': 'DailySignups'}))

# MySQL Query Report for 2011-02-13
urllib.urlopen('http://localhost:8888/kohlrabi/upload',
urllib.urlencode({'date': '2011-02-13',
'data': json.dumps([{'servlet': 'api.web_cmds',
'servlet_count': 3,
'query_text': 'SELECT * FROM some_silly_table',
'query_count': 343,
'query_mean': 83.2,
'query_median': 89.0,
'query_total': 233442.2,
'query_95': 213212.34,
'query_stddev': 12.3},
{'servlet': 'api.something',
'servlet_count': 42,
'query_text': 'SELECT * FROM not_a_table WHERE id IN (0,1,2,3,4,5)',
'query_count': 435243,
'query_mean': 823234.4232,
'query_median': 83232.23,
'query_total': 233.221,
'query_95': 213.232,
'query_stddev': 321.23}]),
'module': 'MySQLQueryReport'}))

# MySQL Query Report for 2011-02-14
urllib.urlopen('http://localhost:8888/kohlrabi/upload',
urllib.urlencode({'date': '2011-02-14',
'data': json.dumps([{'servlet': 'api.web_cmds',
'servlet_count': 4,
'query_text': 'SELECT * FROM some_silly_table',
'query_count': 3221,
'query_mean': 83.1,
'query_median': 83.0,
'query_total': 233922.1,
'query_95': 2132323.2,
'query_stddev': 11.1},
{'servlet': 'api.something',
'servlet_count': 43,
'query_text': 'SELECT * FROM not_a_table WHERE id IN (0,1,2,3,4,5)',
'query_count': 435213,
'query_mean': 823233.4232,
'query_median': 83322.23,
'query_total': 23932.21,
'query_95': 223.233,
'query_stddev': 322.33}]),
'module': 'MySQLQueryReport'}))
2 changes: 1 addition & 1 deletion kohlrabi/handlers.py
Original file line number Diff line number Diff line change
Expand Up @@ -113,7 +113,7 @@ class Uploader(RequestHandler):
path = '/upload'

def post(self):
table = self.get_argument('table')
table = self.get_argument('module')
data = str(self.get_argument('data'))
data = json.loads(data)
date = self.parse_date(self.get_argument('date', None))
Expand Down
5 changes: 4 additions & 1 deletion kohlrabi/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -97,4 +97,7 @@ def run_application():
stream_handler = logging.StreamHandler()
stream_handler.setLevel(logging.DEBUG if debug else logging.INFO)
log.addHandler(stream_handler)
run_application()
try:
run_application()
except KeyboardInterrupt:
pass
76 changes: 12 additions & 64 deletions kohlrabi/modules/example.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
"""This is an example database module. Using some metaclass magic, the kohlrabi
instance that imports this file will learn about the query report.
instance that imports this file will learn about the report.
"""
from sqlalchemy import *
from kohlrabi.db import *
Expand Down Expand Up @@ -38,78 +38,26 @@ class MySQLQueryReport(Base):
def report_data(cls, date):
return session.query(cls).filter(cls.date == date).order_by(cls.servlet).order_by(desc(cls.query_total))

class MemcacheReport(Base):

__tablename__ = 'memcache_report'
__metaclass__ = ReportMeta
class DailySignups(Base):

id = Column(Integer, primary_key=True)
date = Column(Date, nullable=False)
servlet = Column(String, nullable=False)
servlet_count = Column(Integer, nullable=False)
cache_name = Column(String, nullable=False)
hits = Column(Integer, default=0, nullable=False)
misses = Column(Integer, default=0, nullable=False)
hit_rate = Column(Float, default=0, nullable=False)
frequency = Column(Float, default=0, nullable=False)
latency_mean = Column(Float, default=0, nullable=False)
latency_stddev = Column(Float, default=0, nullable=False)
bytes_mean = Column(Float, default=0, nullable=False)
bytes_stddev = Column(Float, default=0, nullable=False)
miss_latency_mean = Column(Float, default=0, nullable=False)
miss_latency_stddev = Column(Float, default=0, nullable=False)
time_saved = Column(Float, default=0, nullable=False)

display_name = 'Memcache Report'
html_table = [
ReportColumn('Servlet', 'servlet'),
ReportColumn('Cache Name', 'cache_name'),
ReportColumn('Frequency', 'frequency'),
ReportColumn('Time Saved', 'time_saved'),
ReportColumn('Servlet Count', 'servlet_count'),
ReportColumn('Hits', 'hits'),
ReportColumn('Misses', 'misses'),
ReportColumn('Hit Rate', 'hit_rate', format=format_percentage),
ReportColumn('Latency Mean', 'latency_mean'),
ReportColumn('Latency Stddev', 'latency_stddev'),
ReportColumn('Miss Latency Mean', 'miss_latency_mean'),
ReportColumn('Miss Latency Stddev', 'miss_latency_stddev'),
ReportColumn('Kb Mean', 'bytes_mean', format=format_kb),
ReportColumn('Kb Stddev', 'bytes_stddev', format=format_kb),
]

@classmethod
def report_data(cls, date):
return session.query(cls).filter(cls.date == date).order_by(cls.servlet).order_by(desc(cls.frequency)).order_by(desc(cls.hits))

class ServletBreakdownReport(Base):

__tablename__ = 'servlet_breakdown_report'
__tablename__ = 'daily_signups'
__metaclass__ = ReportMeta

id = Column(Integer, primary_key=True)
date = Column(Date, nullable=False)
servlet = Column(String, nullable=False)
servlet_count = Column(Integer, nullable=False)
logged_in = Column(Boolean, nullable=False)
db_mean = Column(Float, default=0, nullable=False)
memcache_mean = Column(Float, default=0, nullable=False)
template_mean = Column(Float, default=0, nullable=False)
other_mean = Column(Float, default=0, nullable=False)
total_mean = Column(Float, default=0, nullable=False)
referrer = Column(String, nullable=False)
clickthroughs = Column(Integer, nullable=False, default=0)
signups = Column(Integer, nullable=False, default=0)

display_name = 'Servlet Timing Breakdown'
display_name = 'Daily Signups'
html_table = [
ReportColumn('Servlet', 'servlet'),
ReportColumn('Count', 'servlet_count'),
ReportColumn('Logged In?', 'logged_in'),
ReportColumn('DB Mean', 'db_mean'),
ReportColumn('Memcache Mean', 'memcache_mean'),
ReportColumn('Template Mean', 'template_mean'),
ReportColumn('Other Mean', 'other_mean'),
ReportColumn('Total Mean', 'total_mean'),
ReportColumn('Referrer', 'referrer'),
ReportColumn('Click-Throughs', 'clickthroughs'),
ReportColumn('Signups', 'signups'),
]

@classmethod
def report_data(cls, date):
return session.query(cls).filter(cls.date == date).order_by(cls.servlet).order_by(cls.logged_in)
return session.query(cls).filter(cls.date == date).order_by(desc(cls.signups))