2014 - 275, 2015 - ?
An overview of the different Linux performance tools and the subsystems they work on.
Scraping w/ Celery in 6 mins
A short demo screencast I made about how to use python’s Celery library in a basic way
I’ll show you how to make a database-backed dashboard in under 3 minutes. First, install the following:
With the prerequisites packages installed, we can populate a pandas DataFrame from read_sql:
import MySQLdb
from pandas.io.sql import read_sql
import pandas as pd
db_connection = MySQLdb.connect(read_default_file='~/.my.cnf')
query = """\
SELECT
date(created_at) as date,
count(*) as count
FROM events
GROUP BY 1"""
df = read_sql(query, db_connection)
df.head() # taking a peak at the data
Sweet, we have some data, whoa!
df.plot()
Whoops, we aren’t making use of a datetime index so
So let’s solve both of those:
df.date = pd.to_datetime(df.date)
df.set_index('date', inplace=True)
df = df.reindex(pd.date_range(min(df.index), max(df.index)), fill_value=0)
df.plot()
Wow. We’re pretty much done. Oh yeah, let’s make this dashboard available—1) via web app, 2) via email.
We’ll use Flask—this is what there is to it:
#!/usr/bin/env python
from flask import Flask, make_response
from cStringIO import StringIO
from matplotlib.backends.backend_agg import FigureCanvasAgg as FigureCanvas
import matplotlib.pyplot as plt
import MySQLdb
from pandas.io.sql import read_sql
import pandas as pd
app = Flask(__name__)
db_connection = MySQLdb.connect(read_default_file='/Users/max/.my.cnf.snailbo')
@app.route('/')
def index():
return """\
<html>
<body>
<img src="/plot.png">
</body>
</html>"""
@app.route('/plot.png')
def plot():
query = """\
SELECT
date(created_at) as date,
count(*) as count
FROM events
GROUP BY 1"""
df = read_sql(query, db_connection)
df.date = pd.to_datetime(df.date)
df.set_index('date', inplace=True)
df = df.reindex(pd.date_range(min(df.index), max(df.index)), fill_value=0)
df.plot()
canvas = FigureCanvas(plt.gcf())
output = StringIO()
canvas.print_png(output)
response = make_response(output.getvalue())
response.mimetype = 'image/png'
return response
if __name__ == '__main__':
app.run(debug=True)
If you run this as a script, navigate to http://localhost:5000/ in your browser, you should see this:

Awesome.
We could also embed this chart in an email—perhaps using cron to send it to ourselves every X hours?
Here’s how:
#!/usr/bin/env python
# http://stackoverflow.com/a/920928
import smtplib
from email.MIMEMultipart import MIMEMultipart
from email.MIMEText import MIMEText
from email.MIMEImage import MIMEImage
from cStringIO import StringIO
from matplotlib.backends.backend_agg import FigureCanvasAgg as FigureCanvas
import matplotlib.pyplot as plt
import MySQLdb
from pandas.io.sql import read_sql
import pandas as pd
sender = 'bob@example.com'
recipient = 'bob@example.com'
# Create the root message and fill in the from, to, and subject headers
msg_root = MIMEMultipart('related')
msg_root['Subject'] = 'test message'
msg_root['From'] = sender
msg_root['To'] = recipient
msg_root.preamble = 'This is a multi-part message in MIME format.'
# Encapsulate the plain and HTML versions of the message body in an
# 'alternative' part, so message agents can decide which they want to display.
msg_alt = MIMEMultipart('alternative')
msg_root.attach(msg_alt)
msg_text = MIMEText('This is the alternative plain text message.')
msg_alt.attach(msg_text)
# We reference the image in the IMG SRC attribute by the ID we give it below
msg_text = MIMEText("""\
<html>
<body>
<h1>awesome sauce</h1>
<img src="cid:image1">
<br>Nifty!
</body>
</html>""", 'html')
msg_alt.attach(msg_text)
# Create the chart
query = """\
SELECT
date(created_at) as date,
count(*) as count
FROM events
GROUP BY 1"""
db_connection = MySQLdb.connect(read_default_file='/Users/max/.my.cnf.snailbo')
df = read_sql(query, db_connection)
df.date = pd.to_datetime(df.date)
df.set_index('date', inplace=True)
df = df.reindex(pd.date_range(min(df.index), max(df.index)), fill_value=0)
df.plot()
canvas = FigureCanvas(plt.gcf())
output = StringIO()
canvas.print_png(output)
msg_image = MIMEImage(output.getvalue())
# Define the image's ID as referenced above
msg_image.add_header('Content-ID', '<image1>')
msg_root.attach(msg_image)
server = smtplib.SMTP('smtp.example.com')
server.ehlo()
server.starttls()
server.login(sender, 'mypassword')
server.sendmail(sender, recipient, msg_root.as_string())
server.quit()
Lookin’ good:

There’s a few things we didn’t do that would’ve been extra-nifty:
Drop me a line if you’ve got suggestions for other posts!
I heard about a business named RebelMail a couple weeks back that offers a product to eCommerce stores: email templates that contain forms for customers to complete abandoned purchases from w/in their email clients(!!!). How the hell do they do it?
Well it turns out that modern email clients (e.g. www.gmail.com, your iPhone’s email program, etc.) are browser-ish HTML-rendering environments—why wouldn’t they render a <form> tag?
Well, these browser-ish environments have a lot of security concerns. Neither you or your email provider wants your data being stolen, and there are a lot of possible ways for bad guys to accomplish—most have to do with injecting content into your browser-ish environment from remote resources by embedding it in the body of an HTML email.
But <form> tags? They’re pretty harmless—there’s nothing inherently “dynamic” or unsafe about them besides the submit button, which directs the user away from the current page and to a URL designated by the form. Here’s an example I sent to my Gmail account:

On clicking the “Go!” button, the values of the form fields are populated in the URL as GET query parameters. I haven’t rigorously tested support for forms across email clients or if the experience can be “gracefully degraded” when they’re not supported (please let me know if you do!).
You can check out the python code I used to send these form emails on Github.
http://nbviewer.ipython.org/github/mmautner/email_classifier/blob/master/gmail_importance.ipynb
"Building your own Priority Inbox"
A talk I gave in March about demoing an email “priority” classification model and putting it to use.
Every week, according to Peter Handsman, the former CTO, Reese would come up with an idea for something new to peddle. They would draft a business plan, launch a website, and measure consumers’ subsequent interest in a product. Efforts to sell coins and watches failed. At one point, Reese tried manufacturing family portraiture using inexpensive subcontractor artists in places such as Russia. The concept wasn’t easy to expand. “A lot of people have ideas,” says Handsman. “Byron has the discipline to actually measure them. He was willing to come up with a ridiculous number of ideas, but he was also willing to abandon them if they were proven not to work.”
Amazing weekend in Montreal—hanging out with cool folks from a bunch of different walks of life, got to see/meet famous Python folks and am excited by what I see being accomplished: “combinatorial innovation”
Really cool stuff :)