Welcome to Event Notes’s documentation!

Contents:

PyCodeConf

The Future Is Bright

Author

Talk by Jesse Noller

  • PSF Member
  • Pycon Chair

Intro

  • Aimed at all communities.
  • “Kids are okay”.

What is Python?

Who uses Python

  • Everyone.
  • Scales up to 100k users, to a script to calculate budget.
  • Easy to teach. Fits in head. Scales up well, and scales down.

Where is the language

  • ~123 accepted PEPs.
  • ~80 builtin functions
  • ~285 documented modules.

What’s the Future?

  • 2.7 is the last release of 2.x
  • Py3k
  • Proposals for Coroutines, async IO, cofunctions, daemon, asyncIO for subprocess, OS and Exception heirarchy.

What should Python be?

  • Should be the Borg, by borrowing the good ideas from other languages.
  • Ease of use, simplicity.
  • Adopt, but make it Pythonic.
  • Must continue to look outside the language

Things Jesse Wants

  • Communication (messaging)
  • Lightweight processes
  • Actors.
  • Gevent, libevent,

What we need

  • Cleaner, more Pythonic APIs.
  • Don’t leak.
  • Balance advanced users versus keeping it simple.
  • Modernize standard library.
  • Python is a conservative language. Doesn’t blaze the path.
  • Core language must be able to fit in one’s head.

Interpreters

  • Pypy
    • It’s fast. blazingly fast.
    • Magic in the RPython.
  • CPython
    • It’s the cockroach.
    • Simple, uncomplicated C code.
    • Battle tested.

Predictions

  • Pypy will become the dominant interpreter. CPython won’t die, and it will be reliable.
  • PyPy, CPython, Jython, and ironPython are BFFs.

Python3

  • Keep Calm, and Carry On.
  • Python is over 21 years old.
  • 5 years is nothing for migrating a community so big.

Community

  • Cannot afford to be idle.
  • Cannot be hostile, but welcoming.
  • Get involved locally.
  • Be open to criticism, especially constructive.
  • Look through the pile of vitrol, to find the core of truth.

Embracing The GIL

Author

  • David Beazley (@dabeaz)

Intro

  • Railed on GIL at PyCon, thought it deserved some Love
  • Godwin’s Law of Python

Interest

  • Fun hard systems problem
  • Likes to break GILs as a hobby

Threads are Useful

  • People love to hate on threads...
  • Because they are being used.
  • They solve tricky problems.

In A Nutshell

  • Python code -> VM instructions
  • Can’t execute VM instructions concurrently, therefore locking
  • Keep things safe * Ref counts * Mutable types * Internal bookkepping * Thread safety
  • All low level.

An Experiment in Messaging

  • Comes up in a lot of contexts

  • Involves IO

  • Foundation for working around the GIL

  • Shows an experiment in messaging using 5 implementations
    • C + ZeroMQ
    • Python + ZeroMQ (C extensions)
    • Python + multiprocessing
    • Python + blocking sockets
    • Python + nonblocking sockets
  • Tested on xlarge EC2 instance.

Scenario 1

  • Unloaded server
  • Expect ~10 seconds (10 seconds of sleep in there)
  • All roughly the same (~13 seconds for each)
  • Shows a real time example

Scenario 2

  • Implement a thread to calculate Fib(200) (Referencing Node.js is a cancer)
  • C version is barely affected.
  • Python blocking goes to 142 seconds.
  • Real time example takes a long time.

Commentary

  • This aggression will not stand.

Thoughts

  • Try Pypy
    • Test on Pypy (6k seconds)
    • fixed in trunk
  • Try 2.7
    • Within 2x

GUI

  • Uses Idle with threads (esp CPU bound)
  • Kills performance to the point of completely unusable
  • Can barely type into Idle

Thread Switching

  • GIL aquisition based on timeout
  • Thread that want the GIL must wait 5ms
  • Causes a problem on release
  • 5ms delays build up

What’s Really Happening

  • Before send and recv, acquire GIL
  • After release

How to Fix

  • Thread priorities
  • Was in the original “New GIL” patch
  • Should be revisited

Experiment

  • Has an experimental python 3.2 with priorities
  • Really minimal
  • Threads can set their priority
  • Performance that is comparable to version without threads.
  • Makes GUI completely usable.
  • Tried with 1.4k threads.

More Thoughts

  • Huge boost in performance with few modifications
  • Not the only way to improve the GIL
  • Example: Should the GIL release on nonblocking IO?
  • Currently releases on every IO
  • If you are doing nonblocking IO, you aren’t blocking.

Wrapping Up

Questions

  • Did you do an academic paper on this?
    • No, but I think there is room for it.
    • The interesting question is if the OS thread library gives enough help to languages with a GIL.
    • Could it cooperate to tell the thread that it will be context switched.
    • At Pycon, OS kernel hackers came to talk about this.
    • Should say “Fixing the GIL is impossible”.
  • Is priority code production runnable?
    • No
    • Threads cannot quit.

What Makes Python AWESOME?

Author

  • Raymond Hettinger (@raymondh)

Context for Success

  • OS License

  • Commercial distributions
    • Sponsor advancements
  • Zen
    • Guides the language and community
  • Community
    • Killer feature
  • Repositories (Pypi)
    • Solved problems are a pip install away.

High level qualities

  • Ease of learning
    • Can build a Python programmer in a week and a half
  • Rapid Dev Cycle
    • Used in a high frequency trading company
    • More important to react to market
  • Economy of Expression

  • Readability and Beauty
    • Makes it easy to work in, and less tiring
  • One way to do it
    • Once you learn an aspect, you can apply it somewhere else

A bit of Awesomeness

  • Five minutes to write code to find duplicate files.

  • Can throw away.

  • How long to write in C?
    • Infinite
    • You won’t write it
    • Python programmers write things C programmers won’t.
  • Just the same as any other scripting language?

Why is Python Awesome?

Indentation

  • How we write psuedocode
  • Contributes to readability
  • Shows an example of indentation in C lying

Iterator protocol

  • Lots of stuf is iterable
  • Hold the language together
  • sets, lists, dicts, files
  • shows sorted(set(‘abracadabra’))
  • sorted(set(open(filename)))
  • Like legos: fit together perfectly
  • Shows an analogy between that and Unix pipes.
  • Not enough, GOF pattern

List Comprehension

  • More flexibile than functional style

Generators

  • Easiest way to write an iterator
  • Adds one keyword (yield)
  • Makes tricky iterators easy

Generator Expressions

  • Produce values just-in-time
  • sum(x**3 for x in xrange(1000000))
  • In Pypy, roughly C speed
  • setcomps and dictcomps

Generators that accept Input

  • generators support send(), throw(), and close()
  • Unique to Python
  • Can make Twisted’s inline deferreds using this
  • A state machine with callbacks.
  • Write code that looks procedural, but uses callbacks
  • Monocle (https://github.com/saucelabs/monocle), Twisted inline deferred
  • Fantastic improvement of callback code.

Decorators

  • Expressive
  • Always worked for function
  • Initial response: Syntactic sugar
  • Community rose up and demanded it from Guido.
  • Easy on the eyes
  • Shows example using itty (https://github.com/toastdriven/itty) using decorators for routing.
  • Ping into another machine using curl to lookup environment variables in 3 lines.
  • Web service in 20 lines, made possible by decorators
  • Thanks Django!

With Statement

  • Clean, elegant
  • Profoundly important
  • Sandwich analogy
  • Subroutines factor out the ‘meat’ of the code
  • With statments factor out the ‘bread’ of the code
  • Factors out common setup and teardown methods.

Abstract Base Classes

  • Uniform definition of what it means to be a sequence, mapping, etc.

  • Ability to override isinstance() and issubclass()

  • Duck-typing says: “If it says it’s a duck...”

  • Mixin capability (DictMixin)

  • Can provide the base of a class
    • shows using a list-based set with __iter__, __contains__, and something else
    • Mixin provides the rest

Backbone.js and Django for a Faster WebUI

Author

Leafy Chat

  • Web frontend for IRC
  • Done in Django Dash (2008?)

Grove

Chat Systems

  • Built a lot of them
  • Leafy chat - only used jQuery, lots of javascript
  • Using Backbone in Grove

Examples

  • Show an example of using jQuery to build UI.
  • Embedded HTML in javascript.

Backbone and Grove

  • The UI looks the same
  • Backbone gives MVC style, in a single file.
  • You can roll it yourself, making it easy to get started.
  • Not actually MVC, actually Models, Templates, and Views

Models

Collections

Views

  • Highlight the Backbone Views on the Grove app page.

  • Demonstrates Backbone Event binding
    • Creates the view from the model data
    • Bind updating view when the model changes

Templates

Additional Goodies

Sync

Events

  • Can update multiple views for a single model.
  • App.trigger(‘messageAdded’, ...)

Router

  • Will trigger Events based upon the hash

Questions

  • Do you feel bad that your Django app is now Javascript?
    • No, this is how apps are going.

PyPy is your Past, Present, and Future

Author

Intro

  • There are 2 things faster than C
    • Neutrinos
    • Pypy

Story

  • Armin wanted to write a JIT for Python (Psyco)
  • Psyco was the written by Armin.
  • Kind of messing.
  • Generators came along, and not supported
  • 64-bit computers weren’t supported either
  • Started writing Python in Python
  • About 2000x slower than CPython
  • Somethings in the standard library were in python
  • Copied some optimizations over (TimSort)
  • Writing JITs sucked.
  • Writing a JIT generator for arbitrary languages is much simpler than writing a JIT for Python
  • ~2-3 years ago Alex got into Pypy
  • Beat C in str_cmp ~1 month ago
  • http://speed.pypy.org
  • Tries to show example of real time video analysis, mplayer broke.

Numpy

  • Science likes big datasets, use Numpy
  • Numpy is in C
  • Numpy likes speed, so does pypy
  • Started reimplementing Numpy in Pypy

Hotspot Detection

  • Humans are bad at detecting slow downs

  • Pypy has a JITViewer
  • Shows demo fo JITViewer

  • Look into code
    • “I think that’s too many instructions”
    • Optimize code!
  • Shows example of sum(x**3 for x in xrange(10000))

  • JVM Community has good tooling

  • Python could use that too.

Current

  • Usually benchmark against C
  • Experimenting with using C extensions.

Where we’re going

  • Many projects are being migrated
    • Django
  • Porting to Python3

Architecture

  • Because they use a JIT Generator, can improve constantly
  • Speedups in Python3 will improve Python2

GIL

What People Are Doing with Pypy

  • Researchers getting results over lunch, instead of over night.
  • Financial company for market analysis
  • Engineers at CERN

What Pypy Needs from the Community

  • Encourages use of pypy if you are CPU bound
  • Requests for slow code, and they’ll use it in benchmarks
  • Want to make Python the right tool for the job in more places
  • Work on the ecosystem and tools

Processing Firefox Crash Reports With Python

Author

Socorro

Architecture

  • Collector
    • web.py app behind apache
    • Puts on disk
  • Store in HBase (crashmover)

  • Write to Postgres by monitor

  • Webapp and API

  • All Python

Lifetime of a Crash

  • Raw dump submitted by POST, JSON + minidump
  • Stored
  • Processed

Processing

  • Processing spins off minidumpstackwalk (msdws)
  • Tries to regenerate stack
  • Processor generates a signature
  • Tries avoid things like malloc
  • Writes to Postgres, which acts like a large, relational cache.

Backend Processing

  • Cron

  • Calculate aggregates
    • Top crashers by signature
    • URL
    • Domain (hates Farmville)
  • Process incoming builds

  • Match known crashes to mozilla bugs

  • Dupe detection

  • Match up crash pairs, e.g. plugin containers and browsers

  • Generate CSV extracts for engineers for analysis

Middleware

  • Move data access to REST API
    • Allow engineers to build apps against the data
  • Enable to rewrite app in Django in 2012

Webapp

  • How to visualize?
    • Many builds: release channel, nightly, hourly
  • Reporting in build time
    • Rebuilding in Django in 2012, because it’s Crufty
    • Maybe Flast
  • Almost all new Mozilla apps are Django

  • Don’t need models, though

Implementation

  • Use Python2.6

  • Postgres 9.1, some stored procedures

  • memcached

  • Thrift for HBase access
    • HBase written in Java
    • Thought about rewriting Hbase parts on JVM
    • Decided not to, Clojure not common, Jython for various reasons

Scaling

  • Different
    • Usually scale to millions of users.
    • Crash Center has terrabytes of data, ~100 users.
  • 2300 crashes per minute
    • Going down
  • 2.5 million per day

  • Median size 150k

  • Max 20MB

  • Reject bigger, since probably not useful since mem dump

  • ~110TB in HDFS (3x replicatoin)

What Can We Do?

  • Compare beta null signature crashes.

  • Analyze Flash versions crashes

  • Detect duplicate crashes

  • Detect explosive crashes

  • Find “frankeninstalls”
    • Some Windows updaters don’t work properly
    • Keep duplicate but out of version dlls

Implementation Scale

  • >115 Physical Boxes
    • About to rollout Elastic Search
  • 8 Devs, sysadmins, qa, hadoop ops, analysts
    • Hiring

Managing Complexity

  • Fork
    • Hard to install
    • Use version control VMs
    • Found to help with complex dev environments
  • Pull requests with bugfix features

  • Jenkins polls master on github
    • Runs tests
    • Build package
    • Push out to dev environment
    • builds release branch
    • manual push staging
    • missed rest of this

Continous Deployment

  • Critical
  • Build machinery for Continuous Deploy, even if you don’t
  • Can deploy at 10 a.m.
  • Everyone relaxed
  • Deployment is not a big deal

Config Management

  • Automate configs
    • Managed through Puppet

Virtualization

  • Don’t want to bulid HBase
  • Use Vagrant (http://vagrantup.com/)
  • Jenkins builds Vagrant VMs
  • Puppet configures VMs.
  • Tricky to get data
  • This + Github increased community activity

Upcoming

  • ElasticSearch
    • Lucene, distributed flexible search engine
    • Don’t know how to tune
  • Analytics
    • Detect explosive crashes
    • Detect malware
  • Better queueing

Open Source

  • Almost everything is open

The Future of Collaboration in the Python community and beyond

Author

Mark Pilgrim is Gone

  • He did feedparser, httplib2
    • aside: httplib2 was actually by Joe Gregorio
  • Dive into Python

  • Dive into HTML5

  • Other things

  • We lost a lot with him leaving, sad to see him go.

What Happened to his Projects?

  • What is the copyright?
    • A: CC-SA
  • What about his code?
    • httplib2 is a big dependency of many projects
    • That’s how found out he was gone
    • Pypi didn’t host it
    • Google Code didn’t host it anymore.

PyPI issues

  • Too easy to delete a package * Dependency checks for that package * Request a project handoff * Other projects need to be notified * RSS feeds
  • Human moderation * Some can be automate * Burdens PyPI team

Repeating History

  • Django-lint

  • Django-Piston
    • social factors caused no release in years
  • python.org

  • opencomparison.org
    • Host djangopackages.com
    • How does this get maintained?

Dark Future

  • Critical Packages Breakdown
  • Python packages vanish
  • Build scripts fail
  • Replace from caches/backups

Repercussions

  • Lose domain knowledge
  • Python can’t move forward.
  • Social Issues
  • 3rd Party Community is just as critical as Python core

Not the Future

  • It’s today

  • Legacy code with legacy packages

  • Build scripts fail

  • Example of NASA issue
    • caused project to go to ColdFusion
  • We have lost works of antiquity
    • Blame is moot
  • Stuff we make today is legacy in 5 years

Trust Issues

  • This causes a lack of trust in Python

  • Without trust, we can’t collaborate as well
    • The disease that will trigger zombie apocolypse

Solutions

  • Money!
    • Sponsorships
    • Problems getting money
    • Applications
    • Focus on sprints
    • Code quality issue from sprints
    • Ongoing maintenance

Future is still dark

  • Community Managers

  • Ticket triage, etc.

  • Needs core/senior developers

  • They are already busy
    • Examples pay people to do this
    • Volunteers may have life get in the way
  • Determining authority

PSF Paid Commmunity Manager

  • Proposed solution
  • Paid via the PSF

Repercussions

  • Fixes some problems
  • Mitigate social issues
  • Can still lose domain knowledge

Precedents

  • Ubuntu
  • Fedory
  • Twilio
  • Github

Wants

  • More reasons to trust
  • More reasons to contribute
  • Keep projects operating

Call to Action

  • This is a proposal

  • Wants to see PSF project incubation

  • PSF provides seed funding for OS projects
    • Should return on investment
    • Preferably to Python community
    • Needs a viable business model
    • PSF is an investor
  • Choose from particapants in Django Dash & coding contests

Return

  • Gives OS code
  • Gives money back to the PSF

What this isn’t

  • Covering < $100 for hosting
  • Things without a self-supporting business model

Examples Projects

  • djangolint.com
    • Little setup requires

    • Uses github

    • Wants for all Python

    • Wants syndication

    • How does it make money?
      • Pay to analyze privately?
    • Easy linting increases trust

  • readthedocs.org
    • Places in the 2010 Django Dash

    • Documentation increases trust

    • Business model?
      • Pay for private doc hosting would be good.
      • Clients don’t want to host docs.
  • depot.io
    • Freeze your python dependencies

    • Doesn’t replace PyPI

    • Provides additional security

    • Possible Advantages
      • Archive legacy packages
      • Leave PyPI as the canonical source
      • Adds dependablility, trust
  • PyPI
    • Pay for a PyPI Appliance?
    • Github makes “giant” profits on Enterprise Appliance
  • djangopackages.com
    • Just launched pyramid version

    • Plone?

    • Python?

    • http://bit.ly/django-reg

    • Compare and contrast packages

    • Helped determine a package to use

    • Gives metrics

    • Metrics give trust

    • As opencomparison, support more things
      • Languages
      • Syndication
      • OAuth
      • What’s the business model?

Results

  • Don’t have packages vanish.
  • Let Python move forward
  • Have new social issues.

Project Incubations

  • Already exists, just not with PSF
  • How much code comes out of these?
  • Energy of startup giving back?

The State of Packaging & Dependency Management

Author

Craig Kerstiens (http://twitter.com/craigkerstiens) Works at Heroku

Packaging

  • Need to release it
Where To Release

Your Server

  • Full flexibility
  • People rely on you being up
  • Breaks deploys
  • Don’t do this, unless you want to provide better uptime than PyPI

Github

  • Awesome for dev
  • Not for release
  • Not mean to packages, but source code

PyPI

  • Please release it here
  • Complain about it being down
  • 5 mirrors that are well updated

Managing Dependencies

  • Use pip
    • Supports uninstalling
    • Lots of small improvements
    • Supports version control
    • Don’t use this in production
  • Use virtualenv
    • Great for sandboxing
    • Destroy and recreate it often
    • Pin your dependencies

Pinning

  • Only deploy specific versions
  • pip freeze > requirements.txt
  • It’s explicit (see the Zen)

Version Control

  • Having a github/bitbucket source is good for dev...
  • Not for prod.
  • Put tarballs on internal servers.

PyPI is Down ————oG

  • pip install –use-mirrors, problem solved

Whats Missing

  • Not as good as Bundler from Ruby community
  • Pip upgrade needs to be better

Recap

  • Use PyPI
  • Explicit versions
  • Use mirrors
  • Need to use the tools more effectively

Questions

  • A frozen requirement may have unfrozen dependencies
  • May need to tweak requirements.txt

Python For Humans

Author

  • Kenneth Reitz (http://twitter.com/Kennethreitz)
  • Works for Readability
  • Works on the Github Reflog
  • Used to be part of the Changelog
  • Authored Requests, Tablib, Legit, OSX-GCC_installer, Clint, Evnoy, Httpbin
  • Makes software for humans

Philosphy

  • What people like about Python
    • Simplicity
    • Speed to develop
    • Pypy
  • import this

  • The Zen of Python, our manifesto

  • Beautiful is better than ugly
    • Syntax
  • Explicit is better than implicit
    • Compared to Ruby
  • If the implemenatation is hard to explain, it’s a bad idea
    • Unless you’re pypy
  • This talk will focus on there should only be one obvious way to do it.

Messing Around

  • Using Github API

  • Show’s Ruby code, not beautiful but straightforward

  • When trying it in Python we get confused about what library to use
    • Python 3 helps this naming issue
  • Shows code using urllib2
    • Too many actions to just use basic auth
    • And there’s more!
    • Github API uses 404 instead of 401, need to write our own BasicAuthHandler
    • Need to force it to send basic auth, took 3 hours
  • This would prevent people from using Python.

Problems

  • Unclear on what module to use
  • “HTTP should be simple as the print statement”

Solution

  • We need pragmatic, elegant tools.

HTTP

  • Has methods
  • Very simple
  • Urrllib2 is very complex, and therefore toxic

Requests

  • For humans
  • Simple solution for a simple problem

Litmus Test

  • You should not have to refer to the docs everytime you want to do something simple
  • API is the most important thing
  • Handle the 95% case elegantly

Building

  • Requests was very simple at first, but it resonated with people
  • Grew to handle more stuff
  • 17th most watched project on Github

Subprocesses

  • Powerful, effective, second worst API
  • Docs lacking
  • Follows C API
  • Mostly docs that are lacking

Proposed Solution

  • Envoy
  • Mostly the same API as Requests
  • Pipe, read stdout, etc.
  • Get it done quickly and effectively

File and System Ops

  • Surveyed dev ops
  • Shutil, sys, etc. are confusing
  • Limits adoption by dev ops guys

Install Python

  • Surveying room on installation methods on OSX
  • Many chosen
  • “What happened to one obvious way to do it?”

XML

  • etree is terrible
  • lxml is awesome
  • We need to adopt a better standard

Packaging and Dependencies

  • pip or easy_install

  • setuptools?

  • Distribute
    • How is it better than setuptools?
  • We need simple instructions on how to install, and release packages

Dates

  • Some good 3rd parties
  • Stdlib not good enough

Unicode

  • It’s a simple problem
    • Room erupts in “No it’s not!”
  • Should be easy

Testing

  • Unittests
  • Didn’t get the downside

Installing Dependencies

  • Asked room about difficulties
  • Almost everyone had difficulties

Hitchiker’s Guide to Python

  • http://python-guide.org

  • Teach the best practices

  • “There should be one– and preferably only one –obvious way to do it”

  • Brief overview
    • Idioms
    • Freezing Code
    • Installing code
  • Up for debate, collaboration

  • Aimed to be a reference guide, and to lower the barrier of entry

Manifesto

  • Simplify APIs
  • Document Best Practices

Python is Only Slow If You Use it Wrong

Author

Bup

  • Written in Python
  • Backup software
  • Uses Git as a data store
  • 80 megs/second

sshuttle

  • VPN that handles wireless speeds
  • Also in Python

How to Use Python Wrong

  • Tight Inner Loops
  • In compiled languages, you have these often
  • Really bad in Python
  • Line of code in Python is 80-100x slower than C
  • Keep it in a higher level

Ways to Make it Fast

  • Use Regex and C modules
    • Word based instead of char based ~5x faster
    • Will run it in C
    • Most of bup is Python, small bit in C to speed it up
  • 100% Pure is not pragmatic

  • CPython has a really good C API
    • Java doesn’t, it’s super painful
  • Python + C is winning so far
    • C is for tight inner loops
    • Python for the higher level

Threads

  • Computation threads are useless, because of GIL
    • Sometimes worse than single threaded
  • Okay for I/O
    • GIL will release for I/O
  • fork() works great for both
    • Recommend to use it all the time
    • No GIL
    • Trick is getting info from process to process
    • Bup uses this
    • No weird locking interactions
  • C modules can use threads
    • Can release GIL when you get objects
    • Run threads
    • Get GIL when computations are done
    • Can get high performance
  • CPU Bound threads in Python is doing it wrong

  • Question from audience: Scipy has Weave, which will allow you to inline C code. * Dynamic compilation

  • There are workarounds for the GIL

Garbage Collection

  • Python is both refcounting and gc

  • Refcounting
    • Whenever you use a variable, increase reference count
    • Whenever you stop, decrease the reference count
    • Terrible, terrible thing with threads
    • Need to lock on refcounts
    • GIL solves this problem
  • Shows graphs of programs memory and time
    • Allocates 10k of space a lot
    • Refcounting sematics allow Python lower mem usage than Java
  • Testing Java
    • 3 different tests
    • Shows one where it allocates as much memory as possible
  • Sometimes Python is Garbage Collected
    • Mutual referencing objects that have ref count of one
    • Backup GC finds this, and collects them
    • Shows example on how to do this
    • Pretty complicated in order to get across the GC
    • Then it relies on sucking up tons of memory, and getting it later

Advice: Stay away from GC

  • Break circular references

  • Most common, trees with reference to parents
    • Full tree need to be GC’ed
  • Better: use the weakref module

Deterministic Destructors

  • Win32 example of two writers to a file

  • Win32 doesn’t allow two writers

  • CPython allows it because it closes the writer because of refcounting

  • This causes deterministic behavior, unlike ‘real’ gc
    • In Python you don’t need to manage many resources
    • Files, database handles, etc.
  • Some people are trying to take this away
    • Pypy?
    • with statement isn’t a desirable alternative

HelloMark

  • Fork and exec “Hello World” 20x

  • Demonstrates startup times

  • Jython takes 15 seconds, slower than C+valgrind

  • Shows what you want to write command line tools in

  • pyc + CPython files are awesome for this
    • Django and Tornado can reload really quickly
  • Pypy loses in this regard

Summary

Amazing Things In Open Source

Author

Overview

  • Community recognizes work you do (meritocracy)

Meritocracy

  • People will use your work if it has merit

  • Anyone can build or be a leader
    • If they put in the work
  • Permission isn’t (usually) needed
    • We allow experiments

Open Comparison

  • Writing Comparison Grids for sub communities
  • Compare packages for Django, Pyramid, etc.

Call to Action

  • Build it!
  • Be Nice
  • Others probaby won’t build it, so you should

Early Decisions

  • Django Packages

  • Made during Django Dash

  • Decided to only manually add packages

  • Good decision?
    • Doesn’t matter
    • 900 packages right now
  • Action is better than having something get debated

  • Probably better in the hands of the core devs

  • Gut instinct is often right
    • Can always change it later

Ecosystem Patterns

  • Mostly from Django experience
    • Django has many 3rd party packages
    • Compared to Legos
  • Django Core vs. Apps
    • Many batteries included
    • This approach is good and bad
    • Can get stuck with a heavy core
    • Promotes “one obvious way”
  • Django has well defined patterns for apps
    • App structure
    • App settings
    • Overridable templates
  • Reuse encourages innovations as 3rd party packages

  • Core is conservative

  • Best 3rd party apps get added to core

  • Grow fastest when there is a pattern for extensions
    • jQuery
    • CPAN
  • Pyramid
    • Smaller core
    • Core functionality as add-ons
    • Endorsed add-ons
    • Potential for rapid growth
    • Can deprecate, and allow add-ons to evolve
    • Don’t need to wait on core
  • Pyramid’s Ecosystem developed over time
    • Came from Pylons, Repoze, Turbogears

How to Grow an Ecosystem

  • Write “Best Practices” on how to write 3rd party packages
    • There is a big gap in this
  • Well-defined specs
    • Allow others to write upon a base
  • Sample code

  • Active community

  • Mailing list/ IRC

  • Docs

  • 3rd-Party packages catalog

Too Many Options?

  • “There should be one– and preferably only one –obvious way to do it.”

  • There can be many web frameworks

  • But there is often too much clutter

  • Document the differences

  • Deprecate bad packages
    • Hard to do in some cases
    • Recommend replacements

Fragmentation

  • Not all web

  • Science, games, etc.

  • Can’t have too many interest groups
    • Diversity of ideas

3rd Party Packages

  • Best: Do one thing well

  • Usability
    • Good docs
    • Easy to install
  • Reliability
    • Tests
    • Help
  • Antipatterns are viral

  • Snippets is the biggest anti-pattern
    • Copy and paste code
  • Don’t over-engineer though

  • Don’t make the “kitchen-sink” package
    • Utility functions
    • Unrelated problems
    • More visible in HTML/CSS world
  • Do Be Pythoic
    • Elegance
    • Ease of use
    • Explicitness
    • Simplicity is why we use Python

Mentorship

  • Provide positive encouragement
  • Put yourself out there

Diversity of Ideas

  • Differ from country to country
  • Other types of diversity
  • PyLadies vs. SoCal Python Interest Group

The Prejudgement of Programming Languages

Author

Intro

  • 10 Years of Failures and Bad Ideas

  • Pre-2001: Ignorant of Software

  • ~2001: C is the best thing, Java sucks

  • ~2003: Learned Lisp

  • Designed a “more modern” C

  • Had curly braces, static types, but basically Python

  • ~2006 Built BitBacker in ~98% Python

  • Arc: C -> Lisp -> Python

  • ~2009: Ruby and Python 50/50

  • Tweet about frustration of integrating libraries in Ruby + Javascript

  • Frustrated by Python’s lack of blocks

  • Shows a conversation between _why and Ryan

  • “Ruby isn’t serious”

  • Frustrated with programming

  • q2 2010: Writing Tests

  • Show TDD using Ruby
    • Crazy Vim action

Testing

  • Claim: RSpec is confusing
  • Never had this problem
  • Python based on SUnit from 1994
  • Thought Django views are not as advanced as Rails
  • Ruby is the serious one?
  • “A Python programmer rejects a new idea without considering its value. A Ruby programmer accepts a new idea without considering its value.”

Choose Ruby or Python

  • Ruby community more willing to pay

  • Move to that full time

  • Shows examples of ugliness in Ruby
    • @foo ||= bar
    • realization, it’s how you do memoization
  • Maybe Ruby is well designed?

  • Generators, Comprehensions, Decorators, and Context Managers are easy to implement with blocks

  • Which language is complicated?

Emprically

  • Realized back to ignorance
  • Judged languages before he should
  • Ruby’s community is serious about testing
  • Rare opportunity to work with both

Cherry-picking for Huge Success

Author

Preface

  • Framework/Language fights are boring. Just use the best tool for the job.

Twitter

  • 2006: Rails, XML API
  • Now: JS Frontend, Erlang/Java

Does Ruby Suck?

  • No, and neither does Python
  • Both are great for prototyping
  • Application changes over time
  • Will rewrite

Solution

  • Build small applications

  • Combine into a larger one

  • Builds foundation to experiment * Move dbs, etc.

  • Crossing language boundaries
    • Rewrite
    • Use a different library
    • Implement a service

Agnostic Code

  • Example of depending on Django too much

  • Instead of importing from Django, pass it in
    • Class instance, parameter
    • Make it specific, but not more

Example

  • Drop down to WSGI
  • Usually too specific, if you only need just the url

Protocol Example

  • Compared to Python iterables
  • Flask views return wsgi apps
  • Can dispatch to a Django application, for example

Difflib

  • Compares any iterable that is hashable and comparable
  • Overly specific would be strings, though that’s the main use case
  • Real world use to diff HTML docs
  • Plugin Genshi to difflib to accomplish this

Interface Examples

  • Serializers
    • Missed examples

Mergepoint

  • To build apps we need merge points for smaller apps

WSGI

  • Used with most Python web frameworks

  • Often not enough

  • Provides a framework independent environment

  • Middleware can be useful mergepoints, though overused

  • Cannot consume form data in WSGI, inject uniform html, etc.

  • Libraries that help with this
    • Werkzeug
    • WebOb
    • Paste
  • Can write short helpers to dispatch from e.g. Django to WSGI

HTTP

  • Language independent
  • Cacheable
  • Harder to work with, complex
  • Can do proxying, nginx
  • Caching layers for scalability
  • Problem: Need to keep them running
  • Language independent library
  • cUrl

ZeroMQ

  • More modern TCP Socket

  • Language independent

  • Different topologies
    • push/pull
    • pub/sub
  • Easier than HTTP

  • No caching

  • Non gracefully dies

  • No broker infrastructure

Message Queues

  • Similar to ZeroMQ

  • In reality, a different problem

  • Can run tasks outside request/response

  • Different codes, languages to run code

  • Accessor Library: Celery

  • Don’t assume code to be nonblocking

  • Greatly simplifies testing

  • Redis queues are a good start
    • ~20 lines of code to build your own

Data Store

  • Using the same db for different apps
  • Works well as long as everyone plays nice

Redis

  • Remote datastructures
  • Shows bash example of a queue worker

Javascript

  • It’s awesome
  • Geeks hate it
  • ugly, can be abused
  • Use Coffeescript
  • Decouples frontend by using different services
  • Examples: xbox.com, Battlefield 3 game lobby
  • Can efficiently transform the DOM
  • Backbone.js
  • Testing sucks for others

Processes

  • Daemons can be annoying to run
  • Processes can have different privileges
  • Tune individual processes
  • Upgrade parts to python3
  • ZeroMQ/HTTP to operate together

Breakdancer

Author

Testing

  • Few constructs not mentioned in the past day
  • Someone submitted a bug
  • “I have tests”
  • Straightforward bug that wasn’t tested
  • All the individual items work, but sequences can fail.
  • Testing all sequences is a large number of combinations

Breakdancer Overview

  • Conditions, Actions, Effects
  • Driver to run things
  • Shows how add command can be decomposed into conditions
  • All Conditions, Actions, and Effects are composable
  • Driver holds the boilerplate
  • Python makes boilerplate minimal
  • itertools makes combinations simple.
  • Generate test case combinations automatically
  • Do preconditions, postconditions.

The Many Hats of Building and Launching a Web Startup

Author

Overview

  • Quit job as designer
  • Failed to found co-founder
  • Learned Python

Start Out

  • Have good runway
    • 1 year+
  • Health and relationships

  • Quit your job

What is Success?

  • Don’t want to build Google
  • Just build something that makes you some money
  • Take a step back
  • Love your job
  • Concentrate on small successes

Background

  • Knew HTML
  • Hated CS courses
  • Got a job at a startup
  • Got bored
  • Started freelancing

Entreprenuer

  • No cofounder is better than a bad cofounder
  • Applied to YC
  • Things didn’t go well
  • Used Learn Python the Hard Way (http://learnpythonthehardway.org/)
  • Used Django
  • Six weeks later, launched

Launch as Fast as Possible

  • You need customers
  • It helps morale
  • Allows you to iterate
  • “Good enough”
  • You can add features later
  • Work on the hard parts first
  • For her, programming part was hard
  • It was okay to launch with bad code.
  • Violates DRY.
  • Got picked up by Swiss Miss with MVP

Monetization

  • Have a plan.
  • Don’t think about it later or rely on funding

Don’t Be Alone

  • Surround yourself in a community
  • Find people who are smarter than you to help you out
  • No NDAs
  • Inhibits advice
  • People stealing your ideas is a good thing
  • Use Twitter/HN to talk
  • Attend Hacker Events, SuperHappyDevHouse, PyLadies

Take Shortcuts

  • Django ecosystem is awesome
  • Doesn’t know databases at all, South makes it easy
  • Dotcloud makes servers easy
  • Themeforest for design
  • Design for Non Designers
  • You can always iterate later
  • Launchrock.com

Future of Python and NumPy for array-oriented computing

Author

Why Python?

  • Fits your brain
  • Doesn’t get in your way
  • Software engineering is more about neuroscience than code.
  • Fibonacci is just an Unstable Infinite Impulse Response linear filter
  • Shows numpy example, which is fast, but wraps hardware integer
  • Wants to make Python faster than C, as in a GPU or FPGA

Conway’s Game of Life

  • Interesting excercies

  • Shows an example of it

  • Array oriented

  • APL
    • Grandfather of most array oriented languages
    • J,K,Matlab are descendents
    • Numpy is a descendent
    • Unicode glyphs
  • Game of Life is one line in APL

  • Array-oriented programming deals with arrays as a block

  • Shows numpy example

Numpy/Scipy History

  • Numeric around ~1994

  • More features for array oriented computing
    • a[0,1], a[::2]
    • Ellipsis object
    • Complex numbers
  • Syntax matters

  • Aside: We need more numpy/scipy and core collaboration

  • Derivative Calculations in 1997

  • Came from MATLAB, but it wasn’t memory efficient enough

  • Iterative update loop made Python nice

  • 1999 Scipy emerges

  • Python was better language than MATLAB, but lacked scientific libraries

  • Community Effort
    • Mostly from academics
  • Numpy emerged from Numeric in 2005

Numpy

  • Data types
    • Collections of objects
    • Arrays
  • Statistics functions

  • Arbitrary Arrays
    • Column oriented calculations

Scipy

  • Stats
  • Data fitting
  • Interpolation
  • Brownian Motion

Pypy

  • Let’s not chase C, let’s chase Fortran 90.
  • Example where Fortran 90 is 7 times faster than Numpy and Pypy

Question

  • Coolest thing seen with NumPy?
    • Implant surgery planning tool
    • CT Scans, 3d vis

Lightning Talks

Vagrant

  • Vagrant loves Python
  • Building and distributing VMs
  • Gives isolation, repeatability, and verification
  • Move dev to virtual machines
  • Move production ops scripts to setup environment
  • Vagrant command line, to manage life cycle
  • Designers can use it too
  • http://vagrantup.com

Testing CSS

Pyparsing

  • Time trial using Pypy
  • Search for integers in a string of random alphas and numbers
  • Pypy ~10x faster
  • Verilog parser (~16k lines)
  • Cpython (500 lines/sec)
  • Pypy (1131 lines/sec)

Pandas

  • @wesmckinn
  • Agile Tooling for Small Data
  • First need to small the small data problem before big data
  • DBs, Flat files, time series, mean you may want it
  • indexed data structures for relation data
  • Fast manipulation tool
  • Data alignment
  • Join merge
  • group by
  • Reshaping/pivot
  • In memory and fast
  • Meant for quant finance application backbone
  • ~26k loc
  • In productions since 2008
  • Data Analysis is dominated by thing like SAS
  • Lots of people want to expand in these areas
  • Operations to naturally select portions of data
  • Can plot data
  • Would love collaborators

DSLs

  • Peter Wang (@pwang)
  • Crazy crazy ideas
  • Would like Python to ignore some syntax where we can do whatever the hell we want
  • It might be awesome
  • Calling it extern
  • Just syntactic sugar
  • Hacking import hooks to make it work
  • .pydsl file
  • uses pyparsing under the hood to transform the dsl
  • Aimed at scientists
  • People want it: weave, numexpr
  • Everyone needs it
  • Let’s Python assimilate into existing systems

stackful

  • @erikrose
  • This is a hack
  • Wish things weren’t global
  • Dynamic variables like in Perl
  • Perl has local variables which leaks onto things it calls
  • stackful implemented as with statement
  • Thread safe
  • Implementation is funny
  • No hook in Python for reference
  • Just override every single magic method in Python
  • Should be able to be used

PyCon 2012

Keynote with Stormy Peters

Author

  • Stormy Peters

Web

  • We should make people aware of how their info is being used

Growing a Community

  • As companies get involved we wonder about the direction of the community
  • Reach out to new people, because it can be intimidating
  • When you meet someone, you have 3 seconds to make an impression * Based on your hair * And then shoes
  • When you respond to a bug report, or mailing list post, this is their first impression * Make it a good one
  • Python groups are great for this outreach
  • Study says learning something new is worth a 20% raise * old job needs 20% more money vs. new job with new tech
  • Some like to be famous (cue chuckles)
  • Some get involved because they are paid to
  • Some for ideals of free and available
  • Stay because of the community
  • Community is better when you can measure the impact of members

Open Web

  • Believes in an open web * Shows phone that boots to Gecko * Someone in Mongolia wrote about how excited they were for access to books * Could send html books instead of text messages
  • People made huge sacrifices to make ease of use with open and free software * Stay up all night to get a modem working
  • Free != open * Just because it’s free, doesn’t mean it has the ideals of open software
  • We haven’t defined what it means to have an open web service
  • I want you to host my data, but what kind of access do I need to make it open?
  • You may create a web service that puts you into a position you don’t want to * Give users tools along the way so that they don’t feeel disempowered
  • We need to help change the world so we get fewer phone calls
  • Things to help this (Mozilla examples) * Do not track movement * Browserid (now Persona)
  • Backup is important, as well as delete
  • “Are you sober enough to publish this picture?”

Paul Graham Keynote

Author

  • Paul Graham
  • YCombinator

Silicon Valley

  • The center of SV moves around the peopl who make the next generation of stuff * So, this room is right now
  • The frightening-ness of big startup ideas
  • List of 7 gigantic startup ideas
  • Scary, maybe I should do that recipe site instead

Next Google

  • Start next Google
  • Microsoft lost their way when they got into the search business
  • Google has been getting into the social network business
  • Nostalgic for the right answer from google * Seems based on Scientologist: “What’s true is what’s true for you”
  • Find tiny idea that turns big idea * Dinosaur egg
  • Search engine for top 10k hackers
  • Make the search engine the one you want
  • Don’t worry about something that constrains you in the long term

Replace Email

  • Any big idea has a bunch of people nibbling around it
  • Not designed to be used the way it is now * Bell labs “Want to go to lunch?”
  • Now a shitty todo list
  • Tweaking the inbox is not enough
  • Todo list protocol insteayd of messaging protocol
  • Sending emails to yourself
  • Want to know what they want you to do
  • When does it need to be done?
  • Whenever powerful people are in pain, that is the way to make lots of money
  • Gmail has gotten painfully slow
  • People will pay for faster email

Replace Universities

  • claps
  • Last couple of decades, universities seem to have gone down the wrong path
  • Expensive country clubs

Kill Hollywood

  • Hollywood was slow to embrace the internet
  • Internet beat cable
  • Bolted an iMac to the wall, found it better than a TV
  • TV seemed like it was designed by the same people who designed the thermostat
  • How do you deliver drama via the internet?
  • You kind of want to know what you’re going to get with a show

A New Apple

  • If Apple won’t make the next iPad, who will? * Empirically, it’s none of the incumbents
  • It will be a startup * Not crazy, Apple did it
  • Steve Jobs showed us what one person can do
  • “Steve Jobs unrolled the future like a carpet”
  • The next CEO might not live up to Steve Jobs, but doesn’t need to * Just needs to be better than HP, Samsung, Motorola

Bring Back Moore’s Law

  • Circuits are going to get twice as dense, not twice as fast
  • Hardware would just solve software’s problems
  • Need to rewrite it to be parallel
  • It would be really great by making a lot of CPUs look like one
  • The most ambitious is to do it automatically via a compiler * “Sufficiently smart compiler”
  • If not impossible, expected value is really high
  • Less ambitious is to start from the bottom * Build programs out of more parallizable lego blocks * Programmer still does a lot of the work
  • Middle ground is a semi automatic weapon * Looks like a sufficiently smart compiles, but there are humans in there
  • Make a market place, let people do it * Maybe make bots that will do it

Ongoing Diagnosis

  • Imagine the ways we will seem backwards to people in the future
  • Seem barbaric to wait for symptoms to be diagnosed
  • Bill Clinton had to wait for arteries to be 90% blocked to find out
  • Launch fast and iterate may not work for medical. * Work on pigs first * Sausage company on the side
  • The medical profession will be an obstacle to this
  • Doctors are alarmed to look for problems that aren’t there
  • If you start testing people all the time, you may get a lot of terrifying false alarms
  • Think this is an artifact of current limitations
  • Going against medical tradition

Tactical Advice

  • For big problems, don’t make a frontal attack
  • “Are we there yet?”, Haters
  • Notice that you replaced email when it’s done
  • Start with small things, let them get big * Facebook
  • Maybe big ambitions are a bad thing * The bigger they are, more likely to be wrong * Don’t identify, just think there is something out there * When the opportunity comes to move, move there
  • Blurry vision may be better

Graph Processing in Python

Author

  • Van Lindberg

Graphs

  • Uoniversal datatype
  • Probably not the best fit if you don’t have a relationship

Python-Dev

  • “Who talks to whom?”
  • Nodes are people
  • Edges are “responded to on Python-dev”
  • Centrality * Intuitively, the more central, tend to connect others * Dict to map person to how central they are * There’s a fairly tight knit community, with smallers around the edge * Antoine Pitrou was the most likely to respond
  • Topics
  • Nodes are people and topics
  • Edges are “commented on”
  • Filter out too-common topics

Fast Test, Slow Test

Author

  • Gary Bernhardt

Suites

  • Prevent Regression * Weakest, doesn’t change how you build
  • Prevent Fear * Being able to change things minute to minute, and have test verify * Where speed comes in
  • Prevent Bad Design * Holy Grail of Testing

Stop Writing Classes

Author

  • Jack Diederich

When Should I refactor

  • When there are two methods, and one is __init__
  • When you write functions around classes

Evolution of an API

  • MuffinHash replaces a dict
  • Was two lines, and obfuscated the code
  • 1 package, 20 modules

Version II

  • Easy to read
  • Two methods, __init__ and call

Version III

  • stdlib parts, 6 lines
  • 1 function

Namespaces

  • Preven collisions
  • Not taxonomies
  • Otherwise extra things to type, remember
  • Anytime you make a class, ask “What am I using it for?”
  • Reuse stdlib exceptions
  • Don’t complicate the names of your exceptions

stdlib

  • 200k sloc
  • avg 10 files per package
  • 165 exceptions

Classes

  • great for containers
  • heapq doesn’t use a class
  • Probably should be a class, since functions looke like methods * first param is data

Game of Life

  • Cell and Board classes
  • Board has two methods
  • Refactor to dictionary and function
  • Well, cell can be refactored to the key of the dict
  • Two functions and a dict

Code Generation in Python: Dismantling Jinja

Author

  • Armin Ronacher

Why?

  • Isn’t it evil?
  • A security problem?
  • Bad for performance?
  • Not if you do it right.

Security

  • Code Injection

  • Pollute namespace
    • Change local variables
    • Can evaluate code in different namespace

Performance

  • Alternative: Write an interpreter
  • Too slow
  • Not suitable

Eval 101

  • Compile function to make code objects
  • evan can work on a namespace
  • Using ast module, can alter underlying structure
  • Can use ast to add in line numbers to nodes
  • Don’t pass strings to eval/exec, but use code objects
  • Explicit compliation and namespaces, to fix problems

Jinja

  • Jinja and Django have C inspired scoping rules

  • Pipeline
    • Lexer
    • Parser
    • Identifier analyzer
    • Code generator
    • Python source
    • Bytecode
    • Runtime
  • Only runtime is necessary

Scoping

  • Context objects are dict-alike
  • Slow
  • Resolve in context ahead of time

Code Generation

  • Low level
  • Target byte-code
  • High level
  • AST generation
  • Bytecode doesn’t work on appengine, and is implementation specific
  • Would be nice to map jinja to bytecode
  • Ast is limited, easier to debug, and doesn’t segfault

Tale of Two Pieces of Code

  • scope in a function is faster than global scope
  • lookup via index instead of name
  • local dictionary isn’t generally used
  • semantics can be mapped to fast execution environment
  • Jinja context is data source
  • Django context is data store
  • You cannot modify context in Jinja

jsonjinja

Q&A

  • If you had the chance to redo would you use ast? * Yes, there are utility libraries that help this
  • ctypes for line numbers? * put special line number variables, monkey patch traceback * works in everything tested, including pypy * Some problems on some architectures.

Putting Python in PostgreSQL

Author

  • Frank Wiles

Why

  • Usually you want pl/pgsql
  • Sometimes you want a scripting, with libraries, etc.

Installing

  • Aptitude: postgresql-plpython
  • homebrew

Setting up the database

  • createlang plpythonu <databasename>
  • Check with SELECT * FROM pg_language
  • Python is untrusted
  • Can set this up in templates

Writing your first function

  • CREATE OR REPLACE FUNCTION

Debugging

  • plpy.notice, debug, error, and fatal
  • Will access the log file directly
  • Can use logging

Problems

  • Pain to maintain and debug
  • Can confuse the dba
  • Not free, cached

When

  • Rolling up/aggregating data * Remove network, sql parsing to keep runtime low
  • Enforce new constraints that aren’t in SQL
  • Protect data integrity

Triggers

  • CREATE TRIGGER...
  • Throw a Python exception
  • The TD variable has a lot of stuff in it

Redis

  • Can use system libraries
  • Update Redis unread count automatically

What can you do?

  • Executing other sql, create materialized views
  • plpy namespace has execute

Ideas

  • Lots of them
  • Celery tasks, caches, backups, apis, zeromq
  • Emails, inserts into another system, send an sms

Q&A

  • Limit the runtime of the procedure? * Don’t think so
  • Test Python Code? * Fake it outside
  • Automatically cache? * Have to say it’s immutable
  • How easy is it to specify a python binary? * Can specify per Postgres cluster
  • Run postgres queries inside query, infinite loop? * Will time out eventually? * Not yet, will be in 9.2
  • Interpreter external or internal? * Didn’t hear
  • Timeout kill trigger? * Could have connection timeout in code
  • PGSQL v. Python was a magnitude difference * Not surprisingly
  • Pypy or Jython? * Probably not * Not yet
  • Table functions? * Haven’t done much with that, mostly just materialized views

pandas: Powerful data analysis tools for Python

Author

  • Wes McKinney
  • Recovering mathematician
  • 3 years quant
  • Building Lambda Foundry
  • writing “Python for Data Analysis” * coming out later this year

Pandas

  • pandas.pydata.org
  • rich relational data on numpy
  • high performance tools
  • consistent api

Data Wrangling

  • Simplify the tools on processing the data
  • Don’t transfer from R to Python

Testing

  • >98% coverage
  • Battle tested

Demos

  • iPython transformed development
  • Good outside of science

Table

  • DataFrame is the core structure
  • Axis indexing allows rich data alignment
  • Alignment free programming * Often does munging for you

Lightning Talks

Numba

  • Travis Oliphant

  • Python compiler

  • For numpy and C extensions
    • Pypy not good enough
  • Dynamic compilation

  • Scipy needs a python compiler
    • Allows higher level SciPy
  • Numba

  • Replaces byte-code with type inference

  • Uses LLVM

  • Dothoes codegen

  • Uses C function pointers

  • LLVM works with everything

  • Uses a decorator to compile

  • High bandwidth communication to llvm

  • Python for high level, LLVM for low level

  • DSLs based upon these

  • https://github.com/numba/numba

I has a money

  • Chad Whittaker
  • Mint stores passwords in cleartext.
  • ihasamoney.com
  • Personal finance for geeks
  • j/k to navigate, no mouse

Brain Hacking

  • Talks are bad (but not here)
  • Code for brain
  • No spec for the brain
  • Tell a story
  • Implausible story better than plausible story
  • Make them care * Babies are better than code
  • Show puzzles not solutions * If you show the solution, they won’t care
  • Have to practice in order to get good

Python 3 on Pypi

  • Brett Cannon
  • “Pie-pee-eye”
  • 54-58% of the top projects support py3k
  • Some are under dev, like Django
  • The goal was 5 years
  • 3 years was the stretch
  • Update your metadata, e.g. “Programming Language :: Python :: 3.2”
  • Public shame
  • pyporting guide
  • added u’’ prefix to make it easier

Python on IBooks

  • Luke Gotzling
  • Can run interpreter in an ebook
  • Embed an interpreter in javascript in an html widget
  • 4.8 mg overhead
  • Runs on vanilla ipads

Keynote: David Beazley

Author

  • David Beazley

Let’s Talk About (something diabolical)

  • Let’s talk about Pypy
  • Python implemented in Python
  • Quite a bit faster because of magic
  • Mandlebrot runs 34x faster
  • Which one can you adjust with a pocketknife?

Thinking about Tinkering

  • CPython has patches, extensions, ideas
  • Talking about GIL, etc, wouldn’t be possible without tinkering
  • iPython notebook is an examples of this
  • Is it just “evil geniuses”?
  • Can you tinker with PyPy?
  • Can I teach myself to tinker with it using just resources available, part-time?
  • Building PyPy is challenging
  • Takes hours, > 4gbs of memory, might break C compiler
  • RPython is a restricted subset of the language, but can run as valid Python
  • RPython is defined by the translation toolchain
  • If you love Python, you will hate RPython
  • Uses type inference
  • Lists need to be of a single type
  • Pypy uses the bytecode interpreter and an abstract runtime to compile to C code

Why PyPy By Example

Authors

  • Maciej Fijalkowski
  • Alex Gaynor
  • Armin Rigo

What is PyPy

  • Can’t convince that they are not crazy
  • Python in Python
  • No longer speed of interpreter, speed of running program
  • Measuring memory is important

Edge Detection

  • Use dynamic objects with __get__ overridden to act like a list
  • Do edge detection on a web cam in real time
  • Implemented in Python
  • In cPython, ~7 seconds per frame

Tracebin

  • Successor to JITViewer
  • Expose performance information without understanding how PyPy works

Numpy

  • Believe easier to add numpy to JIT than a JIT to numpy
  • Some good initial results, but not complete

Garbage Collection

  • Don’t have to call free
  • History of talk for Pascal
  • Everywhere now

Transactional Memory

  • How do we use multiple cores? * Semaphores, events, etc.
  • Multicore usage
  • Two times the execution time * Where we were with GC years ago
  • Hard work

Sprints

  • Come sprint on PyPy
  • We’ll help with getting projects working on PyPy

Flexing SQLAlchemy’s Relational Power

Author

  • Brandon Rhodes

Denormalization

  • Quick to render, hard to update * e.g. IMDB updating an actor where it’s stored with movies

Normalization

  • Only store data once
  • Easier update
  • Need to pull data from multiple places

SQL

  • Need to model relationships through intermediary table

  • No composite data types
    • If you see fields like actor_1, actor_2, etc. something is wrong

Storage is Slow

  • Indexes let us jump to right part faster
  • Keeping records sorted on disk is slow
  • Indexes make this faster

How to make it fast?

  • Ask one question
  • Use explain and indexes
  • Domain knowledge can tell us how we can optimize a query * Postgres has an analyzer that does this well

The O Error

  • misconception: An ORM just deals with objects, and hides the relational
  • You need to know relational

Hand Coded Applications with SQLAlchemy

Author

  • Michael Bayer

What’s a Database

  • We can put data in and get it out
  • Can do queries that allow us to find records with attributes

Relational Database

  • Can create derived tables with suqueries
  • Set operations
  • ACID

How Talk to DB

  • DBAPI
  • Abstraction layers

ORM

  • Maps to relations
  • Can map to multiple relations
  • Can map object heirarchies to tables
  • How abstract should these be? * Should document stores work?
  • Relational features are under/misused which causes the mismatch
  • Best to not hide, but to automate
  • Explicit decisions and automation is “hand-coded”

Hand-Coded

  • Make decisions about everything
  • Automate these decisions for ease
  • Opposite of “wizards”, “plugins”, and APIs that make implementation decisions
  • Can still use libraries and frameworks

Polymorphic Association

  • Map multiple classes to something using GenericReferences
  • Does magic for us
  • Sometimes called GenericForeignKey
  • This breaks the C in ACID * Can generate FK that doesn’t point to anything
  • Implicit design decisions * Magic tables * source code stored as data, which is coupling * Application layer responsible for consistency

SQLAlchemy’s Response

  • Declarative Base * Composable patterns
  • HasOwner, and PortfolioAssets defaults
  • Define convention for polymorphic association

Advanced Celery

Author

  • Ask Solem
  • Work at VMware, on RabbitMQ team

Overview

  • Flexible and Reliable message queue system
  • Granularity: the less computation, the more fine grained the task is * Can reuse connections, etc
  • Chunking * Grouping fine-grained tasks to reuse resources

Chords

  • Sync primitive
  • Known as a barier
  • Callback the body with the results of the headers
  • Native support in Redis, with good enough fallbacks for others
  • demo of parallel summariazation using chords
  • Can use this to implement MapReduce

Blocking

  • Is bad
  • Timeouts

Routing

  • Smart routing
  • CPU based routers would be nice

Cyme

Indices and tables