Long ago I realized that it’s often better not to quit a suspected bad idea, but to explore it and learn from it. @substack just gave me more insight into this:

Don’t unpublish your bad ideas. The best argument against a bad idea is to implement it well.

James Halliday (@substack)

The last sentence is key, and applies to more than published bad ideas. In web development, if it is indeed a bad idea, you’re more likely to learn from it, than to cause harm. If it’s a good idea that seems like it’s a bad idea, which turns out to be a good idea, it might be a really good idea.

There are a lot of rails 2.3.x apps that have slipped out of maintenance but are still used. When it comes time to add a feature to them, the original developer may be unavailable or may not have the same configuration for running it in development mode. I found myself in that situation and here’s what I did to run it:

  1. Start a new vagrant instance with Ubuntu. This makes things easy to install.
  2. Install rvm and the same ruby version. The project I received had a .rvmrc which specified a version of 1.8.7. The first time I told rvm to install the old version of ruby, I exited after rvm game me a list of packages to install, and installed those packages. Then I told rvm to install ruby again and it worked.
  3. Install the version of rails. It’s in config/environment.rb. Mine was 2.3.5. I ran “gem install rails -v 2.3.5”.
  4. Install dependencies specified in config/environment.rb. On this project it was just prawn, the pdf generator.
  5. Set up the database. This was the usual drill on this project. Nothing different from a new version of rails, except it was mysql instead of mysql2. I installed the Ubuntu database packages (libmysqlclient-devel and mysql-server), set up the database according to config/database.yml, and tried running rake db:migrate and after that failed ran rake db:schema:load.
  6. Go back to an old version of rubygems. I ran “gem update –system 1.5.3”
  7. Go back to an old version of rake. The breaking changes in rake were pretty much the worst thing ever. Switch to the global gemset and uninstall rake, and install rake again with “gem install rake –version=0.8.7”. If you’re having trouble with rake, 0.8.7 is probably the version you’ll want.

After that you’ll hopefully be able to run commands in script/ as well as rake commands.

In my experience it’s enough of a pain to go from 2.3.5 to 2.3.14 that it might be better to just go from 2.3.5 to the latest 3.x version. This will require a lot of changes but it will make the app ready to be worked on by most rails developers again.

I’m a bit of a software terminology buff. One of the favorite terms that I’ve heard over the years is Plain Old Java Objects, or POJO. It rhymes with pogo, as in pogo stick.

After thinking about a case where a Backbone.js Model might not make the most sense, due to overhead for communicating with servers and publish/subscribe, I came up with the term PODO, or Plain Old Domain Object. I see that a few people have already thought of it.

To me a Plain Old Domain Object is an object which doesn’t also have the responsibility of persisting itself, which Backbone Model objects and ActiveRecord objects are designed to do.

There are reasons why I find myself wanting to use a model object with persistence built in for complex models, but I think if I had the right storage middleware object I’d be comfortable without them. I’d want it to be based on patching rather than full updates. That way I could trigger only the behavior necessary in my PODOs for a given state change.

Here’s a copy of the email I sent to the CouchDB users mailing list:

I made a couple of npm (node.js) modules for editing CouchDB design documents that involves fewer files than python CouchApp, but like python CouchApp supports two-way sync. The function source is left intact when converting between JavaScript source and JSON. The JavaScript source version just shows it in an embedded function expression, which makes it syntax highlightable.

http://benatkin.com/2012/09/04/js2json-and-json2js/
https://github.com/benatkin/js2json
https://github.com/benatkin/json2js

Here’s a quick example. If I stick this in books.js:

module.exports = {
  "views": {
    "author": {
      "map": function(doc) {
        if (doc.type == 'book')
          emit(doc.author, null);
      }
    },
    "title": {
      "map": function(doc) {
        if (doc.type == 'book')
          emit(doc.title, null);
      }
    }
  }
}

…and run this (after npm install json2js js2json):

var js2json = require('js2json');
var json2js = require('json2js');
var fs = require('fs');

var jsSource = fs.readFileSync('books.js', 'utf8');
var jsonValue = js2json.convert(jsSource);
fs.writeFileSync('books.json', JSON.stringify(jsonValue, null, 2) + "\n");
var jsSourceFromJson = json2js.convert(jsonValue);
fs.writeFileSync('books-from-json.js', jsSourceFromJson + "\n");

…I get the following in books.json:

{
  "views": {
    "author": {
      "map": "function(doc) {\n  if (doc.type == 'book')\n
emit(doc.author, null);\n}"
    },
    "title": {
      "map": "function(doc) {\n  if (doc.type == 'book')\n
emit(doc.title, null);\n}"
    }
  }
}

…and books-from-json.js is exactly the same as books.js.

I explain it more in my blog post (linked at the top of this message). I need to add a cli tool that syncs, a way to handle attachments, and a way to handle embedded multiline strings for it to be a full-featured design doc editor. I have much bigger plans for this, though: I want to break up CouchApps into a bunch of smaller documents! The source and tests for these two modules is programmed in the same style. I think storing functions in JSON makes CouchDB just a little bit like Smalltalk, with a much more familiar language.

Thanks for reading. Feedback welcome and appreciated.

I made a couple of small node.js modules to help me develop couchapps.

CouchApp is two things: a concept and a tool. The concept is an application that runs on CouchDB, using its API. The API has three basic components for presenting data (it has others for editing data but that isn’t covered in this post):

  • views: A view in CouchDB is a Map and optionally a Reduce function, which enables indexing JSON documents using the powerful MapReduce programming model.
  • lists: A list is a function that renders documents from a view, into an any format (commonly HTML).
  • shows: A show function renders a single document, into any format.

With these simple functions, a blog or a photo gallery can be made viewable. The URLs can get a bit awkward, but CouchDB provides a rewrites feature to solve it.

These functions go into a CouchDB design document. CouchDB is a JSON database, and a design document is a JSON document that contains functions as strings. A single design document, with attachments for CSS, which can be served directly to the web browser, can provide access to data without relying on client-side JavaScript (instead using server-side JavaScript).

The tool is a python command-line utility that maps a CouchDB design document to a directory structure. It can clone a design document directly from the database. It breaks it into a lot of little files, including a separate JavaScript file for each function.

This can be a lot of files, when there is one file for each function. Here is the CouchOne pages wiki CouchApp. It has a lot of views.

Mikeal Rogers had perhaps the first solution to this problem: node.couchapp.js. It runs on node.js and allows programmatic building of a design document, and puts functions in strings to prepare the JSON using Function.toString(). This loses the python CouchApp’s advantage of being able to go back and forth from file to CouchDB without losing anything.

I wanted something that could sync back and forth and is more like the original JSON than python’s CouchApp. To make it usable, I need functions shown across multiple lines with syntax highlighting. My hack was to convert the JavaScript to JSON. With CommonJS I can convert JSON to JavaScript easily, just by adding module.exports = to the front of the file. I can then replace the function strings with the direct value of the strings. To go from JavaScript to JSON I need to get the original function source. With JavaScript’s fn.toString(), where fn is a function, I get the source without the comments. I want the comments. It also isn’t indented properly. With Esprima I wrote code to extract the raw function source and fix up the indentation. Examples of this are in the READMEs of the two projects, json2js and js2json.

The projects below are still in early development and barely tested, but I’m putting them to use right now, by finally putting one of my fancy domain names to good use. ;) In the meantime, if this interests you, please check them out and give me feedback.

Repositoryjson2js
Repositoryjs2json

Avoid describing your code as automatically doing something. If you need to do multiple things at once, separate it out into a convenience function or method. Finally, make it easy to call each step explicitly. Please don’t put the orchestration code, that calls code that interacts with the outside world, into a constructor. This makes the code less flexible and harder to test. Instead, use the constructor to set up the internal state.

Connect has a session middleware that has a pluggable API for session storage. There is a session store for redis, written by TJ Holowaychuk, who maintains both connect and express. There are also session stores for CouchDB, MongoDB, and postgresql that look to be well-maintained and ready for production use.

These are great, but I wanted to store my session data in cookies, because the amount of session data I plan to use is tiny, and because my app is designed to handle high-latency CouchDB database connections gracefully.

I had a hard time finding a session store that stores session data in the cookies. The session middleware uses cookies, but it uses them to store the keys to access the session data, not the session data itself. I found an example but no actively maintained session store for cookies.

After some more searching, I found that the way to store sessions in cookies is to use a whole different middleware that comes with connect! It’s called cookieSession. To use it all I have to do is add this code snippet, and ensure that I have session_secret set in my app settings:

app.use(connect.cookieParser(app.set('session_secret')));
app.use(connect.cookieSession());

When using cookie sessions it’s important that the cookie data is small and that the cookie is signed using a session secret, to prevent session fixation. This is documented in the excellent Ruby On Rails Security Guide. Even if you aren’t using RoR I recommend reading it.

I’ve been reading quite a bit about parsing and templating in ruby as I attempt to port a templating engine from JavaScript to Ruby. Here are some scattered links:

Jison

Jison is a parser generator for JavaScript that has separate lex and bnf definitions. It’s used by Handlebars.js.

Repositoryjison
Ownerzaach

Treetop

Treetop is a parser generator for Ruby. It’s installed with Ruby On Rails, through the mail gem which is installed by ActionMailer.

Repositorytreetop

Citrus

Citrus is another promising parsing gem for Ruby. It seems to be very easy to get started with, and I like many of the design decisions.

Repositorycitrus

Temple

Temple is a templating-specific library that helps with a lot of the AST transformation. It doesn’t seem to have a CFG syntax so it seems that using treetop or citrus would make sense for complex grammars. Otherwise, strscan could be used.

Repositorytemple
Ownerjudofyr

temple-mustache

This is an implementation of a mustache renderer in . It uses strscan to generate the initial parse tree, and Temple to generate the ruby code. It supports mustache sections.

Repositorytemple-mustache
Ownerminad

Slim

Slim is a Haml-like templating library for Ruby that’s used in production by many. It uses Temple, with a line-based parser, which works well because it uses significant indentation for nesting.

Repositoryslim
Ownerstonean

For the November meeting of Front Range Pythoneers we did a bowling code kata. We worked as a group on the projector, but I also worked on my own version on my laptop. Here’s my code, which was fun to write. It’s a single file which contains its test cases and can be run on the command line or imported:

import unittest

class Frame(object):
    def __init__(self, tenth=False):
        self.rolls = []
        self.tenth = tenth

    def full(self):
        if len(self.rolls) >= self.max_rolls():
            return True
        if self.tenth:
            has_special = all([roll in ('X', '/') for roll in self.rolls])
            return len(self.rolls) == 2 and not has_special
        else:
            return self.strike()

    def roll(self, score):
        if self.full():
            raise RuntimeError('attempted to record a roll on a full frame')
        self.rolls.append(score)

    def pins(self):
        if len(self.rolls) == 0:
            return 0
        if self.strike() or self.spare():
            return 10
        else:
            return sum([int(roll) for roll in self.rolls])

    def first_roll_pins(self):
        if len(self.rolls) == 0:
            return 0
        elif self.rolls[0] == 'X':
            return 10
        else:
            return int(self.rolls[0])

    def score(self, subsequent_frames):
        if self.tenth:
            return self.tenth_frame_score()

        score = self.pins()
        if self.strike():
            score += sum([frame.pins() for frame in subsequent_frames])
        elif self.spare():
            if len(subsequent_frames) > 0:
                score += subsequent_frames[0].first_roll_pins()
        return score

    def tenth_frame_score(self):
        return min(Game(''.join(self.rolls)).score(), 40)

    def strike(self):
        return len(self.rolls) > 0 and self.rolls[0] == 'X'

    def spare(self):
        return len(self.rolls) > 0 and self.rolls[-1] == '/'

    def max_rolls(self):
        return 3 if self.tenth else 2

class Game(object):
    def __init__(self, roll_scores=''):
        self.frames = []
        for score in roll_scores:
            self.roll(score)

    def roll(self, score):
        if len(self.frames) == 0 or self.frames[-1].full():
            tenth = len(self.frames) == 9
            self.frames.append(Frame(tenth))
        self.frames[-1].roll(score)

    def score(self):
        return sum(self.frame_scores())

    def frame_scores(self):
        frame_scores = []
        for frame_index in xrange(len(self.frames)):
            frame = self.frames[frame_index]
            subsequent_frames = []
            if frame.strike() or frame.spare():
                subsequent_frames = self.frames[frame_index+1:]
                added_frames = 1
                if frame.strike() and len(subsequent_frames) > 0:
                    added_frames = 2 if subsequent_frames[0].strike() else 1
                subsequent_frames = subsequent_frames[:added_frames]
            frame_scores.append(frame.score(subsequent_frames))
        return frame_scores

class GameTest(unittest.TestCase):
    def test_initial_strike(self):
        self.assertEqual(Game('X').score(), 10)

    def test_two_strikes(self):
        self.assertEqual(Game('XX').score(), 20+10)

    def test_three_strikes(self):
        self.assertEqual(Game('XXX').score(), 30+20+10)

    def test_strike_spare_strike(self):
        self.assertEqual(Game('X9/X').score(), 20+20+10)

    def test_strike_strike_spare(self):
        self.assertEqual(Game('XX9/').score(), 30+20+10)
        self.assertEqual(Game('XX9/71').score(), 30+20+17+8)

    def test_perfect_game(self):
        game = Game('X'*12)
        self.assertEqual(len(game.frames), 10)
        self.assertEqual(game.score(), 300)

    def test_made_up_game(self):
        game = Game('X907/818/X70070/72')
        self.assertEqual(len(game.frames), 10)
        self.assertEqual(game.score(), 132)

if __name__ == '__main__':
    unittest.main()

I think that the tenth frame calculation is incorrect. My limited understanding of bowling slowed me down a fair bit. I got an object system that I’m fairly happy with, though!

I’m working on a project where I think I’ll be using a parser library so I’ve been looking at the options. One thing I’ve noticed is that treetop is installed when Rails is installed. I didn’t know why, though, so I looked around.

First I looked at the Gemfile.lock. Had I known the format I would have found out my answer more quickly. I didn’t, though, and so when I found my first result for treetop, I stopped. It showed treetop below specs in the hierarchy.

GEM
  remote: http://rubygems.org/
  specs:
    # ...snip...
    thor (0.14.6)
    tilt (1.3.3)
    treetop (1.4.10)
      polyglot
      polyglot (>= 0.3.1)
    tzinfo (0.3.30)
    uglifier (1.0.3)

The next thing I did was run find . -iname '*.treetop' in ~/.rbenv. It found the following results:

(mbp) ~/.rbenv/versions/1.9.2-p290 $ find . -iname '*.treetop'
./lib/ruby/gems/1.9.1/gems/erector-0.8.3/lib/erector/erect/rhtml.treetop
./lib/ruby/gems/1.9.1/gems/mail-2.3.0/lib/mail/parsers/address_lists.treetop
./lib/ruby/gems/1.9.1/gems/mail-2.3.0/lib/mail/parsers/content_disposition.treetop
./lib/ruby/gems/1.9.1/gems/mail-2.3.0/lib/mail/parsers/content_location.treetop
./lib/ruby/gems/1.9.1/gems/mail-2.3.0/lib/mail/parsers/content_transfer_encoding.treetop
./lib/ruby/gems/1.9.1/gems/mail-2.3.0/lib/mail/parsers/content_type.treetop
./lib/ruby/gems/1.9.1/gems/mail-2.3.0/lib/mail/parsers/date_time.treetop
./lib/ruby/gems/1.9.1/gems/mail-2.3.0/lib/mail/parsers/envelope_from.treetop
./lib/ruby/gems/1.9.1/gems/mail-2.3.0/lib/mail/parsers/message_ids.treetop
./lib/ruby/gems/1.9.1/gems/mail-2.3.0/lib/mail/parsers/mime_version.treetop
./lib/ruby/gems/1.9.1/gems/mail-2.3.0/lib/mail/parsers/phrase_lists.treetop
./lib/ruby/gems/1.9.1/gems/mail-2.3.0/lib/mail/parsers/received.treetop
./lib/ruby/gems/1.9.1/gems/mail-2.3.0/lib/mail/parsers/rfc2045.treetop
./lib/ruby/gems/1.9.1/gems/mail-2.3.0/lib/mail/parsers/rfc2822.treetop
./lib/ruby/gems/1.9.1/gems/mail-2.3.0/lib/mail/parsers/rfc2822_obsolete.treetop
./lib/ruby/gems/1.9.1/gems/treetop-1.4.10/examples/lambda_calculus/arithmetic.treetop
./lib/ruby/gems/1.9.1/gems/treetop-1.4.10/examples/lambda_calculus/lambda_calculus.treetop
./lib/ruby/gems/1.9.1/gems/treetop-1.4.10/lib/treetop/compiler/metagrammar.treetop
./lib/ruby/gems/1.9.1/gems/treetop-1.4.10/spec/compiler/test_grammar.treetop
./lib/ruby/gems/1.9.1/gems/treetop-1.4.10/spec/compiler/test_grammar_do.treetop
./lib/ruby/gems/1.9.1/gems/treetop-1.4.10/spec/composition/a.treetop
./lib/ruby/gems/1.9.1/gems/treetop-1.4.10/spec/composition/b.treetop
./lib/ruby/gems/1.9.1/gems/treetop-1.4.10/spec/composition/c.treetop
./lib/ruby/gems/1.9.1/gems/treetop-1.4.10/spec/composition/d.treetop
./lib/ruby/gems/1.9.1/gems/treetop-1.4.10/spec/composition/f.treetop
./lib/ruby/gems/1.9.1/gems/treetop-1.4.10/spec/composition/subfolder/e_includes_c.treetop
(mbp) ~/.rbenv/versions/1.9.2-p290 $

Aha, so there are numerous treetop files in actionmailer! I have my answer. Seems like a good use of a parser, plus those may be worth using as examples.

Then I took another look at a Gemfile.lock from a rails project, and saw that it was plainly listed there. I just didn’t see it and didn’t keep looking after I found one.

GEM
  remote: http://rubygems.org/
  specs:
    XMLCanonicalizer (1.0.1)
      log4r (>= 1.0.4)
    actionmailer (3.1.1)
      actionpack (= 3.1.1)
      mail (~> 2.3.0)
  # ...snip...

I noticed something: Gemfile.lock doesn’t show an arbitrarily nested hierarchy; instead it shows a list of gems and their dependencies, where the list of gems includes all gems. Then, separately at the end of the file, it shows the top-level gems from the Gemfile.

To see a deeply nested, a graph could be constructed from the Gemfile.lock, using the two nesting levels under specs as an adjacency list.

In the last couple of years I’ve witnessed a disturbing trend: developers adopting free Heroku as their only means of hosting side projects. More disturbingly, I operated this way myself for a couple of years. (Yes, freemium can be a trap for customers just like it can be a trap for businesses.)

Heroku has five megabytes for database space, which often sounds like it ought to be enough when it isn’t. Want auditing and comments? Nah, that’ll take up too much space. Its single dyno free plan serves one request at a time. The next steps up are twenty dollars a month and five cents an hour for databases and dynos, respectively. These aren’t that expensive for a major project, but for several side projects it quickly adds up.

I realized this and switched back to running a VPS, this time on Zerigo, which I pay for annually. There is no limit to the number of apps I have. Concurrent requests are supported. They can use the same databases. Database backups are free and uncomplicated. I’m also happy to be outside of the cloud oligopoly that seems to be forming.

Besides that, it’s fun! I get to try niche language platforms. Node.js was building steam long before Heroku supported it, and it still doesn’t support websockets. It’s not hard to find, with a little thinking, other interesting platforms to try. How about Racket or Factor? Or setting up your own Lucene server, or a web server that uses the git command line tool? Those can’t (easily) be run with Heroku.

I find anecdotally that most developers don’t have their own websites or non-trivial side projects. I only have the first, but I can sense that my personal website is helping me prepare to launch non-trivial side projects. I’ve done very little work to set up this server, yet despite tweeting about it and having visitors and occasional commenters, it stays up. That gives me the confidence I need to launch something bigger.

My plea to other developers (and aspiring developers) out there is to draw parallels between programming and other creative works and find out how much you could responsibly be spending for hosting side projects, and then realize that there’s no reason you shouldn’t have at least a VPS.