Archive for May, 2008

The First Commitment

Sunday, May 4th, 2008

In the past few months, I’ve allowed myself to slip. I haven’t been making many public commits, nor discussing much where others can see. It has me feeling like a bodybuilder who hasn’t touched a set of weights in the same amount of time. My work and my writing has atrophied. My ability to maintain code that other people depend upon has suffered, and my ego has, as well. Time to sharpen up.

The first part of this new commitment is that I’ll be making a minimum of 3 commits a week to rFeedParser, no matter how small. This one is a stepping stone to taking on more of a workout, and It gives me time to reacquaint myself with the code base. rFP has weird and hairy parts in it because the problem it was solving was weird and hairy. However, there are a good number of ugly parts that were created because a) I wrote it with the Python version in the next window over causing me to write with a strong Pythonic accent; and b) I wasn’t as skilled in Ruby as I am now.

The module hierarchy alone proves I was diving in and not giving a fuck. At a certain point, I was just trying to get it to goddamn work and not caring what kind of hack-and-slash maneuvers I had to pull off to make it happen. With the distance from the problem and the clearer head I have now, I can piece together how it should be done.

The second part is a commitment to one commit a week to one of my public side projects. Right now, this consists mainly of the strictly-for-fun-and-I’m-keeping-it-that-way-fuckers framework I’m writing called Recess. Everyone writes a web framework, and I’m going to be That Guy, too.

I’ll try not to be too snooty about it, but if the framework turns out well (or, at all, really), I probably will be. Like I’ve said before, my ego knows no bounds. But, remember! It’s just for fun. Really. Really.

As my plans and projects grow and adapt and interests wax and wane, there will, of course, be a call to change this commitment. This two-part commitment is only the first of what will be a series of changing, and, likely, growing vows to myself. Look to see a lot more work from me.

rFeedParser on GitHub

Friday, May 2nd, 2008

Alright, it’s done. I’ve moved rFeedParser and rchardet to GitHub. Check out the rFeedParser and rchardet pages at GitHub and clone them with these URLs:

git://github.com/jmhodges/rfeedparser.git
git://github.com/jmhodges/rchardet.git

rFeedParser, of course, is a Ruby translation of the Universal Feed Parser in Python and passes 98.8% of its 3000+ unit tests. rchardet is a Ruby translation of chardet in Python and is used quite a bit in rFeedParser.

There are, of course, some things left to be done in both of these projects.

Off the top of my head, rFeedParser needs:

  • to be able to use libxml if the user prefers, instead of the Expat binding
  • to use version 0.4.1 of the character-encodings gem
  • someone to ask People Who Know if the way rfp strips out the bad stuff in the *\_crazy.xml tests is acceptable
  • to set up a git submodule for the tests in order to ease the merging in of tests from the feedparser repository
  • a fix up to some of the regexes and lame matching code in it, especially the time parsing code
  • resorting the incredibly ugly object hierarchy.
  • other things I’ve forgotten and am too lazy too look up

rchardet needs:

  • some information on whether using some gem-provided Tuple object instead of the giant Arrays would help the memory usage
  • fix the other encoding bugs that Mark fixed when he released the version of rchardet that cleared up the little endian UTF-16 bug I reported

There’s still a lot of work to do, and I’m listening to your concerns and taking your patches. Hit the mailing list and we can all make this better.

Special Note for People Who Want to Help: Run rake setup in your branch to install all the gems you need to run it.

Moving From Bzr to Git (or “Tailor is So Awesome I Cream My Pants”)

Thursday, May 1st, 2008

rFeedParser obviously has not gotten enough love from me. I intend to correct that.

The first order of business was to stop hosting its branches in bzr on this server. No one knew the repositories existed, they were sucking up tons of hard drive space, and, dammit, I’ve been digging git ever since Garry turned me onto it. Oh, and getting rFeedParser into a svn repository on rubyforge required bzr svn which required me patching svn on my Mac Book Pro. Too much damn work.

But I also don’t want to lose all of those commit logs. I’d feel guilty pretending that rFeedParser just magically appeared in its current state without showing off how many times it looked even worse. I decided that I needed just the main branch turned into the master branch of a new git repository and that I’d host it up on GitHub.

Enter tailor. tailor is an amazing bit of Python that can translate from most any version control system to most any other control system. Trying to describe all of the other crazy source control backflips it can do would take up too much space here, but, I assure, its worth checking out. Go do your googlings.

tailor made it drop dead simple to move rfeedparser over, but I had some significant help making that happen. Bryan Murdock’s post on this exact same topic was a great boon, but not perfect. In the time since he posted, either the config file for tailor changed, or some kind of bizarro bit rot occurred.

In any case, I munged up the config file, looked up some more docs and, now, I present you with a simple way of moving your current bzr branch into a brand spanking new git repository. (How lucky you are!)

First things first, you’ll need bzr, git and tailor. The first two I had installed via MacPorts (verions 1.3.1 and 1.5.5.1, respectively) while the latter was a bit of a pain. You can install it via MacPorts, but by default it tries to run in the Python 2.4 environment when bzr is installed over in the 2.5 one. Bleh.

What I had to do, was grab the tailor code myself (version 0.9.30), unpack it (say, to ~/src) and run it with python2.5 explicitly. For those of you on Macs, substitute python2.5 ~/src/tailor-0.9.30/tailor for tailor when I write it in the commands below. The rest of you can be blissfully unaware that there is any problem at all because your package management system probably doesn’t suck as much as MacPorts.

First things first. Find your bzr branch while I, for the sake of this post, it’s at /path/to/bzrbranch. Next, decide where you want the git branch to exist which, again for the sake of this post, I’ll pretend is /path/to/gitrepo.

Now, anywhere at all, write a file (which we’ll call bzr2git.conf) containing:

[DEFAULT]
verbose = True
patch-name-format = “”

[project]
source = bzr:source
target = git:target
start-revision = INITIAL
root-directory = /path/to/gitrepo
state-file = tailor.state

[bzr:source]
repository = /path/to/bzrbranch


[git:target]
git-command=/opt/local/bin/git

Notice the git-command line at the end. That’s only for lame-o MacPort users because tailor doesn’t seem to understand $PATH or something, freaks out about not being able to find the git command and leaves us questioning our ability to manage our own systems. Leave it out, or change the path to right one if you’re one another system.

Finally, run

tailor -D -c bzr2git.conf

and you’ll have a happy new git repository at /path/to/bzrbranch with your history intact. Oh! And it’ll have the .bzr directory in it but you can feel free to clear it out. The old bzr branch will still have all that info in place.

And you’re done. I’ll be following up with information on what’s up with rFeedParser in the next post.

Update: Now see this: rFeedParser on GitHub.