robkinyon on Perl: 2010

Tuesday, March 2, 2010

DBM::Deep 1.0020 released

1.0020 Feb 16 22:00:00 2010 EST

(This version is compatible with 1.0016)

- Fixed t/43_transaction_maximum.t so that it doesn't error out on systems

which cannot fork > 255 children at one time.

- Improved code coverage

- Added t/96_virtual_functions.t which helps describe what actually

needs to be overridden in a new plugin.

So, the major change between 1.0020 and 1.0016 is the support for different backends. This was one of the biggest changes I've made, even considering adding support for transactions. While transactions were a big change mentally, adding backends required a lot more physical work. There was moving all sorts of code around and actually making proper APIs between the various subsystems. This exposed a ton of coupling that had to be teased apart very gently.

I also learned a ton about just how UN-transactional the various "transactional" datastores really are. SQLite does full DB-locks to implement transactions. InnoDB (MySQL's transactional engine) takes locks on AUTO_INCREMENT keys. So, in both cases, concurrent transactions just don't work. DBM::Deep's transaction support wins. Huh.

But, unlike DBM::Deep's native file backend, using MySQL as a backend allows you to use DBM::Deep across different servers. So, DBM::Deep can start to scale horizontally where, before, it couldn't because of problems with flock over NFS. And, for obvious reasons, running DBM::Deep using MySQL as a backend is faster - much faster.

I have an idea for how to improve transactions over InnoDB to avoid AUTO_INCREMENT keys by using UUIDs. But, until I see someone actually using this in prod, it'll remain just an idea.

Saturday, February 20, 2010

Software as complexity management (part 2)

(q.v. part 1)

All pieces of software have one thing in common - they change over time. Whether it's a new feature, a bug report, or a feature that's implemented over 3 releases, these changes affect our baby in so many ways. Managing those changes is probably the most important thing a developer does.

Change is a problem because it introduces risk - risk of a bug. We all know what a bug is, but every change could introduce one. So, managing change is best done by reducing the risk of every change.

The only way to reduce risk is to reduce the scope of the risk. The biggest risk any change has is in how many workflows the change can potentially affect. So, you want to keep the scope of every line of code as tight as possible. Any change to any line in a scope (block, function, etc) has the same area of effect as every other line in the same scope. So, if one line of code in a function reference a global variable, then any change to any line in that function have the potential to affect that global variable.

This is sometimes called coupling, but I think that coupling isn't a strong enough concept. The best analogy is taint. Let's say we have global variable %BAD_IDEA. It's referenced in functions bad_foo() and bad_bar(). If any line in bad_foo() is changed, then every workflow that uses bad_foo() must be tested (duh!). But, every workflow that uses bad_bar() must also be tested. And, if bad_bar() uses another global variable, then every workflow that uses any function that uses that second global variable must also be tested. All from changing a line in bad_foo().

It becomes pretty obvious pretty quickly that global variables end up tainting whole swathes of code (and why global variables should be avoided at all costs). Next part is about objects and how they encourage the use of global variables.

Thursday, February 18, 2010

Software as complexity management (part 1)

Software exists to solve problems, primarily problems in business. Solutions to those problems pay for the people who write the software who solve them. Even free software runs on corporate money. Especially free software.

If we want to write better software, we need to have a solid understanding of what a problem is. Instead of problems, it's useful to talk about the problemspace. I often find it useful to imagine a problem as a surface in N-dimensional space, each dimension representing a competing concern. For me, the surface describes the complexity floor of the solutionspace. It's the theoretically ideal solution - the absolute zero of software perfection.

So, the ideal software is the implementation that is closest to the complexity floor. Can we describe the complexity axes? Can we describe some rules for how to minimize them? What do you think?

Tuesday, February 16, 2010

How useful is a debugger, really?

(Inspired by this and this and informed by Linus.)

From the get-go, there's been a Perl debugger and lots of people think that it's a "Good Thing"™. Me, I've never seen the point. Yeah, I've used it. I know how to set breakpoints and step into functions and all that. I've used perl -d, gdb, the VB debugger, and the debuggers in both Firefox and IE (version 5-8). And I hate them all.

Well, hate is a strong word. Frankly, I don't see the point. There has never been a programming problem I've faced where using the debugger helped me. At all. In any language. Ever.

In 2000, I was lucky to be mentored by Ted Vessenes when we were working at Motorola. Ted is one of the most phenomenal programmers I've ever met and I know a bunch of brilliant people. Just to give you an idea of how freaking awesome he is, this is the guy that played so much Unreal Tournament that he got pissed at the AI and built a new one. Not a perfect player, but a perfect human player. His AI ended up being so good that he had to mess up his algorithms so that other people knew they were playing an AI. He literally passed the UT3 Turing Test.

Ted once told me "the most powerful debugging tool is print." (He actually told me this a bunch of times.) Or, restated - "If you need anything more than the print statement to debug a problem, then you don't understand the problem enough to solve it."

That means you, Mr. debugger. If I have to reach for you, that means I don't have enough context, knowledge, or whatever else I might need to solve the problem. In fact, I use my desire for a debugger as a litmus test. Anytime I want a debugger, I know I'm too stupid to fix it.

(Frankly, this goes for IDEs, too. If I need anything but vim to manage my project, then my project isn't organized properly.)

Monday, February 15, 2010

Why choose Java over Perl?

I was talking about about Perl and Java in the Enterprise with frew on IRC tonight. We never actually discussed the topic at hand. Instead, we ended up talking about szabgab's comment.

I think the primary reason is that Perl is too individualizable, by far. Yes, I said too individualizable. That strength of Perl - TMTOWTDI - is its biggest weakness when it comes to working in groups.

When it comes to programming applications, the key feature of a good workflow is managing change. Every application will have change come down the pike. A new feature, an overeager salesman, a bug found, or an idiotic CTO - these all bring about changes. A good workflow will build an application that manages these changes without compromising the application's ability to manage subsequent changes. A team of decent Perl developers will easily be able to build an application that does this. But, how long is the ramp-up for a new developer? What happens when one of the initial developers leaves?

Java, on the other hand, does not give you all that freedom in how you build your application. There is no TMTOWTDI Ave. in Java City. This inability to express yourself is a strength when it comes to managing turnover. Every Java application in existence looks a lot like every other Java application.

Furthermore, Java offers a set of guarantees that Perl simply cannot do because of the design tradeoffs inherent in the language. For example, there is absolutely no way to declare something invariant in Perl. In Java, invariants are a normal part of the language. Perl also has no way of guaranteeing the type of a given variable. Java and Haskell both have this feature. You cannot turn off monkey-patching or any of the other runtime poking that is one of Perl's greatest strengths, not even lexically.

And, to top it all off, there's such a huge variety in feature strengths that a Perl novice simply cannot comprehend a Perl master's code, let alone a guru's. Every Java developer can read every other Java developer's code.

Given that some programming teams turn over a developer every week, which language should they choose?

Sunday, February 14, 2010

Perl and Java in the Enterprise

(Via Google Buzz)

I'm glad that Perl job openings in the enterprise are trending upwards. But, what does the number of openings really tell us about the comparative penetration of each language? Well, a lot less than you might think.

The big difference is how many developers of language X are needed to support a given set of features. And, as every good Perl developer knows, Perl requires a lot less developers than Java.

A LOT LESS.

In the past, I was a Perl contractor for some of the largest Fortune 500 companies in the world. Every one of these companies had significant investment in both Perl and Java. A few times, I saw a given application developed in Perl, then reimplemented in Java. I distinctly remember two specific cases where this happened.

Both cases followed the same pattern. First, the Perl app was developed part-time by 2-3 developers over a couple weeks and supported by them in their "spare time" - in addition to their fulltime responsibilities. Management found out about the app and, in both cases, ordered it reimplemented in Java. Six months later, the group 20+ developers and 2-3 project managers released their first version. Unusably buggy, it had less than half the features than the prototype app in Perl. After another year (roughly 40 man-years sunk), the users in both cases got the original app up and running via back-channels and went about their work. In one case, management supported the Perl app. In the other, I think the Java app is still up and running. Maybe (6+ years later), it's finally where the original prototype was. Probably not.

The point is that a team of 3-5 Perl developers can do more than 30 Java developers. 10:1 sounds about right. Perl has already penetrated the enterprise. Frankly, the enterprise cannot exist without Perl. The enterprise just doesn't need as many Perl developers to do the same work. Think about that.

Wednesday, February 10, 2010

What is "Modern Perl 5"?

So, the theme for YAPC::NA::2010 is going to be "Modern Perl 5". Other than a cool marketing buzzword, what exactly does that mean?

Sometimes, it's easier to start with what this doesn't mean.

There's a quasi-truism that says something like "You can always recognize someone's original language in Perl." C programmers write C in Perl. Java programmers do the same thing. (And, wow can you tell a sysadmin who's only ever used awk and sed!) It's easy to recognize "good Perl," but can we describe it?

Perl is also oft-maligned as "a bunch of line-noise." While it's easy to dismiss those complaints as from "sigil haters" or "regex newbies," there is definitely some truth that Perl tends to be much denser than other languages. And, that's a strength of our language - one we get from our FP roots. It's quite common to translate a program from C to Perl and get LOC ratios of 20-1 or higher. I once saw higher than 100-1 and it was well-written C.

Lastly, one of the great promises of Perl 6 is that it will allow sane grammar modification. We don't have that yet. But, Devel::Declare does get us quite close. Syntactic sugar isn't about less keystrokes - it's about being clear about the intent of the code. In my opinion, C's typedef is one of its most powerful features.

Putting it all together, this is what Modern Perl 5 is all about:

A Perl "style"
Readably dense code
Literate code

The best example of this is Moose. Everything in Moose can be implemented in pure Perl. (Some is in XS for performance, but it doesn't have to be.) Code written using Moose is unarguably in a Perl style. The code is very dense (easily 20-1 to C or Java), but it's clear at the same time. There are several kinds of subroutine declarations, each making it clear what its intention is.

Moose will have at least one track devoted to it and there's going to be at least one class on Moose during the after-YAPC. What else is there that epitomizes those "Modern Perl 5" features? How did you bring those features to your pre-modern codebases? What advice can you give someone on how to make their code modern? Did I even get all the features of "Modern Perl 5" correct?

Why am I asking all these questions? Because I want your talk submissions!

Social media setup

We have setup the various social media connections you want to get your YAPC::NA::2010 on.

And we're still on the ones you already know and love:

#yapc on irc.perl.org
Mailing list - http://mail.pm.org/mailman/listinfo/yapc

Tuesday, February 9, 2010

Call for Presentations for YAPC::NA::2010

YAPC::NA::2010 is just around the corner. (Well, inasumuch as something 18 weeks away is "around the corner.") Last week, I met with Heath Bairh and the other organizers and took the position of speaker liaison. So, I should probably talk some about what we're looking for.

YAPC::NA::2010 is going to be about "Modern Perl 5". We believe that Perl 5 is a vibrant and living language with many uncharted places it can go. Perl 6 is going to be great, but we can't wait until Christmas for Perl 6.

So, we're looking for presentations about Perl 5 in all of its modern glory. Whatcha got?

You'll need to register (or login) to the YAPC::NA::2010 website, then after registering for the conference, you can submit your CFP. The window closes March 31, 2010 at midnight GMT.

robkinyon on Perl