Saturday, September 21, 2013

On Schwern

For those who have not heard, Schwern was arrested recently for domestic violence. I have no comment on the charges - I don't know what happened. AND NEITHER DO YOU.

My heart goes out to the victim (whom I will not name). Her experiences must have been terrifying and I'm glad that she is safe now. She has a huge amount of support, support many victims of domestic violence never receive.

What I do know is this:

  1. Domestic violence is horrific. It's something that should never happen and must be stamped out.
  2. If Schwern did do the crime, he should do the time. Hard time.
  3. The Perl community, at least the several hundred people I've known over the past 10+ years, do not condone this sort of behavior. And, yes, I feel very comfortable speaking for all of them.
  4. Schwern, for all of his faults, has done many good things to increase gender diversity in the many OSS communities he has been a part of.
Like all of us, he is not black or white. He is a three-dimensional person, with greatness and horrible flaws. Every single person who reads these words has done things they are deeply ashamed of. Knowing him, I'm sure Schwern, a decade from now, will deeply regret his actions and choices of the past few days.

THIS DOES NOT EXCUSE HIS ACTIONS, NOR AM I APOLOGIZING FOR HIM.

I only hope and pray that people do not throw the baby out with the bathwater. Schwern is a brilliant developer whose contributions to Perl were a small, but key, part of how the Web was won. He's also a troubled man who, like all of us, occasionally makes horrible decisions. And before you say "Well, I'd never do that!", you probably have done something else equally horrible in another arena. Imagine how you'd feel if all your secrets were out.

So, if this episode leads you to change your behavior in any way, let it be one of these:
  1. Give support to someone you know who's dealing with shit. You never know when you might save a life.
  2. If you know a batterer, report them. You might save a life.
  3. Public shaming never helped anyone. Ever. Don't do it. It's a form of abuse.

Friday, September 20, 2013

Release of DBIx::Class::Sims

(xpost from the DBIx::Class mailing list, with formatting)

Announcing the initial release of DBIx::Class::Sims. This is a schema extension that allows you to sanely and easily generate good-looking and legal test data, only specifying the actual rows you care about in your test and letting DBIC figure out the rest. (Repo is https://github.com/robkinyon/dbix-class-sims)

Let's assume you have a standard Artist->Album->Track schema where Artist->has_many(Albums) and Album->has_many(Tracks).

my $ids = $schema->load_sims({
    Track => [
        { name => 'Hello, Dolly' },
        { name => 'Moonlight Sonata', album.artist.name => 'Beethoven' },
    ],
});

That will do exactly what you think it should do. It will generate a randomized artist with a randomized album that has a single track called "Hello, Dolly". It will also generate another artist with a name of Beethoven who has a randomized album that has a single track called "Moonlight Sonata". All other columns will be randomized.

The return value $ids will be a hashref that will look something like:

$ids = {
  Track => [
    { id => 1 }, { id => 2 },
  ],
};

Corresponding to the rows in the tracks table that you requested to be created.

Now, if someone goes ahead and adds a Studio table and Studio->has_many(Albums) (because artists create albums in studios, but don't always stick to the same studio), then your test doesn't break and it doesn't change. At all. For each album that's created, a randomized studio will be generated. Because you didn't specify anything, the same studio will be used for every album. (Remember - your test is about tracks, so you don't care if the same studio is used.)

If, for some reason, your application requires that all artists have at least 2 albums to be legal, that's easily specified as well.

my $ids = $schema->load_sims(
    { Track => [ {} ] },
    { Artist => { albums => 2 } },
);

The randomization part comes from the sim types. You can add the type to each column and a reasonable-looking value will be generated instead of the default_value. Right now, the only type that has been written is us_zipcode. More will be forthcoming and you can release your own CPAN distribution with types if I'm too slow in adding them.

As with everything else DBIC-related, patches are welcome (as pull-requests on github) and discussion is on this mailing list or #dbix-class in IRC.

Tuesday, March 2, 2010

DBM::Deep 1.0020 released

1.0020 Feb 16 22:00:00 2010 EST
(This version is compatible with 1.0016)
- Fixed t/43_transaction_maximum.t so that it doesn't error out on systems
which cannot fork > 255 children at one time.
- Improved code coverage
- Added t/96_virtual_functions.t which helps describe what actually
needs to be overridden in a new plugin.

So, the major change between 1.0020 and 1.0016 is the support for different backends. This was one of the biggest changes I've made, even considering adding support for transactions. While transactions were a big change mentally, adding backends required a lot more physical work. There was moving all sorts of code around and actually making proper APIs between the various subsystems. This exposed a ton of coupling that had to be teased apart very gently.

I also learned a ton about just how UN-transactional the various "transactional" datastores really are. SQLite does full DB-locks to implement transactions. InnoDB (MySQL's transactional engine) takes locks on AUTO_INCREMENT keys. So, in both cases, concurrent transactions just don't work. DBM::Deep's transaction support wins. Huh.

But, unlike DBM::Deep's native file backend, using MySQL as a backend allows you to use DBM::Deep across different servers. So, DBM::Deep can start to scale horizontally where, before, it couldn't because of problems with flock over NFS. And, for obvious reasons, running DBM::Deep using MySQL as a backend is faster - much faster.

I have an idea for how to improve transactions over InnoDB to avoid AUTO_INCREMENT keys by using UUIDs. But, until I see someone actually using this in prod, it'll remain just an idea.

Saturday, February 20, 2010

Software as complexity management (part 2)

(q.v. part 1)

All pieces of software have one thing in common - they change over time. Whether it's a new feature, a bug report, or a feature that's implemented over 3 releases, these changes affect our baby in so many ways. Managing those changes is probably the most important thing a developer does.

Change is a problem because it introduces risk - risk of a bug. We all know what a bug is, but every change could introduce one. So, managing change is best done by reducing the risk of every change.

The only way to reduce risk is to reduce the scope of the risk. The biggest risk any change has is in how many workflows the change can potentially affect. So, you want to keep the scope of every line of code as tight as possible. Any change to any line in a scope (block, function, etc) has the same area of effect as every other line in the same scope. So, if one line of code in a function reference a global variable, then any change to any line in that function have the potential to affect that global variable.

This is sometimes called coupling, but I think that coupling isn't a strong enough concept. The best analogy is taint. Let's say we have global variable %BAD_IDEA. It's referenced in functions bad_foo() and bad_bar(). If any line in bad_foo() is changed, then every workflow that uses bad_foo() must be tested (duh!). But, every workflow that uses bad_bar() must also be tested. And, if bad_bar() uses another global variable, then every workflow that uses any function that uses that second global variable must also be tested. All from changing a line in bad_foo().

It becomes pretty obvious pretty quickly that global variables end up tainting whole swathes of code (and why global variables should be avoided at all costs). Next part is about objects and how they encourage the use of global variables.

Thursday, February 18, 2010

Software as complexity management (part 1)

Software exists to solve problems, primarily problems in business. Solutions to those problems pay for the people who write the software who solve them. Even free software runs on corporate money. Especially free software.

If we want to write better software, we need to have a solid understanding of what a problem is. Instead of problems, it's useful to talk about the problemspace. I often find it useful to imagine a problem as a surface in N-dimensional space, each dimension representing a competing concern. For me, the surface describes the complexity floor of the solutionspace. It's the theoretically ideal solution - the absolute zero of software perfection.

So, the ideal software is the implementation that is closest to the complexity floor. Can we describe the complexity axes? Can we describe some rules for how to minimize them? What do you think?

Tuesday, February 16, 2010

How useful is a debugger, really?

(Inspired by this and this and informed by Linus.)

From the get-go, there's been a Perl debugger and lots of people think that it's a "Good Thing"™. Me, I've never seen the point. Yeah, I've used it. I know how to set breakpoints and step into functions and all that. I've used perl -d, gdb, the VB debugger, and the debuggers in both Firefox and IE (version 5-8). And I hate them all.

Well, hate is a strong word. Frankly, I don't see the point. There has never been a programming problem I've faced where using the debugger helped me. At all. In any language. Ever.

In 2000, I was lucky to be mentored by Ted Vessenes when we were working at Motorola. Ted is one of the most phenomenal programmers I've ever met and I know a bunch of brilliant people. Just to give you an idea of how freaking awesome he is, this is the guy that played so much Unreal Tournament that he got pissed at the AI and built a new one. Not a perfect player, but a perfect human player. His AI ended up being so good that he had to mess up his algorithms so that other people knew they were playing an AI. He literally passed the UT3 Turing Test.

Ted once told me "the most powerful debugging tool is print." (He actually told me this a bunch of times.) Or, restated - "If you need anything more than the print statement to debug a problem, then you don't understand the problem enough to solve it."

That means you, Mr. debugger. If I have to reach for you, that means I don't have enough context, knowledge, or whatever else I might need to solve the problem. In fact, I use my desire for a debugger as a litmus test. Anytime I want a debugger, I know I'm too stupid to fix it.

(Frankly, this goes for IDEs, too. If I need anything but vim to manage my project, then my project isn't organized properly.)

Monday, February 15, 2010

Why choose Java over Perl?

I was talking about about Perl and Java in the Enterprise with frew on IRC tonight. We never actually discussed the topic at hand. Instead, we ended up talking about szabgab's comment.

I think the primary reason is that Perl is too individualizable, by far. Yes, I said too individualizable. That strength of Perl - TMTOWTDI - is its biggest weakness when it comes to working in groups.

When it comes to programming applications, the key feature of a good workflow is managing change. Every application will have change come down the pike. A new feature, an overeager salesman, a bug found, or an idiotic CTO - these all bring about changes. A good workflow will build an application that manages these changes without compromising the application's ability to manage subsequent changes. A team of decent Perl developers will easily be able to build an application that does this. But, how long is the ramp-up for a new developer? What happens when one of the initial developers leaves?

Java, on the other hand, does not give you all that freedom in how you build your application. There is no TMTOWTDI Ave. in Java City. This inability to express yourself is a strength when it comes to managing turnover. Every Java application in existence looks a lot like every other Java application.

Furthermore, Java offers a set of guarantees that Perl simply cannot do because of the design tradeoffs inherent in the language. For example, there is absolutely no way to declare something invariant in Perl. In Java, invariants are a normal part of the language. Perl also has no way of guaranteeing the type of a given variable. Java and Haskell both have this feature. You cannot turn off monkey-patching or any of the other runtime poking that is one of Perl's greatest strengths, not even lexically.

And, to top it all off, there's such a huge variety in feature strengths that a Perl novice simply cannot comprehend a Perl master's code, let alone a guru's. Every Java developer can read every other Java developer's code.

Given that some programming teams turn over a developer every week, which language should they choose?