Sunday, July 07, 2013

On contributing to MySQL


Dave Stokes has just written that MySQL is Looking for External Contributions. The first comments on that were negative, saying that forcing developers to sign a contributor agreement is not friendly, and that MySQL developers don't play well with external contributors.

To be fair, it is not Oracle that has an unfriendly policy about contributions. It was already like this with MySQL AB, and the reason is simply that the company wants to maintain ownership of the code, so that it will be able to sell dual licensing agreements.
I may not like it, but licensing was still a consistent part of the business when I left Oracle, and I assume it still is. Since this “feature” helps paying the developers that create open source software, I believe it is a reasonable trade-off.
Besides, also MontyProgram asks the same thing (https://kb.askmonty.org/en/mca/), and so does Apache (http://www.apache.org/licenses/icla.txt) and Canonical (http://www.canonical.com/contributors).
It is a reasonable request. Not only it allows the main project to retain ownership of the code, which is often a requirement to trigger further revenues, but it also gives the project some legal protection when the contributor submits code that was created by others.

All in all, it’s a good thing that Oracle is encouraging contributions!

About the MySQL developers not playing well with external contributors, this is also nothing new. To my chagrin, when I was working with MySQL I realized early on that not all developers have a mindset that plays well with community participation. That is just to be expected. I know brilliant developers who can deliver exceptional code, but don't understand the concept of coding in an open environment, or don't have the ability of cooperating with other developers that are not colleagues. Is that a bad thing? For someone, yes, but I don't think so. If the final result is that the community gets great code for the effort of non social-oriented developers, I still take it ans day "thank you!"

As a community manager in MySQL, I tried to improve MySQL attitude towards external contributors, but I had to acknowledge that we could not force the developers to work against their inclination. There were (and still are!) some developers who play well with external contributors, but forcing everyone to work with contributors outside the company may have endangered productivity. The developers at MySQL were hired for their technical skills, and with small exception the choice has proven right, as there are developers capable of delivering first class software, which is used in mission critical applications by countless businesses.

So, there are proposals that are rejected by the developers. This is a bad sign for the contributor, but a good sign for the users. A rejected proposal means that someone took the time to examine it, and balanced the options between accepting the contribution and writing the same feature from scratch. Or even rejecting the contribution because it is not worth it.

Let me clarify, because this point is not widely known. This description refers to the process as I knew it 3 years ago. It might be changed now, but I doubt it has changed for the worse.
For a feature to get into MySQL code, by any developer, meaning either an employee of MySQL or an external contributor, it must pass three obstacles:
  1. There must be tests for the code provided. A feature that does not include tests, is rejected.
  2. The code must be reviewed by one or two senior developers. The first things that the developers notice is (besides having tests or not) if the code duplicates something that was already done. If it does, they suggest rewriting the code in such a way that it uses existing features rather than creating new ones.
  3. The code must pass the test suites in all platforms supported by MySQL. This is more easily said than done. External contributors test in their preferred operating system, and then they think they are done. But MySQL has a testing farm that submits every proposed commit to the full range of the operating systems that are actively supported. So chances are that a proposed contribution breaks the tests in one or two operating systems. At that point, the employees who have been reviewing the code can either ask the initial developer to change the code so that it is multi-platform friendly, or they can do it themselves. Having supervised many such exchanges between employees and external contributors, I know for a fact that the contribution is a hard task for both sides. Since MySQL has a policy for high quality (mandatory tests, code review, test passing in all platforms), every piece of code submitted to trunk is the result of a long process, which every would-be contributor must be aware of and willing to endure.

One more thing that comes to mind in this context. An additional difficulty of contributing to MySQL is given by MySQL code being quite intricate. It was initially created when resources were scarce, and thus it includes design decisions that made sense 15 years ago, but could be written quite differently nowadays. The Drizzle project exposed most of these issues, when it stripped down most of the legacy code and implemented the database server in a more agile way, at the same time making it easier for contributors. But it did that at the expense of breaking compatibility with the past. Since MySQL gets most of its popularity by being already widely adopted and available, breaking compatibility is not a path that the project wants to walk now. I know that the developers are refactoring the code to make it more modern and manageable, but they are doing that while trying not to break anything that works well today. It's a slow process, and someone may not like it. But it's a sign of continuous progress, for which the community should be grateful. As I am.

9 comments:

Hartmut said...

Have to confess that i'm not done readin yet, just a quick comment though:

"Besides, also MontyProgram asks the same thing ..."

The key difference here is that this is only one of two options.
The other one, and this is the key difference as Antony Curtis also already mentioned , is to provide code under BSD-new License terms on a case by case basis instead of commiting to a general contributors agreement:

https://kb.askmonty.org/en/community-contributing-to-the-mariadb-project/

"Code Licensing

Similar to other open source projects, Monty Program Ab needs to have a shared ownership of the code that is included in the MariaDB distribution. This can be done by submitting your code under the BSD-new license. The only currently known exceptions to this rule are storage engines and code that is loadable through a plugin. For these, it's enough that the code is GPL.

If you want to submit code under a license other than BSD-new, sign and email the Monty Program Contributor Agreement."

Hartmut said...

"I know that the developers are refactoring the code to make it more modern and manageable, but they are doing that while trying not to break anything that works well today."

This part would be more compelling if this would happen in the open, with up-to-date public development trees on launchpad or elsewhere.

In the current situation it is rather an argument against outside contributions as you'll never know in what interesting ways the code you have been working on may change in the next release ...

Giuseppe Maxia said...

@Hartmut,
The point of complaint was " Allow me to contribute code with a BSD (3-clause) license without having to sign anything new and then we can talk."
Thus I mentioned that a contributor agreement is an established basis of many open source projects.
I am not a lawyer, but I see that the Oracle Contributor Agreement asks for joint ownership, meaning that you can keep your code under whatever license you want.
The OCA says "you agree that each of us can do all things in relation to your contribution as if each of us were the sole owners."
I haven't seen any mention of GPL or BSD or any other license being requested or forbidden. Unless Oracle states explicitly that they accept contributions only under a specific license, I don't see the reason for complaint.

Giuseppe Maxia said...

@Hartmut,
I would also like development and refactoring to happen in the open, but let me remind you that the MySQL team has never been a good social player. So, realistically, I take what I get. The main point for me is that MySQL is provided as open source and its quality is overall increasing (I have my reservations on some features, but that's beside the point). It is open enough to allow forks to flourish.

Hartmut said...

This is not about which license to use, this is about "do i have to commit to an extra agreement" vs. "is a suitable license choice on my side sufficient" ...

This is not about the content of the contirbutors agreement, it is about the extra legal hurdles this can cause ... especially if you need your employer to sign off on it

(which i don't think i personally do ... but that is a different story)

Giuseppe Maxia said...

@Hartmut,
I understood the point, and that's why I mentioned that other open source projects require the same amount of legal paperwork, under the "better safe than sorry" flag.
I personally don't like it, because I hate bureaucracy, but I understand why it is necessary.

Hartmut said...

"So, realistically, I take what I get."

a) things got even worse in that respect under Oracle

b) there is a less bad alternative to work with (or actually several if you don't strive for full compatibility)

And even when only considering the compatible options: as far as i can tell all of them would be fine with BSD-new licensed patches but one ... why would i want to jump through extra loops for just that one?

Unlike in the older days there *are* viable alternatives now ...

Giuseppe Maxia said...

@Hartmut,
You seem to believe that I have some power of convincing Oracle to do The Right Thing. Sadly, I don't.

But I like to understand how things work. And I know that if there is not a business or a legal reason for a change, Oracle won't do it.

Hartmut said...

I don't understand how you jump to that conclusion?

I was just pointing out that, no, Monty Program does not require a contribution agreement i the same way ... and that yes, BSD-new without additional agreement required *does* make a difference ... and one other minor problem ...

So as far as contributor agreements go I think I was trying to provide to your understanding of how things work ...

As for Oracle not changing their ways we can probably agree on that, and that renders that whole effort of asking for more contributions sort of moot ...

As for "business / legal reason": as Antony pointed out there are other projects where Oracle seems to be fine with BSD-new contributions without extra contribution agreements. So yes, business or legal reasons may be the only lever that may cause change, but one could at least wish for an otherwise very much central policy driven company like that to come up with more consistent policies in this respect, too ...

So while i'm wishing them all good look with this "more contributions" effort and i don't see anything specifically evil in it (like some may do) I'm in Goethes "Faust" mode once again:

"The message well I hear, my faith alone is weak"