Hope for ownCloud?

ownCloud is giving us headaches. The package got removed from Debian due to hostile upstream behavior. We are heavily relying on stable and packaged software. We don’t have the resources to deal with new versions for this and that software trickling in every month. Many people, including me, debated about ownClouds behavior at length. But now, maybe something is happening: Frank Karlitschek, initiator and project leader of ownCloud, has left ownCliud Inc. where he worked as CTO. Maybe this is not unrelated to what I was discussing before:

I thought a lot about this situation. Without sharing too much, there are some moral questions popping up for me. Who owns the community? Who owns ownCloud itself? And what matters more, short term money or long term responsibility and growth? Is ownCloud just another company or do we also have to answer to the hundreds of volunteers who contribute and make it what it is today?
These questions brought me to the very tough decisions: I have decided to leave my own company today. Yes, I handed in my resignation and will no longer work for ownCloud, Inc.

And maybe there is hope:

There is tremendous potential in ownCloud and it is an open source product protected by the AGPL license. […] Stay tuned, as more news is coming!

Apt Purge os-prober!

If you are getting kernel errors like

EXT4-fs (sda2): unable to read superblock
EXT4-fs (sda2): unable to read superblock
EXT4-fs (sda2): unable to read superblock
FAT-fs (sda2): bogus number of reserved sectors
FAT-fs (sda2): bogus number of reserved sectors
qnx4: no qnx4 filesystem (no root dir).

for almost every storage like device in /dev/, you know something has to be wrong! However, we ignored those messages for months since everything seemed to be fined. Still, I was concerned.

Yesterday I decided to look into this again. Between the errors quoted above, there was a kernel warning saying

>>>WARNING<<< Wrong ufstype may corrupt your filesystem,

which didn’t sounded any better. Finally I found Debian Bug #788062 “os-prober corrupts LVs/partitions while being mounted inside a VM”. And indeed, our suspicious log entries start with

"debug: running /usr/lib/os-probes/50mounted-tests on /dev/sda2"

On kernel updates or manual grup-update usage this bug might corrupt your storage or put your file systems into read only without having any idea where to look for the problem.

And the moral of this story: apt purge os-prober on your servers and don’t expect debian-boot to react to such a bug within 6 months.

Update: I have noticed that the general behavior has been reported to Debian BTS back in Dec 2014. But it has been considered “entirely cosmetic”. It is also on Launchpad (Ub***u) since 2014. However, back then no severe impact has been reported.

Web Encryption Starts Moving

HTTPS, X.509, SSL, TLS, STARTTLS, SNI, OpenSSL, DNSSEC: Web encryption is (still) painful. There is not a single problem with web encryption, rather there is not a single thing that has been solved properly. But maybe we are currently reaching the state, where enough wrappers are covering the horror of the past. At least, as long nobody is looking beneath the shell again.

Doing web encryption is still almost a synonym for using OpenSSL, at least server side. This also holds for the task of managing keys and certificates. There are Python, PHP and even PostgreSQL wrappers for certain OpenSSL tasks but in practice those tools turn out to be quite incomplete and clumsy. To ensure a decent SSL setup one has to rely on services like SSL Labs. Maybe as a consequence, (Open)SSL integration in software is bad. For me, Apache2’s mod_ssl is a good bad example. If your configuration contains a certificate which is not valid for a configured key, the complete web server will crash on a graceful reload. Now, Let’s encrypt provides a python client which also serves as wrapper around the mod_ssl configuration.

MySQL’s SSL support is a nightmare. Our mySQL SSL auth setup just stopped working over night with little chance to debug anything. This was one of many reasons to transform our setup to PostgreSQL. PostgreSQL provides a fine grained and transparent (Open)SSL feature set with reasonable error messages, giving a decent example how SSL can work. However, if the server is working, the client is there to play up. For example, Roundcube webmail is stripping all the advanced options from database connections. If you configure SSL security measures for your database connection, those will be ignored silently.

Last but not least, certificate issuance is broken, obviously. Let’s encrypt is doing many things right and the ecosystem may improve. Transparent logs and observation of certificate issuance is also a big step. However, the next building blocks for web encryption like DNSSEC are currently setting even higher hurdles for system administrators. While hackers are not getting tired to promote encryption for everyone, the tools are just not there. Proper encryption has a history of complicated and time consuming solutions, reserving it for organizations and companies with the manpower to work around this pile of shards and to keep up with the evolution of encryption methods. Hopefully, the new momentum in web encryption, especially around Let’s encrypt, will make a wider adoption feasible, some day.

GitLab

We are running GitLab CE via the “omnibus” (what ever this is) Debian package since it’s availability in May 2015. Due to GitLab’s version policy we are constantly upgrading our installation. However, we only ran into minor problems with this approach. Recent examples are:

  • Backup broke (workaround available, fixed after two days)
  • Admin page broke (workaround available)

What gives me confidence in our setup are the very short reaction times on the GitLab bug tracker. This includes fast fixes via new versions and the availability of workarounds. However, for more critical infrastructure it would be wise to delay non-critical updates for some weeks.

A tale of bytes and strings in python3’s smtplib

The one feature why I changed almost all my projects from python2 to python3 is the vastly improved handling of encoding stuff. In python2, I was never sure if I needed to throw in a .decode or a .encode and with which arguments to make things work. All my üs and äs would end up as weird characters, so I would try an .encode, which sometimes solved it and sometimes made it weirder yet. So I would try .decode instead, which then sometimes solved it and sometimes didn’t. It was not fun.

Now, for python3, the story is much better and cleaner, since I either have utf-8 strings, which I can print and everything or I have bytes, which are just bytes and need to be decoded before they can be treated as strings. Standard library functions in python3 take and return either strings or bytes. Take for example the open() call: depending on the mode, it returns bytes or strings. If I try to write bytes to a file opened in string mode, I get a TypeError.  So everything is warm and nice and I get type errors if I do stupid things, and then I immediately know if I have to decode or encode.

So I wrote a small program which takes a mail on stdin and passes it via LMTP to dovecot, using python3’s smtplib. Everything worked, no type errors anywhere, I even tested it by sending some weird characters in an email. It worked. I deployed to the hemio mail server. A few days later, I get an SMS in the morning: We are loosing mails! Just silently dropping them. WHAT? That’s of course the worst possible thing you can do as a mailserver. After shutting down the mail server to prevent further breakage, I check the log what was happening. The Traceback I see gives me flashbacks to python2:

Traceback (most recent call last):
File “/usr/local/lib/lda-lmtp.py”, line 163, in
exitcode = main(args)
File “/usr/local/lib/lda-lmtp.py”, line 57, in main
msg = sys.stdin.read()
File “/usr/lib/python3.4/encodings/ascii.py”, line 26, in decode
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: ‘ascii’ codec can’t decode byte 0xc3 in position 13114: ordinal not in range(128)

What? How do I get a UnicodeDecodeError? I thought I was passing unicode strings around all the time, why decode? Checking the documentation of smtplib.SMTP.sendmail, it says:

…msg may be a string containing characters in the ASCII range, or a byte string. A string is encoded to bytes using the ascii codec…

So: smtplib.SMTP.sendmail wants bytes. However, if you pass a string instead, it will silently .encode it using the ‘ascii’ codec. WHY? One of the features of python3 is that you have to consciously decide if you want to encode or decode, instead of the willy-nilly casting/one-type-fits-all of python2. But, helpful as ever, smtplib just ascii-encodes your msg for you. Which will barf on interesting characters. Which ended up just dropping mails. Not nice.

The fix was easy: I re-opened stdin in binary mode and just read in the mails in binary directly, such that my program never has to think about encodings and strings. But I am very confused why smtplib is going out of its way to confuse python3 developers. If you can only deal with bytes, just accept bytes. Throw a traceback if you are given strings. Don’t silently try ascii-decoding a given string. It hurts, it loses mails and virtual kittens die!

And why didn’t my testing catch that earlier? Well, my mail program, claws-mail, encodes all outgoing mails in 7bit-printable encoding automatically, so I never actually tested 8bitmime. -_-

PostgreSQL Arrays

Before I start the week, let’s wrap up the weekend. I was hacking on HamSql and got in trouble with PostgreSQL arrays again. Recently, I stumbled over misleading documentation for array operators. I wanted to report this issue and remembered that the PostgreSQL community is working without bug trackers. But don’t be scared, I got a really fast and kind reaction on the mailing list and in this way the 9.5 documentation covers those pitfalls explicitly. This weekend I hit the PostgreSQL “array lower bound feature”. The index of PostgreSQL arrays starts at 1, but that’s only a default. You can set the default to any number you like. I did forget this feature immediately after reading the docs and would just have ignored this feature for ever if not PostgreSQL internals would use it some times. It comes as no surprise that many client libraries don’t know how to handle arbitrary lower bounds for arrays. Unfortunately, the postgresql-simple Haskell library is no exception. It took me some time to realize the problem, as the issued error message was not that helpful.

While trying to work around this bug I hit another array function corner case. The documentation states for array concatenation “the result retains the lower bound subscript of the left-hand operand’s outer dimension”. Hence, I should be able to fix the problem using something like '{0}'::int[] || '[0:1]={1,2}'. And this works just fine. So let’s just take an empty array on the left site: '{}'::int[] || '[0:1]={1,2}'. Booom! This acts as identity and leaves the bounds untouched. I am using ARRAY(SELECT UNNEST(...)) to reset the lower bound now. Not sure if I should report this || operator issue too.