Hope for ownCloud?

ownCloud is giving us headaches. The package got removed from Debian due to hostile upstream behavior. We are heavily relying on stable and packaged software. We don’t have the resources to deal with new versions for this and that software trickling in every month. Many people, including me, debated about ownClouds behavior at length. But now, maybe something is happening: Frank Karlitschek, initiator and project leader of ownCloud, has left ownCliud Inc. where he worked as CTO. Maybe this is not unrelated to what I was discussing before:

I thought a lot about this situation. Without sharing too much, there are some moral questions popping up for me. Who owns the community? Who owns ownCloud itself? And what matters more, short term money or long term responsibility and growth? Is ownCloud just another company or do we also have to answer to the hundreds of volunteers who contribute and make it what it is today?
These questions brought me to the very tough decisions: I have decided to leave my own company today. Yes, I handed in my resignation and will no longer work for ownCloud, Inc.

And maybe there is hope:

There is tremendous potential in ownCloud and it is an open source product protected by the AGPL license. […] Stay tuned, as more news is coming!

Apt Purge os-prober!

If you are getting kernel errors like

EXT4-fs (sda2): unable to read superblock
EXT4-fs (sda2): unable to read superblock
EXT4-fs (sda2): unable to read superblock
FAT-fs (sda2): bogus number of reserved sectors
FAT-fs (sda2): bogus number of reserved sectors
qnx4: no qnx4 filesystem (no root dir).

for almost every storage like device in /dev/, you know something has to be wrong! However, we ignored those messages for months since everything seemed to be fined. Still, I was concerned.

Yesterday I decided to look into this again. Between the errors quoted above, there was a kernel warning saying

>>>WARNING<<< Wrong ufstype may corrupt your filesystem,

which didn’t sounded any better. Finally I found Debian Bug #788062 “os-prober corrupts LVs/partitions while being mounted inside a VM”. And indeed, our suspicious log entries start with

"debug: running /usr/lib/os-probes/50mounted-tests on /dev/sda2"

On kernel updates or manual grup-update usage this bug might corrupt your storage or put your file systems into read only without having any idea where to look for the problem.

And the moral of this story: apt purge os-prober on your servers and don’t expect debian-boot to react to such a bug within 6 months.

Update: I have noticed that the general behavior has been reported to Debian BTS back in Dec 2014. But it has been considered “entirely cosmetic”. It is also on Launchpad (Ub***u) since 2014. However, back then no severe impact has been reported.

Web Encryption Starts Moving

HTTPS, X.509, SSL, TLS, STARTTLS, SNI, OpenSSL, DNSSEC: Web encryption is (still) painful. There is not a single problem with web encryption, rather there is not a single thing that has been solved properly. But maybe we are currently reaching the state, where enough wrappers are covering the horror of the past. At least, as long nobody is looking beneath the shell again.

Doing web encryption is still almost a synonym for using OpenSSL, at least server side. This also holds for the task of managing keys and certificates. There are Python, PHP and even PostgreSQL wrappers for certain OpenSSL tasks but in practice those tools turn out to be quite incomplete and clumsy. To ensure a decent SSL setup one has to rely on services like SSL Labs. Maybe as a consequence, (Open)SSL integration in software is bad. For me, Apache2’s mod_ssl is a good bad example. If your configuration contains a certificate which is not valid for a configured key, the complete web server will crash on a graceful reload. Now, Let’s encrypt provides a python client which also serves as wrapper around the mod_ssl configuration.

MySQL’s SSL support is a nightmare. Our mySQL SSL auth setup just stopped working over night with little chance to debug anything. This was one of many reasons to transform our setup to PostgreSQL. PostgreSQL provides a fine grained and transparent (Open)SSL feature set with reasonable error messages, giving a decent example how SSL can work. However, if the server is working, the client is there to play up. For example, Roundcube webmail is stripping all the advanced options from database connections. If you configure SSL security measures for your database connection, those will be ignored silently.

Last but not least, certificate issuance is broken, obviously. Let’s encrypt is doing many things right and the ecosystem may improve. Transparent logs and observation of certificate issuance is also a big step. However, the next building blocks for web encryption like DNSSEC are currently setting even higher hurdles for system administrators. While hackers are not getting tired to promote encryption for everyone, the tools are just not there. Proper encryption has a history of complicated and time consuming solutions, reserving it for organizations and companies with the manpower to work around this pile of shards and to keep up with the evolution of encryption methods. Hopefully, the new momentum in web encryption, especially around Let’s encrypt, will make a wider adoption feasible, some day.

GitLab

We are running GitLab CE via the “omnibus” (what ever this is) Debian package since it’s availability in May 2015. Due to GitLab’s version policy we are constantly upgrading our installation. However, we only ran into minor problems with this approach. Recent examples are:

  • Backup broke (workaround available, fixed after two days)
  • Admin page broke (workaround available)

What gives me confidence in our setup are the very short reaction times on the GitLab bug tracker. This includes fast fixes via new versions and the availability of workarounds. However, for more critical infrastructure it would be wise to delay non-critical updates for some weeks.

PostgreSQL Arrays

Before I start the week, let’s wrap up the weekend. I was hacking on HamSql and got in trouble with PostgreSQL arrays again. Recently, I stumbled over misleading documentation for array operators. I wanted to report this issue and remembered that the PostgreSQL community is working without bug trackers. But don’t be scared, I got a really fast and kind reaction on the mailing list and in this way the 9.5 documentation covers those pitfalls explicitly. This weekend I hit the PostgreSQL “array lower bound feature”. The index of PostgreSQL arrays starts at 1, but that’s only a default. You can set the default to any number you like. I did forget this feature immediately after reading the docs and would just have ignored this feature for ever if not PostgreSQL internals would use it some times. It comes as no surprise that many client libraries don’t know how to handle arbitrary lower bounds for arrays. Unfortunately, the postgresql-simple Haskell library is no exception. It took me some time to realize the problem, as the issued error message was not that helpful.

While trying to work around this bug I hit another array function corner case. The documentation states for array concatenation “the result retains the lower bound subscript of the left-hand operand’s outer dimension”. Hence, I should be able to fix the problem using something like '{0}'::int[] || '[0:1]={1,2}'. And this works just fine. So let’s just take an empty array on the left site: '{}'::int[] || '[0:1]={1,2}'. Booom! This acts as identity and leaves the bounds untouched. I am using ARRAY(SELECT UNNEST(...)) to reset the lower bound now. Not sure if I should report this || operator issue too.

Status: Mailserver (SMTP) offline

13:44h: Unserer Mailserver liefert vereinzelt Mails nicht aus. Zur Untersuchung ist der Server zunächst offline.

13:48h: Server sollte wieder ohne Probleme funktionieren. Updates zu den genauen Auswirkungen folgen.

13:55h: Seit Gestern (17. Okt) 00:10 bis Heute 13:48 sind vereinzelt E-Mails von anderen Servern mit Fehlermeldung abgelehnt und gesendete E-Mails ohne Fehlermeldung nicht zugestellt worden. Wir werden in Kürze die Fehlermeldungen nachträglich versenden.

20:40h: Sowohl Empfänger als auch Sender wurden konkret über verlorengegangene E-Mails informiert.