Tag Archives: Linux

Lessons learned with Docker, Nodejs apps and volumes

Context

I have kept playing with Docker recently, just for fun and to learn.

It is very powerful, but still young. It quickly shows some limit when it comes to security or persistence. There are some workarounds, yet more or less complex, more or less hacky.

Indeed, I had some issues with Etherpad, which is a Nodejs application, and its integration into Docker.

Initially, I made something quite simple, so my Dockerfile ended like that:

USER etherpad
CMD ["node","/opt/etherpad-lite/node_modules/ep_etherpad-lite/node/server.js"]

Thus, I simply start the app with a low privileges user.

It worked, but I had two issues:

  1. Docker was not able to stop it nicely. Instead, it timed out after 10 sec and finally killed the app and the container altogether.
  2. No persistence of any kind, of course.

I decided to tackle these two issues to understand what was going on behind.

The PID 1 issue

I could not understand immediately the first issue: why was Docker unable to terminate the container properly?

After wandering a few hours on wrong paths (trying to get through with Nodejs nodemon or supervisor), I finally found some good articles, explaining that Docker misses an init system to catch signals, wich causes some issues with applications started with a PID = 1, which cannot be killed, or with Bash (the shell doesn’t handle transmitted signals.

I am not going to repeat poorly what has already been explained very well, so I encourage you to read this two excellent posts:

You will also find a lot of bug reports in the Docker github about this issue, and a lot of hacky or overkilling solutions.

In my opinion, the most elegant solution among them is to use a launcher program, very simple and dedicated to catch and handle signal.

I chose to use Dumb-init, as it is well packaged (there are plenty of options) and seems to be well maintained.

So, after installing Dump-init in the Dockerfile, the CMD line should now look like this:

USER etherpad
CMD ["dumb-init","node","/opt/etherpad-lite/node_modules/ep_etherpad-lite/node/server.js"]

And indeed, as expected, docker stop now works flawlessly.

Volume permissions

This is where I had the toughest issue, although it is supposed to be straightforward with volumes.

Volumes enable to share files or folders between host and containers, or between containers solely. There are plenty of possibilities, nicely illustrated on this blog:

And it works very well…. as long as you application runs as root.

In my case, for instance, Etherpad runs with a low privileged user, which is highly recommended. At startup, it creates a sqlite database, etherpad.db,  in its ./var folder.

Mounting a volume, of any kind, over the ./var folder, would result in a folder with root only permissions. Subsequently, of course, the launch of Etherpad from the CMD command would fail miserably.

Simple solutions like chown in the Dockerfile don’t work, because they apply before the mount. The mount occurs at runtime and works like a standard Linux mount: it is created by the docker daemon, with root permissions, over possibly existing data.

My solution was to completely change the way Etherpad is started. I now use an external script which is started at runtime:

  1. First, it applies the appropriate permissions to the mounted volume with chown,
  2. Then, it starts Etherpad with a low privileged user thanks to a su hack.

So now the Dockerfile ends with:

VOLUME /opt/etherpad-lite/var
ADD run-docker.sh ./bin/
CMD ["./bin/run-docker.sh"]

And here is the script:

#!/bin/bash

chown -R etherpad:etherpad /opt/etherpad-lite/var
su etherpad -s /bin/bash -c  "dumb-init node /opt/etherpad-lite/node_modules/ep_etherpad-lite/no
de/server.js"

I use a data volume for persistency, so the run command looks like this:

docker run -d --name etherpad -p 80:9001 -v etherpad:/opt/etherpad-lite/var -t debian-etherpad

Far from being ideal, but it works. I really hope some features are coming to bring more options in this area, especially in the Dockerfile.

Some final thoughts

Globally, we can still hope a lot of improvements in security, because when I look at many Dockerfiles around, I see two behaviors:

  • A lot of people don’t care and everything is happily running as root, from unauthenticated third-party images or binaries…
  • Some people do care but end up with dirty hacks, because there is no other way to do so.

It is scary and so far from the Linux philosophy. Let’s wait for the enhancements to come.

You can find the complete updated Dockerfile on this github page.

While we are on this topic, have a look to this nice post with some nice tips and tricks for Docker.

A journey with Btrfs

Why BTRFS ?

I have recently tested Btrfs as the file system for my /home partition (which was previously on ext4).

I have been impressed by what this file system enables to do, but also came to the conclusion that it is not for me.

As a quick reminder, the goal of this file system is to bring to Linux a fully featured file system similar to zfs. Some of these features promise a lot of awesomeness: snapshots, native RAID, automatic defragmentation and repairs, etc.

Wouldn’t it be cool to have such a file system for your data? Among them, snapshotting really is a killer feature. See it as a global git for all your data. You can track any file history, make a diff comparison on them and revert back to a chosen version, anytime and on-line.

Btrfs has been under development for a while and it is still undergoing. However, the first stable version has finally been released last year.

Many people warn that it is not production ready yet. It seems obvious for critical production systems, under heavy load or using the most advanced features (e.g. RAID). But what about a simple /home, mainly using snapshots (which have been around for a while)?

You will see that there are still some issues with virtualization.

Disclaimer 1: this is in no way a review or a benchmark of Btrfs. Consider it simply as some feedback for my specific use case.

Getting ready

This chapter is a summary of procedures found in various resources, along with my feedback.

Disclaimer 2: First of all, make several backup of your entire /home. And make sure that it is operational and complete. Anyway, beware that there is obviously some inherent risk for your data in manipulating your home partition. So, do not come back to insult me if you lose any data.

First, note that there is a conversion utility btrfs-convert, to convert an existing ext4 partition to btrfs. While this sounds cool, it did not work well with my partition, leading to many corrupted inodes.

So my advice is to just make a good backup of your home:

% rsync -av /home /your/backup/

Then, log out and format the partition as root:

# mount | grep home
/dev/mapper/system-home on /home type ext4 (rw,noatime,data=ordered)
# umount /home
# mkfs.btrfs /dev/mapper/system-home

Change the file system and its options in /etc/fstab. For example:

/dev/system/home     /home     ext4     defaults,noatime     1 1

should become (also note the change on the last digit):

/dev/system/home   /home    btrfs  defaults,noatime,ssd,space_cache,compress=lzo    1 0

Re-mount /home and you are done!

Snapper

The main purpose for me to test Btrfs was the snapshot feature, in the hope to keep a version history of each file and avoid accidental deletions and changes.

Of course, one could use the Btrfs commands and implement snapshots manually. But why reinventing the wheel?

The guys behind snapper  already made a service especially for that. It is basically a wrapper over Btrfs that will make automatic snapshots in the background, based on your frequency settings, and ease their handling.

Once installed, it can be enabled with the following command:

# snapper -c home create-config /home

It has the effect of creating a configuration file, where you can adjust the number of snapshots you want to keep per day, week, month, etc. Of course, don’t keep too much data as it will waste free space, especially if you happen to move large amounts of data. Hourly and daily snapshots are OK, as they would be cleaned up quickly. But monthly or yearly snapshots would consume a lot of space and would be pretty useless for a /home.

Here is what I used, without consuming much more than 10 GB:

# subvolume to snapshot
SUBVOLUME="/home"

# filesystem type
FSTYPE="btrfs"

# users and groups allowed to work with config
ALLOW_USERS=""
ALLOW_GROUPS="

# sync users and groups from ALLOW_USERS and ALLOW_GROUPS to .snapshots
# directory
SYNC_ACL="no"

# start comparing pre- and post-snapshot in background after creating
# post-snapshot
BACKGROUND_COMPARISON="yes"

# run daily number cleanup
NUMBER_CLEANUP="yes"

# limit for number cleanup
NUMBER_MIN_AGE="1800"
NUMBER_LIMIT="10"
NUMBER_LIMIT_IMPORTANT="5"

# create hourly snapshots
TIMELINE_CREATE="yes"

# cleanup hourly snapshots after some time
TIMELINE_CLEANUP="yes"

# limits for timeline cleanup
TIMELINE_MIN_AGE="1800"
TIMELINE_LIMIT_HOURLY="10"
TIMELINE_LIMIT_DAILY="7"
TIMELINE_LIMIT_WEEKLY="2"
TIMELINE_LIMIT_MONTHLY="0"
TIMELINE_LIMIT_YEARLY="0"

# cleanup empty pre-post-pairs
EMPTY_PRE_POST_CLEANUP="yes"

# limits for empty pre-post-pair cleanup
EMPTY_PRE_POST_MIN_AGE="1800"

Now, let’s play a little. In the following sequence, we create a file containing “Hello World!”, we then create a manual snapshot, change the file and display the differences:

# vim test.txt
# snapper -c home create --description "before test"
# vim test.txt
# sudo snapper -c home list
Type   | # | Pre # | Date                     | User | Cleanup  | Description  | Userdata
-------+---+-------+--------------------------+------+----------+--------------+---------
single | 0 |       |                          | root |          | current      | 
single | 1 |       | Sun Mar 13 19:44:21 2016 | root |          | before test  | 
single | 2 |       | Sun Mar 13 19:45:12 2016 | root |          | created test | 
single | 3 |       | Sun Mar 13 19:52:39 2016 | root |          | update test  | 
single | 4 |       | Sun Mar 13 20:00:01 2016 | root | timeline | timeline     | 
single | 5 |       | Sun Mar 13 21:00:01 2016 | root | timeline | timeline     | 
single | 6 |       | Sun Mar 13 22:00:01 2016 | root | timeline | timeline     | 
# snapper -c home status 1..0
--- "/home/.snapshots/2/snapshot/phocean/test.txt" 2016-03-13 19:44:53.370641373 +0100
+++ "/home/phocean/test.txt" 2016-03-13 19:45:27.226586459 +0100
@@ -1 +1,2 @@
Hell World!
+Good bye.
@@ -0,0 +1,2 @@
+Hell World!
+Good bye

Neat, isn’t it? Now, what if we decide to restore the file to this snapshot:

snapper -c home undochange 1..0 /home/phocean/test.txt

That’s it!

Note that all these operations can be done against the entire partition (no argument needed), a folder or a file.

Pros

Regarding regular files, I had no issue at all. After a week of intensive use, I already the occasion to enjoy the benefits of having snapshots and being able to restore a file.

On the performance side, even though I haven’t done any benchmark, it is a least as fast as ext4. It is said that under some conditions, compression can be a big read rate boost.

On the compression side, on my partition of 400 GB, it allowed me to reclaim around 20 GB of space. Of course, the gain you can expect is totally related to the sorts of files you have (you won’t gain much on files that are already compressed or encrypted).

Cons

As warned on the official wiki itself, you should not use Btrfs as-is with database or virtualization solutions.

Dixit the official wiki:

Files with a lot of random writes can become heavily fragmented (10000+ extents) causing trashing on HDDs and excessive multi-second spikes of CPU load on systems with an SSD or large amount a RAM.

Indeed, I quickly experienced some issues with Virtualbox. Under heavy I/O operations, and having several machines running at a time, I had the guest file systems corrupted more than once. And so badly that the guest machine was unrecoverable (even with snapshots). Sometimes I got plenty of ext4 errors, or sometimes it just froze, while copying a bunch of file or doing an apt-get upgrade...

The workarounds did not make it for me:

  1. I even did not test disabling CoW for the whole partition. It kills one of the main advantages of using Btrfs.
  2. I tried disabling CoW for all the VM folder. While the corruption frequency decreased, it still occurred after a while.

So, I would simply adivse of not putting any virtual machine on the Btrfs partitions, until this thing definitely get sorted. I use virtual machines intensively at work and need them to be reliable.

Conclusion

Btrfs is awesome and pretty stable at this time, unless you need to host virtual machines. You could still have a dedicate ext4 partition for you VMs, and enjoy Btrfs for the rest of your home.

To be honest, I did not bother (not wanting to manage several partitions), and switched back to ext4 for all, in the expectation of better days. I am not sure if this should be addressed on the Btrfs, or the Virtualbox side (or both).

References

Installation of Metasploit on Fedora 21 / 22

Update 2015/08/04: Works on Fedora 22 too. I recently applied the exact same procedure with success.

A quick update from a previous post for setting Metasploit on Fedora 21, the latest version.

It is mainly a copy and paste, except for a few typo fixes and some changes on the Ruby part. The good news is that Metasploit was recently ported to Ruby 2.x, so we don’t need anymore the rvm stuff anymore, which makes the process much simpler.

Preparing Postgresql

Install:

 yum -y install postgresql-server postgresql-devel

Initiate a new “cluster” and connect to the sql client through the postgres user:

# as root:
postgresql-setup initdb
systemctl start postgresql.service
su postgres
psql

Inside the psql console, create the new Metasploit user and its database:

create user msf;
alter user msf with encrypted password 'super password';
create database msfdb;
grant all privileges on database msfdb to msf;
\q

Then, we will tell to Postgres how to accept local connections. ident necessitates an system account, trust means no password for any local account and md5 stands for a classic password authentication, which we will prefer.
Back to a root terminal, add this line inside /var/lib/pgsql/data/pg_hba.conf and beware that the order is important:

# IPv4 local connections:
host msfdb msf 127.0.0.1/32 md5
host all all 127.0.0.1/32 ident

Then we can restart the service and check with psql that the credentials are working:

systemctl restart postgresql.service
psql -U msf msfdb -h localhost
\q

Setting Ruby

Metasploit runs well with Ruby 1.9.3, so we will install this version and switch to it using rbenv.
rbenv does a nice job at managing several version of ruby next to each other, installing dependancies (as OpenSSL) and setting PATH:

# as root:
yum install ruby rubygems ruby-devel rubygem-bundler

Getting and running Metasploit

Install:

# as root in e.g. /opt
git clone https://github.com/rapid7/metasploit-framework.git msf
cd msf
yum -y install libpcap-devel sqlite-devel
./msfupdate

The installation of ruby modules will take a while. Then, configure the database by creating config/database.yml:

production:
    adapter: postgresql
    database: msfdb
    username: msf
    password: 
    host: 127.0.0.1
    port: 5432
    pool: 75
    timeout: 5

Launch it and have fun :

# as root
./msfconsole
# check connection to the database
db_status

You may want to add a cron entry in /etc/crontab to get regular updates (though it may break from time to time due to broken dependencies, so you are advised to check it sometimes):

# msfupdate every 2 hours
0 */2 * * * root /opt/msf/msfupdate 2>&1

The joy of dependencies: Metasploit on Fedora 20

UPDATE 02/2015 : see there for the procedure on Fedora 21

As I started to use Fedora 20 at work – by the way, a solid distro with all security features enabled, I had the bad surprise to get similar issues to those on OS X.
Again, we will have to face the joy of dependencies! Fedora provides Ruby 2.0 by default, so firing msfconsole would fail with many openssl warnings, ending with:

Continue reading

Misc rants on Linux desktop, Mac OS and Antivirus

Linux desktop is in bad shape…

The culprits? Unity and Gnome 3. I am not talking about KDE, as I never felt good with it. I had tried KDE 4 and it did not change my opinion, not to mention that I suffered from several bugs.

Unity? Like many people, I just don’t get it. It is pretty clumsy and feels unachieved. I also suffered from a lot of performance issues like this that are never fixed and make it a pain to use daily.
Gnome 3? Actually, I liked it. It looks nice, is pretty fast and smooth. What I like the most is the workflow. It really makes use of workspaces logical and optimum. But… it did not work for me! Instability, again and again.
You will tell me, that I should have stayed with Gnome 2 or go to XFCE / Openbox / etc. I have used all of them. They have qualities, sure, but we are in 2012 and I want something with more features.

Conclusion: it is sad that after so many years, Linux is not yet ready for the desktop, because some guys decided to break everything again instead of doing incremental enhancements. Why breaking so suddenly things that work? I don’t get it. I felt really fustrated with the feeling that I was at the same point as 5 years ago, dealing with the same kind of bugs. I have long been a Linux advocate and I believed I was right a few years back when I told people it was promising and superior to the competition (Windows XP at the time). Now years have passed, and I started to feel I was lying, or hiding the truth that is Linux Desktop failed and went nowhere.
Yes, I just got tired to fight with the computer to get basic things done. And considering the Linus post and several reactions into the comments, I am not alone in this case.

… so I gave a try to Apple…

I recently got a Mac Book Pro. The main reason is I wanted a very stable workstation to focus on my work. It was hard to admit after so many years using it, but I came to the conclusion that a Linux desktop could not meet this requirement anymore.

So I am going to be with Mac OS Lion for a while (though I am certainly not closing the door to the Linux desktop forever). I have to say that it is a nice OS and it is damned stable. It is good to have something that works out of the box, without any frustration or need to customize things to have something suitable.

And what about the stability of Mac OS? It is very eye candy, but is it stable?

At first, I actually had some serious troubles. It was freezing almost every day, forcing me to a cold reboot. I started to be seriously doubtful concerning the stability of Mac OS, when I found by chance that the freeze occured every time that Sophos Antivirus started an update…

Antivirus and Mac OS…

Wait, what? Antivirus? On Mac OS? I know it will be the reaction of many Mac users. I do also think that it is useless, but for a different reason than most of them.
Of course, I don’t get the “Mac OS is secure” marketing. Actually, it has the less secure kernel around, even though it benefits from a robust Unix architecture.
No, my point is that antivirus all fail anyway. In forensic analysis, we can even not trust an antivirus scan to decide if a machine is sane or not. Instead, we have to use specific tools and memory acquisition to make sure.
It is simply because signature-based detection can always be worked around by malwares. There are hundreds of ways to achieve it successfully: changing binary headers, code obfuscation, encryption, hooking (see rootkits and bootkits).
Ok, antivirus vendors claim that they also offer behavioral detection, sandboxes, etc. Yes, that’s a good move, but they can’t check all of the system activity and again there are many ways to bypass it. So why bother?

I mean, I still think it matters to have an antivirus on Windows. Especially for people who are not too techy. At least, it will detect the most basics threats and throw out alarms. There are thousands of such threats on Windows, and on this point antivirus offer a simple way to defeat them (though awareness and education are certainly more important).

But on Mac Os, and on Linux as well, there are very few threats. Once again, it is not that they are so much secure, but at the time I am writing, it is a fact.

So to summarize:

  • very few threats on Mac OS and Linux
  • antivirus still massively rely on signature-based detection

You see: if there is nothing much to detect, an antivirus is overhead. It will only eat some resources and fail anyway against coming threats.
Just keeping the system up-to-date is certainly the best thing to do so far.

Well, so why did I set an antivirus? I was actually using it for my forensic analysis on Windows machines. It was a convenient way for me to have a local scanner that I could started on dumped suspicious processes, without having to connect on Viruscan. It used to be convenient when I was traveling without connection, but I can live without it.

About Sophos for Mac OS

So moreover this piece of software was crashing my laptop. The update part seems to be executed with root privileges, and for some reason it locks the system (not only mine, look at the forums). Not to mention that having such a component may offer more room to malicious code to exploit the kernel…

A shame, a pure piece of crap. Now that I removed it, I am enjoying an uptime of about 30 days!

Conclusion

Sophos Antivirus for Mac OS is pure crap, run to remove it if it happens to be on your computer.

Anyway, you don’t need an antivirus on Mac OS. Moreover, it seems that several vendor offer solution that lack of maturity and testing on this platform. So you would actually degrade your system stability and security if you would installed on of these.

And Mac OS is a nice Unix-based desktop alternative to have the work done, even though sadly it is not open-source.

How to physically identify a software RAID disk member

What you need:

  • a good earing
  • smartmontools

Indeed, so far, I haven’t found anything better than launching a process making a lot of disk activity.

This command just do it:

% sudo smartctl -t short /dev/sda

The “short” test will give you a few minutes to carefully listen and select the right disk.

Well, it sure is pretty primitive! But do you know anything better?

By the way, there is a good article for the recovery procedure.