Posts for Thursday, September 24, 2015

Me and you without us

When thinking about the problems and debates of our digital age and lifestyle I keep coming back to a movie made in 1979: Monty Python’s Life of Brian. In one key scene, Brian – the proposed messiah – speaks to the masses gathering underneath his window. Trying to convince them to basically leave him be he appeals to people’s sense of individuality:

<iframe allowfullscreen="true" class="youtube-player" frameborder="0" height="430" src=";rel=1&amp;fs=1&amp;autohide=2&amp;showsearch=0&amp;showinfo=1&amp;iv_load_policy=1&amp;wmode=transparent" type="text/html" width="711"></iframe>

While Monty Python is obviously making a point about the way individualism and difference have been co-opted by advertisers and marketing people to separate people into distinct directly addressable groups who use the products sold to them as means of distinction I think there is a lot to learn from this small piece about the Internet and the way we live and interact here. In fact I believe it outlines the key question or unsolved problem that is the foundation for most of the other issues we are talking and writing and drawing pictures about: The relationship of the individual to the community.

Here in the so-called “western world” we pray at the altar of individualism. We do for example tell history framed as hero-narratives around geniuses or important individuals: Steve Jobs invented the iPhone, Helmut Kohl united the two Germanies, the cold war is often illustrated as kind of an ongoing chess game between changing Russian and US presidents. We buy products because they allow us to express our “true selves”, whatever the hell that is. “I’m a Mac.”

Our digital rights who are under immense strain due to having been defined in a time where networked computing was a party only a handful of  organizations and companies could attend. Yes, e-Mail sounds kinda like Mail but maybe the way we treat letters doesn’t really work when trying to provide an open but still usable communication network for potentially everyone on this planet (including spamers and scamers). While individual privacy is still being paraded around like the digital golden calf even its strongest supporters do admit that maybe our conception of what privacy is or can be or should be maybe shouldn’t have stopped developing after Brandeis’ “Right to be let alone” and Westin’s “Right to control information about oneself”1 .

One of the main arguments against large-scale government surveillance is that it changes people’s behavior, that they no longer behave naturally. And even if we are not cynical enough to answer “Yeah so what you are saying is that surveillance works?” which is probably maximally far away from the argument people are trying to make there is a weird dominance of the individual at work here: Government action (as a representation of an action of society/community) is presented as oppressive and dangerous to the natural order of things. Or maybe better lack thereof.

Everybody probably agrees that we have seen and are seeing many oppressive regimes all around us. And it would be too simple to just cast a stone at some of the nastier dictatorships of our common history and present: The democratically elected German government for example – usually not seen as oppressive by many, in fact Germany and especially its capital is the place to be right now for digital activists looking for a good place to live relatively cheaply – still doesn’t grant the same rights to its hetero and its LGBT citizens when it comes to marriage, adoption and certain medical aspects. The structures we build for our communities to live in do by definition come with some sort of rule set and – usually – with ways to enforce these rules. Enforcement that can easily and often times quite reasonably be perceived as being oppressive.

The Internet as a sociotechnical system is defined by the people using it and the ways they use it. But being based on a technological foundation of hard- and more importantly software certain people and groups have a lot more influence and power to design, define and destroy structures than me and most others: The companies and organisations building the tools, platforms and technologies our digital lives run on to grant or deny, support or discourage certain modes and patterns of behavior. And in that group a certain reading of the world is very popular: Libertarianism.

The libertarian view pumps individualism up to eleven: Whether its about free markets or the fight against government regulation of cryptography or drugs in the libertarian mindset all these infringements on individual freedom are evil and need to stop. The market will decide and the individual will protect itself against other entities through its power of informed decision and reasoning. I could go on quite the rant about the ideology here (in fact I did a few months ago) but while I personally find it morally appalling it is a position people can have and defend if they want to.

With our digital platforms being so deeply infused with those ideas of individuality and metrics to help us be successful on the global market for human attention and money it cannot come as a surprise that it has become harder and harder to discuss questions of the digital in terms outside of that very capitalist, very individualist framework. The Internet has made certain words like “platform” very popular but neither are they usually used in a well-defined way nor do the sloppy definitions usually transcend economic perspectives. And it’s weird because the Internet has actually shown us how little individualism manages to capture what it is to be human.

Of course we are all different in many many ways. It’s what makes communication and friendships and love and feuds and arguments and the human condition so rich. But – apart from constructing extreme cases – we cannot really grasp people as just themselves. People are not just their minds or – if you like the term better – souls. They are just as much their position in different social structures and networks in society. Talking about the genius white man without talking about their privileged position, the networks and therefore resources they were born into or grew into is not meaningless: It’s actually harmful making other people, the people with a different social status with different influence invisible.

It’s tempting to think of ourselves as these brilliant, clearly delimited beings: It allows us to ignore all the sometimes messy details of our position within the social. Details Sartre summarized in his influential play No Exit by having a character state: “Hell is other people“. Other people that do, by existence, limit what libertarians might call freedom. Other people that do rightfully claim unquantifiable parts of or work and fame. It’s convenient and understandable but also quite lazy.

Because if the Internet has shown us anything it is how integral to our being remixing culture, sharing, exchanging, collaborating, the commons are. And how broken the laws governing these things are.

I believe that in forgetting that we are networked beings, homo interconnecticus (( please forgive me for that horrible beating of the Latin language )) . Not because of technology – unless you consider language a technology – but because of the way we are. Alone we are nothing. But that perspective doesn’t seem to influence our debates much.

We live by individualistic concepts and words and we as social beings, as structures of more than just one are suffering for it. The real questions of privacy and copyright and adblocking are questions of how we position the I and the you in relationship to the us. What do we as individuals owe to the us that allows us to be who we are and do what we do? And where does the us’ reach need to stop?

Writing this listening to music I came to remember the lyrics of an Enter Shikari song I like:

<iframe allowfullscreen="true" class="youtube-player" frameborder="0" height="430" src=";rel=1&amp;fs=1&amp;autohide=2&amp;showsearch=0&amp;showinfo=1&amp;iv_load_policy=1&amp;wmode=transparent" type="text/html" width="711"></iframe>

When I was little…
I dressed up as an astronaut and explored outer space
I dressed up as a superhero and ran about the place
I dressed up as a fireman and rescued those in need
I dressed up as a doctor and cured every disease

It was crystal clear to me back then
That the only problems that I could face
Would be the same problems that affect us all
But of course this sense of common existence
Was sucked out of me in an instance
As though from birth I could walk but I was forced to crawl

I believe that we need to stop thinking that crawling around as singular individuals is the way to go. We should walk and find a better way to use technology to enrich all our lives. Then we will live in truly exciting times.

Photo by Me and my tripod

  1. There are obviously developments and researchers and thinkers working on those questions, those results just haven’t really hit mainstream or are basically remixes of the existing models

Flattr this!

Posts for Sunday, September 20, 2015


Switching focus at work

Since 2010, I was at work responsible for the infrastructure architecture of a couple of technological domains, namely databases and scheduling/workload automation. It brought me in contact with many vendors, many technologies and most importantly, many teams within the organization. The focus domain was challenging, as I had to deal with the strategy on how the organization, which is a financial institution, will deal with databases and scheduling in the long term.

This means looking at the investments related to those domains, implementation details, standards of use, features that we will or will not use, positioning of products and so forth. To do this from an architecture point of view means that I not only had to focus on the details of the technology and understand all their use, but also become a sort-of subject matter expert on those topics. Luckily, I had (well, still have) great teams of DBAs (for the databases) and batch teams (for the scheduling/workload automation) to keep things in the right direction.

I helped them with a (hopefully sufficiently) clear roadmap, investment track, procurement, contract/terms and conditions for use, architectural decisions and positioning and what not. And they helped me with understanding the various components, learn about the best use of these, and of course implement the improvements that we collaboratively put on the roadmap.

Times, they are changing

Last week, I flipped over a page at work. Although I remain an IT architect within the same architecture team, my focus shifts entirely. Instead of a fixed domain, my focus is now more volatile. I leave behind the stability of organizationally anchored technology domains and go forward in a more tense environment.

Instead of looking at just two technology domains, I need to look at all of them, and find the right balance between high flexibility demands (which might not want to use current "standard" offerings) which come up from a very agile context, and the almost non-negotionable requirements that are typical for financial institutions.

The focus is also not primarily technology oriented anymore. I'll be part of an enterprise architecture team with direct business involvement and although my main focus will be on the technology side, it'll also involve information management, business processes and applications.

The end goal is to set up a future-proof architecture in an agile, fast-moving environment (contradictio in terminis ?) which has a main focus in data analytics and information gathering/management. Yes, "big data", but more applied than what some of the vendors try to sell us ;-)

I'm currently finishing off the high-level design and use of a Hadoop platform, and the next focus will be on a possible micro-service architecture using Docker. I've been working on this Hadoop design for a while now (but then it was together with my previous function at work) and given the evolving nature of Hadoop (and the various services that surround it) I'm confident that it will not be the last time I'm looking at it.

Now let me hope I can keep things manageable ;-)

Posts for Monday, September 14, 2015


Getting su to work in init scripts

While developing an init script which has to switch user, I got a couple of errors from SELinux and the system itself:

~# rc-service hadoop-namenode format
Authenticating root.
 * Formatting HDFS ...
su: Authentication service cannot retrieve authentication info

The authentication log shows entries such as the following:

Sep 14 20:20:05 localhost unix_chkpwd[5522]: could not obtain user info (hdfs)

I've always had issues with getting su to work properly again. Now that I have what I think is a working set, let me document it for later (as I still need to review why they are needed):

# Allow initrc_t to use unix_chkpwd to check entries
# Without it gives the retrieval failure

# Allow initrc_t to query selinux access, otherwise avc assertion
allow initrc_t self:netlink_selinux_socket { bind create read };

# Allow initrc_t to honor the pam_rootok setting
allow initrc_t self:passwd { passwd rootok };

With these SELinux rules, switching the user works as expected from within an init script.

Posts for Thursday, September 10, 2015


Custom CIL SELinux policies in Gentoo

In Gentoo, we have been supporting custom policy packages for a while now. Unlike most other distributions, which focus on binary packages, Gentoo has always supported source-based packages as default (although binary packages are supported as well).

A recent commit now also allows CIL files to be used.

Policy ebuilds, how they work

Gentoo provides its own SELinux policy, based on the reference policy, and provides per-module ebuilds (packages). For instance, the SELinux policy for the screen package is provided by the sec-policy/selinux-screen package.

The package itself is pretty straight forward:

# Copyright 1999-2015 Gentoo Foundation
# Distributed under the terms of the GNU General Public License v2
# $Id$


inherit selinux-policy-2

DESCRIPTION="SELinux policy for screen"

if [[ $PV == 9999* ]] ; then
        KEYWORDS="~amd64 ~x86"

The real workhorse lays within a Gentoo eclass, something that can be seen as a library for ebuilds. It allows consolidation of functions and activities so that a large set of ebuilds can be simplified. The more ebuilds are standardized, the more development can be put inside an eclass instead of in the ebuilds. As a result, some ebuilds are extremely simple, and the SELinux policy ebuilds are a good example of this.

The eclass for SELinux policy ebuilds is called selinux-policy-2.eclass and holds a number of functionalities. One of these (the one we focus on right now) is to support custom SELinux policy modules.

Custom SELinux policy ebuilds

Whenever a user has a SELinux policy that is not part of the Gentoo policy repository, then the user might want to provide these policies through packages still. This has the advantage that Portage (or whatever package manager is used) is aware of the policies on the system, and proper dependencies can be built in.

To use a custom policy, the user needs to create an ebuild which informs the eclass not only about the module name (through the MODS variable) but also about the policy files themselves. These files are put in the files/ location of the ebuild, and referred to through the POLICY_FILES variable:

# Copyright 1999-2015 Gentoo Foundation
# Distributed under the terms of the GNU General Public License v2
# $Id$

POLICY_FILES="oracle.te oracle.if oracle.fc"

inherit selinux-policy-2

DESCRIPTION="SELinux policy for screen"

if [[ $PV == 9999* ]] ; then
        KEYWORDS="~amd64 ~x86"

The eclass generally will try to build the policies, converting them into .pp files. With CIL, this is no longer needed. Instead, what we do is copy the .cil files straight into the location where we place the .pp files.

From that point onwards, managing the .cil files is similar to .pp files. They are loaded with semodule -i and unloaded with semodule -r when needed.

Enabling CIL in our ebuilds is a small improvement (after the heavy workload to support the 2.4 userspace) which allows Gentoo to stay ahead in the SELinux world.

Posts for Tuesday, September 8, 2015

Why we can’t have nice things, #lessig2016 edition

Writing about other country’s elections or candidates can easily turn to a vehicle for celebrating one’s own political system or the people and achievements thereof. So when I write about Lawrence Lessig’s candidacy for US President don’t take it as another one of the plethora of articles by Europeans joking about what a mess the list of candidates is right now. We here in Germany have our own share of shady people running for (or being in) office so that we should mostly just shut the hell up when it comes to candidates in other countries (our Minister of Finance for example is well known for having taken large cash donations for his party from arms dealers hiding them in the books as “other income”. Politics is or at least can be a dirty game of power and obviously that influences which kind of people with which kind of thinking and ethos it attracts.

Maybe this is what made so many people in the tech bubble embrace Lessig’s proposal. The feeling of a weird and broken system full of corrupt people governing us. Funding the NSA. Being clueless about technology and the Internet. Yadda yadda yadda. Lessig’s plan provides a way out. He’s the knight in shining armor promising to slay the dragon of campaign finance with his sword of … I might have overstretched my metaphor here.

Lawrence Lessig is more than just “not a nobody”. He’s gotten a lot of credit for creating Creative Commons, a set of licenses hijacking existing copyright law to establish better, fairer, more explicit rules governing how things we share online can be used and build upon by others. He also coined the phrase “Code is law” underlining how immensely powerful the – largely unregulated – algorithms companies hack together in their labs and offices are in our digitalised lifes. The code that runs our world sorta acts as if it was law (just without any form of democratic process for changing and challenging it). He’s also a professor of law at Harvard’s prestigious law school.

Now Lessig could just sit back and enjoy his life but he is obviously driven and unhappy with the state of affairs. In recent years he has focused on tackling an issue he sees as one of the biggest if not the biggest problems for the US democratic system: The way campaigns are financed. And he obviously has a point: In the 2012 presidential elections Barack Obama and Mitt Romney alone burned through more than 2 Billion Dollars. Raising that amount of money isn’t easy and will at some point push you to hunt down bigger and bigger donors: Why invest days if not weeks to get 1 million bucks from 100000 individuals when you could just talk to Goldman Sachs real quick instead? And we have seen that happening with the 2012 elections being largely financed by a very limited amount of people and groups each giving more than a few millions to the cause through SuperPACs etc.

In Lessig’s own words: The system is rigged. If you need metric fucktons of money and you have limited time and resources to collect it, you will turn to people who can give you a lot of it. Which they usually don’t  do out of the goodness of their hearts but due to promises for upcoming legislation (or non-legislation). You can’t buy a president but to become one you have to take care of the desires of the 1%, those with the petty cash laying around to finance your campaign. The desires of the rest? Well … if there is time, sure.

So Lessig’s plan goes as follows: He runs for president and if elected will do one thing and one thing only: Reform campaign finance to reduce the influence of the big donors in politics. When the job is done and the pictures for the history books are taken, he will step back and let his vice president do the actual presidenting. That’s what he is now asking for support for: Elect him to achieve that one single, clear task.

I don’t want to dive into the actual details of his proposal here, they don’t matter for the argument I am about to make (finally!) and I also recognize that Lessig wrote about this plan in his 2011 book “Republic lost” giving it a 2% chance of success: He knows he’s not going to become president, he just tries to make campaign finance reform an issue especially the democratic candidates cannot ignore. Let’s also cast aside any conspiracy theory of him not doing as promised when elected, keeping his position, revealing himself to be a reptiloid and enslaving mankind. I believe that his plans are earnest and that people who follow him do that for honest reasons. And that’s possibly the problem.

Lessig’s idea of the politics is a mechanical one: The political system is there, it’s a machine with rules. But the machine is broken so new rules need to fix it in order for it to properly work again. He finds that campaign finance creates “inequality” and that the system needs to be reformed to create balance. But what kind of balance does he mean? What kind of inequality is he talking about?

In a text on medium he addresses his critics but a small sentence, a little group of words made me write this lengthy text. Here’s the quote:

I believe, moreover, that our “inability to govern” is tied fundamentally to the way we’ve permitted our representative democracy to be corrupted. It is, in my view, the vast political inequality that we have allowed to creep within our system that produces the systemic failure of our government to be responsive to Americans. […] That inequality (political inequality — I’m not talking here about wealth inequality) manifests itself along many dimensions. The one I’ve worked on most is the way we fund campaigns. Whatever else a system in which 400 families contribute 1/2 the money to political campaigns in this election cycle so far is, it is unequal. And it should surprise no one that such a system produces politicians focused on their funders first

Lessig’s plan is based on the strange idea that you can separate “political inequality” from other factors such as “wealth inequality”. And that is a dangerous misunderstanding.

At first glance his argument makes sense: People can buy too much political capital so we need to restrict that influence. Now everybody has the same political influence and all is fair. But … is it? What’s with all the personal connections between people from Ivy League universities, some going into business and some into politics? What’s with the different level of access to information, to education and the free time to invest into researching different proposals and deciding which one to support? What’s with social capital and the difference in habitus that the ruling class (oh that’s one of those commie words, right?) and the lower classes have?

When thinking about Lessig’s campaign and the way it has been – mostly enthusiastically – received within the tech circles that are my “peer group” I kept coming back to Max Cohen. Max Cohen is the protagonist in the ultimate, the definitive movie characterizing modern hacker and tech culture: Pi.

Pi’s Max Cohen is a number theorist and mathematician obsessed with the structure of the universe and his quest to understand it, a quest that drives him insane. His obsession can be summarized in his own words by his “assumptions”:

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="360" mozallowfullscreen="mozallowfullscreen" src="" title="The Assumptions (Pi the Movie)" webkitallowfullscreen="webkitallowfullscreen" width="640"></iframe>

One: Mathematics is the language of nature.
Two: Everything around us can be represented and understood through numbers.
Three: If you graph the numbers of any system, patterns emerge.
Therefore, there are patterns everywhere in nature.

Lessig doesn’t see (from what I know) patterns in the stock market based on math. But – just as Cohen – he projects a very limited abstract model of the world onto reality and actively ignores everything outside of his model. Where Cohen believed that math was the perfect, the ideal structure of reality Lessig puts the code of Law, the rules for politics on a similar pedestal cutting away all of the nuance and grey and strangeness and brilliance that make the code of law, that make the abstract rules our political system.

Because a political system is more than just the way money can flow into it. It’s more than just where explicit power flows and acts. A political system is also heavily influenced by the groups of people in power, their habitus, their shared  world view and ideals. By how the media works and what kind of access to time, money and other resources people within society have. Many political decisions happen or don’t happen due to personal relationships, friendships and rivalries. The system of politics is more complex than just law just as the world is more complex than math.1

Even if laws were the language of politics (which they are just part of) his approach would be too short-sighted because – as he will have to admit – there are many other factors that people in power have access to to be heard, to get their wishes solidified into law. And many of those he cannot just “outlaw” because people talking to their friends shouldn’t probably be illegal.

Don’t get me wrong, I do believe that a campaign finance reform could have positive effects on US politics (as much as an outsider on the other side of the ocean can reasonably form any kind of opinion on a system he’s just observing from the outside) but it’s just so little. It might be a first step but it’s not a solution for the many issues at hand.

But that’s not the reception, the way his campaign is being read: Lessig is saving the world or at least the US democracy (for some that might even be the same). And that has quite dangerous side effects.

Not only does it mean that people buy into his reductionist view on politics and power making all the other, way more hidden, way less explicit forms of exercising power invisible and therefore no longer a problem. He explicitly and actively locks people out from the debate who have a different perspective by pulling the magical postponed card.

The Postponed card is one that especially feminist activists of color are just too familiar with: When white feminists get criticised for not including the specific struggles of women of color the typical reaction tends to be to postpone them. Let’s first solve this issue that helps us few to rise up to a higher level. It will trickle down to you as well and we’ll tackle your problems later(tm). 

Lessig’s plan – if it worked – might help a few middle class people to feel more powerful (they won’t get to talk to the candidates then either which is something their Stanford and Harvard buddies will be able to do) but the number of influence that other people have (and that is what money is being solely cast as in Lessig’s view) will not feel as big. Epic win!

It makes sense that Lessig targets the tech audience to jumpstart his campaign: He has accumulated (through his talks and his work on creative commons) a lot of capital in that demographic that he’s now playing up to. His writing includes memes and references to pop culture artifacts like Game of Thrones, it’s picking up on the whole “the government doesn’t work and needs fixing” that is so super popular in the often quite libertarian internet/hacker/tech circles. And it also unconsciously follows the main fallacy that is sadly so prominent and popular in tech circles, one I call the Alpha-Nerd fallacy: The idea that just because you can build some form of abstract model of a system that kinda mighta work you understand said system and can fix and improve it.

Yes IT people know how to model and analyze systems, that’s one of the core skills that that domain requires. But their concept of what a system is comes from a background of engineering and mechanics not of complex social systems with implicit and hidden connections and opaque structures. I can build a system of “the economy” that can explain a simple issue or phenomenon quite well. That doesn’t mean I modeled “the economy”. (That should really be taught way more in computer science programs).

Lessig is the IT bubble’s president. He truly is. Not just cause of his credentials in talking about code and regulation and copyright but because he shares one of IT bubble’s most toxic and wide-spread fallacy.  And that’s really not good enough. Not due to some holiness and dignity of the office he wants to run for but because it ignores and actively the needs of those with very little access to power.

And that is a mechanic, a course of action we see so often coming from the tech bubble who is shaping up not only to define the code (and therefore quoting Lessig: “laws”) that our digital lives run on but also to gathering up immense wealth in our ongoing quest to pump up the IT bubble more and more. The way Lessig’s plan, his proposal is embraced as a way to solve the issues plaguing the political system of the US is a strong indicator for how little the tech/hacker/internet bubble has learned when it comes to politics, the social and the complexities of the world with all its opaque interdependent structures. I can only read it as  a warning of things to come. A warning of a technocratic elite that’s incapable to admit that social problems are more complex that something you can simulate with 100 lines of code. And that is a deeply troubling thought.

Photo by darkday.

  1. In fact I do believe that math is us analysing the structure of our brains and thinking much more than the “real world” but that is a topic for another time. Let it just be said: If you sucked at math that’s cool cause it’s not really “real”

Flattr this!

Posts for Sunday, September 6, 2015


Using multiple OpenSSH daemons

I administer a couple of systems which provide interactive access by end users, and for this interactive access I position OpenSSH. However, I also use this for administrative access to the system, and I tend to have harder security requirements for OpenSSH than most users do.

For instance, on one system, end users with a userid + password use the sFTP server for publishing static websites. Other access is prohibited, so I really like this OpenSSH configuration to use chrooted users, internal sftp support, whereas a different OpenSSH is used for administrative access (which is only accessible by myself and some trusted parties).

Running multiple instances

Although I might get a similar result with a single OpenSSH instance, I prefer to have multiple instances for this. The default OpenSSH port is used for the non-administrative access whereas administrative access is on a non-default port. This has a number of advantages...

First of all, the SSH configurations are simple and clean. No complex configurations, and more importantly: easy to manage through configuration management tools like SaltStack, my current favorite orchestration/automation tool.

Different instances also allow for different operational support services. There is different monitoring for end-user SSH access versus administrative SSH access. Also the fail2ban configuration is different for these instances.

I can also easily shut down the non-administrative service while ensuring that administrative access remains operational - something important in case of changes and maintenance.

Dealing with multiple instances and SELinux

Beyond enabling a non-default port for SSH (i.e. by marking it as ssh_port_t as well) there is little additional tuning necessary, but that doesn't mean that there is no additional tuning possible.

For instance, we could leverage MCS' categories to only allow users (and thus the SSH daemon) access to the files assigned only that category (and not the rest) whereas the administrative SSH daemon can access all categories.

On an MLS enabled system we could even use different sensitivity levels, allowing the administrative SSH to access the full scala whereas the user-facing SSH can only access the lowest sensitivity level. But as I don't use MLS myself, I won't go into detail for this.

A third possibility would be to fine-tune the permissions of the SSH daemons. However, that would require different types for the daemon, which requires the daemons to be started through different scripts (so that we first transition to dedicated types) before they execute the SSHd binary (which has the sshd_exec_t type assigned).

Requiring pubkey and password authentication

Recent OpenSSH daemons allow chaining multiple authentication methods before access is granted. This allows the systems to force SSH key authentication first, and then - after succesful authentication - require the password to be passed on as well. Or a second step such as Google Authenticator.

AuthenticationMethods publickey,password
PasswordAuthentication yes

I don't use the Google Authenticator, but the Yubico PAM module to require additional authentication through my U2F dongle (so ssh key, password and u2f key). Don't consider this three-factor authentication: one thing I know (password) and two things I have (U2F and ssh key). It's more that I have a couple of devices with a valid SSH key (laptop, tablet, mobile) which are of course targets for theft.

The chance that both one of those devices is stolen together with the U2F dongle (which I don't keep attached to those devices of course) is somewhat less.

Posts for Wednesday, September 2, 2015


Maintaining packages and backporting

A few days ago I committed a small update to policycoreutils, a SELinux related package that provides most of the management utilities for SELinux systems. The fix was to get two patches (which are committed upstream) into the existing release so that our users can benefit from the fixed issues without having to wait for a new release.

Getting the patches

To capture the patches, I used git together with the commit id:

~$ git format-patch -n -1 73b7ff41
~$ git format-patch -n -1 4fbc6623

The two generated patch files contain all information about the commit. Thanks to the epatch support in the eutils.eclass, these patch files are immediately usable within Gentoo's ebuilds.

Updating the ebuilds

The SELinux userspace ebuilds in Gentoo all have live ebuilds available which are immediately usable for releases. The idea with those live ebuilds is that we can simply copy them and commit in order to make a new release.

So, in case of the patch backporting, the necessary patch files are first moved into the files/ subdirectory of the package. Then, the live ebuild is updated to use the new patches:

@@ -88,6 +85,8 @@ src_prepare() {
                epatch "${FILESDIR}/0070-remove-symlink-attempt-fails-with-gentoo-sandbox-approach.patch"
                epatch "${FILESDIR}/0110-build-mcstrans-bug-472912.patch"
                epatch "${FILESDIR}/0120-build-failure-for-mcscolor-for-CONTEXT__CONTAINS.patch"
+               epatch "${FILESDIR}/0130-Only-invoke-RPM-on-RPM-enabled-Linux-distributions-bug-534682.patch"
+               epatch "${FILESDIR}/0140-Set-self.sename-to-sename-after-calling-semanage-bug-557370.patch"

        # rlpkg is more useful than fixfiles

The patches themselves do not apply for the live ebuilds themselves (they are ignored there) as we want the live ebuilds to be as close to the upstream project as possible. But because the ebuilds are immediately usable for releases, we add the necessary information there first.

Next, the new release is created:

~$ cp policycoreutils-9999.ebuild policycoreutils-2.4-r2.ebuild

Testing the changes

The new release is then tested. I have a couple of scripts that I use for automated testing. So first I update these scripts to also try out the functionality that was failing before. On existing systems, these tests should fail:

Running task semanage (Various semanage related operations).
    Executing step "perm_port_on   : Marking portage_t as a permissive domain                              " -> ok
    Executing step "perm_port_off  : Removing permissive mark from portage_t                               " -> ok
    Executing step "selogin_modify : Modifying a SELinux login definition                                  " -> failed

Then, on a test system where the new package has been installed, the same testset is executed (together with all other tests) to validate if the problem is fixed.

Pushing out the new release

Finally, with the fixes in and validated, the new release is pushed out (into ~arch first of course) and the bugs are marked as RESOLVED:TEST-REQUEST. Users can confirm that it works (which would move it to VERIFIED:TEST-REQUEST) or we stabilize it after the regular testing period is over (which moves it to RESOLVED:FIXED or VERIFIED:FIXED).

I do still have to get used to Gentoo using git as its repository now. The workflow to use is documented though. Luckily, because I often get that the git push fails (due to changes to the tree since my last pull). So I need to run git pull --rebase=preserve followed by repoman full and then the push again sufficiently quick after each other).

This simple flow is easy to get used to. Thanks to the existing foundation for package maintenance (such as epatch for patching, live ebuilds that can be immediately used for releases and the ability to just cherry pick patches towards our repository) we can serve updates with just a few minutes of work.


Transplanting Go packages for fun and profit

crazy Gopher scientist

A while back I read coders at work, which is a book of interviews with some great computer scientists who earned their stripes, the questions just as thoughtful as the answers. For one thing, it re-ignited my interest in functional programming, for another I got interested in literate programming but most of all, it struck me how common of a recommendation it was to read other people’s code as a means to become a better programmer. (It also has a good section of Brad Fitzpatrick describing his dislike for programming languages, and dreaming about his ideal language. This must have been shortly before Go came about and he became a maintainer.)

I hadn’t been doing a good job reading/studying other code out of fear that inferior patterns/style would rub off on me. But I soon realized that was an irrational, perhaps slightly absurd excuse. So I made the decision to change. Contrary to my presumption I found that by reading code that looks bad you can challenge and re-evaluate your mindset and get out with a more nuanced understanding and awareness of the pros and cons of various approaches.

I also realized if code is proving too hard to get into or is of too low quality, you can switch to another code base with negligible effort and end up spending almost all of your time reading code that is worthwhile and has plenty of learnings to offer. There is a lot of high quality Go code, easy to find through sites like Github or Golang weekly, just follow your interests and pick a project to start reading.

It gets really interesting though once you find bodies of code that are not only a nice learning resource, but can be transplanted into your code with minimal work to solve a problem you’re having, but in a different context then the author of the code originally designed it for. Components often grow and mature in the context of an application without being promoted as reusable libraries, but you can often use them as if they were. I would like to share 2 such success cases below.

Nsq’s diskqueue code

I’ve always had an interest in code that manages the same binary data both in memory and on a block device. Think filesystems, databases, etc. There’s some interesting concerns like robustness in light of failures combined with optimizing for performance (infrequent syncs to disk, maintaining the hot subset of data in memory, etc), combined with optimizing for various access patterns, this can be a daunting topic to get into.

Luckily there’s a use case that I see all the time in my domain (telemetry systems) and that covers just enough of the problems to be interesting and fun, but not enough to be overwhelming. And that is: for each step in a monitoring data pipeline, you want to be able to buffer data if the endpoint goes down, in memory and to disk if the amount of data gets too much. Especially to disk if you’re also concerned with your software crashing or the machine power cycling.

This is such a common problem that applies to all metrics agents, relays, etc that I was longing for a library that just takes care of spooling data to disk for you without really affecting much of the rest of your software. All it needs to do is sequentially write pieces of data to disk and have a sequential reader catching up and read newer data as it finishes processing the older.

NSQ is a messaging platform from bitly, and it has diskqueue code that does exactly that. And it does so oh so elegantly. I had previously found a beautiful pattern in bitly’s go code that I blogged about and again I found a nice and elegant design that builds further on this pattern, with concurrent access to data protected via a single instance of a for loop running a select block which assures only one piece of code can make changes to data at the same time (see bottom of the file), not unlike ioloops in other languages. And method calls such as Put() provide a clean external interface, though their implementation simply hooks into the internal select loop that runs the code that does the bulk of the work. Genius.

func (d *diskQueue) Put(data []byte) error {
  // some details
  d.writeChan <- data
  return <-d.writeResponseChan

In addition the package came with extensive tests and benchmarks out of the box.

After finding and familiarizing myself with this diskqueue code about a year ago I had an easy time introducing disk spooling to Carbon-relay-ng, by transplanting the code into it. The only change I had to make was capitalizing the Diskqueue type to export it outside of the package. It has proven a great fit, enabling a critical feature through little work of transplanting mature, battle-tested code into a context that original authors probably never thought of.

Note also how the data unit here is the []byte, the queue does not deal with the higher level nsq.Message (!). The authors had the foresight of keeping this generic, enabling code reuse and rightfully shot down a PR of mine that had a side effect of making the queue aware of the Message type. In NSQ you’ll find thoughtful and deliberate api design and pretty sound code all around. Also, they went pretty far in detailing some lessons learned and providing concrete advice, a very interesting read, especially around managing goroutines & synchronizing their exits, and performance optimizations. At Raintank, we had a need for a messaging solution for metrics so we will so be rolling out NSQ as part of the raintank stack. This is an interesting case where my past experience with the NSQ code and ideas helped to adopt the full solution.

Bosun expression package

I’m a fan of the bosun alerting system which came out of Stack Exchange. It’s a full-featured alerting system that solves a few problems like no other tool I’ve seen does (see my linked post), and timeseries data storage aside, comes with basically everything built in to the one program. I’ve used it with success. However, for litmus I needed an alerting handler that integrated well into the Grafana backend. I needed the ability to do arbitrarily complex computations. Graphite’s api only takes you so far. We also needed (desired) reduction functions, boolean logic, etc. This is where bosun’s expression language is really strong. I found the expression package quite interesting, they basically built their own DSL for metrics processing. so it deals with expression parsing, constructing AST’s, executing them, dealing with types (potentially mixed types in the same expression), etc.

But bosun also has incident management, contacts, escalations, etc. Stuff that we either already had in place, or didn’t want to worry about just yet. So we could run bosun standalone and talk to it as a service via its API which I found too loosely coupled and risky, hook all its code into our binary at once - which seemed overkill - or the strategy I chose: gradually familiarize ourself and adopt pieces of Bosun on a case by case basis, making sure there’s a tight fit and without ever building up so much technical debt that it would become a pain to move away from the transplanted code if it becomes clear it’s not/no longer well suited. For the foreseeable future we only need one piece, the expression package. Potentially ultimately we’ll adopt the entire thing, but without the upfront commitment and investment.

So practically, our code now simply has one line where we create a bosun expression object from a string, and another where we ask bosun to execute the expression for us, which takes care of parsing the expression, querying for the data, evaluating and processing the results and distilling everything down into a final result. We get all the language features (reduction functions, boolean logic, nested expressions, …) for free.

This transplantation was again probably not something the bosun authors expected, but for us it was tremendously liberating. We got a lot of power for free. The only thing I had to do was spend some time reading code, and learning in the process. And I knew the code was well tested so we had zero issues using it.

Much akin to the NSQ example above, there was another reason the transplantation went so smoothly: the expression package is not tangled into other stuff. It just needs a string expression and a graphite instance. To be precise, any struct instance that satisfies the graphiteContext interface that is handily defined in the bosun code. While the bosun design aims to make its various clients (graphite, opentsdb, …) applicable for other projects, it also happens to let us do opposite: reuse some of its core code - the expression package - and pass in a custom graphite Context, such as our implementation which has extensive instrumentation. This lets us use the bosun expression package as a “black box” and still inject our own custom logic into the part that queries data from graphite. Of course, once we want to change the logic of anything else in the black box, we will need come up with something else, perhaps fork the package, but it doesn’t seem like we’ll need that any time soon.


If you want to become a better programmer I highly recommend you go read some code. There’s plenty of good code out there. Pick something that deals with a topic that is of interest to you and looks mature. You typically won’t know if code is good before you start reading but you’ll find out really fast, and you might be pleasantly surprised, as was I, several times. You will learn a bunch, possibly pretty fast. However, don’t go for the most advanced, complex code straight away. Pick projects and topics that are out of your comfort zone and do things that are new to you, but nothing too crazy. Once you truly grok those, proceed to other, possibly more advanced stuff.

Often you’ll read reusable libraries that are built to be reused, or you might find ways to transplant smaller portions of code into your own projects. Either way is a great way to tinker and learn, and solve real problems. Just make sure the code actually fits in so you don’t end up with the software version of Frankenstein’s monster. It is also helpful to have the authors available to chat if you need help or have issues understanding something, though they might be surprised if you’re using their code in a way they didn’t envision and might not be very inclined to provide support to what they consider internal implementation details. So that could be a hit or miss. Luckily the people behind both nsq and bosun were supportive of my endeavors but I also made sure to try to figure out things by myself before bothering them. Another reason why it’s good to pick mature, documented projects.

Gopher frankenstein

Part of the original meaning of hacking, extended into open source, is a mindset and practice of seeing how others solve a problem, discussion and building on top of it. We’ve gotten used to - and fairly good at - doing this on a project and library level but forgot about it on the level of code, code patterns and ideas. I want to see these practices come back to life.

We also apply this at Raintank: not only are we trying to build the best open source monitoring platform by reusing (and often contributing to) existing open source tools and working with different communities, we realize it’s vital to work on a more granular level, get to know the people and practice cross-pollination of ideas and code.

Next stuff I want to read and possibly implement or transplant parts of: dgryski/go-trigram, armon/go-radix, especially as used in the dgryski/carbonmem server to search through Graphite metrics. Other fun stuff by dgryski: an implementation of the ARC caching algorithm and bloom filters. (you might want to get used to reading Wikipedia pages also). And mreiferson/wal, a write ahead log by one of the nsqd authors, which looks like it’ll become the successor of the beloved diskqueue code.

Go forth and transplant!

Also posted on the Raintank blog

Posts for Saturday, August 29, 2015


Doing away with interfaces

CIL is SELinux' Common Intermediate Language, which brings on a whole new set of possibilities with policy development. I hardly know CIL but am (slowly) learning. Of course, the best way to learn is to try and do lots of things with it, but real-life work and time-to-market for now forces me to stick with the M4-based refpolicy one.

Still, I do try out some things here and there, and one of the things I wanted to look into was how CIL policies would deal with interfaces.

Recap on interfaces

With the M4 based reference policy, interfaces are M4 macros that expand into the standard SELinux rules. They are used by the reference policy to provide a way to isolate module-specific code and to have "public" calls.

Policy modules are not allowed (by convention) to call types or domains that are not defined by the same module. If they want to interact with those modules, then they need to call the interface(s):

# module "ntp"
# domtrans: when executing an ntpd_exec_t binary, the resulting process 
#           runs in ntpd_t
  domtrans_pattern($1, ntpd_exec_t, ntpd_t)

# module "hal"

In the above example, the purpose is to have hald_t be able to execute binaries labeled as ntpd_exec_t and have the resulting process run as the ntpd_t domain.

The following would not be allowed inside the hal module:

domtrans_pattern(hald_t, ntpd_exec_t, ntpd_t)

This would imply that both hald_t, ntpd_exec_t and ntpd_t are defined by the same module, which is not the case.

Interfaces in CIL

It seems that CIL will not use interface files. Perhaps some convention surrounding it will be created - to know this, we'll have to wait until a "cilrefpolicy" is created. However, functionally, this is no longer necessary.

Consider the myhttp_client_packet_t declaration from a previous post. In it, we wanted to allow mozilla_t to send and receive these packets. The example didn't use an interface-like construction for this, so let's see how this would be dealt with.

First, the module is slightly adjusted to create a macro called myhttp_sendrecv_client_packet:

(macro myhttp_sendrecv_client_packet ((type domain))
  (typeattributeset cil_gen_require domain)
  (allow domain myhttp_client_packet_t (packet (send recv)))

Another module would then call this:

(call myhttp_sendrecv_client_packet (mozilla_t))

That's it. When the policy modules are both loaded, then the mozilla_t domain is able to send and receive myhttp_client_packet_t labeled packets.

There's more: namespaces

But it doesn't end there. Whereas the reference policy had a single namespace for the interfaces, CIL is able to use namespaces. It allows to create an almost object-like approach for policy development.

The above myhttp_client_packet_t definition could be written as follows:

(block myhttp
  ; MyHTTP client packet
  (type client_packet_t)
  (roletype object_r client_packet_t)
  (typeattributeset client_packet_type (client_packet_t))
  (typeattributeset packet_type (client_packet_t))

  (macro sendrecv_client_packet ((type domain))
    (typeattributeset cil_gen_require domain)
    (allow domain client_packet_t (packet (send recv)))

The other module looks as follows:

(block mozilla
  (typeattributeset cil_gen_require mozilla_t)
  (call myhttp.sendrecv_client_packet (mozilla_t))

The result is similar, but not fully the same. The packet is no longer called myhttp_client_packet_t but myhttp.client_packet_t. In other words, a period (.) is used to separate the object name (myhttp) and the object/type (client_packet_t) as well as interface/macro (sendrecv_client_packet):

~$ sesearch -s mozilla_t -c packet -p send -Ad
  allow mozilla_t myhttp.client_packet_t : packet { send recv };

And it looks that namespace support goes even further than that, but I still need to learn more about it first.

Still, I find this a good evolution. With CIL interfaces are no longer separate from the module definition: everything is inside the CIL file. I secretly hope that tools such as seinfo would support querying macros as well.

Posts for Tuesday, August 25, 2015


Slowly converting from GuideXML to HTML

Gentoo has removed its support of the older GuideXML format in favor of using the Gentoo Wiki and a new content management system for the main site (or is it static pages, I don't have the faintest idea to be honest). I do still have a few GuideXML pages in my development space, which I am going to move to HTML pretty soon.

In order to do so, I make use of the guidexml2wiki stylesheet I developed. But instead of migrating it to wiki syntax, I want to end with HTML.

So what I do is first convert the file from GuideXML to MediaWiki with xsltproc.

Next, I use pandoc to convert this to restructured text. The idea is that the main pages on my devpage are now restructured text based. I was hoping to use markdown, but the conversion from markdown to HTML is not what I hoped it was.

The restructured text is then converted to HTML using In the end, I use the following function (for conversion, once):

# Convert GuideXML to RestructedText and to HTML
gxml2html() {

  # Convert to Mediawiki syntax
  xsltproc ~/dev-cvs/gentoo/xml/htdocs/xsl/guidexml2wiki.xsl $1 > ${basefile}.mediawiki

  if [ -f ${basefile}.mediawiki ] ; then
    # Convert to restructured text
    pandoc -f mediawiki -t rst -s -S -o ${basefile}.rst ${basefile}.mediawiki;

  if [ -f ${basefile}.rst ] ; then
    # Use your own stylesheet links (use full https URLs for this)  --stylesheet=link-to-bootstrap.min.css,link-to-tyrian.min.css --link-stylesheet ${basefile}.rst ${basefile}.html

Is it perfect? No, but it works.

Posts for Saturday, August 22, 2015


Making the case for multi-instance support

With the high attention that technologies such as Docker, Rocket and the like get (I recommend to look at Bocker by Peter Wilmott as well ;-), I still find it important that technologies are well capable of supporting a multi-instance environment.

Being able to run multiple instances makes for great consolidation. The system can be optimized for the technology, access to the system limited to the admins of said technology while still providing isolation between instances. For some technologies, running on commodity hardware just doesn't cut it (not all software is written for such hardware platforms) and consolidation allows for reducing (hardware/licensing) costs.

Examples of multi-instance technologies

A first example that I'm pretty familiar with is multi-instance database deployments: Oracle DBs, SQL Servers, PostgreSQLs, etc. The consolidation of databases while still keeping multiple instances around (instead of consolidating into a single instance itself) is mainly for operational reasons (changes should not influence other database/schema's) or technical reasons (different requirements in parameters, locales, etc.)

Other examples are web servers (for web hosting companies), which next to virtual host support (which is still part of a single instance) could benefit from multi-instance deployments for security reasons (vulnerabilities might be better contained then) as well as performance tuning. Same goes for web application servers (such as TomCat deployments).

But even other technologies like mail servers can benefit from multiple instance deployments. Postfix has a nice guide on multi-instance deployments and also covers some of the use cases for it.

Advantages of multi-instance setups

The primary objective that most organizations have when dealing with multiple instances is the consolidation to reduce cost. Especially expensive, propriatary software which is CPU licensed gains a lot from consolidation (and don't think a CPU is a CPU, each company has its (PDF) own (PDF) core weight table to get the most money out of their customers).

But beyond cost savings, using multi-instance deployments also provides for resource sharing. A high-end server can be used to host the multiple instances, with for instance SSD disks (or even flash cards), more memory, high-end CPUs, high-speed network connnectivity and more. This improves performance considerably, because most multi-instance technologies don't need all resources continuously.

Another advantage, if properly designed, is that multi-instance capable software can often leverage the multi-instance deployments for fast changes. A database might be easily patched (remove vulnerabilities) by creating a second codebase deployment, patching that codebase, and then migrating the database from one instance to another. Although it often still requires downtime, it can be made considerably less, and roll-back of such changes is very easy.

A last advantage that I see is security. Instances can be running as different runtime accounts, through different SELinux contexts, bound on different interfaces or chrooted into different locations. This is not an advantage compared to dedicated systems of course, but more an advantage compared to full consolidation (everything in a single instance).

Don't always focus on multi-instance setups though

Multiple instances isn't a silver bullet. Some technologies are generally much better when there is a single instance on a single operating system. Personally, I find that such technologies should know better. If they are really designed to be suboptimal in case of multi-instance deployments, then there is a design error.

But when the advantages of multiple instances do not exist (no license cost, hardware cost is low, etc.) then organizations might focus on single-instance deployments, because

  • multi-instance deployments might require more users to access the system (especially when it is multi-tenant)
  • operational activities might impact other instances (for instance updating kernel parameters for one instance requires a reboot which affects other instances)
  • the software might not be properly "multi-instance aware" and as such starts fighting for resources with its own sigbling instances

Given that properly designed architectures are well capable of using virtualization (and in the future containerization) moving towards single-instance deployments becomes more and more interesting.

What should multi-instance software consider?

Software should, imo, always consider multi-instance deployments. Even when the administrator decides to stick with a single instance, all that that takes is that the software ends up with a "single instance" setup (it is much easier to support multiple instances and deploy a single one, than to support single instances and deploy multiple ones).

The first thing software should take into account is that it might (and will) run with different runtime accounts - service accounts if you whish. That means that the software should be well aware that file locations are separate, and that these locations will have different access control settings on them (if not just a different owner).

So instead of using /etc/foo as the mandatory location, consider supporting /etc/foo/instance1, /etc/foo/instance2 if full directories are needed, or just have /etc/foo1.conf and /etc/foo2.conf. I prefer the directory approach, because it makes management much easier. It then also makes sense that the log location is /var/log/foo/instance1, the data files are at /var/lib/foo/instance1, etc.

The second is that, if a service is network-facing (which most of them are), it must be able to either use multihomed systems easily (bind to different interfaces) or use different ports. The latter is a challenge I often come across with software - the way to configure the software to deal with multiple deployments and multiple ports is often a lengthy trial-and-error setup.

What's so difficult with using a base port setting, and document how the other ports are derived from this base port. Neo4J needs 3 ports for its enterprise services (transactions, cluster management and online backup), but they all need to be explicitly configured if you want a multi-instance deployment. What if one could just set baseport = 5001 with the software automatically selecting 5002 and 5003 as other ports (or 6001 and 7001). If the software in the future needs another port, there is no need to update the configuration (assuming the administrator leaves sufficient room).

Also consider the service scripts (/etc/init.d) or similar (depending on the init system used). Don't provide a single one which only deals with one instance. Instead, consider supporting symlinked service scripts which automatically obtain the right configuration from its name.

For instance, a service script called pgsql-inst1 which is a symlink to /etc/init.d/postgresql could then look for its configuration in /var/lib/postgresql/pgsql-inst1 (or /etc/postgresql/pgsql-inst1).

Just like supporting .d directories, I consider multi-instance support an important non-functional requirement for software.

Posts for Wednesday, August 19, 2015


Switching OpenSSH to ed25519 keys

With Mike's news item on OpenSSH's deprecation of the DSA algorithm for the public key authentication, I started switching the few keys I still had using DSA to the suggested ED25519 algorithm. Of course, I wouldn't be a security-interested party if I did not do some additional investigation into the DSA versus Ed25519 discussion.

The issue with DSA

You might find DSA a bit slower than RSA:

~$ openssl speed rsa1024 rsa2048 dsa1024 dsa2048
                  sign    verify    sign/s verify/s
rsa 1024 bits 0.000127s 0.000009s   7874.0 111147.6
rsa 2048 bits 0.000959s 0.000029s   1042.9  33956.0
                  sign    verify    sign/s verify/s
dsa 1024 bits 0.000098s 0.000103s  10213.9   9702.8
dsa 2048 bits 0.000293s 0.000339s   3407.9   2947.0

As you can see, RSA verification outperforms DSA in verification, while signing with DSA is better than DSA. But for what OpenSSH is concerned, this speed difference should not be noticeable on the vast majority of OpenSSH servers.

So no, it is not the speed, but the secure state of the DSS standard.

The OpenSSH developers find that ssh-dss (DSA) is too weak, which is followed by various sources. Considering the impact of these keys, it is important that they follow the state-of-the-art cryptographic services.

Instead, they suggest to switch to elliptic curve cryptography based algorithms, with Ed25519 and Curve25519 coming out on top.

Switch to RSA or ED25519?

Given that RSA is still considered very secure, one of the questions is of course if ED25519 is the right choice here or not. I don't consider myself anything in cryptography, but I do like to validate stuff through academic and (hopefully) reputable sources for information (not that I don't trust the OpenSSH and OpenSSL folks, but more from a broader interest in the subject).

Ed25519 should be written fully as Ed25519-SHA-512 and is a signature algorithm. It uses elliptic curve cryptography as explained on the EdDSA wikipedia page. An often cited paper is Fast and compact elliptic-curve cryptography by Mike Hamburg, which talks about the performance improvements, but the main paper is called High-speed high-security signatures which introduces the Ed25519 implementation.

Of the references I was able to (quickly) go through (not all papers are publicly reachable) none showed any concerns about the secure state of the algorithm.

The (simple) process of switching

Switching to Ed25519 is simple. First, generate the (new) SSH key (below just an example run):

~$ ssh-keygen -t ed25519
Generating public/private ed25519 key pair.
Enter file in which to save the key (/home/testuser/.ssh/id_ed25519): 
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /home/testuser/.ssh/id_ed25519.
Your public key has been saved in /home/testuser/.ssh/
The key fingerprint is:
SHA256:RDaEw3tNAKBGMJ2S4wmN+6P3yDYIE+v90Hfzz/0r73M testuser@testserver
The key's randomart image is:
+--[ED25519 256]--+
|o*...o.+*.       |
|*o+.  +o ..      |
|o++    o.o       |
|o+    ... .      |
| +     .S        |
|+ o .            |
|o+.o . . o       |
|oo+o. . . o ....E|
| oooo.     ..o+=*|

Then, make sure that the ~/.ssh/authorized_keys file contains the public key (as generated as Don't remove the other keys yet until the communication is validated. For me, all I had to do was to update the file in the Salt repository and have the master push the changes to all nodes (starting with non-production first of course).

Next, try to log on to the system using the Ed25519 key:

~$ ssh -i ~/.ssh/id_ed25519 testuser@testserver

Make sure that your SSH agent is not running as it might still try to revert back to another key if the Ed25519 one does not work. You can validate if the connection was using Ed25519 through the auth.log file:

~$ sudo tail -f auth.log
Aug 17 21:20:48 localhost sshd[13962]: Accepted publickey for root from \ port 43152 ssh2: ED25519 SHA256:-------redacted----------------

If this communication succeeds, then you can remove the old key from the ~/.ssh/authorized_keys files.

On the client level, you might want to hide ~/.ssh/id_dsa from the SSH agent:

# Obsolete - keychain ~/.ssh/id_dsa
keychain ~/.ssh/id_ed25519

If a server update was forgotten, then the authentication will fail and, depending on the configuration, either fall back to the regular authentication or fail immediately. This gives a nice heads-up to you to update the server, while keeping the key handy just in case. Just refer to the old id_dsa key during the authentication and fix up the server.

Posts for Sunday, August 16, 2015


Updates on my Pelican adventure

It's been a few weeks that I switched my blog to Pelican, a static site generator build with Python. A number of adjustments have been made since, which I'll happily talk about.

The full article view on index page

One of the features I wanted was to have my latest blog post to be fully readable from the front page (called the index page within Pelican). Sadly, I could not find a plugin of setting that would do this, but I did find a plugin that I can use to work around this: the summary plugin.

Enabling the plugin was a breeze. Extract the plugin sources in the plugin/ folder, and enable it in

PLUGINS = [..., 'summary']

With this plug-in, articles can use inline comments to tell the system at which point the summary of the article stops. Usually, the summary (which is displayed on index pages) is a first paragraph (or set of paragraphs). What I do is I now manually set the summmary to the entire blog post for the latest post, and adjust later when a new post comes up.

It might be some manual labour, but it fits nicely and doesn't hack around in the code too much.

Commenting with Disqus

I had some remarks that the Disqus integration is not as intuitive as expected. Some readers had difficulties finding out how to comment as a guest (without the need to log on through popular social media or through Disqus itself).

Agreed, it is not easy to see at first sight that people need to start typing their name in the Or sign up with disqus before they can select I'd rather post as guest. As I don't have any way of controlling the format and rendered code with Disqus, I updated the theme a bit to add in two paragraphs on commenting. The first paragraph tells how to comment as guest.

The second paragraph for now informs readers that non-verified comments are put in the moderation queue. Once I get a feeling of how the spam and bots act on the commenting system, I will adjust the filters and also allow guest comments to be readily accessible (no moderation queue). Give it a few more weeks to get myself settled and I'll adjust it.

If the performance of the site is slowed down due to the Disqus javascripts: both Firefox (excuse me, Aurora) and Chromium have this at the initial load. Later, the scripts are properly cached and load in relatively fast (a quick test shows all pages I tried load in less than 2 seconds - WordPress was at 4). And if you're not interested in commenting, then you can even use NoScript or similar plugins to disallow any remote javascript.

Still, I will continue to look at how to make commenting easier. I recently allowed unmoderated comments (unless a number of keywords are added, and comments with links are also put in the moderation queue). If someone knows of another comment-like system that I could integrate I'm happy to hear about it as well.


My issue with Tipue Search has been fixed by reverting a change in (the plugin) where the URL was assigned to the loc key instead of url. It is probably a mismatch between the plugin and the theme (the change of the key was done in May in Tipue Search itself).

With this minor issue changed, the search capabilities are back on track on my blog. Enabling is was a matter of:

PLUGINS = [..., `tipue_search`]
DIRECT_TEMPLATES = ((..., 'search'))

Tags and categories

WordPress supports multiple categories, but Pelican does not. So I went through the various posts that had multiple categories and decided on a single one. While doing so, I also reduced the categories to a small set:

  • Databases
  • Documentation
  • Free Software
  • Gentoo
  • Misc
  • Security
  • SELinux

I will try to properly tag all posts so that, if someone is interested in a very particular topic, such as PostgreSQL, he can reach those posts through the tag.

Posts for Thursday, August 13, 2015


Finding a good compression utility

I recently came across a wiki page written by Herman Brule which gives a quick benchmark on a couple of compression methods / algorithms. It gave me the idea of writing a quick script that tests out a wide number of compression utilities available in Gentoo (usually through the app-arch category), with also a number of options (in case multiple options are possible).

The currently supported packages are:

app-arch/bloscpack      app-arch/bzip2          app-arch/freeze
app-arch/gzip           app-arch/lha            app-arch/lrzip
app-arch/lz4            app-arch/lzip           app-arch/lzma
app-arch/lzop           app-arch/mscompress     app-arch/p7zip
app-arch/pigz           app-arch/pixz           app-arch/plzip
app-arch/pxz            app-arch/rar            app-arch/rzip
app-arch/xar            app-arch/xz-utils       app-arch/zopfli

The script should keep the best compression information: duration, compression ratio, compression command, as well as the compressed file itself.

Finding the "best" compression

It is not my intention to find the most optimal compression, as that would require heuristic optimizations (which has triggered my interest in seeking such software, or writing it myself) while trying out various optimization parameters.

No, what I want is to find the "best" compression for a given file, with "best" being either

  • most reduced size (which I call compression delta in my script)
  • best reduction obtained per time unit (which I call the efficiency)

For me personally, I think I would use it for the various raw image files that I have through the photography hobby. Those image files are difficult to compress (the Nikon DS3200 I use is an entry-level camera which applies lossy compression already for its raw files) but their total size is considerable, and it would allow me to better use the storage I have available both on my laptop (which is SSD-only) as well as backup server.

But next to the best compression ratio, the efficiency is also an important metric as it shows how efficient the algorithm works in a certain time aspect. If one compression method yields 80% reduction in 5 minutes, and another one yields 80,5% in 45 minutes, then I might want to prefer the first one even though that is not the best compression at all.

Although the script could be used to get the most compression (without resolving to an optimization algorithm for the compression commands) for each file, this is definitely not the use case. A single run can take hours for files that are compressed in a handful of seconds. But it can show the best algorithms for a particular file type (for instance, do a few runs on a couple of raw image files and see which method is most succesful).

Another use case I'm currently looking into is how much improvement I can get when multiple files (all raw image files) are first grouped in a single archive (.tar). Theoretically, this should improve the compression, but by how much?

How the script works

The script does not contain much intelligence. It iterates over a wide set of compression commands that I tested out, checks the final compressed file size, and if it is better than a previous one it keeps this compressed file (and its statistics).

I tried to group some of the compressions together based on the algorithm used, but as I don't really know the details of the algorithms (it's based on manual pages and internet sites) and some of them combine multiple algorithms, it is more of a high-level selection than anything else.

The script can also only run the compressions of a single application (which I use when I'm fine-tuning the parameter runs).

A run shows something like the following:

Original file (test.nef) size 20958430 bytes
      package name                                                 command      duration                   size compr.Δ effic.:
      ------------                                                 -------      --------                   ---- ------- -------
app-arch/bloscpack                                               blpk -n 4           0.1               20947097 0.00054 0.00416
app-arch/bloscpack                                               blpk -n 8           0.1               20947097 0.00054 0.00492
app-arch/bloscpack                                              blpk -n 16           0.1               20947097 0.00054 0.00492
    app-arch/bzip2                                                   bzip2           2.0               19285616 0.07982 0.03991
    app-arch/bzip2                                                bzip2 -1           2.0               19881886 0.05137 0.02543
    app-arch/bzip2                                                bzip2 -2           1.9               19673083 0.06133 0.03211
    app-arch/p7zip                                      7za -tzip -mm=PPMd           5.9               19002882 0.09331 0.01592
    app-arch/p7zip                             7za -tzip -mm=PPMd -mmem=24           5.7               19002882 0.09331 0.01640
    app-arch/p7zip                             7za -tzip -mm=PPMd -mmem=25           6.4               18871933 0.09955 0.01551
    app-arch/p7zip                             7za -tzip -mm=PPMd -mmem=26           7.7               18771632 0.10434 0.01364
    app-arch/p7zip                             7za -tzip -mm=PPMd -mmem=27           9.0               18652402 0.11003 0.01224
    app-arch/p7zip                             7za -tzip -mm=PPMd -mmem=28          10.0               18521291 0.11628 0.01161
    app-arch/p7zip                                       7za -t7z -m0=PPMd           5.7               18999088 0.09349 0.01634
    app-arch/p7zip                                7za -t7z -m0=PPMd:mem=24           5.8               18999088 0.09349 0.01617
    app-arch/p7zip                                7za -t7z -m0=PPMd:mem=25           6.5               18868478 0.09972 0.01534
    app-arch/p7zip                                7za -t7z -m0=PPMd:mem=26           7.5               18770031 0.10442 0.01387
    app-arch/p7zip                                7za -t7z -m0=PPMd:mem=27           8.6               18651294 0.11008 0.01282
    app-arch/p7zip                                7za -t7z -m0=PPMd:mem=28          10.6               18518330 0.11643 0.01100
      app-arch/rar                                                     rar           0.9               20249470 0.03383 0.03980
      app-arch/rar                                                 rar -m0           0.0               20958497 -0.00000        -0.00008
      app-arch/rar                                                 rar -m1           0.2               20243598 0.03411 0.14829
      app-arch/rar                                                 rar -m2           0.8               20252266 0.03369 0.04433
      app-arch/rar                                                 rar -m3           0.8               20249470 0.03383 0.04027
      app-arch/rar                                                 rar -m4           0.9               20248859 0.03386 0.03983
      app-arch/rar                                                 rar -m5           0.8               20248577 0.03387 0.04181
    app-arch/lrzip                                                lrzip -z          13.1               19769417 0.05673 0.00432
     app-arch/zpaq                                                    zpaq           0.2               20970029 -0.00055        -0.00252
The best compression was found with 7za -t7z -m0=PPMd:mem=28.
The compression delta obtained was 0.11643 within 10.58 seconds.
This file is now available as test.nef.7z.

In the above example, the test file was around 20 MByte. The best compression compression command that the script found was:

~$ 7za -t7z -m0=PPMd:mem=28 a test.nef.7z test.nef

The resulting file (test.nef.7z) is 18 MByte, a reduction of 11,64%. The compression command took almost 11 seconds to do its thing, which gave an efficiency rating of 0,011, which is definitely not a fast one.

Some other algorithms don't do bad either with a better efficiency. For instance:

   app-arch/pbzip2                                                  pbzip2           0.6               19287402 0.07973 0.13071

In this case, the pbzip2 command got almost 8% reduction in less than a second, which is considerably more efficient than the 11-seconds long 7za run.

Want to try it out yourself?

I've pushed the script to my github location. Do a quick review of the code first (to see that I did not include anything malicious) and then execute it to see how it works:

~$ sw_comprbest -h
Usage: sw_comprbest --infile=<inputfile> [--family=<family>[,...]] [--command=<cmd>]
       sw_comprbest -i <inputfile> [-f <family>[,...]] [-c <cmd>]

Supported families: blosc bwt deflate lzma ppmd zpaq. These can be provided comma-separated.
Command is an additional filter - only the tests that use this base command are run.

The output shows
  - The package (in Gentoo) that the command belongs to
  - The command run
  - The duration (in seconds)
  - The size (in bytes) of the resulting file
  - The compression delta (percentage) showing how much is reduced (higher is better)
  - The efficiency ratio showing how much reduction (percentage) per second (higher is better)

When the command supports multithreading, we use the number of available cores on the system (as told by /proc/cpuinfo).

For instance, to try it out against a PDF file:

~$ sw_comprbest -i MEA6-Sven_Vermeulen-Research_Summary.pdf
Original file (MEA6-Sven_Vermeulen-Research_Summary.pdf) size 117763 bytes
The best compression was found with zopfli --deflate.
The compression delta obtained was 0.00982 within 0.19 seconds.
This file is now available as MEA6-Sven_Vermeulen-Research_Summary.pdf.deflate.

So in this case, the resulting file is hardly better compressed - the PDF itself is already compressed. Let's try it against the uncompressed PDF:

~$ pdftk MEA6-Sven_Vermeulen-Research_Summary.pdf output test.pdf uncompress
~$ sw_comprbest -i test.pdf
Original file (test.pdf) size 144670 bytes
The best compression was found with lrzip -z.
The compression delta obtained was 0.27739 within 0.18 seconds.
This file is now available as test.pdf.lrz.

This is somewhat better:

~$ ls -l MEA6-Sven_Vermeulen-Research_Summary.pdf* test.pdf*
-rw-r--r--. 1 swift swift 117763 Aug  7 14:32 MEA6-Sven_Vermeulen-Research_Summary.pdf
-rw-r--r--. 1 swift swift 116606 Aug  7 14:32 MEA6-Sven_Vermeulen-Research_Summary.pdf.deflate
-rw-r--r--. 1 swift swift 144670 Aug  7 14:34 test.pdf
-rw-r--r--. 1 swift swift 104540 Aug  7 14:35 test.pdf.lrz

The resulting file is 11,22% reduced from the original one.

Posts for Tuesday, August 11, 2015


Why we do confine Firefox

If you're a bit following the SELinux development community you will know Dan Walsh, a Red Hat security engineer. Today he blogged about CVE-2015-4495 and SELinux, or why doesn't SELinux confine Firefox. He should've asked why the reference policy or Red Hat/Fedora policy does not confine Firefox, because SELinux is, as I've mentioned before, not the same as its policy.

In effect, Gentoo's SELinux policy does confine Firefox by default. One of the principles we focus on in Gentoo Hardened is to develop desktop policies in order to reduce exposure and information leakage of user documents. We might not have the manpower to confine all desktop applications, but I do think it is worthwhile to at least attempt to do this, even though what Dan Walsh mentioned is also correct: desktops are notoriously difficult to use a mandatory access control system on.

How Gentoo wants to support more confined desktop applications

What Gentoo Hardened tries to do is to support the XDG Base Directory Specification for several documentation types. Downloads are marked as xdg_downloads_home_t, pictures are marked as xdg_pictures_home_t, etc.

With those types defined, we grant the regular user domains full access to those types, but start removing access to user content from applications. Rules such as the following are commented out or removed from the policies:

# userdom_manage_user_home_content_dirs(mozilla_t)
# userdom_manage_user_home_content_files(mozilla_t)

Instead, we add in a call to a template we have defined ourselves:

userdom_user_content_access_template(mozilla, { mozilla_t mozilla_plugin_t })

This call makes access to user content optional through SELinux booleans. For instance, for the mozilla_t domain (which is used for Firefox), the following booleans are created:

# Read generic (user_home_t) user content
mozilla_read_generic_user_content       ->      true

# Read all user content
mozilla_read_all_user_content           ->      false

# Manage generic (user_home_t) user content
mozilla_manage_generic_user_content     ->      false

# Manage all user content
mozilla_manage_all_user_content         ->      false

As you can see, the default setting is that Firefox can read user content, but only non-specific types. So ssh_home_t, which is used for the SSH related files, is not readable by Firefox with our policy by default.

By changing these booleans, the policy is fine-tuned to the requirements of the administrator. On my systems, mozilla_read_generic_user_content is switched off.

You might ask how we can then still support a browser if it cannot access user content to upload or download. Well, as mentioned before, we support the XDG types. The browser is allowed to manage xdg_download_home_t files and directories. For the majority of cases, this is sufficient. I also don't mind copying over files to the ~/Downloads directory just for uploading files. But I am well aware that this is not what the majority of users would want, which is why the default is as it is.

There is much more work to be done sadly

As said earlier, the default policy will allow reading of user files if those files are not typed specifically. Types that are protected by our policy (but not by the reference policy standard) includes SSH related files at ~/.ssh and GnuPG files at ~/.gnupg. Even other configuration files, such as for my Mutt configuration (~/.muttrc) which contains a password for an IMAP server I connect to, are not reachable.

However, it is still far from perfect. One of the reasons is that many desktop applications are not "converted" yet to our desktop policy approach. Yes, Chromium is also already converted, and policies we've added such as for Skype also do not allow direct access unless the user explicitly enabled it. But Evolution for instance isn't yet.

Converting desktop policies to a more strict setup requires lots of testing, which translates to many human resources. Within Gentoo, only a few developers and contributors are working on policies, and considering that this is not a change that is already part of the (upstream) reference policy, some contributors also do not want to put lots of focus on it either. But without having done the works, it will not be easy (nor probably acceptable) to upstream this (the XDG patch has been submitted a few times already but wasn't deemed ready yet then).

Having a more restrictive policy isn't the end

As the blog post of Dan rightly mentioned, there are still quite some other ways of accessing information that we might want to protect. An application might not have access to user files, but can be able to communicate (for instance through DBus) with an application that does, and through that instruct it to pass on the data.

Plugins might require permissions which do not match with the principles set up earlier. When we tried out Google Talk (needed for proper Google Hangouts support) we noticed that it requires many, many more privileges. Luckily, we were able to write down and develop a policy for the Google Talk plugin (googletalk_plugin_t) so it is still properly confined. But this is just a single plugin, and I'm sure that more plugins exist which will have similar requirements. Which leads to more policy development.

But having workarounds does not make the effort we do worthless. Being able to work around a firewall through application data does not make the firewall useless, it is just one of the many security layers. The same is true with SELinux policies.

I am glad that we at least try to confine desktop applications more, and that Gentoo Hardened users who use SELinux are at least somewhat more protected from the vulnerability (even with the default case) and that our investment for this is sound.

Posts for Sunday, August 9, 2015


Can SELinux substitute DAC?

A nice twitter discussion with Erling Hellenäs caught my full attention later when I was heading home: Can SELinux substitute DAC? I know it can't and doesn't in the current implementation, but why not and what would be needed?

SELinux is implemented through the Linux Security Modules framework which allows for different security systems to be implemented and integrated in the Linux kernel. Through LSM, various security-sensitive operations can be secured further through additional access checks. This criteria was made to have LSM be as minimally invasive as possible.

The LSM design

The basic LSM design paper, called Linux Security Modules: General Security Support for the Linux Kernel as presented in 2002, is still one of the better references for learning and understanding LSM. It does show that there was a whish-list from the community where LSM hooks could override DAC checks, and that it has been partially implemented through permissive hooks (not to be mistaken with SELinux' permissive mode).

However, this definitely is partially implemented because there are quite a few restrictions. One of them is that, if a request is made towards a resource and the UIDs match (see page 3, figure 2 of the paper) then the LSM hook is not consulted. When they don't match, a permissive LSM hook can be implemented. Support for permissive hooks is implemented for capabilities, a powerful DAC control that Linux supports and which is implemented through LSM as well. I have blogged about this nice feature a while ago.

These restrictions are also why some other security-conscious developers, such as grsecurity's team and RSBAC do not use the LSM system. Well, it's not only through these restrictions of course - other reasons play a role in them as well. But knowing what LSM can (and cannot) do also shows what SELinux can and cannot do.

The LSM design itself is already a reason why SELinux cannot substitute DAC controls. But perhaps we could disable DAC completely and thus only rely on SELinux?

Disabling DAC in Linux would be an excessive workload

The discretionary access controls in the Linux kernel are not easy to remove. They are often part of the code itself (just grep through the source code after -EPERM). Some subsystems which use a common standard approach (such as VFS operations) can rely on good integrated security controls, but these too often allow the operation if DAC allows it, and will only consult the LSM hooks otherwise.

VFS operations are the most known ones, but DAC controls go beyond file access. It also entails reading program memory, sending signals to applications, accessing hardware and more. But let's focus on the easier controls (as in, easier to use examples for), such as sharing files between users, restricting access to personal documents and authorizing operations in applications based on the user id (for instance, the owner can modify while other users can only read the file).

We could "work around" the Linux DAC controls by running everything as a single user (the root user) and having all files and resources be fully accessible by this user. But the problem with that is that SELinux would not be able to take over controls either, because you will need some user-based access controls, and within SELinux this implies that a mapping is done from a user to a SELinux user. Also, access controls based on the user id would no longer work, and unless the application is made SELinux-aware it would lack any authorization system (or would need to implement it itself).

With DAC Linux also provides quite some "freedom" which is well established in the Linux (and Unix) environment: a simple security model where the user and group membership versus the owner-privileges, group-privileges and "rest"-privileges are validated. Note that SELinux does not really know what a "group" is. It knows SELinux users, roles, types and sensitivities.

So, suppose we would keep multi-user support in Linux but completely remove the DAC controls and rely solely on LSM (and SELinux). Is this something reusable?

Using SELinux for DAC-alike rules

Consider the use case of two users. One user wants another user to read a few of his files. With DAC controls, he can "open up" the necessary resources (files and directories) through extended access control lists so that the other user can access it. No need to involve administrators.

With a MAC(-only) system, updates on the MAC policy usually require the security administrator to write additional policy rules to allow something. With SELinux (and without DAC) it would require the users to be somewhat isolated from each other (otherwise the users can just access everything from each other), which SELinux can do through User Based Access Control, but the target resource itself should be labeled with a type that is not managed through the UBAC control. Which means that the users will need the privilege to change labels to this type (which is possible!), assuming such a type is already made available for them. Users can't create new types themselves.

UBAC is by default disabled in many distributions, because it has some nasty side-effects that need to be taken into consideration. Just recently one of these came up on the refpolicy mailinglist. But even with UBAC enabled (I have it enabled on most of my systems, but considering that I only have a couple of users to manage and am administrator on these systems to quickly "update" rules when necessary) it does not provide equal functionality as DAC controls.

As mentioned before, SELinux does not know group membership. In order to create something group-like, we will probably need to consider roles. But in SELinux, roles are used to define what types are transitionable towards - it is not a membership approach. A type which is usable by two roles (for instance, the mozilla_t type which is allowed for staff_r and user_r) does not care about the role. This is unlike group membership.

Also, roles only focus on transitionable types (known as domains). It does not care about accessible resources (regular file types for instance). In order to allow one person to read a certain file type but not another, SELinux will need to control that one person can read this file through a particular domain while the other user can't. And given that domains are part of the SELinux policy, any situation that the policy has not thought about before will not be easily adaptable.

So, we can't do it?

Well, I'm pretty sure that a very extensive policy and set of rules can be made for SELinux which would make a number of DAC permissions obsolete, and that we could theoretically remove DAC from the Linux kernel.

End users would require a huge training to work with this system, and it would not be reusable across other systems in different environments, because the policy will be too specific to the system (unlike the current reference policy based ones, which are quite reusable across many distributions).

Furthermore, the effort to create these policies would be extremely high, whereas the DAC permissions are very simple to implement, and have been proven to be well suitable for many secured systems.

So no, unless you do massive engineering, I do not believe it is possible to substitute DAC with SELinux-only controls.

Posts for Friday, August 7, 2015


Filtering network access per application

Iptables (and the successor nftables) is a powerful packet filtering system in the Linux kernel, able to create advanced firewall capabilities. One of the features that it cannot provide is per-application filtering. Together with SELinux however, it is possible to implement this on a per domain basis.

SELinux does not know applications, but it knows domains. If we ensure that each application runs in its own domain, then we can leverage the firewall capabilities with SELinux to only allow those domains access that we need.

SELinux network control: packet types

The basic network control we need to enable is SELinux' packet types. Most default policies will grant application domains the right set of packet types:

~# sesearch -s mozilla_t -c packet -A
Found 13 semantic av rules:
   allow mozilla_t ipp_client_packet_t : packet { send recv } ; 
   allow mozilla_t soundd_client_packet_t : packet { send recv } ; 
   allow nsswitch_domain dns_client_packet_t : packet { send recv } ; 
   allow mozilla_t speech_client_packet_t : packet { send recv } ; 
   allow mozilla_t ftp_client_packet_t : packet { send recv } ; 
   allow mozilla_t http_client_packet_t : packet { send recv } ; 
   allow mozilla_t tor_client_packet_t : packet { send recv } ; 
   allow mozilla_t squid_client_packet_t : packet { send recv } ; 
   allow mozilla_t http_cache_client_packet_t : packet { send recv } ; 
 DT allow mozilla_t server_packet_type : packet recv ; [ mozilla_bind_all_unreserved_ports ]
 DT allow mozilla_t server_packet_type : packet send ; [ mozilla_bind_all_unreserved_ports ]
 DT allow nsswitch_domain ldap_client_packet_t : packet recv ; [ authlogin_nsswitch_use_ldap ]
 DT allow nsswitch_domain ldap_client_packet_t : packet send ; [ authlogin_nsswitch_use_ldap ]

As we can see, the mozilla_t domain is able to send and receive packets of type ipp_client_packet_t, soundd_client_packet_t, dns_client_packet_t, speech_client_packet_t, ftp_client_packet_t, http_client_packet_t, tor_client_packet_t, squid_client_packet_t and http_cache_client_packet_t. If the SELinux booleans mentioned at the end are enabled, additional packet types are alloed to be used as well.

But even with this default policy in place, SELinux is not being consulted for filtering. To accomplish this, iptables will need to be told to label the incoming and outgoing packets. This is the SECMARK functionality that I've blogged about earlier.

Enabling SECMARK filtering through iptables

To enable SECMARK filtering, we use the iptables command and tell it to label SSH incoming and outgoing packets as ssh_server_packet_t:

~# iptables -t mangle -A INPUT -m state --state ESTABLISHED,RELATED -j CONNSECMARK --restore
~# iptables -t mangle -A INPUT -p tcp --dport 22 -j SECMARK --selctx system_u:object_r:ssh_server_packet_t:s0
~# iptables -t mangle -A OUTPUT -m state --state ESTABLISHED,RELATED -j CONNSECMARK --restore
~# iptables -t mangle -A OUTPUT -p tcp --sport 22 -j SECMARK --selctx system_u:object_r:ssh_server_packet_t:s0

But be warned: the moment iptables starts with its SECMARK support, all packets will be labeled. Those that are not explicitly labeled through one of the above commands will be labeled with the unlabeled_t type, and most domains are not allowed any access to unlabeled_t.

There are two things we can do to improve this situation:

  1. Define the necessary SECMARK rules for all supported ports (which is something that secmarkgen does), and/or
  2. Allow unlabeled_t for all domains.

To allow the latter, we can load a SELinux rule like the following:

(allow domain unlabeled_t (packet (send recv)))

This will allow all domains to send and receive packets of the unlabeled_t type. Although this is something that might be security-sensitive, it might be a good idea to allow at start, together with proper auditing (you can use (auditallow ...) to audit all granted packet communication) so that the right set of packet types can be enabled. This way, administrators can iteratively improve the SECMARK rules and finally remove the unlabeled_t privilege from the domain attribute.

To list the current SECMARK rules, list the firewall rules for the mangle table:

~# iptables -t mangle -nvL

Only granting one application network access

These two together allow for creating a firewall that only allows a single domain access to a particular target.

For instance, suppose that we only want the mozilla_t domain to connect to the company proxy ( We can't enable the http_client_packet_t for this connection, as all other web browsers and other HTTP-aware applications will have policy rules enabled to send and receive that packet type. Instead, we are going to create a new packet type to use.

;; Definition of myhttp_client_packet_t
(type myhttp_client_packet_t)
(roletype object_r myhttp_client_packet_t)
(typeattributeset client_packet_type (myhttp_client_packet_t))
(typeattributeset packet_type (myhttp_client_packet_t))

;; Grant the use to mozilla_t
(typeattributeset cil_gen_require mozilla_t)
(allow mozilla_t myhttp_client_packet_t (packet (send recv)))

Putting the above in a myhttppacket.cil file and loading it allows the type to be used:

~# semodule -i myhttppacket.cil

Now, the myhttp_client_packet_t type can be used in iptables rules. Also, only the mozilla_t domain is allowed to send and receive these packets, effectively creating an application-based firewall, as all we now need to do is to mark the outgoing packets towards the proxy as myhttp_client_packet_t:

~# iptables -t mangle -A OUTPUT -p tcp --dport 80 -d -j SECMARK --selctx system_u:object_r:myhttp_client_packet_t:s0

This shows that it is possible to create such firewall rules with SELinux. It is however not an out-of-the-box solution, requiring thought and development of both firewall rules and SELinux code constructions. Still, with some advanced scripting experience this will lead to a powerful addition to a hardened system.

Posts for Wednesday, August 5, 2015


My application base: Obnam

It is often said, yet too often forgotten: taking backups (and verifying that they work). Taking backups is not purely for companies and organizations. Individuals should also take backups to ensure that, in case of errors or calamities, the all important files are readily recoverable.

For backing up files and directories, I personally use obnam, after playing around with Bacula and attic. Bacula is more meant for large distributed environments (although I also tend to use obnam for my server infrastructure) and was too complex for my taste. The choice between obnam and attic is even more personally-oriented.

I found attic to be faster, but with a small supporting community. Obnam was slower, but seems to have a more active community which I find important for infrastructure that is meant to live quite long (you don't want to switch backup solutions every year). I also found it pretty easy to work with, and to restore files back, and Gentoo provides the app-backup/obnam package.

I think both are decent solutions, so I had to make one choice and ended up with obnam. So, how does it work?

Configuring what to backup

The basic configuration file for obnam is /etc/obnam.conf. Inside this file, I tell which directories need to be backed up, as well as which subdirectories or files (through expressions) can be left alone. For instance, I don't want obnam to backup ISO files as those have been downloaded anyway.

repository = /srv/backup
root = /root, /etc, /var/lib/portage, /srv/virt/gentoo, /home
exclude = \.img$, \.iso$, /home/[^/]*/Development/Centralized/.*
exclude-caches = yes

keep = 8h,14d,10w,12m,10y

The root parameter tells obnam which directories (and subdirectories) to back up. With exclude a particular set of files or directories can be excluded, for instance because these contain downloaded resources (and as such do not need to be inside the backup archives).

Obnam also supports the CACHEDIR.TAG specification, which I use for the various cache directories. With the use of these cache tag files I do not need to update the obnam.conf file with every new cache directory (or software build directory).

The last parameter in the configuration that I want to focus on is the keep parameter. Every time obnam takes a backup, it creates what it calls a new generation. When the backup storage becomes too big, administrators can run obnam forget to drop generations. The keep parameter informs obnam which generations can be removed and which ones can be kept.

In my case, I want to keep one backup per hour for the last 8 hours (I normally take one backup per day, but during some development sprees or photo manipulations I back up multiple times), one per day for the last two weeks, one per week for the last 10 weeks, one per month for the last 12 months and one per year for the last 10 years.

Obnam will clean up only when obnam forget is executed. As storage is cheap, and the performance of obnam is sufficient for me, I do not need to call this very often.

Backing up and restoring files

My backup strategy is to backup to an external disk, and then synchronize this disk with a personal backup server somewhere else. This backup server runs no other software beyond OpenSSH (to allow secure transfer of the backups) and both the backup server disks and the external disk is LUKS encrypted. Considering that I don't have government secrets I opted not to encrypt the backup files themselves, but Obnam does support that (through GnuPG).

All backup enabled systems use cron jobs which execute obnam backup to take the backup, and use rsync to synchronize the finished backup with the backup server. If I need to restore a file, I use obnam ls to see which file(s) I need to restore (add in a --generation= to list the files of a different backup generation than the last one).

Then, the command to restore is:

~# obnam restore --to=/var/restore /home/swift/Images/Processing/*.NCF

Or I can restore immediately to the directory again:

~# obnam restore --to=/home/swift/Images/Processing /home/swift/Images/Processing/*.NCF

To support multiple clients, obnam by default identifies each client through the hostname. It is possible to use different names, but hostnames tend to be a common best practice which I don't deviate from either. Obnam is able to share blocks between clients (it is not mandatory, but supported nonetheless).

Posts for Sunday, August 2, 2015


Don't confuse SELinux with its policy

With the increased attention that SELinux is getting thanks to its inclusion in recent Android releases, more and more people are understanding that SELinux is not a singular security solution. Many administrators are still disabling SELinux on their servers because it does not play well with their day-to-day operations. But the Android inclusion shows that SELinux itself is not the culprit for this: it is the policy.

Policy versus enforcement

SELinux has conceptually segregated the enforcement from the rules/policy. There is an in-kernel enforcement (the SELinux subsystem) which is configured through an administrator-provided policy (the SELinux rules). As long as SELinux was being used on servers, chances are very high that the policy that is being used is based on the SELinux Reference Policy as this is, as far as I know, the only policy implementation for Linux systems that is widely usable.

The reference policy project aims to provide a well designed, broadly usable yet still secure set of rules. And through this goal, it has to play ball with all possible use cases that the various software titles require. Given the open ecosystem of the free software world, and the Linux based ones in particular, managing such a policy is not for beginners. New policy development requires insight in the technology for which the policy is created, as well as knowledge of how the reference policy works.

Compare this to the Android environment. Applications have to follow more rigid guidelines before they are accepted on Android systems. Communication between applications and services is governed through Intents and Activities which are managed by the Binder application. Interactions with the user are based on well defined interfaces. Heck, the Android OS even holds a number of permissions that applications have to subscribe to before they can use it.

Such an environment is much easier to create policies for, because it allows policies to be created almost on-the-fly, with the application permissions being mapped to predefined SELinux rules. Because the freedom of implementations is limited (in order to create a manageable environment which is used by millions of devices over the world) policies can be made more strictly and yet enjoy the static nature of the environment: no continuous updates on existing policies, something that Linux distributions have to do on an almost daily basis.

Aiming for a policy development ecosystem

Having SELinux active on Android shows that one should not confuse SELinux with its policies. SELinux is a nice security subsystem in the Linux kernel, and can be used and tuned to cover whatever use case is given to it. The slow adoption of SELinux by Linux distributions might be attributed to its lack of policy diversification, which results in few ecosystems where additional (and perhaps innovative) policies could be developed.

It is however a huge advantage that a reference policy exists, so that distributions can enjoy a working policy without having to put resources into its own policy development and maintenance. Perhaps we should try to further enhance the existing policies while support new policy ecosystems and development initiatives.

The maturation of the CIL language by the SELinux userland libraries and tools might be a good catalyst for this. At one point, policies will need to be migrated to CIL (although this can happen gradually as the userland utilities can deal with CIL and other languages such as the legacy .pp files simultaneously) and there are a few developers considering a renewal of the reference policy. This would make use of the new benefits of the CIL language and implementation: some restrictions that where applicable to the legacy format no longer holds on CIL, such as rules which previously were only allowed in the base policy which can now be made part of the modules as well.

But next to renewing existing policies, there is plenty of room left for innovative policy ideas and developments. The SELinux language is very versatile, and just like with programming languages we notice that only a few set of constructs are used. Some applications might even benefit from using SELinux as their decision and enforcement system (something that SEPostgreSQL has tried).

The SELinux Notebook by Richard Haines is an excellent resource for developers that want to work more closely with the SELinux language constructs. Just skimming through this resource also shows how very open SELinux itself is, and that most of the users' experience with SELinux is based on a singular policy implementation. This is a prime reason why having a more open policy ecosystem makes perfect sense.

If you don't like a particular car, do you ditch driving at all? No, you try out another car. Let's create other cars in the SELinux world as well.


Switching to Pelican

Nothing beats a few hours of flying to get things moving on stuff. Being offline for a few hours with a good workstation helps to not be disturbed by external actions (air pockets notwithstanding).

Early this year, I expressed my intentions to move to Pelican from WordPress. I wasn't actually unhappy with WordPress, but the security concerns I had were a bit too much for blog as simple as mine. Running a PHP-enabled site with a database for something that I can easily handle through a static site, well, I had to try.

Today I finally moved the blog, imported all past articles as well as comments. For the commenting, I now use disqus which integrates nicely with Pelican and has a fluid feel to it. I wanted to use the Tipue Search plug-in as well for searching through the blog, but I had to put that on hold as I couldn't get the results of a search to display nicely (all I got were links to "undefined"). But I'll work on this.

Configuring Pelican

Pelican configuration is done through and The former contains all definitions and settings for the site which are also useful when previewing changes. The latter contains additional (or overruled) settings related to publication.

In order to keep the same links as before (to keep web crawlers happy, as well as links to the blog from other sites and even the comments themselves) I did had to update some variables, but the Internet was strong on this one and I had little problems finding the right settings:

# Link structure of the site
ARTICLE_URL = u'{date:%Y}/{date:%m}/{slug}/'
ARTICLE_SAVE_AS = u'{date:%Y}/{date:%m}/{slug}/index.html'
CATEGORY_URL = u'category/{slug}'
CATEGORY_SAVE_AS = u'category/{slug}/index.html'
TAG_URL = u'tag/{slug}/'
TAG_SAVE_AS = u'tag/{slug}/index.html'

The next challenges were (and still are, I will have to check if this is working or not soon by checking the blog aggregation sites I am usually aggregated on) the RSS and Atom feeds. From the access logs of my previous blog, I believe that most of the aggregation sites are using the /feed/, /feed/atom and /category/*/feed links.

Now, I would like to move the aggregations to XML files, so that the RSS feed is available at /feed/rss.xml and the Atom feed at /feed/atom.xml, but then the existing aggregations would most likely fail because they currently don't use these URLs. To fix this, I am now trying to generate the XML files as I would like them to be, and create symbolic links afterwards from index.html to the right XML file.

The RSS/ATOM settings I am currently using are as follows:

CATEGORY_FEED_ATOM = 'category/%s/feed/atom.xml'
CATEGORY_FEED_RSS = 'category/%s/feed/rss.xml'
FEED_ATOM = 'feed/atom.xml'
FEED_ALL_ATOM = 'feed/all.atom.xml'
FEED_RSS = 'feed/rss.xml'
FEED_ALL_RSS = 'feed/all.rss.xml'
TAG_FEED_ATOM = 'tag/%s/feed/atom.xml'
TAG_FEED_RSS = 'tag/%s/feed/rss.xml'

Hopefully, the existing aggregations still work, and I can then start asking the planets to move to the XML URL itself. Some tracking on the access logs should allow me to see how well this is going.

Next steps

The first thing to make sure is happening correctly is the blog aggregation and the comment system. Then, a few tweaks are still on the pipeline.

One is to optimize the front page a bit. Right now, all articles are summarized, and I would like to have the last (or last few) article(s) fully expanded whereas the rest is summarized. If that isn't possible, I'll probably switch to fully expanded articles (which is a matter of setting a single variable).

Next, I really want the search functionality to work again. Enabling the Tipue search worked almost flawlessly - search worked as it should, and the resulting search entries are all correct. The problem is that the URLs that the entries point to (which is what users will click on) all point to an invalid ("undefined") URL.

Finally, I want the printer-friendly one to be without the social / links on the top right. This is theme-oriented, and I'm happily using pelican-bootstrap3 right now, so I don't expect this to be much of a hassle. But considering that my blog is mainly technology oriented for now (although I am planning on expanding that) being able to have the articles saved in PDF or printed in a nice format is an important use case for me.

Posts for Wednesday, July 15, 2015

The Web We Should Transcend

There’s a very popular article being shared right now titled “The Web We Have To Save” written by Hossein Derakhshan who was incarcerated for 6 years by the Iranian Government for his writing, his blog. His well-written and beautifully illustrated article is a warning, a call to arms for each and every Internet user to bring back that old web. The web before the “Stream”, before social media, images and videos. The web in its best form according to Derakhshan. But I fear it’s not my web.

Derakhshan’s article rubs me the wrong way for a bunch of different reasons which I’ll try to outline later but maybe this sentence sums it up best. He writes:

Everybody I linked to would face a sudden and serious jump in traffic: I could empower or embarrass anyone I wanted.

Maybe it’s just a clunky way to phrase things, maybe it’s not his intention to sound like one of the people that communities and platforms such as Reddit suffer from. But he does. I’m all for using one’s standing and position to empower others, especially those with maybe less access, people with less socioeconomic standing, people who usually do not get a voice to speak to the public and be heard. But considering the opportunity to embarrass anyone at whim one of the glorious things about the Internet is seriously tone deaf given the amount of aggressive threats and harassment against a multitude of people.

It’s that libertarian, that sadly very male understanding of the web: The Wild West of ideas and trolls and technology where people shoot not with a revolver but duel each other with words and SWAT teams send each other’s way. A world where the old gatekeepers and authorities made of flesh and iron have been replaced by new gatekeepers and authorities sitting in their own homes while trying to become just as powerful as the old entities of power.

Because that’s his actual complaint.

People used to carefully read my posts and leave lots of relevant comments, and even many of those who strongly disagreed with me still came to read. Other blogs linked to mine to discuss what I was saying. I felt like a king.

It is about power and reach. About being heard within the small elite of people who did invest time and money to access the Internet. To create reputation and relevance in that community. In a way to “win”. But when you obey that very market/capitalist mindset, you can’t plateau, you need to grow. You need to become a gatekeeper.

Those days, I used to keep a list of all blogs in Persian and, for a while, I was the first person any new blogger in Iran would contact, so they could get on the list. That’s why they called me “the blogfather” in my mid-twenties — it was a silly nickname, but at least it hinted at how much I cared.

I remember the time of these lists of links. Carefully curated by individuals with power in small or sometimes bigger communities. Where Derakhshan very vocally complains about algorithms forming and directing attention, he or people like him doing the same is the utopia we lost.

If you print the article and hold it to your ear you can hear a silent reading of the Declaration of the Independence of Cyberspace. The so called home of the mind where it was supposed to be all about ideas and text and high-level philosophical debate. Again a starkly male point of view. But the web got richer. Video is important these days and many creative people experiment with how to not just put forward ideas there but represent themselves and their lives. Instagram is huge and awesome because it (or platforms like it) can depict life in forms that were hidden in the past. Our public common consciousness gets richer by every How to do your Nails tutorial, every kid’s pictures from him or her hanging out with their friends. But that’s not “right” we learn:

Nearly every social network now treats a link as just the same as it treats any other object — the same as a photo, or a piece of text — instead of seeing at as a way to make that text richer.

Yes links are treated as object people share and comment on just as videos, pictures, podcasts, soundtracks, GIFs and whatever. Why should that one way of representing culture be the best one. The one we should prefer before everything else. Is it because that’s the way men used to communicate before kids and women entered sometimes not liking that made up world of supposedly objective and true words and links?

Hm. So what is the world today like online? Channeling the recently deceased German publicist Frank Schirrmacher Derakhshan takes all his old-man-yelling-at-clouds-anger about the modern world and hides it behind a mostly undefined term, in this case: The Stream.

The Stream now dominates the way people receive information on the web. Fewer users are directly checking dedicated webpages, instead getting fed by a never-ending flow of information that’s picked for them by complex –and secretive — algorithms.

Obviously not every stream is algorithmicly filtered: Twitter and Instagram for example just show you the stuff of the people you manually chose to follow. But let’s not nitpick here. The complaint is that we are no longer doing it right. We are not reading things or liking them for their imminent value, their objective content, but:

In many apps, the votes we cast — the likes, the plusses, the stars, the hearts — are actually more related to cute avatars and celebrity status than to the substance of what’s posted. A most brilliant paragraph by some ordinary-looking person can be left outside the Stream, while the silly ramblings of a celebrity gain instant Internet presence.

But is the Internet that’s not showing me cute cat GIFs any better? What kind of weird chauvinism towards any culture he doesn’t like or understand or value is that? My web has twerking videos, cat GIFs, philosophical essays and funny rants as well as whatever the hell people like, whatever makes them happy to make or read or view or share (unless it harasses people then GTFO with your crap).


There is some meat to what Derakhshan writes and I encourage you to read his essay. He makes some valid points about centralization in the way certain Platforms like Facebook try to AOLify the Internet, how certain platforms and services use data about their readers or consumers unfairly and unethically and how it’s getting harder and harder to port data from one platform to another. He has some very very good arguments. But sadly he doesn’t get to the point and instead focuses on attacking things that “kids these days” do that do not support his understanding and vision of the web as a bunch of small kingdoms of digital quasi-warlords.

Dear Hossein Derakhshan.
Thanks for your article, I did enjoy reading it even if I don't agree with everything. The club of people who don't enjoy how our mostly unleashed and rabid form of capitalism ruins everything from human relationships to technology to funny cat videos meets every day down at the docks in the old factory building with the huge red star painted at its wall. Hope to see you there, soon.

The Web Derakhshan wants to save is not the digital Garden of Eden. It sounds like what Ayn Rand and some technocrats would come up with for a western movie. And the days of John Wayne are past. Get over it.

Title Image by sciencefreak (Pixabay)

Flattr this!


Loading CIL modules directly

In a previous post I used the secilc binary to load an additional test policy. Little did I know (and that's actually embarrassing because it was one of the things I complained about) that you can just use the CIL policy as modules directly.

With this I mean that a CIL policy as mentioned in the previous post can be loaded like a prebuilt .pp module:

~# semodule -i test.cil
~# semodule -l | grep test

That's all that is to it. Loading the module resulted in the test port to be immediately declared and available:

~# semanage port -l | grep test
test_port_t                    tcp      1440

In hindsight, it makes sense that it is this easy. After all, support for the old-style policy language is done by converting it into CIL when calling semodule so it makes sense to immediately put the module (in CIL code) ready to be taken up.

Posts for Saturday, July 11, 2015

Protecting the Cause

People ain’t perfect. We forget and neglect, actively ignore, don’t care enough, don’t live up to what we’d like to be or know we could be or should be. Now don’t get me wrong, I’m no misanthropist. I believe that people in general are nice and try their best given their own perception of the world in spite of sometimes things ending badly.

In the last weeks I’ve had a bunch of very different reasons to think about a general problem that has infected many human rights/civil liberties activist circles, organizations, subcultures or structures. Maybe even infection isn’t the right word. Maybe it’s more of a problem of our individual views on the world with are … well … too individualistic. But I’m getting ahead of myself.1

I kept on coming back to a song by the British HipHop artist Scroobius Pip called “Waiting for the beat to kick in”2. The song describes the artist’s dream of walking through New York (“but not New York in real life the New York you see in old films”) and meeting different characters telling him their idea of how to live your life.

<iframe allowfullscreen="true" class="youtube-player" frameborder="0" height="430" src=";rel=1&amp;fs=1&amp;autohide=2&amp;showsearch=0&amp;showinfo=1&amp;iv_load_policy=1&amp;wmode=transparent" type="text/html" width="711"></iframe>

So after meeting a bunch of positive, inspiring people the protagonist’s walk ends with meeting someone named “Walter Nepp” who wants to “bring him back down to earth“:

You may think yourself in general to be a nice guy,
But I’m telling you now – that right there is a lie,
Even the nicest of guys has some nasty within ’em,
You don’t have to be backlit to be the villain,
Whether it be greed lust or just plain vindictiveness,
There’s a level of malevolence inside all of us

I’ve been coming back to those lines over and over again in the last weeks and not just for personal reasons thinking about my own actions. Which is something I need to do. A lot. Sometimes I talk and write faster than I think which lets me end up in situations where I am not just wrong due to ignorance but actually harmful to people, causes or values I claim to support. These things don’t happen. I cause them through my ignorance, stupidity, carelessness or just not giving enough of a shit. I might behave racist, sexist or any other number of -ists in spite of me being an OK guy3. Everybody does these things because nobody is perfect – though some do it way more frequently or more hurtfully than others. Which is no and can be no excuse: The fact that we all behave hurtfully doesn’t make that behavior OK or normal just frequent. A chronic head- and heartache in our collective consciousness.

But I can consider myself lucky. My friends, peers, contacts keep me honest by calling me out on my bullshit. Which is more of a gift than many probably realize. I am living a privileged life and as much as you try, sometimes it is hard to break the veil that position casts on the world. To make a long story short: I need people telling me when I mess up to comprehend not just the situation at hand but also fix my thinking and perception as good as possible. Rinse and repeat. It’s a process.

I am not especially smart so many people do a lot better than I do. And many of them invest tons of their energy, time and life into making the world a better place. They work – often even without payment – in NGOs and activist structures fixing the world. Changing public perception and/or policy. Fighting the good fights on more fronts than I can count (and my Elementary School report card said that I wasn’t half bad at counting!).

As humans relying on the rights or legislations these groups and individuals fight for we are very lucky to have all that on our side. Networks of networks of activists helping each other ( and us) out. Cool. But sadly, very rarely are things as simple as they might seem.

In the last months I keep hearing very similar stories from the inside of many of the (digital) human rights/civil liberties NGOs. Stories contradicting the proposed mission statement. Stories of discrimination, sexism, harassment and communicative violence. Structural failure in organisations completely screwing up providing a safe environment for many trying to contribute. These stories usually have very visible, prominent men within those organisations at their core. But nothing changes.

Obviously there is not one singular cause for the deafening silence the stories drown in.

People might not want to speak up against their (past) employer to protect their present or future which is very understandable. And without witnesses and proof here is no story. It would be easy for me to ask for victims and witnesses to blow the whistle and speak up but what would I do in a similar situation? How can we expect people to destroy their actual lives for some lofty idea of truth, transparency and ethical righteousness?

But I believe there is something that we can change even from the fringes or even outside: We can stop to let people use – explicitly or implicitly – the cause as their shield.

I have seen this mechanic numerous times: Some civil rights dude fucks up but instead of calling him out people keep shtum to not hurt the cause. Few have mastered that mechanic as well as Julian Assange who basically finds new ways to break with the norms of almost everything in any given week, but he’s not alone just probably more intense than the rest. You can see these things in probably any one of the heroes of the scene and as I showed earlier: We all fuck up why shouldn’t they? But with the cause so intimately intertwined with their personas, their public identity any of their personal lapses or transgressions could cause damage for that cause.

Sadly people don’t just shut up but actually target those not shutting up. When questions came up about Assange’s antisemite friends and peers it was quickly labeled a smear campaign against whatever Wikileaks’ cause is4. And it’s not just a Wikileaks thing.

There is a lot to learn from this issue, about the lack of sustainability in chaining a cause or structure to one or very few public superstars. But mostly it’s about understanding that – especially as an NGO/structure fighting for human rights or similar things – you have to be better than most of the people and organisations you stand against.

Recently Wikileaks published a bunch of data from an Italian Cyber(war|security)5 company who seemingly even sold 0-days and exploits to oppressive regimes. Very cool, let’s see what journalists can dig up as stories about who builds these tools used to take other human beings’ rights and sometimes lives. Who sells them. And what we can do. But then I see people – fierce and vocal supporters of privacy as human rights – just randomly digging out personal information about people working for the German secret service from some list published without context (that also included all kinds of random addresses which did not all appear in the emails also leaked). When the data of millions of US government employees leaked people from similar groups were just too happy and quick in dragging out information about individuals and contextualizing them – often without proper research.

Sure. You can use any data you can get your hands on. But you are damaging your own cause, creating contradictions and becoming less credible. And as an activist/NGO credibility is your main asset. So you need to be called out on it. Even if your opponents might pick up that criticism and use it against you. I know it sucks but as an NGO/activist you don’t have the luxury of not caring how you do what you want to do. For example: If you claim privacy to be a human right (which is a popular argument in the civil rights scene) you just can’t republish (and thereby recontextualize) individual’s data on your Twitter feed or whatever. Because that would mean that you either don’t actually believe that privacy is a human right because you don’t grant it to people in certain jobs or that you just don’t respect their rights which would lead to the question why people should listen to you arguing for them.

Fighting for ethics and standards means that you have to live them as well: You can’t fight for transparency while being intransparent, can’t fight for right to privacy while just throwing “evil people”‘s data out there just cause they deserve it for some reason.

And that is where we as a networked subculture fail so often. We don’t call people out for that kind of crap. Because we like them. Because they were or are our friends or peers. Because we are afraid of how it might endanger the cause or the campaign or whatever. Sadly two failures (by the original fucker-up and by us not calling them out on it) don’t make one success. But helping people see their mistakes can make future successes possible.

Maybe it’s just one of these nights, sitting alone in front of a computer thinking, writing instead of anything else but I am just soo tired of hearing digital martyrs and saviors being glorified while seeing them discriminating against people for their gender or religion or acting according to different rules than what they demand for everyone else. I’m tired of all the secret stories being exchanged in the dark between so many members of our communities that they have basically become public knowledge and still nothing getting better.

‘Ethical behavior is doing the right thing when no one else is watching- even when doing the wrong thing is legal.’ Aldo Leopold

Photo by CJS*64 A man with a camera

  1. Excuse my strange stream of consciousness kind of writing here, I’m basically writing while thinking here.
  2. which incidentally shares some sort of stream of consciousness lyrics style with this way clunkier text
  3. Obviously my own tainted opinion 😉
  4. and let’s not even start with the accusations against Assange in Sweden
  5. as if there was a difference really

Flattr this!


Restricting even root access to a folder

In a comment Robert asked how to use SELinux to prevent even root access to a directory. The trivial solution would be not to assign an administrative role to the root account (which is definitely possible, but you want some way to gain administrative access otherwise ;-)

Restricting root is one of the commonly referred features of a MAC (Mandatory Access Control) system. With a well designed user management and sudo environment, it is fairly trivial - but if you need to start from the premise that a user has direct root access, it requires some thought to implement it correctly. The main "issue" is not that it is difficult to implement policy-wise, but that most users will start from a pre-existing policy (such as the reference policy) and build on top of that.

The use of a pre-existing policy means that some roles are already identified and privileges are already granted to users - often these higher privileged roles are assigned to the Linux root user as not to confuse users. But that does mean that restricting root access to a folder means that some additional countermeasures need to be implemented.

The policy

But first things first. Let's look at a simple policy for restricting access to /etc/private:

policy_module(myprivate, 1.0)

type etc_private_t;

This simple policy introduces a type (etc_private_t) which is allowed to be used for files (it associates with a file system). Do not use the files_type() interface as this would assign a set of attributes that many user roles get read access on.

Now, it is not sufficient to have the type available. If we want to assign it to a type, someone or something needs to have the privileges to change the security context of a file and directory to this type. If we would just load this policy and try to do this from a privileged account, it would fail:

~# chcon -t etc_private_t /etc/private
chcon: failed to change context of '/etc/private' to 'system_u:object_r:etc_private_t:s0': Permission denied

With the following rule, the sysadm_t domain (which I use for system administration) is allowed to change the context to etc_private_t:

allow sysadm_t etc_private_t:{dir file} relabelto;

With this in place, the administrator can label resources as etc_private_t without having read access to these resources afterwards. Also, as long as there are no relabelfrom privileges assigned, the administrator cannot revert the context back to a type that he has read access to.

The countermeasures

But this policy is not sufficient. One way that administrators can easily access the resources is to disable SELinux controls (as in, put the system in permissive mode):

~# cat /etc/private/README
cat: /etc/private/README: Permission denied
~# setenforce 0
~# cat /etc/private/README
Hello World!

To prevent this, enable the secure_mode_policyload SELinux boolean:

~# setsebool secure_mode_policyload on

This will prevent any policy and SELinux state manipulation... including permissive mode, but also including loading additional SELinux policies or changing booleans. Definitely experiment with this setting without persisting (i.e. do not use -P in the above command yet) to make sure it is manageable for you.

Still, this isn't sufficient. Don't forget that the administrator is otherwise a full administrator - if he cannot access the /etc/private location directly, then he might be able to access it indirectly:

  • If the resource is on a non-critical file system, he can unmount the file system and remount it with a context= mount option. This will override the file-level contexts. Bind-mounting does not seem to allow overriding the context.
  • If the resource is on a file system that cannot be unmounted, the administrator can still reboot the system in a mode where he can access the file system regardless of SELinux controls (either through editing /etc/selinux/config or by booting with enforcing=0, etc.
  • The administrator can still access the block device files on which the resources are directly. Specialized tools can allow for extracting files and directories without actually (re)mounting the device.

A more extensive list of methods to potentially gain access to such resources is iterated in Limiting file access with SELinux alone.

This set of methods for gaining access is due to the administrative role already assigned by the existing policy. To further mitigate these risks with SELinux (although SELinux will never completely mitigate all risks) the roles assigned to the users need to be carefully revisited. If you grant people administrative access, but you don't want them to be able to reboot the system, (re)mount file systems, access block devices, etc. then create a user role that does not have these privileges at all.

Creating such user roles does not require leaving behind the policy that is already active. Additional user domains can be created and granted to Linux accounts (including root). But in my experience, when you need to allow a user to log on as the "root" account directly, you probably need him to have true administrative privileges. Otherwise you'd work with personal accounts and a well-designed /etc/sudoers file.

Posts for Thursday, July 9, 2015

Let’s kill “cyborg”

I love a good definition: Very few things in life wield extreme power as elegantly as definitions do. Because every teeny, tiny definition worth its salt grabs the whole universe, all things, all ideas, all stories, feelings, objects and laws and separates them into distinct pieces, into two different sets: The set of things conforming to the definition and the set of things that don’t. If you’ve read enough definitions, super hero comics are basically boring.

Definitions structure our world. They establish borders between different things and allow us to communicate more precisely and clearly. They help us to understand each other better. Obviously many of them are in a constant development, are evolving and changing, adapting to our always fluid experience of our shared world.

And just as they change, sometimes definitions and the concepts they describe just stop being useful. Today I want to encourage you and me to put one definition and concept to rest that has outlived its usefulness: Let’s kill the term “cyborg”.

The term cyborg has been with us since the 1960s and has influenced more than just cheesy science fiction movies: Talking about cyborgs was a way to describe a future where human beings and machines would meld in order to describe our actual current lifestyles. Because for better or worse: We have been hybrid beings of part nature, part technology of sorts for many many decades now.

Still, the idea of “natural humans” put in contrast to machine-augmented humans was useful to elaborate the requirements that we as a society would need to postulate in order to integrate technology into ourselves, our bodies and our mental exoskeletons in a humane way. It has been a good conversation. But lately it really hasn’t been.

“Cyborg” has mostly been a word to single out and alienate “freaks”. It refers to body hackers that do things to their bodies that the mainstream doesn’t really understand but that they just love to watch like one of those strangely popular torture porn movies like Saw. It refers to people with disabilities in a way that does not include them in whatever the mainstream view of society is or that helps them make a case for how to design and engineer things in a way more accessible but as these weird “others”. I can’t count the amount of times that for example Neil Harbisson has been talking on conferences about his perception augmentation allowing him to hear colors with the gist of it in the media reception being mostly: Look how weird!

Instead of helping us to understand ourselves and our decision to intertwine our lives, worlds and bodies with all kinds of different technologies “Cyborg” just creates distance these days. It doesn’t build bridges for fruitful debates but in fact tears them down to look at freaks on the other side of the river without them coming closer.

We are all cyborgs and we have been for quite a while. But when everybody is a cyborg really nobody is. The distinction from whatever “norm” that the word, idea and definition provided is no longer helpful but actually hurtful for so many debates that we have to have in the next few years.

Thank you “cyborg”, it’s been an interesting ride. Rest in peace.

Photo by JD Hancock

Flattr this!

Planet Larry is not officially affiliated with Gentoo Linux. Original artwork and logos copyright Gentoo Foundation. Yadda, yadda, yadda.