Posts for Thursday, June 10, 2010

Posts drafting away …

A few days ago I went meta again and wrote about writing, I mean if you’ve read more than 3 posts here you know my spiel, I’m basically mutating the same 3 or 4 ideas around new contexts. In that post I talked about how some posts are hard to write, let me be all pretentious and quote myself:

On the other hand there are posts that usually have meandered around in my head for weeks or months, ideas that have kept my mind occupied for a long time and that I try to meld into a post at some point in time. And usually those posts are really hard to write, I trash many of them, rewrite them, trash the rewrite.

This lead to Andy C adding to the comments:

Interesting. Your blogging workflow is very different from mine. I get an idea, I type the words, I fix typos and then I just hit ‘Publish’.

I never have any Drafts sitting there. In fact, the very thought just irritates me.

I replied quickly and thought it was done but on my way home tonight I came back to this, to the process of writing and stuff.

Many people do not blog because they say: “But I don’t have any ideas!” I rarely hear that people don’t have time or skills, it’s usually about “not having anything to write about” or “not having ideas”. Which is something I just don’t believe but I know how that perception might emerge because I know it from myself.

I have ideas for short and long articles all day (which sucks when you are concentrating on writing something and another thing wants your mind’s attention) but obviously I don’t write them whenever they come to my head. So I put them on a mental “yeah write about that” list. That list has a problem, it disappears all the time.

An idea might bounce around in my head for 10 hours throughout the day but when I have time to write recreationally it’s empty, all the cool and awesome posts are gone. A few days later they might pop back up, they might not. I lose many many ideas that way every month and that sucks. Not cause all of them are so bloody brilliant but because if I had more than just one idea to write about I could chose the best one and wouldn’t have to run with the one I got at a particular moment.

For a while I tried writing “Drafts”. I would add some sort of heading and a few very rough bullet points and come back later going through the drafts to decide which one to work on for a post. Basically none of them ended up being written, when the amount of drafts reached a certain point I would just go draft-berzerk and kill them all. I am not a draft person, it’s just not a way that triggers my brain.

I also tried paper for a while but my pants rarely have enough room as it is now and that makes it hard to carry even more stuff so that never worked out. Another thing that kills that kind of process is my lack of a proper handwriting, I usually cannot read my handwritten notes after three days.

I have not given up looking for solutions to this because losing ideas hurts. As I said, not cause they are all awesome, but cause I hate losing something without having had the chance to work on it. But right now I probably have to accept that I lose ideas every day. Writing, as any creative task, is very personal I guess. Things that work for someone else might absolutely kill your workflow and inspiration, on the other hand a pointer from someone else might make you realize how much you’ve been fighting your own brain.

Understanding how one’s own brain ticks, what kind of things capture its attention and what slows it down is probably one of the most important tasks for anyone who wants (or has to be) creative. It’s about finding your way of doing things because most ways won’t work. All those shiny “10 rules to follow for blogging” howtos, all those gurus that tell you about structure or simplicity or whatever: Read it, but read it like someone describing their own favorite dish. You might like it, you might like parts of it, but chances are that you will hate it.

kde activities on multiscreens

I don't know since which release activities are part of KDE. Maybe they introduces it already whit the first KDE 4 release. Anyway, on multiscreens it really works good since the 4.4 release series. Before 4.4, activities where a typical crash-scenario. Now especially for multi-head setups they made alot of improvements and bug-fixes. This includes the activities, but it's still not perfekt (at least for me).

Usually, activities are a nice feature. In my case i use an extra activity for showing all my system-information like cpu-temperature, cpu-usage, network-usage. Additionally i also have a weather-forcast plasmoid and one for the latest xkcd story :)

The problem now is that the activity manager shows 3 different activities. Since every screen counts as a "full" activity i could also just switch the left and the right screen. (in the past plasma would have crashed)

Here you can see what i mean:

But that's something which i don't want at all. Especially if i change activities via shortcut it can happen quite fast that i just switch the screens. Well, it's not a big problem (maybe it's a feature :D), but i guess this one could be better implemented...

Posts for Wednesday, June 9, 2010

It’s not you, it’s me

Dear FSF.

I’ve been using FLOSS (free/libre/open source software) on my computers pretty much exclusively for about 5 or 6 years now and I don’t see me going back to the limited lands of Windows or OSX anytime soon, but I have started to feel some sort of disconnect to one of the leading institutions of FLOSS lately: The Free Software Foundation (FSF).

The FSF has achieved a lot, has funded and supported a lot of the GNU projects and software developers, has defended the GPL (my preferred license for software) in court and in general have had a huge part in bringing free software to where it is today. I am not ungrateful, it’s just … it’s not you, it’s me.

When I look at some of the recent FSF campaigns I feel ashamed for being seen as associated with it: Windows7Sins is just the prime example for the kind of attitude I can no longer stand (DefectiveByDesign isn’t much better). What I can not longer stand is the vocabulary of hate and judgement, the negativity of it all.

Look at the campaign name: Windows7Sins. Sins are a religious concept, one that doesn’t apply to software but one that shows what kind of broken self perception the FSF has: That of a missionary, of someone who can, from the moral high ground, cast judgement down onto everybody else with a holier-than-thou attitude.

Those campaigns are full of negativity, of nagging and most importantly talking about issues that do not even reach their target audience. If you want to make people use your software or support your cause, it’s not enough to just run around screaming how “evil” the others are, that makes you look like a raving lunatic. And that is not my way.

While I do have ideals and while I do consider Freedom Zero to be important, I don’t preach because in the end that is my decision. It’s not my place to tell people what they are not allowed to do because of my rules. What I can do is to lead by example. To be positive.

I am a very positive person, even if I might might look ranty or grumpy from time to time, and these campaigns that focus on whatever other people might have done wrong just bore me. No. That’s not enough. They suck the life out of me.

If you want people to adapt your ideals or products you gotta show them why they are better than what they have been using: Tell them about the brilliant things they get when they use your stuff, tell them about new possibilities. The FSF should focus on outlining what positive things a new users gets from FLOSS: Tell people about VLC that allows them to play basically every type of media without hassle. Don’t talk about why IE is evil but show them what an awesome browser Firefox is. Just drop the hate.

In my life my message has never been one of hate. And I feel that lately our goals and approaches no longer line up the right way. You probably didn’t change, you were always like that, maybe I changed or got tired of that attitude, I don’t know. It’s not you, it’s me, but I think we can’t really let you talk for me anymore.

Thanks for all you have done and good luck with your future projects and campaigns,

tante

Clojure, from a Ruby perspective

Fogus' recent article "clojure.rb" speculates about why there seem to be so many Ruby users adopting Clojure. As a Ruby user who adopted Clojure, I figured I'd write about my experiences.

What do Ruby and Clojure have in common, that would attract a Rubyist to Clojure? A lot. Obviously, this is somewhat subjective and I don't expect anyone else to agree, but this is what did it for me.

Semantic consistency

In Ruby, everything is an object. It makes it simple to write code without worrying much about what kind of thing you have. foo.some_method(1,2,3) will generally work for any foo.

In Clojure, everything is not an object, specifically because it inherits primitives from Java land (though this doesn't hurt much in the kind of everyday use I put Clojure to). But also because Clojure by design doesn't even attempt to be object-oriented.

Clojure does have abstractions though. For example there's an abstraction that says "this thing can be called like a function". And then you can treat any callable-thing as a function without worrying about what it is.

In Ruby, a lot of things are Enumerable, which means you can do foo.each{} and other similar things for a lot of different types of foo. Clojure has something similar with its seq abstraction. Similarly, most Clojure data types and many Java ones are seq-able, and most built-in core functions can iterate over the guts of various things using seqs. This includes regex matches, strings, directories of files, and so on.

Ruby-style OOP brings a lot of complexity and baggage which Clojure avoids by not being OOP. For example (ignoring Java for the moment), in Clojure land you don't have to worry about a member being public/private/protected, and there are few times when you have to worry about inheritance and class hierarchies. (And there's the whole thread-safety thing.) In Clojure data is stupid and immutable, and functions are just things that take input and give output, usually side-effect free. The separation is clean and this results in programs that are very easy to reason about.

Another example of consistency: Expressions. In both Ruby and Clojure, everything has a value. Things that are "statements" in other languages are instead expressions that return something. This alone makes a lot of programs just a little bit better/easier to write.

Aesthetics

Like Ruby, Clojure code tends to be terse and expressive.

Ruby reads like poetry, because it's mostly words and not so much punctuation. In Ruby, you have none of Perl's sigils, very little of C's semi-colon line-endings and curly-delimited blocks. Especially when you start omitting optional parentheses, it has a very minimalistic vibe which is appealing to many.

    def foo(bar)
      bar.each do |x|
        puts x
      end
    end

Clojure's s-expressions are another story, of course. Some love them, some hate them. Personally, I love them. The tired old trope about Lispers not paying attention to parentheses is true; after a while they blend into the background.

    (defn foo [bar]
      (doseq [x bar]
        (prn x)))

What I see:

    defn foo [bar]
      doseq [x bar]
        prn x

Clojure has the same minimalistic feel to it, in my eyes. It also helps that so many Clojure function names are short and concise. And being able to use punctuation like ? and ! in variable/function names (true? and false? are function names), hyphens instead of underscores... these things help Clojure read smoothly.

But along with aesthetics, in Clojure you get the benefits of s-exps: no order of operations to deal with, and absolute consistency. Everything is (function param1 param2 param3). Look at how many syntax rules you have to memorize for the Ruby code above to make sense. Dot means method call, blocks are do/end delimited and have those weird pipes in there, etc.

Literals and syntax sugar

Ruby has literal syntax for many types. This includes:

  • :symbols
  • [arrays]
  • {:hash => :maps}
  • /regex/
  • do block end and {block}

This really is a big deal. I'm spoiled and I can't use a language without these things nowadays. It saves a ton of typing and it makes those things stand out in the code, making it both easier to write and to read. I use those structures all the time in every program I write, so they should have a terse representation.

Clojure has literal support for the same types as Ruby, and they even look mostly the same (with #"regex" being a change I can live with). And then it also has (among others):

  • #{sets}
  • #(function-literals %)
  • '(quoted forms)
  • `(quasi-quoted ~forms)

Is this a contradiction of my last point? What happened to s-expressions and consistency? Well, in Clojure, the reader shortcuts are just sugar that reduce to s-exps. You can avoid all use of that sugar if you hate it. (hash-map :key "val"), (vector 1 2 3), (quote foo) etc.

But more importantly, let's draw a(n arbitrary) distinction between "good" syntax and "bad" syntax.

Clojure's reader-macro sugar makes your code shorter, but doesn't change the structure of your code. Take a function call (f x y z), and you can always substitute a vector or regex literal or quoted form into it. (f [vector] #{set} #"regex"). The syntax sugar is very local, very self-contained. It doesn't leak into the surrounding code. And of course you can combine them in nearly arbitrary ways: '[quoted vector], {:hash-map-containing-a #{:set 'of #(functions)}}. This is "good" syntax.

Compare this to things like the x ? y : z construct, or heredocs. These things are not as orthogonal. They not only mean something on the "inside", they also influence and interact with the code before and after them, thanks to precedence rules and special parsing rules. Can you stick a heredoc in the middle of a function call? Maybe (I don't even know), but have fun with the indentation and line-breaks if so. When should you use do/end and when should you use {} for blocks? When do you need parens around your ternary if-then-else construct and when don't you? When do you need to use and and when &&? That's the "bad" kind of syntax. Sometimes, maybe even most of the time, it makes your code shorter, but there are a lot of rules to memorize and you never know when you'll be bitten.

Clojure largely avoids the "bad" syntax while taking advantage of the "good". Reader macros make your code shorter and visually easier to scan, but they rarely require you to do backflips to get your code to compile or run properly.

First-order functions

Ruby's blocks and yield and friends let you deal with first-order functions. This is a huge step in the Lisp direction already, and it's one of the things that makes Ruby great. But there are limits. Blocks use funky, special syntax, and in idiomatic Ruby, you will pass around only one block per method. There's the whole lambda and proc mess, and then there are methods-as-objects which are different still. And a lot of Ruby just calls .send on an object and passes in a method name as a symbol.

Being a Lisp, Clojure takes this a bit further. First-order functions are ingrained in nearly everything you do in Clojure. And they are easy to define and easy to call. Define f via defn (to make it top-level), fn (for a local function), or #() (sugar for fn), and then call it like (f).

Clojure also takes advantage of some functional-programming mainstays like partial and complement and comp(osition). We're not in full-blown Haskell territory, but it's a lot more FP than idiomatic Ruby.

And hash-maps, vectors, sets, keywords, and symbols are also callable as functions in Clojure. ({:foo 1} :foo) => 1. Many things can be treated as functions.

Metaprogramming

In Ruby you can mess with the innards of any class you want. There are facilities for defining methods dynamically, opening and inspecting classes at runtime, catch-all handlers for undefined methods, and all kinds of other dark magic. But again there are limits... Ruby needs to make use of eval to get certain things done. And monkey-patching is a shotgun aimed at your foot.

Well, if you like metaprogramming, Lisp macros are top of the line. You can abstract away boilerplate with a vengeance. Macros are the ultimate application of DRY.

Clojure doesn't deal much with classes, so there isn't much of that kind of introspection, but the Lisp principle of code-as-data enables a kind of introspection that you won't find in Ruby. The line between compile-time and run-time is very blurry, which enables all kinds of magic.

Java itself does offer Ruby-style reflection and such, if you need it, but you won't often, while in Clojure land.

Multimethods (and soon, protocols and defrecord) let you avoid monkey-patching and get some of the same kinds of "extend a class" things done in a saner and safer way.

So?

Fogus suspects:

Ruby programmers being the adventurous lot to begin with, are not satisfied with “halfway to Lisp”. Instead, they want it all.

This is true in my case. I like Ruby largely insofar as it borrowed and adapted many great features of Lisp. It only makes sense that I would like Clojure, which takes most of those things one step further. Clojure in particular, as a "modern" Lisp with vaguely Ruby-like syntax in certain places, is an obvious choice.

On top of that, Clojure is fast, thanks to the JVM. Ruby has JRuby too, but vanilla Ruby is not known for its speed. Clojure integrates with a REPL in a way that Ruby really doesn't, making interactive development enjoyable. Clojure is a compiled language, which has benefits for deployment. And again, there's the whole thread-safety thing. Clojure is awesome for writing sane, safe multi-threaded programs. These things are rather appealing.

I do still use Ruby though. Ruby is great for scripting, Clojure not so much, thanks to the JVM startup time, among other things. Ruby can be banged out quickly in any editor, but Clojure isn't much fun to edit in any editor that lacks good paren-matching support and REPL integration.

Rubygems offers dead-simple install of a ton of libraries, whereas Clojure is still working out the details of a standard build tool and install tool. Ruby has a library for anything, and while Clojure can use Java libraries, Java libraries tend to be huge and feature-rich, sometimes too huge for one-off tasks where a small Ruby library is a perfect fit.

gentoo cube

Since composite becomes famous the desktop-cube was one of the top showcase-effects.

Right now they are a few popular composite-manager out there. One of them is kwin from the KDE desktop enviroment. kwin also supports the cube-effect alongside with alot of other effects.

Today i'll show you how to personalize this cube a bit, so that you have your own image at the top and botton of the cube. Usually there is a nice KDE-logo, but you'll see how it would look like with a gentoo-logo :) To make that possible on your system you first need the gentoo png image which you can get from here: image

Afterwards i suggest you to make a backup of the original KDE-image:

cp /usr/share/apps/kwin/cubecap.png /usr/share/apps/kwin/cubecap.png.bak

Then you can copy/move the new image into the above folder. Be sure that you change the filename to "cubecap.png".

cp ~/gentoo_logo.png /usr/share/apps/kwin/cubecap.png

After a restart of kwin the result would look like this:

Posts for Tuesday, June 8, 2010

Just a test

This is just a test to see if the feeds are working correctly. Nothing to see here, move on.

Screenshot June 2010

I haven't posted one of these in a while. I've been in an 8-bit kind of mindset for a while:

Screenshot

What I actually stare at for 8 hours every day:

Screenshot

KDE4, Buuf icons, QtCurve, wallpaper is from somewhere on the internets.

Emacs isn't for everyone

Chas Emerick recently posted the results of his State of Clojure survey. It turns out that the (self-selected) group of Clojure-using respondents happen to prefer Emacs as their IDE of choice, eclipsing all other editors by a large margin.

Chas then has this to say:

I continue to maintain that broad acceptance and usage of Clojure will require that there be top-notch development environments for it that mere mortals can use and not be intimidated by...and IMO, while emacs is hugely capable, I think it falls down badly on a number of counts related to usability, community/ecosystem, and interoperability.

As an avid, die-hard Vim and Emacs user for life, I'm going to agree.

Mere mortals?

Emacs isn't difficult to learn. Not in the sense of requiring skill or cleverness. It is however extremely painful to learn. I think there's a difference.

The key word is tedium. Learning Emacs is a long process of rote memorization and repetition of commands until they become muscle memory. If you're smart enough to write programs, you can learn Emacs. You just have to keep dumping time into the task until you become comfortable.

Until you're comfortable, you face the unpleasant task of un-learning all of your habits and forming new ones. And you're trying to do this at the same time you're undertaking another, even harder task: writing programs. And if you're a new Clojurist, and you're learning Emacs and Clojure from scratch at the same time, well, get the headache medication ready.

As a programmer and someone who sits in front of a computer 12+ hours a day, I consider myself pretty flexible and capable of picking up a new user interface. As someone who had been using Vim for years prior to trying Emacs, I considered myself more than capable of learning even a strange and foreign interface. I'd done it once before.

But learning Emacs still hurt. Oh how it hurt. I blogged while I was learning it, and you can see my pain firsthand. I sometimes hear people say "I tried Emacs for a whole month and I still couldn't get it". Well, it took me over a year to be able to sit down at Emacs and use it fluidly for long periods of time without tripping over the editor.

To be fair, I'm talking here about using Emacs as a programming environment. Using Emacs as a Notepad replacement could be learned in short order. C-x C-f, C-x C-s, or use the menus, there you go. Using it comfortably as a full-fledged IDE is significantly harder and requires you to touch (and master) many more features. Syntax highlighting, tab-completion, directory traversal and cwd issues, enabling line numbers, version-control integration, build tool integration, Emacs' funky regex syntax for search/replace, Emacs' bizarre kill rings and undo rings, the list goes on. These things are very flexible in Emacs, which is a great thing, but it's also an impediment to learning how to configure and use them. There's no getting around the time investment.

And it's not just a matter of learning some new keyboard shortcuts. There's a new vocabulary to learn. You don't open files, you visit them. What's a buffer? What's a window? (Not what you think it is.) What's a point? What's a mark? Kill? Yank? "Apropos"? Huh? C-c M-o means what exactly? My keyboard doesn't have a Meta key. Yeah, you can use CUA mode and get your modernized Copy/Cut/Paste shortcuts back, but that's the tip of the iceberg. It's hard even to know where to begin looking for help.

Yeah, Emacs came first, before our more common and more modern conventions were established, and that explains why it's so different. That doesn't change the fact that Emacs today is a strange beast.

Community and ecosystem

Personally I find the Emacs community to be a pretty nice bunch. In the highest tradition of hackerdom and open source software, Emacs users seem to be eager and willing to share their elisp snippets and bend over backwards to help other people learn the editor. I got lots of help when I was struggling and learning Emacs.

The Emacs wiki is an awesome resource. The official documentation is so complete (and so long) that it leaves me speechless sometimes. And there are a million 3rd-party scripts for it. Whatever you want Emacs to do is generally a short google away.

If there's anything wrong with the Emacs community, it'd be people who take Emacs evangelism overboard. The answer to "I don't want to have to use Emacs to use your language" can't be "Be quiet and learn more Emacs", or "If you're too dumb to learn Emacs, go away". In some communities there is certainly some of that. But thankfully I don't see it much in the Clojure community. Let's hope it stays that way.

Interoperability

Once someone spends the time to write a suitable amount of elisp, Emacs can interoperate with anything. I think so many people use SLIME for Clojure development precisely because it interoperates so darned well with Lisps. SLIME is amazing. You probably can't beat Paredit either, and Emacs' flexibility is precisely what makes things like Paredit possible.

The problem is the amount of time you have to spend to get that interoperability set up and to learn how to use it. After two years of using Emacs and Clojure together, every once in a while I still find myself bashing my face on my desk trying to get the latest SLIME or swank to work just right, or trying to get a broken key binding fixed, or tweaking some other aspect of Emacs that's driving me crazy. One day, curly braces stopped being recognized as matched pairs by Paredit. Why? No idea; I fixed it, but it was a half hour of wasted time.

Emacs is good at integrating with Git too. So good that there are four or five different Emacs-Git libraries, each with a different interface and feature set. I gave up eventually and went back to using the command line. (You can embed a shell / command line right in Emacs. There are three or four different libraries to do that too.)

The wealth of options of ways to do things in Emacs is simultaneously a good thing, overwhelming and confusing. If all you want is something that works and gets out of your way, too many options can be worse than one option, even if that one option isn't entirely ideal.

Emacs' Java interop, I know nothing about. Almost certainly, Emacs can come close to a modern Java IDE for fancy features like tab-completion and document lookups and project management. But how long is it going to take you to figure out that tab-completion is called hippie-expand in Emacs? That and a million other surprises await you.

What's my point?

There was a pithy quote floating around on Twitter a while back (I think quoting Rich Hickey):

One possible way to deal with being unfamiliar with something is to become familiar with it.

That's true, and you could say that of Emacs. I strongly believe that when it comes to computers, there's no such thing as "intuitive". There's stuff you've already spent a lot of time getting used to, and there's stuff you haven't.

But certain things require more of a time investment than others. Could I learn Clojure if all the keywords were in Russian or Chinese instead of my native English? Sure, but it'd take me a long time. I'd certainly have to have a good reason to attempt it.

I learned Emacs partly because it was hard. I saw it as a challenge. It was fun, yet painful, but more pain, more glory. Mastering it makes me feel like I've accomplished something. I'd encourage other people to learn Emacs and Vim too. I think the benefits of knowing them outweigh the cost and time investment of learning them.

But I didn't learn Emacs with the goal of being productive. I learned it for the same reason some people build cars in their garages, while most people just buy a one and drive it to and from work every day. I learned Emacs because I love programming and I love playing with toys, and Vim or Emacs are as nice a toy as I could ask for. (I love programming enough to form strong opinions and write huge blog posts about text editors.) For me, productivity was a beneficial side-effect.

There are only so many hours in a day. There are a lot of other challenges to conquer, some of which offer more tangible benefits than Emacs mastery would get you. Mastering an arcane text editor isn't necessarily going to be on the top of the list of everyone's goals in life, especially when there are other editors that are easier to use and give you a significant subset of what Emacs would give you. We have to pick our battles.

So I understand when people say they don't want to learn Emacs. I think maybe so many Clojurists use Emacs right now because we're still in the early adopter stage. If you're using Clojure today, you're probably pretty enthusiastic about programming. You're likely invested enough to be willing to burn the required time to learn Emacs.

If Clojure becomes "big", there are going to be a lot of casual users. A casual user of Clojure isn't going to learn Emacs. They're going to silently move on to another language. And I really think that new blood is vital to the strength of a community and necessary for the continued healthy existence of a programming language.

So Clojure does need alternatives. I'll stick with Emacs myself, but there should be practical alternatives. I'd encourage the Clojure community to continue to support and enjoy Emacs, but don't push it too hard.

Posts for Monday, June 7, 2010

avatar

Portage Hooks

Now that school is done for the 2009-2010 year, I’m back at it in Neuvoo again. I’m finishing off a long-planned and fairly major addition to portage I call “portage hooks.” The fun thing is I’ve submitted some patches to zmedico and the response has actually been more positive than previous experiences. solar seemed to be (tentatively?) liking the idea as well.

So, here’s what portage hooks are all about. If you have portage-utils installed, you will have an /etc/portage/postsync.d/ directory. Scripts in this directory are executed after portage syncs the tree. I thought this was a great idea, and I thought it should be expanded so there are other opportunities for unofficial extensions.

So, hopefully soon, there will be /etc/portage/hooks/{pre,post}-{run,sync,ebuild}.d directories, and inside hook scripts can be installed by ebuilds or users. The run and sync hooks are self-explanatory. The ebuild hooks will be executed within the ebuild environment before or after each phase, and can modify an ebuild’s environment, similar to /etc/portage/bashrc.

I’ve written documentation for the portage DocBook (enable the doc USE flag on portage) when the patches get accepted.

Neuvoo has already utilized these hooks to add transparent and stable support for squashfs portage trees. The “emerge –sync” command is hijacked by a pre-sync hook to download the latest squashfs tree, and every time “emerge” is run, there is a hook that checks to be sure the squashfs tree is mounted.

I hope to use hooks for the following additional unofficial features:

  • When a package fails to merge, a hook would inform the user of known open bugs for that package, and possibly even suggest a fix. Still thinking over this one.
  • Since we’re working in embedded environments, we’re going to be primarily using binary packages on production systems. This means we’ll need a pretty solid binary repository format to keep everything stable. I’m hoping to use our beagleboards to keep these binaries up-to-date. A post-run hook would detect that a new binary package was built and port the binary into our new binary repository format (which will consist of ebuilds, since ebuilds have much better flexibility than the “Packages” file will ever have). In addition, Neuvoo users, and anyone else with our binary system, can submit new binaries to our server for automated review, which will probably be some kind of fancy voting system.

An unofficial version of portage, and the squashfs hooks, are all available right now if you want to try it out in the neuvoo overlay. Run “layman -a neuvoo && emerge -av squashfs-portage” to get it. Be warned: the squashfs-portage package requires the latest neuvoo portage git. I don’t know if you can revert back to an old version of portage once you have the new one.


Posts for Saturday, June 5, 2010

It’s not just *what* you write, it’s *when* you write it

I’ve had this blog for quite a while now and there’s a few things that I still haven’t really figured out. On of the things that always irritates me most is which posts get recognition (as in them being linked, twittered about or commented on).

There are posts that come quickly. I open the text editor thingy and just type, a few minutes later a post has emerged. Those are usually me venting or poking something retarded with a stick. Those posts (even if they end up being rather long at times) just come to me. They do not require a lot of work and while venting always relieves me, those posts are nothing I do feel particularly good about.

On the other hand there are posts that usually have meandered around in my head for weeks or months, ideas that have kept my mind occupied for a long time and that I try to meld into a post at some point in time. And usually those posts are really hard to write, I trash many of them, rewrite them, trash the rewrite. Those posts are hard work and when I manage to boil my thoughts down to something my limited writing skills are able to bring to virtual paper it feels different. It’s probably more like what a mother feels after having given birth (without the physical pain obviously). Those post I wrestled with usually are something really important to me, some idea or comment I think is really really important. Also, those posts hardly ever gain any traction, but that’s probably cause they tend to be basically unreadable without the weeks of background thinking I did (I try to be better with that but it does not work so great so far ;) ).

But recently I realized that something else is very important for posts as well: Timing. With this I do not mean that a comment on some political issue has to be made directly after the event happened so people still have the event in mind, I am talking about the time of the day.

While some people still surf to sites directly to check out what’s new, a few years ago, many people switched to RSS readers. Personally I couldn’t live without one (I use the “evil” Google Reader [which is an absolutely brilliant RSS reader btw. you should really try it!]), but lately Twitter has turned out to be the thing driving traffic. I do what many others do, I have a service poll my RSS feed and post notices alerting people on my Twitter and Identi.ca feeds of the new post on the blog. Sometimes people retweet those notices to their followers and a posts get spread around.

But (and this is quite a serious but!) this has lead to many people heavily using twitter to find new and interesting articles and twitter is “real-time-ish” meaning: Something written yesterday evening is basically gone today. Microblogs are a stream of information. You dive in, you get out and the stream continues flowing. It’s basically impossible to read all messages passing through so you only check them out when you have the time.

I am in Germany and I usually write in the evening when I have some time off. So if I write something in German (which does not happen all that often though a few recent articles in German were fun to write) and a few minutes later the post is picked up and posted to twitter many people might have already gone off to bed. In the morning they open their microblogging application and maybe even check the backlog of messages but during those 8 or more hours so many messages will have gone through that the blogpost has basically never existed.

Now, if you know me somewhat, I am stubborn. I don’t want to repeat posts notifying people of my blog, it’s just more noise I would add to the world. But it makes you wonder how many articles never get any traction just because somebody wrote them after work  instead of in his or her lunchbreak.

avatar

OVAL, SCAP, CVE, CPE, …

For a personal POC I wanted to see if it is possible to generate, based on the collection of CVE entries publicly available, a report informing a system administrator about possible vulnerabilities. Nothing fancy, just based upon versions.

A simple example: tool detects Perl, acquires installed Perl version, then matches the collection of CVE entries against this Perl version. If at least one CVE is found, report it. The idea is then to make this as generic as possible (not specific for an operating system or Linux distribution), so not use a package version but really the tool version (or library version).

Of course, whenever I am planning such minor POCs, I search the Internet for possible existing tools (just like kev009 describes – “But First, Write No Code”). And I found out that there are already quite some “foundation components” available…

  • CPE is a structured way of naming software (vendor, title, version …)
  • OVAL is a method for performing structured tests (like regular expression matches in text) for reporting purposes

Many more of these efforts are linked through the Mitre sites. The above two are the most important ones though – it seems that it might be possible to use OVAL to describe the tests I wanted for the POC.

To be continued…

avatar

Listing files of (not) installed software

Everyone that has been using Gentoo for a while now knows about tools such as qlist that show you the list of files installed by an (installed) package, or qfile that allows you to find which package provided a particular file on your system.

One thing lacking is to be able to find out which package would provide a file. Unlike the previous tools, this tool cannot rely on the information found on your system as the package isn’t installed yet.

There have been projects in the past that attempted to provide such functionality, almost always through an online queryable database. Many haven’t survived, due to too high expectations or little server infrastructure resources. But it seems like PortageFileList is to stay for a while.

The project not only offers an online interface for querying information, it also provides a package (app-portage/pfl) that allows you to query their infrastructure from the command line. The package provides a tool called e-file which supports SQL-like syntax for the queries.


~$ e-file '%bin/xdm'

The above command will then display, using the well-known emerge/Portage output, which package provides the file (as well as which file was matched by the query).

Definitely a nice tool to have around. Thanks guys of PortageFileList!

Posts for Friday, June 4, 2010

Configuring Repositories Automatically via RepositoryRepository

Paludis is aware of packages that are in repositories you don’t have configured thanks to the unavailable repository. However, once Paludis has shown you that the package you want is in a repository you don’t have configured, you need to set up a configuration file for that repository (and any repositories it requires) and then sync. This is more work than is really necessary.

Enter RepositoryRepository, also known as r^2. Conceptually, it works as follows:

As well as providing special packages for packages in unavailable repositories, the unavailable repository also now provides packages named ‘repository/blah’ for repositories you don’t have configured. The metadata for these packages includes dependency information etc, along with useful things like the repository’s sync URI.

A new repository, using format = repository, provides special packages for repositories you do have configured.

Repository packages in unavailable repositories can be ‘installed’ to repository repositories. ‘Installing’ a repository creates a configuration file for it, and then syncs the newly created repository.

The configuration file it creates is controlled by a simple template, so it can contain anything you want it to contain.

Exherbo users can follow the setup instructions to start using this. On Gentoo this functionality is not yet available, since we won’t be switching the generated unavailable data to the new format until we’re reasonably sure that everyone is using a Paludis release that supports it.


Filed under: exherbo, paludis for users Tagged: exherbo, paludis

Can't win here

In the recent election, some of the parties put charts into their literature. In this post I analyse their accuracy.

Last month in the UK we had a general election, this is where we elect an Member of Parliament to represent our little area (called a 'constituency'). During the election period, I received a large amount of A4 paper from each of the various parties. I threw out hundreds of them, but I still managed to find a representative sample of them lying in my hall.

A couple of them (1 2) are leaflets for the neighbouring constituency! These are a complete waste of time.

Many of the leaflets share an interesting feature, little bar graphs or pie charts.

This is a commonly used tactic of the Liberal Democrat Party, to depict themselves as the second candidate in the local area, as opposed to their national profile as the third party. Here are some leaflets from the Lib Dem candidate in my area, showing these pictures.

http://commandline.org.uk/images/posts/hallgreen/numbers_libdems.jpg

The bar charts are a not very subtle appeal to people who would otherwise vote for the Conservative party. The argument is churlish and not in the British sporting tradition of fair play, candidates should campaign on their own merits.

However, it is not inaccurate from a demographic viewpoint. My area mainly consists of white working class workers, public sector workers and a growing Asian population. None of these groups are disposed to vote Conservative, and in the vote, the Conservatives came a pathetic forth.

Not that is stopped the Conservative candidate making her own graph in her literature. I found it hilarious so I will zoom in and show it in its full glory. It is dodgy (even dishonest) on several levels.

http://commandline.org.uk/images/posts/hallgreen/numbers_tories.jpg

Firstly, it is not even a general election result, it is showing a local council by-election. Councillors have an important role collecting rubbish and other local issues, but it is not the same thing at all.

Secondly, as a council by-election, it is a much smaller area than the general election consistency. So there is no way to tell how the larger area will vote based on the east end (Sparkbrook) of the consistency. This is especially the case because Sparkbrook, as the Balti capital of the world, is a majority Asian area, whereas the other parts of the constituency have a more dispersed ethnic population. Anyway, here are the results in that 2009 council by-election:

Name Party Votes
Ali Shokat Respect 2495
Mohammed Azim Labour 2228
Abdul Kadir Conservative 799
Naeem Qureshi Liberal Democrats 506
Charles Alldrick Green 213
Sakander Mahmood Independent 55

So we can see here that the Conservatives came third, not even getting a third of the votes of the winning candidate. Therefore the most dishonest feature of the graph is showing the change in vote without supplying the total vote. When armed with the figures above, we can see that the Conservative vote went up by a few dozen votes and that they started from a very low base indeed.

The dishonesty is confounded with the annotations on the chart. It says "Can't win here!" pointing to Green, Labour and Respect. Even if we ignore my caveats about this data being irrelevant for the general election; as you can see from the absolute numbers from the by-election result, Respect and Labour were the top two results, so they could in-fact win here.

This is further proved by looking at the actual vote last month:

Name Party Votes
Labour Roger Godsiff 16,039
Respect Salma Yaqoob 12,240
Liberal Democrats Jerry Evans 11,988
Conservative Jo Barker 7,320
UKIP Alan Blumenthal 950
Independent Andrew Gardner 190

So the Convervative chart looks extremely selective and misleading as Labour did in fact win the seat and Respect came second. Many media commentators were predicting a possible Respect victory.

The Respect party also had a chart. It is a well presented chart, showing the different results in 2005. It is the previous general election in 2005, so it is not a complete fiction like the Conservative chart.

http://commandline.org.uk/images/posts/hallgreen/numbers_respect.jpg

The annotation is also much more positive, instead of "someone else can't win here", it is the rousing "She can do it!". Unlike the Conservative annotation, it was not a lie; Yaqoob had a real chance.

The Respect party was actually founded by George Galloway, a Scottish Catholic, however locally in the campaign, Respect was perceived as the Muslim party, which limited its appeal among the indigenous population, and probably cost Respect the seat. However, demographic changes are on Respect's side, so they may well get over the finishing line next time when many new Asian voters have come of age. Assuming that the party does not run out of steam causing Asian voters return to the Labour party.

The chart is not without a problem. It is not the correct constituency. It is actually a neighbouring constituency. This is perhaps unavoidable because the constituency that Yaqoob ran for in 2005 was abolished due to boundary changes, and this current constituency did not have a respect representative in 2005. However, I think the chart should have put a little note pointing out that the boundaries have changed and this is the nearest relevant result.

Lastly we are onto Labour's literature:

http://commandline.org.uk/images/posts/hallgreen/numbers_labour.jpg

As you can see, no chart. (The independent candidate's leaflet does not have a chart either.) Instead, on the Labour leaflet, there are pictures of the incumbent meeting people and a record of how he voted. For the incumbent candidate, especially when his/her party is doing worse nationally than the candidate is locally, then is less point to a chart like the type above. However, an accurate and honest chart would help potential voters to understand the context behind any dodgy charts, such as the one in the Conservative literature.

Discuss this post - Leave a comment

Posts for Tuesday, June 1, 2010

Discussions

Whenever people are asked why they blog, most people will answer that it is for the comments and discussions that emerge from well-written blog posts. This idea has been picked up by people who write “guides” on how to write for a successful blog and is usually phrased something like this:

Finish your post with a direct question to your readers to get the discussion started.

If you look at this blog you will see that most posts do not generate any comments at all. Some people might argue that this stems from me writing mostly crappy posts and for some they might actually be right, but I’m not always that bad so there’s gotta be a different reason for this.

This actually ties in with a complaint my girlfriend sometimes faced me with: The fact that I don’t ask a lot.

Not asking many questions might sound weird, especially since I do in fact consider myself to be a rather curious person interested in basically everything, but it is true to a certain extend. Especially when I talk or write about something I do not ask for other people’s opinion which is often perceived as lack of interest when it is something slightly different.

I want to hear your input. Almost every human being is interesting and has something interesting to add and I hate to miss out on that kind of stuff. But on the other hand I hate stupidity.

There’s the saying that there are no stupid questions which is absolute bullshit. Yes there are. Asking for something the speaker explained just 10 seconds ago is stupid (and shows lack of interest) for example. And just as there are stupid questions there are many stupid answers and comments, comments which I just don’t care about. I don’t want this blog to be full of “Yes, me, too” or “I dunno” comments, I don’t actually want to make commenting too “easy”.

The ease of commenting has something to do with the software used here (which makes commenting really easy if you know how to type a name and an email address) and with a … let’s call it a social barrier.

If I ended every post asking for everybody’s input I’d make commenting really easy. I’d roll out the red carpet and ask you to please come in. That’s not the modus operandi here.

I am thankful for everybody who comes here and reads my stuff (or who reads this via the feed), I appreciate the time you take out of your busy and short lives to spend it on this. That is the reason I put all this stuff out here under free licenses. What I don’t care about is a random count of comments.

When it comes to comments I care about quality, I care about people seriously challenging me and the ideas or thoughts I put forward. I’d rather have 3 posts without any comments and one with somebody seriously taking the time to challenge me than having 20 comments on every posts.

I mean, it’s not like everything here is pure and utter intellectual brilliance, some posts are just retarded fun, some are just a quick linkdump or an embedded video. Why should you comment here on a video I embedded from youtube? Why not go to the youtube video instead? But there are some articles here that I did put some thought and some time in. Posts that are important to me. And I’d hate to have a possible discussion watered down.

I am similar in real life, I don’t ask for your input when I talk a lot. That does not mean that I don’t care it’s just that I don’t make it ridiculously easy for you. I put thoughts into what I write and say. Sometimes my opinions and ideas might even be controversial, might even get me into trouble. That’s a risk I take. So when you comment here I make you take some risk as well. I don’t prestructure the discussion with questions. I don’t open the door and roll out the red carpet.

But if you come and just get into the discussion (in real life just as virtually) I enjoy every minute of it. Yes, there is no red carpet, but if you manage to find the courage to come and open the door, you’ll find a seat at the table reserved just for you.

gallium3D on gentoo

Ok, today i give a short howto about gallium3D.

Gallium3D is a new 3D API under Linux. It not only supports OpenGL, but also OpenVG, GPGPU and Direct3D. Right now gallium3D is still under heavy development, but it's already quite stable. Since a few days now it's also very easy to get this new API. We only have to unmask mesa-9999 and keyword both mesa and eselect-mesa (which is needed by mesa-9999).

This is easily done by doing (as root) :

echo "media-libs/mesa" >> /etc/portage/package.unmask

echo "media-libs/mesa" >> /etc/portage/package.keywords

echo "app-admin/eselect-mesa" >> /etc/portage/package.keywords

Afterwards we can set the gallium API via eselect:

eselect mesa set r300 gallium

eselect mesa set sw gallium

Now everything runs threw Gallium3D, but how much does it bring? Here is a short glxgears test:

gallium: 12124 frames in 5.0 seconds = 2424.745 FPS

classic: 11817 frames in 5.0 seconds = 2363.365 FPS

This is under my Radeon X1900 XT. Well not much in glxgears, but i encountered a better playable ut2004. :)

Posts for Friday, May 28, 2010

But First, Write No Code

Something I see often in person and online are programmers constantly implementing common solutions, reinventing wheels, or embracing NIH.

Before you do this, please consider the Kev009’s Oath“But First, Write No Code”.  This is a solution to a variety of problems in software development, but today’s article is specifically on using external code.

I’ve found that programmers who follow a system similar to mine (detailed below) develop systems that are more stable, maintainable, and sane.  They likely write better code because it means they understand their tools and also read others’ code.  They examine the problem first rather than going in guns blazing.

Steps to decide whether to use an existing solution or write your own implementation:

  1. Scan the area. Google, Freshmeat, SourceForge, standard library, OS libraries, etc. are your friend.  See if the problem you are trying to solve has been solved.  I don’t care how long you’ve been programming or how much you think you know. The ecosystem of a language is constantly changing.Make a list of hits that look similar to the problem you are trying to solve.  Try and get a quick sense of the idiomatic methods of using your language, OS, etc.
  2. Do research. Are the solutions you found in step 1 suitable to the problem at hand?  Consider the pros and cons of each item.  Now, carefully evaluate how idiomatic the items are to your language and environment.If the item is open source, does the community seem active?  If it doesn’t fully map to your problem, does it look like you can modify it to do so?

    Even if you end up developing a solution from scratch, you should at least now have some good references.  Keep in mind, extending an existing project may be considerably less work.  You might even be able to offload maintenance of that component.

  3. Consider the license. This isn’t just for the legal department.  What kind of project you are working on will weigh in heavily.  Commercial or open source?  As a software professional, you need to be abreast with the various licenses in the wild.  As an open source developer, you need to consider how licenses will affect your work being packaged by distributions.An open source library licensed under the GPL is not acceptable for static linking to commercial software.  However, you can link to an operating system provided copy or bundle the dynamic library with your application.  LGPL does not have this restriction.  With both of these, you must supply your changes upon request from end users among other things.

    BSD, MIT, and Apache style licenses allow you to make changes and redistribute under completely different licenses.  Some just want credit in your documentation.  These are very compelling even in commercial development.

    Commercial components may have a per-copy fee associated which may dissuade their use by your organization.  If you don’t get the source, you won’t be able to effectively change or maintain it so you will also be at mercy of that developer.

  4. Make a decision. By now, your list should have been pared down based on licenses and research.  Perform extensive evaluations of the remainder and eventually hone in on the one you think fits best.  You’re going to have to rely on your experience and intuition while making the critical decision.  Perhaps the hardest part:  weighing it against a mythical home-grown solution in your mind.
  5. Implement the decision. Self explanatory.  This either means bootstrapping your own project or fully integrating the external one.  If you are extending an open source solution, consider submitting the patches back to the community for feedback and perhaps integration.  If you are bootstrapping your own solution, you’ve got your work cut out.  Is this only suitable for an internal project, or perhaps it would have its own merit as a new open source project?Be sure to reevaluate early and often.  That library you chose might turn out to be a can of worms, just as the “easy” new solution you had in your head might require years of development.
  6. Subscribe to the announce mailing list. Only if you used an external solution. Does the project have an RSS feed for releases or a low volume announcement list?  Don’t be like Adobe.  Avoid embarrassing security problems.  Also consider how enhancements and bug fixes to the external project might make your own project better, more stable, and more efficient.  This is where the real lasting dividends of using an external solution come from.

This list is widely applicable.  You’ve got a seriously high bar to reach if you are developing containers of <T>, sorting methods, GUI frameworks, parsers, text and binary file formats, and much more so try and follow it the next time you code.

Share and Enjoy: Digg del.icio.us Slashdot Facebook Reddit StumbleUpon Google Bookmarks FSDaily Twitter email Print PDF

Related posts:

  1. El Reg Humor and Java in free software The Register has a good article on Sphinx search with...
  2. Java: The Good Parts A while back, a book entitled JavaScript: The Good Parts...
  3. One Small Step for QT, One Giant Leap for Free Software QT Software, under the graces of Nokia, has released the...

Posts for Monday, May 24, 2010

avatar

scanning for base64_decode references

A friend’s site was recently hit by the massive infections/hacks on Dreamhost’s servers, so I decided to do some scanning on some servers that I administrate for base64_decode references.

The simple command I used to find suspect files was:
# find . -name \*.php -exec grep -l "eval(base64_decode" {} \;

The results could be sorted in just 2 categories. Malware and stupidity. There was no base64_decode reference that did something useful in any possible way.

The best malware I found was a slightly modified version of the c99 php shell on a hacked joomla installation (the site has been hacked multiple times but the client insists on just re-installing the same joomla installation over and over and always wonders how the hell do they find him and hack him…oh well). c99 is impressive though…excellent work. I won’t post the c99 shell here…google it, you can even find infected sites running it and you can “play” with them if you like…

And now comes the good part, stupidity.
My favorite php code containing a base64_decode reference that I found:

<code2>$hash  = 'aW5jbHVkZSgnLi4vLi';
$hash .= '4vaW5jX2NvbmYvY29u';
$hash .= 'Zi5pbmMucGhwJyk7aW';
$hash .= '5jbHVkZSgnLi4vLi4v';
$hash .= 'aW5jX2xpYi9kZWZhdW';
$hash .= 'x0LmluYy5waHAnKTtl';
$hash .= 'Y2hvICRwaHB3Y21zWy';
$hash .= 'd2ZXJzaW9uJ107';
eval(base64_decode($hash));
</code2>

Let’s see what this little diamond does:

<code2>
% base64 -d 
aW5jbHVkZSgnLi4vLi4vaW5jX2NvbmYvY29uZi5pbmMucGhwJyk7aW5jbHVkZSgnLi4vLi4vaW5jX2xpYi9kZWZhdWx0LmluYy5waHAnKTtlY2hvICRwaHB3Y21zWyd2ZXJzaW9uJ107
include('../../inc_conf/conf.inc.php');include('../../inc_lib/default.inc.php');echo $phpwcms['version'];
</code2>

So this guy used a series of strings which all of them together create a base64 encoded string in order to prevent someone from changing the version tag of his software. That’s not software, that’s crapware. Hiding the code where the version string appears ? That’s how you protect your software ? COME OOOOON….

Runtime Type Checking in C++ without RTTI

A technique I always seem to forget is how to map C++ types to an integer without relying upon RTTI. A variation on this is used in <locale> in standard library, for std::use_facet<>. But let’s take a much simpler, and highly contrived, example.

Let’s say we’ve got some values of different types, and we want to give those types to a library to store somewhere, and then we later want to get them back again. Crucially, the library itself doesn’t know anything about the types in question. So, for a very simple case:

#include <vector>
#include <iostream>
#include <string>

int main(int, char *[])
{
    std::vector<Something> things = { std::string("foo"), 123 };
    /* ... */
    std::cout << things[0].as<std::string>() << " " << things[1].as<int>() << std::endl;
}

Note the gratuitous use of c++0x initialiser lists, just because we can.

Those familiar with Boost might think that Something is like boost::any. However, boost::any uses RTTI, which is slow and completely unnecessary.

A first implementation of Something might look like this:

#include <memory>

class Something
{
    private:
        struct SomethingValueBase
        {
            virtual ~SomethingValueBase()
            {
            }
        };

        template <typename T_>
        struct SomethingValue :
            SomethingValueBase
        {
            T_ value;

            SomethingValue(const T_ & v) :
                value(v)
            {
            }
        };

        std::shared_ptr<SomethingValueBase> _value;

    public:
        template <typename T_>
        Something(const T_ & t) :
            _value(new SomethingValue<T_>(t))
        {
        }

        template <typename T_>
        const T_ & as() const
        {
            return static_cast<const SomethingValue<T_> &>(*_value).value;
        }
};

This works, but has a major flaw: if you get the types wrong when calling Something.as<>, you’ll get a segfault or something similarly horrible. We’d like to replace that with something safer.

One way to do it is to use runtime type information. The simplest variation on this is to replace the static_cast with a dynamic_cast. However, we can only do this if SomethingValueBase is a polymorphic type, which it isn’t. We can make it so by adding in a virtual destructor:

#include <memory>

class Something
{
    private:
        struct SomethingValueBase
        {
            virtual ~SomethingValueBase()
            {
            }
        };

        template <typename T_>
        struct SomethingValue :
            SomethingValueBase
        {
            T_ value;

            SomethingValue(const T_ & v) :
                value(v)
            {
            }
        };

        std::shared_ptr<SomethingValueBase> _value;

    public:
        template <typename T_>
        Something(const T_ & t) :
            _value(new SomethingValue<T_>(t))
        {
        }

        template <typename T_>
        const T_ & as() const
        {
            return dynamic_cast<const SomethingValue<T_> &>(*_value).value;
        }
};

Now, if we get the types wrong, a std::bad_cast will be thrown. Alternatively, we can use our own exception type:

class SomethingIsSomethingElse
{
};

class Something
{
    /* snip */

    public:
        template <typename T_>
        const T_ & as() const
        {
            auto value_casted(dynamic_cast<const SomethingValue<T_> *>(_value.get()));
            if (! value_casted)
                throw SomethingIsSomethingElse();
            return value_casted->value;
        }
};

We can also make use of std::dynamic_pointer_cast, which is possibly slightly less ugly syntactically:

class Something
{
    /* snip */

    public:
        template <typename T_>
        const T_ & as() const
        {
            auto value_casted(std::dynamic_pointer_cast<const SomethingValue<T_> >(_value));
            if (! value_casted)
                throw SomethingIsSomethingElse();
            return value_casted->value;
        }
};

All of this is using RTTI, though, and RTTI is a huge amount of overkill for what we need. Before eliminating the RTTI, though, we’ll switch to using it in a different way:

#include <memory>
#include <string>
#include <typeinfo>

class Something
{
    private:
        template <typename T_>
        struct SomethingValueType
        {
            virtual ~SomethingValueBase()
            {
            }
        };

        struct SomethingValueBase
        {
            std::string type_info_name;

            SomethingValueBase(const std::string & t) :
                type_info_name(t)
            {
            }
        };

        template <typename T_>
        struct SomethingValue :
            SomethingValueBase
        {
            T_ value;

            SomethingValue(const T_ & v) :
                SomethingValueBase(typeid(SomethingValueType<T_>()).name()),
                value(v)
            {
            }
        };

        std::shared_ptr<SomethingValueBase> _value;

    public:
        template <typename T_>
        Something(const T_ & t) :
            _value(new SomethingValue<T_>(t))
        {
        }

        template <typename T_>
        const T_ & as() const
        {
            if (typeid(SomethingValueType<T_>()).name() != _value->type_info_name)
                throw SomethingIsSomethingElse();
            return std::static_pointer_cast<const SomethingValue<T_> >(_value)->value;
        }
};

Here we make use of typeid explicitly, which is widely considered to be about on par with use of goto. However, it paves the way for our next step. Can we replace typeid(SomethingValueType<T_>()).name() with a different, non-evil expression? Let’s think about what properties the result of that expression must have:

  • We must be able to store it, so it needs to be a regular type.
  • We must be able to compare values of it, and be guaranteed true if and only if the two types used to create the value are the same, and false if and only if they are different. (Note that RTTI doesn’t even provide this guarantee.)

Let’s try this:

#include <memory>
#include <string>

class SomethingIsSomethingElse
{
};

template <typename T_>
struct SomethingTypeTraits;

class Something
{
    private:
        struct SomethingValueBase
        {
            int magic_number;

            SomethingValueBase(const int m) :
                magic_number(m)
            {
            }

            virtual ~SomethingValueBase()
            {
            }
        };

        template <typename T_>
        struct SomethingValue :
            SomethingValueBase
        {
            T_ value;

            SomethingValue(const T_ & v) :
                SomethingValueBase(SomethingTypeTraits<T_>::magic_number),
                value(v)
            {
            }
        };

        std::shared_ptr<SomethingValueBase> _value;

    public:
        template <typename T_>
        Something(const T_ & t) :
            _value(new SomethingValue<T_>(t))
        {
        }

        template <typename T_>
        const T_ & as() const
        {
            if (SomethingTypeTraits<T_>::magic_number != _value->magic_number)
                throw SomethingIsSomethingElse();
            return std::static_pointer_cast<const SomethingValue<T_> >(_value)->value;
        }
};

Now, our library user has to provide specialisations of SomethingTypeTraits for every type they wish to use:

#include <string>
#include <iostream>
#include <vector>

template <>
struct SomethingTypeTraits<int>
{
    enum { magic_number = 1 };
};

template <>
struct SomethingTypeTraits<std::string>
{
    enum { magic_number = 2 };
};

int main(int, char *[])
{
    std::vector<Something> things = { std::string("foo"), 123 };
    std::cout << things[0].as<std::string>() << " " << things[1].as<int>() << std::endl;
}

No RTTI at all there, and it is type safe, but it relies upon a lot of boilerplate from the library user, and that boilerplate is very easy to screw up. So, we’ll allocate magic numbers automatically instead:

#include <memory>

class Something
{
    private:
        static int next_magic_number()
        {
            static int magic(0);
            return magic++;
        }

        template <typename T_>
        static int magic_number_for()
        {
            static int result(next_magic_number());
            return result;
        }

        struct SomethingValueBase
        {
            int magic_number;

            SomethingValueBase(const int m) :
                magic_number(m)
            {
            }

            virtual ~SomethingValueBase()
            {
            }
        };

        template <typename T_>
        struct SomethingValue :
            SomethingValueBase
        {
            T_ value;

            SomethingValue(const T_ & v) :
                SomethingValueBase(magic_number_for<T_>()),
                value(v)
            {
            }
        };

        std::shared_ptr<SomethingValueBase> _value;

    public:
        template <typename T_>
        Something(const T_ & t) :
            _value(new SomethingValue<T_>(t))
        {
        }

        template <typename T_>
        const T_ & as() const
        {
            if (magic_number_for<T_>() != _value->magic_number)
                throw SomethingIsSomethingElse();
            return std::static_pointer_cast<const SomethingValue<T_> >(_value)->value;
        }
};

How does this work? Each instantiation of the magic_number_for<T_> function needs to return the same magic number every time it is called. The first time any particular instantiation is called, its static int result requests the next magic number. On subsequent calls, the allocated number is remembered. (Note that static values inside a template are not shared between different instantiations of that template.) Finally, next_magic_number just returns a new magic number every time it is called.

And there we have it: fast runtime type checking with no boilerplate and no RTTI. What we’ve done here is more or less useless, but the techniques do have other applications. For the curious, std::use_facet<> is probably the most common, and anyone brave enough to delve into its design will eventually see why this isn’t either pointless wankery or reinventing the wheel. For the rest, if you think that using RTTI can solve your problem adequately, then it probably can, and you don’t need to go into the kind of devious trickery the standard library uses internally.


Filed under: c++ Tagged: c++, rtti

Posts for Sunday, May 23, 2010

set up swappiness on boot

The swap - a space on the harddrive which the system would use if there is no more memory left. The swappiness variable which you can find under /proc/sys/vm/ controls this feature. A higher value means the system would rather use the swap space than trying to free some memory space. A lower value means - of course - the opposite.

Well, on my server the system runs on a ssd harddrive (to be exactly: they are 2 ssd's running at raid 0). The bad thing about these harddrives are, they have limited write-cycles. That means less write would keep the drives longer alive.

For that i changed that swappiness on my server to 30. To change the default setting you have to type:

sysctl vm.swappiness=30

But the problem is, with every reboot the system would loose these settings. Though, a short (boot)script could simply set it up every boot, but actually there is an easier way.

Amoung /etc there is a file called sysctl.conf. Here we can set up things like ip_forward. And we also can set up the swappiness. Just change the file and add a new line:

vm.swappiness = 30

That's all. Now on every boot, the system would set the swappiness value to 30.

Let us contemplate existence

In my last few posts (1, 2), I followed my readers' advice and have been reviewing the book "Programming the Semantic Web" published by O'Reilly. The full reference is below:

"Programming the Semantic Web by Toby Segaran, Colin Evans, and Jamie Taylor. Copyright 2009 Toby Segaran, Colin Evans, and Jamie Taylor, 978-0-596-15381-6."

Now we pick up at chapter 6 which deals with ontologies, which is the reason I starting working through this book. So without further ado, lets jump back in:

"The Web Ontology Language (OWL) is an RDF language developed by the W3C for defining classes and properties, and also for enabling more powerful reasoning and inference over relationships." (page 135)

The chapter explains the main classes on OWL. owl:Thing is a superclass for every other class (page 136), like 'Object' in Python new class syntax. The chapter also outlines owl:Class, owl:DatatypeProperty, owl:ObjectProperty and rdf:XMLLiteral which you might be able to figure out from the names.

The chapter then outlines the following properties: rdf:type, rdfs:subClassOf, rdfs:domain and rdfs:range. Then on pages 137-140, the chapter defines a schema for films in OWL using RDFLib. The book is worth it just for this.

http://commandline.org.uk/images/posts/semantic/bladerunner2.jpg

Using this schema we record the information that Harrison Ford played Rick Deckard in Blade Runner, directed by Ridley Scott. I have made a visualisation of that using GraphViz (as introduced in a previous chapter), you can download the image here and examine it on whatever image viewer your computer has. What a lot of noise for those few simple facts! But that is what it takes to get the applications to understand the semantics.

The chapter then moves to look at the GUI programme Protégé. I have already been introduced to this by Peter who is a big fan. Protege (I am bored with the accents already) is a Java program, so it will run fine on any system with Java (i.e. almost all of them). The chapter works through the GUI features of Protege in a matter of fact way, building up an ontology.

The approach the chapter outlines is to develop your ontology using Protege and then load the data using scripts and programs rather than using the GUI. On page 145, the chapter loads the ontology created using Protege into RDFLib by creating an instance of the ConjunctiveGraph class and then using the 'load' method.

Going back to the post before I started working through this book, namely Headfirst into the Semantic Web, this what seemed to be a simple approach to the work I need to do. However, there are many other packages outlined later in the book so I may change my mind.

On pages 146-147, the chapter goes back to some OWL theory, looking at 'Functional and Inverse Functional Properties', 'Inverse Properties' and 'Disjoint Classes'. The chapter then points out some ontologies available online that are worth examining (pages 148-149). Lastly the chapter then works through an ontology for beer (pages 149-151), an appropriate place to end, I might grab a cold ale myself.

Discuss this post - Leave a comment

Still swimming in the Semantic Web

In my last post, I followed my readers' advice and checked out the book "Programming the Semantic Web" published by O'Reilly. The full reference is below:

"Programming the Semantic Web by Toby Segaran, Colin Evans, and Jamie Taylor. Copyright 2009 Toby Segaran, Colin Evans, and Jamie Taylor, 978-0-596-15381-6."

I stopped in the middle of chapter 3, in this post we keep going with the review. The book tells us that:

"Inference is the process of deriving new information from information you already have." (page 43)

For example, you might have one piece of information, then download a second from the web, and then from these two pieces of information, derive a third. One of the examples given in the chapter is "If I know a restaurant's address, I can use a geocoder to find its coordinates on a map" (page 43).

The chapter goes onto work out which restaurants in Washington DC are likely to be touristy. It does that by working out which restaurants are near a tourist attraction and are at the same time cheap. It uses this example to explain how inference rules can chain together to generate new information:

"What's important to realize here is that the rules exist totally independently. Although we ran the three rules in sequence, they weren't aware of each other - they just looked to see if there were any triples that they knew how to deal with and then created new ones based on those. These rules can be run continuously—even from different machines that have access to the same triplestore - and still work properly, and new rules can be added at any time." (page 49)

The chapter then looks at merging graphs together, allowing queries across data from different sources. Then the chapter ends with some fun, we get to generate graphic visualisations with the program graphviz (which I discovered that I already had on my system).

http://commandline.org.uk/images/posts/semantic/swimming.jpg

Image by eteela used with permission.

Chapter 4 dives straight into RDF. In RDF, everything is a resource, identified by a URI (page 65). A URI does not have to be retrievable as a URL, though to aid uniqueness, it is a convention to use a hostname that you control as the first part of the URI. RDF allows the use of a blank node for situations where you do not know the URI (page 67), these are given an arbitrary ID starting with underscore colon _:

RDF can be expressed in different serialization formats (page 69), the chapter demonstrates these using a set RDF format, the Friend of a Friend (FOAF) vocabulary, as the primary example.

The first RDF serialization format covered is N-Triples, a series of statements, each one "containing a subject, predicate, and object followed by a dot" (page 71). N3 is very similar to N-Triples but various shorthands are introduced to remove redundancies (page 72).

Then the XML representation of RDF, which is perhaps what most people think of as RDF, is covered next (pages 73-76). Lastly RDFa, where XML attributes are added to XHTML tags, allowing one document to be both the human and machine-readable content (page 76). The extra XML attributes "specify the semantics behind the information that is displayed" (page 76).

Chapter 4 leaves behind the simplified tools from the previous chapters and breaks out RDFlib and SPARQL. SPARQL is a query language for RDF graphs. It is read only, and if you (dis)like SQL then you will equally (dis)like SPARQL. The chapter uses RDFlib to demonstrate this. I might cover RDFlib myself in a future post, so I will skip talking about it for now. Briefly, it seems a really useful library and the first call for dealing with all things RDF in Python.

Chapter 5 is "source of semantic data" which seems to have extensive examples of the work covered so far, I skipped this chapter for now, in order so I could press straight on to chapter 6 about ontologies.

As I have been going along, I have been trying out all the examples, and there have been several small errors in the code, especially inconsistently named references and files. This is not supposed to be a book aimed just at Python users, this is a general book for anyone, with the examples just happening to be in Python. Therefore there could be a little more proofreading, i.e. user testing, of the examples.

This did not spoil it for me, as a more experienced Python programmer I could just fix the examples as I went. So far I have found the book really engaging and useful, and I am very keen to read the rest.

Discuss this post - Leave a comment

Posts for Friday, May 21, 2010

Emacs: Yank lines as lines

One thing nice about Vim is manipulating whole lines at a time. dd deletes a line (including trailing newline), regardless of where the cursor is on the line. Then, p puts that line (with its newline) as a new line after the current line, and P puts it above the current line, again regardless of where your cursor is at the moment. (It also jumps the cursor to the beginning of the text you just inserted, which is nice.)

Emacs has kill-whole-line (C-S-Backspace) which is like Vim's dd. But I didn't find an equivalent of p and P. So here's my version:

(defun yank-with-newline ()
  "Yank, appending a newline if the yanked text doesn't end with one."
  (yank)
  (when (not (string-match "\n$" (current-kill 0)))
    (newline-and-indent)))

(defun yank-as-line-above ()
  "Yank text as a new line above the current line.

Also moves point to the beginning of the text you just yanked."
  (interactive)
  (let ((lnum (line-number-at-pos (point))))
    (beginning-of-line)
    (yank-with-newline)
    (goto-line lnum)))

(defun yank-as-line-below ()
  "Yank text as a new line below the current line.

Also moves point to the beginning of the text you just yanked."
  (interactive)
  (let* ((lnum (line-number-at-pos (point)))
         (lnum (if (eobp) lnum (1+ lnum))))
    (if (and (eobp) (not (bolp)))
        (newline-and-indent)
      (forward-line 1))
    (yank-with-newline)
    (goto-line lnum)))

(global-set-key "\M-P" 'yank-as-line-above)
(global-set-key "\M-p" 'yank-as-line-below)

Just one more step along the path to Vimmify my Emacs setup. Emacs has some weird edge cases because you can move the cursor one "line" past the last real line in the file. But I think I worked out something comfortable for myself.

PS: I've written about this before, but if you use C-S-Backspace a lot in Emacs on Linux, I highly recommend putting this into your X11 config:

Option "DontZap" "True"

It's really easy to mix up C-S-Backspace and C-M-Backspace (the latter of which kills your X server). It's not fun to mix those up. Not fun at all.

PPS: This thread on Stack Overflow has some Emacs equivalents of Vim's o and O which are pretty nice too.

xorg-server 1.8 without hal

Last Weekend i finally updated on my main-pc xorg to the latest xorg-server 1.8. The update itself went very smoothly. I had no compile-errors or any package conflicts. After installing i instantly stopped the hal daemon and removed it from the default runlevel. Well, i had to change some things in the xorg.conf, but actually after a few minutes the new xorg-server worked perfectly well for me.

Well that's not the end of the story.

After a few hours, when i wanted to play some music, i encountered another problem. My soundcard was missing. But not really, because for example, smplayer still played video's but kmplayer don't.

Phonon, der KDE-multimedia framework actually didn't found the soundcard anymore. smplayer don't use phonon, it used the soundcard directly. My first though was i accidently changed some sound-settings. But after a hour looking for the problem, i finally found out the problem was with the hal daemon. Sadly, phonon still needs the hal daemon. Without it wouldn't find the soundcard. After starting the hal daemon again everything worked well again.

I hope the kde guys will change that soon, so that i can finally unmerge hal from my system.

avatar

Playing a song as a background process in Windows

Sometimes you ask yourself how to do cool things like playing a song in the background (ie. no visible interface or application) upon login on a Windows box. Being completely unfamiliar with using DOS I wasn’t quite sure how to go about doing this, but apparently it was quite easy. So here I am documenting it for future "reference". This marks my very first time touching the DOS prompt and indeed any sort of commands in Windows, so please excuse the newbie-format of this post.

Everything is done CLI for obvious reasons – we don’t want any interface for them to turn off our song. So we need a command line music player. mplayer is also available as a command line player on Windows, and so it was my first choice. A quick download of a build without an interface and we were ready to play any song with a *.bat file containing `mplayer "music.mp3"`

The next step is to make it run without the prompt opening up. This is again easily done by executing the bat file via a vbs file with the following content. Creating a shortcut to this vbs file and dumping it in your startup folder is the simplest and most obvious way to make it play on login. Here’s the code:

Set WshShell = CreateObject("WScript.Shell")
WshShell.Run chr(34) & "C:\path\to\my\bat\file.bat" & Chr(34), 0
Set WshShell = Nothing

Now I wanted to be able to change this song whenever I wanted from a central server. Basically it would check whether or not it needs to update the song, and if it does, delete the existing song and download the new song. This is useful to give a little variety in our fun little player. Some things didn’t work quite as I wanted it to so I have probably used the most horrendous of hacks based on what I could garner from various online references.

First I needed a way to download files akin to wget. I found a small program called url2file which did just the thing. I wanted it to check whether or not a song existed on the server, and if it did, download it. However the url2file program didn’t quite play nice with that idea (it would download a 404 page instead of allowing me to tell it not to do anything), and I didn’t know how to check whether or not a file existed on a remote server. So instead I had to make do with a second "notifier" file which, if it contained a certain string, would mean that a new song was available to be downloaded.

It would download that plaintext file’s contents to a tmp file, search in that tmp file for the string we were looking for, and if successful, would delete the existing music file and download the new one to take its place. Unfortunately doing a simple `if %getnew%==yes` didn’t work (explanations welcome!), so I made do with checking the first 3 characters, which did work. Here’s the final code, with the getnew.txt file including just the single word "yes".

del tmp
URL2FILE.EXE http://foobar.com/getnew.txt > tmp
set /p getnew= < tmp
set _part_name=%getnew:~0,3%
if %_part_name%==yes del music.mp3
if %_part_name%==yes URL2FILE.EXE http://foobar.com/music.mp3 music.mp3

Tada, and worked flawlessly. Not bad for a couple hours work from scratch and not knowing anything about DOS at all.

In unrelated news, I’m looking for good bagpipe music.

Related posts:

  1. Tech Tip #6: Reencode any video to ensure compatibility with Windows Media Player
  2. Top 10 Windows Mobile Applications
  3. Free, legal music for all.

Planet Larry is not officially affiliated with Gentoo Linux. Original artwork and logos copyright Gentoo Foundation. Yadda, yadda, yadda.