Leif Biberg Kristensen
Norway
Just a test
This is just a test to see if the feeds are working correctly. Nothing to see here, move on.
gentoo users, compiled
Posts for Tuesday, June 8, 2010
Norway
This is just a test to see if the feeds are working correctly. Nothing to see here, move on.

USA
I haven't posted one of these in a while. I've been in an 8-bit kind of mindset for a while:
What I actually stare at for 8 hours every day:
KDE4, Buuf icons, QtCurve, wallpaper is from somewhere on the internets.

USA
Chas Emerick recently posted the results of his State of Clojure survey. It turns out that the (self-selected) group of Clojure-using respondents happen to prefer Emacs as their IDE of choice, eclipsing all other editors by a large margin.
Chas then has this to say:
I continue to maintain that broad acceptance and usage of Clojure will require that there be top-notch development environments for it that mere mortals can use and not be intimidated by...and IMO, while emacs is hugely capable, I think it falls down badly on a number of counts related to usability, community/ecosystem, and interoperability.
As an avid, die-hard Vim and Emacs user for life, I'm going to agree.
Emacs isn't difficult to learn. Not in the sense of requiring skill or cleverness. It is however extremely painful to learn. I think there's a difference.
The key word is tedium. Learning Emacs is a long process of rote memorization and repetition of commands until they become muscle memory. If you're smart enough to write programs, you can learn Emacs. You just have to keep dumping time into the task until you become comfortable.
Until you're comfortable, you face the unpleasant task of un-learning all of your habits and forming new ones. And you're trying to do this at the same time you're undertaking another, even harder task: writing programs. And if you're a new Clojurist, and you're learning Emacs and Clojure from scratch at the same time, well, get the headache medication ready.
As a programmer and someone who sits in front of a computer 12+ hours a day, I consider myself pretty flexible and capable of picking up a new user interface. As someone who had been using Vim for years prior to trying Emacs, I considered myself more than capable of learning even a strange and foreign interface. I'd done it once before.
But learning Emacs still hurt. Oh how it hurt. I blogged while I was learning it, and you can see my pain firsthand. I sometimes hear people say "I tried Emacs for a whole month and I still couldn't get it". Well, it took me over a year to be able to sit down at Emacs and use it fluidly for long periods of time without tripping over the editor.
To be fair, I'm talking here about using Emacs as a programming environment. Using Emacs as a Notepad replacement could be learned in short order. C-x C-f, C-x C-s, or use the menus, there you go. Using it comfortably as a full-fledged IDE is significantly harder and requires you to touch (and master) many more features. Syntax highlighting, tab-completion, directory traversal and cwd issues, enabling line numbers, version-control integration, build tool integration, Emacs' funky regex syntax for search/replace, Emacs' bizarre kill rings and undo rings, the list goes on. These things are very flexible in Emacs, which is a great thing, but it's also an impediment to learning how to configure and use them. There's no getting around the time investment.
And it's not just a matter of learning some new keyboard shortcuts. There's a new vocabulary to learn. You don't open files, you visit them. What's a buffer? What's a window? (Not what you think it is.) What's a point? What's a mark? Kill? Yank? "Apropos"? Huh? C-c M-o means what exactly? My keyboard doesn't have a Meta key. Yeah, you can use CUA mode and get your modernized Copy/Cut/Paste shortcuts back, but that's the tip of the iceberg. It's hard even to know where to begin looking for help.
Yeah, Emacs came first, before our more common and more modern conventions were established, and that explains why it's so different. That doesn't change the fact that Emacs today is a strange beast.
Personally I find the Emacs community to be a pretty nice bunch. In the highest tradition of hackerdom and open source software, Emacs users seem to be eager and willing to share their elisp snippets and bend over backwards to help other people learn the editor. I got lots of help when I was struggling and learning Emacs.
The Emacs wiki is an awesome resource. The official documentation is so complete (and so long) that it leaves me speechless sometimes. And there are a million 3rd-party scripts for it. Whatever you want Emacs to do is generally a short google away.
If there's anything wrong with the Emacs community, it'd be people who take Emacs evangelism overboard. The answer to "I don't want to have to use Emacs to use your language" can't be "Be quiet and learn more Emacs", or "If you're too dumb to learn Emacs, go away". In some communities there is certainly some of that. But thankfully I don't see it much in the Clojure community. Let's hope it stays that way.
Once someone spends the time to write a suitable amount of elisp, Emacs can interoperate with anything. I think so many people use SLIME for Clojure development precisely because it interoperates so darned well with Lisps. SLIME is amazing. You probably can't beat Paredit either, and Emacs' flexibility is precisely what makes things like Paredit possible.
The problem is the amount of time you have to spend to get that interoperability set up and to learn how to use it. After two years of using Emacs and Clojure together, every once in a while I still find myself bashing my face on my desk trying to get the latest SLIME or swank to work just right, or trying to get a broken key binding fixed, or tweaking some other aspect of Emacs that's driving me crazy. One day, curly braces stopped being recognized as matched pairs by Paredit. Why? No idea; I fixed it, but it was a half hour of wasted time.
Emacs is good at integrating with Git too. So good that there are four or five different Emacs-Git libraries, each with a different interface and feature set. I gave up eventually and went back to using the command line. (You can embed a shell / command line right in Emacs. There are three or four different libraries to do that too.)
The wealth of options of ways to do things in Emacs is simultaneously a good thing, overwhelming and confusing. If all you want is something that works and gets out of your way, too many options can be worse than one option, even if that one option isn't entirely ideal.
Emacs' Java interop, I know nothing about. Almost certainly, Emacs can come close to a modern Java IDE for fancy features like tab-completion and document lookups and project management. But how long is it going to take you to figure out that tab-completion is called hippie-expand in Emacs? That and a million other surprises await you.
There was a pithy quote floating around on Twitter a while back (I think quoting Rich Hickey):
One possible way to deal with being unfamiliar with something is to become familiar with it.
That's true, and you could say that of Emacs. I strongly believe that when it comes to computers, there's no such thing as "intuitive". There's stuff you've already spent a lot of time getting used to, and there's stuff you haven't.
But certain things require more of a time investment than others. Could I learn Clojure if all the keywords were in Russian or Chinese instead of my native English? Sure, but it'd take me a long time. I'd certainly have to have a good reason to attempt it.
I learned Emacs partly because it was hard. I saw it as a challenge. It was fun, yet painful, but more pain, more glory. Mastering it makes me feel like I've accomplished something. I'd encourage other people to learn Emacs and Vim too. I think the benefits of knowing them outweigh the cost and time investment of learning them.
But I didn't learn Emacs with the goal of being productive. I learned it for the same reason some people build cars in their garages, while most people just buy a one and drive it to and from work every day. I learned Emacs because I love programming and I love playing with toys, and Vim or Emacs are as nice a toy as I could ask for. (I love programming enough to form strong opinions and write huge blog posts about text editors.) For me, productivity was a beneficial side-effect.
There are only so many hours in a day. There are a lot of other challenges to conquer, some of which offer more tangible benefits than Emacs mastery would get you. Mastering an arcane text editor isn't necessarily going to be on the top of the list of everyone's goals in life, especially when there are other editors that are easier to use and give you a significant subset of what Emacs would give you. We have to pick our battles.
So I understand when people say they don't want to learn Emacs. I think maybe so many Clojurists use Emacs right now because we're still in the early adopter stage. If you're using Clojure today, you're probably pretty enthusiastic about programming. You're likely invested enough to be willing to burn the required time to learn Emacs.
If Clojure becomes "big", there are going to be a lot of casual users. A casual user of Clojure isn't going to learn Emacs. They're going to silently move on to another language. And I really think that new blood is vital to the strength of a community and necessary for the continued healthy existence of a programming language.
So Clojure does need alternatives. I'll stick with Emacs myself, but there should be practical alternatives. I'd encourage the Clojure community to continue to support and enjoy Emacs, but don't push it too hard.
Posts for Monday, June 7, 2010
USA
Now that school is done for the 2009-2010 year, I’m back at it in Neuvoo again. I’m finishing off a long-planned and fairly major addition to portage I call “portage hooks.” The fun thing is I’ve submitted some patches to zmedico and the response has actually been more positive than previous experiences. solar seemed to be (tentatively?) liking the idea as well.
So, here’s what portage hooks are all about. If you have portage-utils installed, you will have an /etc/portage/postsync.d/ directory. Scripts in this directory are executed after portage syncs the tree. I thought this was a great idea, and I thought it should be expanded so there are other opportunities for unofficial extensions.
So, hopefully soon, there will be /etc/portage/hooks/{pre,post}-{run,sync,ebuild}.d directories, and inside hook scripts can be installed by ebuilds or users. The run and sync hooks are self-explanatory. The ebuild hooks will be executed within the ebuild environment before or after each phase, and can modify an ebuild’s environment, similar to /etc/portage/bashrc.
I’ve written documentation for the portage DocBook (enable the doc USE flag on portage) when the patches get accepted.
Neuvoo has already utilized these hooks to add transparent and stable support for squashfs portage trees. The “emerge –sync” command is hijacked by a pre-sync hook to download the latest squashfs tree, and every time “emerge” is run, there is a hook that checks to be sure the squashfs tree is mounted.
I hope to use hooks for the following additional unofficial features:
An unofficial version of portage, and the squashfs hooks, are all available right now if you want to try it out in the neuvoo overlay. Run “layman -a neuvoo && emerge -av squashfs-portage” to get it. Be warned: the squashfs-portage package requires the latest neuvoo portage git. I don’t know if you can revert back to an old version of portage once you have the new one.
Posts for Saturday, June 5, 2010

Germany
I’ve had this blog for quite a while now and there’s a few things that I still haven’t really figured out. On of the things that always irritates me most is which posts get recognition (as in them being linked, twittered about or commented on).
There are posts that come quickly. I open the text editor thingy and just type, a few minutes later a post has emerged. Those are usually me venting or poking something retarded with a stick. Those posts (even if they end up being rather long at times) just come to me. They do not require a lot of work and while venting always relieves me, those posts are nothing I do feel particularly good about.
On the other hand there are posts that usually have meandered around in my head for weeks or months, ideas that have kept my mind occupied for a long time and that I try to meld into a post at some point in time. And usually those posts are really hard to write, I trash many of them, rewrite them, trash the rewrite. Those posts are hard work and when I manage to boil my thoughts down to something my limited writing skills are able to bring to virtual paper it feels different. It’s probably more like what a mother feels after having given birth (without the physical pain obviously). Those post I wrestled with usually are something really important to me, some idea or comment I think is really really important. Also, those posts hardly ever gain any traction, but that’s probably cause they tend to be basically unreadable without the weeks of background thinking I did (I try to be better with that but it does not work so great so far
).
But recently I realized that something else is very important for posts as well: Timing. With this I do not mean that a comment on some political issue has to be made directly after the event happened so people still have the event in mind, I am talking about the time of the day.
While some people still surf to sites directly to check out what’s new, a few years ago, many people switched to RSS readers. Personally I couldn’t live without one (I use the “evil” Google Reader [which is an absolutely brilliant RSS reader btw. you should really try it!]), but lately Twitter has turned out to be the thing driving traffic. I do what many others do, I have a service poll my RSS feed and post notices alerting people on my Twitter and Identi.ca feeds of the new post on the blog. Sometimes people retweet those notices to their followers and a posts get spread around.
But (and this is quite a serious but!) this has lead to many people heavily using twitter to find new and interesting articles and twitter is “real-time-ish” meaning: Something written yesterday evening is basically gone today. Microblogs are a stream of information. You dive in, you get out and the stream continues flowing. It’s basically impossible to read all messages passing through so you only check them out when you have the time.
I am in Germany and I usually write in the evening when I have some time off. So if I write something in German (which does not happen all that often though a few recent articles in German were fun to write) and a few minutes later the post is picked up and posted to twitter many people might have already gone off to bed. In the morning they open their microblogging application and maybe even check the backlog of messages but during those 8 or more hours so many messages will have gone through that the blogpost has basically never existed.
Now, if you know me somewhat, I am stubborn. I don’t want to repeat posts notifying people of my blog, it’s just more noise I would add to the world. But it makes you wonder how many articles never get any traction just because somebody wrote them after work instead of in his or her lunchbreak.
Belgium
For a personal POC I wanted to see if it is possible to generate, based on the collection of CVE entries publicly available, a report informing a system administrator about possible vulnerabilities. Nothing fancy, just based upon versions.
A simple example: tool detects Perl, acquires installed Perl version, then matches the collection of CVE entries against this Perl version. If at least one CVE is found, report it. The idea is then to make this as generic as possible (not specific for an operating system or Linux distribution), so not use a package version but really the tool version (or library version).
Of course, whenever I am planning such minor POCs, I search the Internet for possible existing tools (just like kev009 describes – “But First, Write No Code”). And I found out that there are already quite some “foundation components” available…
Many more of these efforts are linked through the Mitre sites. The above two are the most important ones though – it seems that it might be possible to use OVAL to describe the tests I wanted for the POC.
To be continued…
Belgium
Everyone that has been using Gentoo for a while now knows about tools such as qlist that show you the list of files installed by an (installed) package, or qfile that allows you to find which package provided a particular file on your system.
One thing lacking is to be able to find out which package would provide a file. Unlike the previous tools, this tool cannot rely on the information found on your system as the package isn’t installed yet.
There have been projects in the past that attempted to provide such functionality, almost always through an online queryable database. Many haven’t survived, due to too high expectations or little server infrastructure resources. But it seems like PortageFileList is to stay for a while.
The project not only offers an online interface for querying information, it also provides a package (app-portage/pfl) that allows you to query their infrastructure from the command line. The package provides a tool called e-file which supports SQL-like syntax for the queries.
~$ e-file '%bin/xdm'
The above command will then display, using the well-known emerge/Portage output, which package provides the file (as well as which file was matched by the query).
Definitely a nice tool to have around. Thanks guys of PortageFileList!
Posts for Friday, June 4, 2010
Paludis is aware of packages that are in repositories you don’t have configured thanks to the unavailable repository. However, once Paludis has shown you that the package you want is in a repository you don’t have configured, you need to set up a configuration file for that repository (and any repositories it requires) and then sync. This is more work than is really necessary.
Enter RepositoryRepository, also known as r^2. Conceptually, it works as follows:
As well as providing special packages for packages in unavailable repositories, the unavailable repository also now provides packages named ‘repository/blah’ for repositories you don’t have configured. The metadata for these packages includes dependency information etc, along with useful things like the repository’s sync URI.
A new repository, using format = repository, provides special packages for repositories you do have configured.
Repository packages in unavailable repositories can be ‘installed’ to repository repositories. ‘Installing’ a repository creates a configuration file for it, and then syncs the newly created repository.
The configuration file it creates is controlled by a simple template, so it can contain anything you want it to contain.
Exherbo users can follow the setup instructions to start using this. On Gentoo this functionality is not yet available, since we won’t be switching the generated unavailable data to the new format until we’re reasonably sure that everyone is using a Paludis release that supports it.

England
In the recent election, some of the parties put charts into their literature. In this post I analyse their accuracy.
Last month in the UK we had a general election, this is where we elect an Member of Parliament to represent our little area (called a 'constituency'). During the election period, I received a large amount of A4 paper from each of the various parties. I threw out hundreds of them, but I still managed to find a representative sample of them lying in my hall.
A couple of them (1 2) are leaflets for the neighbouring constituency! These are a complete waste of time.
Many of the leaflets share an interesting feature, little bar graphs or pie charts.
This is a commonly used tactic of the Liberal Democrat Party, to depict themselves as the second candidate in the local area, as opposed to their national profile as the third party. Here are some leaflets from the Lib Dem candidate in my area, showing these pictures.
The bar charts are a not very subtle appeal to people who would otherwise vote for the Conservative party. The argument is churlish and not in the British sporting tradition of fair play, candidates should campaign on their own merits.
However, it is not inaccurate from a demographic viewpoint. My area mainly consists of white working class workers, public sector workers and a growing Asian population. None of these groups are disposed to vote Conservative, and in the vote, the Conservatives came a pathetic forth.
Not that is stopped the Conservative candidate making her own graph in her literature. I found it hilarious so I will zoom in and show it in its full glory. It is dodgy (even dishonest) on several levels.
Firstly, it is not even a general election result, it is showing a local council by-election. Councillors have an important role collecting rubbish and other local issues, but it is not the same thing at all.
Secondly, as a council by-election, it is a much smaller area than the general election consistency. So there is no way to tell how the larger area will vote based on the east end (Sparkbrook) of the consistency. This is especially the case because Sparkbrook, as the Balti capital of the world, is a majority Asian area, whereas the other parts of the constituency have a more dispersed ethnic population. Anyway, here are the results in that 2009 council by-election:
| Name | Party | Votes |
|---|---|---|
| Ali Shokat | Respect | 2495 |
| Mohammed Azim | Labour | 2228 |
| Abdul Kadir | Conservative | 799 |
| Naeem Qureshi | Liberal Democrats | 506 |
| Charles Alldrick | Green | 213 |
| Sakander Mahmood | Independent | 55 |
So we can see here that the Conservatives came third, not even getting a third of the votes of the winning candidate. Therefore the most dishonest feature of the graph is showing the change in vote without supplying the total vote. When armed with the figures above, we can see that the Conservative vote went up by a few dozen votes and that they started from a very low base indeed.
The dishonesty is confounded with the annotations on the chart. It says "Can't win here!" pointing to Green, Labour and Respect. Even if we ignore my caveats about this data being irrelevant for the general election; as you can see from the absolute numbers from the by-election result, Respect and Labour were the top two results, so they could in-fact win here.
This is further proved by looking at the actual vote last month:
| Name | Party | Votes |
|---|---|---|
| Labour | Roger Godsiff | 16,039 |
| Respect | Salma Yaqoob | 12,240 |
| Liberal Democrats | Jerry Evans | 11,988 |
| Conservative | Jo Barker | 7,320 |
| UKIP | Alan Blumenthal | 950 |
| Independent | Andrew Gardner | 190 |
So the Convervative chart looks extremely selective and misleading as Labour did in fact win the seat and Respect came second. Many media commentators were predicting a possible Respect victory.
The Respect party also had a chart. It is a well presented chart, showing the different results in 2005. It is the previous general election in 2005, so it is not a complete fiction like the Conservative chart.
The annotation is also much more positive, instead of "someone else can't win here", it is the rousing "She can do it!". Unlike the Conservative annotation, it was not a lie; Yaqoob had a real chance.
The Respect party was actually founded by George Galloway, a Scottish Catholic, however locally in the campaign, Respect was perceived as the Muslim party, which limited its appeal among the indigenous population, and probably cost Respect the seat. However, demographic changes are on Respect's side, so they may well get over the finishing line next time when many new Asian voters have come of age. Assuming that the party does not run out of steam causing Asian voters return to the Labour party.
The chart is not without a problem. It is not the correct constituency. It is actually a neighbouring constituency. This is perhaps unavoidable because the constituency that Yaqoob ran for in 2005 was abolished due to boundary changes, and this current constituency did not have a respect representative in 2005. However, I think the chart should have put a little note pointing out that the boundaries have changed and this is the nearest relevant result.
Lastly we are onto Labour's literature:
As you can see, no chart. (The independent candidate's leaflet does not have a chart either.) Instead, on the Labour leaflet, there are pictures of the incumbent meeting people and a record of how he voted. For the incumbent candidate, especially when his/her party is doing worse nationally than the candidate is locally, then is less point to a chart like the type above. However, an accurate and honest chart would help potential voters to understand the context behind any dodgy charts, such as the one in the Conservative literature.
Discuss this post - Leave a commentPosts for Tuesday, June 1, 2010

Germany
Whenever people are asked why they blog, most people will answer that it is for the comments and discussions that emerge from well-written blog posts. This idea has been picked up by people who write “guides” on how to write for a successful blog and is usually phrased something like this:
Finish your post with a direct question to your readers to get the discussion started.
If you look at this blog you will see that most posts do not generate any comments at all. Some people might argue that this stems from me writing mostly crappy posts and for some they might actually be right, but I’m not always that bad so there’s gotta be a different reason for this.
This actually ties in with a complaint my girlfriend sometimes faced me with: The fact that I don’t ask a lot.
Not asking many questions might sound weird, especially since I do in fact consider myself to be a rather curious person interested in basically everything, but it is true to a certain extend. Especially when I talk or write about something I do not ask for other people’s opinion which is often perceived as lack of interest when it is something slightly different.
I want to hear your input. Almost every human being is interesting and has something interesting to add and I hate to miss out on that kind of stuff. But on the other hand I hate stupidity.
There’s the saying that there are no stupid questions which is absolute bullshit. Yes there are. Asking for something the speaker explained just 10 seconds ago is stupid (and shows lack of interest) for example. And just as there are stupid questions there are many stupid answers and comments, comments which I just don’t care about. I don’t want this blog to be full of “Yes, me, too” or “I dunno” comments, I don’t actually want to make commenting too “easy”.
The ease of commenting has something to do with the software used here (which makes commenting really easy if you know how to type a name and an email address) and with a … let’s call it a social barrier.
If I ended every post asking for everybody’s input I’d make commenting really easy. I’d roll out the red carpet and ask you to please come in. That’s not the modus operandi here.
I am thankful for everybody who comes here and reads my stuff (or who reads this via the feed), I appreciate the time you take out of your busy and short lives to spend it on this. That is the reason I put all this stuff out here under free licenses. What I don’t care about is a random count of comments.
When it comes to comments I care about quality, I care about people seriously challenging me and the ideas or thoughts I put forward. I’d rather have 3 posts without any comments and one with somebody seriously taking the time to challenge me than having 20 comments on every posts.
I mean, it’s not like everything here is pure and utter intellectual brilliance, some posts are just retarded fun, some are just a quick linkdump or an embedded video. Why should you comment here on a video I embedded from youtube? Why not go to the youtube video instead? But there are some articles here that I did put some thought and some time in. Posts that are important to me. And I’d hate to have a possible discussion watered down.
I am similar in real life, I don’t ask for your input when I talk a lot. That does not mean that I don’t care it’s just that I don’t make it ridiculously easy for you. I put thoughts into what I write and say. Sometimes my opinions and ideas might even be controversial, might even get me into trouble. That’s a risk I take. So when you comment here I make you take some risk as well. I don’t prestructure the discussion with questions. I don’t open the door and roll out the red carpet.
But if you come and just get into the discussion (in real life just as virtually) I enjoy every minute of it. Yes, there is no red carpet, but if you manage to find the courage to come and open the door, you’ll find a seat at the table reserved just for you.
Posts for Friday, May 28, 2010
Something I see often in person and online are programmers constantly implementing common solutions, reinventing wheels, or embracing NIH.
Before you do this, please consider the Kev009’s Oath – “But First, Write No Code”. This is a solution to a variety of problems in software development, but today’s article is specifically on using external code.
I’ve found that programmers who follow a system similar to mine (detailed below) develop systems that are more stable, maintainable, and sane. They likely write better code because it means they understand their tools and also read others’ code. They examine the problem first rather than going in guns blazing.
Steps to decide whether to use an existing solution or write your own implementation:
Even if you end up developing a solution from scratch, you should at least now have some good references. Keep in mind, extending an existing project may be considerably less work. You might even be able to offload maintenance of that component.
BSD, MIT, and Apache style licenses allow you to make changes and redistribute under completely different licenses. Some just want credit in your documentation. These are very compelling even in commercial development.
Commercial components may have a per-copy fee associated which may dissuade their use by your organization. If you don’t get the source, you won’t be able to effectively change or maintain it so you will also be at mercy of that developer.
This list is widely applicable. You’ve got a seriously high bar to reach if you are developing containers of <T>, sorting methods, GUI frameworks, parsers, text and binary file formats, and much more so try and follow it the next time you code.
Share and Enjoy:
Related posts:
Posts for Monday, May 24, 2010
Greece
A friend’s site was recently hit by the massive infections/hacks on Dreamhost’s servers, so I decided to do some scanning on some servers that I administrate for base64_decode references.
The simple command I used to find suspect files was:
# find . -name \*.php -exec grep -l "eval(base64_decode" {} \;
The results could be sorted in just 2 categories. Malware and stupidity. There was no base64_decode reference that did something useful in any possible way.
The best malware I found was a slightly modified version of the c99 php shell on a hacked joomla installation (the site has been hacked multiple times but the client insists on just re-installing the same joomla installation over and over and always wonders how the hell do they find him and hack him…oh well). c99 is impressive though…excellent work. I won’t post the c99 shell here…google it, you can even find infected sites running it and you can “play” with them if you like…
And now comes the good part, stupidity.
My favorite php code containing a base64_decode reference that I found:
<code2>$hash = 'aW5jbHVkZSgnLi4vLi'; $hash .= '4vaW5jX2NvbmYvY29u'; $hash .= 'Zi5pbmMucGhwJyk7aW'; $hash .= '5jbHVkZSgnLi4vLi4v'; $hash .= 'aW5jX2xpYi9kZWZhdW'; $hash .= 'x0LmluYy5waHAnKTtl'; $hash .= 'Y2hvICRwaHB3Y21zWy'; $hash .= 'd2ZXJzaW9uJ107'; eval(base64_decode($hash)); </code2>
Let’s see what this little diamond does:
<code2>
% base64 -d
aW5jbHVkZSgnLi4vLi4vaW5jX2NvbmYvY29uZi5pbmMucGhwJyk7aW5jbHVkZSgnLi4vLi4vaW5jX2xpYi9kZWZhdWx0LmluYy5waHAnKTtlY2hvICRwaHB3Y21zWyd2ZXJzaW9uJ107
include('../../inc_conf/conf.inc.php');include('../../inc_lib/default.inc.php');echo $phpwcms['version'];
</code2>
So this guy used a series of strings which all of them together create a base64 encoded string in order to prevent someone from changing the version tag of his software. That’s not software, that’s crapware. Hiding the code where the version string appears ? That’s how you protect your software ? COME OOOOON….
A technique I always seem to forget is how to map C++ types to an integer without relying upon RTTI. A variation on this is used in <locale> in standard library, for std::use_facet<>. But let’s take a much simpler, and highly contrived, example.
Let’s say we’ve got some values of different types, and we want to give those types to a library to store somewhere, and then we later want to get them back again. Crucially, the library itself doesn’t know anything about the types in question. So, for a very simple case:
#include <vector>
#include <iostream>
#include <string>
int main(int, char *[])
{
std::vector<Something> things = { std::string("foo"), 123 };
/* ... */
std::cout << things[0].as<std::string>() << " " << things[1].as<int>() << std::endl;
}
Note the gratuitous use of c++0x initialiser lists, just because we can.
Those familiar with Boost might think that Something is like boost::any. However, boost::any uses RTTI, which is slow and completely unnecessary.
A first implementation of Something might look like this:
#include <memory>
class Something
{
private:
struct SomethingValueBase
{
virtual ~SomethingValueBase()
{
}
};
template <typename T_>
struct SomethingValue :
SomethingValueBase
{
T_ value;
SomethingValue(const T_ & v) :
value(v)
{
}
};
std::shared_ptr<SomethingValueBase> _value;
public:
template <typename T_>
Something(const T_ & t) :
_value(new SomethingValue<T_>(t))
{
}
template <typename T_>
const T_ & as() const
{
return static_cast<const SomethingValue<T_> &>(*_value).value;
}
};
This works, but has a major flaw: if you get the types wrong when calling Something.as<>, you’ll get a segfault or something similarly horrible. We’d like to replace that with something safer.
One way to do it is to use runtime type information. The simplest variation on this is to replace the static_cast with a dynamic_cast. However, we can only do this if SomethingValueBase is a polymorphic type, which it isn’t. We can make it so by adding in a virtual destructor:
#include <memory>
class Something
{
private:
struct SomethingValueBase
{
virtual ~SomethingValueBase()
{
}
};
template <typename T_>
struct SomethingValue :
SomethingValueBase
{
T_ value;
SomethingValue(const T_ & v) :
value(v)
{
}
};
std::shared_ptr<SomethingValueBase> _value;
public:
template <typename T_>
Something(const T_ & t) :
_value(new SomethingValue<T_>(t))
{
}
template <typename T_>
const T_ & as() const
{
return dynamic_cast<const SomethingValue<T_> &>(*_value).value;
}
};
Now, if we get the types wrong, a std::bad_cast will be thrown. Alternatively, we can use our own exception type:
class SomethingIsSomethingElse
{
};
class Something
{
/* snip */
public:
template <typename T_>
const T_ & as() const
{
auto value_casted(dynamic_cast<const SomethingValue<T_> *>(_value.get()));
if (! value_casted)
throw SomethingIsSomethingElse();
return value_casted->value;
}
};
We can also make use of std::dynamic_pointer_cast, which is possibly slightly less ugly syntactically:
class Something
{
/* snip */
public:
template <typename T_>
const T_ & as() const
{
auto value_casted(std::dynamic_pointer_cast<const SomethingValue<T_> >(_value));
if (! value_casted)
throw SomethingIsSomethingElse();
return value_casted->value;
}
};
All of this is using RTTI, though, and RTTI is a huge amount of overkill for what we need. Before eliminating the RTTI, though, we’ll switch to using it in a different way:
#include <memory>
#include <string>
#include <typeinfo>
class Something
{
private:
template <typename T_>
struct SomethingValueType
{
virtual ~SomethingValueBase()
{
}
};
struct SomethingValueBase
{
std::string type_info_name;
SomethingValueBase(const std::string & t) :
type_info_name(t)
{
}
};
template <typename T_>
struct SomethingValue :
SomethingValueBase
{
T_ value;
SomethingValue(const T_ & v) :
SomethingValueBase(typeid(SomethingValueType<T_>()).name()),
value(v)
{
}
};
std::shared_ptr<SomethingValueBase> _value;
public:
template <typename T_>
Something(const T_ & t) :
_value(new SomethingValue<T_>(t))
{
}
template <typename T_>
const T_ & as() const
{
if (typeid(SomethingValueType<T_>()).name() != _value->type_info_name)
throw SomethingIsSomethingElse();
return std::static_pointer_cast<const SomethingValue<T_> >(_value)->value;
}
};
Here we make use of typeid explicitly, which is widely considered to be about on par with use of goto. However, it paves the way for our next step. Can we replace typeid(SomethingValueType<T_>()).name() with a different, non-evil expression? Let’s think about what properties the result of that expression must have:
Let’s try this:
#include <memory>
#include <string>
class SomethingIsSomethingElse
{
};
template <typename T_>
struct SomethingTypeTraits;
class Something
{
private:
struct SomethingValueBase
{
int magic_number;
SomethingValueBase(const int m) :
magic_number(m)
{
}
virtual ~SomethingValueBase()
{
}
};
template <typename T_>
struct SomethingValue :
SomethingValueBase
{
T_ value;
SomethingValue(const T_ & v) :
SomethingValueBase(SomethingTypeTraits<T_>::magic_number),
value(v)
{
}
};
std::shared_ptr<SomethingValueBase> _value;
public:
template <typename T_>
Something(const T_ & t) :
_value(new SomethingValue<T_>(t))
{
}
template <typename T_>
const T_ & as() const
{
if (SomethingTypeTraits<T_>::magic_number != _value->magic_number)
throw SomethingIsSomethingElse();
return std::static_pointer_cast<const SomethingValue<T_> >(_value)->value;
}
};
Now, our library user has to provide specialisations of SomethingTypeTraits for every type they wish to use:
#include <string>
#include <iostream>
#include <vector>
template <>
struct SomethingTypeTraits<int>
{
enum { magic_number = 1 };
};
template <>
struct SomethingTypeTraits<std::string>
{
enum { magic_number = 2 };
};
int main(int, char *[])
{
std::vector<Something> things = { std::string("foo"), 123 };
std::cout << things[0].as<std::string>() << " " << things[1].as<int>() << std::endl;
}
No RTTI at all there, and it is type safe, but it relies upon a lot of boilerplate from the library user, and that boilerplate is very easy to screw up. So, we’ll allocate magic numbers automatically instead:
#include <memory>
class Something
{
private:
static int next_magic_number()
{
static int magic(0);
return magic++;
}
template <typename T_>
static int magic_number_for()
{
static int result(next_magic_number());
return result;
}
struct SomethingValueBase
{
int magic_number;
SomethingValueBase(const int m) :
magic_number(m)
{
}
virtual ~SomethingValueBase()
{
}
};
template <typename T_>
struct SomethingValue :
SomethingValueBase
{
T_ value;
SomethingValue(const T_ & v) :
SomethingValueBase(magic_number_for<T_>()),
value(v)
{
}
};
std::shared_ptr<SomethingValueBase> _value;
public:
template <typename T_>
Something(const T_ & t) :
_value(new SomethingValue<T_>(t))
{
}
template <typename T_>
const T_ & as() const
{
if (magic_number_for<T_>() != _value->magic_number)
throw SomethingIsSomethingElse();
return std::static_pointer_cast<const SomethingValue<T_> >(_value)->value;
}
};
How does this work? Each instantiation of the magic_number_for<T_> function needs to return the same magic number every time it is called. The first time any particular instantiation is called, its static int result requests the next magic number. On subsequent calls, the allocated number is remembered. (Note that static values inside a template are not shared between different instantiations of that template.) Finally, next_magic_number just returns a new magic number every time it is called.
And there we have it: fast runtime type checking with no boilerplate and no RTTI. What we’ve done here is more or less useless, but the techniques do have other applications. For the curious, std::use_facet<> is probably the most common, and anyone brave enough to delve into its design will eventually see why this isn’t either pointless wankery or reinventing the wheel. For the rest, if you think that using RTTI can solve your problem adequately, then it probably can, and you don’t need to go into the kind of devious trickery the standard library uses internally.
Posts for Sunday, May 23, 2010

England
In my last few posts (1, 2), I followed my readers' advice and have been reviewing the book "Programming the Semantic Web" published by O'Reilly. The full reference is below:
"Programming the Semantic Web by Toby Segaran, Colin Evans, and Jamie Taylor. Copyright 2009 Toby Segaran, Colin Evans, and Jamie Taylor, 978-0-596-15381-6."
Now we pick up at chapter 6 which deals with ontologies, which is the reason I starting working through this book. So without further ado, lets jump back in:
"The Web Ontology Language (OWL) is an RDF language developed by the W3C for defining classes and properties, and also for enabling more powerful reasoning and inference over relationships." (page 135)
The chapter explains the main classes on OWL. owl:Thing is a superclass for every other class (page 136), like 'Object' in Python new class syntax. The chapter also outlines owl:Class, owl:DatatypeProperty, owl:ObjectProperty and rdf:XMLLiteral which you might be able to figure out from the names.
The chapter then outlines the following properties: rdf:type, rdfs:subClassOf, rdfs:domain and rdfs:range. Then on pages 137-140, the chapter defines a schema for films in OWL using RDFLib. The book is worth it just for this.
Using this schema we record the information that Harrison Ford played Rick Deckard in Blade Runner, directed by Ridley Scott. I have made a visualisation of that using GraphViz (as introduced in a previous chapter), you can download the image here and examine it on whatever image viewer your computer has. What a lot of noise for those few simple facts! But that is what it takes to get the applications to understand the semantics.
The chapter then moves to look at the GUI programme Protégé. I have already been introduced to this by Peter who is a big fan. Protege (I am bored with the accents already) is a Java program, so it will run fine on any system with Java (i.e. almost all of them). The chapter works through the GUI features of Protege in a matter of fact way, building up an ontology.
The approach the chapter outlines is to develop your ontology using Protege and then load the data using scripts and programs rather than using the GUI. On page 145, the chapter loads the ontology created using Protege into RDFLib by creating an instance of the ConjunctiveGraph class and then using the 'load' method.
Going back to the post before I started working through this book, namely Headfirst into the Semantic Web, this what seemed to be a simple approach to the work I need to do. However, there are many other packages outlined later in the book so I may change my mind.
On pages 146-147, the chapter goes back to some OWL theory, looking at 'Functional and Inverse Functional Properties', 'Inverse Properties' and 'Disjoint Classes'. The chapter then points out some ontologies available online that are worth examining (pages 148-149). Lastly the chapter then works through an ontology for beer (pages 149-151), an appropriate place to end, I might grab a cold ale myself.
Discuss this post - Leave a comment
England
In my last post, I followed my readers' advice and checked out the book "Programming the Semantic Web" published by O'Reilly. The full reference is below:
"Programming the Semantic Web by Toby Segaran, Colin Evans, and Jamie Taylor. Copyright 2009 Toby Segaran, Colin Evans, and Jamie Taylor, 978-0-596-15381-6."
I stopped in the middle of chapter 3, in this post we keep going with the review. The book tells us that:
"Inference is the process of deriving new information from information you already have." (page 43)
For example, you might have one piece of information, then download a second from the web, and then from these two pieces of information, derive a third. One of the examples given in the chapter is "If I know a restaurant's address, I can use a geocoder to find its coordinates on a map" (page 43).
The chapter goes onto work out which restaurants in Washington DC are likely to be touristy. It does that by working out which restaurants are near a tourist attraction and are at the same time cheap. It uses this example to explain how inference rules can chain together to generate new information:
"What's important to realize here is that the rules exist totally independently. Although we ran the three rules in sequence, they weren't aware of each other - they just looked to see if there were any triples that they knew how to deal with and then created new ones based on those. These rules can be run continuously—even from different machines that have access to the same triplestore - and still work properly, and new rules can be added at any time." (page 49)
The chapter then looks at merging graphs together, allowing queries across data from different sources. Then the chapter ends with some fun, we get to generate graphic visualisations with the program graphviz (which I discovered that I already had on my system).
Image by eteela used with permission.
Chapter 4 dives straight into RDF. In RDF, everything is a resource, identified by a URI (page 65). A URI does not have to be retrievable as a URL, though to aid uniqueness, it is a convention to use a hostname that you control as the first part of the URI. RDF allows the use of a blank node for situations where you do not know the URI (page 67), these are given an arbitrary ID starting with underscore colon _:
RDF can be expressed in different serialization formats (page 69), the chapter demonstrates these using a set RDF format, the Friend of a Friend (FOAF) vocabulary, as the primary example.
The first RDF serialization format covered is N-Triples, a series of statements, each one "containing a subject, predicate, and object followed by a dot" (page 71). N3 is very similar to N-Triples but various shorthands are introduced to remove redundancies (page 72).
Then the XML representation of RDF, which is perhaps what most people think of as RDF, is covered next (pages 73-76). Lastly RDFa, where XML attributes are added to XHTML tags, allowing one document to be both the human and machine-readable content (page 76). The extra XML attributes "specify the semantics behind the information that is displayed" (page 76).
Chapter 4 leaves behind the simplified tools from the previous chapters and breaks out RDFlib and SPARQL. SPARQL is a query language for RDF graphs. It is read only, and if you (dis)like SQL then you will equally (dis)like SPARQL. The chapter uses RDFlib to demonstrate this. I might cover RDFlib myself in a future post, so I will skip talking about it for now. Briefly, it seems a really useful library and the first call for dealing with all things RDF in Python.
Chapter 5 is "source of semantic data" which seems to have extensive examples of the work covered so far, I skipped this chapter for now, in order so I could press straight on to chapter 6 about ontologies.
As I have been going along, I have been trying out all the examples, and there have been several small errors in the code, especially inconsistently named references and files. This is not supposed to be a book aimed just at Python users, this is a general book for anyone, with the examples just happening to be in Python. Therefore there could be a little more proofreading, i.e. user testing, of the examples.
This did not spoil it for me, as a more experienced Python programmer I could just fix the examples as I went. So far I have found the book really engaging and useful, and I am very keen to read the rest.
Discuss this post - Leave a commentPosts for Friday, May 21, 2010

USA
One thing nice about Vim is manipulating whole lines at a time. dd deletes a line (including trailing newline), regardless of where the cursor is on the line. Then, p puts that line (with its newline) as a new line after the current line, and P puts it above the current line, again regardless of where your cursor is at the moment. (It also jumps the cursor to the beginning of the text you just inserted, which is nice.)
Emacs has kill-whole-line (C-S-Backspace) which is like Vim's dd. But I didn't find an equivalent of p and P. So here's my version:
(defun yank-with-newline ()
"Yank, appending a newline if the yanked text doesn't end with one."
(yank)
(when (not (string-match "\n$" (current-kill 0)))
(newline-and-indent)))
(defun yank-as-line-above ()
"Yank text as a new line above the current line.
Also moves point to the beginning of the text you just yanked."
(interactive)
(let ((lnum (line-number-at-pos (point))))
(beginning-of-line)
(yank-with-newline)
(goto-line lnum)))
(defun yank-as-line-below ()
"Yank text as a new line below the current line.
Also moves point to the beginning of the text you just yanked."
(interactive)
(let* ((lnum (line-number-at-pos (point)))
(lnum (if (eobp) lnum (1+ lnum))))
(if (and (eobp) (not (bolp)))
(newline-and-indent)
(forward-line 1))
(yank-with-newline)
(goto-line lnum)))
(global-set-key "\M-P" 'yank-as-line-above)
(global-set-key "\M-p" 'yank-as-line-below)
Just one more step along the path to Vimmify my Emacs setup. Emacs has some weird edge cases because you can move the cursor one "line" past the last real line in the file. But I think I worked out something comfortable for myself.
PS: I've written about this before, but if you use C-S-Backspace a lot in Emacs on Linux, I highly recommend putting this into your X11 config:
Option "DontZap" "True"
It's really easy to mix up C-S-Backspace and C-M-Backspace (the latter of which kills your X server). It's not fun to mix those up. Not fun at all.
PPS: This thread on Stack Overflow has some Emacs equivalents of Vim's o and O which are pretty nice too.
Malaysia
Sometimes you ask yourself how to do cool things like playing a song in the background (ie. no visible interface or application) upon login on a Windows box. Being completely unfamiliar with using DOS I wasn’t quite sure how to go about doing this, but apparently it was quite easy. So here I am documenting it for future "reference". This marks my very first time touching the DOS prompt and indeed any sort of commands in Windows, so please excuse the newbie-format of this post.
Everything is done CLI for obvious reasons – we don’t want any interface for them to turn off our song. So we need a command line music player. mplayer is also available as a command line player on Windows, and so it was my first choice. A quick download of a build without an interface and we were ready to play any song with a *.bat file containing `mplayer "music.mp3"`
The next step is to make it run without the prompt opening up. This is again easily done by executing the bat file via a vbs file with the following content. Creating a shortcut to this vbs file and dumping it in your startup folder is the simplest and most obvious way to make it play on login. Here’s the code:
Set WshShell = CreateObject("WScript.Shell")
WshShell.Run chr(34) & "C:\path\to\my\bat\file.bat" & Chr(34), 0
Set WshShell = Nothing
Now I wanted to be able to change this song whenever I wanted from a central server. Basically it would check whether or not it needs to update the song, and if it does, delete the existing song and download the new song. This is useful to give a little variety in our fun little player. Some things didn’t work quite as I wanted it to so I have probably used the most horrendous of hacks based on what I could garner from various online references.
First I needed a way to download files akin to wget. I found a small program called url2file which did just the thing. I wanted it to check whether or not a song existed on the server, and if it did, download it. However the url2file program didn’t quite play nice with that idea (it would download a 404 page instead of allowing me to tell it not to do anything), and I didn’t know how to check whether or not a file existed on a remote server. So instead I had to make do with a second "notifier" file which, if it contained a certain string, would mean that a new song was available to be downloaded.
It would download that plaintext file’s contents to a tmp file, search in that tmp file for the string we were looking for, and if successful, would delete the existing music file and download the new one to take its place. Unfortunately doing a simple `if %getnew%==yes` didn’t work (explanations welcome!), so I made do with checking the first 3 characters, which did work. Here’s the final code, with the getnew.txt file including just the single word "yes".
del tmp
URL2FILE.EXE http://foobar.com/getnew.txt > tmp
set /p getnew= < tmp
set _part_name=%getnew:~0,3%
if %_part_name%==yes del music.mp3
if %_part_name%==yes URL2FILE.EXE http://foobar.com/music.mp3 music.mp3
Tada, and worked flawlessly. Not bad for a couple hours work from scratch and not knowing anything about DOS at all.
In unrelated news, I’m looking for good bagpipe music.
Related posts:
Posts for Thursday, May 20, 2010
C++ doesn’t have named function parameters. In some ways this isn’t a huge deal, since the compiler will usually catch when you screw up the ordering of arguments to a function. But if you’ve got a function accepting multiple arguments of the same type, the compiler isn’t going to save you. So we want to allow something like following:
shop.populate(
param::number_of_cheeses() = 0,
param::number_of_parrots() = 1,
param::parrot_variety() = "Norwegian Blue"
);
We also want:
It would be nice to allow arguments to be specified in any order, and there is a way of doing that using C++0x, but it’s rather convoluted, so we’ll stick with the requirement that arguments be in the right order for now.
First, we want to work out the type of those param::foo() things. Since we’re using operator=, they need to be structs or constants of some kind (since operator= can only be overloaded as a member function). Since we want lots of them of different types, and since we don’t want to have to worry about declaring the same name multiple times (which means we’d start hitting the ODR), a typedef of a template seems in order. Thus, we’d like to do:
namespace params
{
typedef Name</* something */> number_of_cheeses;
typedef Name</* something */> number_of_parrots;
typedef Name</* something */> parrot_variety;
}
As for the something, the best I’ve been able to come up with is an inline forward declaration of a meaningless struct:
namespace params
{
typedef Name<struct N_number_of_cheeses> number_of_cheeses;
typedef Name<struct N_number_of_parrots> number_of_parrots;
typedef Name<struct N_parrot_variety> parrot_variety;
}
What about the function parameters?
void Shop::populate(
const NamedValue<param::number_of_cheeses, int> & number_of_cheeses,
const NamedValue<param::number_of_parrots, int> & number_of_parrots,
const NamedValue<param::number_of_cheeses, std::string> & parrot_variety)
{
/* ... */
}
There’s a small amount of duplication there, but that’s a necessity: it’s considered a useful feature of C and C++ that declarations and implementations of functions can use different names for parameters.
As for using the parameters, we’ve got two options. We could add a super magic cast operator to NamedValue, or we could make it explicit. Since super magic casts have a nasty habit of doing really weird things, we’ll make it explicit using operator():
void Shop::populate(
const NamedValue<param::number_of_cheeses, int> & number_of_cheeses,
const NamedValue<param::number_of_parrots, int> & number_of_parrots,
const NamedValue<param::number_of_cheeses, std::string> & parrot_variety)
{
cheeses.resize(number_of_cheeses());
cage.insert(number_of_parrots(), parrot_variety());
}
Now we just have to make it work. First, NamedValue, remembering to provide const and non-const versions of our operator:
template <typename T_, typename V_>
class NamedValue
{
private:
V_ _value;
public:
explicit NamedValue(const V_ & v) :
_value(v)
{
}
V_ & operator() ()
{
return _value;
}
const V_ & operator() () const
{
return _value;
}
};
Then Name. Our first attempt might look like this:
template <typename T_>
struct Name
{
template <typename V_>
NamedValue<Name<T_>, V_> operator= (const V_ & v) const
{
return NamedValue<Name<T_>, V_>(v);
}
};
But there’s a problem: whilst this works for int and most classes, it does something immensely stupid when fed a string literal. We could require users to write out parameters like:
param::parrot_variety() = std::string("Norwegian Blue")
but that’s rather silly. So instead we’ll add in a way of overriding types for NamedValue, keeping it nice and generic in case any similar situations crop up elsewhere:
template <typename T_>
struct NamedValueType
{
typedef T_ Type;
};
template <int n>
struct NamedValueType<char [n]>
{
typedef std::string Type;
};
template <typename T_>
struct Name
{
template <typename V_>
NamedValue<Name<T_>, typename NamedValueType<V_>::Type> operator= (const V_ & v) const
{
return NamedValue<Name<T_>, typename NamedValueType<V_>::Type>(v);
}
};
Fortunately, g++ is smart enough to compile all of this into exactly the same code as it would if named parameters weren’t used.
And there we have it: very low boilerplate type safe named parameters with no icky macros.
USA
So, as you can see, besides pommed, a fan script, and the webcam, there’s really very little tweaking required. Everything more or less works.
Edit: The kernel configuration for this machine was requested in the comments below, so I’ve posted it here.
USA
This is a short blog post to say that I’ve found an excellent must-have for Firefox users. It’s called Mozilla Weave. It keeps history, bookmarks, passwords, Firefox settings, and soon add-ons in sync. My experience of any kind of automated syncing software has been poor, but this one seems to be flawless. Minus connectivity issues when I’m offline, it’s never reported an error or bugged me about some conflict or other. It just sits there and uploads data to Weave servers (or your own if you so choose) so other computers may stay in sync. Even if you don’t have multiple computers, this is great for just backing up, especially considering the Firefox data folder is a pain to back up.
Privacy concerns are non-existent here, because everything is encrypted with a pass-phrase, on top of your Weave account username and password.
The only bummer about Weave is it’s limited to just Firefox, and Firefox’s competitors have no equivalent yet. It looks like I won’t be switching to Chrome any time soon.
Posts for Wednesday, May 19, 2010

Slovenia
Wow!
It's pretty rare that my heart goes pouding when I'm playing a game, but this one made it. Digital: a Love Story is quite a unique little adventure game.
It has it all: oldskoolness, hacking. phracking, BBS, AI ...everything. And althought it's pretty linear, at least to me it brought the genuine feeling of being on the edge and sometimes tough decisions to do as well as pretty realistically reenacting the feeling of getting access codes blocked.
Sure, it's all done a very simple way and I , but it's just a game. And as such a very cool one :]
hook out >> just finished the game and feeling good and his alter ego a bit brokenhearted

Slovenia
However we spin it, it can't be denied that the status quo of our (western) society is not perfect. With the massive recession, pollution (and global warming), general unhappiness and stress leading to depression, cancer, low birth rates etc. etc. etc. I think it's safe to say we're sinking ever deeper into the cacky as a society.
I've just stumbled upon this great 10-minutes animated presentation how studies show that money is only a good motivation for pure physical work. On the other hand for anything demanding even a slight cognitive process huge money awards are counter-productive and the true motivations are the working/thinking person's autonomy, its strive to achieve mastery and the feeling of fullfilling a purpose. Which is a good explanation why FOSS exists in the first place and why it continues to grow.
In connection to that Simon Phipps has blogged about why the freedoms FOSS carries are more important even to businesses then the open source development model.
Currently the probably most brilliant project I've seen in years is the Design for the First World [Dx1W] which invites all who are born and living in the so called 3rd world to help find solutions for problems that bother the developped world. They name e.g. obesity, consumerism, integration of immigration, low birth rate and aging population, but there's more. You could look at that sarcastically, but IMHO these are real problem we're facing and to which our society has so far failed to find working solutions.
The cruel reality remains that the current state of the 1st world is troubled by many things, doubts arise in both capitalism and individualism and elsewhere as well. It's time to stop, smell the roses and rethink our strategy. And in that, I think, we have to learn from both the FOSS community and the rest of the world.
hook out >> halfway comprehensive news brought to you by the guy who should be studying his arse off right now, but is not :P
<!--break-->
Belgium
Another update to Quizzer, now at version 3. But more importantly, updates to the Linux Sea related chapters are made available online – get a taste for it at the online quizzer set.
Feedback is, as always, very much appreciated.

England
In my last post, I talked about talking some first tentative steps into the semantic web. Two of the commentators suggested that I should check out the book "Programming the Semantic Web" published by O'Reilly. The full reference is below:
"Programming the Semantic Web by Toby Segaran, Colin Evans, and Jamie Taylor. Copyright 2009 Toby Segaran, Colin Evans, and Jamie Taylor, 978-0-596-15381-6."
In this post I review what I have read so far, making notes as I go along.
Chapter 1 is called "Why Semantics?" The book explains the core idea of the semantic web:
"it's about using semantics to represent, combine, and share knowledge between communities of machines, and how to write systems that act on that knowledge." (Page 2)
In particular:
"With a little work you can make the semantic relationships in your data explicit, and program in a way that allows the behavior of your systems to change based on the meaning of the data. With the semantics made explicit, other programs, even those not written by you, can seamlessly use your data. Similarly, when you write programs that understand semantic data, your programs can operate on datasets that you didn't anticipate when you designed your system." (page 3)
Chapter 2 talks about the 'triple' which is the "the fundamental building block of semantic representations" (page 19). Here are two triples:
Zeth writes 'Command Line Warriors'
Zeth's address is 'Buckingham Palace' (not yet at least)
The first part of the triple is the subject (e.g. Zeth). The second part is the 'predicate', which is "a property of the entity to which they are attached" (page 19), so my birthday or my address would be a predicate. The last part is the object, which can be another entity "that can be the subject in other triples" (page 19) or a literal value "such as strings or numbers." (page 19).
Different triples are linked together by sharing objects or subjects. The book starts off with an example spreadsheet of restaurant listings which it turns into a relational database, which it then turns into these triples. So the links are analogous to links between tables in a relational database.
Next the book moves on to building graphs of triples by using shared ids:
zeth first_name "Zeth"
zeth address royal_residence street
royal_residence street_address "Buckingham Palace"
royal_residence street post_code "SW1A 1AA"
commandline_warriors written by zeth
commandline_warriors name 'Command Line Warriors'
So here there are three entities: the person Zeth represented by the ID 'zeth', a house represented by the ID 'royal_residence' and a website represented by the ID 'commandline_warriors'. The post code "SW1A 1AA" is just a literal value at this point, but it could later be turned into an entity also. The first two triples have a shared ID, meaning both statements are about the entity 'zeth'. 'zeth' is not the name of the entity, it is an arbitrarily-chosen ID, the name is provided by the first_name predicate. The ID could have been a hash value or a sequential number.
The book then works through its first code example, which is available online here: simpletriple.py I recommend for you to download it now and have a look over.
The code sample has a class called SimpleGraph which is a simple example 'triplestore'. In the __init__ method, 's' stands for subject, 'p' for predicate, and 'o' for object. So the triples are stored in three different combinations. The book then explains the various methods, which may be evident to you from the code and docstrings.
Next we are shown in pictures and code that if there are two graphs with consistent identifiers, they can be merged together. Then we are to download a csv file of triples and load then using the load method of the SimpleGraph object. Then we perform queries upon this data.
from simpletriple import SimpleGraph
# Make an instance of the class
film_graph = SimpleGraph()
# Load the CSV from the book's website
film_graph.load("movies.csv")
# Now lets find Julie Walters' id
julie_id = film_graph.value(None, "name", "Julie Walters")
print julie_id
# Now lets find out all the films Julie has been in:
julie_films = film_graph.triples((None, "starring", julie_id))
for film in julie_films:
print film_graph.value(film[0], "name", None)
One of the results is the classic film 'Educating Rita'.
educating_rita = film_graph.value(None, "name", "Educating Rita")
Now lets find another actor in Educating Rita:
actor = film_graph.triples((educating_rita, "starring", None)).next()[2]
print film_graph.value(actor, 'name', None)
Sadly there are no dates of the films in the csv file, if there were we could sort an actor's films by year and thus generate a filmography.
Lets instead find the director:
director = film_graph.value(educating_rita, 'directed_by', None)
print(film_graph.value(director, 'name', None))
directed_films = film_graph.triples((None, "directed_by", director))
What other films has he directed?
for film in directed_films:
print film_graph.value(film[0], "name", None)
If you want to play along, use simpletriple.py to find out what other film has this director made that also stars the actor we found above? The answer is in the comments.
The rest of chapter two gives few more examples that can be played with.
Chapter three gives a new query syntax that works by defining various contraints and binding the results to set references. This is most easily demostrated by an example. Start by downloading an upgraded version of the triples module called simplegraph.py.
We load the data in the same way as before:
from simplegraph import SimpleGraph
film_graph = SimpleGraph()
film_graph.load('movies.csv')
You might still have 'actor' and 'director' etc in memory. Assuming that you do not, we can repeat what we did above:
julie_id = film_graph.value(None, "name", "Julie Walters")
educating_rita = film_graph.value(None, "name", "Educating Rita")
actor = film_graph.triples((educating_rita, "starring", None)).next()[2]
director = film_graph.value(educating_rita, 'directed_by', None)
Now we can answer the above quiz question in a far more efficient manner. The question was to find out what other film the 'director' made that also started the 'actor'.
film_graph.query([('?film', 'starring', actor),
('?film', 'directed_by', director)])
You can see instantly that this query is far shorter than the previous attempt which involved manually iterating our way to the correct result. How the query method is implemented can be seen by reading the Python file linked to above. What happens in this case is that for each possible result matching these constraints, a dictionary is returned binding the key 'film' to the ID of the film that has been found.
So far we are part way through chapter 3. Join us next time when we continue working through the book.
Discuss this post - Leave a commentPosts for Tuesday, May 18, 2010
i’m not a big fan of updates but sometimes i have to do them. this time i would like to blog the steps so that other gentoo users can review my doings. this posting lists all commands i’ve issues in order to update from kde-4.3.3 to kde-4.4.2 (later kde-4.4.3)
WARNING: it is very important to understand what stable in portage means. if a package called mypackage is marked stable in portage, it can be installed with ‘emerge mypackage’. that means it is marked stable by portage. however this might not mean that the package itself is stable at all – but most likely it is. this concept is different to the concept of releases made by software vendors, who have their own idea about stable/testing and unstable. in portage a package is marked stable when the integration into the gentoo linux system has proved to be working well. also new packages might be marked unstable as they are not tested enough, even though many users would think they should be marked stable. if in doubt: do not install software which is marked unstable by portage. this posting is all about to install a ‘kde release’ which is, when writing this posting, marked unstable in portage. in contrast: the kde developers release the software, which i’m going to install, as stable.
see [6] for the official kde & gentoo guide.
i’ve had lots of problems with kde 4.x so i basically removed all my daily kde 4.x dependencies and replaced them with none-kde programs as:
i probably will change back once kde 4.x if:
first let’s update portage with:
# eix-sync
since gentoo installations of kde take very long i’ve decided to install xfce4, which is a very nice and tiny desktop environment:
# emerge xfce4-meta
afterwards i’ve logged out and logged into xfce4. i’m using the gnome terminal for the update.
using emerge this message shows up every time:
WARNING: One or more repositories have missing repo_name entries:/usr/local/portage/profiles/repo_nameNOTE: Each repo_name entry should be a plain text file containing aunique name for the repository on the first line.
i’ve seen this warning all over the place here and i usually look the error up in google, fix the problem and forget about it. as it still seems to be there on some machines it is probably a good idea to document the fix in this blog. so here we go:
echo “invalidmagic’s local repository” > /usr/local/portage/profiles/repo_name
and finally the warning is gone see also [1] where this issue is discussed.
the first thing i try is to test if the update does work out of the box, i’m doing this with:
# autounmask =kde-base/kde-meta-4.4.2
next i try if portage can perform an update with:
# emerge –color n =kde-base/kde-meta-4.4.2
usually this looks like this (only relevant lines shown):
[blocks B ] <x11-libs/qt-xmlpatterns-4.6.2 (“<x11-libs/qt-xmlpatterns-4.6.2″ is blocking x11-libs/qt-webkit-4.6.2-r1, x11-libs/qt-sql-4.6.2, x11-libs/qt-qt3support-4.6.2, x11-libs/qt-core-4.6.2-r1, x11-libs/qt-svg-4.6.2, x11-libs/qt-test-4.6.2, x11-libs/qt-opengl-4.6.2, x11-libs/qt-script-4.6.2, x11-libs/qt-gui-4.6.2)
[blocks B ] <x11-libs/qt-test-4.6.2 (“<x11-libs/qt-test-4.6.2″ is blocking x11-libs/qt-webkit-4.6.2-r1, x11-libs/qt-sql-4.6.2, x11-libs/qt-xmlpatterns-4.6.2, x11-libs/qt-core-4.6.2-r1, x11-libs/qt-svg-4.6.2, x11-libs/qt-gui-4.6.2, x11-libs/qt-opengl-4.6.2, x11-libs/qt-qt3support-4.6.2, x11-libs/qt-script-4.6.2)
[blocks B ] kde-base/libknotificationitem:4.3[-kdeprefix] (“kde-base/libknotificationitem:4.3[-kdeprefix]” is blocking kde-base/kdelibs-4.4.2)
[blocks B ] <x11-libs/qt-script-4.6.2 (“<x11-libs/qt-script-4.6.2″ is blocking x11-libs/qt-webkit-4.6.2-r1, x11-libs/qt-sql-4.6.2, x11-libs/qt-xmlpatterns-4.6.2, x11-libs/qt-core-4.6.2-r1, x11-libs/qt-svg-4.6.2, x11-libs/qt-test-4.6.2, x11-libs/qt-opengl-4.6.2, x11-libs/qt-qt3support-4.6.2, x11-libs/qt-gui-4.6.2)
(see full list at [2], i’ve used ‘emerge –color n =kde-base/kde-meta-4.4.2 -a > portage_log 2>&1′ to create a file with the output)
so what i do instead, is to remove all kde components from the system
using qlist (app-portage/portage-utils-0.2.1) we need to find all kde components. we need to use -I to find installed packages. we also disable the usage of color, with -C, to make the output usable for script processing.
qlist -IC kde
there are some applications as k3b for instance which does use kde-base/kdelibs but which are NOT included in this list. most of the time this can be ignored since a later ‘revdep-rebuild’ will fix this for those programs. however if kdelibs is removed k3b can’t be started anymore. removing kdelibs after k3b has been started will probably not crash k3b and k3b might still work. so let’s remove all kde components (also erasing all SLOTS, aka different versions):
emerge -C $(qlist -IC kde)
dependent on your installation and harddrive speed this might take a while (545.09 seconds).
right now i realized that there are old ‘autounmasks’ in /etc/portage, i’m going to clean that up first:
cd /etc/portage/
- grep kde * -R
- grep qt * -R
- grep avahi * -R
i removed most files which had something to do with kde, qt, amarok and some with avahi. so have a look at all categories (directories):
especially look for unmasks for 9999 packages, which refer to svn/cvs/git versions which were not released yet (but might be tagged still). this means: if those packages refer to a developer’s version without a tag the package might change without warning. therefor the installation could break with different error messages on different checkouts. usually developers want this in order to test their software. users don’t want that but it’s a nice way to experiment with recent software but still having a package manager for safe removal. is the system now clean? we’ll see. we probably have to set the correct use flags again. WARNING: be aware that use flags can also be set globally in /etc/make.conf some useflags can be shown with:
equery u kde-meta
the semantic-desktop useflag might be of interest. i’m not sure, but i think that using the kdeprefix useflag resulted in having: ~/.kde3.5, ~/.kde4.2, ~/kde4.4 and others. so once you want to use kde 4.4 instead of kde 4.2 (you can select this on login using kdm for instance) this means that all your system settings as: kaddressbook, knotes, autostarters, desktop configuration and others will have to be migrated manually by copying the files from ~/.kde4.2 to ~/.kde4.4 prior to your login. However this is just a guess but it would explain the issues i had, during the time i used +kdeprefix. about ~ kde3.5 / kde 4.2. so let’s do the autounmask again:
# autounmask =kde-base/kde-meta-4.4.2
this time it took really long (36 minutes) but as htop shows portage does only use one core while iotop showed that there was no disk access at the same time. probably a result of complex dependency-graph-calculations. so autounmask came up with this blocks:
[blocks B ] >x11-libs/qt-opengl-4.5.3-r9999 (“>x11-libs/qt-opengl-4.5.3-r9999″ is blocking x11-libs/qt-assistant-4.5.3, x11-libs/qt-test-4.5.3-r1 , x11-libs/qt-dbus-4.5.3-r1, x11-libs/qt-xmlpatterns-4.5.3-r1, x11-libs/qt-core-4.5.3-r2, x11-libs/qt-gui-4.5.3-r2, x11-libs/qt-qt3support-4.5.3, x11 -libs/qt-svg-4.5.3-r1, x11-libs/qt-script-4.5.3-r1, x11-libs/qt-demo-4.5.3, x11-libs/qt-webkit-4.5.3, x11-libs/qt-sql-4.5.3)
[blocks B ] <x11-libs/qt-svg-4.6.2 (“<x11-libs/qt-svg-4.6.2″ is blocking x11-libs/qt-webkit-4.6.2-r1, x11-libs/qt-sql-4.6.2, x11-libs/qt-xmlpatte rns-4.6.2, x11-libs/qt-core-4.6.2-r1, x11-libs/qt-test-4.6.2, x11-libs/qt-opengl-4.6.2, x11-libs/qt-qt3support-4.6.2, x11-libs/qt-script-4.6.2, x11-l ibs/qt-dbus-4.6.2, x11-libs/qt-gui-4.6.2) and many more….
so it’s time to check the qt-* packages. interesting, there is x11-libs/qt installed (a qt-3.x version), the new qt-4.x have a split package naming scheme.
# equery d x11-libs/qt
- app-crypt/qca-1.0-r3 (x11-libs/qt:3)
- dev-libs/dbus-qt3-old-0.70 (=x11-libs/qt-3*)
- media-sound/hydrogen-0.9.3-r4 (=x11-libs/qt-3*)
qlist -Iv x11-libs | grep “qt-.*”
- x11-libs/qt-3.3.8b-r2 (WARNING: you can leave this installed as it is not a conflict candidate for a kde 4.x installation despite some avahi issues)
- x11-libs/qt-assistant-4.5.3
- x11-libs/qt-core-4.5.3-r2
- x11-libs/qt-dbus-4.5.3-r1
- x11-libs/qt-demo-4.5.3
- x11-libs/qt-gui-4.5.3-r2
- x11-libs/qt-opengl-4.5.3-r1
- x11-libs/qt-qt3support-4.5.3
- x11-libs/qt-script-4.5.3-r1
- x11-libs/qt-sql-4.5.3
- x11-libs/qt-svg-4.5.3-r1
- x11-libs/qt-test-4.5.3-r1
- x11-libs/qt-webkit-4.5.3
- x11-libs/qt-xmlpatterns-4.5.3-r1
# qlist -IC x11-libs | grep “qt-.*”# emerge -C $(qlist -IC x11-libs | grep “qt-.*”)
cd /etc/portagegrep kde *
rm package.unmask/autounmask-kde-metarm package.use/autounmask-kde-metarm package.keywords/autounmask-kde-meta
# autounmask =kde-base/kde-meta-4.4.2 (see [3] for a complete list)
*smile* only one final ‘block’ left!
[blocks B ] <app-emulation/emul-linux-x86-xlibs-20100409 (“<app-emulation/emul-linux-x86-xlibs-20100409″ is blocking app-emulation/emul-linux-x86-opengl-20100410_pre)
* Error: The above package list contains packages which cannot be installed at the same time on the same system.
(‘installed’, ‘/’, ‘app-emulation/emul-linux-x86-xlibs-20091231‘, ‘nomerge’) pulled in by
~app-emulation/emul-linux-x86-xlibs-20091231 required by (‘installed’, ‘/’, ‘app-emulation/emul-linux-x86-gtklibs-20091231‘, ‘nomerge’)
~app-emulation/emul-linux-x86-xlibs-20091231 required by (‘installed’, ‘/’, ‘app-emulation/emul-linux-x86-medialibs-20091231‘, ‘nomerge’)
app-emulation/emul-linux-x86-xlibs required by (‘ebuild’, ‘/’, ‘x11-drivers/nvidia-drivers-195.36.24‘, ‘merge’)
(and 2 more)
(‘ebuild’, ‘/’, ‘app-emulation/emul-linux-x86-opengl-20100410_pre‘, ‘merge’) pulled in by
app-emulation/emul-linux-x86-opengl required by (‘ebuild’, ‘/’, ‘app-emulation/emul-linux-x86-xlibs-20100409-r1‘, ‘merge’)
so let’s try point two (2):
# autounmask =app-emulation/emul-linux-x86-gtklibs-20100409-r1
# autounmask =app-emulation/emul-linux-x86-medialibs-20100409
and finally let’s try it again
# autounmask =kde-base/kde-meta-4.4.2
oh we got a “!done”. that is great news as it seems to work so far!
however, i don’t plan to remove them. with some luck they might be updated automagically.
# emerge -uDN world –keep-going -a see [4] for the complete output of the command above … Use emerge @preserved-rebuild to rebuild packages using these libraries emerge -uDN world –keep-going -a 16972.19s user 5351.49s system 97% cpu 6:21:36.73 total
i usually use “–keep-going”, please see the documentation what is cool about doing so. in general it helps to shorten installation time as a failure in the middle of a 200 package installation won’t stop for manual maintenance. with some luck nearly all packages were installed using this feature when still having several critical compile or linker errors.
since we removed x11-libs/qt-* basically every program which links against any of these libraries MUST be broken. with one exception: programs which are linked statically. however most programs on linux are linked dynamically so we have to check for broken programs with:
# revdep-rebuild revdep-rebuild 802.74s user 271.12s system 95% cpu 18:49.88 total
this is really important and there are other ways to do it, anyway:
# etc-update
now the final step! after one day we are finally there! yepeee.
emerge kde-meta -a (see [5] for a complete list of packages and use flags)
so all i did was to add: lzma and semantic-desktop useflag
it seems that while i wrote this blog entry a new version of kde was released (might be my late eix-sync as well). so i’m going to install ‘kde 4.4.3′ instead of ‘kde 4.4.2′. so what i do is basically starting all over again:
surprise! we got new blocks:
# emerge kde-meta -a
(‘ebuild’, ‘/’, ‘kde-base/kdelibs-4.4.3′, ‘merge’) pulled in by
>=kde-base/kdelibs-4.3.5[-kdeprefix,-aqua] required by (‘ebuild’, ‘/’, ‘kde-base/solid-4.3.5′, ‘merge’)
>=kde-base/kdelibs-4.3.5[-kdeprefix,-aqua] required by (‘ebuild’, ‘/’, ‘kde-base/krosspython-4.3.5′, ‘merge’)
>=kde-base/kdelibs-4.3 required by (‘ebuild’, ‘/’, ‘net-p2p/ktorrent-3.3.4′, ‘merge’)
(and 6 more)
(‘ebuild’, ‘/’, ‘kde-base/libknotificationitem-4.3.5′, ‘merge’) pulled in by
>=kde-base/libknotificationitem-4.3.5[-kdeprefix,-aqua] required by (‘ebuild’, ‘/’, ‘kde-base/krosspython-4.3.5′, ‘merge’)
>=kde-base/libknotificationitem-4.3.5[-kdeprefix,-aqua] required by (‘ebuild’, ‘/’, ‘kde-base/solid-4.3.5′, ‘merge’)
>=kde-base/libknotificationitem-4.3.5[-kdeprefix,-aqua] required by (‘ebuild’, ‘/’, ‘kde-base/kdialog-4.3.5′, ‘merge’)
(and 1 more)
so what can we do about this one? first thing is to look if there is a more recent version of ktorrent which would use kdelibs-4.4.3 instead of kdelibs-4.3 and there is none, so ktorrent can’t be installed with ‘kde 4.4.3′.
but that autounmask also shows a lot of blocks, basically those from above. however there is an additional one now:
(‘ebuild’, ‘/’, ‘kde-base/libkworkspace-4.3.5′, ‘merge’) pulled in by
>=kde-base/libkworkspace-4.3 required by (‘ebuild’, ‘/’, ‘net-wireless/kbluetooth-0.4.2′, ‘merge’)
there is nothing we can do about it right now. we have to remove kbluetooth. this stupid apple magic mouse didn’t work well anyway so who cares?
emerge -C kbluetooth
let’s try to mask ‘<kde-base/kdelibs-4.4′ versions, that means all versions which were released before 4.4
echo “=kde-base/kdelibs-4.3.5″ >> /etc/portage/package.mask/kde
echo “=kde-base/kdelibs-4.3.3-r1″ >> /etc/portage/package.mask/kde
that worked partially. ‘emerge -uDN world’ still has blocks but ‘emerge kde-meta’ would work well.
so let’s care about that blocks first:
emerge -C ktorrent kile
and i’m done. those two applications don’t seem to work with kdelibs-4.4.3 so i will check the reinstall these when i see a new version of these two programs in the ‘eix-sync’ log. currently i don’t need either of them. probably installing kde with the ‘kdeprefix’ useflag could have worked as well but i did not want to do that.
so now the final step
it seems we got all dependencies resolved!
emerge kde-meta –keep-going -a
it seems some use flags which i did not set result in an dependency issue:
emerge: there are no ebuilds built with USE flags to satisfy “>=x11-libs/qt-qt3support-4.6.0:4[kde]“.
!!! One of the following packages is required to complete your request:
- x11-libs/qt-qt3support-4.6.2 (Change USE: +kde)
(dependency required by “kde-base/libkcompactdisc-4.4.3″ [ebuild])
(dependency required by “kde-base/kdemultimedia-meta-4.4.3″ [ebuild])
(dependency required by “kde-base/kde-meta-4.4.3″ [ebuild])
(dependency required by “kde-meta” [argument])
let’s fix that with:
echo “x11-libs/qt-qt3support kde” >> /etc/portage/package.use/qt-qt3support
and then we should restart the emerge but this time we add -N for ‘new use’
emerge kde-meta -N –keep-going -a
now all problems are resolved and the installation (compilation&linking) is running. x11-libs/qt-qt3support-4.6.2 is the first package which is installed as we used -N.
it might be a good idea to check for broken programs once again, just to be sure. use ‘revdep-rebuild’ for that.
next time i update i can have a look at this posting. maybe it is of help for other gentoo users as well. i would be delighted.
[1] http://bugs.gentoo.org/show_bug.cgi?id=248603
[2] http://lastlog.de/misc/wordpress/portage_kde_meta_blocks.txt
[3] http://lastlog.de/misc/wordpress/autounmask.txt
[4] http://lastlog.de/misc/wordpress/emerge_world.txt
[5] http://lastlog.de/misc/wordpress/emerge_kde.txt
[6] http://www.gentoo.org/proj/de/desktop/kde/kde-config.xml
Planet Larry is not officially affiliated with Gentoo Linux. Original artwork and logos copyright Gentoo Foundation. Yadda, yadda, yadda.