Posts for Monday, June 7, 2010

avatar

Portage Hooks

Now that school is done for the 2009-2010 year, I’m back at it in Neuvoo again. I’m finishing off a long-planned and fairly major addition to portage I call “portage hooks.” The fun thing is I’ve submitted some patches to zmedico and the response has actually been more positive than previous experiences. solar seemed to be (tentatively?) liking the idea as well.

So, here’s what portage hooks are all about. If you have portage-utils installed, you will have an /etc/portage/postsync.d/ directory. Scripts in this directory are executed after portage syncs the tree. I thought this was a great idea, and I thought it should be expanded so there are other opportunities for unofficial extensions.

So, hopefully soon, there will be /etc/portage/hooks/{pre,post}-{run,sync,ebuild}.d directories, and inside hook scripts can be installed by ebuilds or users. The run and sync hooks are self-explanatory. The ebuild hooks will be executed within the ebuild environment before or after each phase, and can modify an ebuild’s environment, similar to /etc/portage/bashrc.

I’ve written documentation for the portage DocBook (enable the doc USE flag on portage) when the patches get accepted.

Neuvoo has already utilized these hooks to add transparent and stable support for squashfs portage trees. The “emerge –sync” command is hijacked by a pre-sync hook to download the latest squashfs tree, and every time “emerge” is run, there is a hook that checks to be sure the squashfs tree is mounted.

I hope to use hooks for the following additional unofficial features:

  • When a package fails to merge, a hook would inform the user of known open bugs for that package, and possibly even suggest a fix. Still thinking over this one.
  • Since we’re working in embedded environments, we’re going to be primarily using binary packages on production systems. This means we’ll need a pretty solid binary repository format to keep everything stable. I’m hoping to use our beagleboards to keep these binaries up-to-date. A post-run hook would detect that a new binary package was built and port the binary into our new binary repository format (which will consist of ebuilds, since ebuilds have much better flexibility than the “Packages” file will ever have). In addition, Neuvoo users, and anyone else with our binary system, can submit new binaries to our server for automated review, which will probably be some kind of fancy voting system.

An unofficial version of portage, and the squashfs hooks, are all available right now if you want to try it out in the neuvoo overlay. Run “layman -a neuvoo && emerge -av squashfs-portage” to get it. Be warned: the squashfs-portage package requires the latest neuvoo portage git. I don’t know if you can revert back to an old version of portage once you have the new one.


Posts for Saturday, June 5, 2010

It’s not just *what* you write, it’s *when* you write it

I’ve had this blog for quite a while now and there’s a few things that I still haven’t really figured out. On of the things that always irritates me most is which posts get recognition (as in them being linked, twittered about or commented on).

There are posts that come quickly. I open the text editor thingy and just type, a few minutes later a post has emerged. Those are usually me venting or poking something retarded with a stick. Those posts (even if they end up being rather long at times) just come to me. They do not require a lot of work and while venting always relieves me, those posts are nothing I do feel particularly good about.

On the other hand there are posts that usually have meandered around in my head for weeks or months, ideas that have kept my mind occupied for a long time and that I try to meld into a post at some point in time. And usually those posts are really hard to write, I trash many of them, rewrite them, trash the rewrite. Those posts are hard work and when I manage to boil my thoughts down to something my limited writing skills are able to bring to virtual paper it feels different. It’s probably more like what a mother feels after having given birth (without the physical pain obviously). Those post I wrestled with usually are something really important to me, some idea or comment I think is really really important. Also, those posts hardly ever gain any traction, but that’s probably cause they tend to be basically unreadable without the weeks of background thinking I did (I try to be better with that but it does not work so great so far ;) ).

But recently I realized that something else is very important for posts as well: Timing. With this I do not mean that a comment on some political issue has to be made directly after the event happened so people still have the event in mind, I am talking about the time of the day.

While some people still surf to sites directly to check out what’s new, a few years ago, many people switched to RSS readers. Personally I couldn’t live without one (I use the “evil” Google Reader [which is an absolutely brilliant RSS reader btw. you should really try it!]), but lately Twitter has turned out to be the thing driving traffic. I do what many others do, I have a service poll my RSS feed and post notices alerting people on my Twitter and Identi.ca feeds of the new post on the blog. Sometimes people retweet those notices to their followers and a posts get spread around.

But (and this is quite a serious but!) this has lead to many people heavily using twitter to find new and interesting articles and twitter is “real-time-ish” meaning: Something written yesterday evening is basically gone today. Microblogs are a stream of information. You dive in, you get out and the stream continues flowing. It’s basically impossible to read all messages passing through so you only check them out when you have the time.

I am in Germany and I usually write in the evening when I have some time off. So if I write something in German (which does not happen all that often though a few recent articles in German were fun to write) and a few minutes later the post is picked up and posted to twitter many people might have already gone off to bed. In the morning they open their microblogging application and maybe even check the backlog of messages but during those 8 or more hours so many messages will have gone through that the blogpost has basically never existed.

Now, if you know me somewhat, I am stubborn. I don’t want to repeat posts notifying people of my blog, it’s just more noise I would add to the world. But it makes you wonder how many articles never get any traction just because somebody wrote them after work  instead of in his or her lunchbreak.

avatar

OVAL, SCAP, CVE, CPE, …

For a personal POC I wanted to see if it is possible to generate, based on the collection of CVE entries publicly available, a report informing a system administrator about possible vulnerabilities. Nothing fancy, just based upon versions.

A simple example: tool detects Perl, acquires installed Perl version, then matches the collection of CVE entries against this Perl version. If at least one CVE is found, report it. The idea is then to make this as generic as possible (not specific for an operating system or Linux distribution), so not use a package version but really the tool version (or library version).

Of course, whenever I am planning such minor POCs, I search the Internet for possible existing tools (just like kev009 describes – “But First, Write No Code”). And I found out that there are already quite some “foundation components” available…

  • CPE is a structured way of naming software (vendor, title, version …)
  • OVAL is a method for performing structured tests (like regular expression matches in text) for reporting purposes

Many more of these efforts are linked through the Mitre sites. The above two are the most important ones though – it seems that it might be possible to use OVAL to describe the tests I wanted for the POC.

To be continued…

avatar

Listing files of (not) installed software

Everyone that has been using Gentoo for a while now knows about tools such as qlist that show you the list of files installed by an (installed) package, or qfile that allows you to find which package provided a particular file on your system.

One thing lacking is to be able to find out which package would provide a file. Unlike the previous tools, this tool cannot rely on the information found on your system as the package isn’t installed yet.

There have been projects in the past that attempted to provide such functionality, almost always through an online queryable database. Many haven’t survived, due to too high expectations or little server infrastructure resources. But it seems like PortageFileList is to stay for a while.

The project not only offers an online interface for querying information, it also provides a package (app-portage/pfl) that allows you to query their infrastructure from the command line. The package provides a tool called e-file which supports SQL-like syntax for the queries.


~$ e-file '%bin/xdm'

The above command will then display, using the well-known emerge/Portage output, which package provides the file (as well as which file was matched by the query).

Definitely a nice tool to have around. Thanks guys of PortageFileList!

Posts for Friday, June 4, 2010

Configuring Repositories Automatically via RepositoryRepository

Paludis is aware of packages that are in repositories you don’t have configured thanks to the unavailable repository. However, once Paludis has shown you that the package you want is in a repository you don’t have configured, you need to set up a configuration file for that repository (and any repositories it requires) and then sync. This is more work than is really necessary.

Enter RepositoryRepository, also known as r^2. Conceptually, it works as follows:

As well as providing special packages for packages in unavailable repositories, the unavailable repository also now provides packages named ‘repository/blah’ for repositories you don’t have configured. The metadata for these packages includes dependency information etc, along with useful things like the repository’s sync URI.

A new repository, using format = repository, provides special packages for repositories you do have configured.

Repository packages in unavailable repositories can be ‘installed’ to repository repositories. ‘Installing’ a repository creates a configuration file for it, and then syncs the newly created repository.

The configuration file it creates is controlled by a simple template, so it can contain anything you want it to contain.

Exherbo users can follow the setup instructions to start using this. On Gentoo this functionality is not yet available, since we won’t be switching the generated unavailable data to the new format until we’re reasonably sure that everyone is using a Paludis release that supports it.


Filed under: exherbo, paludis for users Tagged: exherbo, paludis

Can't win here

In the recent election, some of the parties put charts into their literature. In this post I analyse their accuracy.

Last month in the UK we had a general election, this is where we elect an Member of Parliament to represent our little area (called a 'constituency'). During the election period, I received a large amount of A4 paper from each of the various parties. I threw out hundreds of them, but I still managed to find a representative sample of them lying in my hall.

A couple of them (1 2) are leaflets for the neighbouring constituency! These are a complete waste of time.

Many of the leaflets share an interesting feature, little bar graphs or pie charts.

This is a commonly used tactic of the Liberal Democrat Party, to depict themselves as the second candidate in the local area, as opposed to their national profile as the third party. Here are some leaflets from the Lib Dem candidate in my area, showing these pictures.

http://commandline.org.uk/images/posts/hallgreen/numbers_libdems.jpg

The bar charts are a not very subtle appeal to people who would otherwise vote for the Conservative party. The argument is churlish and not in the British sporting tradition of fair play, candidates should campaign on their own merits.

However, it is not inaccurate from a demographic viewpoint. My area mainly consists of white working class workers, public sector workers and a growing Asian population. None of these groups are disposed to vote Conservative, and in the vote, the Conservatives came a pathetic forth.

Not that is stopped the Conservative candidate making her own graph in her literature. I found it hilarious so I will zoom in and show it in its full glory. It is dodgy (even dishonest) on several levels.

http://commandline.org.uk/images/posts/hallgreen/numbers_tories.jpg

Firstly, it is not even a general election result, it is showing a local council by-election. Councillors have an important role collecting rubbish and other local issues, but it is not the same thing at all.

Secondly, as a council by-election, it is a much smaller area than the general election consistency. So there is no way to tell how the larger area will vote based on the east end (Sparkbrook) of the consistency. This is especially the case because Sparkbrook, as the Balti capital of the world, is a majority Asian area, whereas the other parts of the constituency have a more dispersed ethnic population. Anyway, here are the results in that 2009 council by-election:

Name Party Votes
Ali Shokat Respect 2495
Mohammed Azim Labour 2228
Abdul Kadir Conservative 799
Naeem Qureshi Liberal Democrats 506
Charles Alldrick Green 213
Sakander Mahmood Independent 55

So we can see here that the Conservatives came third, not even getting a third of the votes of the winning candidate. Therefore the most dishonest feature of the graph is showing the change in vote without supplying the total vote. When armed with the figures above, we can see that the Conservative vote went up by a few dozen votes and that they started from a very low base indeed.

The dishonesty is confounded with the annotations on the chart. It says "Can't win here!" pointing to Green, Labour and Respect. Even if we ignore my caveats about this data being irrelevant for the general election; as you can see from the absolute numbers from the by-election result, Respect and Labour were the top two results, so they could in-fact win here.

This is further proved by looking at the actual vote last month:

Name Party Votes
Labour Roger Godsiff 16,039
Respect Salma Yaqoob 12,240
Liberal Democrats Jerry Evans 11,988
Conservative Jo Barker 7,320
UKIP Alan Blumenthal 950
Independent Andrew Gardner 190

So the Convervative chart looks extremely selective and misleading as Labour did in fact win the seat and Respect came second. Many media commentators were predicting a possible Respect victory.

The Respect party also had a chart. It is a well presented chart, showing the different results in 2005. It is the previous general election in 2005, so it is not a complete fiction like the Conservative chart.

http://commandline.org.uk/images/posts/hallgreen/numbers_respect.jpg

The annotation is also much more positive, instead of "someone else can't win here", it is the rousing "She can do it!". Unlike the Conservative annotation, it was not a lie; Yaqoob had a real chance.

The Respect party was actually founded by George Galloway, a Scottish Catholic, however locally in the campaign, Respect was perceived as the Muslim party, which limited its appeal among the indigenous population, and probably cost Respect the seat. However, demographic changes are on Respect's side, so they may well get over the finishing line next time when many new Asian voters have come of age. Assuming that the party does not run out of steam causing Asian voters return to the Labour party.

The chart is not without a problem. It is not the correct constituency. It is actually a neighbouring constituency. This is perhaps unavoidable because the constituency that Yaqoob ran for in 2005 was abolished due to boundary changes, and this current constituency did not have a respect representative in 2005. However, I think the chart should have put a little note pointing out that the boundaries have changed and this is the nearest relevant result.

Lastly we are onto Labour's literature:

http://commandline.org.uk/images/posts/hallgreen/numbers_labour.jpg

As you can see, no chart. (The independent candidate's leaflet does not have a chart either.) Instead, on the Labour leaflet, there are pictures of the incumbent meeting people and a record of how he voted. For the incumbent candidate, especially when his/her party is doing worse nationally than the candidate is locally, then is less point to a chart like the type above. However, an accurate and honest chart would help potential voters to understand the context behind any dodgy charts, such as the one in the Conservative literature.

Discuss this post - Leave a comment

Posts for Tuesday, June 1, 2010

Discussions

Whenever people are asked why they blog, most people will answer that it is for the comments and discussions that emerge from well-written blog posts. This idea has been picked up by people who write “guides” on how to write for a successful blog and is usually phrased something like this:

Finish your post with a direct question to your readers to get the discussion started.

If you look at this blog you will see that most posts do not generate any comments at all. Some people might argue that this stems from me writing mostly crappy posts and for some they might actually be right, but I’m not always that bad so there’s gotta be a different reason for this.

This actually ties in with a complaint my girlfriend sometimes faced me with: The fact that I don’t ask a lot.

Not asking many questions might sound weird, especially since I do in fact consider myself to be a rather curious person interested in basically everything, but it is true to a certain extend. Especially when I talk or write about something I do not ask for other people’s opinion which is often perceived as lack of interest when it is something slightly different.

I want to hear your input. Almost every human being is interesting and has something interesting to add and I hate to miss out on that kind of stuff. But on the other hand I hate stupidity.

There’s the saying that there are no stupid questions which is absolute bullshit. Yes there are. Asking for something the speaker explained just 10 seconds ago is stupid (and shows lack of interest) for example. And just as there are stupid questions there are many stupid answers and comments, comments which I just don’t care about. I don’t want this blog to be full of “Yes, me, too” or “I dunno” comments, I don’t actually want to make commenting too “easy”.

The ease of commenting has something to do with the software used here (which makes commenting really easy if you know how to type a name and an email address) and with a … let’s call it a social barrier.

If I ended every post asking for everybody’s input I’d make commenting really easy. I’d roll out the red carpet and ask you to please come in. That’s not the modus operandi here.

I am thankful for everybody who comes here and reads my stuff (or who reads this via the feed), I appreciate the time you take out of your busy and short lives to spend it on this. That is the reason I put all this stuff out here under free licenses. What I don’t care about is a random count of comments.

When it comes to comments I care about quality, I care about people seriously challenging me and the ideas or thoughts I put forward. I’d rather have 3 posts without any comments and one with somebody seriously taking the time to challenge me than having 20 comments on every posts.

I mean, it’s not like everything here is pure and utter intellectual brilliance, some posts are just retarded fun, some are just a quick linkdump or an embedded video. Why should you comment here on a video I embedded from youtube? Why not go to the youtube video instead? But there are some articles here that I did put some thought and some time in. Posts that are important to me. And I’d hate to have a possible discussion watered down.

I am similar in real life, I don’t ask for your input when I talk a lot. That does not mean that I don’t care it’s just that I don’t make it ridiculously easy for you. I put thoughts into what I write and say. Sometimes my opinions and ideas might even be controversial, might even get me into trouble. That’s a risk I take. So when you comment here I make you take some risk as well. I don’t prestructure the discussion with questions. I don’t open the door and roll out the red carpet.

But if you come and just get into the discussion (in real life just as virtually) I enjoy every minute of it. Yes, there is no red carpet, but if you manage to find the courage to come and open the door, you’ll find a seat at the table reserved just for you.

Posts for Friday, May 28, 2010

But First, Write No Code

Something I see often in person and online are programmers constantly implementing common solutions, reinventing wheels, or embracing NIH.

Before you do this, please consider the Kev009’s Oath“But First, Write No Code”.  This is a solution to a variety of problems in software development, but today’s article is specifically on using external code.

I’ve found that programmers who follow a system similar to mine (detailed below) develop systems that are more stable, maintainable, and sane.  They likely write better code because it means they understand their tools and also read others’ code.  They examine the problem first rather than going in guns blazing.

Steps to decide whether to use an existing solution or write your own implementation:

  1. Scan the area. Google, Freshmeat, SourceForge, standard library, OS libraries, etc. are your friend.  See if the problem you are trying to solve has been solved.  I don’t care how long you’ve been programming or how much you think you know. The ecosystem of a language is constantly changing.Make a list of hits that look similar to the problem you are trying to solve.  Try and get a quick sense of the idiomatic methods of using your language, OS, etc.
  2. Do research. Are the solutions you found in step 1 suitable to the problem at hand?  Consider the pros and cons of each item.  Now, carefully evaluate how idiomatic the items are to your language and environment.If the item is open source, does the community seem active?  If it doesn’t fully map to your problem, does it look like you can modify it to do so?

    Even if you end up developing a solution from scratch, you should at least now have some good references.  Keep in mind, extending an existing project may be considerably less work.  You might even be able to offload maintenance of that component.

  3. Consider the license. This isn’t just for the legal department.  What kind of project you are working on will weigh in heavily.  Commercial or open source?  As a software professional, you need to be abreast with the various licenses in the wild.  As an open source developer, you need to consider how licenses will affect your work being packaged by distributions.An open source library licensed under the GPL is not acceptable for static linking to commercial software.  However, you can link to an operating system provided copy or bundle the dynamic library with your application.  LGPL does not have this restriction.  With both of these, you must supply your changes upon request from end users among other things.

    BSD, MIT, and Apache style licenses allow you to make changes and redistribute under completely different licenses.  Some just want credit in your documentation.  These are very compelling even in commercial development.

    Commercial components may have a per-copy fee associated which may dissuade their use by your organization.  If you don’t get the source, you won’t be able to effectively change or maintain it so you will also be at mercy of that developer.

  4. Make a decision. By now, your list should have been pared down based on licenses and research.  Perform extensive evaluations of the remainder and eventually hone in on the one you think fits best.  You’re going to have to rely on your experience and intuition while making the critical decision.  Perhaps the hardest part:  weighing it against a mythical home-grown solution in your mind.
  5. Implement the decision. Self explanatory.  This either means bootstrapping your own project or fully integrating the external one.  If you are extending an open source solution, consider submitting the patches back to the community for feedback and perhaps integration.  If you are bootstrapping your own solution, you’ve got your work cut out.  Is this only suitable for an internal project, or perhaps it would have its own merit as a new open source project?Be sure to reevaluate early and often.  That library you chose might turn out to be a can of worms, just as the “easy” new solution you had in your head might require years of development.
  6. Subscribe to the announce mailing list. Only if you used an external solution. Does the project have an RSS feed for releases or a low volume announcement list?  Don’t be like Adobe.  Avoid embarrassing security problems.  Also consider how enhancements and bug fixes to the external project might make your own project better, more stable, and more efficient.  This is where the real lasting dividends of using an external solution come from.

This list is widely applicable.  You’ve got a seriously high bar to reach if you are developing containers of <T>, sorting methods, GUI frameworks, parsers, text and binary file formats, and much more so try and follow it the next time you code.

Share and Enjoy: Digg del.icio.us Slashdot Facebook Reddit StumbleUpon Google Bookmarks FSDaily Twitter email Print PDF

Related posts:

  1. El Reg Humor and Java in free software The Register has a good article on Sphinx search with...
  2. Java: The Good Parts A while back, a book entitled JavaScript: The Good Parts...
  3. One Small Step for QT, One Giant Leap for Free Software QT Software, under the graces of Nokia, has released the...

Posts for Monday, May 24, 2010

avatar

scanning for base64_decode references

A friend’s site was recently hit by the massive infections/hacks on Dreamhost’s servers, so I decided to do some scanning on some servers that I administrate for base64_decode references.

The simple command I used to find suspect files was:
# find . -name \*.php -exec grep -l "eval(base64_decode" {} \;

The results could be sorted in just 2 categories. Malware and stupidity. There was no base64_decode reference that did something useful in any possible way.

The best malware I found was a slightly modified version of the c99 php shell on a hacked joomla installation (the site has been hacked multiple times but the client insists on just re-installing the same joomla installation over and over and always wonders how the hell do they find him and hack him…oh well). c99 is impressive though…excellent work. I won’t post the c99 shell here…google it, you can even find infected sites running it and you can “play” with them if you like…

And now comes the good part, stupidity.
My favorite php code containing a base64_decode reference that I found:

<code2>$hash  = 'aW5jbHVkZSgnLi4vLi';
$hash .= '4vaW5jX2NvbmYvY29u';
$hash .= 'Zi5pbmMucGhwJyk7aW';
$hash .= '5jbHVkZSgnLi4vLi4v';
$hash .= 'aW5jX2xpYi9kZWZhdW';
$hash .= 'x0LmluYy5waHAnKTtl';
$hash .= 'Y2hvICRwaHB3Y21zWy';
$hash .= 'd2ZXJzaW9uJ107';
eval(base64_decode($hash));
</code2>

Let’s see what this little diamond does:

<code2>
% base64 -d 
aW5jbHVkZSgnLi4vLi4vaW5jX2NvbmYvY29uZi5pbmMucGhwJyk7aW5jbHVkZSgnLi4vLi4vaW5jX2xpYi9kZWZhdWx0LmluYy5waHAnKTtlY2hvICRwaHB3Y21zWyd2ZXJzaW9uJ107
include('../../inc_conf/conf.inc.php');include('../../inc_lib/default.inc.php');echo $phpwcms['version'];
</code2>

So this guy used a series of strings which all of them together create a base64 encoded string in order to prevent someone from changing the version tag of his software. That’s not software, that’s crapware. Hiding the code where the version string appears ? That’s how you protect your software ? COME OOOOON….

Runtime Type Checking in C++ without RTTI

A technique I always seem to forget is how to map C++ types to an integer without relying upon RTTI. A variation on this is used in <locale> in standard library, for std::use_facet<>. But let’s take a much simpler, and highly contrived, example.

Let’s say we’ve got some values of different types, and we want to give those types to a library to store somewhere, and then we later want to get them back again. Crucially, the library itself doesn’t know anything about the types in question. So, for a very simple case:

#include <vector>
#include <iostream>
#include <string>

int main(int, char *[])
{
    std::vector<Something> things = { std::string("foo"), 123 };
    /* ... */
    std::cout << things[0].as<std::string>() << " " << things[1].as<int>() << std::endl;
}

Note the gratuitous use of c++0x initialiser lists, just because we can.

Those familiar with Boost might think that Something is like boost::any. However, boost::any uses RTTI, which is slow and completely unnecessary.

A first implementation of Something might look like this:

#include <memory>

class Something
{
    private:
        struct SomethingValueBase
        {
            virtual ~SomethingValueBase()
            {
            }
        };

        template <typename T_>
        struct SomethingValue :
            SomethingValueBase
        {
            T_ value;

            SomethingValue(const T_ & v) :
                value(v)
            {
            }
        };

        std::shared_ptr<SomethingValueBase> _value;

    public:
        template <typename T_>
        Something(const T_ & t) :
            _value(new SomethingValue<T_>(t))
        {
        }

        template <typename T_>
        const T_ & as() const
        {
            return static_cast<const SomethingValue<T_> &>(*_value).value;
        }
};

This works, but has a major flaw: if you get the types wrong when calling Something.as<>, you’ll get a segfault or something similarly horrible. We’d like to replace that with something safer.

One way to do it is to use runtime type information. The simplest variation on this is to replace the static_cast with a dynamic_cast. However, we can only do this if SomethingValueBase is a polymorphic type, which it isn’t. We can make it so by adding in a virtual destructor:

#include <memory>

class Something
{
    private:
        struct SomethingValueBase
        {
            virtual ~SomethingValueBase()
            {
            }
        };

        template <typename T_>
        struct SomethingValue :
            SomethingValueBase
        {
            T_ value;

            SomethingValue(const T_ & v) :
                value(v)
            {
            }
        };

        std::shared_ptr<SomethingValueBase> _value;

    public:
        template <typename T_>
        Something(const T_ & t) :
            _value(new SomethingValue<T_>(t))
        {
        }

        template <typename T_>
        const T_ & as() const
        {
            return dynamic_cast<const SomethingValue<T_> &>(*_value).value;
        }
};

Now, if we get the types wrong, a std::bad_cast will be thrown. Alternatively, we can use our own exception type:

class SomethingIsSomethingElse
{
};

class Something
{
    /* snip */

    public:
        template <typename T_>
        const T_ & as() const
        {
            auto value_casted(dynamic_cast<const SomethingValue<T_> *>(_value.get()));
            if (! value_casted)
                throw SomethingIsSomethingElse();
            return value_casted->value;
        }
};

We can also make use of std::dynamic_pointer_cast, which is possibly slightly less ugly syntactically:

class Something
{
    /* snip */

    public:
        template <typename T_>
        const T_ & as() const
        {
            auto value_casted(std::dynamic_pointer_cast<const SomethingValue<T_> >(_value));
            if (! value_casted)
                throw SomethingIsSomethingElse();
            return value_casted->value;
        }
};

All of this is using RTTI, though, and RTTI is a huge amount of overkill for what we need. Before eliminating the RTTI, though, we’ll switch to using it in a different way:

#include <memory>
#include <string>
#include <typeinfo>

class Something
{
    private:
        template <typename T_>
        struct SomethingValueType
        {
            virtual ~SomethingValueBase()
            {
            }
        };

        struct SomethingValueBase
        {
            std::string type_info_name;

            SomethingValueBase(const std::string & t) :
                type_info_name(t)
            {
            }
        };

        template <typename T_>
        struct SomethingValue :
            SomethingValueBase
        {
            T_ value;

            SomethingValue(const T_ & v) :
                SomethingValueBase(typeid(SomethingValueType<T_>()).name()),
                value(v)
            {
            }
        };

        std::shared_ptr<SomethingValueBase> _value;

    public:
        template <typename T_>
        Something(const T_ & t) :
            _value(new SomethingValue<T_>(t))
        {
        }

        template <typename T_>
        const T_ & as() const
        {
            if (typeid(SomethingValueType<T_>()).name() != _value->type_info_name)
                throw SomethingIsSomethingElse();
            return std::static_pointer_cast<const SomethingValue<T_> >(_value)->value;
        }
};

Here we make use of typeid explicitly, which is widely considered to be about on par with use of goto. However, it paves the way for our next step. Can we replace typeid(SomethingValueType<T_>()).name() with a different, non-evil expression? Let’s think about what properties the result of that expression must have:

  • We must be able to store it, so it needs to be a regular type.
  • We must be able to compare values of it, and be guaranteed true if and only if the two types used to create the value are the same, and false if and only if they are different. (Note that RTTI doesn’t even provide this guarantee.)

Let’s try this:

#include <memory>
#include <string>

class SomethingIsSomethingElse
{
};

template <typename T_>
struct SomethingTypeTraits;

class Something
{
    private:
        struct SomethingValueBase
        {
            int magic_number;

            SomethingValueBase(const int m) :
                magic_number(m)
            {
            }

            virtual ~SomethingValueBase()
            {
            }
        };

        template <typename T_>
        struct SomethingValue :
            SomethingValueBase
        {
            T_ value;

            SomethingValue(const T_ & v) :
                SomethingValueBase(SomethingTypeTraits<T_>::magic_number),
                value(v)
            {
            }
        };

        std::shared_ptr<SomethingValueBase> _value;

    public:
        template <typename T_>
        Something(const T_ & t) :
            _value(new SomethingValue<T_>(t))
        {
        }

        template <typename T_>
        const T_ & as() const
        {
            if (SomethingTypeTraits<T_>::magic_number != _value->magic_number)
                throw SomethingIsSomethingElse();
            return std::static_pointer_cast<const SomethingValue<T_> >(_value)->value;
        }
};

Now, our library user has to provide specialisations of SomethingTypeTraits for every type they wish to use:

#include <string>
#include <iostream>
#include <vector>

template <>
struct SomethingTypeTraits<int>
{
    enum { magic_number = 1 };
};

template <>
struct SomethingTypeTraits<std::string>
{
    enum { magic_number = 2 };
};

int main(int, char *[])
{
    std::vector<Something> things = { std::string("foo"), 123 };
    std::cout << things[0].as<std::string>() << " " << things[1].as<int>() << std::endl;
}

No RTTI at all there, and it is type safe, but it relies upon a lot of boilerplate from the library user, and that boilerplate is very easy to screw up. So, we’ll allocate magic numbers automatically instead:

#include <memory>

class Something
{
    private:
        static int next_magic_number()
        {
            static int magic(0);
            return magic++;
        }

        template <typename T_>
        static int magic_number_for()
        {
            static int result(next_magic_number());
            return result;
        }

        struct SomethingValueBase
        {
            int magic_number;

            SomethingValueBase(const int m) :
                magic_number(m)
            {
            }

            virtual ~SomethingValueBase()
            {
            }
        };

        template <typename T_>
        struct SomethingValue :
            SomethingValueBase
        {
            T_ value;

            SomethingValue(const T_ & v) :
                SomethingValueBase(magic_number_for<T_>()),
                value(v)
            {
            }
        };

        std::shared_ptr<SomethingValueBase> _value;

    public:
        template <typename T_>
        Something(const T_ & t) :
            _value(new SomethingValue<T_>(t))
        {
        }

        template <typename T_>
        const T_ & as() const
        {
            if (magic_number_for<T_>() != _value->magic_number)
                throw SomethingIsSomethingElse();
            return std::static_pointer_cast<const SomethingValue<T_> >(_value)->value;
        }
};

How does this work? Each instantiation of the magic_number_for<T_> function needs to return the same magic number every time it is called. The first time any particular instantiation is called, its static int result requests the next magic number. On subsequent calls, the allocated number is remembered. (Note that static values inside a template are not shared between different instantiations of that template.) Finally, next_magic_number just returns a new magic number every time it is called.

And there we have it: fast runtime type checking with no boilerplate and no RTTI. What we’ve done here is more or less useless, but the techniques do have other applications. For the curious, std::use_facet<> is probably the most common, and anyone brave enough to delve into its design will eventually see why this isn’t either pointless wankery or reinventing the wheel. For the rest, if you think that using RTTI can solve your problem adequately, then it probably can, and you don’t need to go into the kind of devious trickery the standard library uses internally.


Filed under: c++ Tagged: c++, rtti

Posts for Sunday, May 23, 2010

Let us contemplate existence

In my last few posts (1, 2), I followed my readers' advice and have been reviewing the book "Programming the Semantic Web" published by O'Reilly. The full reference is below:

"Programming the Semantic Web by Toby Segaran, Colin Evans, and Jamie Taylor. Copyright 2009 Toby Segaran, Colin Evans, and Jamie Taylor, 978-0-596-15381-6."

Now we pick up at chapter 6 which deals with ontologies, which is the reason I starting working through this book. So without further ado, lets jump back in:

"The Web Ontology Language (OWL) is an RDF language developed by the W3C for defining classes and properties, and also for enabling more powerful reasoning and inference over relationships." (page 135)

The chapter explains the main classes on OWL. owl:Thing is a superclass for every other class (page 136), like 'Object' in Python new class syntax. The chapter also outlines owl:Class, owl:DatatypeProperty, owl:ObjectProperty and rdf:XMLLiteral which you might be able to figure out from the names.

The chapter then outlines the following properties: rdf:type, rdfs:subClassOf, rdfs:domain and rdfs:range. Then on pages 137-140, the chapter defines a schema for films in OWL using RDFLib. The book is worth it just for this.

http://commandline.org.uk/images/posts/semantic/bladerunner2.jpg

Using this schema we record the information that Harrison Ford played Rick Deckard in Blade Runner, directed by Ridley Scott. I have made a visualisation of that using GraphViz (as introduced in a previous chapter), you can download the image here and examine it on whatever image viewer your computer has. What a lot of noise for those few simple facts! But that is what it takes to get the applications to understand the semantics.

The chapter then moves to look at the GUI programme Protégé. I have already been introduced to this by Peter who is a big fan. Protege (I am bored with the accents already) is a Java program, so it will run fine on any system with Java (i.e. almost all of them). The chapter works through the GUI features of Protege in a matter of fact way, building up an ontology.

The approach the chapter outlines is to develop your ontology using Protege and then load the data using scripts and programs rather than using the GUI. On page 145, the chapter loads the ontology created using Protege into RDFLib by creating an instance of the ConjunctiveGraph class and then using the 'load' method.

Going back to the post before I started working through this book, namely Headfirst into the Semantic Web, this what seemed to be a simple approach to the work I need to do. However, there are many other packages outlined later in the book so I may change my mind.

On pages 146-147, the chapter goes back to some OWL theory, looking at 'Functional and Inverse Functional Properties', 'Inverse Properties' and 'Disjoint Classes'. The chapter then points out some ontologies available online that are worth examining (pages 148-149). Lastly the chapter then works through an ontology for beer (pages 149-151), an appropriate place to end, I might grab a cold ale myself.

Discuss this post - Leave a comment

Still swimming in the Semantic Web

In my last post, I followed my readers' advice and checked out the book "Programming the Semantic Web" published by O'Reilly. The full reference is below:

"Programming the Semantic Web by Toby Segaran, Colin Evans, and Jamie Taylor. Copyright 2009 Toby Segaran, Colin Evans, and Jamie Taylor, 978-0-596-15381-6."

I stopped in the middle of chapter 3, in this post we keep going with the review. The book tells us that:

"Inference is the process of deriving new information from information you already have." (page 43)

For example, you might have one piece of information, then download a second from the web, and then from these two pieces of information, derive a third. One of the examples given in the chapter is "If I know a restaurant's address, I can use a geocoder to find its coordinates on a map" (page 43).

The chapter goes onto work out which restaurants in Washington DC are likely to be touristy. It does that by working out which restaurants are near a tourist attraction and are at the same time cheap. It uses this example to explain how inference rules can chain together to generate new information:

"What's important to realize here is that the rules exist totally independently. Although we ran the three rules in sequence, they weren't aware of each other - they just looked to see if there were any triples that they knew how to deal with and then created new ones based on those. These rules can be run continuously—even from different machines that have access to the same triplestore - and still work properly, and new rules can be added at any time." (page 49)

The chapter then looks at merging graphs together, allowing queries across data from different sources. Then the chapter ends with some fun, we get to generate graphic visualisations with the program graphviz (which I discovered that I already had on my system).

http://commandline.org.uk/images/posts/semantic/swimming.jpg

Image by eteela used with permission.

Chapter 4 dives straight into RDF. In RDF, everything is a resource, identified by a URI (page 65). A URI does not have to be retrievable as a URL, though to aid uniqueness, it is a convention to use a hostname that you control as the first part of the URI. RDF allows the use of a blank node for situations where you do not know the URI (page 67), these are given an arbitrary ID starting with underscore colon _:

RDF can be expressed in different serialization formats (page 69), the chapter demonstrates these using a set RDF format, the Friend of a Friend (FOAF) vocabulary, as the primary example.

The first RDF serialization format covered is N-Triples, a series of statements, each one "containing a subject, predicate, and object followed by a dot" (page 71). N3 is very similar to N-Triples but various shorthands are introduced to remove redundancies (page 72).

Then the XML representation of RDF, which is perhaps what most people think of as RDF, is covered next (pages 73-76). Lastly RDFa, where XML attributes are added to XHTML tags, allowing one document to be both the human and machine-readable content (page 76). The extra XML attributes "specify the semantics behind the information that is displayed" (page 76).

Chapter 4 leaves behind the simplified tools from the previous chapters and breaks out RDFlib and SPARQL. SPARQL is a query language for RDF graphs. It is read only, and if you (dis)like SQL then you will equally (dis)like SPARQL. The chapter uses RDFlib to demonstrate this. I might cover RDFlib myself in a future post, so I will skip talking about it for now. Briefly, it seems a really useful library and the first call for dealing with all things RDF in Python.

Chapter 5 is "source of semantic data" which seems to have extensive examples of the work covered so far, I skipped this chapter for now, in order so I could press straight on to chapter 6 about ontologies.

As I have been going along, I have been trying out all the examples, and there have been several small errors in the code, especially inconsistently named references and files. This is not supposed to be a book aimed just at Python users, this is a general book for anyone, with the examples just happening to be in Python. Therefore there could be a little more proofreading, i.e. user testing, of the examples.

This did not spoil it for me, as a more experienced Python programmer I could just fix the examples as I went. So far I have found the book really engaging and useful, and I am very keen to read the rest.

Discuss this post - Leave a comment

Posts for Friday, May 21, 2010

Emacs: Yank lines as lines

One thing nice about Vim is manipulating whole lines at a time. dd deletes a line (including trailing newline), regardless of where the cursor is on the line. Then, p puts that line (with its newline) as a new line after the current line, and P puts it above the current line, again regardless of where your cursor is at the moment. (It also jumps the cursor to the beginning of the text you just inserted, which is nice.)

Emacs has kill-whole-line (C-S-Backspace) which is like Vim's dd. But I didn't find an equivalent of p and P. So here's my version:

(defun yank-with-newline ()
  "Yank, appending a newline if the yanked text doesn't end with one."
  (yank)
  (when (not (string-match "\n$" (current-kill 0)))
    (newline-and-indent)))

(defun yank-as-line-above ()
  "Yank text as a new line above the current line.

Also moves point to the beginning of the text you just yanked."
  (interactive)
  (let ((lnum (line-number-at-pos (point))))
    (beginning-of-line)
    (yank-with-newline)
    (goto-line lnum)))

(defun yank-as-line-below ()
  "Yank text as a new line below the current line.

Also moves point to the beginning of the text you just yanked."
  (interactive)
  (let* ((lnum (line-number-at-pos (point)))
         (lnum (if (eobp) lnum (1+ lnum))))
    (if (and (eobp) (not (bolp)))
        (newline-and-indent)
      (forward-line 1))
    (yank-with-newline)
    (goto-line lnum)))

(global-set-key "\M-P" 'yank-as-line-above)
(global-set-key "\M-p" 'yank-as-line-below)

Just one more step along the path to Vimmify my Emacs setup. Emacs has some weird edge cases because you can move the cursor one "line" past the last real line in the file. But I think I worked out something comfortable for myself.

PS: I've written about this before, but if you use C-S-Backspace a lot in Emacs on Linux, I highly recommend putting this into your X11 config:

Option "DontZap" "True"

It's really easy to mix up C-S-Backspace and C-M-Backspace (the latter of which kills your X server). It's not fun to mix those up. Not fun at all.

PPS: This thread on Stack Overflow has some Emacs equivalents of Vim's o and O which are pretty nice too.

avatar

Playing a song as a background process in Windows

Sometimes you ask yourself how to do cool things like playing a song in the background (ie. no visible interface or application) upon login on a Windows box. Being completely unfamiliar with using DOS I wasn’t quite sure how to go about doing this, but apparently it was quite easy. So here I am documenting it for future "reference". This marks my very first time touching the DOS prompt and indeed any sort of commands in Windows, so please excuse the newbie-format of this post.

Everything is done CLI for obvious reasons – we don’t want any interface for them to turn off our song. So we need a command line music player. mplayer is also available as a command line player on Windows, and so it was my first choice. A quick download of a build without an interface and we were ready to play any song with a *.bat file containing `mplayer "music.mp3"`

The next step is to make it run without the prompt opening up. This is again easily done by executing the bat file via a vbs file with the following content. Creating a shortcut to this vbs file and dumping it in your startup folder is the simplest and most obvious way to make it play on login. Here’s the code:

Set WshShell = CreateObject("WScript.Shell")
WshShell.Run chr(34) & "C:\path\to\my\bat\file.bat" & Chr(34), 0
Set WshShell = Nothing

Now I wanted to be able to change this song whenever I wanted from a central server. Basically it would check whether or not it needs to update the song, and if it does, delete the existing song and download the new song. This is useful to give a little variety in our fun little player. Some things didn’t work quite as I wanted it to so I have probably used the most horrendous of hacks based on what I could garner from various online references.

First I needed a way to download files akin to wget. I found a small program called url2file which did just the thing. I wanted it to check whether or not a song existed on the server, and if it did, download it. However the url2file program didn’t quite play nice with that idea (it would download a 404 page instead of allowing me to tell it not to do anything), and I didn’t know how to check whether or not a file existed on a remote server. So instead I had to make do with a second "notifier" file which, if it contained a certain string, would mean that a new song was available to be downloaded.

It would download that plaintext file’s contents to a tmp file, search in that tmp file for the string we were looking for, and if successful, would delete the existing music file and download the new one to take its place. Unfortunately doing a simple `if %getnew%==yes` didn’t work (explanations welcome!), so I made do with checking the first 3 characters, which did work. Here’s the final code, with the getnew.txt file including just the single word "yes".

del tmp
URL2FILE.EXE http://foobar.com/getnew.txt > tmp
set /p getnew= < tmp
set _part_name=%getnew:~0,3%
if %_part_name%==yes del music.mp3
if %_part_name%==yes URL2FILE.EXE http://foobar.com/music.mp3 music.mp3

Tada, and worked flawlessly. Not bad for a couple hours work from scratch and not knowing anything about DOS at all.

In unrelated news, I’m looking for good bagpipe music.

Related posts:

  1. Tech Tip #6: Reencode any video to ensure compatibility with Windows Media Player
  2. Top 10 Windows Mobile Applications
  3. Free, legal music for all.

Posts for Thursday, May 20, 2010

C++ Named Function Parameters

C++ doesn’t have named function parameters. In some ways this isn’t a huge deal, since the compiler will usually catch when you screw up the ordering of arguments to a function. But if you’ve got a function accepting multiple arguments of the same type, the compiler isn’t going to save you. So we want to allow something like following:

shop.populate(
    param::number_of_cheeses() = 0,
    param::number_of_parrots() = 1,
    param::parrot_variety() = "Norwegian Blue"
    );

We also want:

  • As little as possible boilerplate from the programmer.
  • Type safety. It shouldn’t compile if the arguments are wrong.
  • Zero overhead.

It would be nice to allow arguments to be specified in any order, and there is a way of doing that using C++0x, but it’s rather convoluted, so we’ll stick with the requirement that arguments be in the right order for now.

First, we want to work out the type of those param::foo() things. Since we’re using operator=, they need to be structs or constants of some kind (since operator= can only be overloaded as a member function). Since we want lots of them of different types, and since we don’t want to have to worry about declaring the same name multiple times (which means we’d start hitting the ODR), a typedef of a template seems in order. Thus, we’d like to do:

namespace params
{
    typedef Name</* something */> number_of_cheeses;
    typedef Name</* something */> number_of_parrots;
    typedef Name</* something */> parrot_variety;
}

As for the something, the best I’ve been able to come up with is an inline forward declaration of a meaningless struct:

namespace params
{
    typedef Name<struct N_number_of_cheeses> number_of_cheeses;
    typedef Name<struct N_number_of_parrots> number_of_parrots;
    typedef Name<struct N_parrot_variety> parrot_variety;
}

What about the function parameters?

void Shop::populate(
    const NamedValue<param::number_of_cheeses, int> & number_of_cheeses,
    const NamedValue<param::number_of_parrots, int> & number_of_parrots,
    const NamedValue<param::number_of_cheeses, std::string> & parrot_variety)
{
    /* ... */
}

There’s a small amount of duplication there, but that’s a necessity: it’s considered a useful feature of C and C++ that declarations and implementations of functions can use different names for parameters.

As for using the parameters, we’ve got two options. We could add a super magic cast operator to NamedValue, or we could make it explicit. Since super magic casts have a nasty habit of doing really weird things, we’ll make it explicit using operator():

void Shop::populate(
    const NamedValue<param::number_of_cheeses, int> & number_of_cheeses,
    const NamedValue<param::number_of_parrots, int> & number_of_parrots,
    const NamedValue<param::number_of_cheeses, std::string> & parrot_variety)
{
    cheeses.resize(number_of_cheeses());
    cage.insert(number_of_parrots(), parrot_variety());
}

Now we just have to make it work. First, NamedValue, remembering to provide const and non-const versions of our operator:

template <typename T_, typename V_>
class NamedValue
{
    private:
        V_ _value;

    public:
        explicit NamedValue(const V_ & v) :
            _value(v)
        {
        }

        V_ & operator() ()
        {
            return _value;
        }

        const V_ & operator() () const
        {
            return _value;
        }
};

Then Name. Our first attempt might look like this:

template <typename T_>
struct Name
{
    template <typename V_>
    NamedValue<Name<T_>, V_> operator= (const V_ & v) const
    {
        return NamedValue<Name<T_>, V_>(v);
    }
};

But there’s a problem: whilst this works for int and most classes, it does something immensely stupid when fed a string literal. We could require users to write out parameters like:

    param::parrot_variety() = std::string("Norwegian Blue")

but that’s rather silly. So instead we’ll add in a way of overriding types for NamedValue, keeping it nice and generic in case any similar situations crop up elsewhere:

template <typename T_>
struct NamedValueType
{
    typedef T_ Type;
};

template <int n>
struct NamedValueType<char [n]>
{
    typedef std::string Type;
};

template <typename T_>
struct Name
{
    template <typename V_>
    NamedValue<Name<T_>, typename NamedValueType<V_>::Type> operator= (const V_ & v) const
    {
        return NamedValue<Name<T_>, typename NamedValueType<V_>::Type>(v);
    }
};

Fortunately, g++ is smart enough to compile all of this into exactly the same code as it would if named parameters weren’t used.

And there we have it: very low boilerplate type safe named parameters with no icky macros.


Filed under: c++ Tagged: c++
avatar

Status of Gentoo on MacBook Pro (5,3)

  • ALSA: It supports all the inputs and outputs on the computer. The headphones and speakers get two different volume levels. I find setting headphones to 30% and speakers to 100% works perfectly for me, but every pair of headphones acts differently.
  • Graphics: NVIDIA drivers have been available from the start. No fuss or mess here, especially now that distributions have seemed to have finally found a way to package them in a way that doesn’t obliterate important X11 libraries.
  • Screen and keyboard brightness work if you install pommed. Since GNOME already recognizes the volume keys, I turned that off in pommed. The Banshee music player (my new favorite) understands the media controls, to my surprise! The eject button is also supported by pommed. As far as standard keyboard buttons go, the Home, End, Page Up, Page Down, Windows, Delete, and Function keys are all accessible via the Fn key, in combination with the left, right, up, down, Command, “delete”/Backspace, and Function keys respectively. (Why OSX doesn’t understand half these keys, and some of them only half the time, I still don’t understand.)
  • The large trackpad works very well. Single-finger click is a left click, two-finger click is a right click, and three-finger click is a middle click. Two-finger scrolling works very well, and can even be turned on and off in the Mouse settings in GNOME. Four-finger scrolling appears to be interpreted as a single finger, but that may be adjustable.
  • Within the last week or so, a new release of the isight-firmware-tools package (1.5.92) just added support for the iSight camera built into this MacBook. I am very happy about that, since that’s one less thing to reboot into OSX for. It still has some small setup required, but it’s a one-liner, so it barely registers on my “todo” list.
  • The wireless card works very well, and works with NetworkManager. Bluetooth works.
  • The battery is reported correctly in GNOME.
  • The SD card slot works. I’ve used it several times.
  • The fans require some doing. They don’t actually turn on automatically (scarily enough), so I had to hack up a script someone wrote to get the fan to react to temperatures reported by sysfs.
  • And the hard-drive, um, spins and stuff.

So, as you can see, besides pommed, a fan script, and the webcam, there’s really very little tweaking required. Everything more or less works.

Edit: The kernel configuration for this machine was requested in the comments below, so I’ve posted it here.


avatar

Mozilla Weave

This is a short blog post to say that I’ve found an excellent must-have for Firefox users. It’s called Mozilla Weave. It keeps history, bookmarks, passwords, Firefox settings, and soon add-ons in sync. My experience of any kind of automated syncing software has been poor, but this one seems to be flawless. Minus connectivity issues when I’m offline, it’s never reported an error or bugged me about some conflict or other. It just sits there and uploads data to Weave servers (or your own if you so choose) so other computers may stay in sync. Even if you don’t have multiple computers, this is great for just backing up, especially considering the Firefox data folder is a pain to back up.

Privacy concerns are non-existent here, because everything is encrypted with a pass-phrase, on top of your Weave account username and password.

The only bummer about Weave is it’s limited to just Firefox, and Firefox’s competitors have no equivalent yet. It looks like I won’t be switching to Chrome any time soon.


Posts for Wednesday, May 19, 2010

In love with “Digital: a Love Story”

Wow!

It's pretty rare that my heart goes pouding when I'm playing a game, but this one made it. Digital: a Love Story is quite a unique little adventure game.

It has it all: oldskoolness, hacking. phracking, BBS, AI ...everything. And althought it's pretty linear, at least to me it brought the genuine feeling of being on the edge and sometimes tough decisions to do as well as pretty realistically reenacting the feeling of getting access codes blocked.

Sure, it's all done a very simple way and I , but it's just a game. And as such a very cool one :]

hook out >> just finished the game and feeling good and his alter ego a bit brokenhearted


P.S. On Gentoo you can obtain it from the sunrise repository.
<!--break-->

Much to learn from Free software and the third world as well

However we spin it, it can't be denied that the status quo of our (western) society is not perfect. With the massive recession, pollution (and global warming), general unhappiness and stress leading to depression, cancer, low birth rates etc. etc. etc. I think it's safe to say we're sinking ever deeper into the cacky as a society.

I've just stumbled upon this great 10-minutes animated presentation how studies show that money is only a good motivation for pure physical work. On the other hand for anything demanding even a slight cognitive process huge money awards are counter-productive and the true motivations are the working/thinking person's autonomy, its strive to achieve mastery and the feeling of fullfilling a purpose. Which is a good explanation why FOSS exists in the first place and why it continues to grow.

In connection to that Simon Phipps has blogged about why the freedoms FOSS carries are more important even to businesses then the open source development model.

Currently the probably most brilliant project I've seen in years is the Design for the First World [Dx1W] which invites all who are born and living in the so called 3rd world to help find solutions for problems that bother the developped world. They name e.g. obesity, consumerism, integration of immigration, low birth rate and aging population, but there's more. You could look at that sarcastically, but IMHO these are real problem we're facing and to which our society has so far failed to find working solutions.

The cruel reality remains that the current state of the 1st world is troubled by many things, doubts arise in both capitalism and individualism and elsewhere as well. It's time to stop, smell the roses and rethink our strategy. And in that, I think, we have to learn from both the FOSS community and the rest of the world.

hook out >> halfway comprehensive news brought to you by the guy who should be studying his arse off right now, but is not :P
<!--break-->

avatar

Question yourself v3

Another update to Quizzer, now at version 3. But more importantly, updates to the Linux Sea related chapters are made available online – get a taste for it at the online quizzer set.

Feedback is, as always, very much appreciated.

Metadata is our data

In my last post, I talked about talking some first tentative steps into the semantic web. Two of the commentators suggested that I should check out the book "Programming the Semantic Web" published by O'Reilly. The full reference is below:

"Programming the Semantic Web by Toby Segaran, Colin Evans, and Jamie Taylor. Copyright 2009 Toby Segaran, Colin Evans, and Jamie Taylor, 978-0-596-15381-6."

In this post I review what I have read so far, making notes as I go along.

Chapter 1 is called "Why Semantics?" The book explains the core idea of the semantic web:

"it's about using semantics to represent, combine, and share knowledge between communities of machines, and how to write systems that act on that knowledge." (Page 2)

In particular:

"With a little work you can make the semantic relationships in your data explicit, and program in a way that allows the behavior of your systems to change based on the meaning of the data. With the semantics made explicit, other programs, even those not written by you, can seamlessly use your data. Similarly, when you write programs that understand semantic data, your programs can operate on datasets that you didn't anticipate when you designed your system." (page 3)

Chapter 2 talks about the 'triple' which is the "the fundamental building block of semantic representations" (page 19). Here are two triples:

Zeth writes 'Command Line Warriors'
Zeth's address is 'Buckingham Palace'  (not yet at least)

The first part of the triple is the subject (e.g. Zeth). The second part is the 'predicate', which is "a property of the entity to which they are attached" (page 19), so my birthday or my address would be a predicate. The last part is the object, which can be another entity "that can be the subject in other triples" (page 19) or a literal value "such as strings or numbers." (page 19).

Different triples are linked together by sharing objects or subjects. The book starts off with an example spreadsheet of restaurant listings which it turns into a relational database, which it then turns into these triples. So the links are analogous to links between tables in a relational database.

Next the book moves on to building graphs of triples by using shared ids:

zeth first_name "Zeth"
zeth address royal_residence street
royal_residence street_address "Buckingham Palace"
royal_residence street post_code "SW1A 1AA"
commandline_warriors written by zeth
commandline_warriors name 'Command Line Warriors'

So here there are three entities: the person Zeth represented by the ID 'zeth', a house represented by the ID 'royal_residence' and a website represented by the ID 'commandline_warriors'. The post code "SW1A 1AA" is just a literal value at this point, but it could later be turned into an entity also. The first two triples have a shared ID, meaning both statements are about the entity 'zeth'. 'zeth' is not the name of the entity, it is an arbitrarily-chosen ID, the name is provided by the first_name predicate. The ID could have been a hash value or a sequential number.

The book then works through its first code example, which is available online here: simpletriple.py I recommend for you to download it now and have a look over.

The code sample has a class called SimpleGraph which is a simple example 'triplestore'. In the __init__ method, 's' stands for subject, 'p' for predicate, and 'o' for object. So the triples are stored in three different combinations. The book then explains the various methods, which may be evident to you from the code and docstrings.

http://commandline.org.uk/images/posts/semantic/educating_rita.jpg

Next we are shown in pictures and code that if there are two graphs with consistent identifiers, they can be merged together. Then we are to download a csv file of triples and load then using the load method of the SimpleGraph object. Then we perform queries upon this data.

from simpletriple import SimpleGraph

# Make an instance of the class
film_graph = SimpleGraph()

# Load the CSV from the book's website
film_graph.load("movies.csv")

# Now lets find Julie Walters' id
julie_id = film_graph.value(None, "name", "Julie Walters")
print julie_id

# Now lets find out all the films Julie has been in:
julie_films = film_graph.triples((None, "starring", julie_id))
for film in julie_films:
    print film_graph.value(film[0], "name", None)

One of the results is the classic film 'Educating Rita'.

educating_rita = film_graph.value(None, "name", "Educating Rita")

Now lets find another actor in Educating Rita:

actor = film_graph.triples((educating_rita, "starring", None)).next()[2]
print film_graph.value(actor, 'name', None)

Sadly there are no dates of the films in the csv file, if there were we could sort an actor's films by year and thus generate a filmography.

Lets instead find the director:

director = film_graph.value(educating_rita, 'directed_by', None)
print(film_graph.value(director, 'name', None))

directed_films = film_graph.triples((None, "directed_by", director))

What other films has he directed?

for film in directed_films:
    print film_graph.value(film[0], "name", None)

If you want to play along, use simpletriple.py to find out what other film has this director made that also stars the actor we found above? The answer is in the comments.

http://commandline.org.uk/images/posts/semantic/alfie.jpg

The rest of chapter two gives few more examples that can be played with.

Chapter three gives a new query syntax that works by defining various contraints and binding the results to set references. This is most easily demostrated by an example. Start by downloading an upgraded version of the triples module called simplegraph.py.

We load the data in the same way as before:

from simplegraph import SimpleGraph
film_graph = SimpleGraph()
film_graph.load('movies.csv')

You might still have 'actor' and 'director' etc in memory. Assuming that you do not, we can repeat what we did above:

julie_id = film_graph.value(None, "name", "Julie Walters")
educating_rita = film_graph.value(None, "name", "Educating Rita")
actor = film_graph.triples((educating_rita, "starring", None)).next()[2]
director = film_graph.value(educating_rita, 'directed_by', None)

Now we can answer the above quiz question in a far more efficient manner. The question was to find out what other film the 'director' made that also started the 'actor'.

film_graph.query([('?film', 'starring', actor),
                  ('?film', 'directed_by', director)])

You can see instantly that this query is far shorter than the previous attempt which involved manually iterating our way to the correct result. How the query method is implemented can be seen by reading the Python file linked to above. What happens in this case is that for each possible result matching these constraints, a dictionary is returned binding the key 'film' to the ID of the film that has been found.

So far we are part way through chapter 3. Join us next time when we continue working through the book.

Discuss this post - Leave a comment

Posts for Tuesday, May 18, 2010

updating to kde-4.4.2 from kde-4.3.3

motivation

gentoo linux logo (copied from commons.wikipedia.org)

gentoo

i’m not a big fan of updates but sometimes i have to do them. this time i would like to blog the steps so that other gentoo users can review my doings. this posting lists all commands i’ve issues in order to update from kde-4.3.3 to kde-4.4.2 (later kde-4.4.3)

gentoo concepts

WARNING: it is very important to understand what stable in portage means. if a package called mypackage is marked stable in portage, it can be installed with ‘emerge mypackage’. that means it is marked stable by portage. however this might not mean that the package itself is stable at all – but most likely it is. this concept is different to the concept of releases made by software vendors, who have their own idea about stable/testing and unstable. in portage a package is marked stable when the integration into the gentoo linux system has proved to be working well. also new packages might be marked unstable as they are not tested enough, even though many users would think they should be marked stable. if in doubt: do not install software which is marked unstable by portage. this posting is all about to install a ‘kde release’ which is, when writing this posting, marked unstable in portage. in contrast: the kde developers release the software, which i’m going to install, as stable.

see [6] for the official kde & gentoo guide.

what software do i use right now

  • portage 2.2 rc67
  • kde 4.3.3 (various packages, i’ve not used kde-meta)

i’ve had lots of problems with kde 4.x so i basically removed all my daily kde 4.x dependencies and replaced them with none-kde programs as:

  • kate -> thunderbird
  • kontact -> psi
  • konqueror ->chromium & firefox

i probably will change back once kde 4.x if:

  • kde 4.x itself is stable enough for a day by day usage
  • good integration of kde 4.x in portage is finally there

preparations

first let’s update portage with:

# eix-sync

since gentoo installations of kde take very long i’ve decided to install xfce4, which is a very nice and tiny desktop environment:

# emerge xfce4-meta

afterwards i’ve logged out and logged into xfce4. i’m using the gnome terminal for the update.

first: fixing the emerge WARNING

using emerge this message shows up every time:

WARNING: One or more repositories have missing repo_name entries:
/usr/local/portage/profiles/repo_name
NOTE: Each repo_name entry should be a plain text file containing a
unique name for the repository on the first line.

i’ve seen this warning all over the place here and i usually look the error up in google, fix the problem and forget about it. as it still seems to be there on some machines it is probably a good idea to document the fix in this blog. so here we go:

echo “invalidmagic’s local repository” > /usr/local/portage/profiles/repo_name

and finally the warning is gone see also [1] where this issue is discussed.

kde update, oh wait…

the first thing i try is to test if the update does work out of the box, i’m doing this with:

# autounmask =kde-base/kde-meta-4.4.2

next i try if portage can perform an update with:

# emerge –color n =kde-base/kde-meta-4.4.2

usually this looks like this (only relevant lines shown):

[blocks B     ] <x11-libs/qt-xmlpatterns-4.6.2 (“<x11-libs/qt-xmlpatterns-4.6.2″ is blocking x11-libs/qt-webkit-4.6.2-r1, x11-libs/qt-sql-4.6.2, x11-libs/qt-qt3support-4.6.2, x11-libs/qt-core-4.6.2-r1, x11-libs/qt-svg-4.6.2, x11-libs/qt-test-4.6.2, x11-libs/qt-opengl-4.6.2, x11-libs/qt-script-4.6.2, x11-libs/qt-gui-4.6.2)

[blocks B     ] <x11-libs/qt-test-4.6.2 (“<x11-libs/qt-test-4.6.2″ is blocking x11-libs/qt-webkit-4.6.2-r1, x11-libs/qt-sql-4.6.2, x11-libs/qt-xmlpatterns-4.6.2, x11-libs/qt-core-4.6.2-r1, x11-libs/qt-svg-4.6.2, x11-libs/qt-gui-4.6.2, x11-libs/qt-opengl-4.6.2, x11-libs/qt-qt3support-4.6.2, x11-libs/qt-script-4.6.2)

[blocks B     ] kde-base/libknotificationitem:4.3[-kdeprefix] (“kde-base/libknotificationitem:4.3[-kdeprefix]” is blocking kde-base/kdelibs-4.4.2)

[blocks B     ] <x11-libs/qt-script-4.6.2 (“<x11-libs/qt-script-4.6.2″ is blocking x11-libs/qt-webkit-4.6.2-r1, x11-libs/qt-sql-4.6.2, x11-libs/qt-xmlpatterns-4.6.2, x11-libs/qt-core-4.6.2-r1, x11-libs/qt-svg-4.6.2, x11-libs/qt-test-4.6.2, x11-libs/qt-opengl-4.6.2, x11-libs/qt-qt3support-4.6.2, x11-libs/qt-gui-4.6.2)

(see full list at [2], i’ve used ‘emerge –color n =kde-base/kde-meta-4.4.2  -a > portage_log 2>&1′ to create a file with the output)

so what i do instead, is to remove all kde components from the system

remove old kde components

using qlist (app-portage/portage-utils-0.2.1) we need to find all kde components. we need to use -I to find installed packages. we also disable the usage of color, with -C, to make the output usable for script processing.

qlist -IC kde

there are some applications as k3b for instance which does use kde-base/kdelibs but which are NOT included in this list. most of the time this can be ignored since a later ‘revdep-rebuild’ will fix this for those programs. however if kdelibs is removed k3b can’t be started anymore. removing kdelibs after k3b has been started will probably not crash k3b and k3b might still work. so let’s remove all kde components (also erasing all SLOTS, aka different versions):

emerge -C $(qlist -IC kde)

dependent on your installation and harddrive speed this might take a while (545.09 seconds).

unwanted unmasks

right now i realized that there are old ‘autounmasks’ in /etc/portage, i’m going to clean that up first:

cd /etc/portage/

  • grep kde * -R
  • grep qt * -R
  • grep avahi * -R

i removed most files which had something to do with kde, qt, amarok and some with avahi. so have a look at all categories (directories):

  • package.use/
  • package.mask/
  • package.unmask/
  • package.keywords/

especially look for unmasks for 9999 packages, which refer to svn/cvs/git versions which were not released yet (but might be tagged still). this means: if those packages refer to a developer’s version without a tag the package might change without warning. therefor the installation could break with different error messages on different checkouts. usually developers want this in order to test their software. users don’t want that but it’s a nice way to experiment with recent software but still having a package manager for safe removal. is the system now clean? we’ll see. we probably have to set the correct use flags again. WARNING: be aware that use flags can also be set globally in /etc/make.conf some useflags can be shown with:

equery u kde-meta

the semantic-desktop useflag might be of interest. i’m not sure, but i think that using the kdeprefix useflag resulted in having: ~/.kde3.5, ~/.kde4.2, ~/kde4.4 and others. so once you want to use kde 4.4 instead of kde 4.2 (you can select this on login using kdm for instance) this means that all your system settings as: kaddressbook, knotes, autostarters, desktop configuration and others will have to be migrated manually by copying the files from ~/.kde4.2 to ~/.kde4.4 prior to your login. However this is just a guess but it would explain the issues i had, during the time i used +kdeprefix. about ~  kde3.5 / kde 4.2. so let’s do the autounmask again:

# autounmask =kde-base/kde-meta-4.4.2

this time it took really long (36 minutes) but as htop shows portage does only use one core while iotop showed that there was no disk access at the same time. probably a result of complex dependency-graph-calculations. so autounmask came up with this blocks:

[blocks B     ] >x11-libs/qt-opengl-4.5.3-r9999 (“>x11-libs/qt-opengl-4.5.3-r9999″ is blocking x11-libs/qt-assistant-4.5.3, x11-libs/qt-test-4.5.3-r1 , x11-libs/qt-dbus-4.5.3-r1, x11-libs/qt-xmlpatterns-4.5.3-r1, x11-libs/qt-core-4.5.3-r2, x11-libs/qt-gui-4.5.3-r2, x11-libs/qt-qt3support-4.5.3, x11 -libs/qt-svg-4.5.3-r1, x11-libs/qt-script-4.5.3-r1, x11-libs/qt-demo-4.5.3, x11-libs/qt-webkit-4.5.3, x11-libs/qt-sql-4.5.3)

[blocks B     ] <x11-libs/qt-svg-4.6.2 (“<x11-libs/qt-svg-4.6.2″ is blocking x11-libs/qt-webkit-4.6.2-r1, x11-libs/qt-sql-4.6.2, x11-libs/qt-xmlpatte rns-4.6.2, x11-libs/qt-core-4.6.2-r1, x11-libs/qt-test-4.6.2, x11-libs/qt-opengl-4.6.2, x11-libs/qt-qt3support-4.6.2, x11-libs/qt-script-4.6.2, x11-l ibs/qt-dbus-4.6.2, x11-libs/qt-gui-4.6.2) and many more….

so it’s time to check the qt-* packages. interesting, there is x11-libs/qt installed (a qt-3.x version), the new qt-4.x have a split package naming scheme.

# equery d x11-libs/qt

  • app-crypt/qca-1.0-r3 (x11-libs/qt:3)
  • dev-libs/dbus-qt3-old-0.70 (=x11-libs/qt-3*)
  • media-sound/hydrogen-0.9.3-r4 (=x11-libs/qt-3*)
i do not want to change anything related to the qt-3 package sets. let’s have a look at the x11-libs/qt-* stuff. this wildcard indicates it has something to do with the qt4 library.
qlist -Iv x11-libs | grep “qt-.*”
  • x11-libs/qt-3.3.8b-r2    (WARNING: you can leave this installed as it is not a conflict candidate for a kde 4.x installation despite some avahi issues)
  • x11-libs/qt-assistant-4.5.3
  • x11-libs/qt-core-4.5.3-r2
  • x11-libs/qt-dbus-4.5.3-r1
  • x11-libs/qt-demo-4.5.3
  • x11-libs/qt-gui-4.5.3-r2
  • x11-libs/qt-opengl-4.5.3-r1
  • x11-libs/qt-qt3support-4.5.3
  • x11-libs/qt-script-4.5.3-r1
  • x11-libs/qt-sql-4.5.3
  • x11-libs/qt-svg-4.5.3-r1
  • x11-libs/qt-test-4.5.3-r1
  • x11-libs/qt-webkit-4.5.3
  • x11-libs/qt-xmlpatterns-4.5.3-r1
so there are many x11-libs/qt-* packages around.
WARNING: if i’d remove these packages some software components might stop working. so it would be wise to see which packages depend on them. my worst case minimal requirement is a console (no X or desktop) environment so i skip this test.
NOTE: qt4 can be used for console only programs (no gui) as well. but i don’t know any relevant low level system dependencies which might break my minimal system requirements.
therefore i think it is safe to remove these packakges:
# qlist -IC x11-libs | grep “qt-.*”
# emerge -C $(qlist -IC x11-libs | grep “qt-.*”)
so now that all qt4 packages are removed, let’s try autounmask again but before that we need to remove previous failed attempts with:
cd /etc/portage
grep kde *
then review the grep output, in my case i needed to remove these 3 files
rm package.unmask/autounmask-kde-meta
rm package.use/autounmask-kde-meta
rm package.keywords/autounmask-kde-meta
so again:

# autounmask =kde-base/kde-meta-4.4.2 (see [3] for a complete list)

*smile* only one final ‘block’ left!

[blocks B     ] <app-emulation/emul-linux-x86-xlibs-20100409 (“<app-emulation/emul-linux-x86-xlibs-20100409″ is blocking app-emulation/emul-linux-x86-opengl-20100410_pre)

* Error: The above package list contains packages which cannot be installed at the same time on the same system.

(‘installed’, ‘/’, ‘app-emulation/emul-linux-x86-xlibs-20091231‘, ‘nomerge’) pulled in by

~app-emulation/emul-linux-x86-xlibs-20091231 required by (‘installed’, ‘/’, ‘app-emulation/emul-linux-x86-gtklibs-20091231‘, ‘nomerge’)

~app-emulation/emul-linux-x86-xlibs-20091231 required by (‘installed’, ‘/’, ‘app-emulation/emul-linux-x86-medialibs-20091231‘, ‘nomerge’)

app-emulation/emul-linux-x86-xlibs required by (‘ebuild’, ‘/’, ‘x11-drivers/nvidia-drivers-195.36.24‘, ‘merge’)

(and 2 more)

(‘ebuild’, ‘/’, ‘app-emulation/emul-linux-x86-opengl-20100410_pre‘, ‘merge’) pulled in by

app-emulation/emul-linux-x86-opengl required by (‘ebuild’, ‘/’, ‘app-emulation/emul-linux-x86-xlibs-20100409-r1‘, ‘merge’)

so how to deal with that?
  1. i guess i could remove nvidia-drivers and replace it by the new nouveau driver (yes i’m on gentoo-sources-2.6.33 now)
  2. i could try to update various components in random order and try again

so let’s try point two (2):

# autounmask =app-emulation/emul-linux-x86-gtklibs-20100409-r1

# autounmask =app-emulation/emul-linux-x86-medialibs-20100409

and finally let’s try it again

# autounmask =kde-base/kde-meta-4.4.2

oh we got a “!done”. that is great news as it seems to work so far!

i just found out that i missed two more kde packages (qlist -IC kde) :
  • kde-base/kdebase-pam
  • kde-base/kde-env

however, i don’t plan to remove them. with some luck they might be updated automagically.

update the system, before installing kde 4.x

# emerge -uDN world –keep-going -a see [4] for the complete output of the command above … Use emerge @preserved-rebuild to rebuild packages using these libraries emerge -uDN world –keep-going -a  16972.19s user 5351.49s system 97% cpu 6:21:36.73 total

i usually use “–keep-going”, please see the documentation what is cool about doing so. in general it helps to shorten installation time as a failure in the middle of a 200 package installation won’t stop for manual maintenance. with some luck nearly all packages were installed using this feature when still having several critical compile or linker errors.

check for broken programs

since we removed x11-libs/qt-* basically every program which links against any of these libraries MUST be broken. with one exception: programs which are linked statically. however most programs on linux are linked dynamically so we have to check for broken programs with:

# revdep-rebuild revdep-rebuild  802.74s user 271.12s system 95% cpu 18:49.88 total

  • Tue May 18 11:15:12 2010 >>> x11-libs/libXxf86dga-1.1.1
  • Tue May 18 11:15:52 2010 >>> kde-base/libkcddb-4.4.3
  • Tue May 18 11:16:30 2010 >>> net-wireless/kbluetooth-0.4.2
  • Tue May 18 11:17:17 2010 >>> app-arch/libarchive-2.7.1-r1
  • Tue May 18 11:21:07 2010 >>> app-cdr/k3b-1.91.0_rc2
  • Tue May 18 11:29:08 2010 >>> media-video/vlc-1.0.6
  • Tue May 18 11:29:36 2010 >>> x11-apps/xf86dga-1.0.2
check the list of programs and libraries and if there is a program you would like to get rid of first, do so! i removed mumble, amarok and amarok-utils.

updating configuration files in /etc/

this is really important and there are other ways to do it, anyway:

# etc-update

looking at the use flags of kde 4.4.2

now the final step! after one day we are finally there! yepeee.

emerge kde-meta -a (see [5] for a complete list of packages and use flags)

so all i did was to add: lzma and semantic-desktop useflag

oh there is a kde 4.4.3 now, so i install this instead

it seems that while i wrote this blog entry a new version of kde was released (might be my late eix-sync as well). so i’m going to install ‘kde 4.4.3′ instead of ‘kde 4.4.2′.  so what i do is basically starting all over again:

  • removing old autounmasks in /etc/portage
  • autounmask =kde-base/kde-meta-4.4.3

finally emerge kde-meta -a

surprise! we got new blocks:

# emerge kde-meta -a

(‘ebuild’, ‘/’, ‘kde-base/kdelibs-4.4.3′, ‘merge’) pulled in by

>=kde-base/kdelibs-4.3.5[-kdeprefix,-aqua] required by (‘ebuild’, ‘/’, ‘kde-base/solid-4.3.5′, ‘merge’)

>=kde-base/kdelibs-4.3.5[-kdeprefix,-aqua] required by (‘ebuild’, ‘/’, ‘kde-base/krosspython-4.3.5′, ‘merge’)

>=kde-base/kdelibs-4.3 required by (‘ebuild’, ‘/’, ‘net-p2p/ktorrent-3.3.4′, ‘merge’)

(and 6 more)

(‘ebuild’, ‘/’, ‘kde-base/libknotificationitem-4.3.5′, ‘merge’) pulled in by

>=kde-base/libknotificationitem-4.3.5[-kdeprefix,-aqua] required by (‘ebuild’, ‘/’, ‘kde-base/krosspython-4.3.5′, ‘merge’)

>=kde-base/libknotificationitem-4.3.5[-kdeprefix,-aqua] required by (‘ebuild’, ‘/’, ‘kde-base/solid-4.3.5′, ‘merge’)

>=kde-base/libknotificationitem-4.3.5[-kdeprefix,-aqua] required by (‘ebuild’, ‘/’, ‘kde-base/kdialog-4.3.5′, ‘merge’)

(and 1 more)

so what can we do about this one? first thing is to look if there is a more recent version of ktorrent which would use kdelibs-4.4.3 instead of kdelibs-4.3 and there is none, so ktorrent can’t be installed with ‘kde 4.4.3′.

but that autounmask also shows a lot of blocks, basically those from above. however there is an additional one now:

(‘ebuild’, ‘/’, ‘kde-base/libkworkspace-4.3.5′, ‘merge’) pulled in by

>=kde-base/libkworkspace-4.3 required by (‘ebuild’, ‘/’, ‘net-wireless/kbluetooth-0.4.2′, ‘merge’)

there is nothing we can do about it right now. we have to remove kbluetooth. this stupid apple magic mouse didn’t work well anyway so who cares?

emerge -C kbluetooth

let’s try to mask ‘<kde-base/kdelibs-4.4′ versions, that means all versions which were released before 4.4

echo “=kde-base/kdelibs-4.3.5″ >> /etc/portage/package.mask/kde

echo “=kde-base/kdelibs-4.3.3-r1″ >> /etc/portage/package.mask/kde

that worked partially. ‘emerge -uDN world’ still has blocks but ‘emerge kde-meta’ would work well.

so let’s care about that blocks first:

emerge -C ktorrent kile

and i’m done. those two applications don’t seem to work with kdelibs-4.4.3 so i will check the reinstall these when i see a new version of these two programs in the ‘eix-sync’ log. currently i don’t need either of them. probably installing kde with the ‘kdeprefix’ useflag could have worked as well but i did not want to do that.

so now the final step

installing kde-meta-4.4.3

it seems we got all dependencies resolved!

emerge kde-meta –keep-going -a

it seems some use flags which i did not set result in an dependency issue:

emerge: there are no ebuilds built with USE flags to satisfy “>=x11-libs/qt-qt3support-4.6.0:4[kde]“.

!!! One of the following packages is required to complete your request:

- x11-libs/qt-qt3support-4.6.2 (Change USE: +kde)

(dependency required by “kde-base/libkcompactdisc-4.4.3″ [ebuild])

(dependency required by “kde-base/kdemultimedia-meta-4.4.3″ [ebuild])

(dependency required by “kde-base/kde-meta-4.4.3″ [ebuild])

(dependency required by “kde-meta” [argument])

let’s fix that with:

echo “x11-libs/qt-qt3support kde” >> /etc/portage/package.use/qt-qt3support

and then we should restart the emerge but this time we add -N for ‘new use’

emerge kde-meta -N –keep-going -a

now all problems are resolved and the installation (compilation&linking) is running. x11-libs/qt-qt3support-4.6.2 is the first package which is installed as we used -N.

it might be a good idea to check for broken programs once again, just to be sure. use ‘revdep-rebuild’ for that.

summary

next time i update i can have a look at this posting. maybe it is of help for other gentoo users as well. i would be delighted.

links

[1] http://bugs.gentoo.org/show_bug.cgi?id=248603

[2] http://lastlog.de/misc/wordpress/portage_kde_meta_blocks.txt

[3] http://lastlog.de/misc/wordpress/autounmask.txt

[4] http://lastlog.de/misc/wordpress/emerge_world.txt

[5] http://lastlog.de/misc/wordpress/emerge_kde.txt

[6] http://www.gentoo.org/proj/de/desktop/kde/kde-config.xml


Posts for Monday, May 17, 2010

The conference farce

If you work in science at some point you’ll be faced with conferences because those are a way to get your research published which leads to money and your research being validated (because other science people thought it was interesting). This is all by itself not completely dumb and takes the publishing load away from magazines (since most of the scientific community is not interested in publishing itself but in publishing to one of the walled gardens that are “respectable sources” [because only those lead to getting money]).

If you have an open source background you might know conferences because the free software and free culture scene has basically millions of them: Linux Conferences, Python Conferences, General Free * conferences, if there is a group with more than 5 people they will at some point organize a conference. Those conferences do not just serve a networking and social function but also help people get together in a room and come to new ideas, discuss existing ideas and technologies as well as providing the framework for spreading knowledge around through talks. Conferences rock.

Now comes the bitter part: Scientific conferences often don’t seem to work like that. Why? Money.

Getting a conference going is expensive (in time and money!), apart from rent for the venue you need people doing the work to organize it, to keep the technology working and to properly introduce the speakers. And where free culture conferences can rely on many people donating their time (and maybe even money) to get it going, other conferences need to pay their workers which leads to the pathetic state many conferences are in today.

If you want to visit a conference there is a fee to help cover the costs and those fees can be quite substantial (500 Euro is not really a lot as conference fee). Considering that you will need a flight and hotel in addition to the conference fee you are facing a few thousand bucks a pop and it’s not like universities and scientists are swimming in money (it’s basically the opposite).

But everybody understands because somehow the bills need to be paid and the conference is interesting or people wouldn’t go for that amount of money, right?

Wrong. People visit conferences because they need to present the paper they handed in (because basically every conference requires you to do so) and here comes the funny part: Most conferences charge speakers as well as people who just visit.

If you think that that sounds fucked up, you are absolutely right: The speakers and presented topics are what makes a conference relevant. They provide the content that all the fluff around is just decoration for. But that leaves a few funny little facts out of the equation:

  • people need to publish (either because their job requires them to do or cause they want to show that their dissertation is relevant which [how convenient] works by getting accepted to conferences
  • the people you publish have to come to your gig
  • you charge people that present stuff at your conference

The problem is simple: Most conferences have to rely on boring presentations. Why? Cause a lot of research is not interesting, cause it’s just rehashing of old and deprecated ideas with a few new name tags (seriously look at what people try to sell you based on “web services”!). And this makes conferences a problem.

People like me who try to get their research accepted as relevant have to subsidize conferences. Conferences most people don’t care too much about. Apart from being a huge waste of natural and financial resources as well as time it’s just a disgrace for everybody involved. And it’s not like there is a lot of choice: You either pay (and sign the horrible agreements about rights that most conferences shove down your throat [but that is the topic for another conference]) or you’re out.

Our whole system of doing science, of getting information published and into the hands of people interested needs to change: There are more gatekeepers and people draining money from the equation than can be good for our scientific community.

And yes, this was a rant. And yes, I am annoyed.

avatar

My New Best Friend – Introducing Spiceworks

So for the last few months I’ve been looking for ways to improve our ability to monitor our network (both the Servers and Desktops), we already have external monitors for the really important, business critical things like our website and email but for the day to day tracking of our desktops and less important servers
avatar

Marketing noise or marketing contribution?

I follow a few feed planets here and there. For the unitiated (few), a planet is an aggregation of blog feeds, normally filtered by topic so that interested readers can get the scoop from multiple sources from as central location. An interesting point which crops up once in a while is what determines whether or not a blog post is appropriate for the planet. Often this is because a few posts get through that don’t have anything to do really with the topic but instead talks about their personal life.

The main argument used against this is that people join the feed planet in order to read about their favourite software or development, and that "nobody gives a rat’s ass about your personal life", thus diluting the rest of the content. While I agree with this in essence, I would like to ask people to reconsider what they consider as inappropriate.

I am here referring specifically to open-source projects’ planets. In my opinion the number one difference between the user-facing open-source concept and the user-facing company concept is that whilst a company seeks to inform its users, open-source projects should seek to engage their users. The difference is simply because an open-source community survives on the interest of the community. Now when people refer to the "community" of an open-source project, I believe a common misunderstanding is that it only refers to the group of people connected to the product. No. What it really should refer to is the group of people … connected to the rest of the group.

So when an open-source project engages its users, it shouldn’t do so on a community-project level, but instead on a community-community level. People should respect that the rest of the community has interests outside the product, and should take this not as noise but instead as a contribution to the community. After all, the lifeblood of any open-source project is the community. I believe that developing a relationship on the "people" level rather than the "product" level is vital for the long-run sustainability of a project. It’s a protectionist measure against elitism, of which larger projects are prone to, promotes feedback, empathetic development, and guards against bureaucracy.

Of course, this only applies to some planets depending on their purpose.

So, the next time you decide to tell your planet something totally irrelevant, don’t apologise.

Related posts:

  1. Ethical website advertising?
  2. The Road to KDE Devland (Moult Edition) #0
  3. Chrome in the Clouds: The Google OS

Planet Larry is not officially affiliated with Gentoo Linux. Original artwork and logos copyright Gentoo Foundation. Yadda, yadda, yadda.