Posts for Saturday, July 17, 2010

ANTLR via Clojure

ANTLR is a parser-generator for Java. Can you use it from Clojure? Sure. Would you want to? Maybe.

Here's how to do it, start to finish.

For the impatient among you, all of the code below is on github.

git clone git://github.com/briancarper/clojure-antlr-example.git

Setup

I'm going to use leiningen for this project. Let's make a new project called antlr-example.

$ lein new antlr-example

Now edit project.clj to tell lein to fetch ANTLR (and Swank, since I use Emacs). ANTLR is available in Maven Central, so leiningen can grab it for us.

(defproject antlr-example "1.0.0-SNAPSHOT"
  :description "ANTLR Clojure example"
  :dependencies [[org.clojure/clojure "1.2.0-beta1"]
                 [org.clojure/clojure-contrib "1.2.0-beta1"]
                 [org.antlr/antlr "3.2"]]
  :dev-dependencies [[swank-clojure "1.3.0-SNAPSHOT"]])

Then a simple lein deps will download all of the jars and all of ANTLR's dependencies for you. Handy. You end up with this:

antlr-example
├── classes
├── lib
│   ├── antlr-2.7.7.jar
│   ├── antlr-3.2.jar
│   ├── antlr-runtime-3.2.jar
│   ├── clojure-1.2.0-beta1.jar
│   ├── clojure-contrib-1.2.0-beta1.jar
│   ├── stringtemplate-3.2.jar
│   └── swank-clojure-1.3.0-20100502.112537-1.jar
├── project.clj
├── README
├── src
│   └── antlr_example
│       └── core.clj
└── test
    └── antlr_example
        └── core_test.clj

Interactive development? Kind of...

One weakness of ANTLR via Clojure is that development is no longer REPL-based. The ANTLR workflow is essentially a Java workflow:

  1. Write a grammar
  2. Compile the grammar into Java source code
  3. Compile the Java source into Java .class files

Until and unless someone writes a Clojure backend for ANTLR, so that ANTLR can directly produce Clojure source code (and it seems like it should be possible to do so), you're back to a write/compile/debug cycle. Joy!

This also means restarting your REPL every time you alter and recompile your grammar.

But the good thing is that there's ANTLR Works, a free GUI for writing ANTLR grammars. ANTLR Works has an interactive interpreter, which is nice, and it has a compiler/debugger, which is better. These let you test your grammar as you write it, by trying inputs and seeing the resulting AST in graphical form, which is as good as you could hope for. This is actually even a bit nicer than a plaintext REPL.

Plus, it has a nice editor for ANTLR code. Emacs was freaking out over the indentation on a few grammars I tried.

So a decent workflow might be to write and debug your grammar in ANTLR Works, then fire up Clojure afterwards to consume the parser.

A simple grammar

We're going to use a classic textbook example, simple arithmetic expressions, as a proof of concept.

For Java (hence Clojure) to find this code, it has to be named properly and it has to be put into the proper directory on CLASSPATH. I'm going to create the grammar file src/antlr_example/Expr.g. (My grammar file, Java source, and Clojure source will all be jumbled together in src. You can easily do it differently if this offends your sensibilities.)

The first line in the grammar names the grammar (and the classes we'll consume from Clojure):

grammar Expr;

Clojure being a s-exp based language, it might be nice to have ANTLR generate an AST. ANTLR has good support for this:

options {
    output=AST;
    ASTLabelType=CommonTree;
}

Helper tokens for the grammar:

tokens {
    PLUS = '+';
    MINUS = '-';
    MULT = '*';
    DIV = '/';
}

These next lines are important. We want to generate classes antlr_example.ExprLexer and antlr_example.ExprParser, so we have to set our code up in the proper package, antlr_example.

This code includes both a parser and lexer, so you have to set the package for both in ANTLR.

@header        {package antlr_example;}
@lexer::header {package antlr_example;}

And then the grammar itself, which is simple:

expr: term ((PLUS | MINUS)^ term)* ;
term: factor ((MULT | DIV)^ factor)* ;
factor: INT ;

INT :   '0'..'9'+ ;
WS  :   ( ' ' | '\t' | '\r' | '\n')+ {$channel=HIDDEN;} ;

One thing to note is (PLUS | MINUS)^; the ^ here tells ANTLR to treat (PLUS | MINUS) as the parent of the node created for this rule. This means the term on either side will be children of this node. We do the same for MULT and DIV. This will give us a nice tree. Otherwise you end up with a flat list of tokens.

{$channel=HIDDEN;} for the whitespace rule tells ANTLR to ignore whitespace, also useful.

If you use ANTLR Works with this grammar, on the input 1 + 2 * 3 + 4, you can see that it works OK.

ANTLR

That was an awful lot of boilerplate and setup and ugly syntax though. See how much Clojure spoils you? But for a simple grammar, it's not that much code.

For a longer or more complex grammar, you might end up having to embed inline Java code into your ANTLR grammar. But again, that's not the end of the world.

Compile everything

Once you write your grammar (presumably in ANTLR Works), you can generate the Java code and compile it thusly (run from your project's base directory):

$ java -cp 'lib/*' org.antlr.Tool src/antlr_example/Expr.g
$ javac -cp 'lib/*' -d classes src/antlr_example/Expr{Lexer,Parser}.java

The first command should generate src/antlr_example/ExprLexer.java and src/antlr_example/ExprParser.java.

The second command should spew tons of class files into classes/. Leiningen and other tools generally include classes/ on your CLASSPATH, so that's a good place for them. Again putting them here is a fairly arbitrary decision on my part. You can put the source and class files anywhere, as long as the class files end up on your CLASSPATH when you start Clojure.

If you're really so lazy that you can't run these two commands, you could use ANTLR Works' built-in commands for compiling your grammar. The only way I could find to invoke it was by starting a Debug session, which compiled my code as a side-effect. And it spews files into places I don't want them, so I like the command line version better.

You could probably also set up a leiningen task for this if you're re-running these commands a lot, or configure your build tool of choice to do the same. I didn't bother.

Now (re)start your REPL and let's write some Clojure.

Clojure code

It's pretty easy to consume an ANTLR parser from Clojure. I'm putting the code below into src/antlr_example/core.clj

First import the classes. You probably need some ANTLR helper classes to set up the parser; you also need the classes you just generated.

(ns antlr-example.core
  (:import (org.antlr.runtime ANTLRStringStream
                              CommonTokenStream)
           (antlr_example ExprLexer ExprParser)))

Here's a function to parse a string using our new parser class:

(defn parse-expr [s]
  (let [lexer (ExprLexer. (ANTLRStringStream. s))
        tokens (CommonTokenStream. lexer)
        parser (ExprParser. tokens)]
    (.expr parser)))

.expr is the name of the top-level rule we want to invoke from our grammar. The rest is boilerplate; just plug in the proper class names.

So let's see what we've got.

user> (in-ns 'antlr-example.core)
#<Namespace antlr-example.core>
antlr-example.core> (parse-expr "1 + 2 * 3 + 4")
#<expr_return antlr_example.ExprParser$expr_return@3755e508>

Er, OK. A good way to inspect Java objects is via bean, so let's view the guts of this object:

antlr-example.core> (bean *1)
{:tree #<CommonTree +>,
 :template nil,
 :stop #<CommonToken [@12,12:12='4',<8>,1:12]>,
 :start #<CommonToken [@0,0:0='1',<8>,1:0]>,
 :class antlr_example.ExprParser$expr_return}

I see. We set up our grammar above to generate an AST, so :tree on the bean will give us that. This translates to .getTree on the Java object. So we can edit our function to call this for us, since we always want to have the tree.

(defn parse-expr [s]
  (let [lexer (ExprLexer. (ANTLRStringStream. s))
        tokens (CommonTokenStream. lexer)
        parser (ExprParser. tokens)]
    (.getTree (.expr parser))))

Now:

antlr-example.core> (parse-expr "1 + 2 * 3 + 4")
#<CommonTree +>

OK, it appears we have one node of the tree, the root node. Let's peak inside:

antlr-example.core> (bean *1)
{:children #<ArrayList [+, 4]>,
 :childIndex -1,
 :parent nil,
 :text "+",
 :nil false,
 :token #<CommonToken [@10,10:10='+',<4>,1:10]>,
 :class org.antlr.runtime.tree.CommonTree,
 :ancestors nil,
 :tokenStartIndex 0,
 :type 4,
 :childCount 2,
 :charPositionInLine 10,
 :tokenStopIndex 12,
 :line 1}

Looks like our children are in :children on the bean, which translates to .getChildren on the Java object. And we have .getChildCount to count the children, and .getText to get the text of the current node. (We could've learned all of this by reading the javadocs for ANTLR too, but what fun is that?)

Since we're dealing with a tree, we can use tree-seq in Clojure to get a flat list of all the tokens in our text.

tree-seq takes three arguments. First a function that returns true if the node has children, false otherwise. Second a function that returns the children for a node that has children. Third the root node of our tree.

That'll give us a seq of tree nodes. So finally we call .getText on all the resulting nodes in the list, to turn node objects into strings.

Easy enough:

(defn node-seq [x]
  (map #(.getText %)
   (tree-seq #(not (zero? (.getChildCount %)))
             #(.getChildren %)
             x)))

antlr-example.core> (node-seq (parse-expr "1 + 2 * 3 + 4"))
("+" "+" "1" "*" "2" "3" "4")

But that's no good. We lost our tree structure.

We'd rather have something like nested vectors or lists. It's easy enough to roll something by hand. This should do it:

(defn AST [node]
  (if (zero? (.getChildCount node))
    (.getText node)
    (let [children (map AST (.getChildren node))
          txt (.getText node)]
      (if txt
        (cons txt children)
        children))))

antlr-example.core> (AST (parse-expr "1 + 2 * 3 + 4"))
("+" ("+" "1" ("*" "2" "3")) "4")

That's better, but it's a list of strings. These strings all happen to be literal representations of Clojure objects, so a call to read-string in the proper places should give us something we can work with:

(defn AST [node]
  (if (zero? (.getChildCount node))
    (read-string (.getText node))
    (let [children (map AST (.getChildren node))
          txt (read-string (.getText node))]
      (if txt
        (apply list txt children)
        children))))

antlr-example.core> (AST (parse-expr "1 + 2 * 3 + 4"))
(+ (+ 1 (* 2 3)) 4)

Hey look, now we have something we can evaluate.

antlr-example.core> (eval *1)
11

Yeah I kind of planned that ahead. If you had a more complex grammar, you might not get away with something quite this easy.

Conclusion

The advantage of ANTLR is that it's mature, widely used, actively developed, and well-documented. (There's a whole book about ANTLR, The Definitive ANTLR Reference by Terence Parr.) There's also a lot of tooling available for it, not just ANTLR Works, but plugins for other Java IDEs.

The disadvantage is that it's a Java library, and as always, there will be some friction when consuming it from Clojure. But it's not that bad.

For pure-Clojure parser generator alternatives, there are fnparse and clojure-pg, neither of which I've tried much.

Posts for Thursday, July 15, 2010

Generic Lambda Visitors, or Writing Haskell in C++0x (Part 4)

In part three, we built a generic alternative to visitors that used lambdas. Now we’ll do a few largely pointless but potentially interesting bits of trickery.

Previously, we worked out the return type of our when function by looking at the return type of the first lambda. Let’s get a bit more adventurous. We’ll make a class that works out what the return type should be.

template <typename... Funcs_>
struct WhenReturnType;

template <typename Val_, typename... Funcs_>
typename WhenReturnType<Funcs_...>::Type
when(Val_ && val, Funcs_ && ... funcs)
{
    LambdaVisitor<typename WhenReturnType<Funcs_...>::Type, Funcs_...> visitor(funcs...);
    return accept_returning<typename WhenReturnType<Funcs_...>::Type>(val, visitor);
}

For starters, we’ll say that if the first function returns void, our return type is void; otherwise, it’s an error:

template <typename... Funcs_>
struct WhenReturnType;

template <typename FirstFunc_, typename... Funcs_>
struct WhenReturnType<FirstFunc_, Funcs_...>
{
    typedef typename std::conditional<
        std::is_same<typename LambdaParameterTypes<FirstFunc_>::ReturnType, void>::value,
        void,
        UnknownTypeForOneOf
        >::type Type;
};

Let’s get a bit more adventurous. What if one lambda returns an int, and another returns a long? We could return a long. Similarly, if one returns a char, and another returns a double, we could return a double. And that’s what std::common_type does:

template <typename... Funcs_>
struct WhenReturnType;

template <typename FirstFunc_, typename... Funcs_>
struct WhenReturnType<FirstFunc_, Funcs_...>
{
    typedef typename std::conditional<
        std::is_same<typename LambdaParameterTypes<FirstFunc_>::ReturnType, void>::value,
        void,
        typename std::common_type<typename LambdaParameterTypes<Funcs_>::ReturnType ...>::type
        >::type Type;
};

Except, std::common_type isn’t useful for user defined types. What we really want to do is say “return a type that holds a value of any of the return types we could get”. If only we’d just spent the past few pages developing such a type…

First, though, let’s work out how to just return a value of a given type, if all the return types are the same:

template <typename... Funcs_>
struct AllReturnSame;

template <typename Func_>
struct AllReturnSame<Func_>
{
    enum { value = true };
};

template <typename A_, typename B_, typename... Funcs_>
struct AllReturnSame<A_, B_, Funcs_...>
{
    enum { value = std::is_same<typename LambdaParameterTypes<A_>::ReturnType, typename LambdaParameterTypes<B_>::ReturnType>::value &&
        AllReturnSame<B_, Funcs_...>::value };
};

template <typename... Funcs_>
struct WhenReturnType;

template <typename FirstFunc_, typename... Funcs_>
struct WhenReturnType<FirstFunc_, Funcs_...>
{
    typedef typename std::conditional<
        AllReturnSame<FirstFunc_, Funcs_...>::value,
        typename LambdaParameterTypes<FirstFunc_>::ReturnType,
        UnknownTypeForOneOf
        >::type Type;
};

Note that we no longer have to handle void specially here.

Next, we’ll add in handling for the easy case where all the lambdas return different things:

template <typename... Funcs_>
struct WhenReturnType;

template <typename FirstFunc_, typename... Funcs_>
struct WhenReturnType<FirstFunc_, Funcs_...>
{
    typedef typename std::conditional<
        AllReturnSame<FirstFunc_, Funcs_...>::value,
        typename LambdaParameterTypes<FirstFunc_>::ReturnType,
        OneOf<typename LambdaParameterTypes<FirstFunc_>::ReturnType, typename LambdaParameterTypes<Funcs_>::ReturnType...>
        >::type Type;
};

Again, note the complex expression used to the left of a ... operator.

Unfortunately, this barfs spectacularly if, say, two lambdas return an int and one returns a string. We need a way of removing the duplicates from a type list. It’s not possible to pass around two type lists directly, so we’ll first need a helper struct that stores all the types we’ve seen so far:

template <typename...>
struct SeenSoFar
{
};

Next, let’s have a helper that tells us whether a particular type is already in a SeenSoFar list:

template <typename...>
struct AlreadySeen;

template <typename Query_>
struct AlreadySeen<SeenSoFar<>, Query_>
{
    enum { value = false };
};

template <typename Query_, typename A_, typename... Rest_>
struct AlreadySeen<SeenSoFar<A_, Rest_...>, Query_>
{
    enum { value = std::is_same<Query_, A_>::value || AlreadySeen<SeenSoFar<Rest_...>, Query_>::value };
};

We also need to be able to create a new SeenSoFar with an extra type:

template <typename...>
struct ExtendSeenSoFar;

template <typename New_, typename... Current_>
struct ExtendSeenSoFar<New_, SeenSoFar<Current_...> >
{
    typedef SeenSoFar<Current_..., New_> Type;
};

Now we can put all that together:

template <typename...>
struct OneOfDeduplicatorBuilder;

template <typename... Values_>
struct OneOfDeduplicatorBuilder<SeenSoFar<Values_...> >
{
    typedef OneOf<Values_...> Type;
};

template <typename SeenSoFar_, typename Next_, typename... Funcs_>
struct OneOfDeduplicatorBuilder<SeenSoFar_, Next_, Funcs_...>
{
    typedef typename std::conditional<
        AlreadySeen<SeenSoFar_, Next_>::value,
        typename OneOfDeduplicatorBuilder<SeenSoFar_, Funcs_...>::Type,
        typename OneOfDeduplicatorBuilder<typename ExtendSeenSoFar<Next_, SeenSoFar_>::Type, Funcs_...>::Type
            >::type Type;
};

template <typename... Funcs_>
struct OneOfDeduplicator
{
    typedef typename OneOfDeduplicatorBuilder<SeenSoFar<>, Funcs_...>::Type Type;
};

And we can use it:

template <typename... Funcs_>
struct WhenReturnType;

template <typename FirstFunc_, typename... Funcs_>
struct WhenReturnType<FirstFunc_, Funcs_...>
{
    typedef typename std::conditional<
        AllReturnSame<FirstFunc_, Funcs_...>::value,
        typename LambdaParameterTypes<FirstFunc_>::ReturnType,
        typename OneOfDeduplicator<
            typename LambdaParameterTypes<FirstFunc_>::ReturnType,
            typename LambdaParameterTypes<Funcs_>::ReturnType ...>::Type
        >::type Type;
};

Letting us do horrible things like:

struct SomeType
{
};

int main(int, char *[])
{
    OneOf<int, std::string, SomeType> s(123);

    std::cout << when(
            when(s,
                [] (int & x)               -> std::string { return std::string(++x, 'x'); },
                [] (std::string & x)       -> int         { x.append(" spanker"); return x.length(); },
                [] (const SomeType &)      -> int         { return 42; }
                ),
            [] (const int x)           -> int { return x; },
            [] (const std::string & x) -> int { return x.length(); }
            ) << std::endl;

    return EXIT_SUCCESS;
}

You’ll note we’re explicitly specifying the return types. This is because std::string::length() probably doesn’t return an int, and our code does overload resolution on references, which allows conversions for classes but not integral types. Fixing this in a sane way is left as an easy exercise for the reader — one possibility is to see if the return types of all the lambdas have a std::common_type, and using that rather than a OneOf if they do. Implementing this is left to the reader; in the mean time, our code in full looks like:

#include <memory>
#include <type_traits>
#include <utility>

struct UnknownTypeForOneOf;

template <typename Want_, typename... Types_>
struct SelectOneOfType;

template <typename Want_>
struct SelectOneOfType<Want_>
{
    typedef UnknownTypeForOneOf Type;
};

template <typename Want_, typename Try_, typename... Rest_>
struct SelectOneOfType<Want_, Try_, Rest_...>
{
    typedef typename std::conditional<
        std::is_same<Want_, Try_>::value,
        Try_,
        typename SelectOneOfType<Want_, Rest_...>::Type
            >::type Type;
};

template <typename Type_>
struct ParameterTypes;

template <typename C_, typename R_, typename P_>
struct ParameterTypes<R_ (C_::*)(P_)>
{
    typedef P_ FirstParameterType;
    typedef R_ ReturnType;
};

template <typename C_, typename R_, typename P_>
struct ParameterTypes<R_ (C_::*)(P_) const>
{
    typedef P_ FirstParameterType;
    typedef R_ ReturnType;
};

template <typename Lambda_>
struct LambdaParameterTypes
{
    typedef typename ParameterTypes<decltype(&Lambda_::operator())>::FirstParameterType FirstParameterType;
    typedef typename ParameterTypes<decltype(&Lambda_::operator())>::ReturnType ReturnType;
};

template <typename Type_>
struct OneOfVisitorVisit
{
    virtual void visit(Type_ &) = 0;
};

template <typename... Types_>
struct OneOfVisitor :
    OneOfVisitorVisit<Types_>...
{
};

template <typename Visitor_, typename Underlying_, typename Result_, typename... Types_>
struct OneOfVisitorWrapperVisit;

template <typename Visitor_, typename Underlying_, typename Result_>
struct OneOfVisitorWrapperVisit<Visitor_, Underlying_, Result_> :
    Visitor_
{
    Underlying_ & underlying;
    std::function<Result_ ()> execute;

    OneOfVisitorWrapperVisit(Underlying_ & u) :
        underlying(u)
    {
    }
};

template <typename Visitor_, typename Underlying_, typename Result_, typename Type_, typename... Rest_>
struct OneOfVisitorWrapperVisit<Visitor_, Underlying_, Result_, Type_, Rest_...> :
    OneOfVisitorWrapperVisit<Visitor_, Underlying_, Result_, Rest_...>
{
    OneOfVisitorWrapperVisit(Underlying_ & u) :
        OneOfVisitorWrapperVisit<Visitor_, Underlying_, Result_, Rest_...>(u)
    {
    }

    Result_ visit_returning(Type_ & t)
    {
        return this->underlying.visit(t);
    }

    virtual void visit(Type_ & t)
    {
        this->execute = std::bind(&OneOfVisitorWrapperVisit::visit_returning, this, std::ref(t));
    }
};

template <typename Underlying_, typename Result_, typename... Types_>
struct OneOfVisitorWrapper :
    OneOfVisitorWrapperVisit<OneOfVisitor<Types_...>, Underlying_, Result_, Types_...>
{
    OneOfVisitorWrapper(Underlying_ & u) :
        OneOfVisitorWrapperVisit<OneOfVisitor<Types_...>, Underlying_, Result_, Types_...>(u)
    {
    }
};

template <typename... Types_>
struct OneOfValueBase
{
    virtual ~OneOfValueBase() = 0;

    virtual void accept(OneOfVisitor<Types_...> &) = 0;
    virtual void accept(OneOfVisitor<const Types_...> &) const = 0;
};

template <typename... Types_>
OneOfValueBase<Types_...>::~OneOfValueBase() = default;

template <typename Type_, typename... Types_>
struct OneOfValue :
    OneOfValueBase<Types_...>
{
    Type_ value;

    OneOfValue(const Type_ & type) :
        value(type)
    {
    }

    virtual void accept(OneOfVisitor<Types_...> & visitor)
    {
        static_cast<OneOfVisitorVisit<Type_> &>(visitor).visit(value);
    }

    virtual void accept(OneOfVisitor<const Types_...> & visitor) const
    {
        static_cast<OneOfVisitorVisit<const Type_> &>(visitor).visit(value);
    }
};

template <typename... Types_>
class OneOf
{
    private:
        std::unique_ptr<OneOfValueBase<Types_...> > _value;

    public:
        template <typename Type_>
        OneOf(const Type_ & value) :
            _value(new OneOfValue<typename SelectOneOfType<Type_, Types_...>::Type, Types_...>{value})
        {
        }

        OneOf(const OneOf & other) = delete;

        OneOf(OneOf && other) :
            _value(std::move(other._value))
        {
        }

        template <typename Type_>
        OneOf & operator= (const Type_ & value)
        {
            _value.reset(new OneOfValue<typename SelectOneOfType<Type_, Types_...>::Type, Types_...>{value});
            return *this;
        }

        OneOf & operator= (const OneOf & other) = delete;

        OneOf & operator= (OneOf && other)
        {
            _value = std::move(other._value);
            return *this;
        }

        OneOfValueBase<Types_...> & value()
        {
            return *_value;
        }

        const OneOfValueBase<Types_...> & value() const
        {
            return *_value;
        }
};

template <typename Visitor_, typename Result_, typename OneOf_>
struct OneOfVisitorWrapperTypeFinder;

template <typename Visitor_, typename Result_, typename... Types_>
struct OneOfVisitorWrapperTypeFinder<Visitor_, Result_, const OneOf<Types_...> &>
{
    typedef OneOfVisitorWrapper<Visitor_, Result_, const Types_...> Type;
};

template <typename Visitor_, typename Result_, typename... Types_>
struct OneOfVisitorWrapperTypeFinder<Visitor_, Result_, OneOf<Types_...> &>
{
    typedef OneOfVisitorWrapper<Visitor_, Result_, Types_...> Type;
};

template <typename Result_, typename OneOf_, typename Visitor_>
Result_
accept_returning(OneOf_ && one_of, Visitor_ && visitor)
{
    typename OneOfVisitorWrapperTypeFinder<Visitor_, Result_, OneOf_>::Type visitor_wrapper(visitor);
    one_of.value().accept(visitor_wrapper);
    return visitor_wrapper.execute();
}

template <typename OneOf_, typename Visitor_>
void accept(OneOf_ && one_of, Visitor_ && visitor)
{
    accept_returning<void>(one_of, visitor);
}

template <typename Result_, typename... Funcs_>
struct LambdaVisitor;

template <typename Result_>
struct LambdaVisitor<Result_>
{
    void visit(struct NotReallyAType);
};

template <typename Result_, typename Func_, typename... Rest_>
struct LambdaVisitor<Result_, Func_, Rest_...> :
    LambdaVisitor<Result_, Rest_...>
{
    Func_ & func;

    LambdaVisitor(Func_ & f, Rest_ & ... rest) :
        LambdaVisitor<Result_, Rest_...>(rest...),
        func(f)
    {
    }

    Result_ visit(typename LambdaParameterTypes<Func_>::FirstParameterType & v)
    {
        return func(v);
    }

    using LambdaVisitor<Result_, Rest_...>::visit;
};

template <typename... Funcs_>
struct AllReturnSame;

template <typename Func_>
struct AllReturnSame<Func_>
{
    enum { value = true };
};

template <typename A_, typename B_, typename... Funcs_>
struct AllReturnSame<A_, B_, Funcs_...>
{
    enum { value = std::is_same<typename LambdaParameterTypes<A_>::ReturnType, typename LambdaParameterTypes<B_>::ReturnType>::value &&
        AllReturnSame<B_, Funcs_...>::value };
};

template <typename...>
struct SeenSoFar
{
};

template <typename...>
struct ExtendSeenSoFar;

template <typename New_, typename... Current_>
struct ExtendSeenSoFar<New_, SeenSoFar<Current_...> >
{
    typedef SeenSoFar<Current_..., New_> Type;
};

template <typename...>
struct AlreadySeen;

template <typename Query_>
struct AlreadySeen<SeenSoFar<>, Query_>
{
    enum { value = false };
};

template <typename Query_, typename A_, typename... Rest_>
struct AlreadySeen<SeenSoFar<A_, Rest_...>, Query_>
{
    enum { value = std::is_same<Query_, A_>::value || AlreadySeen<SeenSoFar<Rest_...>, Query_>::value };
};

template <typename...>
struct OneOfDeduplicatorBuilder;

template <typename... Values_>
struct OneOfDeduplicatorBuilder<SeenSoFar<Values_...> >
{
    typedef OneOf<Values_...> Type;
};

template <typename SeenSoFar_, typename Next_, typename... Funcs_>
struct OneOfDeduplicatorBuilder<SeenSoFar_, Next_, Funcs_...>
{
    typedef typename std::conditional<
        AlreadySeen<SeenSoFar_, Next_>::value,
        typename OneOfDeduplicatorBuilder<SeenSoFar_, Funcs_...>::Type,
        typename OneOfDeduplicatorBuilder<typename ExtendSeenSoFar<Next_, SeenSoFar_>::Type, Funcs_...>::Type
            >::type Type;
};

template <typename... Funcs_>
struct OneOfDeduplicator
{
    typedef typename OneOfDeduplicatorBuilder<SeenSoFar<>, Funcs_...>::Type Type;
};

template <typename... Funcs_>
struct WhenReturnType;

template <typename FirstFunc_, typename... Funcs_>
struct WhenReturnType<FirstFunc_, Funcs_...>
{
    typedef typename std::conditional<
        AllReturnSame<FirstFunc_, Funcs_...>::value,
        typename LambdaParameterTypes<FirstFunc_>::ReturnType,
        typename OneOfDeduplicator<
            typename LambdaParameterTypes<FirstFunc_>::ReturnType,
            typename LambdaParameterTypes<Funcs_>::ReturnType ...>::Type
        >::type Type;
};

template <typename Val_, typename... Funcs_>
typename WhenReturnType<Funcs_...>::Type
when(Val_ && val, Funcs_ && ... funcs)
{
    LambdaVisitor<typename WhenReturnType<Funcs_...>::Type, Funcs_...> visitor(funcs...);
    return accept_returning<typename WhenReturnType<Funcs_...>::Type>(val, visitor);
}

Filed under: OneOf Tagged: c++, c++0x, lambdas, metaprogramming, templates, variadic templates, visitor pattern
avatar

Reviewing the statistics for WIPUP 27.06.10a.

I decided to delay the statistics-review post for WIPUP 27.06.10a because a recent update to the dashboard now shows view statistics on a daily basis instead of a weekly basis – the results were quite surprising:

As you can see WIPUP is clearly one of our most active projects, with quite a decent kudos:subscriptions:updates ratio compared to the others. However when looking at the activity, we notice something rather interesting – the increase in views is not a sustained increase. It’s a spike whenever there is an update. Looking at my personal WIPSpace we can attribute the initial spike up to almost 200 to the 27.06.10a release itself. The very next day, with no updates, views returned to a pathetic zero.

The even larger spike was quite an oddity. On the 1st of June, one of my less relevant posts (about cooking) was aggregated onto Planet Larry, which sort of explains it (despite a 1 day time lag occuring before the spike) – but further investigation shows that my update about buying the C++ Qt book received a uniqely larger amount of views. I conclude that where the Planet Larry aggregation helped spark some interest, another equally important factor to the spike was that planet readers decided to read what else was on the blog, which was a post which linked to the programming book update – which was obviously a lot more relevant. All the same, very interesting stuff.

An obvious reason behind the non-sustained views is that WIPUP is largly an unknown entity on the web. Hopefully with more updates (which will come!) I can change this behavior into a steady stream.

In relevant news, the upcoming beta of WIPUP is making rapid progress and should be quite a sweet release.

Related posts:

  1. The WIPUP 21.02.10 stats are out.
  2. After the WIPUP release, the stats are in.
  3. WIPUP 19.03.10a – under or overcooked?

Posts for Wednesday, July 14, 2010

Paludis 0.48.3 Released

Paludis 0.48.3 has been released:

  • ‘cave display-resolution’ now indicates when changing the origin repository for an upgrade.
  • ‘cave display-resolution’ now shows dependency reasons correctly.
  • ‘cave resolve !foo’ now applies to all slots of foo, rather than using the –target-slots option.
  • ‘cave uninstall’ will now only update world if every slot has been removed.
  • ‘cave execute-resolution’ will now show pkg_nofetch information where appropriate.
  • ‘cave resume’ on a resume file that includes already-completed uninstalls no longer errors.
  • Uninstalls are now considered for ‘cave execute-resolution –continue-on-failure’ (previously a failed uninstall would not count as a failure).

Filed under: paludis releases Tagged: paludis

Programming with Falcon: Regex

While programming with Falcon your almost always going to want to do something fun. I personally get tired of the stupid posts over and again about how to use strings

// Standard string
ex_str = "this is a string"
// String addition (concat)
ex_str + " ...and still is" = "this is a string ...and still is"
// Literal strings
lit_str = 'Hell I''m a literal string'  // notice the two single quotes to produce one single quote when outputed
// And even funny things like getting "d" from "a"
letter_from_letter = "a" / 3 = "d"

But personally, that kind of stuff bores me. Everybody can do that with just about any language. What about something more interesting? Like regex?

Regex

Let’s say you want to use Regex in your code. Why? I don’t know why you want regex. I don’t really care to be totally honest. And, of course, I’m really bad at coming up with examples. So, let’s use some regular expressions


#!/usr/bin/env falcon
load regex

string = "1234"

if string == Regex("/\d+/")
> "Found it!"
end

The above does basically what you’d expect. If the string contains a series of integers your good to go! But Falcon can do more and better. You can actually create regular expression objects:


#!/usr/bin/env falcon
load regex

string = "1234"
re = Regex("/\d+/")

if re.match( string )
> "Found it!"
// Note here I use printl so I don't have to worry about
// the type returned by re.captured
printl( "I found it, here: ", re.captured(1))
end

Now that is where it starts to become interesting for me. The examples are obviously trivial, but the you can begin the see the power of language forming. Now you can reuse and abuse not to mention your code becomes much cleaner. You might gain a line or two but the power and beauty you get from it are totally worth it.

Enjoy the Penguins!


Posts for Tuesday, July 13, 2010

Clementine: A triumph of Free Software

Ages ago, in the long-forgotten days of 2008, there was Amarok 1.4. And it was good. Then KDE4 came along and Amarok was rewritten, reshaped, becoming something... different. Something unsettling. Something not altogether pleasant.

Fear not. Today we have Clementine.

Clementine

I consider Clementine a triumph of Free Software. A great project fell off the rails, so someone else picked up the pieces, forked it and kept the spirit alive.

Features present

Clementine embodies everything good about Amarok 1.4, in a shiny Qt4 package. The layout is eminently pleasant to use. It uses the classic "spreadsheet" playlist view that saw so much success in Amarok 1.4. If you care about cramming as much information about your music as possible onto the screen, this is as good as it gets. It's boring, and that's a good thing. It gets the job done.

Like Amaork, 1.4, in Clementine you can very quickly drill into your music collection, filter it, view recently added tracks, group songs by artist or album or year or genre or a combination of those things. Clementine also handles all of the edge cases correctly: it lists albums with Various Artists exactly how I'd want (exactly like Amarok 1.4). It correctly handles songs with non-Latin tag text.

Clementine detects additions and changes to my music collection instantly, without the massive scan-lags on startup that plague some other music players. Clementine doesn't bat an eye at my 7,000 song collection. There's no MySQL integration, but I don't need it. Clementine's SQLite backend supposedly handles 300k songs without much problem, which is good enough for me.

Clementine has Last.FM integration. It has three different styles of desktop notification. It has visualizations. It handles USB devices. It understands reply gain. It has cross-fading. It has an equalizer. It has a transcoder. It has a cover manager.

I'm tired of listing features. Let's just say it has every useful feature you'd ever want. And if you don't need a feature, it stays out of your way.

And for a program under such active development, it's rock solid. I have yet to see a crash. And speaking of active development, if you follow the activity in Clementine's SVN repo, you will find that this program is updated almost daily. How the devs find the time, I don't know, but I'm grateful. This program has gone from non-existent to awesome in record time.

Clementine can use gstreamer, so it even works cross-platform. I fired it up on Windows 7 the other day and I was amazed at how good it looked and felt. It supposedly also works on OS X.

Clementine doesn't cook your breakfast for you, but that might be in the works.

How to make a good UI

A perfect example of the polish of Clementine's UI: Tagging. How do you tag a whole album worth of music at once? You can select some songs and right click and go into a dialog, like most music players allow.

Or:

  1. Edit a tag for a single song (inline) by clicking the field. Let's say you edit Artist.
  2. Select multiple songs in your playlist. (Click and drag, CTRL-click, Shift-click, CTRL-A, whatever.)
  3. Right click the Artist tag in the song you edited, select Set Artist to "XXXXX", and now all the songs you selected will be updated.

Clementine

This is the kind of UI innovation that I like. It's simple, it's useful, and it's predictable. You can get things done without going through dialog windows, without a million clicks, without spending a minute scratching your head figuring things out.

(Meanwhile Amarok 2 is busy getting rid of the Stop button and making the volume control circular.)

Features missing

Admittedly, Clementine is missing a couple of features I wouldn't mind having. You can't skin or theme Clementine. You can't rate songs. You can't display song lyrics. You can't "queue" songs. But oh well. I can live without these features because the rest of the program is so darned good. For all I know, these features might pop up next week. I wouldn't be surprised.

The Clementine devs seem to be very friendly and responsive to feature requests and feedback, which is also great.

Clementine is also missing a few features/bloat that I'm glad to see NOT ported from Amarok. Wikipedia integration? Good riddance.

I would pay money for this program.

In November 2009 I had this to say:

(Anyone out there reading this, if you port Amarok 1.4 to Qt4 intact, I will pay you. Seriously. I will pay you money.)

The offer still stands. I will pay money for Clementine. I'm still waiting for a Donate link so I can do so. (Clementine devs, are you reading this?)

Why do I care about this so much? Because I have music playing whenever I'm using this computer, and when you add up work plus free time, I'm at this computer 8-10 hours per day. Music keeps me sane during multi-hour debug sessions. Music is an integral part of my life, and a music app is an integral part of playing music.

It's very important to me that the programs and tools I use all day are comfortable. Otherwise I become cranky. If you were a carpenter, would you want to use a hammer with a wobbly handle all day? I'm a programmer, and I want to use comfortable computer programs.

Clementine is very comfortable.

avatar

Do (Computer) Geeks Really Rule the (Computer) World?

I just read about an interesting development with the latest iPhone. It all started when we found out the iPhone’s antenna is bad. Now we’re finding out AppleCare has informed customers it won’t be fixed (really), Consumer Reports specifically does not recommend the iPhone 4, and Apple forum moderators are deleting threads right and left, which apparently isn’t out of the ordinary, nor will that likely change in the future. (Whew, that was a mouthful!)

If that weren’t enough, Megan McArdle reported about a growing aggravation with Apple’s policies among the geek crowd (surprise, surprise).

Now, what I have to say here is largely speculation. It’s one of those thoughts where you think about it, and it interests you, but you have no real way or need to prove it at the time, but, as time goes on, you start to notice patterns that add leverage to the idea.

It started with Firefox’s growth, which is almost a miracle. I was trying to figure out, during Firefox 2, how in the world it got so popular. Internet Explorer is still the default on every single Windows computer out there, (and that’s quite a few). People have to actually go out there and get Firefox. How in the world did they decide to do that?

In my corner of the universe, there’s only one way Firefox spread around: word of mouth. Word of mouth has to begin somewhere, though. As I read the Wikipedia article on the “History of Mozilla Firefox,” it became clear that the initial audience was the computer geek, until later versions. If they knew about it first, and they told all their friends, and the word kept spreading because the geeks kept supporting it, then that might explain Firefox’s popularity.

And then there’s the growing hatred of Windows. I’m sure many of you who are reading this blog are Linux users and have spouted your hatred of Windows, like I have, at every polite and appropriate moment possible. That, in combination with Apple’s increased focus (at least in the past) on pleasing developers (us geeks again!), can mostly explain Apple’s success.

There’s probably a few other examples I’m missing that I’ve discovered over the years. Nothing here is conclusive, and with something so socially oriented, I’m not sure there’s ever going to be proofs. Nevertheless, the patterns are definitely there. Besides, people tend to look up to the folks who deal with technology, when it comes to tech advice. It only makes sense, and that alone adds to the power of the computer geek

My answer to the “Do Geeks Really Rule the World?” question, then, is “mostly”.

It almost doesn’t surprise me, then, that, if Apple is getting slammed by geeks for their restrictive policies against developers and geeks, their popularity would begin to fall among everyone else.

What do you think? Am I going a bit overboard?


Generic Lambda Visitors, or Writing Haskell in C++0x (Part 3)

In parts one and two, we built up a OneOf holder that could contain one of a number of different types of values, and developed a flexible variation on the visitor pattern to do things with the underlying values. Now we’ll adapt that to use lambdas.

We’ll need some helper classes so we can figure out the parameter types and return type of a lambda. A lambda itself is just fancy syntax for a struct with an operator(), so combined with the new decltype operator we can do this:

template <typename Type_>
struct ParameterTypes;

template <typename C_, typename R_, typename P_>
struct ParameterTypes<R_ (C_::*)(P_)>
{
    typedef P_ FirstParameterType;
    typedef R_ ReturnType;
};

template <typename C_, typename R_, typename P_>
struct ParameterTypes<R_ (C_::*)(P_) const>
{
    typedef P_ FirstParameterType;
    typedef R_ ReturnType;
};

template <typename Lambda_>
struct LambdaParameterTypes
{
    typedef typename ParameterTypes<decltype(&Lambda_::operator())>::FirstParameterType FirstParameterType;
    typedef typename ParameterTypes<decltype(&Lambda_::operator())>::ReturnType ReturnType;
};

Then we can use this to implement an adapter that turns a number of lambda functions into a visitable class. For starters we’ll ignore the return type:

template <typename... Funcs_>
struct LambdaVisitor;

template <>
struct LambdaVisitor<>
{
    void visit(struct NotReallyAType);
};

template <typename Func_, typename... Rest_>
struct LambdaVisitor<Func_, Rest_...> :
    LambdaVisitor<Rest_...>
{
    Func_ & func;

    LambdaVisitor(Func_ & f, Rest_ & ... rest) :
        LambdaVisitor<Rest_...>(rest...),
        func(f)
    {
    }

    void visit(typename LambdaParameterTypes<Func_>::FirstParameterType & v)
    {
        func(v);
    }

    using LambdaVisitor<Rest_...>::visit;
};

Note the using statement, so all the visit methods defined have equal visibility for overload resolution. Also note how we define a dummy visit method in the base class, since it’s illegal to using something that doesn’t exist.

Incidentally, so far as I can see we have to use the “lots of inheritance” technique here. We can’t inherit using the ... operator, since there doesn’t seem to be a way of applying the ... operator to a using statement. Thus we can’t quite do something like:

template <typename Func_>
struct LambdaVisitorVisit
{
    Func_ func;

    LambdaVisitorVisit(Func_ & f) :
        func(f)
    {
    }

    void visit(typename LambdaParameterTypes<Func_>::FirstParameterType & v)
    {
        func(v);
    }
};

template <typename... Funcs_>
struct LambdaVisitor :
    LambdaVisitorVisit<Funcs_>...
{
    LambdaVisitor(Funcs_ & ... funcs) :
        LambdaVisitorVisit<Funcs_>(funcs)...
    {
    }

    /* Can't do this: */
    using LambdaVisitorVisit<Funcs_>::visit...;
};

I’m not sure whether this is an intentional omission from the standard, or an oversight, or whether it’s just not considered useful.

That aside, now we just have to concoct the when function:

template <typename Val_, typename... Funcs_>
void when(Val_ && val, Funcs_ && ... funcs)
{
    accept(val, LambdaVisitor<Funcs_...>(funcs...));
}

And here we go:

int main(int, char *[])
{
    OneOf<int, std::string> s(123);

    when(s,
            [] (int & x)               { std::cout << "int " << ++x << std::endl; },
            [] (const std::string & x) { std::cout << "string " << x << std::endl; }
        );

    const OneOf<int, std::string> t(std::string("monkey"));

    when(t,
            [] (const int & x)         { std::cout << "int " << x << std::endl; },
            [] (const std::string & x) { std::cout << "string " << x << std::endl; }
        );

    return EXIT_SUCCESS;
}

What about return types? We could switch to using when_returning, but there’s another option: we could use the return type of the first lambda as our return type instead.

template <typename Result_, typename... Funcs_>
struct LambdaVisitor;

template <typename Result_>
struct LambdaVisitor<Result_>
{
    void visit(struct NotReallyAType);
};

template <typename Result_, typename Func_, typename... Rest_>
struct LambdaVisitor<Result_, Func_, Rest_...> :
    LambdaVisitor<Result_, Rest_...>
{
    Func_ & func;

    LambdaVisitor(Func_ & f, Rest_ & ... rest) :
        LambdaVisitor<Result_, Rest_...>(rest...),
        func(f)
    {
    }

    Result_ visit(typename LambdaParameterTypes<Func_>::FirstParameterType & v)
    {
        return func(v);
    }

    using LambdaVisitor<Result_, Rest_...>::visit;
};

template <typename Val_, typename FirstFunc_, typename... Rest_>
typename LambdaParameterTypes<FirstFunc_>::ReturnType
when(Val_ && val, FirstFunc_ && first_func, Rest_ && ... rest)
{
    return accept_returning<typename LambdaParameterTypes<FirstFunc_>::ReturnType>(
            val,
            LambdaVisitor<typename LambdaParameterTypes<FirstFunc_>::ReturnType, FirstFunc_, Rest_...>(first_func, rest...));
}

And using it:

int main(int, char *[])
{
    OneOf<int, std::string> s(123);

    std::cout << when(s,
            [] (const int x)           { return x; },
            [] (const std::string & x) { return x.length(); }
        ) << std::endl;

    return EXIT_SUCCESS;
}

There we have it: a useful, highly generic alternative to visitors that makes use of lambdas. The code in full:

#include <string>
#include <memory>
#include <type_traits>
#include <utility>
#include <cstdlib>
#include <iostream>

struct UnknownTypeForOneOf;

template <typename Want_, typename... Types_>
struct SelectOneOfType;

template <typename Want_>
struct SelectOneOfType<Want_>
{
    typedef UnknownTypeForOneOf Type;
};

template <typename Want_, typename Try_, typename... Rest_>
struct SelectOneOfType<Want_, Try_, Rest_...>
{
    typedef typename std::conditional<
        std::is_same<Want_, Try_>::value,
        Try_,
        typename SelectOneOfType<Want_, Rest_...>::Type
            >::type Type;
};

template <typename Type_>
struct ParameterTypes;

template <typename C_, typename R_, typename P_>
struct ParameterTypes<R_ (C_::*)(P_)>
{
    typedef P_ FirstParameterType;
    typedef R_ ReturnType;
};

template <typename C_, typename R_, typename P_>
struct ParameterTypes<R_ (C_::*)(P_) const>
{
    typedef P_ FirstParameterType;
    typedef R_ ReturnType;
};

template <typename Lambda_>
struct LambdaParameterTypes
{
    typedef typename ParameterTypes<decltype(&Lambda_::operator())>::FirstParameterType FirstParameterType;
    typedef typename ParameterTypes<decltype(&Lambda_::operator())>::ReturnType ReturnType;
};

template <typename Type_>
struct OneOfVisitorVisit
{
    virtual void visit(Type_ &) = 0;
};

template <typename... Types_>
struct OneOfVisitor :
    OneOfVisitorVisit<Types_>...
{
};

template <typename Visitor_, typename Underlying_, typename Result_, typename... Types_>
struct OneOfVisitorWrapperVisit;

template <typename Visitor_, typename Underlying_, typename Result_>
struct OneOfVisitorWrapperVisit<Visitor_, Underlying_, Result_> :
    Visitor_
{
    Underlying_ & underlying;
    std::function<Result_ ()> execute;

    OneOfVisitorWrapperVisit(Underlying_ & u) :
        underlying(u)
    {
    }
};

template <typename Visitor_, typename Underlying_, typename Result_, typename Type_, typename... Rest_>
struct OneOfVisitorWrapperVisit<Visitor_, Underlying_, Result_, Type_, Rest_...> :
    OneOfVisitorWrapperVisit<Visitor_, Underlying_, Result_, Rest_...>
{
    OneOfVisitorWrapperVisit(Underlying_ & u) :
        OneOfVisitorWrapperVisit<Visitor_, Underlying_, Result_, Rest_...>(u)
    {
    }

    Result_ visit_returning(Type_ & t)
    {
        return this->underlying.visit(t);
    }

    virtual void visit(Type_ & t)
    {
        this->execute = std::bind(&OneOfVisitorWrapperVisit::visit_returning, this, std::ref(t));
    }
};

template <typename Underlying_, typename Result_, typename... Types_>
struct OneOfVisitorWrapper :
    OneOfVisitorWrapperVisit<OneOfVisitor<Types_...>, Underlying_, Result_, Types_...>
{
    OneOfVisitorWrapper(Underlying_ & u) :
        OneOfVisitorWrapperVisit<OneOfVisitor<Types_...>, Underlying_, Result_, Types_...>(u)
    {
    }
};

template <typename... Types_>
struct OneOfValueBase
{
    virtual ~OneOfValueBase() = 0;

    virtual void accept(OneOfVisitor<Types_...> &) = 0;
    virtual void accept(OneOfVisitor<const Types_...> &) const = 0;
};

template <typename... Types_>
OneOfValueBase<Types_...>::~OneOfValueBase() = default;

template <typename Type_, typename... Types_>
struct OneOfValue :
    OneOfValueBase<Types_...>
{
    Type_ value;

    OneOfValue(const Type_ & type) :
        value(type)
    {
    }

    virtual void accept(OneOfVisitor<Types_...> & visitor)
    {
        static_cast<OneOfVisitorVisit<Type_> &>(visitor).visit(value);
    }

    virtual void accept(OneOfVisitor<const Types_...> & visitor) const
    {
        static_cast<OneOfVisitorVisit<const Type_> &>(visitor).visit(value);
    }
};

template <typename... Types_>
class OneOf
{
    private:
        std::unique_ptr<OneOfValueBase<Types_...> > _value;

    public:
        template <typename Type_>
        OneOf(const Type_ & value) :
            _value(new OneOfValue<typename SelectOneOfType<Type_, Types_...>::Type, Types_...>{value})
        {
        }

        OneOf(const OneOf & other) = delete;

        OneOf(OneOf && other) :
            _value(std::move(other._value))
        {
        }

        template <typename Type_>
        OneOf & operator= (const Type_ & value)
        {
            _value.reset(new OneOfValue<typename SelectOneOfType<Type_, Types_...>::Type, Types_...>{value});
            return *this;
        }

        OneOf & operator= (const OneOf & other) = delete;

        OneOf & operator= (OneOf && other)
        {
            _value = std::move(other._value);
            return *this;
        }

        OneOfValueBase<Types_...> & value()
        {
            return *_value;
        }

        const OneOfValueBase<Types_...> & value() const
        {
            return *_value;
        }
};

template <typename Visitor_, typename Result_, typename OneOf_>
struct OneOfVisitorWrapperTypeFinder;

template <typename Visitor_, typename Result_, typename... Types_>
struct OneOfVisitorWrapperTypeFinder<Visitor_, Result_, const OneOf<Types_...> &>
{
    typedef OneOfVisitorWrapper<Visitor_, Result_, const Types_...> Type;
};

template <typename Visitor_, typename Result_, typename... Types_>
struct OneOfVisitorWrapperTypeFinder<Visitor_, Result_, OneOf<Types_...> &>
{
    typedef OneOfVisitorWrapper<Visitor_, Result_, Types_...> Type;
};

template <typename Result_, typename OneOf_, typename Visitor_>
Result_
accept_returning(OneOf_ && one_of, Visitor_ && visitor)
{
    typename OneOfVisitorWrapperTypeFinder<Visitor_, Result_, OneOf_>::Type visitor_wrapper(visitor);
    one_of.value().accept(visitor_wrapper);
    return visitor_wrapper.execute();
}

template <typename OneOf_, typename Visitor_>
void accept(OneOf_ && one_of, Visitor_ && visitor)
{
    accept_returning<void>(one_of, visitor);
}

template <typename Result_, typename... Funcs_>
struct LambdaVisitor;

template <typename Result_>
struct LambdaVisitor<Result_>
{
    void visit(struct NotReallyAType);
};

template <typename Result_, typename Func_, typename... Rest_>
struct LambdaVisitor<Result_, Func_, Rest_...> :
    LambdaVisitor<Result_, Rest_...>
{
    Func_ & func;

    LambdaVisitor(Func_ & f, Rest_ & ... rest) :
        LambdaVisitor<Result_, Rest_...>(rest...),
        func(f)
    {
    }

    Result_ visit(typename LambdaParameterTypes<Func_>::FirstParameterType & v)
    {
        return func(v);
    }

    using LambdaVisitor<Result_, Rest_...>::visit;
};

template <typename Val_, typename FirstFunc_, typename... Rest_>
typename LambdaParameterTypes<FirstFunc_>::ReturnType
when(Val_ && val, FirstFunc_ && first_func, Rest_ && ... rest)
{
    return accept_returning<typename LambdaParameterTypes<FirstFunc_>::ReturnType>(
            val,
            LambdaVisitor<typename LambdaParameterTypes<FirstFunc_>::ReturnType, FirstFunc_, Rest_...>(first_func, rest...));
}

In part four we’ll start do some clever things with the return type, mostly because we can, rather than because it’s definitely useful.


Filed under: OneOf Tagged: c++, c++0x, lambdas, metaprogramming, templates, variadic templates, visitor pattern

iSSH – A Great iPhone App

iSSH (App Store link) is a Secured SHell client for iPhone that works over any data network, cellular or WiFi. It is easily one of my favorite apps for a number of reasons. The biggest reason is obvious: I can remotely manage my server(s) from literally anywhere. Other reasons include a very intuitive user interface and the ability to use pretty much any meta key you would ever need.

Here is me watching someone’s NetHack game in iSSH:

The first thing you will notice about iSSH is that the keyboard is transparent so that you can see more lines of terminal and still be able to type.

Here is a similar screen of a server running htop:

Another great feature of iSSH is multitasking. iSSH can stay logged in for up to ten minutes after you leave the application (10 minutes is Apple’s limit). This is great for quickly switching between other applications that you’re using. iSSH will send you a message alerting you that your connections will be closed if you don’t reopen the application within a few minutes of the ten minute limit.

SSH isn’t the only thing that iSSH can do. It can handle VNC connections for viewing servers and desktops (of any OS) remotely. The new version of iSSH can even set up tunnels through SSH so that you can take ports from the connected host and forward them to your iPhone locally.

For instance, I have forwarded port 80 from my server to my iPhone on port 8080. Note that it looks like I’m running that server locally:

iSSH is undoubtedly one of the most feature-filled SSH/VNC/Telnet client for iPhone. It is definitely one of my favorite apps. I highly recommend it. It is updated on what seems to be a bi-weekly basis with new features and bug fixes. iSSH is easily the most frequently updated iPhone app that I own.

The only drawback is the price: $9.99 USD. If you ask me, iSSH is worth $50.

The full feature list:

• Multitasking support – can hold connections open for up to 10 minutes
• Arbitrary port tunneling through SSH
• Tunneled VNC client (or unencrypted using a “raw” connection)
• Tunneled X server
• Support for any non-standard ssh or vnc port
• Multiple simultaneous connections
• iPhone 3.0 cut and paste (double tap in console)
• Nearly every encoding supported
• Reachability notification
• Features a VGA BBS ANSI compatible font
• Portrait and landscape mode
• 53×24 or 80×24 (user configurable on iPad) with scroll-back buffer
• Arrow keys (by pop-up or by toolbar). Ctrl, alt, esc, tab, shift, Fn keys (1-10), ` key, all in combination.
• Keys are highlighted to confirm combination.
• Store any number of connections and configurations
• Command execution on connection
• Transparent keyboard
• Four (six on iPad) selectable monospaced fonts
• Via EDGE, WiFi or 3G
• RSA and DSA key generation and exchange via email, password-connected SSH or pasteboard
• Can support “uninterrupted” connections via GNU Screen and the command option.
• Transfers public keys automatically, without needing to email first.
• Pinch zooming in VNC/X11
• As good or better bluetooth/dock keyboard support than other VNC/RDP and console apps in the App Store! (Please follow iTunes support link for details.)

Posts for Monday, July 12, 2010

Vim and plaintext data files

I use Vim to work with plaintext datasets. Here are some habits and code snippets I've picked up which make data files a bit easier to deal with.

Tab-delimited files

If you have a file full of tab-separated values, the columns may not line up very well due to the variable display-length of tabs. You'll often end up with something that looks like this:

foo bar
longvaluehere    quux

But things will line up if you set tabstop to a value greater than the longest column.

:setlocal tabstop=16

Then it'll look like this:

foo             bar
longvaluehere   quux

Now you can use visual-block mode to add and remove columns easily.

If you're working with TSV files, you'll want to set the listchars option in such a way that tabs are displayed specially. I use this, stolen/adapted from here:

:set listchars=eol:\ ,tab:»-,trail:·,precedes:…,extends:…,nbsp:‗

Remember also that you can insert a literal tab into your file no matter what you have expandtab set to, by hitting CTRL-v first.

Those things are just about all I need to work with TSV files comfortably.

Dealing with long lines

Sometimes I'll run some data through a statistics program and it'll tell me "Invalid character at line 576, column 9438". How do you jump to that location in your data? Easily enough in normal mode:

576G9438|

G jumps you to a line, | jumps you to a column. A good mnemonic to remember these: "g = goto line" and "| looks like a column".

Suppose (like me) you turn scrollbars off in your GVim window, or you work from a terminal. How do you scroll horizontally? With zL and zH. But those are hard to type and harder to remember, so I map them to CTRL+Shift+Right and Left.

nnoremap <C-S-Right> zL
nnoremap <C-S-Left> zH

If you have a lot of long lines, it behooves you to set up a mapping to turn soft line-wrapping on and off quickly:

nnoremap <Leader>w :setlocal nowrap!<CR>

Now you can jump back and forth between wrapped and unwrapped views of your data via \w.

Diffs on long lines

When using vimdiff, ]c and [c will jump you to lines that contain differences. But if your lines are thousands of characters long, it doesn't always help to know that two lines are different: you want to know where they differ. ]c puts you at the first column in the line, which isn't helpful.

This function will jump you to the column where the difference between two lines starts:

function! IsDiff(col)
    let hlID = diff_hlID(".", a:col)
    return hlID == 24
endfunction

function! FindDiffOnLine()
    let c = 1
    while c < col("$")
        if IsDiff(c)
            call cursor(".", c)
            return
        endif
        let c += 1
    endwhile
endfunction

This seems fragile, so if you know a better way please leave a comment and let me know.

Diff a buffer against itself

Sometimes I have files with lots of lines of data, and some lines might be duplicate. It's really easy to get rid of the duplicate lines:

:sort u

But what if I want to see what lines were just removed? I don't know of a built-in way to do it, so I use this simple function:

function! MarkDuplicateLines()
    let x = {}
    let count_dupes = 0
    for lnum in range(1, line('$'))
        let line = getline(lnum)
        if has_key(x, line)
            exe lnum . 'norm I *****'
            let count_dupes += 1
        else
            let x[line] = 1
        endif
    endfor
    echomsg count_dupes . " dupe(s) found"
endfunction

That'll put a bunch of asterisks at the beginning of every line that's a duplicate of a previous line.

Fix broken punctuation

Ever had to work with textual data that someone else sent you in MS Word or Excel? I have. It hurts. If you copy/paste from an MS document into a plaintext file, you'll probably end up with a bunch of funky unreadable or undisplayable characters.

This function fixes the most common Word-spawned "smart" punctuation characters that you're likely to run across:

function! FixInvisiblePunctuation()
    silent! %s/\%u2018/'/g
    silent! %s/\%u2019/'/g
    silent! %s/\%u2026/.../g
    silent! %s/\%uf0e0/->/g
    silent! %s/\%u0092/'/g
    silent! %s/\%u2013/--/g
    silent! %s/\%u2014/--/g
    silent! %s/\%u201C/"/g
    silent! %s/\%u201D/"/g
    silent! %s/\%u0052\%u20ac\%u2122/'/g
    silent! %s/\%ua0/ /g
    retab
endfunction

This function was built up during years of choking on special characters that crept up in my data, so I'm sure I'll be adding more to it in the future.

\%u#### here is a regex escape to represent a character as a hexidecimal Unicode codepoint.

Many of these characters show up as blank (whitespace) if you lack the font to display them. If you ever run across a character you can't see and you want to inspect it, put the cursor on it and do

:ascii

That's a bad name for the command, since it works on Unicode characters too. That'll give you the numeric code you can use in a regex to replace them all with something sane.

avatar

“I like bug 327809″: Let’s make revdep-rebuild obsolete

That’s a quote from solar. I also like this bug. I really like it. I think it’s one of the best bug reports I’ve ever seen.

(First, a disclaimer: I’m not a Gentoo developer, I’ve only just started any kind of serious portage work, and have had quite a few stupid ideas at this point. Take what I have to say with a healthy pinch of salt.)

First of all, here’s the bug link: https://bugs.gentoo.org/327809

Here’s a highlight of what I think makes this bug so impressive, and my comments on it:

  • “A first step would of course be to implement and have a working –as-needed build environment. This is mostly a matter of developer buy-in rather than anything else.” That’s a quote from spider in the beginning of the bug report. Not only is this the best bug report ever, but it depends on what is possibly the most needed fix in Gentoo. (It’s caused at least the libpng fiasco.) I’ve yet to hear any argument against this fix, and it’s wonderful that there’s finally something convincing to push this fix forward.
  • “First, we build a pair list of the form ‘file dependency_file’ for our package, this list gets shipped inside the installation path list. … second, we also generate a backwards-resolved list of ‘Dependend files => currently installed package owning it’ to be stored for QA reasons.” Basically, this means that any files that are required by a binary to operate will be scanned and recorded into the first list, and the packages that own that file will be recorded into the second list. The first list essentially eliminates revdep-rebuild’s first function, which is to scan all binaries for their dependencies. The second list could be used later by portage for an extra feature, which would warn about packages that use other libraries but never mention it in their DEPEND variables. Brilliant!
  • “for both binary and source installation, we can at a pre_install step check if our package (binary or source) is currently unsupportable by the current system….” This check also occurs before removing files, spider said. This no doubt also applies to upgrades, where binaries are replaced by newer ones. This check, as far as I can tell, will eliminate almost all breaks that revdep-rebuild is used to catch. That’s what this is all about.

CC yourself. Add votes. Let’s get this thing in portage!


Posts for Sunday, July 11, 2010

Generic Lambda Visitors, or Writing Haskell in C++0x (Part 2)

In part one, we created a basic OneOf type which allowed us to store one of a specific list of types, and we created a simple visitor framework for it. It allowed us to write a visitor that looks like:

struct IntStringVisitor :
    OneOfVisitor<int, std::string>
{
    virtual void visit(const int & x)
    {
        std::cout << "int " << x << std::endl;
    }

    virtual void visit(const std::string & x)
    {
        std::cout << "string " << x << std::endl;
    }
};

This isn’t very flexible. What if we want to visit by non-const reference, or by value? What if we want to make our visit methods const? What if we want to return something?

Because the visitor pattern relies upon virtual methods, and because virtual template methods don’t make sense, we need to keep the same basic underlying pattern. But how we present this to the programmer can change: rather than requiring visitors to be derived from OneOfVisitor, we can accept visitors of an arbitrary class, and create a OneOfVisitor implementation that, for each visit method, calls the appropriate visit method in the wrapper.

Here’s what our wrapper looks like:

template <typename Visitor_, typename Underlying_, typename... Types_>
struct OneOfVisitorWrapperVisit;

template <typename Visitor_, typename Underlying_>
struct OneOfVisitorWrapperVisit<Visitor_, Underlying_> :
    Visitor_
{
    Underlying_ & underlying;

    OneOfVisitorWrapperVisit(Underlying_ & u) :
        underlying(u)
    {
    }
};

template <typename Visitor_, typename Underlying_, typename Type_, typename... Rest_>
struct OneOfVisitorWrapperVisit<Visitor_, Underlying_, Type_, Rest_...> :
    OneOfVisitorWrapperVisit<Visitor_, Underlying_, Rest_...>
{
    OneOfVisitorWrapperVisit(Underlying_ & u) :
        OneOfVisitorWrapperVisit<Visitor_, Underlying_, Rest_...>(u)
    {
    }

    virtual void visit(Type_ & t)
    {
        this->underlying.visit(t);
    }
};

template <typename Underlying_, typename... Types_>
struct OneOfVisitorWrapper :
    OneOfVisitorWrapperVisit<OneOfVisitor<Types_...>, Underlying_, Types_...>
{
    OneOfVisitorWrapper(Underlying_ & u) :
        OneOfVisitorWrapperVisit<OneOfVisitor<Types_...>, Underlying_, Types_...>(u)
    {
    }
};

Something to note here: we’re saying “make me a visit method that takes the first item in our type list as a parameter, and also inherit from OneOfVisitorWrapperVisit with the remainder of our type list”. Then we use the leaf class to inherit from the original abstract visitor class we’re implementing, and also to handle holding the underlying type.

This is slightly more convoluted than how we handled type lists for OneOfVisitor, where we used expansion instead:

template <typename... Types_>
struct OneOfVisitor :
    OneOfVisitorVisit<Types_>...
{
};

There we start to see what the ... operator really means: it says “expand the pattern to our left once for each item in the specified type list, replacing the type list name with the value of the expansion each time, and then join the whole lot together with commas”.

Which technique is better? It depends. The expansion form is more compact. However, if we used that for our visitor wrapper, we’d then have to have some additional way of inheriting from the abstract visitor, and of handling the underlying type. This would involve either virtual base classes or lots of extremely icky casting.

We also need to modify our accept method to create the wrapper:

template <typename... Types_>
class OneOf
{
    public:
        template <typename Visitor_>
        void accept(Visitor_ & visitor) const
        {
            OneOfVisitorWrapper<Visitor_, Types_...> visitor_wrapper(visitor);
            _value->accept(visitor_wrapper);
        }
};

Now we can make our visitor more flexible:

struct IntStringVisitor
{
    void visit(const int x) const
    {
        std::cout << "int " << x << std::endl;
    }

    void visit(std::string & x)
    {
        x.append("-spanker");
        std::cout << "string " << x << std::endl;
    }
};

int main(int, char *[])
{
    OneOf<int, std::string> s(std::string("monkey"));

    IntStringVisitor visitor;
    s.accept(visitor);
    s.accept(visitor);

    return EXIT_SUCCESS;
}

This has another neat implication:

struct Base { };
struct FirstDerived : Base { };
struct SecondDerived : Base { };
struct ThirdDerived : Base { };

struct DerivedVisitor
{
    void visit(const int x)
    {
        std::cout << "int " << x << std::endl;
    }

    void visit(const Base &)
    {
        std::cout << "derived from base" << std::endl;
    }

    void visit(const ThirdDerived &)
    {
        std::cout << "third derived" << std::endl;
    }
};

int main(int, char *[])
{
    OneOf<int, FirstDerived, SecondDerived, ThirdDerived> s(123);
    DerivedVisitor visitor;

    s.accept(visitor);

    s = FirstDerived();
    s.accept(visitor);

    s = SecondDerived();
    s.accept(visitor);

    s = ThirdDerived();
    s.accept(visitor);

    return EXIT_SUCCESS;
}

If some of the visitable items are derived from the same base class, we can just visit that base class rather than each of the derived items. We can also implement specific visit methods for some of the derived classes, whilst falling back to visiting the base class for any in which we don’t have a specific interest. This uses the normal overload selection rules, except decided at runtime rather than compile time, giving us a much more flexible kind of dynamic dispatch than conventional visitors.

What about return types? Well, we could store the result in our visitor wrapper:

template <typename Visitor_, typename Underlying_, typename Result_, typename... Types_>
struct OneOfVisitorWrapperVisit;

template <typename Visitor_, typename Underlying_, typename Result_>
struct OneOfVisitorWrapperVisit<Visitor_, Underlying_, Result_> :
    Visitor_
{
    Underlying_ & underlying;
    Result_ result;

    OneOfVisitorWrapperVisit(Underlying_ & u) :
        underlying(u)
    {
    }
};

template <typename Visitor_, typename Underlying_, typename Result_, typename Type_, typename... Rest_>
struct OneOfVisitorWrapperVisit<Visitor_, Underlying_, Result_, Type_, Rest_...> :
    OneOfVisitorWrapperVisit<Visitor_, Underlying_, Result_, Rest_...>
{
    OneOfVisitorWrapperVisit(Underlying_ & u) :
        OneOfVisitorWrapperVisit<Visitor_, Underlying_, Result_, Rest_...>(u)
    {
    }

    virtual void visit(Type_ & t)
    {
        this->result = this->underlying.visit(t);
    }
};

template <typename Underlying_, typename Result_, typename... Types_>
struct OneOfVisitorWrapper :
    OneOfVisitorWrapperVisit<OneOfVisitor<Types_...>, Underlying_, Result_, Types_...>
{
    OneOfVisitorWrapper(Underlying_ & u) :
        OneOfVisitorWrapperVisit<OneOfVisitor<Types_...>, Underlying_, Result_, Types_...>(u)
    {
    }
};

Then we could add a new accept_returning method:

template <typename... Types_>
class OneOf
{
    public:
        template <typename Result_, typename Visitor_>
        Result_ accept_returning(Visitor_ & visitor) const
        {
            OneOfVisitorWrapper<Visitor_, Result_, Types_...> visitor_wrapper(visitor);
            _value->accept(visitor_wrapper);
            return visitor_wrapper.result;
        }
};

And use it thusly:

struct IntStringVisitor
{
    int visit(const int x) const
    {
        return x;
    }

    int visit(const std::string & s) const
    {
        return s.length();
    }
};

int main(int, char *[])
{
    OneOf<int, std::string> s(123);
    IntStringVisitor visitor;
    std::cout << s.accept_returning<int>(visitor) << std::endl;

    return EXIT_SUCCESS;
}

Except now we’ve broken the void case, so we need to add some specialisations:

template <typename Visitor_, typename Underlying_>
struct OneOfVisitorWrapperVisit<Visitor_, Underlying_, void> :
    Visitor_
{
    Underlying_ & underlying;

    OneOfVisitorWrapperVisit(Underlying_ & u) :
        underlying(u)
    {
    }
};

template <typename Visitor_, typename Underlying_, typename Type_, typename... Rest_>
struct OneOfVisitorWrapperVisit<Visitor_, Underlying_, void, Type_, Rest_...> :
    OneOfVisitorWrapperVisit<Visitor_, Underlying_, void, Rest_...>
{
    OneOfVisitorWrapperVisit(Underlying_ & u) :
        OneOfVisitorWrapperVisit<Visitor_, Underlying_, void, Rest_...>(u)
    {
    }

    virtual void visit(Type_ & t)
    {
        this->underlying.visit(t);
    }
};

And we need to fix the old accept method:

template <typename... Types_>
class OneOf
{
    public:
        template <typename Visitor_>
        void accept(Visitor_ & visitor) const
        {
            OneOfVisitorWrapper<Visitor_, void, Types_...> visitor_wrapper(visitor);
            _value->accept(visitor_wrapper);
        }
};

This isn’t pretty. It also assumes that our result type is regular (specifically, default constructible and copyable), which is much too tight. We can approach this another way: rather than having our visitor wrapper directly call the underlying method and store the result, we can have it create a bound function.

template <typename Visitor_, typename Underlying_, typename Result_, typename... Types_>
struct OneOfVisitorWrapperVisit;

template <typename Visitor_, typename Underlying_, typename Result_>
struct OneOfVisitorWrapperVisit<Visitor_, Underlying_, Result_> :
    Visitor_
{
    Underlying_ & underlying;
    std::function<Result_ ()> execute;

    OneOfVisitorWrapperVisit(Underlying_ & u) :
        underlying(u)
    {
    }
};

template <typename Visitor_, typename Underlying_, typename Result_, typename Type_, typename... Rest_>
struct OneOfVisitorWrapperVisit<Visitor_, Underlying_, Result_, Type_, Rest_...> :
    OneOfVisitorWrapperVisit<Visitor_, Underlying_, Result_, Rest_...>
{
    OneOfVisitorWrapperVisit(Underlying_ & u) :
        OneOfVisitorWrapperVisit<Visitor_, Underlying_, Result_, Rest_...>(u)
    {
    }

    Result_ visit_returning(Type_ & t)
    {
        return this->underlying.visit(t);
    }

    virtual void visit(Type_ & t)
    {
        this->execute = std::bind(&OneOfVisitorWrapperVisit::visit_returning, this, std::ref(t));
    }
};

template <typename Underlying_, typename Result_, typename... Types_>
struct OneOfVisitorWrapper :
    OneOfVisitorWrapperVisit<OneOfVisitor<Types_...>, Underlying_, Result_, Types_...>
{
    OneOfVisitorWrapper(Underlying_ & u) :
        OneOfVisitorWrapperVisit<OneOfVisitor<Types_...>, Underlying_, Result_, Types_...>(u)
    {
    }
};

Then we can modify our accept methods:

template <typename... Types_>
class OneOf
{
    public:
        void accept(Visitor_ & visitor) const
        {
            OneOfVisitorWrapper<Visitor_, void, Types_...> visitor_wrapper(visitor);
            _value->accept(visitor_wrapper);
            visitor_wrapper.execute();
        }

        template <typename Result_, typename Visitor_>
        Result_ accept_returning(Visitor_ & visitor) const
        {
            OneOfVisitorWrapper<Visitor_, Result_, Types_...> visitor_wrapper(visitor);
            _value->accept(visitor_wrapper);
            return visitor_wrapper.execute();
        }
};

But wait, there’s more! In C++0x an expression of the form “return func();” is legal inside a template function if func and our function both ‘return’ void. So we can rewrite accept as:

template <typename... Types_>
class OneOf
{
    public:
        template <typename Visitor_>
        void accept(Visitor_ & visitor) const
        {
            return accept_returning<void>(visitor);
        }
};

We can now deal with return types that are only moveable:

struct IckyType
{
    int value;

    explicit IckyType(const int x) :
        value(x)
    {
    }

    IckyType(IckyType && other) :
        value(other.value)
    {
    }

    IckyType(const IckyType &) = delete;
    IckyType & operator= (const IckyType &) = delete;
};

struct IckyVisitor
{
    IckyType visit(const int i)
    {
        return IckyType{i};
    }

    IckyType visit(const std::string & s)
    {
        return IckyType{s.length()};
    }
};

int main(int, char *[])
{
    OneOf<int, std::string> s(123);
    IckyVisitor visitor;

    std::cout << s.accept_returning<IckyType>(visitor).value << std::endl;

    return EXIT_SUCCESS;
}

Up until now we’ve mostly ignored constness. A visitor must be non-const, even if all its visit methods are const, which is an arbitrary restriction. More importantly, a visitor can currently modify the value stored in a const OneOf, which shouldn’t be allowed.

First up, let’s change how we get at the value for OneOf:

template <typename... Types_>
class OneOf
{
    private:
        std::unique_ptr<OneOfValueBase<Types_...> > _value;

    public:
        template <typename Type_>
        OneOf(const Type_ & value) :
            _value(new OneOfValue<typename SelectOneOfType<Type_, Types_...>::Type, Types_...>{value})
        {
        }

        OneOf(const OneOf & other) = delete;

        OneOf(OneOf && other) :
            _value(std::move(other._value))
        {
        }

        template <typename Type_>
        OneOf & operator= (const Type_ & value)
        {
            _value.reset(new OneOfValue<typename SelectOneOfType<Type_, Types_...>::Type, Types_...>{value});
            return *this;
        }

        OneOf & operator= (const OneOf & other) = delete;

        OneOf & operator= (OneOf && other)
        {
            _value = std::move(other._value);
            return *this;
        }

        OneOfValueBase<Types_...> & value()
        {
            return *_value;
        }

        const OneOfValueBase<Types_...> & value() const
        {
            return *_value;
        }
};

That closes up the hole introduced by using std::unique_ptr, which behaves like a pointer for const rules rather than behaving like a transparent holder. Next, we need both const and non-const accept methods in our values. However, the const variant must accept a visitor that visits const versions of our types. Again, we can make use of the fact that ... works by expanding a possibly non-trivial pattern into a list:

template <typename... Types_>
struct OneOfValueBase
{
    virtual ~OneOfValueBase() = 0;

    virtual void accept(OneOfVisitor<Types_...> &) = 0;
    virtual void accept(OneOfVisitor<const Types_...> &) const = 0;
};

template <typename... Types_>
OneOfValueBase<Types_...>::~OneOfValueBase() = default;

template <typename Type_, typename... Types_>
struct OneOfValue :
    OneOfValueBase<Types_...>
{
    Type_ value;

    OneOfValue(const Type_ & type) :
        value(type)
    {
    }

    virtual void accept(OneOfVisitor<Types_...> & visitor)
    {
        static_cast<OneOfVisitorVisit<Type_> &>(visitor).visit(value);
    }

    virtual void accept(OneOfVisitor<const Types_...> & visitor) const
    {
        static_cast<OneOfVisitorVisit<const Type_> &>(visitor).visit(value);
    }
};

With current C++, what would follow would be even more overloading for the accept method in OneOf. With C++0x we have another option: we can move to using free template functions, and taking rvalue references:

template <typename Result_, typename OneOf_, typename Visitor_>
Result_
accept_returning(OneOf_ && one_of, Visitor_ && visitor)
{
    typename OneOfVisitorWrapperTypeFinder<Visitor_, Result_, OneOf_>::Type visitor_wrapper(visitor);
    one_of.value().accept(visitor_wrapper);
    return visitor_wrapper.execute();
}

template <typename OneOf_, typename Visitor_>
void accept(OneOf_ && one_of, Visitor_ && visitor)
{
    accept_returning<void>(one_of, visitor);
}

But what is OneOfVisitorWrapperTypeFinder<Visitor, Result_, OneOf_>? We need it to be OneOfVisitorWrapper<Visitor_, Result, OneOf Types...> if one_of is non-const, but OneOfVisitorWrapper<Visitor_, Result, OneOf Types made const...> if it is const. This is where the rvalue reference comes in. In the form specified above, the type value of OneOf_ will be an lvalue reference if the parameter is non-const, and a const lvalue reference if the parameter is const. Thus:

template <typename Visitor_, typename Result_, typename OneOf_>
struct OneOfVisitorWrapperTypeFinder;

template <typename Visitor_, typename Result_, typename... Types_>
struct OneOfVisitorWrapperTypeFinder<Visitor_, Result_, const OneOf<Types_...> &>
{
    typedef OneOfVisitorWrapper<Visitor_, Result_, const Types_...> Type;
};

template <typename Visitor_, typename Result_, typename... Types_>
struct OneOfVisitorWrapperTypeFinder<Visitor_, Result_, OneOf<Types_...> &>
{
    typedef OneOfVisitorWrapper<Visitor_, Result_, Types_...> Type;
};

Now we can do this:

struct ChangeVisitor
{
    void visit(int & x)
    {
        ++x;
    }

    void visit(std::string & x)
    {
        x.append(" spanker");
    }
};

struct NoChangeVisitor
{
    void visit(int x)
    {
        std::cout << "int " << x << std::endl;
    }

    void visit(const std::string & x)
    {
        std::cout << "string " << x << std::endl;
    }
};

struct ConstChangeVisitor
{
    void visit(int & x) const
    {
        ++x;
    }

    void visit(std::string & x) const
    {
        x.append(" spanker");
    }
};

struct ConstNoChangeVisitor
{
    void visit(int x) const
    {
        std::cout << "int " << x << std::endl;
    }

    void visit(const std::string & x) const
    {
        std::cout << "string " << x << std::endl;
    }
};

int main(int, char *[])
{
    OneOf<int, std::string> s(123);

    accept(s, ChangeVisitor());
    accept(s, NoChangeVisitor());
    accept(s, ConstChangeVisitor());
    accept(s, ConstNoChangeVisitor());

    const ConstChangeVisitor const_change_visitor{};
    const ConstNoChangeVisitor const_no_change_visitor{};

    accept(s, const_change_visitor);
    accept(s, const_no_change_visitor);

    const OneOf<int, std::string> t(std::string("monkey"));

    // accept(t, ChangeVisitor());
    accept(t, NoChangeVisitor());
    // accept(t, ConstChangeVisitor());
    accept(t, ConstNoChangeVisitor());

    // accept(t, const_change_visitor);
    accept(t, const_no_change_visitor);

    return EXIT_SUCCESS;
}

In part three we’ll start using lambdas to avoid the need for all those ugly visitor classes.


Filed under: OneOf Tagged: c++, c++0x, metaprogramming, templates, variadic templates, visitor pattern
avatar

xorg 1.8

Upgrading to xorg-server with USE=-hal appeared to make things run a tad faster.
However, some really strange behavior with keypress events started to occur. I tried several different variants for keyboard layout, setting special keys etc but I still got stuff like "right ctrl is return" or "arrow down inserts a space and line down". Adding to xorg.conf:

Section "InputClass"
        Identifier "keyboard-all"
        Driver "evdev"
        Option "XkbLayout" "se"
        Option "XkbVariant" ",qwerty"
        MatchIsKeyboard "on"
EndSection
Section "InputClass"
        Identifier "mouse-all"
        Driver "evdev"
        MatchIsPointer "on"
EndSection
Section "Module"
        Load           "dbe"
        Load           "extmod"
        Load           "glx"
        Load           "evdev"
EndSection

Made it work again. I was under the impression that you would have to configure less in xorg.conf after upgrading. But hey X works all dandy now.

Posts for Saturday, July 10, 2010

Paludis 0.48.2 Released

Paludis 0.48.2 has been released:

  • The ‘cave’ client is now enabled by default. ‘cave’ is a modular console client that will eventually replace ‘paludis’. It is currently reasonably functional and well tested, but does not yet have all the features present in ‘paludis’, and is thus not yet considered a complete replacement.

See this post for more on cave.


Filed under: paludis releases Tagged: paludis

fixing volume change in linux

motivation

windows xp changes volumes no matter what the user is doing. in linux this is often not possible because things are done differently (read: “wrong”). however, this guide will help you to fix it or at least tells you why it is done wrong. i’m experimenting with my ‘hp elitebook 8530w‘ using gentoo linux. in linux the volume keys are most often bound to X, as: XF86AudioRaiseVolume and XF86AudioLowerVolume, see this example ~/.Xmodmap (not mine)

keycode 176 = XF86AudioRaiseVolume
keycode 174 = XF86AudioLowerVolume

this is a bad design because things don’t work well. a different concept was issued by my thinkpad t43: mute/increase/decrease of volume was done using a hardware mixer. this means alsa wasn’t used at all.

this posting addresses alternative approaches on how to use the volume up/down/mute buttons in a global X independent way.

common linux audio setup

currently linux features two desktop audio systems:

  • alsa
  • pulseaudio
  • (jack audio connection kit)

most often a weird combination of both simultaneously, with confusing results. point is: changing volumes.

volume settings issues

volume levels can be changed for both: alsa and pulseaudio. the difference is that if you change the ‘master’ volume of alsa, you also change it for all pulseaudio clients. while changing an individual volume in a ‘pulseaudio client’ will only change the volume for that specific client.

most of the time volume changes are redirected to alsa, for example ‘kmix‘ does catch events like XF86AudioRaiseVolume and changes the volume of the ‘selected master channel’. one can select ‘master’ or ‘pcm’ as ‘master channel’ (others as well, depends on the audio setup). using this architecture has pitfalls, as you CAN NOT change volumes, when:

  • playing fullscreen-games (using opengl or sdl)
  • embedded adobe flash animations (skips fullscreen on XF86AudioRaiseVolume)
  • X is NOT running (or has crashed)
  • system load is too high (X input handling is delayed)
  • kmix is not running (or kde is not running yet)

xorg input handling

the linux kernel adds input devices in ‘/dev/input/‘. xorg will use only devices configured in ‘/etc/X11/xorg.conf’. Let’s see ‘/var/log/Xorg.0.log‘:

  • /dev/input/mice (mouse)
  • /dev/input/event3 (keyboard)
    (II) config/hal: Adding input device AT Translated Set 2 keyboard
    (**) AT Translated Set 2 keyboard: always reports core events
    (**) AT Translated Set 2 keyboard: Device: “/dev/input/event3
    (II) AT Translated Set 2 keyboard: Found keys
    (II) AT Translated Set 2 keyboard: Configuring as keyboard
    (II) XINPUT: Adding extended input device “AT Translated Set 2 keyboard” (type: KEYBOARD)

  • /dev/input/event7 (HP multimedia keys)
    (II) config/hal: Adding input device HP WMI hotkeys
    (**) HP WMI hotkeys: always reports core events
    (**) HP WMI hotkeys: Device: “/dev/input/event7
    (II) HP WMI hotkeys: Found keys
    (II) HP WMI hotkeys: Configuring as keyboard
    (II) XINPUT: Adding extended input device “HP WMI hotkeys” (type: KEYBOARD)

one can get a complete list by typing “xinput list” as a normal user in a xterm. the default setup of X will use ‘hal’ to manage hardware.

monitoring for acpi keys

let’s see if one receives any of the acpi key-events using acpi_listen. i pressfn+f8‘, which is a special key with a battery symbol on it.

xev reports (run as normal user):

KeyPress event, serial 45, synthetic NO, window 0×3200001, root 0x27a, subw 0×0, time 26309626, (51,-20), root:(55,930), state 0×0, keycode 244 (keysym 0x1008ff93, XF86Battery), same_screen YES, XLookupString gives 0 bytes: XmbLookupString gives 0 bytes: XFilterEvent returns: False KeyRelease event, serial 45, synthetic NO, window 0×3200001, root 0x27a, subw 0×0, time 26309686, (51,-20), root:(55,930), state 0×0, keycode 244 (keysym 0x1008ff93, XF86Battery), same_screen YES, XLookupString gives 0 bytes: XFilterEvent returns: False

acpi_listen reports (run as root or normal user):
button/battery BAT 00000080 00000000
repeating this experiment with a different keycombination as ‘fn+f1′ resulted only in a xev event but NO acpi event was reported by acpi_listen.
this means we can NOT just use any ‘fn+whatever’ combination as a acpi shortcut. we are probably limited to what is provided by the hp acpi firmware.
to sum up: this means some keys or key combinations provide an acpi event while also a normal key event (which is delivered to X and monitored using xev). acpi is independent of X so if the multimedia keys (volume up/down/mute) produce acpi events it is good. if not we still can do what we want but with a little more effort as it will require to monitor /dev/input/eventX for volume keys.

syncing hardware-mute with software-mute

my ‘hp elitebook 8530w’ does mute the soundcard directly (no operating system operation needed). this is helpful since instant muting sometimes is VERY IMPORTANT. however, if no kmix is running, then the XF86AudioMute isn’t processed and alsa still thinks the devices is unmuted (with an ‘out of the box setup’).

the truth is that the audio-signal flow can be interrupted serially, in two ways:

  • first by alsa, which does software mute
  • but the ‘hp elitebook 8630w’ also does hardware muting

if both are muted, using alsamixer to unmute won’t restore sound since the hardware mute still is set. this can be very confusing.

my old thinkpad t43 was able to mask the hardware mute (on an acpi basis, see ibm-acpi [2]), in order to make ‘software’ mute the only option.

  • this is nice as it is more intuitive
  • this is bad as all the issues which this posting talks about were very problematic. most users would therefore NOT mask the hardware mixer mute as this could mean ‘not to be able to quickly mute’ in certain situations.

looking at the hp-wmi.c code

the handling of acpi can either be done using the generic acpi subsystem or with some special care as for example ibm-acpi [2]. since i can’t use the ibm-acpi with my hp laptop, let’s have a look what that hp-wmi thing does.

/usr/src/linux/drivers/platform/x86/hp-wmi.c‘ shows that this code is meant to:

  • rfkill wireless lan
  • rfkill bluetooth dongle
  • wwan? (whatever that is)
  • register a sysfs interface

on hp-wmi module load, ‘/sys/devices/platform/hp-wmi‘ is created. this is the interface used by ‘net-wireless/rfkill‘.

# rfkill list

7: hci0: Bluetooth

Soft blocked: no

Hard blocked: no

8: phy2: Wireless LAN

Soft blocked: no

Hard blocked: no

15: hp-wifi: Wireless LAN

Soft blocked: no

Hard blocked: no

16: hp-bluetooth: Bluetooth

Soft blocked: no

Hard blocked: no

this interface can be used to disable the devices. so this is not what we want. maybe the hp can’t mask the hardware mute at all.

but how is the mute button connected?

since i’m using a laptop, the mute button might just be a hard wired toggle for the soundcards mute state. let’s check that using an external multimedia usb keyboard. lsusb:

Bus 008 Device 003: ID 04b3:3003 IBM Corp. Rapid Access III Keyboard

using the mute button there doesn’t change anything until i start ‘kmix’. so my first assumption might be correct.
using ‘acpi-listen’ as root:
button/mute MUTE 00000080 00000000
but looking at /etc/acpi there does not seem to be any script caring about acpi event. looking at /var/log/messages:
tail -f /var/log/messages
Jul  5 16:24:21 ebooK logger: ACPI event unhandled: button/mute MUTE 00000080 00000000
so it’s probably a hard wired button toggle.
my previous laptop, ‘thinkpad t43′, did handle the volume up/down and mute directly (similar of how the mute button is implemented on the elitebook 8530w). i quote from the ibm-acpi implementation documentation [2]:

0×1017 0×16 MUTE Mute internal mixer. This key is always handled by the firmware, even when unmasked.

so this might be handled by the hp-wmi firmware as well.

/etc/acpi/default.sh script (adapted)

#!/bin/sh
# /etc/acpi/default.sh
# Default acpi script that takes an entry for all actions

set $*

group=${1%%/*}
action=${1#*/}
device=$2
id=$3
value=$4

log_unhandled() {
        logger "ACPI event unhandled: $*"
}

case "$group" in
        video)
                case "$action" in
                        brightnessup)
                        /etc/acpi/brightness.sh up
                        ;;

                        brightnessdown)
                        /etc/acpi/brightness.sh down
                        ;;
                esac
                ;;
        button)
                case "$action" in
                        volumedown)
                        /etc/acpi/volume.sh down
                        ;;
                        volumeup)
                        /etc/acpi/volume.sh up
                        ;;
                        mute)
                        /etc/acpi/volume.sh mute
                        ;;

                        power)
                                #/sbin/init 0
                                ;;

                        # if your laptop doesnt turn on/off the display via hardware
                        # switch and instead just generates an acpi event, you can force
                        # X to turn off the display via dpms.  note you will have to run
                        # 'xhost +local:0' so root can access the X DISPLAY.
                        #lid)
                        #  xset dpms force off
                        #  ;;

                        sleep)
                                /usr/sbin/pm-suspend
                                ;;
                        *)  log_unhandled $* ;;
                esac
                ;;

        ac_adapter)
                case "$value" in
                        # Add code here to handle when the system is unplugged
                        # (maybe change cpu scaling to powersave mode).  For
                        # multicore systems, make sure you set powersave mode
                        # for each core!
                        #*0)
                        #  cpufreq-set -g powersave
                        #  ;;

                        # Add code here to handle when the system is plugged in
                        # (maybe change cpu scaling to performance mode).  For
                        # multicore systems, make sure you set performance mode
                        # for each core!
                        #*1)
                        #  cpufreq-set -g performance
                        #  ;;

                        *)  log_unhandled $* ;;
                esac
                ;;

        *)  log_unhandled $* ;;
esac

/etc/acpi/volume.sh

#!/bin/bash
# script to change the volume via acpi calls by joachim schiele 2010-06-30

if [ "$1"x == "up"x ]; then
        volume=$(amixer get Master | grep "Front Left: Playback" | awk '{print $4}')
        Y=$(echo  "$volume+1" | bc)
        amixer set Master $Y
        P=$(echo "$Y/30*100" | bc -l)
fi

if [ "$1"x == "down"x ]; then
        volume=$(amixer get Master | grep "Front Left: Playback" | awk '{print $4}')
        Y=$(echo  "$volume-1" | bc)
        amixer set Master $Y
        P=$(echo "$Y/30*100" | bc -l)
fi

if [ "$1"x == "mute"x ]; then
        $(amixer get Master | grep "Front Left: Playback" | grep '\[off\]')
        state=$?
        volume=$(amixer get Master | grep "Front Left: Playback" | awk '{print $4}')
        Y=$(echo  "$volume")
        amixer set Master $Y
        P=$(echo "$Y/30*100" | bc -l)

        echo "state=$state"

        if [ $state -ne 1 ]; then
                amixer set Master unmute
        else
                amixer set Master mute
        fi
fi

changing volumes not using kmix (using acpi)

see my last posting [1] on how to use acpi to change the display brightness. this time we are going to change the volume level using acpi. volume script

# cat /etc/acpi/events/volume

this script ignores stereo balance (left speaker louder than right speaker). but it works great for my laptop.
# /etc/acpi/default.sh
one can find out the group and action with (run as root):
acpi_listen
and then create an event:
button/volumedown VOLDN 00000080 00000000
button/volumedown VOLDN 00000080 00000000
button/volumeup VOLUP 00000080 00000000
button/volumeup VOLUP 00000080 00000000

now extract the needed information: ‘button/volumedown’ and ‘button/volumeup‘. put these into the /etc/acpi/default.sh script as i already did above. you can experiment with the script using the command line:

./default.sh button/volumedown

changing volumes not using kmix (using /dev/input/eventX)

my laptop produces an acpi event (even for an externally attached usb keyboads) but i guess most computers won’t. this is why i’ve adapted a c program and a script [3] to parse /dev/input/eventX.

using this program is very easy, just follow the steps in the README.

# ./mediakeys-controller /dev/input/event3

calling script for volume down

calling script for volume down

calling script for volume up

calling script to toggle mute state

works perfectly here (but since the acpi stuff works also i’m using acpi). the reason is that [3] does not support hotplugging of new devices and it will only open one device at program start. this needs to be fixed but i think of it more as a prove of concept code.

fixing the X-bindings

there are still the bindings for the volume keys in X, we need to remove bindings for XF86AudioRaiseVolume, XF86AudioLowerVolume and XF86AudioMute. i simply removed the kmix key bindings for these keys. doing so works but one would have to remove keybindings from many applications.

a far better approach is to remove the keymappings so that no such events are produced, done using: keycodes via ‘~/.Xmodmap’.

let’s find the key bindings first:

# xmodmap -diplay :0.0 -pke | grep -i vol

keycode 122 = XF86AudioLowerVolume NoSymbol XF86AudioLowerVolume

keycode 123 = XF86AudioRaiseVolume NoSymbol XF86AudioRaiseVolume

now let’s kill those bindings with empty assignments:

xmodmap -e “keycode 122 =”

xmodmap -e “keycode 123 =”

do the same for:

keycode 121 = XF86AudioMute NoSymbol XF86AudioMute

to make these changes permanent, we need to create a ~/.Xmodmap, mine looks like this:

# cat ~/.Xmodmap

keycode 121 =

keycode 122 =

keycode 123 =

installing an OSD

x11-libs/xosd is a very nice OSD. one can indicate volume change. but it somehow does look buggy from time to time:

  • sometimes using osd_cat won’t show any osd (only X restart makes it work again)
  • osd_cat does not work very good with mplayer
  • it is not very fast and it does not look nice

i would like to use the osd of kmix. as kmix already changes volumes (see the sliders when using alsamixer from the console) one would simply have to write an option in kmix settings to enable the osd also on passive changes (where kmix didn’t change the volume). i’ve not done this so far.

links

[1] http://invalidmagic.wordpress.com/2010/06/30/no-more-crashes-after-resuming-from-pm-suspend/

[2] http://ibm-acpi.sourceforge.net/README

[3] http://github.com/qknight/mediakeys-controller

errata

update: fixed the /etc/acpi/default.sh script /etc/acpi/events/volume is now /etc/acpi/volume.sh for mute as well


Posts for Friday, July 9, 2010

Generic Lambda Visitors, or Writing Haskell in C++0x (Part 1)

A good FORTRAN programmer can write FORTRAN in any language. Now, I present Haskell written in C++0x!

struct Base { };
struct FirstDerived : Base { };
struct SecondDerived : Base { };
struct ThirdDerived : Base { };

int main(int, char *[])
{
    OneOf<int, std::string, FirstDerived, SecondDerived, ThirdDerived> s(123), t(std::string("monkey"));

    when(t,
            [] (int x)                   { std::cout << "t is int " << x << std::endl; },
            [] (const std::string & x)   { std::cout << "t is string " << x << std::endl; },
            [] (const Base &)            { std::cout << "t is a Base subclass" << std::endl; },
            [] (const FirstDerived &)    { std::cout << "t is Base subclass FirstDerived" << std::endl; }
        );

    s = SecondDerived();

    std::cout << when(
            when(s,
                [] (int & x)               -> std::string { return std::string(++x, 'x'); },
                [] (std::string & x)       -> int         { x.append(" spanker"); return x.length(); },
                [] (const Base &)          -> int         { return 42; },
                [] (const ThirdDerived &)  -> double      { return 3.14; }
                ),
            [] (const int x)           -> int { return x; },
            [] (const std::string & x) -> int { return x.length(); },
            [] (const double x)        -> int { return x > 3 ? 4 : 3; }
            ) << std::endl;

    return EXIT_SUCCESS;
}

Ok, that one’s rather silly. But there is a point to this: it’s a variation on the visitor pattern, with the visiting externalised rather than being a part of the classes itself. That in turn means classes can be part of different visitable hierarchies in different places — one example would be in handling Gentoo style dependency-like structures, where certain constructs make sense in certain contexts but not others (e.g. || ( ) blocks are legal in dependencies but not SRC_URI, and blockers are legal in dependencies but not provides). Rather than forcing all visitors to implement all methods, or having entirely independent hierarchies for similar things, or losing type safety, you can externalise the visiting parts.

As an added bonus, it’s a good way of getting to grips with variadic templates and type lists. The excessive return type cleverness for the second when above is of questionable real world use, but the techniques required to get it working may be of interest.

Let’s start with the bare minimum:

#include <string>
#include <memory>
#include <utility>
#include <cstdlib>

template <typename... Types_>
struct OneOfValueBase
{
    virtual ~OneOfValueBase() = 0;
};

template <typename... Types_>
OneOfValueBase<Types_...>::~OneOfValueBase() = default;

template <typename Type_, typename... Types_>
struct OneOfValue :
    OneOfValueBase<Types_...>
{
    Type_ value;

    OneOfValue(const Type_ & type) :
        value(type)
    {
    }
};

template <typename... Types_>
class OneOf
{
    private:
        std::unique_ptr<OneOfValueBase<Types_...> > _value;

    public:
        template <typename Type_>
        OneOf(const Type_ & value) :
            _value(new OneOfValue<Type_, Types_...>{value})
        {
        }

        OneOf(const OneOf & other) = delete;

        OneOf(OneOf && other) :
            _value(std::move(other._value))
        {
        }

        template <typename Type_>
        OneOf & operator= (const Type_ & value)
        {
            _value.reset(new OneOfValue<Type_, Types_...>{value});
            return *this;
        }

        OneOf & operator= (const OneOf & other) = delete;

        OneOf & operator= (OneOf && other)
        {
            _value = std::move(other._value);
            return *this;
        }
};

int main(int, char *[])
{
    OneOf<int, std::string> s(123);
    s = std::string("monkey");

    return EXIT_SUCCESS;
}

Note that we’ve made the type moveable but not copyable or default constructible; extending the type to clone its value and to support an uninitialised state isn’t particularly difficult, but isn’t necessary for what we’re after.

We have a problem, though:

    OneOf<int, std::string> s(123);
    s = 1.23;

Oops. So we need a way of checking whether a particular type is in a typelist:

struct UnknownTypeForOneOf;

template <typename Want_, typename... Types_>
struct SelectOneOfType;

template <typename Want_>
struct SelectOneOfType<Want_>
{
    typedef UnknownTypeForOneOf Type;
};

template <typename Want_, typename Try_, typename... Rest_>
struct SelectOneOfType<Want_, Try_, Rest_...>
{
    typedef typename std::conditional<
        std::is_same<Want_, Try_>::value,
        Try_,
        typename SelectOneOfType<Want_, Rest_...>::Type
            >::type Type;
};

Then we can make sure we’re only creating valid children:

template <typename... Types_>
class OneOf
{
    public:
        template <typename Type_>
        OneOf(const Type_ & value) :
            _value(new OneOfValue<typename SelectOneOfType<Type_, Types_...>::Type, Types_...>{value})
        {
        }

        template <typename Type_>
        OneOf & operator= (const Type_ & value)
        {
            _value.reset(new OneOfValue<typename SelectOneOfType<Type_, Types_...>::Type, Types_...>{value});
            return *this;
        }
};

Yay. But none of this is much use if all we can do is create and assign legal things to our values. We need to be able to do things to the underlying value too. Let’s start with a very simple visitor pattern. We need a visitor base class:

template <typename Type_>
struct OneOfVisitorVisit
{
    virtual void visit(const Type_ &) = 0;
};

template <typename... Types_>
struct OneOfVisitor :
    OneOfVisitorVisit<Types_>...
{
};

We need to make our value holders able to accept a visitor:

template <typename... Types_>
struct OneOfValueBase
{
    virtual ~OneOfValueBase() = 0;

    virtual void accept(OneOfVisitor<Types_...> &) = 0;
};

template <typename... Types_>
OneOfValueBase<Types_...>::~OneOfValueBase() = default;

template <typename Type_, typename... Types_>
struct OneOfValue :
    OneOfValueBase<Types_...>
{
    Type_ value;

    OneOfValue(const Type_ & type) :
        value(type)
    {
    }

    virtual void accept(OneOfVisitor<Types_...> & visitor)
    {
        static_cast<OneOfVisitorVisit<Type_> &>(visitor).visit(value);
    }
};

And for convenience we’ll put an accept method in OneOf too:

template <typename... Types_>
class OneOf
{
    public:
        void accept(OneOfVisitor<Types_...> & visitor) const
        {
            _value->accept(visitor);
        }
};

Now we can try it out:

struct IntStringVisitor :
    OneOfVisitor<int, std::string>
{
    virtual void visit(const int & x)
    {
        std::cout << "int " << x << std::endl;
    }

    virtual void visit(const std::string & x)
    {
        std::cout << "string " << x << std::endl;
    }
};

int main(int, char *[])
{
    OneOf<int, std::string> s(123);
    s = std::string("monkey");

    IntStringVisitor visitor;
    s.accept(visitor);

    return EXIT_SUCCESS;
}

In part two, we’ll make visitors more flexible.


Filed under: OneOf Tagged: c++, c++0x, haskell, lambdas, metaprogramming, templates, variadic templates, visitor pattern

Posts for Thursday, July 8, 2010

Portage post-GLEP-55 - Moving Forward

GLEP 55 support has now officially been removed from Portage-2.2_rc, and now I'm going to focus on working with Zac Medico to improve the much-in-need-of-love Portage codebase.

Zac and I already have some projects we are working on that should reduce the size of the on-disk Portage tree by an order of magnitude and be a great benefit for many users. Other spiffy improvements are also in the works.

I realize that there is at least one extremely vocal proponent of GLEP 55 who thinks that it is the best and only reasonable solution for the EAPI problem. I respect your opinion but think you are wrong in this case. Rather than engage in marginally productive debate, I'd rather just move forward with improvements to Portage.

If we can implement a solution to the EAPI issue, we can also implement solutions for other related areas of Portage as well. I don't see any problem overcoming the logistical and technical hurdles you have raised thus far.

Dell: the aftermath

In a previous post I outlined the ways in which Dell's customer service sucks. I finally got my computer yesterday, a Studio XPS 9000. Here are my first impressions.

The bad

  1. This computer weighs so much I almost hurt my back lifting it. I thought computers were supposed to be getting smaller and lighter?

  2. The HD indicator light is tiny and on the top of the case. I can't see it with my computer under my desk.

    The optical drive is behind one of those stupid plastic flap door things. So there isn't even an indicator light for the DVD drive. I'm seriously considering taking a screwdriver the case to fix this.

  3. It didn't come with a Windows install disk or a driver disk. It only has a recovery partition on the HD.

    I found an order form which I think will get me my disks. In the mail. Seriously, Dell? Seriously? Why not come to my house and kick me in the balls while you're at it?

    The recovery partition doesn't help you worth a crap if you want to do things like repartition your drive to put Linux on it. Windows7's sucky partition shrinking app wouldn't shrink it lower than 500GB.

  4. The Dell recovery program is called DataSafe or something, and when you use it, it tries to upsell you like crazy to get a "pro" version that has a bunch of useless backup features. Uggggghhhhh.

  5. The side of the case is white. In 10 years it's either going to be yellow with age, or scuffed up beyond hope. Kind of ugly, but I don't care much. The front of the case looks OK though. Black with red highlights. About as good as I could expect.

  6. It came pre-installed with some crappy "Dell Dock" knockoff of Apple's Dock. Worthless and instantly uninstalled.

    This thing caused the desktop icons to be hidden by default. Who would possibly want to do this in Windows? I can image everyone and their grandmother being awfully confused by the missing icons.

    When quitting this dock, it said "Undo is not possible". I love a program that has no going back once you quit.

  7. I wanted to find drivers for the wireless card that came with the Dell. So I went to Dell's support site and typed in the tag number on my computer. It gave me a link to drivers for the wrong card. I had to google all over the place to find the right ones. Way to go.

    Dell's website is a labyrinth full of outdated information and dead pages in general.

  8. The instructions I got with the computer reference Vista. I don't have Vista.

  9. There's a "Windows inside" logo on the case. It will be removed shortly. They leave an awful lot of glue behind.

The good?

  1. The i7 is about as fast as I had hoped. It only took a half-dozen cores and 12 GB of RAM to let me watch full-screen flash videos on Youtube. I feel so modern.

  2. The inside of the case is OK. There are a lot of hard drive bays and lots of extra screws. It should be easy to expand if I need to.

  3. It came with bloatware and crapware, but actually far less than I was dreading. And most of it was trying to sell you Dell crap.

    In the olden days you'd get a hundred links to AOL and other 3rd-party crap. I saw a link to Skype and the obligatory nag to buy an anti-virus subscription (fat chance), but not much else.

  4. Dell delivered the computer 2 days past the original estimated delivery date. So in spite of all the bullcrap and phone-jockeying I had to go through for billing, I can't complain about how fast it got here. Two days late isn't bad.

    I've heard rumors that these computers are built in Malaysia, and mine was definitely shipped from the US (per the Purolator tracking site). So I'm surprised they can get these things delivered as fast as they can given that it was shipped halfway around the world first, and had to go through Customs at least once coming from the US to Canada.

    Purolator was the only shipping option for Canada. I would've preferred to rush it. But maybe that's not possible given that it's coming from the US. Oh well.

  5. It runs pretty quiet, given how huge the fans are. We'll see how hot it gets once I start putting some load on it.

  6. It came with a DVI to VGA adapater and a DVI to HDMI adapter. I thought that was a nice touch, though it could be that they come standard with any nVidia card nowadays.

  7. Works OK with Linux. It took 20 minutes to set up. (Not counting wiping the Windows partition and re-installing on a smaller partition from my own copy, minus the crapware. That took over an hour.) Sound, video and wireless work out of the box in Linux. All 12 GB of RAM are usable, given a 64 bit OS. (I discovered this the fun way, by unthinkingly trying a 32 bit OS first.)

  8. It didn't burst into flames (yet).

  9. It has a peanut tray on the top. Or MP3-player tray, I guess. But I really want to put peanuts in there.

Brian, you're stupid!

So why did I get a Dell? Because I had good experience with them in the past, at home and at work. Given, that experience was 5 years in the past, and a lot can change. And I'm new to Canada, and relatively unaware of what options exist here.

The other (main) reason was that they were far, far cheaper than going through newegg.ca to get the same hardware. But I guess you get what you pay for. Caveat emptor.

I wouldn't recommend Dell to anyone else, given how chaotic the whole buying process was. Too much uncertainty, too much room for mistakes.

Dave asked in my previous post why I didn't just a computer myself, like I had in the past. I said I didn't have time, but what I meant wasn't build time, which should be an hour or two max. I meant research time. Trying to match up compatible hardware, trying to find the best prices on all the components, checking for Linux compatibility, this takes forever and a half. I don't have hours / days to dork around with this any more.

On the other hand I can just google "xps 9000 linux" and see instantly what problems people had. I can be semi-confident that the hardware would all be compatible. And that did work out OK.

And the last reason I got Dell is that unfortunately I need Windows for work and gaming. Blarg. Paying the Windows tax to Dell is bad enough, let alone buying one off the shelf for $6,000 or however much they cost nowadays.

Sorry for the downtime

This isn't the kind of email I like to see when I wake up:

Our backend monitoring system has detected an error on the host where your Linode resides which could lead to a failure condition. In order to protect your Linode, we have scheduled an emergency migration to a different host which will commence shortly. Please note that there is currently no issue with your Linode - this is a proactive measure we are taking to avoid an issue in the future.

We apologize for any inconvenience this may cause. Should you have any questions or concerns, please do not hesitate to reply to this ticket.

The server rebooted and my sites didn't come back up, so there was some downtime. Sorry about that.

Paludis 0.48.1 Released

Paludis 0.48.1 has been released:

  • We now work around broken TLS support in certain monkey-patched GCC versions.
  • Various minor bug fixes.
  • Working compiler support for ‘extern template’ is now required.
  • The PALUDIS_IGNORE_HOOKS_NAMED environment variable can be used to skip executing hooks from specific files.

Filed under: paludis releases Tagged: paludis

Why GLEP 55 is a Good Idea and thus a Bad Thing

As it seems that ill-thought-out GLEP 55 bashing is back in fashion again, I thought I’d explain why GLEP 55 is a good idea, and thus a bad thing for Gentoo.

In short, GLEP 55 proposes moving the EAPI from being ordinary ebuild metadata to being encoded as part of the filename (e.g. .ebuild-4). The advantages of this are:

  • It allows global scope changes that affect metadata generation to be made. Right now, it is impossible for EAPIs to add new global scope functions, and it is impossible for EAPIs to change the behaviour of existing functions, since all ebuilds must be able to be sourced and have at least the EAPI part of their metadata generated by a package manager that does not support newer EAPIs.
  • It allows changes to the versioning rules. New EAPIs cannot introduce new package version formats or fix arbitrary limitations with existing formats, since as soon as an older package manager encounters what it thinks is an ill-formatted version, it will produce a noisy, user-visible warning. With GLEP 55, instead these new versions will be invisible.
  • It allows ebuilds to use newer bash features in ebuilds without breaking older package managers.
  • It makes a package’s EAPI consistently defined throughout the ebuild. Right now it’s legal to set EAPI deep inside a nested chain of eclasses (so long as doing so doesn’t break metadata invariance rules), which means any code encountered before EAPI is set will have a false impression of what the eventual EAPI will be.

Because it’s currently impossible to add per-package eclasses (due to the first item), ebuilds are doing all sorts of stupid things to get the same effect. Because the second item means it’s impossible to add sane versioning for SCM packages, maintainers are making do with lots of 1.2.9999 versions that don’t quite work with dependencies properly. Due to a combination of the first two, I had to write the hideous abomination that is the versionator eclass because it’s impossible to let Portage support versions like 1.23-alpha1 directly, and impossible to add package manager provided version parsing functions that can be used in global scope.

The third is a continual source of problems, as developers frequently accidentally use newer bash features, thus screwing up the upgrade path for anyone who hasn’t updated their box for a few months.

As for the fourth, it’s highly discouraged from a QA perspective these days, but people have set EAPI via an eclass in the past. This one’s more an illustration of icky design than anything else.

Because GLEP 55′s author once tried using a package manager that isn’t Portage, and is thus irrevocably tainted, some alternative ways of handling these deficiencies being proposed.

Alternative One: Repository Capabilities

First, it is suggested that repositories specify, at the repository level, their requirements (presumably via layout.conf). The implications of this are:

  • We can’t start using anything new until we’re sure that everyone is using a compliant package manager. That means yet another two year wait before any of this goes anywhere.
  • New capabilities can’t be used in the main tree for at least a year after their introduction, because users must be able to upgrade their package manager on a box they haven’t touched for a while.

This means that the cost of introducing a change is extremely high, and so developers will be encouraged to continue using bad solutions rather than fixing things properly.

Also, if repository capabilities replace rather than supplement EAPIs, as some have suggested, then it becomes impossible to change a repository’s capabilities without rewriting every single ebuild to the new standard. For example, one could not turn on a strict-s-checking feature without first checking every single existing package and fixing any that rely upon the older “S doesn’t have to exist” behaviour. Certain developers have proposed branching the tree once a year and rewriting everything to do this; quite how they plan to find the manpower to pull this off has yet to be answered.

Thus, assuming a mass rewrite is out of the question, repository capabilities must be combined with another solution to be of practical use, which brings us to:

Alternative Two: Magic Markers

Second, it is suggested that ebuilds include some kind of magic marker to allow their EAPI to be determined without going through the usual sourcing process. This does absolutely nothing to fix the version formats problem, and so must be combined with repository capabilities anyway. On top of that, none of the proposed magic marker methods are particularly pleasant.

Magic via Fixed EAPI Strings and Parsing Not Using Bash

The first kind of magic marker proposal is to require that ebuilds specify the EAPI string in a particular, fixed format. Proposals are typically along the lines of “it must be specified within the first N lines, and it must be exactly in the form EAPI=X”. So, depending upon the exact wording chosen, none of the following would be legal:

# Copyright 1999-2010 Gentoo Foundation
# Distributed under the terms of the GNU General Public License v2
# $Header: $

EAPI='4'

(Single quotes might not be legal)

# Copyright 1999-2010 Gentoo Foundation
# Distributed under the terms of the GNU General Public License v2
# $Header: $

EAPI="4"

(Nor double quotes)

# Copyright 2010 Joe Bloggs
# Derived from foo-1.23.ebuild from Gentoo, which is:
#     Copyright 1999-2010 Gentoo Foundation
# Distributed under the terms of the GNU General Public License v2
# $Header: $

EAPI=4

(Too many comment lines at the start)

# Copyright 1999-2010 Gentoo Foundation
# Distributed under the terms of the GNU General Public License v2
# $Header: $

    EAPI=4

(Indenting isn’t allowed)

# Copyright 1999-2010 Gentoo Foundation
# Distributed under the terms of the GNU General Public License v2
# $Header: $

inherit eclass-that-sets-eapi

(Eclasses can set every metadata variable except EAPI)

# Copyright 1999-2010 Gentoo Foundation
# Distributed under the terms of the GNU General Public License v2
# $Header: $

SLOT=1
EAPI=4

(EAPI must be first)

To make matters worse, this exception would only apply to the EAPI variable, even though in every other way EAPI would remain a normal metadata variable. This restriction would not necessarily be enforceable by the package manager either, so sometimes screwups would lead to weird results rather than errors.

Also, existing ebuilds don’t follow this format — all of the above examples have occurred in real code. Thus, before switching to this, the entire tree and every single overlay everywhere would have to be fixed.

Magic via Special Comment Strings

The second magic marker variant is to shove a special comment line at the start of every single ebuild containing EAPI information. Again, this would require updating every single existing ebuild, and would also require yet another huge wait before it becomes usable.

This sort of resembles what’s done in certain fixed binary formats. However, ebuilds aren’t fixed binary formats; ebuilds are scripts. In addition, fixed binary formats that do this were designed from the start with different versions in mind; the ebuild format has instead been gradually grown over time, often by people who actively oppose any kind of standardisation.

Magic via Extended Attributes

The third magic marker variant is to store the marker in extended attributes, rather than in the file itself. This means:

  • It becomes impossible to store ebuilds anywhere that doesn’t support extended attributes, such as in most version control systems, on pastebins, on bugzilla and on pastebins.
  • The EAPI is effectively invisible except to people using obscure tools.

And, again, a huge wait.

Alternative Three: eapi Function

Third, make eapi a function. This will allow early exit for unsupported EAPIs, so global scope changes can be made so long as they occur after the eapi function is called. Again, this means a huge wait, and again, this requires an additional proposal to fix version formats.

Why none of this matters anyway

In other words, the choices come down to:

  • Put the EAPI in the filename. Be able to use new features straight away, including in the main repository.
  • Introduce repository capability support to package managers. Wait a year or more. Every time you want to use a new capability, release package manager support for it, then wait a year before you can use the new feature. In addition, either rewrite the entire tree every time you change capabilities, or also introduce a second solution.
  • Introduce some kind of magic marker. Rewrite every existing ebuild, including the ones in overlays, to use this marker. Wait a year. Still be unable to change version format rules, unless a second solution is also introduced.
  • Move eapi to be a function. Wait a year. Still be unable to change version format rules, etc.

Or, of course, the usual Gentoo solution can be applied: do nothing. Continue to be unable to support per-package eclasses, and continue adding extensive convoluted hacks into ebuilds to try sourcing things in FILESDIR to get the same result. Continue to use silly 9999 version numbers, which can’t be used in dependencies correctly, to support version branches of scm packages. Continue to arbitrarily not support certain upstream version formats just because they use -alpha instead of _alpha, and require ebuilds to carry on doing transformations to deal with this. Continue to reject proposals based upon the author rather than technical merit.

It’s obvious that doing nothing is the way best suited to Gentoo’s needs. Gentoo’s main priority is to avoid making any change that is in any way controversial or that could in any way be associated with the wrong people. Delivering a better product to users or making things easier for developers is irrelevant. Thus, maintaining the status quo whilst vaguely mentioning alternatives that can’t and won’t ever happen is the way forward. If things get desperate, and it starts to become impossible to fire any developer who asks for better tools, then at that point one of the “wait several years” solutions can be pulled out as proof that “something is being done”.

At this point, all that switching to a solution that would enable changes to be made would do is illustrate that Gentoo can’t deliver even the simplest of the changes developers need. By repeatedly hovering around proposals that will require long waits before their introduction, Gentoo has a plausible-looking excuse to mask an almost complete lack of progress in Portage; if GLEP 55 were to be introduced, developers might start asking why they still don’t have per-package eclasses, less arbitrarily restricted version numbers, package manager provided version parsing helpers or the ability to use newer bash features in ebuilds.


Filed under: gentoo Tagged: gentoo, glep 55

Posts for Tuesday, July 6, 2010

Hello again... and Why GLEP 55 Is A Bad Idea

Uh... hello there. I haven't posted to my blog lately. So what has been up in Funtoo land? Not much? Actually, quite a lot - tons - I've just been neglecting my blog. I'll try to catch you up in upcoming blog posts.

One very recent thing that I wanted to blog about is that I am going to be more active in helping to define the future of Portage. So I thought I'd start by telling you why GLEP 55 is a bad idea.

GLEP 55, currently in draft form, is somewhat supported in the current Portage 2.2_rc source code. It specifies a new ebuild file name extension that isn't just ".ebuild" - but ".ebuild-[number]<number>, where [number] <number>specifies the EAPI version, which is the revision of the ebuild API that is supported by the ebuild.</number></number>

The rationale is that Portage currently can't figure out the EAPI of the ebuild until it sources the ebuild using bash, and it needs to know the EAPI of the ebuild *before* it sources the ebuild, otherwise problems might occur when it sources the ebuild. So it can't use the current mechanism of sourcing the ebuild to find out what EAPI it uses. It can lead to problems. So far, this is true.

Another rationale for GLEP 55 is GLEP 54, which proposes the "-scm" extension to ebuilds. This extension is presumably going to be phased in in a future EAPI. The GLEP 55 proposal claims that Portage is going to need to figure out the EAPI of the ebuild even before it can try to grok the *version* specified in the ebuild filename, and therefore, the EAPI must be in the filename too...! Wow! Really? This certainly isn't true.

But before I go into more detail, let me digress for a moment and speak about a general observation I have regarding file formats. Did you know that there are actually many different versions of .jpg and .png files, but they all use the same file extension? This is true of many other different file types as well. Why is this? It appears that people have figured out a way to solve this problem in a more elegant way than GLEP 55 proposes. I'd like to also have Portage apply an elegant solution to this problem so that we do not need to uglify .ebuild filenames.

Here's another observation I have. UNIX already has a solution to this problem - used for shell scripts. You've probably noticed that the beginning of a shell script looks like this:

#!/bin/sh

...and likewise, the beginning of a Python script will often look like this:

#!/usr/bin/python

The GLEP 55 proposal suggests that using a mechanism like this to identify the proper interpreter for a file actually "hurts performance." Really? Well, in a general sense it does not, but I suppose if you think that you need to figure out the EAPI before you can properly parse the version string contained in the filename (which isn't true,) then maybe you might convince yourself that there are performance problems.

However, you don't need to figure out the EAPI before you parse the version string.

Let's take a look at a more elegant solution, shall we?

First, GLEP 54. The -scm extension should not be phased in via EAPI but as a Portage Repository-specific capability. In other words, a Portage repository-specific configuration file can define what capabilities are enabled in that particular Portage repository, and if GLEP 54 is used in the repository, a flag will be switched (in a file such as layout.conf, which contains repo-specific settings) to indicate that ebuilds can contain -scm extensions. Using this approach, we view the versions found in ebuild filenames, and the format that they use, as a *repository-specific* capability. GLEP 55 should really have focused on adding Portage Repository capabilities, which is a needed feature in Portage, and this would have paved the way for GLEP 54 without adding wacky ebuild filenames.

If we move the -scm extension in GLEP 54 so that it is a repository-specific capability, we can now figure out how to grok the version of the ebuild without having to know its EAPI. This solves the potential performance problem.

Now we can revisit the time-tested approach of storing the EAPI inside the filename, using the first few bytes of the file, something like:

#?ebb4?

This could mean "*EB*uild in *B*ash format, EAPI *4*." I am using a "?" instead of the more common "!" so that the following characters are not misinterpreted to be the path to an interpreter. The initial "#" is pretty much universally supported as a comment string. I bet you could probably think up an even classier-looking initial few bytes, or even a better, more feature-rich format for the first few bytes of the file that could be used for other things.

So there we go. No need to add the EAPI extension to each ".ebuild" file.

The only reason to add any additional data to a ".ebuild" filename is so that two files can exist independently on disk, next to each other. Since we're never going to need to have EAPI 3 and EAPI 4 versions of the same ebuild sitting alongside one another on disk, or in a git repo, there is no need to put EAPI data in the filename.

So that's why GLEP 55 is a bad idea. Portage Repository capabilities should be implemented instead.

Posts for Monday, July 5, 2010

In which border-radius is abused

I threw together a new blog layout today. I scaled back the level of cows a bit (just a bit, don't worry!) Criticism / feedback welcome. (IE-related feedback should be dropped off in the circular file by my desk.)

In what is surely a prelude to the future of the internets, I'm abusing border-radius pretty heavily in this layout. border-radius is likely to become the new marquee HTML tag or text-shadow CSS attribute: Maybe an OK idea at first, but then everyone uses it so much it makes your eyes bleed.

So I figured I'd best get my border-radiusing in early while it's still cool. IE8 users, you still get pointy corners. Sucks to be you.

Also, if you have any ideas for features I should implement for cow-blog, please let me know. I've been crawling the internet looking at blogs for ideas of things to implement and features to steal, but I'm running out of ideas. It does everything I want now, but I'm not a reader.

Posts for Sunday, July 4, 2010

Neglecting my blog

For a variety of reasons, I haven't blogged in a loooong time. So what have I been doing?

Firstly, Abbey gave birth to Theo Ray Marples on 3rd April 2010. Secondly, I have joined a  World of Warcraft  guild that raids  Icecrown Citadel on a regular basis (8/12 progression). These two facts have taken up a lot of my free time I normally devote to working on my OSS projects.

But do not fret - the few bug reports I've received have all been addressed and I have been releasing new versions when required. However, since I no longer use Gentoo I will not be maintaining  OpenRC anymore. That will need a new home. A big thanks to all the people who helped me work on it over the years :)

I may start blogging about  Twisted Fire, my Warlock which I play in Warcraft. I may also find the time to work on IPv6 support for  dhcpcd, which is hopefully being updated in  Debian from the ancient  dhcpcd-3 it currently uses.

avatar

Off to Toronto July 14-28, Archcon

As mentioned earlier, I'll be at Archcon in Toronto in a few weeks.
It's a very small conference, and the first of its kind. At the last FrOSCon we have been playing with the idea to hold an informal Arch conference in Europe, but those were just ideas. Dusty and Ricardo beat us with an actual implementation.
This is great, and one of the milestones in Arch Linux history. Which is why I want to be there and help making it better.

Archcon schedule. I'll be talking about AIF and uzbl.

My schedule:

  • wed. July 14: departure
  • July 14 - 21: hanging around in downtown Toronto. Not sure yet where I'll stay. probably some hostel.
  • July 21 - 28: bed and breakfast pretty close to the conference location, outside the city center
  • July 22 - 23 are the conference talks days, July 24-25 are the conference tourist days, where we'll do some touristy stuff with the conference participants
  • wed. July 28: flying back

There are many things to do in and around Toronto. We still need to make the planning, but the Niagara Falls are definitely on the list.

Finally, big thanks to Elysium Digital, a technology oriented consulting company for legal matters. Apparently they use Arch Linux for many things: on the desktop, in the server room, and in the forensic lab. They emailed me to propose sponsoring my plane tickets, and they did.

Planet Larry is not officially affiliated with Gentoo Linux. Original artwork and logos copyright Gentoo Foundation. Yadda, yadda, yadda.