Planet Larry

May 07, 2008

Brian S. Stephan

A Tour of the Worm

http://world.std.com/~franl/worm.html

A Slashdot article reminded me of one of my favorite technical articles on the Internet entitled “A Tour of the Worm", an in-depth historical and technical look at the Morris worm. The Morris worm, mistakenly unleashed in 1988, was one of the first significant worms to strike the Internet, and it caused enough damage that it arguably has done the most relative damage of any worm since then.

Check it out if you haven’t read it, or even if you have; it’s a fascinating look at the early days of the Internet. http://world.std.com/~franl/worm.html

May 07, 2008 04:07 AM :: Wisconsin, USA  

May 06, 2008

Jason Jones

07 Mustang GT Upgrade Round 2

Okay...

Those who have followed my journal / blog, know that I have a 2007 Mustang GT, which I recently upgraded with Flowmaster exhaust and a Magnaflow X-Pipe.

Well, I was expecting it tomorrow, so when my wife called me at work, right after lunch to give me the news, I was beside myself with excitement.

Delievered to our door was my new Steeda cold air intake along with the Steeda inlet elbow.  Also delivered was a SCT X3 power flash tuner for my car's computer.

Well, needless to say, I couldn't wait to get it installed, so I hopped in my car, and went back home.

about 2 hours later, I was driving back to work with a dumbfounded grin in my face.

People had told me that a tune with a cold-air intake upgrade would seriously change / improve the sound of the car, but I was way more interested in the power upgrade.  Well, I now can totally understand why people emphasize the change in sound just as much, if not more, than the increase in power.

I cannot believe the way my mustang sounds now!

It's like the whole car has been in a groggy state of being half-awake since birth, and this upgrade has basically woke it up and given it a shot of caffeine!

The sound .... well...  The best I could do to explain it to you is to record it and attach it to this entry.  So, you can click the play button top-right to hear for yourself.

The first thing I noticed was the decrease in time for the car to rev up.  It now revs much quicker than before.

There also is a noticeable *pop* along with a sucking sound when accellerating quickly.  I just love it.

I've only driven it to work so far, so, I haven't really opened it up yet, but that is sure to come.

So, if you're thinking of upgrading to a cold-air intake, or considering tuning your car's computer, with what I know so far, I highly recommend it.

I'm sure I won't be changing my opinion when I can get a few minutes to actually drive it, either.

Goooood stuff!

May 06, 2008 05:20 PM :: Utah, USA  

Jürgen Geuter

Renaming files based on EXIF data

As I mentioned before, my girlfriend and me attended some family thingy this weekend. She took a bunch of photos but when transferring the files to her computer something weird happened.

Like many cameras (if not all) hers numbers the files ascending. When reading the files from the camera you can give a prefix that all files get (it was "Taufe" in this case) which is suffixed by the number (001, 002 ...). The problem was: The files were not in order.

That is not a biggie if you use some sort of photo management tool but she was planning to send them away burned to a CD so she wanted the files to be in proper order when looking at the with a default file manager, so here's how to do that.

The first thing you need is the exif utility which should be in your package manager. Now it's pretty easy actually:

#!/bin/bash for file in *.jpg ; do mv "$file" "`exif -t 0x0132 -m "$file"| sed -e's/:/-/ig' -e's/^/Taufe /ig' - e's/$/.jpg/ig'`" ; done
Some explanations:

exif -t 0x0132 makes sure that I only read the "tag" 0x0132 which in this case is the date. You can use the exif command to get a list of all tags a file support by calling it like this: exif -l filename. The -m switch makes the output "machine readable" by cutting away all the crap you don't need.

The date that came out was in the form YYYY:mm:dd HH:MM:SS which is (because of the ":") unsuitable as a file name (Windows users need to be able to use the files) so I used sed to do a few translations:
s/:/-/ig replaces all ":" with "-", s/^/Taufe /ig' replaces the beginning of the string with "Taufe " (as in prefixing) and s/$/.jpg/ig' replaces the end of the string with ".jpg" (as in suffixing). So we have transformed the EXIF date to the new file name and we can just call mv to rename the files.

This shows us once again that expensive utilities are not necessary and the built-in unix tools (except for the exif dependency) are absolutely sufficient. So next time you wanna do some renaming based on metadata you know how to do it ;-)

Have fun.

May 06, 2008 12:01 PM :: Germany  

Martin Matusiak

why you’ll never have security with Microsoft

Here’s the thing. I hate stating the obvious. It really annoys me. On the other hand, obvious things are sometimes things that most need to be repeated. So I wrestle with myself and I finally decide that I should, because there is a shockingly large number of people out there who don’t realize how obvious this is. See if you can learn something from this mock dialog.

Vendor: Good morning, is this Harry, the CTO*, I’m speaking to?
Client: Yes, how may I help you?
Vendor: Hey Harry, this is Steve from Microsoft. I would like to talk to you about Windows Vista.
Client: What’s that?
Vendor: Why, it’s the brand new version of our Windows operating system.
Client: Oh, that.
Vendor: I was wondering if I could interest you in our product.
Client: You know what, I don’t think so, we are a very security sensitive company, and..
Vendor: But that’s precisely the reason I’m calling, I would like to tell you how you can enhance your security with Windows Vista. You see, we’ve built the operating system with security in mind and it’s the state of the art in operating systems.
Client: Hey, that sounds pretty exciting. So how does this work now, you ship us the source code and…
Vendor: No no, we don’t distribute the source code.
Client: You don’t?!?
Vendor: No, you see it’s a trade secret. (my precious etc)
Client: You’re kidding, right?
Vendor: No, really.
Client: So how do we know that it’s actually secure if we can’t see for ourselves? How do we know there isn’t anything malicious in it?
Vendor: Well you’ll just have to trust us.
*Harry hangs up*
Vendor: Hello? Harry?
*CTO - the highest placed person who makes technical decisions in a company.

How did it go? Did you get it? It was kind of a long thing, huh? Ok, stop racking your brains, I’ll give you the answer: no source code, no security.

Here’s how that works. It’s simple economics, so try to keep up. If they give you the source code, then they put their cards on the table. You can see what the code does, and if it’s doing something stupid (security hole) or nasty (like sending your data to back to the vendor), then you’ll be able to check for this. Now you may say “I don’t know how to check”, and that’s okay. But just by giving you the source code the vendor knows that you can see everything the code is doing. And if you find something nasty in there, they know you’ll never trust them again. So it doesn’t really matter if *you* don’t know how to check, because there are others who do, and sooner or later someone will find the nasty code if it’s in there. Thus, if the vendor gives you the source code, then he’ll be a lot more careful about what’s in there, because he’s risking losing your trust and your business forever. That will keep him honest.

Is there then anything surprising about finding out that Microsoft is putting in backdoors in Windows? No, because how would you know it’s there? You don’t have the source code! In case you were wondering, the words “security” and “backdoor” are mutually exclusive.

So what have we learned today? Is there somehow we could summarize all this in just one sentence? There is: If you want security, ask for the source code. If you can’t get the source code, you know that the vendor isn’t taking security seriously.

May 06, 2008 11:41 AM :: Utrecht, Netherlands  

Jürgen Geuter

Package distribution

Ruby has a distribution problem is a nice article dealing with Ruby's problems with package distribution. The problem basically is that the different ways of supplying Ruby libraries are not compatible so you cannot just say "you need ModuleX installed" but you would have to say "you have to have ModuleX installed via InstallerY".

The problem is mostly triggered by the fact that Ruby nowadays is mostly Rails. But Rails applications are usually written for one client, for one specific installation, not for widespread personal use. This means that the developer usually controls or at least knows the environment completely which makes installing required packages properly easy. It becomes a nasty problem when distribution your software on a large scale (think of wordpress scale).

The usual Ruby workaround is to bundle everything. And that approach is not just present for Ruby: Java program often come with pretty much every Java package under the sun included (mostly cause they only work with one specific version of the package used), and even I have recently packaged a whole GTK installation with some software for a client because without it the Windows version had problems over problems.

Packaging things with your application can be right: When I did it, I packaged all kinds of stuff because I knew that the target system would not have Python or anything else installed so bundling stuff would be the way to go. But Most of the time bundling libraries with your software is just a bad idea.

If you use a modern operating system with decent package management the libraries you bundle will not be updated by the normal process. Which means that security flaws or functional bugs will not be automatically corrected, making your customers/users target for all kinds of attack vectors (the binary JDK used to bundle all kinds of libs with many, many known vulnerabilities for example). In this context bundling is bad. As in really bad. On the other hand that point of view is too simplistic.

A big bunch of the operating systems used today don't have sane package management (most importantly MS Windows and Apple's OSX) so your users don't have the advantages of that anyways. Also for those platforms it's usually a big pain in the ass to install libraries and packages you need. You have to visit buttloads of websites, download and install packages just to see that your still need to get another package to fulfill all requirements. In those cases bundling working libraries might make sense.

Many languages don't offer really good ways to install extensions/modules/packages/however you wanna call them. Perl has it's famous CPAN which is pretty much the best example we have considering ease of use, functionality and quality. Python's Cheeseshop is quite good and easy but not up to Perl. Ruby, Java and other mainstream languages really have not a lot to compete in this area.

It's somewhat of a hen-and-egg problem: Since there's not decent way to manage extension modules everybody bundles stuff and since everybody bundles stuff no one creates a decent extension installing mechanism.

We're talking about a language's culture here, about a mindset. Right now both mindsets work: The install mechanism and the bundling but from a logical standpoint the bundling really should be the rare exception.

Bundling libraries or packages creates a whole new can of worms for you: You'll not only have to manage your own software but also other packages from other vendors, you have to keep track of their bugs, issues and might even have to manually patch them. Then you have to get your users to pull your updated version. All in all it's a huge pain in the ass for developers and (on systems with no sane package management) for the users that have to get used to 30 different ways to update software.

Bundling won't die easily but developer's mindset can be changed more easily. Stop thinking that packaging of your software is unimportant. Stop relying on things to work or the admin to figure the issues out. Exactly knowing your dependencies is not just important for your documentation but also for your own development. Find dependency management for your platform/language and use it so you encounter the same problems your users might have. Don't bundle stuff until you really must, your software will work better and you will get less cryptic bug reports that you just cannot seem to debug.

May 06, 2008 10:15 AM :: Germany  

Portage 2.2 and FEATURES="stricter"

Portage 2.2 is still masked but I'm using it to test it and it's already working great but there seems to be a change that makes compiling many things a lot more difficult: FEATURES="stricter" seems to be the default now.

This will show when random packages won't install anymore because of install_qa_check problems with certain ... not so helpful error messages like
  * ERROR: net-fs/nfs-utils-1.1.2-r1 failed.
 * Call stack:
 *       misc-functions.sh, line 652:  Called install_qa_check
 *       misc-functions.sh, line 360:  Called die
 * The specific snippet of code:
 *   		[[ ${abort} == "yes" ]] && hasq stricter ${FEATURES} && die "poor code kills airplanes"
 *  The die message:
 *   poor code kills airplanes


Until the packages are fixed (and there's quite a bunch of them) you can unset that new behavior by adding "-stricter" to your FEATURES in /etc/make.conf.

Just a little headsup in case you forgot you unmasked portage and have problems installing stuff.

May 06, 2008 09:50 AM :: Germany  

If you're scared of that you probably shouldn't tinker with it anyways

I stumbled on an article today called "Ubuntu Nuggets - it’s the little things that count" that rubbed me in the wrong way.

The article gave a list of a few GUI applications that the author felt made life in Ubuntu (and probably other linux distributions) easier for people and in general I have no problem with that kind of list: From time to time they even show me a nifty little application that I had not known about before which is always neat. And even if not those articles might help others so it's all good. Well not all.

Let's look at a few of the items the author presents: Under the heading "Simplifying GRUB" he gives a selection of two graphical editors for grub.conf/menu.lst (the file that tells your bootloader which kernels to load).

The grub.conf file is not really complex, it's actually really really straight forward and simple. If you are scared to touch that very simple file to get some change done you want (as in changing the default kernel which is just editing the lines # By default, boot the first entry. default 0) you probably will do even more damage with a graphical tool to mess around.

People feel very comfortable with graphical tools and checkboxes and dropdown menus, so they start fiddling around in it like they are used to with their usual applications. The problem is that they are in a place where little changes can really mess the system up and since they were to scared to edit a little number in a text file they probably don't really know what they are doing.

I can't count the times I had to rescue some Windows installation after the owner had found "Power tools" or however all those applications are called that allow you to set all kinds of internal Windows settings. Those settings often do not have a GUI because the developer didn't want you to click that checkbox and make your system unbootable. Whowouddathunk?

This is not saying that the linked article does not offer anything useful, for example the "Ink Management" part or the "Virtualization made easier" part might really help a few people. But if you're too scared to edit grub.conf or the fstab you probably should mess with them via GUI. Only click when you know what you're doing there.

Just think of thins: You're too scared to touch those files by hand but use a GUI. You don't know what it does but it changes the files somehow. It might work for a while but then you encounter problems. You post your file to the forums of your distro and the other users have to deal with the carnage that the GUI tool might have left (because even in simple files like fstab every distro seems to have their own "style" of doing things).

Better read a short howto and just do it in a text editor, especially when it comes to your boot configuration. I personally wouldn't set all those gnome gconf settings via the commandline either, but some files are better just worked on directly. Oh and while we're at it:

When you are unsure of whether your changes are right, copy the original line, comment out the old one, and modify the copy. Now add a comment above the edited line to remind you what you were trying to do and maybe even add the link to the howto you used. Will make your life (and the lives of the people that will try to help you in case things go wrong) a lot easier.

May 06, 2008 09:06 AM :: Germany  

Brian Carper

FAT

I had to undelete someone's files from a FAT partition today. My first thought was to use good ol' Windows to do so, given that Windows is the unholy ground which spawned FAT to begin with. I remember there used to be an UNDELETE command of some sort in some old version of DOS. But this doesn't seem to exist in XP any longer.

There are however lots and lots of third-party "shareware" programs which can do this kind of thing, as Google reveals. There is in fact an overwhelming number of such shareware programs. Most of these programs are total crap and cost around $30. One program required me to burn a CD and reboot my computer from the CD before I could run it. Many of the programs "intelligently" scan a partition looking for chunks of things that look like JPEGS or WMVs. I tried a few "demos" before I gave up, not having an hour to waste finding the one program that would work. Thus bringing the current score to Windows: 948, Brian: 0.

Instead I brought the drive home and plugged it into Gentoo and used this post as a guide. I dd'ed the partition to a file, fscked around with it a bit, mounted it via loopback, and had my files back. Took 10 minutes, and worked as expected. And it didn't cost me $30.

The moral of this story: I need to burn a Knoppix disk to take to work with me.

My only quibble is that I can never ever remember what Gentoo package contains fsck.vfat. Note to self, it's dosfstools. I can never think of the search terms even to locate that package. I had to google it.

May 06, 2008 01:13 AM :: Pennsylvania, USA  

Westinghouse: FAIL

My ninth call to Westinghouse today, about my Westinghouse L2410NM 24" LCD monitor which I RMA'ed back in March, revealed that they did in fact shipmy monitor, supposedly to my house, on April 4th or so. A UPS tracking number confirms it. There are are a few things wrong with this.

  1. In spite of the fact that I asked for a phone call to be updated on the status of my monitor whenever it was shipped, I received no such phone call.
  2. During the four phone calls (or was it five?) I made to Westinghouse in April, AFTER my monitor was supposedly shipped to my house, no one at the company had any record that it shipped. I was told that by multiple representatives over the past four weeks that my monitor was "in processing".
  3. I asked for my monitor to be shipped to workplace, not my house. My nice, safe, cozy workplace with human beings who can sign for large expensive packages. Not my empty house in a neighborhood full of drug addicts, in the property theft capital of the west. In addition to telling the phone representative this, I actually taped a 8.5 x 11 inch sheet of paper directly to the monitor itself (as well as the outside of the box) specifying SHIP TO: and my work address. Even such drastic measures were not enough to catch the attention of whatever magical monitor-repair fairies work at Westinghouse, apparently. Perhaps I should've carved that information directly into the monitor screen.
  4. I could possibly overlook the above, except that, as you may have surmised, at the present time, I do not, in fact, have my monitor.

After calling up UPS to ask why their driver left a $450 computer monitor, in a shiny bright blue and white box with pictures of a computer monitor all over it, sitting on my front porch while I was at work without getting my signature, I placed call number ten (yes, I've finally hit double digits!) to Westinghouse, and managed to escalate my issue to the Westinghouse corporate office. Supposedly in 7-10 business days they will send me a brand new monitor.

Oh how I wish I had any confidence that I'm ever going to see that monitor.

In the meantime, this guy was on sale at the local store, so I bought one. Time will tell whether LG brand is any better than Westinghouse. This time, I also bought the extended warranty, having learned my lesson that it can, indeed, be worth an extra $60 to save myself some pain and aggravation later. I'm also going to think twice about buying things like this over the internet in the future. There is something to be said about being able to drive 10 minutes down the road to have your property serviced or replaced by real-life human beings, rather than paying to have things shipped around the world for a month.

May 06, 2008 01:00 AM :: Pennsylvania, USA  

May 05, 2008

Dirk R. Gently

Have fingers, will post...

Hello blogosphere! Dirk has bumped into a mild setback as beasty laptop seems to have failed on him. Actually the laptop is perfectly fine but the network (both ethernet and wifi) have complete gone. Currently obligations hold me liable, obscuring a new beasty, so I reach out to the net so see if any [...]

May 05, 2008 10:15 PM :: WI, USA  

Steve Dibb

planet larry policy update

I’ve made a difficult decision that I hope doesn’t make anyone feel too bad: effective immediately I’ve removed any feeds from Planet Larry / Larry the Universe that were from developers who have retired from the Gentoo project.

I feel like ex-developers carry a lot of weight with their posts and opinions, and I created the planet feeds mainly for users, by users (speaking of which, don’t be shy, and sign up).

That’s all for now.

May 05, 2008 08:41 PM :: Utah, USA  

Jürgen Geuter

Unholy

Unholy is a project that can compile ruby scripts to python bytecode and actually turn it into python source from the bytecode later.

It's a fun idea that would finally make ruby useful ;-)

Neat little project, of course not all that useful: Python's interpreter is a lot faster that ruby's but Sun is throwing so much money a JRuby and the likes that not using the Java VM with ruby sounds rather dumb. On the other hand having Java stuff always sucks so I can understand the creator ;-)

Well, it's basically one of those projects I love: Kinda pointless but still a lot of fun.

May 05, 2008 08:35 PM :: Germany  

Intellectual property is not oil

You might not know who Mark Getty is, but you probably have heard one of his more famous quotes:
"Intellectual Property is the oil of the 21st century"


Just to save you some time reading Wikipedia and getting distracted from this post (it would be interesting to see how many people leave a blog post they are reading to check a random Wikipedia link and never come back ;-) ), Mark Getty is a member of a rich family that made its money from oil, he owns Getty Images, a commercial archive of images of various kinds.

Considering that history, his comparison is not surprising and it feels kinda right: Everyone and their mom has heard how important intellectual property is and the way that certain sectors of the content industry fight for their monopolies or at least market does somewhat compare to the oil business. Let's have a look whether the comparison of intellectual property and oil is a suitable one.

Disclaimer: Just for the sake of the argument we assume here that a thing as "intellectual property" exists. Personally I think that that concept is flawed and does not really work but that would be another blog post (if interested, I might write that one later). So, to summarize: "Intellectual property" exists for the duration of this article. So back to the argument.

Mr Getty's quote obviously hints at the importance that "intellectual property" will gain in the next few years, an importance that he thinks is comparable to the importance of oil nowadays.

Oil fuels our western economies, it allows us to drive cars, to have cheap transportation, to create various plastic crap. Oil is everywhere and we feel how important it is every day since the price of oil is raising every day. The importance of oil is only surpassed by its scarcity: There's only so much oil on this planet, it takes really long to get new oil and more and more people and other entities need oil. The reserves are running low and that process is speeding up. How does that compare to information.

"Intellectual property" is basically just the right to some piece of information, the right to say what happens with that information, who may do what with it.

Information is not what we'd call "scarce", actually we often hear people claiming that there is information overload, that people have the feeling of being bombarded with too much information. But maybe he's not really talking about information in general but about structured information: Searching Google Image search for a random word gives you buttloads of results, but are they good? Are they structured and reviewed? No. You have to find out about the license, about the quality all by yourself. So is structured information like oil? Or is the structure itself the oil?

The amount of structured information is conceivably lower than the amount of information in general but there is one, if not the most significant difference: If more people use the unstructured information we actually create more order, it might be by offering the users a way to tag and structure content or by them just putting images in more contexts (which makes it easier for automated classifications [and that is basically what a search engine does]). So in contrast to oil that gets less faster the more people use it, information actually gets better by more people using it. As do the structure that that information is embedded in. It's a resource that gets better and more plenty the more people participate. And that is why his comparison (whether he meant structured or all information) is wrong.

Information is not the oil of the 21st century.

Mr Getty is stuck in the old thought or property, where a resource that's scarce is getting more and more valuable, especially when it is important for society. But information (and therefore intellectual property) is fundamentally different: It's cheap and easy to produce, it's subjective (you cannot just put a meter in it and say whether the quality is high or not), it's getting better and richer, the more it is used and the more people participate (which in Mr. Getty's world would mean competition).

The only way to make intellectual property the oil of the 21st century is the way of the Apples, the MPAA and the RIAAs: Restrict people from participating in culture, in the creation of information. Restrict your right to say what you think, to modify and enrich culture. Block you from entering the content creation world cause you'd just create more of that precious "oil", maybe even give it away for free.

As long as there are Creative Commons, Copyleft and other free subcommunities creating content and information, it will never be new "oil". So whenever someone wants it be be like oil, you know what that person is fighting for and against: Your rights, your rights to say and think and create.

Intellectual property and information was no oil, is not oil and must never be.

May 05, 2008 07:36 PM :: Germany  

Py2exe and (Py)GTK

Just as a hint to save you the pain: Never assume that a normally installed Windows GTK works, it might cough up errors left and right with your Python and (Py)GTK that you have no clue how to debug. Rely on the hint given here and package your version of GTK (by copying the etc, share, bin and lib into your dist folder. You really save buttloads of time tracing bugs that you did not create and that waste your time.

GTK might be ABI stable, but the Windows Version is really weird sometimes.

May 05, 2008 07:15 PM :: Germany  

Brian Carper

Hello again, world

Computers are a love/hate thing for me. I love all things digital, but I desperately need to get away from it sometimes too. So I had a nice vacation away from my computer last week. I couldn't keep myself from reading some mailing lists and hitting Slashdot once a day, but I didn't write a single line of code and didn't give my websites or work projects or anything much thought.

But now my vacation is over, and it's so easy to fall back into old habits, endlessly looking at webcomics and reading articles about Common Lisp unit testing suites and cringing at the latest drama amongst Gentoo devs and minding my message board like a crusty old beat cop making his rounds. It's the life I've chosen, and I do like it, but I do like getting away sometimes too.

I fulfilled one of my dreams last week when I finally caved and ordered a solid glass mousepad. They're pretty cheap on newegg.com, depending on the color you want. I happened to want green, and it happened to be the cheapest, so all is well. It looks very nice, and it's big and hopefully the surface won't degrade over time; I tend to eat through mousepads via a slow yet inexorable process of erosion.

Unfortunately my laser mouse doesn't work on it. However, I have learned that if I upgrade my mouse's firmware, it will magically be able to work on a solid glass mousepad. Who would've thought my mouse had updateable firmware, let alone that updating the firmware would allow it to work on new surfaces? Not I.

The bad thing is that I need freaking Windows XP to upgrade the firmware on my mouse. I don't have any computer that has XP on it and I'm afraid to try anything in a virtual machine that involves something as dangerous as fiddling with the innards of connected peripherals. So I tried to install XP on my laptop, desperate times calling for desperate measures. But of course the install failed because my XP install CD is so old (pre-SP1, received free from my college 7 years ago) that it didn't recognize most of my hardware. In fact, the XP install CD blue-screened, which set a new record for how low Windows could sink in my opinion.

So I tried slipstreaming SP2 into my install CD. But it failed because, get this, the filenames of some drivers on the CD, namely usbehci.sys, ended up in lower case rather than uppercase and the CD's install program couldn't locate them. I kid you not. Since when is anything in Windows case-sensitive? Is it running Linux? I had to burn another CD after renaming all the files into uppercase. Then the CD worked, but it couldn't find my hard drive, probably due to missing SATA drivers. At that point I gave up, and plan to take my mouse to work tomorrow to upgrade the firmware on a work machine that has XP on it.

And so the score up to this point in my life is Windows: 947, Brian: 0. Windows remains undefeated.

Thanks go out to Logitech for not letting me use Vista (or, say, LINUX) to upgrade my mouse's firmware, and of course to Microsoft, for yet another gloriously broken and frustrating computing experience.

May 05, 2008 04:04 AM :: Pennsylvania, USA  

Clete Blackwell

Google SketchUp

Google’s SketchUp is an incredibly easy-to-use architectural and 3D-modeling tool. The Computer Science Club here at Mansfield has been talking about modeling the university for a 3D perspective in Google Earth, similar to what Google Earth provides for New York City and other metropolitan areas. Recently, I have been experimenting with SketchUp. A free version is available from the website and a professional version can be purchased for about $500. People enrolled in Universities can obtain the professional licenses for $50 a year. The $50 counts towards purchasing a full license, so after 10 years, it’s yours forever. Or, you can pay for the educational license for 3 years and pay the rest of the money up front.

At a first glance, SketchUp seems to be too simple to be worth anything. Upon further investigation, the simplicity seems to come from Google’s innovational perspective. Google has outdone themselves with SketchUp. It is amazingly easy to pick up and create simple objects. More complicated objects can be created with some practice. I have spent about an hour and a half working with the program. First, I watched the beginning tutorials. Then, I went straight into making objects and refining them. It’s amazingly simple. Just make a shape, pull it up to give it depth, draw other objects on it, and manipulate them. It’s amazingly simple.

In about twenty minutes, I was able to make the desk that I use here at school. Keep in mind that it isn’t 100% perfect, nor is it 100% to scale. I have never seen a program this simple. With an hour and a half of experience, I was easily able to make this (click to enlarge):

Here is the front of it:

From an angle:

From the side (notice the arches in the drawers):

And here it is next to a person:

May 05, 2008 03:39 AM

May 04, 2008

Martin Matusiak

OLPC about to self destruct?

I consider OLPC to be one of the most exciting initiatives of the last few years. When the idea was first circulated it was such an exciting call to arms to do something about the lack of education in poor regions of the world. And the project has produced what appears to be a pretty incredible product, the research of which is now recycled back into the general hardware industry, so it has brought advances that wouldn’t otherwise have happened (now).

I recall pondering the real purpose of the project, asking what is going to be achieved with these laptops. The OLPC project had a very good answer to this. They said the laptops will promote learning in areas where school books are a luxury. Furthermore, the laptop itself is completely tweakable, you press a special key and the source code of the current program pops up. This will promote learning through tweaking and experimentation, so that eventually an industry can be built on these foundations, in regions where little industry exists today and where perhaps the potential for one (in terms of natural resources) is bleak. A beautiful dream, one that could change the world in big ways.

Now Negroponte has changed his tune. Visionary that he is, he failed to convince the clients of the value of free software. So now he’s humming “forget open source, it’s all about the kids!” while preparing to run Windows on the laptop. There is a new smoke screen being constructed:

Negroponte says that the organization is working to ensure that Sugar can run smoothly on Windows.

Riiiight, running Sugar on Windows. Tell me, what exactly is the value of running Windows with an all free software stack? It’s completely useless, that’s what. The whole value of Windows is as a platform, not merely as an operating system. People buy Windows to run Windows applications, not for Windows itself. Or are we actually buying that Egyptian officials are eager to purchase Windows licenses in order to run the free software suite?

Congratulations, Negroponte, you’ve just become a licensed Windows vendor. The kids will no doubt have fun clicking on the Start menu and playing Solitaire. There is a great deal to learn from that, just nothing about the operating system or the applications, you know, actual learning.

OLPC in its original form was about empowering the users, with Windows that capability is entirely destroyed. The fact you cannot mix learning with trade secrets should be blindly obvious to anyone. Open souce is important, but it’s especially important when you want people to learn something.

Furthermore, learning doesn’t happen in isolation. It’s accelerated when it happens in a community of ideas and impulses that flow freely. Resigning OLPC president gets it when he says:

“What comes part and parcel with open source is a culture, and it’s the culture that I’m interested in,” he says. “It’s a culture of expression and critique, sharing, collaboration, appropriation.” And this culture can and should spill into classrooms, he says.

May 04, 2008 09:08 PM :: Utrecht, Netherlands  

Jan Tönjes

bash-Befehle

Mal eben zum festhalten:

Einfache Schleife
for i in `ls` ; do echo $i ; done

In Dateien suchen und ersetzen
cat filename | sed ’s/suchen/ersetzen/g’ > filename-new

Dateien umbenennen und hinten etwas wegnehmen
mv $i ${i%-new}

May 04, 2008 06:57 PM :: Lower Saxony, Germany  

Alex Bogak

Running tests on Windows.

Hi all

I need a free tool for testing a GUI application. Something in lines with Mercury's WinRunner.
Does anyone knows something like that?

Thanks

May 04, 2008 01:49 PM :: Israel  

May 03, 2008

Andreas Aronsson

Doubleclick links in terminal

For a very long while now, several years actually, I've been a bit annoyed by the behaviour of terminals under X when you doubleclick links. What the UI considers a word is selected. Selection 'starts' at the point that is doubleclicked and 'spreads' in each direction, stopping at a char it considers to be a word delimiter. A space is probably always considered a delimiter. Sometimes a '?' too, and often ',' as well. This has been very annoying for me as I spend quite alot of time in irc (irssi in a screen, its lovely in combination with bitlbee; icq, irc etcetc in the same screen.) and every now and then someone pastes an url which I want to doubleclick and then paste into my browser. Now the selection stops prematurely as it's not uncommon for a hyperlink to contain one of the characters that is in the word delimiter list. I've thought of this as a limitation to the system I use and, although annoyed, never gave it much thought. Maybe I've been to susceptible to propagandaists telling me gnu/linux is user-unfriendly.

A couple of weeks ago I started to think a little about it and the solution is rather simple. I use Eterm more or less exclusively.

In ~/.Eterm/user.cfg I've put this:

<eterm-0.9>
begin misc
cut_chars "\t\\\`\\\"\'() *,;<>[]{|}"
end misc

This might not be perfect but it's certainly served it's purpose this far as I have been able to doubleclick on the links and then middleclick in the address field (or open a new tab) in the browser. My OS is even better =)

May 03, 2008 01:06 PM :: Sweden

Michael Klier

What The Frack, Truecrypt?

Yesterday I've bought myself a new 250GB USB bus powered external hard-disk for my NSLU2. It replaces my bigger 3,5” which lives in a case and requires a separate AC adapter. I hope to safe some energy with this and get rid of the noise the bigger one made during the night.

Because the main purpose of the disk is to keep my backups and my digital audio library I went on to encrypt the disk using truecrypt, just like I did for the one before. I've never used truecrypt for anything else than mounting my old HD on my NSLU2 on which I run a self compiled 4.1 version of the software (at the time I encrypted my old disk there was no truecrypt package for the Debian arm port), so I never experienced the changes they'd made in the 5.1 version.

What should I say, IMHO the new version is a nightmare in terms of usability. It has a new Tcl/Tk GUI which should ease the management of encrypted devices. That might be the case for the Windows port but on Linux it results in just the opposite, especially on a headless machine.

If you want to use the text mode interface you have to explicitly force it on the command line by adding the -t option. Creating new encrypted devices also requires -t, if you omit it you'll get an error :-S. C'mon that really sounds like bad programming to me. The former cli interface of truecrypt was perfect IMO. I really don't know what has caused them to change it so dramatically. OK, there's a trick by putting an alias into your $shellrc to save you from future surprises and the ugly help window.

alias truecrypt='truecrypt -t'

Anyway, assuming that every user wants to use the crappy GUI by default is just plain wrong. Another example, in the old version you could mount the encrypted device without mounting its filesystem, for example if you wanted to format it with a different filesystem than FAT32 by omitting the destination mount point.

% truecrypt /dev/sdb1
Enter passwort:
% ls -1 /dev/mapper/
/dev/mapper/truecrypt0
% mkfs.ext3 /dev/mapper/truecrypt0

The new version isn't as clever. You have to omit the destination mount point and tell truecrypt not to mount it, or in other words tell it not to ask for a destination mount point because you didn't give one. Did they think: OK most users are not that smart, if they forget to provide a destination mount point we just keep asking them until we get one. Oh, and those who know what they do - well - lets add another cli switch so they can tell us that they really don't want to really mount the volume? The question about the missing destination mount point is not the only one. You're also asked for an optional key file and whether or not you like to protect the hidden volume.

To get the same effect as the above example you now have to use this easy to remember combination:

% truecrypt -t --keyfiles="" --protect-hidden=no --filesystem=none /dev/sdb1

As a side note: Since truecrypt uses FUSE now the devices don't appear in /dev/mapper anymore. You can use the following to list them.

% truecrypt -t -l
1: /dev/sdb1 /dev/loop0 - 
% mkfs.ext3 /dev/loop0

But the story doesn't end here, truecrypt now ships with another nifty gimmick.

I also wanted to create a hidden volume on the new hard-disk, the possibility to have hidden containers is what IMHO makes truecrypt a good choice for encryption. I know there are some controversial opinions on this matter, but anyway, check this out:

% truecrypt -t -c /dev/sdb1 
Volume type:
 1) Normal
 2) Hidden
Select [1]: 2
Error: The selected feature is currently not supported on your platform.

Erm what? I mean WHAT!? Are they kidding me? This worked versions ago. Although I bet this has something to do with the switch to FUSE, it seems that they decided to get the new fancy GUI version out in time (along with the added support for bootable encrypted devices on Windows) but also decided to ship the yet obviously unfinished rewrite of the Linux version.

I am sorry, but this just sucks!

Read or add comments to this article

May 03, 2008 12:13 PM :: Germany  

Muhammad Najmi Ahmad Zabidi

Random Heart

Graduate School Information Request


Your request for information has been processed.

Thank you for your interest in the University of Massachusetts Amherst.

Applications directed to addresses within the United States are mailed twice weekly with first class postage; allow 10 days for delivery. Applications directed to non-U.S. addresses are mailed weekly and sent air mail; allow at least 3 weeks for receipt. There is no charge for any of these materials.

The following materials will be sent to you here:

Department of Computer Science
KICT, IIUM
Gombak, 53100 MYS

  • CMPSC program materials

May 03, 2008 10:40 AM :: Kuala Lumpur, Malaysia  

Thomas Keller

Windows Media player doesn’t play all files

I have some strange issues recently with my windows media player - it does not play all video files any more (I have no problems on KDE with kmplayer, though). Turns out that this seems to be quite a common error…. The web-help of Microsoft points to this page, and it says something about iD3 tags [...]

May 03, 2008 07:55 AM

Zeth

Email Syntax Check in Python

Sometimes you may want to check that an email address is not syntactically invalid, i.e. it looks like a recognisable email address. I use this approach in my zetact contact form processor.

Of course, it does not mean the address actually leads anywhere, but at least you know are dealing with an email address that could exist.

This is the code I have been using, albeit I have changed it from a class method to a simple function to make this post simpler.

"""Email check using regex."""

def invalidreg(emailkey):
    """Email validation, checks for syntactically invalid email             
    courtesy of Mark Nenadov.                                               
    See http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/65215"""
    import re
    emailregex = "^.+\\@(\\[?)[a-zA-Z0-9\\-\\.]+\\.([a-zA-Z]{2,3}|[0-9]{1,3\
})(\\]?)$"
    if len(emailkey) > 7:
        if re.match(emailregex, emailkey) != None:
            return False
        return True
    else:
        return True

I decided it would be more Pythonic to try to do this using the built-in string methods, rather than importing the re module and using a monster regular expression. Here was my first attempt.

"""Email checks using string methods - simple version."""
def invalidemail(emailaddress):
    """Checks for a syntactically invalid email address."""
    try:
        emailitems = emailaddress.rsplit('@', 1)
        emailitems.extend(emailitems[1].rsplit('.', 1))
    except IndexError:
        return True

    if [x for x in emailitems if not x.replace(".","").isalnum()] \
            and emailaddress >= 7:
        return True
    else:
        return False

After a bit of testing and playing with this, a friend pointed me towards the relevant RFC on restrictions of email addresses. While the standard allows the use of many different special characters, in practice email addresses have to be much stricter if you actually want people in the real world to be able to send email to you.

For example, if we allow the email address []@commandline.org.uk, will whatever receives the output of this function be able to use it? As pointed out by Jan Goyvaerts, most software won't actually be able to handle obscure special characters.

We also don't want to water down the syntax check and allow junk for the sake of theoretical but non-existent addresses.

My compromise is to allow these special symbols -_.%+. in the local-part of the email address, and -_. in the domain name. I also do sanity checking on the top-level domain, it needs to be either a generic name or two characters long (country codes are all two letters).

So below is my current version, I added lots of comments and white space to make it easy to read.

"""Ditch nonsense email addresses."""

GENERIC_DOMAINS = "aero", "asia", "biz", "cat", "com", "coop", \
    "edu", "gov", "info", "int", "jobs", "mil", "mobi", "museum", \
    "name", "net", "org", "pro", "tel", "travel"

def invalid(emailaddress, domains = GENERIC_DOMAINS):
    """Checks for a syntactically invalid email address."""

    # Email address must be 7 characters in total. 
    if len(emailaddress) < 7:
        return True # Address too short.

    # Split up email address into parts.
    try:
        localpart, domainname = emailaddress.rsplit('@', 1)
        host, toplevel = domainname.rsplit('.', 1)
    except ValueError:
        return True # Address does not have enough parts. 
    
    # Check for Country code or Generic Domain.
    if len(toplevel) != 2 and toplevel not in domains:
        return True # Not a domain name.
    
    for i in '-_.%+.':
        localpart = localpart.replace(i, "")
    for i in '-_.':
        host = host.replace(i, "")
    
    if localpart.isalnum() and host.isalnum():
        return False # Email address is fine.
    else:
        return True # Email address has funny characters.
    
# Start the ball rolling.
if __name__ == "__main__":
    print invalid("warrior@example.com")

Discuss this post - Leave a comment

May 03, 2008 02:00 AM :: West Midlands, England  

May 02, 2008

Thomas Keller

Installing my new Brother MFC-7820N

I bought a Brother MFC-7820N; although I do not currently intend to use the fax capabilities, the device is good in all aspects, I assume. What especially impressed me is the Linux support Brother gives. The device is not attached to my server - instead, it is directly attached to my network via LAN (it comes [...]

May 02, 2008 06:37 AM

May 01, 2008

Nicolas Trangez

Python ‘all’ odity

[update] Question solved, see bottom of post.

Since Python 2.5 the language got a new built-in method ‘all’ (and it’s nephew ‘any’). I wanted to play around with this a little, combined with generators, so I created a little testcase to test performance.

Here’s the test-case: take a list L of X random numbers in a given range [A, B], and check whether

  • all elements in L are >= A
  • all elements in L are >= (A + Z) where Z is a number in [0, (B - A)]

The first test should always result True, the second test could result to False.

Here’s the output of a test-run:

In [1]: import random, sys

In [2]: a = [random.randint(100, sys.maxint) for i in xrange(2000000)]

In [3]: len(a)
Out[3]: 2000000

In [4]: #Check whether all elements are >= 100 

In [5]: %timeit all(i >= 100 for i in a)
10 loops, best of 3: 515 ms per loop

In [6]: %timeit any(i < 100 for i in a)
10 loops, best of 3: 454 ms per loop

In [7]: def f(l):
   ...:     for i in l:
   ...:         if i < 100:
   ...:             return False
   ...:     return True
   ...: 

In [8]: %timeit f(a)
10 loops, best of 3: 292 ms per loop

In [9]: #Same thing for 100000, since now the list shouldn't be completely iterated

In [10]: %timeit all(i >= 100000 for i in a)
100 loops, best of 3: 4.73 ms per loop

In [11]: %timeit any(i < 100000 for i in a)
100 loops, best of 3: 4.29 ms per loop

In [12]: def g(l):
   ....:     for i in l:
   ....:         if i < 100000:
   ....:             return False
   ....:     return True
   ....: 

In [13]: %timeit g(a)
100 loops, best of 3: 2.82 ms per loop

In [14]: #For reference

In [15]: %timeit False in (i >= 100 for i in a)
10 loops, best of 3: 531 ms per loop

In [16]: %timeit False in (i >= 100000 for i in a)
100 loops, best of 3: 5.03 ms per loop

It’s as if ‘all’, ‘any’ or ‘in’ don’t break/return when a first occurence of False (or True, obviously) is found. Is this the desired behaviour, and if it is, why? The calculation time difference between using all/any/in or a custom-made function (which is, unlike all etc, not written in C) which breaks whenever it can, is pretty astonishing.

[update] Question solved. It’s pretty normal the function-based approach performs better, since it combines what ‘all’ and the generator provided to ‘all’ do, taking away the generator function-call overhead. Damn :-)

May 01, 2008 01:57 PM

Martin Matusiak

renaming sequentially

If you’ve been dealing with files for a while you will have noticed that there is a slight semantic gap between how humans see files and how computers do. If you’ve ever seen a file list like this you know what I mean:

Lecture10.pdf
Lecture11.pdf
Lecture12.pdf
Lecture1.pdf
Lecture2.pdf

Numbering these files was done in good faith, and a user understands what it means, but the computer doesn’t get it. Sorting in dictionary order produces the wrong order as far as the user is concerned. The reason is that the digits in these filenames are not treated and compared as integers, merely as strings. (Actually, . comes before 0 in ASCII, what’s going on here?)

While we’re not expecting our computers to wisen up about this anytime soon, there is the obvious fix:

Lecture01.pdf
Lecture02.pdf

Lecture10.pdf
Lecture11.pdf
Lecture12.pdf

You’ve probably done this by hand once or twice, while cursing.

On the upshot, this is very easy to fix with a few lines of code:

#!/usr/bin/env python
#
# Author: Martin Matusiak <numerodix@gmail.com>
# Licensed under the GNU Public License, version 3.
#
# revision 1 - support multiple digit runs in filenames
 
import os, string, glob, re, sys
 
def renseq():
    if (len(sys.argv) != 2):
        print "Usage:\\t" + sys.argv[0] + " <num_digits>"
    else:
        ren_seq_files(sys.argv[1])
 
 
def ren_seq_files(num_digits):
    files = glob.glob("*")
    for filename in files:
        m = re.search("(.*)(\\..*)", filename)
        ext = ""
        if m: (filename, ext) = m.groups()
 
        digit_runs = re.finditer("([0-9]+)", filename)
        spans = [m.span() for m in digit_runs if digit_runs]
        if spans:
            spans.reverse()
            arr = list(filename)
            for (s, e) in spans:
                arr[s:e] = string.zfill(str( int(filename[s:e]) ), int(num_digits))
            os.rename(filename+ext, "".join(arr)+ext)

 
 
if __name__ == "__main__":
    renseq()

Download this code: renseq.py

This works on all the files in the current directory. Pass an integer to renseq.py and it will change all the numbers in a filename (if there are any) to the same numbers, padded with zeros if they have fewer digits than the amount you want. So on the example

renseq.py 2

will turn the first list into the second list.

If say, there are filenames with numbers of three digits and you pass 2 to renseq.py, the numbers will be preserved (so it’s not a destructive rename), you’ll just revert to your incorrect ordering as it was in the beginning.

renseq.py will rewrite all the numbers in a filename, but not the extension. So mp3 won’t become mp03. ;)

May 01, 2008 01:32 PM :: Utrecht, Netherlands  

Dieter Plaetinck

Windows sucks

I had to fix a problem at my dad's company...
"The network was broken."

It was a NetBEUI network connecting some windows stations - it has been running for years - and now suddenly the nodes couldn't find eachother.
One of the boxes (windows 2000 iirc) had 2 network cards, one for the network, the other not used for anything (not even connected). Disabling the latter - not even touching the former - fixed half of the network.

read more

May 01, 2008 12:22 PM :: Belgium  

Brian S. Stephan

Sender Policy Framework

Someone in #lh today told me about Sender Policy Framework, which sounds like a badly-needed enhancement to the Internet’s email protocols. Basically, the idea is to provide a DNS record that informs MTAs “don’t trust emails claiming to be from this domain unless they’re coming from one of my actual servers".

In DNS, this looks like (in my case):

emptymatter.org. IN TXT "v=spf1 a mx ~all"

Some MTAs support SPF but need to be configured, I believe Gentoo’s postfix is one of them. If I’m going to expect other mail servers to support it, I probably should myself. I’ll have to tackle that another day…

May 01, 2008 12:20 AM :: Wisconsin, USA  

April 30, 2008

Steven Oliver

steveno


Do you remember the last time I posted on here? I don’t LOL.

Regardless, this weekend I am planning a Gentoo install party. Sadly I will be the only one attending but since Paludis 0.26.1 is out I see no reason to delay the return of the King. Naturally I am the aforementioned king. No I’m really not that arrogant. I just play that way on the internet.

Enjoy the Penguins!

April 30, 2008 11:58 PM :: West Virginia, USA  

Christoph Bauer

Microsoft Delays Windows XP Service Pack 3

Since Heise announced that Microsoft will release the Windows XP Service Pack 3 on the 29th, I didn’t sleep too well, as I really want to grab it as soon as possible. Sure, I use Linux, but this doesn’t mean that I am not dangling with windows boxes from time to time and I am fed up applying about 100 patches before I can even think of security.

But I was laughing too soon - as just one day after the 29th (today), I have seen a posting on the Washington Post Blog that Microsoft has delayed the start of the service pack again. In a written statement they say:

“In order to make sure customers have the best possible experience we have decided to delay releasing Windows XP SP3 to Windows Update and Microsoft Download Center.”

In other words, there seems to be no release date yet. Well - in the meantime I’m rolling my own update pack using the Heise Offline Update. Thanks a ton, guys.


Copyright © 2007
Please note that this feed is for private use only. All other usage, including the distribution or reproduction of multiple copies, performance or otherwise use in a public way of the images or text require the authorization of the author.
(digitalfingerprint: 0f46ca51d0fa4e6588e24f0bf2b80fed)

April 30, 2008 06:52 AM :: Vorarlberg, Austria  

Brian S. Stephan

Yet another lazy post

Nothing exciting here. Got my tax refunds. Might build a home-made NAS with a couple terabytes of disk and put it in the basement.

On the DS, I’ve been playing Rondo of Swords and The World Ends With You. Rondo is a pleasant find, a difficult but still reasonable strategy RPG that makes one think and plan ahead, unlike games such as Revenant Wings which are much more “bring a healer and just mob everyone at the thing they’re strong against!” Also, I have a crush on Atlus by this point. There’s no denying it now. I draw their name with little hearts all around when I’m in meetings.

The World Ends With You is refreshingly original, one of those games that, even with it being Square Enix, is a bit surprising that it made it to the States. Very Japanese, and the game makes few concessions to the English audience. Sure, long gone are the times of gratitutous name changes, but even the j-pop/j-rock soundtrack remains intact, and that is, to my slightly jaded mind, a bit commendable. Now, if only the main character didn’t suffer from two vile Square Enix staples: unimaginable thinness and nearly sickening teenage angst. Neku is supposed to get better with the latter; I hope it is soon.

My games to beat are now Etrian Odyssey and Rondo of Swords, one I must beat before Etrian Odyssey II (guess which one) is released here, and the other before the Final Fantasy IV remake reaches the States. I’m excited. If I have time before those, Final Fantasy III and The World Ends With You are my RPGs to beat. FF3 is a cakewalk thus far, but its ease and its crude mechanics compared to Final Fantasy V make it hard to stay with for long.

I didn’t really intend this to become all about video games. I’ve been working on a Gentoo Wiki page for the HP 2133 which has kind of slowed down as most of the parts I’m interested in are supported as best they can be without new versions of drivers, I think. There’s some other hardware that I need to try out (the webcam, for example), but I don’t really care that much, so it’s low priority. Notebooky stuff works.

I have a Waterfield Designs bag coming soon, which I’m excited about. Don’t think it will be suitable for gaming books, but I still have that backpack which is going on 5+ years. The little trooper.

I’ve been meaning to survey the gaming group and associated friends to see what they’re using for IM these days. I think the answer for some is “nothing", with a couple saying “AIM on occasion” or “I idle on Google Talk", so I’ve not really been motivated to test those waters. I want to get a private Jabber conference room running for the group, since the IRC thing kind of sputtered off and died (I still idle there!), but I know it means getting people to switch to Jabber (or at least Google Talk) and then getting them to use a non-Google Talk client (Pidgin, I bet, but maybe Trillian would work). Sigh. If anyone has interest in switching to one network (I highly suggest a Jabber-like ["XMPP” for the techies]), or trying out conferences, or whatever, email/IM me and we’ll play around.

This really is getting rambly, and people might expect me to write long posts all the time. So I’m wrapping this up by saying that spring is finally here, and that’s why it snowed yesterday.

April 30, 2008 03:31 AM :: Wisconsin, USA  

April 29, 2008

Zeth

Three more tips - use keybindings, scripts and SSH without passwords

Use Readline shortcuts

At the bash prompt, you can use the default readline keybindings, these are similar to Emacs ones. Many of these are also available within other programs that use readline, such as the Python interpreter.

Here are some useful ones:

Ctrl-A Beginning of Line
Ctrl-E End of Line

Ctrl-U Kill (cut) everything left of cursor
Ctrl-K Kill (cut) everything right of cursor
Ctrl-W Kill (cut) the single word before the cursor
Ctrl-Y Yank (paste) the text back

Ctrl-L Clear Screen
Ctrl-D Exit
Ctrl-R Reverse interactive-search, (attempt to complete what is currently being typed using the history file)

SSH without Passwords

If you login to a remote machine often and you get bored of typing the password, then you can use public key cryptography instead.

The way it works is that the remote machine has a copy of your local machine's public key, it can then use that to check that your local machine is really your machine, and so let you in.

To start with, on the local machine, see if you already have a key pair:

ls ~/.ssh/id_?sa.pub

If not, then make one:

ssh-keygen -t dsa

Now you need to copy your public key to the remote host. On the local machine run:

scp ~/.ssh/id_?sa.pub remotehost:

Now we login to the remote server:

ssh remotehost

Append the public key to your authorized keys file

cat id_?sa.pub >> ~/.ssh/authorized_keys

Now you can login without passwords. Make sure the security of your machines is well thought out. Use disk encyption if possible.

Create a script directory in home directory

I often talk about random Python or bash scripts. The easy way to use them on Linux is to make a dedicated script directory for these.

mkdir ~/bin

Add it to your shell's path. Edit ~/.bashrc and add:

export PATH=$HOME/bin:$PATH

Now all the scripts that you add to ~/bin are always available. This makes things a lot more flexible and fun as you can try out various scripts by dropping them in ~/bin and then deleting them when you are bored of them.

Discuss this post - Leave a comment

April 29, 2008 09:00 PM :: West Midlands, England  

Jürgen Geuter

Firefox3 beta4 bug that annoys me

This Firefox3 beta4 bug is really annoying. Whenever I click a link on my email workspace, the browser window is pulled to the window. Really needs fixing (yeah I could write devilspie rules but I shouldn't have to).

April 29, 2008 12:10 PM :: Germany  

April 28, 2008

Jürgen Geuter

Bugmenot Extension for Firefox3 beta

I don't know when it was updated, but the bugmenot extension works with the current betas.

Bugmenot collects logins for those sites that require you to log in to read their content (like for example nytimes.com) and allows you to use those "throw-away" logins when you stumble on a page like that: You just rightclick the form and select "Log in with Bugmenot" and the extension will try the logins bugmenot has to get you into the page.

Terribly useful and didn't use to work with the firefox betas but now it does, which really rocks. If you have not used it so far, do it now.

April 28, 2008 10:08 PM :: Germany  

April 27, 2008

Alex Bogak

Cellular Video Calls: reality that never happened?

Hi all

I recently started working for Comverse - the company supplies solutions for telephony providers, mainly cellular ones. Our product lies in the core of the operator's network and manages all or some of the services provided by the operator, such as Voice Mail, SMS, MMS, Video Calls, etc. Our system can provide a complete solution or integrate its parts with other available solutions in the market.

As I'm having an educational process now, I got an interesting thought during the studies. I got an insinuation from some of the cellular operators in last years, that video calls ability was the major drive behind the transition to fast networks, such as 3G, 3.5G and next generations. While it is true for some cases, I am not that sure that it is completely valuable.

Just think about it: would you perform a video call using the modern handset that has a video camera? Of course not - you'd have privacy issues right away. Do you really want the whole world to hear what you are saying? So what the point then in having fast network but not providing any type of service with it? Probably this is one of the reasons that cellular providers have problem: they have the infrastructure, but no services to monetize it. So everything else costs more to cover the losses. And this is something that I as consumer do not like.

I wonder why is it so in my locale that we do not have an unlimited connection cellular plans. We do have various packages, but they all are paid per minutes or MBs of data - just similar to what dial-ups used to be ages ago. It really would be great to have internet everywhere, and I think that cellular companies are not getting something here.

It's not that they make more money on pay per minute/byte basis. It's just me not buying the service at all while this is the payment scheme. So general users of this are business folks that gotta have an access to their email at all times. And even then, better options exist (we have WiFi hotspots almost everywhere now).

Just wonders of the world I guess.

April 27, 2008 03:16 PM :: Israel  

Nirbheek Chauhan

<3 X, PulseAudio, and DAAP

So, right now, I'm sitting at my comp listening to Norah Jones. But this isn't like any other music-listening time. Right now, I'm:


  1. Logged into a lab computer via XDMCP: I could've used VNC, but that would've required someone to be logged-in on the lab comp.

  2. Using my laptop's PulseAudio as the lab computer's default PulseAudio sink: This makes the lab computer's PulseAudio send all sound to my laptop's PulseAudio by default.

  3. Connected to my laptop's DAAP share from the lab computer's Rhythmbox: The music on my laptop becomes accessible from the lab computer's Rhythmbox.



This setup results in me playing Norah Jones on the lab computer, and listening to it here :)

April 27, 2008 01:28 AM :: Uttar Pradesh, India  

April 26, 2008

Martin Matusiak

download all media links on a webpage

This has probably happened to you. You come to a web page that has links to a bunch of pictures, or videos, or documents that you want to download. Not one or two, but all. How do you go about it? Personally, I use wget for anything that will take a while to download. It’s wonderful, accepts http, https, ftp etc, has options to resume and retry, it never fails. I could just use Firefox, and if it’s small files then I do just that, and click all the links in one fell swoop, then let them all download on their own. But if it’s larger files then it’s not practical. You don’t want to download 20 videos of 200mb each in parallel, that’s no good. If Firefox crashes within the next few hours (which it probably will) then you’ll likely end up with not even one file successfully downloaded. And Firefox doesn’t have a resume function (there is a button but it doesn’t do anything :rolleyes: ).

So there is a fallback option: copy all the links from Firefox and queue them up for wget: right click in document, Copy Link Location, right click in terminal window. This is painful and I last about 4-5 links before I get sick of it, download the web page and start parsing it instead. That always works, but I have to rig up a new chain of grep, sed, tr and xargs wget (or a for loop) for every page, I can never reuse that and so the effort doesn’t go a long way.

There is another option. I could use a Firefox extension for this, there are some of them for this purpose. But that too is fraught with pain. Some of them don’t work, some only work for some types of files, some still require some amount of manual effort to pick the right urls and so on, some of them don’t support resuming a download after Firefox crashes. Not to mention that every new extension slows down Firefox and adds another upgrade cycle you have to worry about. Want to run Firefox 3? Oh sorry, your download extension isn’t compatible. wget, in contrast, never stops working. Most limiting of all, these extensions aren’t Unix-y. They assume they know what you want, and they take you from start to end. There’s no way you can plug in grep somewhere in the chain to filter out things you don’t want, for example.

So the problem is eventually reduced to: how can I still use wget? Well, browsers being as lenient as they are, it’s difficult to guarantee that you can parse every page, but you can at least try. spiderfetch, whose name describes its function: spider a page for links and then fetch them, attacks the common scenario. You find a page that links to a bunch of media files. So you feed the url to spiderfetch. It will download the page and find all the links (as best it can). It will then download the files one by one. Internally, it uses wget, so you still get the desired functionality and the familiar output.

If the urls on the page require additional post-processing, say they are .asx files you have to download one by one, grab the mms:// url inside, and mplayer -dumpstream, you at least get the first half of the chain. (Unlikely scenario? If you wanted to download these freely available lectures on compilers from the University of Washington, you have little choice. You could even chain spiderfetch to do both: first spider the index page, download all the .asx files, then spider each .asx file for the mms:// url, print it to the screen and let mplayer take it from there. No more grep or sed. :) )

Features

  • Spiders the page for anything that looks like a url.
  • Ability to filter urls for a regular expression (keep in mind this is still Ruby’s regex, so .* to match any character, not * as in file globbing, (true|false) for choice and so on.)
  • Downloads all the urls serially, or just outputs to screen (with --dump) if you want to filter/sort/etc.
  • Can use an existing index file (with --useindex), but then if there are relative links among the urls, they will need post-processing, because the path of the index page on the server is not known after it has been stored locally.
  • Uses wget internally and relays its output as well. Supports http, https and ftp urls.
  • Semantics consistent with for url in urls; do wget $url… does not re-download completed files, resumes downloads, retries interrupted transfers.

Limitations

  • Not guaranteed to find every last url, although the matching is pretty lenient. If you can’t match a certain url you’re still stuck with grep and sed.
  • If you have to authenticate yourself somehow in the browser to be able to download your media files, spiderfetch won’t be able to download them (as with wget in general). However, all is not lost. If the urls are ftp or the web server uses simple authentication, you can still post-process them to: ftp://username:password@the.rest.of.the.url, same for http.

Download spiderfetch:

Recipes

To make the use a bit clearer, let’s see some concrete examples.

Recipe: Download the 2008 lectures from Fosdem:

spiderfetch.rb http://www.fosdem.org/2008/media/video 2008.*ogg

Here we use the pattern 2008.*ogg. If you first run spiderfetch with --dump, you’ll see that all the urls for the lectures in 2008 contain the string 2008. Further, all the video files have the extension ogg. And whatever characters come in between those two things, we don’t care.

Recipe: Download .asx => mms videos

Like it or not, sometimes you have to deal with ugly proprietary protocols. Video files exposed as .asx files are typically pointers to urls of the mms:// protocol. Microsoft calls them metafiles. This snippet illustrates how you can download them. First you spider for all the .asx urls, using the pattern \.asx$, which means “match on strings containing .asx as the last characters of the string”. Then we spider each of those urls for actual urls to video files, which begin with mms. And for each one we use mplayer -dumpstream to actually download the video.

#!/bin/bash
 
mypath=$(cd $(dirname $0); pwd)
webpage="$1"
 
for url in $($mypath/spiderfetch.rb $webpage "\\.asx$" --dump); do
	video=$($mypath/spiderfetch.rb $url "^mms" --dump)
	mplayer -dumpstream $video -dumpfile $(basename $video)
done
 

Download this code: asx_spiderfetch.sh

April 26, 2008 07:44 PM :: Utrecht, Netherlands  

April 25, 2008

Zeth

Twelve commandments for Beautiful Python code

Living Code

David Parker famously said that texts are living, once they leave the pen of the author then they have a life of their own, you never know where the text will end up or how it will be modified. For Python code that is even more true.

The beauty of Python is that you can write code fast, share code and modify code. For this to work, your code needs to be readable. Writing code is easy, reading other people's code is much harder, or even reading your own code after a few months or years has past.

Therefore the aim is to make code as readable as possible, even if it causes a little more work when you write it. The way to make your Python code most readable is to keep to the Style Guide for Python Code, also known as PEP8.

Pylint for the Win

It is far easier to keep your code valid to PEP8 as you go along, than to try to move a large codebase to PEP8 at the end. I recommend the use of a tool called pylint.

Pylint is available from all Linux distributions' package managers (e.g. apt-get install pylint or emerge pylint). Here are some instructions for Windows.

If you have ever made a webpage you probably know about HTML-tidy or the online W3C Validator tool. These tell you everything wrong with your HTML.

Pylint is similar, it goes through and tells you both syntax errors and also how your code differs from the PEP8 standard.

There are some corner cases in which you will need to give pylint the finger, but doing it consciously for good reason is better than because you are sloppy.

PEP8 is better than your crappy style

People often don't use PEP8. This is for a variety of (bad) reasons.

Firstly, sometimes people are tourists from another programming language, they do not know any better so they write their Python code like it was Java or C code.

Secondly, Sometimes people think their (cl)own style is better than PEP8 in some technical way. Well that does not matter. I might have a better way to design a plug socket, but if I implemented my better plug socket, I would not be able to buy any electrical devices.

There can only be one standard, and PEP8 is that standard. If you want to change that standard then bribe, sleep with or kill Guido Van Rossum.

Not following the standard makes your code less readable to others, this prevents the quick reuse that Python is designed for (see above).

If you are a free-software/open-source project, then you particularly should be ashamed if you write hard to read code, because allowing other people to read, understand and modify your code is the whole point.

Lastly, some people don't use PEP8 because the document is too circular and verbose for them to remember. I feel your pain, below are the main points in 12 easy rules.

The 12 commandments

Guido, who brought you out of the land of Visual Basic, out of the land of slavery, spake all these words to thee:

  • Module names should be in all lowercase - hello.py.
  • Class names should be in CamelCase.
  • Methods and functions should be in lower_with_underscores
  • Implementation-specific 'private' methods _single_underscore_prefix
  • Especially private non-subclassable methods __double_underscore_prefix
  • Top level constants (i.e. those that are not in a function or class) should be in BLOCKCAPITALS. Overuse of these constants may make your code less reusable.
  • If a variable inside a function or method is so temporary and disposable that you cannot give it a name, then use i for the first one, j for the second and k for third.
  • Indentation is four spaces per level. No tabs. If you break this rule then you must be stoned in the village square.
  • Lines are never more that 80 characters wide. Tip, break lines with a backward slash \. You do not need to do this if there are parentheses, brackets or braces. Don't add extra parentheses just to break lines, use \ instead.
  • Spaces after commas, (green, eggs, and, ham)
  • Spaces around operators i = i + 1
  • Write docstrings for all public modules, functions, classes, and methods. Python is an international community, so use English for docstrings, object names and comments. If you want to provide local translations then use a proper localisation library.

Discuss this post - Leave a comment

April 25, 2008 06:00 PM :: West Midlands, England  

April 24, 2008

Michael Klier

The Twitter Blacklist And Another Greasemonkey Script

If you, like me, use twitter on a regular basis, you maybe like this one.

There's a new great site around called The Twitter Blacklist. It was created by Earle Martin and intends to gather a list of all the spammers and morons who either try to use the service to promote their nonsense products/websites or simply just are attention addicts. In both cases, these people are blindly following as much other people as they can. The best indicator to see whether someone is a twitter spammer or not, is the ratio between how many people they follow and how many follow them.

1:5 = twittercaster, 1:2 = notable, 1:1 socially healthy, 2:1 newbie or social climber, 5:1 twitter spammer. - Evan Podromou

Since a couple of days the twitterblacklist has a simple, yet nice API which allows you to check if a certain user is listed or not. This is where Greasemonkey enters the game :-).

I wrote a tiny Greasemonkey script which looks up the username of the visited twitter profile and displays a nice warning message at the top of the page if it's listed at the twitter blacklist.

Blocking the user then, is just one click away 8-).

I made the script available at the userscript website, you can fetch it here.

I hope this also finds its way into some of the available twitter clients. If the twitter blacklist grows (which it does almost daily) it will make twitter a even nicer place to stay.

And last not least: If you know other twitter spammers which aren't listed at the twitter blacklist yet, remember to report them (details about how you can report a spammer can be found http://twitterblacklist.com).

Read or add comments to this article

April 24, 2008 08:01 PM :: Germany  

Nirbheek Chauhan

Google Summer of Code, Gentoo

Right after the GSoC results were announced, Anant Narayanan sent an email to the gentoo-soc ML welcoming the students with lots of good advice about how to proceed, what all they can expect, and what all they're expected to do. Thanks Anant!

The only thing about that email that irked me was that third party source code management systems such as code.google.com, sf.net, and repo.or.cz were recommended for hosting the source code. Now, for a small project that does not have much in the name of Infra, this would be acceptable, but for a full-fledged organisation with a dedicated infra team, this looks quite shoddy (this probably happened due to insufficient communication between gentoo-soc and gentoo-infra). And on top of that, projects getting distributed across several repositories makes it impossible to find the code during and after SoC is finished. For instance, I am completely unable to find the code for a lot of the SoC 2007 projects.

Now, I understand that Gentoo Infra is very short-staffed and overworked at the moment, and hosting dedicated Trac setups for all the students is not an easy task. So I poked my mentor Patrick Lauer and asked him if he could host Redmine at gentooexperimental which could then be used as a central place for tracking/hosting all the Gentoo SoC projects. He agreed, but his dislike of Rails meant that I would have to do the setup and manage it.

And so it was done, and an email sent to the list. soc.gentooexperimental.org now hosts Redmine for project management.

After a small chat with Donnie Berkholz on IRC, we agreed that hosting the source code under Gentoo Infra and using Redmine for the rest of the stuff would be best. OTOH, Alec Warner was in favour of giving the students full freedom with hosting their projects as long as the place of their choice was usable. I replied to his email suggesting that in the interest of keeping the projects accessible from one place, people who want to do their development somewhere else be asked to create a dummy project at soc.ge.o which points to the place where the actual development is taking place.

Let's see how things turn out.

April 24, 2008 11:39 AM :: Uttar Pradesh, India  

April 23, 2008

Alex Bogak

Oh Gentoo, what had become of thee?

Dear friends

Yesterday was an important day for me. I stumbled into a very important issue, albeit small, which made me to come to the following decision: I am leaving Gentoo as a desktop platform.

It does not come as an easy decision. I've been using Gentoo and quasi-actively participating in the community for about 5 years. I have it installed currently on 3 out of 4 computers I have (the last one being mac mini, which I keep with Mac OS X). So why would I take this decision?

It all began with a one simple thing. You may have read my previous posts on various WINE installations, and I use some Windows applications with WINE. But recently Internet Explorer stopped working. I've tried to reinstall it (and it is easy in Gentoo, just as in any other Linux distribution with decent package manager), but to no avail.

Next step was slightly more complicated, but still quite simple: I've used VMWare to install complete Window XP environment. It worked fine for awhile, until I couldn't use VM images between different computers I have. It just stopped working. Besides that, the performance of VMWare on my AMD Athlon 1.8 with 1G of memory was, to say the least, appalling. Next came Innotek (now Sun) VirtualBox. This is the best emulation environment I could find to work on my computer. It works fine, and I use it for all my Windows-related projects.

But as a side effect of all installations, system began breaking. I started noticing various weird things, such as sudden applications freezing at times, etc. Couple of days ago, when there were no applications running, I've seen CPU usage at ~80%, I did what most Windows users do. I rebooted the machine.

And then, system just broke. System utilities seemed nowhere to be found. Some init scripts seemed to be incorrect, etc. I somehow fixed the situation by copying old versions from other projects, and updating the system. But now, GNOME has problems with graphics and themes, and most applets do not work and even do not exist. It just never ends, does it?

So, as a normal user of Gentoo, I went to emerge my world. I haven't done that for a couple of months, so there were almost 1G of updates waiting for me. I've downloaded all the packages, and began the emerge.

The thing that broke the last straw was a simple apache update. The system update failed because I had an old version. Not because compile didn't work. Just because it needed me to manually do something!! It redirected me to a Gentoo doc site, which has 2 lines of code that fixed the problem, and emerge now runs again.

Why in the heavens name wasn't this done automatically? Why did I loose half a day, during which my system could be updated? I lost this time because update procedure stopped. I had to fix the Apache configuration, so my GNOME desktop could continue updating. I understand that this specific issue with Apache may be serious, and that not many ordinary people run it on their computer, it still bugs me. I don't like it when I have to do this sort of manual intervention in update procedure.

So what is the problem here? Daniel Robbins created a Gentoo moto once: The goal of Gentoo is to design tools and systems that allow a user to do his or her work as pleasantly and efficiently as possible, as they see fit....If the tool forces the user to do things a particular way, then the tool is working against, rather than for, the user. (cited from Gentoo Philosophy)

The problem is that I spent too much time caring for the computer with Gentoo. I don't have that luxury anymore. There was time, when geeking with the machine and fixing problems was cool. Today, its a burden. I value time, and I only have 24 hours a day of it.

I believe that this may be one of the general problems with Gentoo. When it began, most folks using Linux were techies, who cared about all the bits on their computers. Gentoo fit very well in this community, so it flourished and became very popular. It provided tools that noone had (and used to compile anything manually anyway), and community of a good will and lot of friendship. It had the best documentation (and maybe still do) among brothers, and best team of engineers.

But nowadays, many users want word processor, web browser, email program and video player. They want it now, and not wait 20 minutes when compilation will finish. They don't care about technicalities. And as Gentoo haven't changed its nature, it doesn't fit for majority anymore. Sabayon anyone?

Gentoo distro has proven over the years, that it will stay the way it is. And that's why it won't be back on my desktop soon.

So, Gentoo, stay on server.

Ubuntu, CentOS - my desktop is waiting.

April 23, 2008 02:19 PM :: Israel  

April 22, 2008

Clete Blackwell

On the Ignorance of People

The USA’s 2008 election is the first governmental election of any kind that I will have an opportunity to vote in. This is a very important election for everyone. We face major decisions about the war in Iraq and on many major issues. A lot is at stake in this year’s general election. I have followed this election more than any other. I have been involved in a lot of conversations with people who come from very different cultures and backgrounds. Many of these people support my viewpoints, many of them disagree, and many of them could not care less about politics.

I have been appalled at how little it takes for a certain candidate to obtain an undecided and unopinionated person’s vote. Those of us who are strong conservatives or liberals will not be swayed to vote the opposite way for anything in the world (although this is debatable). However, undecided people are swayed way too easily. I have heard at least ten people tell me that they will vote for Obama because he is “young” and relates to “the younger generation” more than any of the “old” people such as Clinton and McCain. I have had one person tell me that, although they are a liberal, they would not vote for McCain if they were conservative because he is “old” and that he will probably “die in office.”

People disregard the candidate’s actual views and opinions and focus in on the unimportant things. They decide to vote for candidates because they “seem like nice people.” I am not sure if it is the college students’ way or if the entire population is this way, but it’s atrocious. If I wanted to, I could run for President, take all of the “correct” views on all of the issues; that is, the ones that would get me elected. I could completely disagree with all of these views, but I could stand up there, smile, act friendly, cry when I hear a sad story, etc. I could be a complete phony and get some of these idiots to follow me just because I am a down-to-earth and likable character.

What has our country come to? Elect someone because they are honest and they agree with your issues. Elect them because they will lead our country in the right (or should I say correct?) direction. But do NOT elect them because they are nice people and they seem friendly. This is ridiculous. Does anyone want someone in office who is nice but leads our country to demise? I don’t think so.

Amendment: It seems that my post has sparked a large debate over at the Gentoo Forums. Check it out here. It looks like most people agree with my basic argument that people are ignorant. However, many people seem to have some insane (read: extreme socialistic / communistic) ideas on the Gentoo Forums.

April 22, 2008 10:39 PM

Nirbheek Chauhan

Wheeeeeeeeeee

So today was the day.
An insane night, on an insane channel.

So we were promised Cake.
Which got a bit delayed,
but the end we got a plate
Which was truly worth the wait

Translation: I've gotten accepted into GSoC, and the community bonding period has begun!

A couple of people I know got accepted as well -- Satya, Ramnik, and Siddarth. This will be a fun summer *grin*.

I was going through the abstracts of the accepted applications in orgs that interest me, and I found the following to be *very* interesting (in no specific order):

April 22, 2008 01:30 AM :: Uttar Pradesh, India  

April 21, 2008

Martin Matusiak

clocking jruby1.1

Did you hear the exciting news? JRuby 1.1 is out! For real, you can call your grandma with the great news. :party: Wow, that was quick.

Okay, so the big new thing in JRuby is a bytecode compiler. As you may know, up to 1.0 it was just a Ruby interpreter in Java. Now you can actually compile Ruby modules to Java classes and no one will know the difference, very devious. :cool: Sounds like Robin Hood in a way, doesn’t it?

The JRuby guys are claiming that this makes JRuby on par with “regular Ruby” on performance, if not better. Hmm. Just to be on the safe side, what size shoes do you wear? Oh ouch, those are going to be tricky to fit in your mouth. :/ And Freud will say you’re stuck in the oral stage. Too much? Okay.

So here is my completely unvetted, dirty, real world test. No laboratory conditions here, you’re in the ghetto. First we need something *to* test. I don’t have a great deal of Ruby code at my disposal, but this should do the trick. How does scanning the raw filesystem for urls sound? The old harvest script actually does a half decent job of turning up a bunch of findings.

Now introducing the contenders. First up, his name is JRuby, you know him from occasional mentions on obscure blogs and the programming reddit past the top 500 entries. He promises to free all Java slaves by giving away free Rubies to everyone!

Aaand the incumbent, the famous… Ruby! You know him, your parents know him, every family would adopt him as their own child if they could. He’s the destroyer of kingdoms and the creator of empires, he’s bigger than Moses himself!

Our two drivers will be racing across a hostile territory. Your track is a 25gb ext3 live file system. During this time, I can promise you that only Firefox is likely to be writing new urls to disk, but I could be lying eheheh. Due to the unpredictable nature of this rally track, regulations allow only one racer at a time, but you will be clocked.

First up is the new kid on the block Jay….Ruby. The Ruby code will not be compiled before execution, we’ll let the just-in-time compiler do its thing.

$ time ( sudo cat /dev/sda5 | bin/jruby harvest.rb –url > /tmp/fsurls.jruby )
real 39m26.547s
user 37m19.072s
sys 1m28.406s

Not too shabby for a first run, but since this a brand new venue, we have no frame of reference yet. Let’s see how Ruby will do here.

$ time ( sudo cat /dev/sda5 | harvest.rb –url > /tmp/fsurls.ruby )
real 78m42.186s
user 62m12.537s
sys 2m18.721s

Well, look at that! The new kid is pretty slick, isn’t he? Sure is giving the old man a run for his money. Let’s see how they answered the questions.

$ lh
-rw-r–r– 1 alex alex 86M 2008-04-21 18:29 fsurls.jruby
-rw-r–r– 1 alex alex 8.6G 2008-04-21 20:58 fsurls.ruby

Yowza! No less than a hundred times more matches with Ruby. What is going on here? Did Jay just race to the finish line, dropping the vast majority of his parcels? Or did father Ruby see double and triple and quadruple, ending up with lots and lots of duplicates? Well, we don’t really *know* how many urls exist in those 25gb of data, but it seems a little bit suspect that there would be in excess of 8gb of them.

One way or the other, it’s pretty clear that the regular expression semantics are not entirely identical. In fact, you might be sweating a little right now if your code uses them heavily.

UPDATE: Squashing duplicates in both files actually produces two files of very similar size (13mb), in which the disparity of unique entries is only a very reasonable 4% (considering the file system was being written to in the process). The question still remains how did Ruby produce 8gb of output.

April 21, 2008 06:50 PM :: Utrecht, Netherlands  

Nick Cunningham

Possibly the best use for the eee-pc yet

I couldnt resist posting this, everytime i think xkcd couldnt get any better they go post something like this!

Of course, now xkcd posted this, how long till someone goes off and actually makes this :D

(incase the image doesnt show, heres the original xkcd post)

April 21, 2008 12:16 PM :: Portsmouth, England  

Alex Bogak

Linux on the desktop now!

Hello all

I just read an article, where Novell's CEO says that Linux will not be on the consumer desktop in at least for another 3 years. And that made me think.

We, users of Linux and open source software, would be happy to see everyone using Linux. We use it every day ourselves. And we're happy with it. Dell is installing Ubuntu Linux on various models, and people are buying them, preferring this to installing it by themselves. IBM, Sun and other vendors provide Linux systems just as they do Windows-based ones. Isn't this a nice trend that shows readiness of an operating system and its acceptance by vendors?

With this trend, how can it be that Linux on the desktop will take another 3-4 years? And what does it mean exactly? Linux desktop share currently stands about 3-4% of total desktop installations. Another 3-4% goes to Apple Mac OS X installations, another similar share to other alternative operating systems (such as Free/Net/OpenBSD, BeOs, Haiku, OpenSolaris, etc).

But Windows OS is spanning over 90% of all desktop computers. So, my guess, that in 3-4 years Linux installations can get to say, 10%. Will this mean that it is "on the desktop"? What numbers it should show for CEOs and other similarly hierarchially placed people, compared to a Windows OS so they will consider it "there"?

I personally believe, that any tool that you use should server its purpose and serve it well. If it does not do what it is supposed to do, choose another tool. I recently began to believe that there's a place for Windows systems as well for Linux systems, but I am still open-source minded. Choosing Linux or Windows, or Mac or Solaris is purely business decision in many cases. If choosing Linux on the desktop provides me with the tool to do my job (or work, or fun and procrastinate) - that's fine. If Window does the same - that's fine too, I'll just go with cheaper solution in the long run.

All the tools I use in Windows (those that are not forced on me anyway) are open source - VirtualWin, vi, GIMP, Open Office, Firefox, Innotek VirtualBox, 7zip; and much more - and if I go to Linux I will use the same tools, so I don't have to re-teach myself each time I switch platform.

So for me Linux is really on the desktop for about 4 and half years already. I don't even use Window at home anymore. And yet, Novell's CEO thinks that it will take another 3-4. If that's what a CEO thinks, then no wonder that it is all about Novell Linux. Maybe they are hibernated and there's an alarm clock set into the 3-years distant perfect future.

I wonder where RedHat and Ubuntu will be by then.

Cheers.

Update: it seems I'm not the only one

April 21, 2008 12:01 PM :: Israel  

April 20, 2008

Martin Matusiak

what the heck is a closure?

That’s a question that’s been bugging me for months now. It’s so vexing to try to find something out and not getting it. All the more so when you look it up in a couple of different places and the answers don’t seem to have much to do with each other. Obviously, once you have the big picture, all those answers intersect in a meaningful place, but while you’re still hunting for it, that’s not helpful at all.

I put this question to a wizard and the answer was (not an exact quote):

A function whose free variables have been bound.

Don’t you love to get a definition in terms of other terms you’re not particularly comfortable with? Just like a math textbook. This answer confused me, because I couldn’t think of a case that I had seen where that wasn’t the case, so I thought I must be missing something. The Python answer is very simple:

A nested function.

It’s sad, but one good answer is enough. When you can’t get that, sometimes you end up stacking up several unclear answers and hoping you can piece it all together. And that can very well fail.

I read a definition today that finally made it clear to me. It’s not the simplest and far from the most intuitive description. In fact, it too reads like a math textbook. But it’s simply what I needed to hear in words that would speak to me.

A lexical closure, often referred to just as a closure, is a function that can refer to and alter the values of bindings established by binding forms that textually include the function definition.

I read it about 3 times, forwards and backwards, carefully making sure that as I was lining up all the pieces in my mind, they were all in agreement with each other. And once I verified that, and double checked it, I felt so relieved. Finally!

I can’t follow the Common Lisp example that follows on that page, but scroll down and you find a piece of code that is much simpler.

(define (foo x)
	(define (bar y)
		(+ x y))
	bar)
 
(foo 1) 5 => 6
(foo 2) 5 => 7

Download this code: closure.lisp

What’s going on here? First there is a function being defined. Its name is foo and it takes a parameter x. Now, once we enter the body of this function foo, straight away we have another function definition - a nested function. This inner function is called bar and takes a parameter y. Then comes the body of the function bar, which says “add variables x and y“. And then? Follow the indentation (or the parentheses). We have now exited the function definition of bar and we’re back in the body of foo, which says “the value bar“, so that’s the return value of foo: the function bar.

In this example, bar is the closure. Just for a second, look back at how bar is defined in isolation, don’t look at the other code. It adds two variables: y, which is the formal parameter to bar, and x. How does x receive its value? It doesn’t. Not inside of bar! But if you look at foo in its entirety, you see that x is the formal parameter to foo. Aha! So the value of x, which is set inside of foo, carries through to the inner function bar.

Can we square this code with the answers quoted earlier? Let’s try.

A function whose free variables have been bound. - A function, in this case bar. Free variables, in this case x. Bound, in this case defined as the formal parameter x to the function foo.

A nested function. - The function bar.

A lexical closure, often referred to just as a closure, is a function that can refer to and alter the values of bindings established by binding forms that textually include the function definition. - A function, in this case bar. That can refer to and alter, in this case bar refers to the variable x. values of bindings, in this case the value of the bound variable x. established by binding forms, in this case the body of the function foo. that textually include the function definition, in this case foo includes the function definition of bar.

So yes, they all make sense. If you understand what it’s all about. :/

Let’s return to the code example. We now call the function foo with argument 1. As we enter foo, x is bound to 1. We now define the function bar and return it, because that is the return value of foo. So now we have the function bar, which takes one argument. We give it the argument 5. As we enter bar, y is bound to 5. And x? Is it an undefined argument, since it’s not defined inside bar? No, it’s bound *from before*, from when foo was called. So now we add x and y.

In the second call, we call foo with a different argument, thus x inside of bar receives a different value, and once the call to bar is made, this is reflected in the return value.

Well, that was easy. And to think I had to wait so long to clarify such a simple idiom. So what is all the noise about anyway? Think of it as a way to split up the assignment of variables. Suppose you don’t want to assign x and y at the same time, because y is a “more dynamic” variable whose value will be determined later. Meanwhile, x is a variable you can assign early, because you know it’s not going to need to be changed.

So each time you call foo, you get a version of bar that has a value of x already set. In fact, from this point on, for as long as you use this version of bar, you can think of x as a constant that has the value that it was assigned when foo was called. You can now give this version of bar to someone and they can use it by passing in any value for y that they want. But x is already determined and can’t be changed.

April 20, 2008 12:11 AM :: Utrecht, Netherlands  

April 18, 2008

David Grant

Firefox 3.0 Memory Consumption Greatly Improved

Last time I blogged about Firefox (2.x), I was complaining about how much memory it was sucking up. I have read about how many memory leaks were supposed to have been fixed in Firefox 3 but I had to see it to believe it. So after suffering again from low memory while running VMWare and Firefox (whose memory consumption regurarly climbs to 500MB) at the same time, I decided to upgrade and leave Firefox 2.0 for good. Firefox 3.0beta5 is blazingly fast and I have yet to see it use more than 170MB of memory. Thanks to the all the Firefox developers for finally fixing what was probably the biggest problem with Firefox 2.0.

Update: I also installed it on my Linux machine at home and memory usage and performance with many tabs open is far better than Firefox 2.

April 18, 2008 09:04 PM :: British Columbia, Canada  

Zeth

Filesharing is the democratic choice

Nonsense laws are socially divisive

In Saudi Arabia, ownership of a Bible is illegal. If you are found in the possession of a Bible, at the very least you will have it confiscated, you may be given corporal punishment. If you are found in possession of many bibles, then you can be executed (source).

We do not have that law in UK, or any laws like it. Not only is the UK a Christian country, (in name at least), but also because the majority of the people in the UK think that laws against owning books are stupid, only barbarians have such laws. In Britain you can own the Bible, the Koran, Harry Potter, whatever the heck book you want.

No matter what legal arguments you make for a law, if the majority of the public think it is stupid then it won't work.

A lot of countries used to have laws against "Nightwalking", i.e. walking around at night, because people out at night are obviously up to no good. Most countries have abolished such laws. Make laws about soliciting maybe, but wandering around at night? That was just silly.

Silly laws should be abolished. If you just leave silly laws on the statue book, expecting everyone to just ignore these laws, then you are undermining the law itself and making a mockery of the institutions charged with enforcing the law.

How many million people file share?

An study in 2005 claimed than 9.2 million people in Britain were involved in filesharing, which represented an annual 50% increase from the 4.3 million people that were involved in filesharing in 2003. Source 1, Source 2.

What is the number now? If a 50% annual increase was maintained, then it would have been 13.8 million people in 2006, 20.7 million people in 2007, and 31 million by the end of this year, 2008.

By the end of next year, it would be 46.6 million, and by 2010, it would be 70 million. That can't be possible of course as the population of the UK is only 60.6 Million.

So we can be pretty sure the number of people file-sharing is within the range of 10 million to 60.6 million. If you have more accurate figures please let me know. For sake of discussion, lets choose an arbitrary number - 20 million.

Filesharing is normal behaviour

Lets say that 20 million people in the UK have been or are involved in filesharing. With 20 million people, filesharing is not a crime, it is a mandate.

Labour achieved 9.5 million votes at the 2005 general election, for this it received 356 out of 646 possible seats, and became the government.

If 20 million people in the UK are filesharing then it cannot be considered a crime, or a bad act, it is the democratic will of the nation.

The majority of people have decided that previewing a song by downloading it is fair, the majority of people have decided that sharing a file with your friends is not the same as stealing a car.

Therefore, the government must stop trying to make 20 million British people into criminals, but instead should try to understand the cultural changes happening here and then frame the policy agenda accordingly.

The public are not interested in the police spending time in a futile mission to stop kids sharing music with other in order to prop up dying foreign companies. Instead spend the scarce resources on stopping organised drug crime, or on solving murders or on confiscating illegal guns from urban street gangs.

If the government cannot see this, and wants to waste our money on misadventures, then we will throw you lot out and get a government that does represent our values. That is democracy.

The old music companies are not important to the economic future of Britain

Likewise, the music and film industries also need to wake up and smell the coffee, their potential customers like sharing music and film on the web with each other. It is too late, they need to just get over it.

Suing their own customers is not going to help them manage decline. It is not going to turn back the clock. The old companies that represent yesterday's music industry cannot burn down the Internet, however hard it tries. The Internet has become far bigger and far more important economically and socially than the old music or film companies.

Google is one of the biggest web companies, alone it makes $10 billion in annual profit, that is double the profit for the whole music industry. To put it another way, all of these old music companies are worth, economically speaking, half a Google.

Things change, that is life, get on with it. The arrival of electric refrigeration killed off the ice storage companies. Tough luck, no one made a law protecting the old ice storage companies against people using freezers in their homes to make ice.

These old music and film companies if yesterday must be told to just live with it, get on with it, embrace it. Make services that appeal to these people, or cease to exist. Governments must not kill the Internet golden goose for some old dying companies.

The old ice storage companies going to the wall did not end cold drinks in the summer, in fact electronic refrigeration led to a massive increase in the number of cold drinks available.

Likewise if the old music companies are too slow to adapt and go to the wall, it will not be the end of music, or the beginning of the end of music, it will be, as Churchill famously said, the end of the beginning.

Politicians, this is your final warning

If you are a politician, be aware, we are watching you like never before. In the past you might have taken party contributions from special interests and then given them special treatment.

Now we, the public, have our own communication channels and this time, we will punish you for it. You will represent us, the people, or we will remove you.

Discuss this post - Leave a comment

Entry at Digg

April 18, 2008 02:00 PM :: West Midlands, England  

April 17, 2008

Roderick B. Greening

Open Source Census

Here is another great initiative:

""The Open Source Census is the first collaborative, global project to count the number of installations for each open source software package. We realize that’s pretty ambitious, but we figure you have to think big. Of course, we know we can’t count every single installation of open source software in the world, but we believe it’s possible to obtain a sample large enough to be representative.""

Wow!

Ok, so, what do you do? Head on over to osscensus.org and sign up. Once registered, you need to follow the steps to download a discovery tool and run it on your system.

If you are using Kubuntu, you already have Java and likely Ruby, so you can choose the smallest download (~5MB). Otherwise, you can download the normal one (~40MB).

Once downloaded, you need to extract the archive somewhere. Then from the command line, run the discovery tool, providing the correct options and your assigned census code.

Pretty simple.

April 17, 2008 10:14 PM :: NL, Canada  

How to Recover from System Crashes or Sluggishness

Most of you have come to realize that Linux is really stable, and rarely suffers from a complete system crash. However, on those rare occasions that it does, what should you do? What if you have a runaway process that is eating up all your CPU or memory, and the system had become non-responsive? Would you simply power-cycle or wave the three fingered salute?

Well, if you are not a total Linux geek, you probably will resort to using the CTRL+ALT+DEL method, and failing that a hard power off. In the Windows world, this was acceptable, as these really were the only options you had available. However, for Linux, this is not the case.

How many times under Windows did you have a crash and then had to hard reset, only to find your file system was corrupted and often unbootable? The same can happen under Linux (though generally less likely), if you simply power off.

So, you ask, what should I do? Well, since you asked, here are some steps you can try to safely and cleanly recover from a non-responsive system/desktop under Kubuntu (some of the suggestions may be specific to KDE/Kubuntu - ymmv).

1) CTRL+ALT+ESC - Kill Window

If your system is sluggish due to a hanging application, you can hit the CTRL+ALT+ESC key sequence and the next window you left-click will be killed. Be warned, you can click the background/desktop and kill it using this method as it is treated just like any other application/window.

2) CTRL+ESC - ksysguard

If you cannot kill the offending application using method 1) above (e.g. there is no GUI or killing the app you though was the problem did not resolve the issue), then you can bring up the process table (similar to windows task manager) and look for the application sucking up all the memory or CPU time. Select it from the list, and hit Kill to terminate it.

If you are familiar with the Linux command line, you can achieve the same via running Konsole and typing using the command 'ps -A' or 'top' to examine the same process list. To kill the offending application, you need to issue the 'kill' command followed by the process ID (or PID) (e.g. kill 9999).

Instead of using kill, you may need to use the killall command, which can be passed an application name, like 'killall konqueror', which will kill all instances of konqueror (this is not the same as killing one instance of konqueror via a single PID).

3) CTRL+ALT+F1 - Switch to first text console

Your system has six virtual terminals predefined, and can be accessed via CTRL+ALT+F1 through CTRL+ALT+F6 consecutively. If the desktop is frozen/hung, and the first two options cannot be performed, then you can use this method to bring up a text based console. Login with your usual name and password, and using step 2 above, you should be able to find the offending applicatiion (assuming there is one that stands out - i.e. has all your memory tied up or using 99% CPU).

4) CTRL+ALT+BACKSPACE - Restart X server (Desktop/GUI)

If that doesn’t work, you might want to restart your Desktop using the CTRL-ALT-Backspace combo. Beware, that this will kill all your Desktop apps currently running, and you may lose any changes to files not recently saved or auto-backed up. This should kick you back to the login manager. If it does not, then the X Server may have failed to re-initialize, try the next option.

5) CTRL+ALT+DEL - Reboot System

You can attempt to use CTRL+ALT+DEL from the Desktop/GUI or one of the Virtual terminals (CTRL+ALT+F1). If you do it via the Desktop, you may be given the Shutdown Dialog with options to Reboot or Shutdown or the system may just silently reboot. Sometimes this will not work, and you must invoke the CTRL+ALT+DEL via one of the Virtual terminals, which will perform a full reboot.

6) ALT+SysRq - Magic SysRq (System Request) Key

If none of the above work, you can try this last option before resorting to a hard power-cycle. This method has sometimes been called "Skinny Elephants", "Raise the Elephant" or "Raising Skinny Elephants". Not sure where the phrase originated, but here's what it refers to: "Raising Skinny Elephants Is Utterly Boring"

Taking the first letter from each word in the phrase, and you have the key sequence you need to hit to safely sync the disks, terminate running processes, unmount file systems and finally reboot.

r - put keyboard in raw mode
s - sync the disk
e - terminate all processes
i - kill all processes
u - remount all filesystems read only
b - reboot the system

Now you see why someone came up with the silly phrase you will now never forget (just like elephants never forget).

Here is the full key sequence (remember to use the left ALT key and the SysRq key is the PrtSc key if not labeled on your keyboard):

Alt+SysRq+r
Alt+SysRq+s
Alt+SysRq+e
Alt+SysRq+i
Alt+SysRq+u
Alt+SysRq+b

Please wait 2 or 3 seconds between hitting each key sequence to allow for each step to complete. Especially if you have a lot of running services/processes.

When your system boots, you may be prompted for a file system check. If you are, please ensure you let the system check and repair if necessary.

7) Power-cycle - Power off or Reset

You should never do this. The system will hate you for this and will eventually lead to some sort of file corruption of lost data. This is as an absolute last resort (i.e. keyboard is not responding to any key sequences above).

----

I hope this is useful to someone :)

April 17, 2008 08:15 PM :: NL, Canada  

Nick Cunningham

Living in CSV Hell

As part of my new job im left dealing with a whole series of databases, and one of the regular jobs is to update them from various sources. Normally that involves extracting data as a CSV file from the main database, which then needs to be edited and checking in excel because all the table names need changing so that when you import it into the database system used within the office it matches the fields used there. Also the conversion to csv seems to introduce random characters in fields, so as part of the process you also have to check all the data and make sure it doesnt contain any of these random characters before finally importing it and hoping everything works.

Now incase this didnt sound too painful, while most of the time there are only a few new records, im about to get to the one time of year where we receive in excess of 2500 new records. And while in previous years it may have been accepted that you need to manually check them, i *really* dont want to have to waste a day (and my sanity!) checking all this data.

So we get to my problem(s), not only am i stuck using windows (bad!), but ive also got a crappy CRT monitor (my eyes!), and finally i have to spend half my time working with a custom frontend to an oracle database which is written in java and helpfully launches from inside IE and is so obtuse and badly designed i feel sick everytime i use it!

However, getting away from my severe java allergies :p im hitting a roadblock in terms of automating at least some of this, originally i was going to try out my python skills, till i realised i really dont know anything, so i resorted to using PHP. In both cases though, despite some googling there seem to be very few, if any, decent tutorials/guides to working with CSV files. So my question plea is for any guides/tutorials or any decent documentation for working with CSV files in PHP. Im open to pretty much anything as the best ive found is for pretty much just opening the file and reading it.

Now all i have to do is dust off my PHP skills and hope i can throw something together that works!

April 17, 2008 12:36 PM :: Portsmouth, England  

April 16, 2008

Zeth

Linus Torvalds on ...

Linus Torvalds writes the Linux kernel, he also likes a good mailing list flamewar, not least because he has a very sarcasatic wit. Here he is, writing about various topics.

On fair use:

When you start thinking that you have absolute control over the content or programs you produce, and that the rest of the worlds opinions doesn't matter, you're just _wrong_.

Me, personally, I think the RIAA and the MPAA is a shithouse. They are immoral.

On virtualization:

I think what you're seeing is virtualization proponents being absolutely _desperate_ for any reason to use virtualization.

On userspace binary drivers:

No user-space ass-hattery here.

On turning off interrupt requests:

You cannot have a generic kernel driver that doesn't know about the low-level hardware (not with current hardware - you could make the "shut the f*ck up" a generic thing if you designed hardware properly, but that simply does not exist in general right now).

On those arguing for userpace interrupt request handlers:

You may be a bit simple. But I think it's more polite to call you "special". Or maybe just not very used to how hardware works.

On C++ :

In fact, in Linux we did try C++ once already, back in 1992. It sucks. Trust me...

C++ is a horrible language. It's made more horrible by the fact that a lot of substandard programmers use it, to the point where it's much much easier to generate total and utter crap with it. Quite frankly, even if the choice of C were to do *nothing* but keep the C++ programmers out, that in itself would be a huge reason to use C.

So I'm sorry, but for something like git, where efficiency was a primary objective, the "advantages" of C++ is just a huge mistake. The fact that we also piss off people who cannot see that is just a big additional advantage.

On Linux Kernel version 2.6.19:

It's one of those rare "perfect" kernels. So if it doesn't happen to compile with your config, you can rest easy knowing that it's all your own d*mn fault, and you should just fix your evil ways.

On Intel's inventions:

The fact that ACPI was designed by a group of monkeys high on LSD, and is some of the worst designs in the industry obviously makes running it at _any_ point pretty damn ugly. And the fact that MB vendors don't test it with anything else than Windows (and sometimes you wonder whether they do even that) doesn't help.

EFI is this other Intel brain-damage (the first one being ACPI). It's totally different from a normal BIOS, and was brought on by ia64, which never had a BIOS, of course. Sadly, Apple bought into the whole "BIOS bad, EFI good" hype, so we now have x86 machines with EFI as the native boot protocol.

On Apple OS X:

OS X in some ways is actually worse than Windows to program for. Their file system is complete and utter crap, which is scary.

Discuss this post - Leave a comment

April 16, 2008 08:20 PM :: West Midlands, England  

Brian S. Stephan

Neglect

I always seem to update my blog in bursts. So, while I suggest that no one get hasty and expect large posts the next couple days, I thought I would mention a little something to keep me from feeling totally neglectful.

I bought a HP 2133 ("Mini-note") to play with, the middle configuration with SuSE, to be exact. So that should be exciting. Won’t be here for another couple weeks, though. :( The plan is to use it as essentially a quick companion for work and such, with enough of a desktop environment to get work done and/or SSH into my various sites of interest. The MacBook is still around, too, for any hypothetical bigger or more involved tasks. Not that the HP 2133 is a toy, but a lot of it is about playing around and being a bit less tied to my desktop.

Looking forward to when it gets here, since it would seem that it will also become a minor task to get everything supported in Gentoo… looks like I get to contribute to another page on the Gentoo wiki.

April 16, 2008 01:28 AM :: Wisconsin, USA  

April 15, 2008

Nirbheek Chauhan

portage/pkgcore rhyme

Please don't kill me for the bad rhyming :)

I sit down to emerge -av
It churns through the dependencies
I get frustrated in 2 minutes
pmerge -a is what was amiss

I alias `emerge` to spit out a warning
that a crappy PM is what I'm invoking
Now I'll remember to always use;
pmerge so my time's not abused!


Comments on the crappiness of these lines are welcome.
More such rhymes revolving around Gentoo are even more welcome ;)

April 15, 2008 04:58 AM :: Uttar Pradesh, India  

April 14, 2008

Zeth

Sharing our scripts together

Diverse technical interests

My technical focus has shifted quite a lot over the years.

I was really into computers as a small child, but took my teenage years off it to play guitar, bike outside and meet girls. Then I started to get more heavily into computers (again) in 1998 when I went to University and had a broadband connection.

I started by copy-editing for the web and writing bucket loads of HTML. This led to ASP and then a bit of PHP. I then, around 2002, moved to open source software and got deeper and deeper into Linux/Unix system administration, as well as, of course, making websites and setting up CMSes and so on.

Over the years I had picked up the very basics in many different programming languages, but I decided in 2006 that if I was going to develop my programming (while working and studying part time), then I should concentrate on one language for the moment.

I wanted a language that I could use to make interactive websites, as well as use for system administration, but it also needed to be good for programming for the Linux desktop.

So in the Summer of 2006, I sat in a relative's garden with a beatup ancient laptop and some printed out Python documentation, and never looked back.

These days my favourite computing activity is programming, especially with the Python, as regular readers might have noticed. I am churning out Python at a good rate, some of it for paid work, some of it for my own private learning and a tiny bit ends up in my public Python script directory.

In the coming months the amount of programming will rise still further as I am likely to take on some new roles from May, more of that will be revealed later when/if that happens.

Let's all share our scripts together

So I was asked:

zeth have you got a bzr repo of your python scripts? so people using your little apps could give something back?

An interesting idea. The whole idea of free software/open source is that people can modify the code and share the changes. It might be that no one ever uses it, if so I have lost nothing, it is worth a try.

I only wanted to include the ones I thought were both useful and were relatively self-contained. The focus is on more on developing useful scripts that can be used by others, rather than in old scripts that only fulfilled a need of mine only. So I had a bit of a clear out, and uploaded nine fairly self-contained Python modules to Launchpad. With the idea is that I publish new ones there in the months ahead.

Here is the site. I was going to call it "Zeth's scripts" but since the idea is that it will be a collaborative affair, i.e. you can get involved, it is called Eden.

It is a garden because it is small now but might develop and change over time. Some scripts will grow, some will die.

For example, in the last post here, I made a module which took updates from Twitter and put them in pop-up notifications in the GNOME-desktop. Say someone takes the program and changes it so that Facebook updates are published instead, or makes it so it makes pop-ups for KDE or the Mac instead of for GNOME.

That person could then push their version ('branch' in bzr terminology') back up to the site. They do not need permission from me or anyone else. Launchpad can contain as many branches as you like.

Even more importantly, other people can also submit their own scripts and we can discuss them and modify them. You don't even have to use Python (shock, horror).

Making your own branch

So to get your own branch of the programs.

bzr branch lp:eden

Now lets say you have edited one of the scripts, or you have put new scripts into the directory. You can then share them back to launchpad like this:

bzr push sftp://zeth0@bazaar.launchpad.net/~zeth0/eden/devel

Replace zeth0 both times with whatever your launchpad username is.

If you don't have a launchpad username click here and enter an email address.

As that bloke with the beard says, "Happy Hacking".

Discuss this post - leave a comment

April 14, 2008 11:01 PM :: West Midlands, England  

April 13, 2008

Alex Bogak

My new computer

Hello folks!

It's been a bit more than 2 years since I have Serenity - an AMD Athlon machine. And it's time to grow further. So I've made my research, and I got planned a machine that will serve me for my modest needs for another 2-3 years.

As prices plummeted seriously lately, and since dollar is not what is used to be, I can get pretty decent machine for a buck. I don't play games, my main needs are VM running (even couple of machines at the same time), Photoshop/Gimp rendering (for pics like these). Here's my configuration of choice that I'm thinking to have:

CPU:  Intel  Q9300
Motherboard: I'd like to have an Abit IP35 Pro, but its availability seems limited in my locale. So I'd be happy for other suggestions.
Memory: Mushkin or Corsair, 4Gb, CL4
Graphics: any NVidia 256Mb PCIe will do.
HD: 7200rpm, 250Gb or any other that gives good price/size ratio.
DVD burner: we have LG's and NECs laying around here for ~$30, so its easy.
PSU: Zalman, Thermaltake or Antec. These are the decent ones we have in local market.
Case: something simple, but that can sustain my system

Please let me know what do you think about it, and I'd love an MB suggestion that plays nicely with Linux. My main intention is to run Xen or other VM, and run Linux and Window under it.

Another thing that some folks may not understand, is my attention to run it with Ubuntu. To tell you the truth, I'm still the Gentoo person, but it takes increasingly more and more time to just maintain my Gentoo-based Serenity, and its only updates. My rsync doesn't work, updates are slow and I got many errors while updating a lot, which requires an attention as it renders system unusable.

I understand that those may be very easy to fix, but as I've said - I don't have time to deal with it as I had before, so I'm going to try my luck with customized Ubuntu for a while. Besides, I like learning new things, with Gentoo I feel like I don't know what's up there. And I've always wanted to learn Ubuntu.

Gentoo star seems to have eclipsed lately, I think I might get to fixing it when I have time later on...

Cheers.

April 13, 2008 11:30 AM :: Israel  

April 12, 2008

Michael Klier

Creating Backups With Dar

Everybody knows that backups are essential and good to have. On Linux there's a plethora of possibilities to create them, ranging from simple scripts that use tar or rsync to full blown and feature rich backup tools like bacula. One not very well known tool which IMHO does the job quite well for normal desktop machines is dar. One of the features that makes dar special, is it's ability to split big data sets into so called “slices” of a defined size. This makes it very easy to split big backups into smaller junks, which then can be burned on DVDs for example. Another nice thing about dar is that it also supports differential backups.

The Basics

dar comes like tar with a big number of command line options. The basic usage though is very simple. The following example will create a backup named “backup-home” of the directory /home/user/foo in the current directory and start a new slice every 600M.

You can perform a dry-run of the operation by adding the -e option!
% dar -s 600M -c backup-home -R /home/user/foo
-- output stripped --
% ls
backup-home.1.dar backup-home.2.dar

Excluding Files/Directories

You often don't want to backup all files in the backup directory. dar offers the possibility to either exclude certain files (-X) or whole sub directories (-P) relative to the backup directory. Both options support glob like wild cards * / ? and can be specified multiple times. The following example will exclude all files that contain “thumb” in their filename and the directories /home/user/foo/media/audio and /home/user/foo/public_html.

% dar -s 600M -c backup-home -R /home/user/foo -X ".thumb*" -P "media/audio" -P "public_html"

Compression

dar can use file compression to generate the slices and supports the gzip (-z) and the bzip2 (-y) compression algorithm. You can specify the compression level from 1-9, it's 9 by default. For some file types compression is not desired though. For example for already compressed files like .zip, .tar.gz or .png. You can explicitly exclude these files from being compressed (-Z) and you can also define a minimum file size (-m) for files that should be compressed. Again, -Z can also be specified mulitble times.

% dar -s 600M -m 256 -y -c backup-home -R /home/user/foo -X ".thumb*" -P "media/audio" -P "public_html" -Z "*.zip" -Z "*.gz" -Z "*.bz2" -Z "*.png"

Batch Operations

As you see the command got quite long already and we're excluding only two directories and a couple of files. To make it easier to perform complex backups dar supports make like configuration files. In these files you can specify your desired options for all main operations of dar, like archive creation/extraction/listing a.s.o.. The corresponding configuration file for our example would look like the following (I only cover the archive creation here).

/home/user/backupcfg

## dar config for backup-home

# affects all commands
all:
  # be verbose
  -v

# affects only -c (create archive)
create:
  # don't compress the following files
  -Z "*.gz" -Z "*.bz2" -Z "*.zip" -Z "*.png"
  # only compress files bigger than 256M
  -m 256
  # create slices of 600M size
  -s 600M
  # use bzip2 compression algorithm
  -y
  # directory to backup
  -R /home/user/foo
  # exclude directories
  -P media/audio
  -P public_html
  # exclude files
  -X "*thumb*"
  # run some commands once a slice was created
  -E "echo created slice %b.%n"

I added another option to the configuration file we haven't covered yet -E. This option can be used to issue a certain command once after a slice has been created. This could be used for example to automatically burn the slice to a DVD or encrypt the backup.

To create the backup from the configuration file in the current directory simply issue:

% dar -c backup-home -B /home/user/backupcfg

Differential Backups

As I wrote earlier, dar also supports differential backups. You simply pass the full backup as reference backup (-A) to your dar backup command and dar will then check the differences for you and create the differential backup. To make that work you have to be in the directory the full backup resides.

The -A option only takes the basename of the backup! Note the absence of “1.dar” in the command!
% dar -c backup-home-diff -B /home/user/backupcfg -A backup-home
-- output stripped --
% ls
backup-home.1.dar backup-home.2.dar backup-home-diff.1.dar

For bigger backups the analyzing of the archive will take quite some time. Also the archive might not be on the same hard disk or even on the same machine. To make it possible to generate differential backups anyway you can isolate a so called “catalogue” from your full backup and use it along with the -A option. The catalogue can be isolated using the -C option.

% dar -A backup-home -C backup-home-catalogue
% ls
backup-home.1.dar backup-home.2.dar backup-home-catalogue.1.dar

Then to create the differential backup simply use:

% dar -c backup-home-diff -B /home/user/backupcfg -A backup-home-catalogue

Restoring Backups

Restoring a backup is pretty straight forward, however you should be root otherwise dar will warn you that file ownerships will not be restored. To extract a full backup and one or more differential backups in a given directory use (note that we only use the basename of the backup files again):

% cd /home/chi/restore
% dar -x /home/chi/backups/backup-home
-- output stripped 
% dar -x /home/chi/backups/backup-home-diff
-- output stripped

The extraction process is quite verbose and requires user interaction. For information about how this can be changed please refer to the dar manual.

That's basically it :-). dar offers of course much more options to perform archive integrity checks list archive contents a.s.o. which weren't covered in this article. The next step from here would be to combine everything into a nice script, automate the whole backup process with cron jobs (or custom udev rules for external hard disks) and maybe secure the backups using ecryption.

Finally: What do you use to create backups?

Read or add comments to this article

April 12, 2008 03:24 PM :: Germany  

April 10, 2008

Zeth

Twitter and GNOME integration

This is part two of our look at using Python with the API of the Twitter social networking website. (Part one here)

I am assuming you are using a GNOME-based Unix/Linux system. If you use something else then sorry, you might want to read one of my other posts.

So in Twitter, you can have updates of what your friends are doing appear on the Twitter website (like this), however, it is more fun to make these updates come to you.

So I wrote a little program in Python that makes your Twitter updates pop-up on the desktop, like this:

This uses a feature of the Gnome desktop called 'notify', as explained by Andy W, and then followed up by me two posts ago.

Without further ado, lets get started.

1. Get python-notify

There are three very small but important dependencies. Firstly, you need the Python bindings for libnotify. These are available from your Linux distribution (you might even have them installed them already):

You will need setuptools as well. I am sure you already have that installed, but it cannot hurt to double-check.

The two computers I have here are Ubuntu and Gentoo:

On Ubuntu or Debian:

sudo apt-get install python-notify python-setuptools

On Gentoo:

sudo emerge notify-python setuptools

On your Linux/Unix, you can alter the commands slightly so it matches your package management system.

2. Get json and Twitter bindings

Now we need two more, simplejson and python-twitter

From any Linux Distribution, run:

sudo easy_install simplejson

sudo easy_install python-twitter

3. Twitter account

Next you will need to sign up to a Twitter account and add at least one friend (e.g. zeth0), in the spooky terminology of twitter, to add a friend you "follow them".

4. Get my script

Next you need to download a copy of my script and make the file extension .py. Get it from here, or type:

wget http://zeth.me.uk/python/twitnotify.txt -O twitnotify.py

chmod +x twitnotify.py

Now you need to edit the top of the script so that your username and password are correct. For historical reasons, I'm an Emacs man, but if you don't have a favourite editor, then gedit is pretty easy to use:

gedit twitnotify.py

5. Engage!

Now we are finally ready to run the script:

./twitnotify.py

If you have a lot of friends then there will be lots of updates the first time. The updates and pictures are all cached, so you don't have to download those again.

Six degrees of separation

After I made that script, I tried to be more adventurous and wrote a new script that found other people that shared your interests.

It would spider recursively through your friends, followers and their friends and followers, and their friends and followers, and their friends and followers (and so on), and returned a set of updates based on whatever keywords you specified.

It worked pretty well and the results were really fun. However, what I didn't know was that Twitter has quite sharp rate limits, so when this script stopped working I was combing through the code to find out what had gone wrong. It turned out that my code was fine but that my username had just been suspended from Twitter.

Twitter rate usage

Your username is allowed 70 requests per 60 sixty minute time period, starting from the first request.

Using my original twitnotify.py script posed no problem, the script usually uses three requests each time (one for normal updates, one for replies and one for personal "direct" messages).

So if you update every 10 minutes, then that will be about 18 requests an hour, easily under your 70 maximum.

Getting the photos of your friends is included in the above, but the first time you get a private message from a new person who is not a friend, it will take one request to download the photo.

My second script, the one to find new friends according to your interests, easily ate up 70 requests in one pass. So I had to give that up.

Fortunately, it seems that you are automatically un-suspended from Twitter after a couple of hours, and having your username temporally suspended from the API does not affect logging into the Twitter website.

Discuss this post - leave a comment

April 10, 2008 09:00 PM :: West Midlands, England  

Get Added

If you are a Gentoo user and have a blog, then you can be added. :) It doesn't matter how frequently you post, or what topics you cover.

We'll even take feeds of different languages, and setup feeds for each one.

Send your submissions to beandog at gentoo dot org or djay-il at gentoo-userreps dot org in this format:

[http://my.blog.com/rss-feed.xml]
name = First Last Name
email = foo@bar.com
province* = Utah
country* = USA

* Optional

Your email will be kept private, it's only so we can contact you if necessary.

Please send in a hackergotchi or an avatar for your feed as well.

If you don't know the URL for your subscription feed, just send us the WWW address of your blog and we'll figure it out for you.

About

Planet Larry is an aggregation of blogs from Gentoo users worldwide.

The Planet feed is updated every 30 minutes.

This project is not officially affiliated with Gentoo. We're just a bunch of weirdo users with too much free time.