Netfilter (iptables) performance tests

Friday, July 29th, 2011

Here’s a nice study on the performance of Linux’s network firewalling/packet mangling layer:

Conclusions (mine, based on the study):

Netfilter/iptables is up to par with other filter solutions when it comes to plain routing.
Netfilter/iptables is not up to par with other filter solutions when it comes to connection tracking (basically just getting all the traffic through netfilter and keeping track of it) and filtering
When a chain has many rules, netfilter/iptables filtering performance drops significantly. Chain modifications (adding rules) performance also degrades significantly. This starts at 256 rules, so don’t use more.

The problems seem to stem from the way Netfilter stores and processes the rules:

It is well known that netfilter/iptables does not scale well if one wants to use large number of rules in a single chain. The reason of the problem lies in the fact that the rules are processed in netfilter/iptables one after another, linearly.

Minecraft Server optimization

Friday, July 22nd, 2011

The Minecraft server was running very slowly and gobbling up a significant amount of memory. Game-play was laggy, chunk loading stuttery and I saw a LOT of the following message in the server.log:

[WARNING] Can't keep up! Did the system time change, or is the server overloaded?

I decided to unleash my google-fu, dug through the Java manuals and found some optimizations that:

Significantly reduced the CPU load.
Reduced the memory usage of Minecraft from about 80% to 20 to 30% (of 1Gb). That means you can run Minecraft server on as little as 200Mb.
Got rid of the “Can’t keep up!” messages.
Reduced lag on the server to almost nothing.
Reduced chunk loading stuttering.

It does seem there has been a slight increase in chunks not loading properly, but that might be my imagination.

So what did I do? It’s really simple:

Run the Minecraft server and world from a RAM disk. This will greatly enhance performance of loading chunks and at the same time reduce CPU load. Memory usage in total will go up (because the entire world is now in RAM), but since the new Minecraft McRegion storage format (introduced in Minecraft v1.3 Beta) uses a lot less disk space, it’s no big deal.
Provide the -Xincgc option to Java. This enabled the concurrent incremental garbage collector, which basically means that Java won’t pause for a couple of seconds to clean up old unused stuff (unloaded chunks). This reduces lag and choppiness in the loading of chunks/movement of mobs and destruction/placement of blocks.

I’ll discuss here how to set up a RAM disk and how to make persistent copies of your live Minecraft map (or you’d lose everything on reboot).

Setting up a RAM disk

This section assumes that you’re running under GNU/Linux, you’ve got Minecraft in a directory /home/minecraft/minecraft/minecraft_server/ (the directory where the minecraft_server.jar lives). Commands are prefixed by either a ‘$‘ or a ‘#‘ prompt, meaning you should either run the command as the minecraft user (or whichever user has read/write access to the minecraft server) or the root user.

Let’s get started! First, check how much space you will need for your minecraft world:

$ du -hs /home/minecraft/minecraft/minecraft_server
50M	/home/minecraft/minecraft/minecraft_server

The minecraft directory is currently using 50mb. It’s already a fairly large world, so we’ll give it double that: 100Mb.

Now, move the minecraft_server directory to a different name, because we need to create an empty RAM disk in its place:

$ mv /home/minecraft/minecraft/minecraft_server /home/minecraft/minecraft/minecraft_server.persistent
$ mkdir /home/minecraft/minecraft/minecraft_server

Next, add an entry for the RAM disk to /etc/fstab. This will make sure it is automatically remounted when your system restarts

$ sudo echo "tmpfs    /home/minecraft/minecraft/minecraft_server     tmpfs    rw,size=100M    0    0" >> /etc/fstab

Mount it:

# mount /home/minecraft/minecraft/minecraft_server

Copy the contents off the backup minecraft_server directory over to the RAM disk:

$ cp -ar /home/minecraft/minecraft/minecraft_server.persistent/* /home/minecraft/minecraft/minecraft_server/

You can now start Minecraft with the following command:

$ tmux new -d -n "minecraft" "minecraft" "java -Xincgc -Xmx1G -jar minecraft_server.jar nogui"

Making persistent copies

It is imperative that you regularly create a persistent copy of the RAM disk! If the power on your server ever fails (or if you reboot it manually), your world is LOST! If you’re running the Minecraft server in a `tmux` session (and you’ve started tmux with: ‘tmux -n minecraft -s minecraft’), you can create a shellscript and call it from a CRON job say every hour. You can also get my Minecraft Server run script, which can also do backups and start/stop the Minecraft server without having to attach to the console. But for those who just want the persistent-copy script, here you go:

PATH_MC="/home/minecraft/minecraft/minecraft_server"

# Temporary turn off MC saving so we don't get a corrupt backup
tmux send -t "minecraft" "save-off" C-m
tmux send -t "minecraft" "save-all" C-m
# Wait until the MC server log indicates the save is complete
while true; do
    sleep 0.2
    TMP=`grep "Saved the world" $PATH_MC/server.log | wc -l`
    if [ $TMP -gt $SAVE_COMPLETE ]; then
        break
    fi
done
# Create persistent copy from RAM to disk
mv "$PATH_MC.persistent" "$PATH_MC.persistent.bak" 2>/dev/null
cp -ar "$PATH_MC" "$PATH_MC.persistent"
if [ $? -ne 0 ]; then
    echo "Something went wrong while backing up."
    echo "An older copy can be found in $PATH_MC.persist.bak"
    exit 1
else
    rm -rf "$PATH_MC.persistent.bak"
fi
# Turn world saving back on
tmux send -t "minecraft" "save-on" C-m

Save it as a file called /home/minecraft/mc_persistent.sh and make it executable:

chmod 750 /home/minecraft/mc_persistent.sh

Create a now cronjob and call it every hour:

0 * * * * /home/minecraft/mc_persistent.sh

That’s it! Happy lag-free mining.

Updated June 2, 2013: Thanks to Dan Hull for mentioning that the save message has changed in newer Minecraft server. The article has been updated to reflect this change.

Updated Nov 29, 2016: Thanks for Richard S for tips and fixes that increase robustness.

MCPlayerEdit v0.19 released

Thursday, July 14th, 2011

I’ve released a new version of my Minecraft Player/World Editor MCPlayerEdit v0.19. This release features the following modifications and additions:

Added a `health` command which lets you set the player health (also a god-mode)
Fixed a bug where only supplying an item ID to give/remove would not give the maximum stack size but would give a stack equal in size to the item ID specified.
Fixed a bug where adding non-safe items to the inventory would give a stack size of 0.

You can get the new version Here.

MCPlayerEdit 0.18 released

Tuesday, July 5th, 2011

I’ve released a new version of my Minecraft Player/World Editor MCPlayerEdit v0.18. This release features the following modifications and additions:

Made leaves available in safe-mode.
Added a `remove` command which allows the user to remove items from the inventory by name/id instead of slots. (Suggestion by rowanxim)
Extended the `list` command so the user can now list items by slot id or item name. (Suggestion by rowanxim)

You can get the new version Here.

MCPlayerEdit 0.17 released

Friday, July 1st, 2011

I’ve released a new version of my Minecraft Player/World Editor MCPlayerEdit v0.17. This release features the following modifications and additions:

Lava and Water buckets can now be added to the inventory in safe-mode.
Added Pistons, Sticky Pistons and Shears.
Fixed a bug in the nbtdump command.

You can get the new version Here.

I’m on vacation till June 29

Friday, June 17th, 2011

I’ll be on vacation from now till June 29. Depending on WiFi-availability in the area I’m visiting, I may or may not respond to anything until that time.

Filesystem Latency

Tuesday, May 31st, 2011

There’s an interesting on-going series of articles on file system latency over at Brendan’s Blog. Usually when system administrators look into I/O performance, we look at the I/O of the disks. This is usually fine for a rough estimate of raw disk performance, but there’s a lot more going on between the actual application and the disk: buffers, cache, the file system, etc. Brendan goes into detail regarding these matters by examining I/O performance of a MySQL database at both the disk and file system level:

MCPlayerEdit v0.16 released

Thursday, May 26th, 2011

I’ve released a new version of my Minecraft Player/World Editor MCPlayerEdit v0.16. This release features the following modifications and additions:

Update for Minecraft Beta 1.6
Added trapdoor inveotory item.
Added map inventory item.
Added dead desert shrub (unsafe) inventory item. Can only be placed on sand blocks.
Added dead grass shrub (unsafe) inventory item. Can only be placed on grass blocks.
Added locked chests (automatically disappear when placed after random time).
Fishing rods can no longer be stacked in safe mode.
Added the ‘loseme’ command. It clears the inventory and transports the player in a random direction for a given distance. The objective is to find your way back.
Added the ‘restore’ command which restores the last (automatic) created backup of the player data.
Improved the startup message.
Allow for multiple commands on a single line, separated by a semi-colon. Example: > load World1; give 1 diamond pickaxe; save; quit

It does not seem possible to add Tall Grass to the inventory, even as an unsafe item. :(

You can get the new version Here.

Closures, and when they’re useful.

Friday, May 20th, 2011

When is a closure useful?

Before we start with why a closure is useful, we might first need to understand what exactly a closure is.

First-class functions

In order to understand what a closure is, we must realize that in many, if not most, languages we can not just call functions, but we can also pass references to a function around in a variable. If a language supports that, it is said to have first-class functions. This can be used, amongst other things, to implement callbacks: you pass a reference to a function to a part of the program, which can then later call the function and obtain the results.

A common example of something that uses callback functions is a sorting routine that takes a comparison function. Such a function is called a higher-order function. For instance, Python’s sorted function:

sorted(iterable, cmp=None, key=None, reverse=False) --> new sorted list

The cmp parameter is a callback function. If we have a list of custom objects:

class MyPerson():
   def __init__(name, age):
      self.name = name
      self.age = age

people = [
   MyPerson('john', 24),
   MyPerson('santa', 100'),
   MyPerson('pete', 30),
]

and we want to sort people by age, we can do so by defining our own custom comparison function and pass it to sorted:

def my_cmp(a, b):
   return(cmp(a.age, b.age))

sorted(people, my_cmp)

The sorted function will now loop through the items in people and call the callback function my_cmp for two items in the list at a time. If one is bigger/smaller than the other, it swaps them in order to sort people. Note that we are not calling my_cmp! We’re simply passing a reference to the function to sorted.

Nested functions

Okay, so that covers first-class functions. Many languages also support nested functions. Example:

def get_cmp_func(key='age'):

   def my_cmp_name(a, b):
      return(cmp(a.name, b.name))

   def my_cmp_age(a, b):
      return(cmp(a.age, b.age))
      
   if key == 'name':
      return my_cmp_name
   elif key == 'age':
      return my_cmp_age

The get_cmp_func returns a function that can be used to compare things depending on what you pass as the key parameter. get_cmp_func is also a higher-order function because it returns a reference to a function. Of course in this use-case there are better ways of sorting the list, but it’s just an example.

Anonymous functions

Anonymous functions are not a requirement for closures, but it may be a good idea to explain what they are nonetheless, as there’s a lot of confusion over when exactly something is an anonymous function.

Anonymous functions, sometimes also called lambda’s, are simply that: anonymous. They have no name. Looking at previous examples in this post, we see function names such as my_cmp, get_cmp_func and even nested functions with names: my_cmp_age. Anonymous functions have no name. That doesn’t mean they can’t be passed around as a reference though! Example:

sorted(people, lambda a, b: cmp(a.age, b.age))

The anonymous function here is: lambda a, b: cmp(a.age, b.age). As you can see, it looks a lot like our first my_cmp function, except it has no name and doesn’t seem to return anything. That’s because an anonymous (lambda) function in Python always implicitly returns its first statement. In fact, you can only have one statement in a lambda in Python. (Other languages allow for more advanced anonymous functions; Python likes to keep it simple).

Okay, so why exactly would you need anonymous functions? Well, if your language already supports first-class functions (passing around references to a function), there really isn’t a need for anonymous functions, except that it saves some typing. Lambda functions are syntactic sugar for first-class functions.

Scope

So.. a closure, what is it? Again, before we can understand closures, we need to understand scope. Scope determines when we can access defined variables and functions at a certain location in our code. When a function is called, the programming language allocates a piece of memory where parameters to the function are stored and local variables can be stored by the function. This piece of memory (called the stack) is automatically cleared when the function returns. This is called the local scope.

Functions usually can also reference variable of the parent scope. For example:

a = 10

def print_a():
   print a

print_a() # output: 10

The print_a function has access to the a variable in the parent scope. But if we define a in a function’s local scope, we’ll get an error:

def define_a():
   a = 10

def print_a():
   print a

define_a()
print_a() # NameError: global name 'a' is not defined

We get a NameError when we try to print a’s value, because it is defined in define_a‘s local scope, which will be destroyed as soon as define_a stops running. This is called going out of scope. Anything a piece of code can access (local scope, parent scope) is defined as being within scope.

Closures

Now, finally, closures!

A closure is a special way in which scopes are handled. Instead of a function going out of scope and all the variables/functions its scope (both the local, as the parent, as the grand-parent, etc scope) being destroyed, the scope is kept around for later usage. Let’s look at an example:

def define_a():
   a = 10

   def print_a():
      print a

   return(print_a)

var_print_a = define_a()
var_print_a() # output: 10

This outputs 10. Let’s take a look at what’s happening. We define a function define_a and set a = 10 in its local scope. We then define a nested function that prints a from the parent scope. The define_a function then returns a reference to that function.

Next, we call define_a, which returns a reference to print_a and assigns it to variable var_print_a. Then we call var_print_a as a function (this is called dereferencing). By all accounts it shouldn’t work, because define_a has already stopped running. It has gone out of scope and its scope (containing a) should have been destroyed. But it’s not, because Python kept its scope around. This is a closure. The variables that were in scope at the time the closure was generated are still accessible for the function, and are now known as free variables.

The use-case

So, when are closures useful? Why not just use an Object and store the value in the object, along with a method that uses the object?

Let’s say we have a multithreaded program that handles requests. Data is stored in a database. The request handlers need to access the data in the database, but each thread has to have its own handler to the database, or they might accidentally overwrite each other’s data. So our multithreaded program allows us to register a callback function which will be called when a new thread starts. The callback function should return a new database connection for use in the thread.

def make_db_connection():
   return(db.conn(host='localhost', username='john', passwd='f00b4r'))

app = MyMultiThreadedApp(on_new_thread_cb = make_db_connection)
app.serve()

MyMultiThreadedApp will call make_db_connection for each new thread it starts, and the thread can then use the database connection returned by make_db_connection. But there is a problem! The database connection information (host, username, passwd) is hard-coded, but we want to get it from a configuration file instead!

So? We just pass some paramters to the make_db_connection right? Wrong!

def make_db_connection(host, username, passwd):
   return(db.conn(host=host, username=username, passwd=passwd))

app = MyMultiThreadedApp(on_new_thread_cb = make_db_connection)
app.serve()

This example wont work! Why not? Because MyMultiThreadedApp has absolutely no idea it should pass parameters to make_db_connection. Remember that we’re not calling the function ourselves, we’re just passing a reference to the MyMultiThreadedApp, which will call it eventually. There’s no way for it to know which parameters it should pass, because that depends on how your database needs to be set up. SQLite only needs a path parameter, but MySQL also needs username, password, and a host.

This is where closures step in:

def gen_db_connector(host, username, passwd):
   def make_db_connection():
      return(db.conn(host=host, username=username, passwd=passwd))
   return(make_db_connection)

callback_func = gen_db_connector('localhost', 'john', 'f00b4r')
app = MyMultiThreadedApp(on_new_thread_cb = callback_func)
app.serve()

The gen_db_connector function generates a closure (make_db_connection) which has access to host, username and passwd. We then get a reference to the closure, put it in callback_func and pass that to MyMultiThreadedApp. Now when a new thread is created, and the callback function is called, it will have access to the host, username and passwd information, without MyMultiThreadedApp needing to know which params it should pass on.

An alternative to closures

There’s a different way of accomplishing this though. By using objects:

class DBConnector():
   def __init__(self, host, username, passwd):
      self.host = host
      self.username = username
      self.passwd = passwd

   def connect(self):
      return(db.conn(
         host=self.host, 
         username=self.username,
         passwd=self.passwd)
      )

db_conn = DBConnector('localhost', 'john', 'f00b4r')
app = MyMultiThreadedApp(on_new_thread_cb = db_conn.connect)
app.serve()

However, this is a lot more lines, and wheter it works depends on if your programming language allows first-class methods. That is, passing references around to methods on an object, while also allowing you to call them as an instance method (instead of just as a static method).

I’d personally argue for the Object way. Closures are a concept which is very hard to understand for less experienced programmers. It is a matter of debate on whether closures hide state in an unpredictable way. I tend to think they do, and I’m not much of a fan of free variables since it is hard to guess where they came from. At any rate, objects are easier to understand than closures, so if at all possible, go for the object-way.

This is why I don’t use Apple products or DRM media

Wednesday, May 11th, 2011

This company is going out of business because they put all their eggs in a very delicate and quite frankly evil basket:

BeamItDown Software and the iFlow Reader will cease operations as of May 31, 2011. We absolutely do not want to do this, but Apple has made it completely impossible for anyone but Apple to make a profit selling contemporary ebooks on any iOS device.

If you’re a company, and you do this:

We bet everything on Apple and iOS and then Apple killed us by changing the rules in the middle of the game.

you need to have your head examined :-) This is not the first time this has happened, and it will most certainly not be the last time. Apple will do anything it can to make a buck over other company’s back!

Not just the company is being royally screwed over by Apple:

Many of you have purchased books and would like to keep them. You may still be able to read them using iFlow Reader although we cannot guarantee that it will work beyond May 31, 2011 […] your computer which will let you access them with Adobe Digital Editions or any other ebook application that is compatible with Adobe DRM protected epubs.

So iFlowReader’s have probably also lost all their ebooks because they had DRM on them. DRM (Digital Rights Management) is a technology which restricts media to a certain application or device; opening it in third-party applications is usually impossible.

And that’s why I have never and will never buy an Apple product, or use any media that is DRM protected.

← Older posts

Newer posts →

The text of all posts on this blog, unless specificly mentioned otherwise, are licensed under this license.