Lesson 7: Modules coming and going.

Printer-friendly versionPrinter-friendly version

So what's on tap for today?

Today's lesson will involve the intricacies of what happens when you load and unload your module, and what should happen in the entry and exit routines of every loadable module. And we'll throw in how you can print simple information messages from your module as well. So, away we go.

Your new crash2 module

For the rest of this lesson, we'll be using a slightly extended version of the crash1 module you were playing with in earlier lessons, so create a new crash2 directory and add the following files. Here's your source file crash2.c:

/* Module source file 'crash2.c'. */
#include <linux/module.h>
#include <linux/init.h>
#include <linux/kernel.h>

static int __init hi(void)
{
     printk(KERN_INFO "crash2 module being loaded.\n");
     return 0;
}

static void __exit bye(void)
{
     printk(KERN_INFO "crash2 module being unloaded.\n");
}

module_init(hi);
module_exit(bye);

MODULE_AUTHOR("Robert P. J. Day, http://crashcourse.ca");
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("Doing a whole lot of nothing.");

and the Makefile:

ifeq ($(KERNELRELEASE),)

KERNELDIR ?= /lib/modules/$(shell uname -r)/build
PWD := $(shell pwd)

.PHONY: build clean

build:
        $(MAKE) -C $(KERNELDIR) M=$(PWD) modules

clean:
        rm -rf *.o *~ core .depend .*.cmd *.ko *.mod.c
        rm -f modules.order Module.symvers

else

$(info Building with KERNELRELEASE = ${KERNELRELEASE})
obj-m :=    crash2.o

endif

Once again, note that all the indentation in your Makefiles must be a single TAB character and not spaces.

At this point, feel free to once again compile this module, load it, verify that it was loaded, then unload it. And once that's done, now we can talk about it.

Aside: For predictability, I'm once again doing all of this under the default Ubuntu 10.04 kernel:

$ uname -r
2.6.32-22-generic
$

However, if you're feeling ambitious and you've built a newer kernel and are currently running under it, everything in this lesson should still work equally well.

Printing "Hello, world"

Most budding kernel programmers want to know how to print from their module, and you can see in the source above the standard technique:

...
printk(KERN_INFO "crash2 module being loaded.\n");
...
printk(KERN_INFO "crash2 module being unloaded.\n");
...

except that it's obvious that, when you load and unload your module, nothing shows up on the screen, so let's emphasize a fundamental principle of modules and kernel code one more time.

Once your module is loaded, it's running in kernel space and has very little association with user space, so anything printed is not going to show up at your terminal, for the simple reason that your module has no association with your terminal anymore.

Instead, anything you print with the above format is going to, by default, show up in the system log file, which is almost assuredly the file /var/log/messages, so once you load and unload your module, you can run the following (root privilege-required) command to see what happened:

Jun 20 07:55:10 ... crash2 module being loaded.
Jun 20 07:55:14 ... crash2 module being unloaded.

and there's the proof.

As an alternative, if you want to see those output messages in real time as they're being printed (and a lot of other kernel log messages as well), you can follow the tail end of that log file thusly:

$ sudo tail -f /var/log/messages
...
Jun 20 07:55:10 ... crash2 module being loaded.
Jun 20 07:55:14 ... crash2 module being unloaded.

and simply break out of the tail command when you're done with it.

We'll have more about printing diagnostics from your modules toward the end of this lesson. For now, however, we need to cover the primary reason for this lesson -- the proper coding of your module's entry and exit routines.

Module entry and exit code

What's happening in your module code here should be fairly obvious:

static int __init hi(void)
{
     printk(KERN_INFO "crash2 module being loaded.\n");
     return 0;
}

static void __exit bye(void)
{
     printk(KERN_INFO "crash2 module being unloaded.\n");
}

module_init(hi);
module_exit(bye);

Every module should define both an init and an exit routine, representing the code that should be executed at module load and unload time, respectively. In the above, we're obviously not doing anything sophisticated but we still need those routines. And what exactly are they supposed to do? That depends.

Based on how complicated your module is, it might need to do a fair bit of initialization including things like allocating memory, registering device nodes, initializing hardware and so on.

Conversely, your modules's exit routine is responsible for undoing all of that work -- that is, unregistering devices, freeing allocated memory and so on, which brings us to the cardinal rule of module init and exit routines:

Your exit routine is responsible for undoing everything done by the init routine.

I can't stress the above enough. Developers working in user space are used to being somewhat sloppy in their cleanup routines, particularly when it comes to deallocating memory they dynamically allocated earlier, since they simply assume that, once their program ends, the operating system will take care of all of that.

This is not true for kernel space programming -- if you do something in your entry code, it's your responsibility to undo it on your way out. No one is going to be checking your code and cleaning up after you. If you leave devices running or memory allocated, all of that stays there until the system is rebooted. In short, you must be scrupulous about cleaning up after yourself when your module is unloaded.

But wait ... there's more.

Detecting errors in the entry code

Given how much your module is trying to do in its entry code, it's entirely possible that something might go wrong. Perhaps you can't initialize a device properly. Perhaps a call to kmalloc() to dynamically allocate some kernel memory fails. Any number of things might go wrong and it's your responsibility to check every return code and deal with failure accordingly.

Ignoring the details (which are coming shortly), if everything in your entry routine appears to work, you signal that by ending that routine with:

  return 0;

In other words, returning a value of zero from your entry code is your way of saying that everything appears to have worked and the module is ready to be fully loaded.

On the other hand, if you detect that anything has gone wrong, it's your job to return something other than zero. That not only alerts the programmer that something failed, but it kills the loading process.

Exercise for the student: Typically, one returns a negative integer value from the init routine to show that something has gone wrong and that the module can't be loaded. Modify your crash2.c source file to return a value of -7, then rebuild it and try to load it. What happens? Check what's happening in /var/log/messages as you're trying this.

The sordid details of module load failure

At this point, we can cover the final details of what happens when your module detects that something has failed during its init code, and that's that, whatever the module has managed to do up to the point of apparent failure, it's the module's responsibility to undo all of that (typically in reverse order) before returning an appropriate negative error code. Again (as we explained above), when you're working in kernel space, no one is going to clean up after you, so if you simply exit your module init code with a negative failure to signify failure, everything you did up to that point will remain. Let's demonstrate this with some sample code.

If we ignore that anything can go wrong during module loading, you can imagine some fairly simple init and exit routines looking something like:

static int __init hi(void)
{
    do_this();
    do_that();
    do_something_else();
    return 0;  // success!
}

static void __exit bye(void)
{
    undo_something_else();
    undo_that();
    undo_this();
}

Note how the code above assumes everything works successfully, no error checking is done, and the init routine always returns zero to show success.

However, more realistic code would look more, perhaps, like this:

static int __init hi(void)
{
    int err;
    /* registration takes a pointer and a name */
    err = register_this(ptr1, "rday");
    if (err) goto fail_this;
    err = register_that(ptr2, "rday");
    if (err) goto fail_that;
    err = register_those(ptr3, "rday");
    if (err) goto fail_those;
    return 0; /* success */

    fail_those: unregister_that(ptr2, "rday");
    fail_that: unregister_this(ptr1, "rday");
    fail_this: return err; /* propagate the error */
}

static void __exit bye(void)
{
    unregister_those(ptr3, "rday");
    unregister_that(ptr2, "rday");
    unregister_this(ptr1, "rday");
    return;
}

It should be obvious what those routines are doing. While the exit routine (correctly) assumes that it must undo everything that a successful load would have done, the init routine now has to check every single step and, upon detecting an error, has to know what's already been done and carefully back out each of those steps -- typically in reverse order -- then return a negative value to identify the error.

The above is one variation of how some programmers do that. Another variation is to put all the exit code in a separate routine and have both the entry and exit routines call that. Obviously, you have some flexibility in how you want to do the above. You just have to do it.

Exercise for the student: Feel free to wander through your git kernel source tree and check out any driver code that could be built as a module, and peruse its entry and exit code (typically found at the bottom of the main source file for that feature).

For example, look at the main source file for the Btrfs filesystem, fs/btrfs/super.c, for a realistic example of entry and exit code. And leave a comment of any other examples you find that represent a good demonstration of this.

And what about those negative return codes?

As the final topic in this lesson, let's clarify what you should be returning from the module init routine if something goes wrong. In order to prevent any attempt to load the module, you have to return a negative integer value. But rather than writing out the value as a number, you typically return the negative value of one of the error macros found either of these files in the kernel source tree:

$ cat include/asm-generic/errno-base.h
#define EPERM            1      /* Operation not permitted */
#define ENOENT           2      /* No such file or directory */
#define ESRCH            3      /* No such process */
#define EINTR            4      /* Interrupted system call */
#define EIO              5      /* I/O error */
#define ENXIO            6      /* No such device or address */
#define E2BIG            7      /* Argument list too long */
#define ENOEXEC          8      /* Exec format error */
#define EBADF            9      /* Bad file number */
#define ECHILD          10      /* No child processes */
#define EAGAIN          11      /* Try again */
#define ENOMEM          12      /* Out of memory */
#define EACCES          13      /* Permission denied */
... snip ...
$ cat include/asm-generic/errno.h
#include 

#define EDEADLK         35      /* Resource deadlock would occur */
#define ENAMETOOLONG    36      /* File name too long */
#define ENOLCK          37      /* No record locks available */
#define ENOSYS          38      /* Function not implemented */
#define ENOTEMPTY       39      /* Directory not empty */
... snip ...

The above should be self-explanatory. If your module init routine fails because of a memory allocation error, you would most likely clean up after yourself, then:

return -ENOMEM;   // Don't forget to return a *negative* value.

The rest should be fairly obvious. Once again, feel free to wander around the kernel source tree and see how the existing code does all this.

Exercise for the student: What happens if you modify your module source to (mistakenly) return a positive value from its init routine? Does the module still load? What's different about what happens here?

What have we forgotten?

We've deliberately ignored the new keywords of "__init" and "__exit" that you can see in your module source file. We'll be covering that next lesson. And a lot more, of course.

Until next time.

Comments

Exercise for students

I made following chance in init routine:
static int __init hi(void)
{
printk(KERN_INFO "crash2 module being loaded.\n");
return -7;
}

Now, I did make and tried to load the module:

$ sudo insmod crash2.ko

But I got this error:
insmod: error inserting 'crash2.ko': -1 Argument list too long

I checked the log messages:
$ sudo tail -f /var/log/messages

kernel: [ 6390.209195] crash2 module being loaded.

Then I tried to see if the module has been loaded, it wasn't:
$ lsmod | grep crash2
$

I tried to remove it too even though I knew it would fail (just for the sake of it)
$ sudo rmmod crash2.ko
ERROR: Module crash2 does not exist in /proc/modules

What is going on here?

That's exactly what you should expect.

The point of that exercise was to see what happens when you return something other than zero from your module entry code. In cases like that, you'll get an error message reflecting the negative value you returned -- in this case, -7 represents argument list too long (just a random error code I chose as part of the example).

In short, what you saw was exactly what you should have expected.

Compile errors

When old eyes get an error compiling it's time to increase the font size.

Interesting output when the return code is a positive number, not what I was expecting at all.

inode.c in filesystem type code feels interesting

As you mentioned above to browse the source tree

/ubuntu-lucid/fs/fat/inode.c

seemed interesting.
Though I could not understand the code much but at the end of code I saw

module_init(init_fat_fs)
module_exit(exit_fat_fs)

which is what upto this lesson I had just learned :)

So thought of sharing.

Comma is missing

Just a syntactical error, but the last line, namely MODULE_DESCRIPTION("Doing a whole lot of nothing.") does not end with a colon, maybe cms removed it.

Quite right, my copy and

Quite right, my copy and paste must have lost the semi-colon. It's fixed now.

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <p> <br> <pre> <h1> <h2> <h3> <h4>
  • Lines and paragraphs break automatically.

More information about formatting options

By submitting this form, you accept the Mollom privacy policy.

We know

We're aware of the time and budget pressures at most companies, normally accompanied by the plaintive cry from management of, "Yes, I know we need training on that topic, but I just can't afford to send my entire team away for three (or four or five) days to get it!" And that's where we come in.

Choices!

The main focus at Crashcourse is to offer a choice of intense, 1-day, hands-on courses on specific topics in Linux and open source. And given that we already have the laptops for the delivery of that training, the idea is to show up early, set up a classroom, then spend the day teaching exactly the topic you're interested in. No travel time, and no wasted classroom time.

Customization

If we don't already have a course that addresses the topic you're interested in, drop us a note and we'll see what we can do -- our content providers can almost certainly put together a course that's precisely what you're after.

The difference

While there are a variety of sources for Linux and open source training, we at Crashcourse are taking a slightly different approach. Our philosophy is simple: exactly the training you want, and no wasted time or travel to get it.