As none of you have noticed (as I have no readers), I've given my blog a major overhaul. Along with a different look, I'm also now using SyntaxHighlighter which allows me to make some really pretty posts.
I've also renamed my blog from "Russell Harmon's Blog" to "Page Fault" in reference to what occurrs when you attempt to access a virtual address which is not resident in memory.
EDIT 2010-10-13: I appear to be working on yet another blog overhaul. This one is far from done. There must be something wrong with me.
Page Fault
A chronicle of the tech stuff I do.
Tuesday, October 12, 2010
Friday, October 1, 2010
Object Oriented C
So, thanks to blocks, Apple's new extension to C, you can now do basic object-orientation. Have a look over at github for a short example on how to do it.
To break it down, an object is a struct, which contains both fields and blocks which act as the object's methods.
Because this String's first field is an Object, it can be safely casted to one (upwards casting).
More to come... maybe.
To break it down, an object is a struct, which contains both fields and blocks which act as the object's methods.
This creates a String object which inherits from Object and has a field _value, and methods getValue and setValue.typedef struct { Object super; char *_value; const char *(^getValue)(); void (^setValue)( const char * ); } String;
Because this String's first field is an Object, it can be safely casted to one (upwards casting).
More to come... maybe.
Monday, July 5, 2010
Bash Scripters: Stop using subshells to call functions.
When writing in bash, zsh, sh, etc... stop using subshells to call functions. There is a significant speed overhead to using a subshell and there is a much better alternative. Instead, you should just have a convention where a particular variable is always the return value of the function (I use retval). This has the added benefit of also allowing you to return arrays from your functions.
If you don't know what a subshell is, a subshell is another bash shell which is spawned whenever you use $() or `` and is used to execute the code you put inside.
I did some simple testing to allow you to observe the overhead. For two functionally equivalent scripts:
This one uses a subshell:
If you don't know what a subshell is, a subshell is another bash shell which is spawned whenever you use $() or `` and is used to execute the code you put inside.
I did some simple testing to allow you to observe the overhead. For two functionally equivalent scripts:
This one uses a subshell:
This one uses a variable:#!/bin/bash function a() { echo hello } for (( i = 0; i < 10000; i++ )); do echo "$(a)" done
The speed difference between these two is noticeable and significant.#!/bin/bash function a() { retval="hello" } for (( i = 0; i < 10000; i++ )); do a echo "$retval" done
$ for i in variable subshell; do > echo -e "\n$i"; time ./$i > /dev/null > done variable real 0m0.367s user 0m0.346s sys 0m0.015s subshell real 0m11.937s user 0m3.121s sys 0m0.359s
Monday, May 24, 2010
Reclaimable Userspace Cache Memory
Caches are used all over your computer and for a huge variety of purposes. From apache to your physical CPU, cache is everywhere. Normally, when you want to cache something in memory, you malloc(3) a chunk of memory, and store data in that. This works well in the small scale, but when you and 30+ others want to cache some information, that can quickly turn into a large amount of memory taken up by information which can be (easily, or not so easily) regenerated, and there is no way for the operating system to reclaim that memory when it really needs it.
In Java, that's not the case. In Java, you can create SoftReference objects which are collected by the garbage collector when the VM is running out of memory. This exact idea is what I'd like to see in an operating system.
I propose a system, whereby you can allocate memory which the operating system can reclaim at it's own discretion. This would work by using malloc(3) to get some memory, then using madvise(2) to advise to the kernel that this is reclaimable memory. Then, before you read or write to the memory, you lock the memory (for read or write) using reclock, during which time the kernel guarantees not to reclaim the memory. Then, when you are done reading / writing to that memory, recunlock it.
The function prototypes for the reclock and recunlock functions (which don't exist) would be:
Before simply giving a chunk of memory to someone else however, the kernel has to check to see if the memory is in use. In order to do that, there has to be a lock bit somewhere. I had originally thought to put it in the kernel's memory, but Clockfort noted that locking and unlocking would require a system call, which would be quite slow. Therefore, the bit can be kept in the processes memory space, and simply read by the kernel before reclaiming memory. That way, reclock and recunlock can be implemented entirely without syscalls.
In Java, that's not the case. In Java, you can create SoftReference objects which are collected by the garbage collector when the VM is running out of memory. This exact idea is what I'd like to see in an operating system.
I propose a system, whereby you can allocate memory which the operating system can reclaim at it's own discretion. This would work by using malloc(3) to get some memory, then using madvise(2) to advise to the kernel that this is reclaimable memory. Then, before you read or write to the memory, you lock the memory (for read or write) using reclock, during which time the kernel guarantees not to reclaim the memory. Then, when you are done reading / writing to that memory, recunlock it.
The function prototypes for the reclock and recunlock functions (which don't exist) would be:
Under the hood, what would happen is that when you madvise(2) the kernel that a particular space is reclaimable, it would add it to a list of reclaimable addresses. Then, when the system is low on memory, it would scan the list for a chunk of memory large enough, check that the memory isn't locked (read next paragraph), mark that element in the list as reclaimed and with the pid that it was taken from, and give it to someone else.// Returns 0 on success, -1 if the memory // is no longer available int reclock( const void *addr, int perms ); void recunlock( const void *addr );
Before simply giving a chunk of memory to someone else however, the kernel has to check to see if the memory is in use. In order to do that, there has to be a lock bit somewhere. I had originally thought to put it in the kernel's memory, but Clockfort noted that locking and unlocking would require a system call, which would be quite slow. Therefore, the bit can be kept in the processes memory space, and simply read by the kernel before reclaiming memory. That way, reclock and recunlock can be implemented entirely without syscalls.
Thursday, May 28, 2009
Pure bash cat
So just to see if I could, I wrote a version of cat using pure bash. Pure bash is a bash script which uses nothing but bash builtins to accomplish it's goal. To determine if a particular command is a builtin, you can use the command type -t "command" (the command type, is itself a builtin). Some notable commands which are builtins include echo, read, exec, return. Some notable commands which are not builtins include cat and grep. As follows is my implementation of cat in pure bash.
Maybe i'll explain more when i'm less lazy.
Now, keep reading if you want a small lesson in advanced bash. I'll go line by line to explain what this is doing.#!/bin/bash INPUTS=( "${@:-"-"}" ) for i in "${INPUTS[@]}"; do if [[ "$i" != "-" ]]; then exec 3< "$i" || exit 1 else exec 3<&0 fi while read -ru 3; do echo -E "$REPLY" done done
Line 1 is the shebang.#!/bin/bash INPUTS=( "${@:-"-"}" )
Line 2 assigns the array variable INPUTS either the arguments provided on the command line if they exist, or the single character "-". The way this happens is as follows: $@ is the variable to reference the positional parameters (the arguments to your program). If you have not heard of $*, read this. The way I reference the positional parameters is like ${@}. That's because the brackets allow me to add a "default value" to the variable. A default value is the value that the variable will seem to have if the variable is not set. The way to use a default value is with the :-, like so: ${@:-"hello"}. So if $@ is not set, it will seem to have the value "hello". You will then notice that is all enclosed in (). That makes an array out of the positional parameters (the first argument to the program becomes the first element in the array, the second argument becomes the second element, etc.).#!/bin/bash INPUTS=( "${@:-"-"}" ) for i in "${INPUTS[@]}"; do
Line 3 begins a for loop which will assign to i each value stored in the array INPUTS which was discussed earlier. The @ index used is the same for arrays as $@ is for the positional parameters.INPUTS=( "${@:-"-"}" ) for i in "${INPUTS[@]}"; do if [[ "$i" != "-" ]]; then
Maybe i'll explain more when i'm less lazy.
Sunday, March 22, 2009
Chromium on Linux
So I decided today to make a shot at compiling google chrome on Linux... aaand after a number of compile errors, it works! Here's a screenshot of what you see when you start it up:
Some things I noted about it:
In short, the browser is not usable yet.
P.S. As you can see in my screenshot, there is a big disclaimer that this browser is NOT READY YET, so DON'T judge the quality of the linux port of chromium using any information you can get about it today!
Some things I noted about it:
- It took a long time to connect to many web sites.
- It crashed a lot
- There was no tab interface... opening a new tab worked, but you couldn't close it or get back to any old tabs.
- It caused google to block me
- No dialog boxes worked... couldn't open the options pane, no about pane, etc...
In short, the browser is not usable yet.
P.S. As you can see in my screenshot, there is a big disclaimer that this browser is NOT READY YET, so DON'T judge the quality of the linux port of chromium using any information you can get about it today!
Friday, January 23, 2009
Fixed lighttpd
Subscribe to:
Posts (Atom)