Ok, so I didn't get my file system completed on my trip a few months ago. In fact, I didn't work on it at all during the trip. (I don't know why I thought I would)
I am now, however, working in earnest on getting the file system to work. This involves removing a lot of old code and rewriting some of the code. I'm basing my file system interfaces on Linux, which stores pointers to functions for open, close, read, write, lseek, and other system calls in the file and inode structures. This way each different type of file system (eg, FAT32, NTFS, or ext3) can supply its own set of methods for each system call. I'm simplifying the interfaces here and there, but it should be recognizable as being derived from Linux.
At first I am not going to write the Punix File System (which is stored on FlashROM), since that involves debugging PFS itself, the FlashROM block device driver, block buffer system, and the VFS interfaces. I am actually writing a memory-based file system, tmpfs, as the first file system. This removes device drivers and block buffers out of the equation. I only need to get VFS and tmpfs working correctly. Once tmpfs works, I'll start testing the flash driver and block buffer, and then writing PFS.
On an unrelated note, I made exceptions work correctly now. They used to panic regardless of the source (user or kernel). Now they panic only if they come from the kernel. Otherwise they send the appropriate signal to the offending process, which kills the process by default. I don't think signals are working completely yet, but they work well enough to cause a process to terminate.
While I was working on the link port driver, I uncovered a dormant bug in the "tests" applet in the shell which resulted in an Address Error exception:
When the applet timed out reading a packet, it attempted to longjmp() to an invalid context (without a matching setjmp()). I never saw this bug before because I always had "Listen for files" enabled in TiEmu, so TiEmu always replied immediately.
Regardless of what caused the address error, the kernel successfully caught and handled it. I ran the applet in a subshell, so only the subshell process died, and the parent shell took over and reported that the subshell terminated and how. The system just kept running as if nothing happened. TI-AMS would barf all over itself in this situation. :)
Of course... Punix will send a signal like this only when an exception occurs. It can do nothing about a process overwriting memory that belongs to the kernel or to another process. I mean "nothing" as in "no practical way to do it". There are ways to detect when it happens, but every way to do so would add significant overhead in time or memory. One example would be to dedicate all of RAM to a single process and use FlashROM as swap space. This would be a significant overhead in time and memory (though this method would have a few advantages).
July 2, 2011
March 27, 2011
Road trip!
I will be leaving out of state for awhile in a couple of days, so I will probably not be updating this blog much during the trip. I will be bringing my dino-laptop with me, however, so I can continue developing Punix as time permits.
Right now I am trying to compile and install GCC4TI onto it. It's taking a long time because the CPU runs at only 300MHz.... I already have TiEmu installed, which is really good, since I really don't want to compile that too.
Hopefully when I get back I can check in a working file system for Punix. :)
Update: I ended up compiling the latest version (3.03) of TiEmu anyway, since I had an older version (3.02) installed. It took about two hours to complete this (as I also had to download and compile the libti* packages). This turned out to be a very good thing though, since TiEmu now uses about 45% CPU on my dino-laptop, whereas before it pegged the CPU and ran at about 40% of real speed. That's about a 450% increase in performance!
Right now I am trying to compile and install GCC4TI onto it. It's taking a long time because the CPU runs at only 300MHz.... I already have TiEmu installed, which is really good, since I really don't want to compile that too.
Hopefully when I get back I can check in a working file system for Punix. :)
Update: I ended up compiling the latest version (3.03) of TiEmu anyway, since I had an older version (3.02) installed. It took about two hours to complete this (as I also had to download and compile the libti* packages). This turned out to be a very good thing though, since TiEmu now uses about 45% CPU on my dino-laptop, whereas before it pegged the CPU and ran at about 40% of real speed. That's about a 450% increase in performance!
March 17, 2011
Real top is coming
Up til now I have been either faking the statistics in top, or pulling statistics directly from the kernel (by reading kernel variables). Since top runs in user space, it can't really get information directly from the kernel, so it needs to get it another way. In Linux (and not in FreeBSD), top uses the /proc filesystem. Punix doesn't have so much as a filesystem working, let alone a /proc filesystem, so I followed 4.4BSD's and FreeBSD's model: use the sysctl() system call. (Technically speaking, top in Linux could use sysctl() as well, since sysctl() maps directly (?) to the /proc FS there).
The BSD sysctl() syscall has a top-level "directory" called "kern", and one of the second-level directories under that is "proc". This is used for getting process information and has several third-level directories for filtering which processes are included. In my version of top I use the "all" value for retrieving information for all processes.
Here is what "top" currently looks like (r377 in the trunk):
(sh is the top process because top runs in the same process as the shell, à la BusyBox)
As you can see, some values are -1. These are statistics that I have not gathered yet, either in the kernel or in top. I still have to add memory statistics in the kernel, for example.
Also, the CPU usage reported for top itself is much lower than before (0.5 vs 2.5). This is because the older top code used getrusage() to calculate CPU usage (as change in CPU time divided by change in real time), while this latest version uses the kp_pcpu field in struct kinfo_proc from sysctl(). This field is maintained in the kernel as an exponential moving average of CPU usage rather than a linear moving average, so it will naturally be different from the old method.
The BSD sysctl() syscall has a top-level "directory" called "kern", and one of the second-level directories under that is "proc". This is used for getting process information and has several third-level directories for filtering which processes are included. In my version of top I use the "all" value for retrieving information for all processes.
Here is what "top" currently looks like (r377 in the trunk):
(sh is the top process because top runs in the same process as the shell, à la BusyBox)
As you can see, some values are -1. These are statistics that I have not gathered yet, either in the kernel or in top. I still have to add memory statistics in the kernel, for example.
Also, the CPU usage reported for top itself is much lower than before (0.5 vs 2.5). This is because the older top code used getrusage() to calculate CPU usage (as change in CPU time divided by change in real time), while this latest version uses the kp_pcpu field in struct kinfo_proc from sysctl(). This field is maintained in the kernel as an exponential moving average of CPU usage rather than a linear moving average, so it will naturally be different from the old method.
March 11, 2011
Beta 4
I just released Beta 4 on Sourceforge. Read the release notes here.
This release includes initial support for the TI-89 (as I posted about previously), remote terminal (using the "uterm" program), and a user login. Check out the screenshot below to see the user login in action:
This release includes initial support for the TI-89 (as I posted about previously), remote terminal (using the "uterm" program), and a user login. Check out the screenshot below to see the user login in action:
March 10, 2011
Port to the TI-89!
I fixed the display and keyboard input to work on the TI-89. There wasn't much to change since I already had key translation tables in place. I basically had to add one more translation table for the Alpha key, and change the existing translation table for the 2nd key so it works for both the 92+ and the 89.
For the display I removed the separator line at the bottom, changed the status line to 4 pixels high, and added one more line to the terminal (40 columns by 16 lines).
Here's a second screenshot showing all of the smaller status indicators (except for "bell" and "busy"):
For the display I removed the separator line at the bottom, changed the status line to 4 pixels high, and added one more line to the terminal (40 columns by 16 lines).
Here's a second screenshot showing all of the smaller status indicators (except for "bell" and "busy"):
March 9, 2011
First TI-89 compile
Here's my first compile for the TI-89:
I had to fix a few screen alignment issues, but there wasn't too much work to make it look ok. The key scanning needs some work to work with the 89's different layout, though.
I plan to work almost exclusively with the TI-92+ since it has a larger screen. Most of the pieces I need to write from now on will work the same on both calculators. Once I have a more complete system I can work on the 89 keyboard layout issues.
I had to fix a few screen alignment issues, but there wasn't too much work to make it look ok. The key scanning needs some work to work with the 89's different layout, though.
I plan to work almost exclusively with the TI-92+ since it has a larger screen. Most of the pieces I need to write from now on will work the same on both calculators. Once I have a more complete system I can work on the 89 keyboard layout issues.
vfork()!
I got vfork() working well now, and exit() and execve() play nicely too. There were initially some memory leaks when doing a vfork() and then execve(), but those are all plugged now. I also fixed a few other bugs (one in heap.c, the memory manager) along the way.
I just released beta 3 demonstrating these awesome (in my mind at least) features.
Download it now. Also read the release notes.
I just released beta 3 demonstrating these awesome (in my mind at least) features.
Download it now. Also read the release notes.
March 7, 2011
Busy indicator
I just added a busy indicator to the status line at the bottom. This indicator shows when the CPU is running a user program (or running in kernel mode on behalf of a user program).
I shifted the other status indicators (shift, 2nd, diamond, etc) to put it next to the battery indicator:
It's not the best-looking status icon, but can you do better? (Really, if you can draw a better-looking one, send it my way!)
Update: I changed the icon to an hourglass which I think looks better. Still, if you want to draw a better one, let me see it! I also could use a better-looking icon for the "hand" status. Mine looks like a claw. :)
I shifted the other status indicators (shift, 2nd, diamond, etc) to put it next to the battery indicator:
It's not the best-looking status icon, but can you do better? (Really, if you can draw a better-looking one, send it my way!)
Update: I changed the icon to an hourglass which I think looks better. Still, if you want to draw a better one, let me see it! I also could use a better-looking icon for the "hand" status. Mine looks like a claw. :)
Here comes vfork()!
I finally got vfork() working. It's about time...
There is no screenshot this time, since it won't show much, but I know it's working because when I first got it working, init (process 1) was taking up about 100% cpu time while the shell (process 2) waited for user input. I could tell init was running in the background because the activity indicators were running full steam. What this means is that we have full multitasking!! The only major component missing now from having a complete system is the file system.
I haven't merged the code into the trunk yet, as I still want to fix a few related system calls before doing so (such as wait()). If you're interested in seeing the code before then, check out the "vfork" branch in the SVN repository:
svn co https://punix.svn.sourceforge.net/svnroot/punix/branches/vfork
There is no screenshot this time, since it won't show much, but I know it's working because when I first got it working, init (process 1) was taking up about 100% cpu time while the shell (process 2) waited for user input. I could tell init was running in the background because the activity indicators were running full steam. What this means is that we have full multitasking!! The only major component missing now from having a complete system is the file system.
I haven't merged the code into the trunk yet, as I still want to fix a few related system calls before doing so (such as wait()). If you're interested in seeing the code before then, check out the "vfork" branch in the SVN repository:
svn co https://punix.svn.sourceforge.net/svnroot/punix/branches/vfork
March 5, 2011
Serious bug. Seriously.
A couple days ago I discovered a somewhat serious bug in the kernel in which it throws an address error exception. The top of the stack frame at the beginning of the exception handler looked like this:
0015
0000
002f
The stack frames for most of the exception and trap handlers on the M68K contain the saved Status Register and saved Program Counter at the top. If this were the case with the address error exception, the Status Register would be 0x0015 (a reasonable value) and the Program Counter would be 0x0000002f (an invalid value).
I spent the longest time trying to figure out how the kernel is jumping to the address 0x0000002f. I should have instead read the M68000 Programmer's Reference Manual. On page 630 it describes each of the exception stack frames. Guess what? The Address Error Exception stack frame contains the Access Address where the Program Counter would be. This means some code in the kernel was merely accessing the address 0x0000002f. (The 68000 can access only even addresses for word and long accesses.)
I finally narrowed it down to a function call to sched_run. This function takes a pointer to a process structure. The sched_run function accesses the "p_deadline" member of the "proc" structure, which resides at offset 0x2e. In one place I called the function without an argument (a major reason to use function prototypes!), so sched_run took whatever value was on the stack at the time. That value on the stack always happened to be 0x00000001, so sched_run tried to access a long-word at address 0x00000001+0x2e, or... 0x0000002f. There you go!
It turns out that that offending call to sched_run was a bug in itself. At the time it was being called, the process was already scheduled to run, so it didn't need to be scheduled to run again. I ended up removing the line completely, which killed two birds with one stone (or with one "dd" command in vim :D).
During this whole ordeal I trekked across many lines of code and found a small handful of other bugs as well. Some were potential synchronization issues, and others were just code cleanups and simplifications, particularly with the timer and system clock routines. For example, I changed the process interval timers (the ITIMER_REAL timer in setitimer(2)) to rely on a separate monotonic clock instead of the system clock. The result is that, even if you change the system clock forward or backward (or use adjtime() to speed up/slow down the system clock temporarily), the real-time interval timers will now run at exactly the same rate and phase. They are not affected by any changes to the system clock.
All in all, I believe the scheduler and timer code is now quite robust and stable. Onward to finishing vfork()...
0015
0000
002f
The stack frames for most of the exception and trap handlers on the M68K contain the saved Status Register and saved Program Counter at the top. If this were the case with the address error exception, the Status Register would be 0x0015 (a reasonable value) and the Program Counter would be 0x0000002f (an invalid value).
I spent the longest time trying to figure out how the kernel is jumping to the address 0x0000002f. I should have instead read the M68000 Programmer's Reference Manual. On page 630 it describes each of the exception stack frames. Guess what? The Address Error Exception stack frame contains the Access Address where the Program Counter would be. This means some code in the kernel was merely accessing the address 0x0000002f. (The 68000 can access only even addresses for word and long accesses.)
I finally narrowed it down to a function call to sched_run. This function takes a pointer to a process structure. The sched_run function accesses the "p_deadline" member of the "proc" structure, which resides at offset 0x2e. In one place I called the function without an argument (a major reason to use function prototypes!), so sched_run took whatever value was on the stack at the time. That value on the stack always happened to be 0x00000001, so sched_run tried to access a long-word at address 0x00000001+0x2e, or... 0x0000002f. There you go!
It turns out that that offending call to sched_run was a bug in itself. At the time it was being called, the process was already scheduled to run, so it didn't need to be scheduled to run again. I ended up removing the line completely, which killed two birds with one stone (or with one "dd" command in vim :D).
During this whole ordeal I trekked across many lines of code and found a small handful of other bugs as well. Some were potential synchronization issues, and others were just code cleanups and simplifications, particularly with the timer and system clock routines. For example, I changed the process interval timers (the ITIMER_REAL timer in setitimer(2)) to rely on a separate monotonic clock instead of the system clock. The result is that, even if you change the system clock forward or backward (or use adjtime() to speed up/slow down the system clock temporarily), the real-time interval timers will now run at exactly the same rate and phase. They are not affected by any changes to the system clock.
All in all, I believe the scheduler and timer code is now quite robust and stable. Onward to finishing vfork()...
March 3, 2011
New developments
I know what it looks like: Punix is dead. I swear it's not! In fact I have just released the first Punix beta! You can download that at SourceForge
Recently I started work on a floating-point unit (FPU) emulator to add support for floating-point instructions, such as fadd, fneg, fsqrt, and more. So far instruction decoding (to get the destination, the source, and the operation) works for the most part, most number format conversions work (but only converting to extended-precision floating point and not the other way around) and a small handful of simple instructions work completely.
Another recent change is in userspace. For more than a year the system dropped first to a series of tests, one of them being a simple shell. Now the init process attempts to vfork() in order to spawn additional processes (normally it would run getty for each console), but since vfork() doesn't work yet, it drops straight to the shell. The shell is still very simple, but it can now run about a dozen different commands as built-in applets. Among these applets are the *nix utilities "top", "cat", "true", "false", "clear", "uname", and "date"; however, most of these are severely simplified as this is just demo code. Most of the tests that were run in the previous versions are still available by running "tests" from the shell.
Have fun playing with Punix! If you find any bugs or have any suggestions for improvements, please let me know.
Recently I started work on a floating-point unit (FPU) emulator to add support for floating-point instructions, such as fadd, fneg, fsqrt, and more. So far instruction decoding (to get the destination, the source, and the operation) works for the most part, most number format conversions work (but only converting to extended-precision floating point and not the other way around) and a small handful of simple instructions work completely.
Another recent change is in userspace. For more than a year the system dropped first to a series of tests, one of them being a simple shell. Now the init process attempts to vfork() in order to spawn additional processes (normally it would run getty for each console), but since vfork() doesn't work yet, it drops straight to the shell. The shell is still very simple, but it can now run about a dozen different commands as built-in applets. Among these applets are the *nix utilities "top", "cat", "true", "false", "clear", "uname", and "date"; however, most of these are severely simplified as this is just demo code. Most of the tests that were run in the previous versions are still available by running "tests" from the shell.
Have fun playing with Punix! If you find any bugs or have any suggestions for improvements, please let me know.
Subscribe to:
Posts (Atom)