Only few of these projects are "work projects", while most are related to hobby and or studies. This is dues to the fact that most of what I have been working on is not really suited for "publication" on this kind of list.
This following is some some of the more interesting projects. It is a mix of ordinary PC software and embedded software. On this page you will find a brief description of each project and a link to more detailed info. Most of the older projects no longer have "more details", since I had a hard disk crash some years back, and the source code and more detailed descriptions were lost. I am leaving the index as it is. It still serves as a display of some of the things I have worked on, and perhaps you will be inspired to play around with something along the same lines.
The programs were made to see if I could implement various effects and features, and to gain knowledge. The software is almost never complete, but is left at an intermediate state, when I had done what I wanted to do with it.
Most of the projects are somewhat ancient. It is not that I have done nothing the last couple of years, but rather than I have not taken the time to describe it. I will try and get something recent up sometime soon.
For further information contact Thomas Greenleaf
Particle based snow simulation in an Eulerian wind field
This is my masters thesis, which received the second highest grade of 10.
The key topics are Smoothed Particle Hydrodynamics (SPH) and Finite Volume Method (FVM) for general discrtization as well as for specific numerical modelling of snow and air. Fluid dynamics, a rheological material model and material strength mechanics was introduced before implementing a simulation of wind blown, and resting, snow on a GPU using CUDA.
The abstract is presented below.
This is a graduate project at university. It received the highest possible grade of 12
This project is a part of a larger research project dealing with automated analysis and feedback to patients undergoing physical therapy. I deal specifically with how a previously recorded correct exercise can be matched with a current recording of a patient attempting to perform that same exercise.
For Danish Broadcasting (DR) I implemented a displayserver. It is a computer program which takes commands over an TCP/IP connection from other computers. It allows for easy generation and control of interactive on-screen graphics. This has been used in a number of game shows for both DR and for other tv stations. The last use of the system, to my knowledge (early 2011) was for the X-Factor finale 2011. In the image above, the graphics at the bottom is displayed by the server and faded in and out. Nothing fancy really.
The rudimentary design is simple. Commands are received over the network and then parsed from human readable text commands into method calls by regular expressions. The methods are then called through reflection. The commands can either create new graphics objects in 3D or control various properties of those objects in order to generate graphics with videos playing, still images or text. The generated graphics is then transfered to a dedicated SDI video card which outputs a key and fill signal. This is in turn sent to an external graphics mixer that controls how the studio cameras signal should be blended with that of the displayserver.
There are currently no more details since this is a propitiatory piece of software.
This is a graduate project at university. It received the second highest grade of 10
In this project I am investigating N-body SPH modelling in astronomy in general terms. A few simulations are implemented for real-time demonstration. They are the collapse of interstellar gas clouds, the destruction of a soft moon inside the Roche limit and the creation of a moon by a giant impact between two planets.
It has a lot about CUDA in it, though CUDA was not the topic but more the tool.
I looked at a new way of simulating large crowds as a fluid-like particle system. Most other crowd simulations treat each individual agent as a separate entity, while I tried to consider the crowd more a whole where each agent in the crowd would be controlled by forces such as steering, avoidance and coherence as well as a general goal destination. Those forces would be calculated in a way similar to the forces at work inside a fluid.
The possibly novel element in this project was how I calculated those forces by 2D convolution over the entire domain rather than by calculating nearest neighbors for each individual agent.
More details can be read in the paper.
It is a simulation of water of different viscosities and with and without uniform gravity.
More details can be read in the paper.
This is my bachelor project, which was written together with Axel E Jensen. It received the highest possible grade of 12.
We presented a real-time method for procedurally generating huge planetary landscapes with continuous level of detail. This approach enabled us to produce interesting planets with a small or non existing pre-generated dataset, which in turn could be used to visualize an endless number of different planets. Where previous work in landscape generation have generally been purely procedural or purely design, we devised a method which allows for a seamless integration of design into the computer generated world.
Another novelty is the decoupling of the mesh optimization from the rendering. While a high frame-rate is a requirement for fast and smooth animation, the mesh optimization can run in the background at a slower pace. We implemented a system with different update frequencies for rendering and mesh optimization, to let us prioritize the different tasks, and to distribute the workload on multiple processors.
A method to generate natural looking river systems in the procedural generated terrain is explored and implemented. While we found that actual real-time procedural river generation was very difficult, one could combine a fast preprocessing step, with correct river flow calculations, which could later be placed inside the terrain.
A quite fast sudoku solver for 16x16 grids. It has a nice gui which lets you enter your own problems and solve them with simple depth first searching or with a heuristic assisted depth first.
A raytracer was implemented as a demo. It supports textures for both diffuse color and reflectivity as well as various mapping methods. Constructive Solid Geometry, as described in more details here is implemented.
Most compression schemes are dealing with isolated data, while differential compression is about compressing data which relates to some other known data. This can be used to compress a newer version of a document, by basing the compression of an older version of the same document. This is also known as incremental compression, and depending on its use, as incremental backup.
I was hired by a company providing backups, to come up with a usable method, which would allow a user to execute incremental backups, even though that user no longer had the older version of the data. That data was stored on a remote server, and it should not have to be transmitted to the user in order for him to calculate the difference. Instead a hashing method was used.
A flock of birds, a horde of bisons or a school of fish all display flocking behavior. Even though there is no designated leader, they all seem to want to go the same place and they act as one unit. Boids is an example of flocking defined by three basic desired. I implemented a demo which would let me play around with different defining parameters for a flock of boids.
A simple pressure wave simulator which I made to see if my idea for a basic simulation would work in a way that looked realistic. Later comparisons with more "correct" simulations have shown that the simple method performs quite well actually.
The Principal idea was to represent a 2D scene as a grid of point object with a certain mass. Those objects were linked to the neighbors by virtual springs. The pressure at any given point would be represented by the point mass' position along the third dimension. That means, the points could be moved perpendicular to the 2D plane, and their distance from the plane would represent the pressure. If a point was moved out, then it would be pulled back in by its neighbors, while they in return would be pulled out towards it. This would generate a wave like motion as seen in the videos.
I wanted to try making an agent which would be able to take a safe path through a threat landscape. The initial threat would be defined as greater the more visible a certain position on a 2D map was. If the position could be seen from many other locations then it would be less safe than a position which could only be seen from a few locations.
Later I will add extra threat for positions which are actually visible fom the point of view of "The Enemy". It will in fact be less heavy on the calculations than this global visibility testing is.
In short I test visibility for every position in the 2D map against every other position on the map. If the point can be seen then the threat increases by one. For a 100 by 100 map, that is 100*99 visibility tests. This can be cut in half if we assume that if A can see B, then B can see A. What i actually do is to test every position against a number of other positions which I pick at random - a so called Monte Carlo method. For the 100 by 100 map I could decide to only test against 10 other positions instead of all other 99 positions.
The maps are in fact quite a bit larger than 100 by 100, so picking some lower number can speed things up by an enormous amount, and the fact that the threat assessment is not 100% correct just adds a little realism to the picture. It is worth noting that this global visibility testing can be precalculated and stored along with the map.
At one time, a friend of mine was building a small trebuchet (medieval catapult), and I wanted to see if I could calculate its optimal position to release the ball and its range potential based on arm mass and length, counter weight mass and ball mass and ball carrying string length. Having recently done some cloth simulation using constraints, I wanted to try this without using ordinary rigid body physics. Instead I wanted to build the contraption using constraints and see what it would do. The idea worked OK, and we had some input for the actual model. When taking air resistance into account, the simulated range was not too far off. I do not remember how precise it was exactly since I am writing this some time later, but it was in the neighborhood.
This is a constraint-based simulation of a piece of cloth affected by various forces. The underlying code is a hierarchy of objects, which implement force, weight and constraint. A force can be a simple force such as gravity or it can be a more complex pseudo random force like the wind. The fabric of the cloth consisted of weights joined by constraints. The constraints could be position or distance based, so they could keep a weight at an exact position or at a certain distance from related weights. See these still images or download samples at the "more detail" link below.
Intel 386 Real mode emulator, except for floating point support, which will be added later. It can load code compiled with any 16-bit compiler for dos. Pascal, C, C++ Assembler or anything else. It runs on the machine code, so as long as it is 16 bit realmode instructions, it will run it. The CPU is a C++ object, so you can add any number of CPU's to a system and interconnect them on a common bus. The object has the ability to disassemble and display the code, which is loaded into it, its registers and memory. Breakpoints can be set and you can jump over functions instead of following CALL commands into the function. On an AMD XP 1600+ I have been running the current not-so-optimized version at 50MHz. The project, which used the CPU object, can load both plain COM files and EXE files with relocations. It can however not display the source code (COFF and OMF doesn't look easy to implement), so when observing the code, it is disassembled machinecode. In the test project the CPU is connected to a bus object, which again is connected to hardware objects. When the CPU gets a command to read or write to a port, it calls the port read/write command on the bus object, which sends it to any connected hardware. The hardware which matches the given port address receives or returns data to the bus depending on weather it was a read or write. This way an entire computer can be constructed with various components. The emulator is used in my robot project. If you are an expert in COFF symbolic debug info, then I would really like to hear from you. I would like to implement symbolic debugging in the emulator, but I haven't been able to get started on COFF yet.
Simulated robots in a simulated world doing physically simulated things. This is a current and ongoing project, which is all about making a framework for test of different robot designs and controller code. Robots consist of various hardware components such as a body, engines, ranger, compass ect. And one or more processors to control all this. The CPU loads code designed to control the robot. The code reads and writes various ports to communicate with the hardware, which in turn communicated with he world. An example could be reading port 0x12, which could be mapped to a register on the sonar, which contains the latest distance measurement to objects in front of the sonar. After reading this value, your code could write to port 0x44, which could be mapped to the engines speed register. Writing a value of 0 could turn the engine off and stop the robot before a collision. The physical world is simulated using RungeKutta4 integrator and constraints for rigid body objects. The world runs at 100Hz and the CPU currently runs at 1MHz. The world and objects in it are rendered using OpenGL without too many fancy effects.
For the Atmel 8515 MCU I made embedded software, which could generate a simple static test image on a standard vga monitor. The resolution was limited to 64*48 pixels in 8 colors, due to the lack of a DAC and the relative slow speed of the available MCU, which was 8 MHz.
Encapsulation of WinSock sockets (before 2004)
A group of classes which encapsulates much of the functionality of Winsock sockets. For ease of use classes were defined for UDP and TCP sockets based on an abstract socket. Class methods for events were define virtual so it was easy to overload them in descendant concrete classes and simply implement the behavior needed in the methods themselves. This gave a cleaner design than plain callback functions.
A FTP server with virtual file system in a relational Access database. In test it has served a thousand simultaneous clients with good speed. The file system included a security scheme that defined which folders and files any given user could see, enter, delete and write.
A simulator of objects moving in a gravitational field. Any number of objects with varying mass and velocity can be simulated as they orbit their common center of mass. On collision, objects are joined to form a common object with a mass equal the sun of the original objects, and new motion vector calculated as a weighted average of the previous objects. Unlike the constraint-based simulation in the cloth project, this simulation used the RungeKutta4 integrator to solve the equations. I started using Euler, moved on to midpoint (RK2) and ended up with RK4 and Verlet. RK4 is more accurate than Verlet, but it is also more computational costly, so I implemented both in this system, to make a "real world" comparison.
Another embedded software project. Here I connected an ordinary IDE harddisk from a PC to an Atmel 8515 MCU. The mcu had almost all its ports connected to the IDE cable, so register and data read/write could be done with ease. A file system like FAT was not implemented, but the software could start the harddisk, select slave/master drive (when two disks were connected to same cable), get status, format tracks, select a sector and read data to and from it. The disks could receive a spin down command to save noise and power.
A standard PC keyboard was connected to an Atmel 8515 MCU, and controlled. The keyboard communicated using two lines. One is the clock and the other is data. The software in the MCU basically responded to a falling edge on the clock line and sampled the bit on the data line. 11 clocks were received pr. byte. Falling edge, data and two stop bits. Using a simple command interpreter, the MCU could be controlled to turn various pins on its ports on and off. This project like the harddisk one was meant to be part of a larger homebuilt computer.
A standard PCI NE2000 compatible ethernet card was given a basic DOS driver. Using interrupt vector number 0xAF the driver gave access to a few functions for the card. With different values in register AH it was possible to initialize the card (start it and set buffer ect.), get its MAC address and send ethernet packets over LAN. No protocol stack was placed on top of ethernet. Only ethernet packets could be transmitted and received. Software was later made which used this driver to send UDP packets.
Following my experience with the card from "NE2k Ethernet interface driver for DOS" I acquired an ISA NE2K ethernet card, and connected this to an Atmel 8515 MCU. The card was hardwired to IO address 0x300 and the read/write pins and address and data were connected directly to the ports on the mcu. The software could now control the card entirely through read and write to various internal registers. First version of the software was ported directly from the DOS driver code. Next came implementation of ARP to exchange IP information on LAN. IP Packets were also implemented with UDP and ICMP on top of it. This was all very simple compared to implementing TCP on IP. I did do it, but my limited stack only supported one TCP connection at a time, and it was not perfectly stabile. It did however support both server and client mode. The server mode was used for an absolutely basic HTTP server. This server simply responded to any data from a client by returning a HTTP answer with a single HTML formatted text, which reported the status of 8 input bit on portD on the MCU. To get this up and running was a complex task. Especially considering I had only 512 bytes of ram to contain the variable data on the TCP stack. The code itself was of course in flash. The end result was that I could connect my little embedded project to my home LAN and after opening the router and pointing a port towards the IP I had given my project, you could connect to it from anywhere in the world, using an ordinary web browser. Imagine what this would have grown into had I been blessed with more than merely 512 bytes ram. I will add a latch and some external ram to the MCU some day and work a little more on this project.
A 16bit realmode DOS program was made to implement reading FAT12 and FAT16 disks using BIOS routines for reading and writing absolute sectors. To learn about the FAT system, I made this program which read the bootsector to get disk information, read the root directory and through this could look files and directories up in the FAT data and read files to memory by getting one cluster at a time. A simple API was made to duplicate the behavior of FindFirst and FindNext as it exists in the Windows API. I didn't consider file size or attributes for my functions, I only looked at filename and if the entry was a directory or a real file. When writing files I simply looked through the FAT to find vacant space to write into. If I ran out of space before the file was written entirely, then I ran back through the FAT and marked the areas I used as free again.
Building on my experience with FAT12, I made a boot sector in assembler which could realize that it was loaded to 007C:0000 and run from there to load another program located on disk. The program (the command Interpreter) could be located in any sectors on the floppy, so it could only be found by reading the root directory and then cluster-by-cluster reading data into memory. After load, the boot code made a jump to LoadSegment+LoadOffset*0x10:0 so it could run at offset zero. The command interpreter was coded in C and linked with a small ASM initializer, which in this case simply called main(). The interpreter supported only the dir and CD command that I duplicated name and parameters from DOS.
Using the boot code and command interpreter from "Boot sector and command interpreter" I made a multiprocess operating system. More or less. Given the command RUN [filename] [option], the command interpreter would load the given file to memory and execute it. If the option was set to multi then the interpreter didn't start the loaded program. Instead it added it to a list of running programs. Interrupt 0x1C (the timer) started a routine which stored all registers, looked at the list of programs, fount the next program to run and read its registers back in and then did an interrupt return which effectively made the program run again as if it had not been interrupted. Up to four programs could run like this. Naturally they needed to behave nicely and not mess up the display too much, since only the registers were saved when programs were switched. I did make tests where three programs were running at "the same" time and each printing counting numbers at different positions on screen.
This was a test of dynamic landscape generation based on random numbers. The general ide is to subdivide triangles into smaller triangles and add extra detail on the smaller triangles based on random numbers and the details on the higher level. I rendered the triangle mesh in OpenGL and color coded each vertex based on its "altitude" above "sea level". The mesh was calculated to a certain detail level. I didn't calculate a viewer dependent level of detail, but gave each part of the terrain same priority. This meant that a lot of time was wasted calculating detail on the far side of mountains. An image of a calculated terrain can be seen here