Thanks to the folks from #cyberhack, especially shadowdaemon, for making this project start. CYBERHACK DRAFT 0x00000003 For the time being, this document is Copyright (C) Tuomo Petteri Venäläinen 2011. Fear not though, we started this project for educational and recreational purposes so the related code will have an open license of some sort. :) Personally, I'm in the hopes this would teach people some programming and machine basics and perhaps most importanly, be fun. The embedded programming language attempts to hopefully be a bit easier to grasp than that of Core Wars; this should hopefully make it more suitable for the younger generations among us. NOTE: this is kind of formal language; if I say 'we use', it doesn't mean we have to; comments and suggestions are more than welcome and this text should be considered an early draft. Cyberhack Programming Language (CHPL) ------------------------------------- This language is a tool for program-on-program combat in the game Cyberhack. The basic idea is that when two programs start a battle, they can choose a program, i.e. a set of instructions with to investigate and manipulate memory. The goal is to get more of your numerical ID written than the opponent gets hers; the memory size is fixed and known to the player in the beginning. The string length can be variable. The game executes competing programs - there can be more than two - one instruction per turn, or in the case of multicore systems when you have more than one thread running, as many threads as there are cores get a turn. The threads are scheduled round-robin, i.e. in the order indicated by their creation time by default. This order of scheduling can be modified with priorities by the program. Pseudo-Computers ---------------- The code executed by the machines may remind some, even many, of you of games such as Core Wars. The language allows writing self-modifying code that can clone itself by the means of fork() and vfork() familiar to Unix programmers. As an extension, we shall do basic networking. Owned memory cells have not only owner process ID but the NET address (32-bit, 4 octets for topological hierarchy); after you capture such addresses, you can attack those remote hosts with your code. All pseudo-computers in the cyberspace have the same instruction set so you can distribute your code at ease to remote locations. Multicore systems let you run several threads at once. Competing programs may run on the same host, or in case of network attacks, on different hosts; if the attacker code wins, it gains access to that port on the remote computer. The NET addresses you encounter will be added to your address book; you can add comments to it. Software Environment -------------------- I/O and Storage --------------- Storage access, except when storing analysis data of your processes, requires you have gained access to host system storage facilities; there is a set of ports you need to gain control of before you can read and write disks and other storage devices. Remote storage requires control of the network device and so forth. Storage mechanism include: - floppy disk; only for physical access to the host - tape drives; remote - hard disks; only for your own systems - optical media; only for your own systems - network; needs control of the network adapter Signal interface ---------------- Failure signals indicate losing the program-on-program battle. Some signals can be used as forms of communication behind the threads you're the owner of. Host systems may set some signals to trigger shutdown, self-destruction, or other extreme means of failure such as compromised system security. Actions ------- I - activate 'ICE'; execute a countermeasure such as EMP. T - terminate process D - dump 'core' (memory image of running process) SIGSTKFLT - stack segment overrun SIGSEGV - illegal memory reference SIGBUS - illegal memory access SIGFPE - floating-point exception; zero-division SIGILL - illegal machine instruction encountered SIGTERM - terminate signal received SIGKILL - kill signal received SIGQUIT SIGINT SIGUSR1 SIGUSR2 Networks -------- Our virtual world consists of hosts on different networks. The item of interest at start are the public hosts; once you conquer some of these, you may get access to private ones. Network servers run services such as Domain Name Service (DNS). As an example, this service translates alphabetical addresses convenient for human beings to the underlying digital addresses. We use IPv4-style 32-bit addresses consisting of 4 8-byte octets, e.g. 128.0.0.36. The more networked hosts you conquer without being intercepted, the bigger distributed attacks you can initiate. The goal is to create a multiplayer game, so people could team up and form 'botnets' to attack powerful servers run by interesting entities such as banks. Network Services ---------------- Port Service Explanation ---- ------- ----------- ECHO Sends messages to users on the host. Mostly for fun. SYSLOG Logs system messages; disable to wipe traces of attack attempts. ROUTE Routes network packets; may be intercepted by middleware to capture traffic and collect NET addresses. DNS Translates network addresses; may be forged to redirect traffic. FTP File transfers Virtual Machine Environment --------------------------- Registers --------- pc -- - Program Counter; the address of the current instruction; knowing instruction length - let's say fixed 32 bits - you can use this to access other instructions in your code. This makes programming more interesting as you can do things such as write self-modifying code sp -- - Stack Pointer for push and pop. The address of the 'topmost' item on the stack. fp -- - Frame Pointer. Can be used to access the return address of function calls. ar -- - Accumulator Register for arithmetic operations pr -- - Pointer Register for memory addressing nr -- - iNdex Register for memory addressing Global Variables ---------------- myid ----- - the number (32-bit) you are trying to write; does not share words with opponent strings progsz ------ - the size of your program in [compiled] bytes stksz ----- - the size of your private 'work' memory (stack) in 16-byte words memsz ----- - the size of memory in 16-bit words Example instructions *%pr = w; // store word W into memory location pointed to by PR %ar[%pr] = w; // store word W into memory location *(%ar + %pr) ar = *pr; // load memory from location *%pr into AR ar = %ar[%pr]; Library Functions ----------------- rnd(x) // compute random number between 0 and x-1 frz(a) // store frz instruction to memory address A; makes opponent // read or write cause a 1-cycle freeze psh(x) // push x to memory location pointed to by SP, then decrement SP pop() // increment SP, then fetch the word it points to Default Setup ------------- memsz - 65536 stksz - 8192 Example Program --------------- CODE start() { main(rnd(memsz)); } CODE main(UNSIGNED a) { start[a++] = myid; start[a++] = myid; start[a++] = myid; start[a++] = myid; main(a); }