Log in

No account? Create an account
05 July 2004 @ 08:45 pm
Warning: computer geekery ahead  
Back in college, when I was a Comp Sci major, I started fooling around with designing my own programming language, which would "improve" on C++ and Java. I've fiddled with it off and on ever since, adopting features and concepts from other languages (mostly C++ and Java, although bits of Perl have found ther way in). Since I have no experience with compiler design, it'll probably never get implemented, but it's still fun for me to tinker with. I've named it T, because it's derived from C and isn't Java. ;)

The basic design is like C++: a strongly-typed, mixed procedural/object-oriented language with (mostly) C-like syntax. However, it isn't an extention to C like C++ is; valid C code is not necessarily valid T code. T has some features that C++ doesn't have, and does some of the same things differently.

Functions and classes, as well as primitive types and objects, are first-class types. They can be passed around and manipulated (unspeakablevorn has convinced me to make methods—pairings of member functions with individual objects—first-class types as well).

Primitive types (int, char, etc.) are considered objects and are not separate from the classification hierarchy. T also has per-object inner classes, like Java. Like C++, and unlike Java, T supports multiple inheritance.

I'm of two minds about pointers. My original idea was to have three different kinds of pointers for three different purposes: handles, allocators, and iterators. Handles could point to anything. Allocators would be returned by new and required by delete, and would necessarily always point to the heap. Iterators would be accessed from arrays, and would be capable of pointer arithmetic. Nowadays, I've been thinking of replacing pointers entirely with "floating references" which would be like C++ references (in other words, implicit dereferencing) but with a way to change where they point (I'm thinking an @= operator). Since I decided that T would be garbage-collected, there's less to be gained from giving pointers to the heap special treatment (I think), but that still leaves the problem of pointer arithmetic. Should I allow the same sort of hijinks that C/C++ does? Should I keep array iterators separate? I dunno.

And on the subject of arrays, T has true multidimensional arrays, using an array[x,y,z] sort of syntax.

In C++, stack variables are non-polymorphic, and pointers to objects polymorphic, by definition. T has no such restrictions: stack variables can be declared as "polymorphs", and pointers/references/whatever can be strictly typed. The tilde (~) is used to signal polymorphism in variable declarations. Polymorphs on the stack would probably be implemented as a layer of indirection, but would still act like regular stack variables except for the fact that they can be assigned values of types that are subclasses of the given type.

Classification and inheritance are considered separate but related concepts in T. Classification is more general. Simply having a superset of public members of another class is enough to be considered a subclass. It's even possible to declare a superclass later than its subclasses.

The different kinds of loops are: do, while, for, and forever (and whenever, but that's special). Do isn't really a loop (it only executes once, normally), but it allows labels and flow-of-control statemenrs like break or continue to be used with a block. Forever is an endless loop, the same as while(true). Else blocks can be used with conditional loops, and are run if the condition is not met (but are skipped if a break statement is used). Loops can be "daisy-chained", so conditions can appear pretty much anywhere in the loop: do { Foo(); } while ( i < 5) { Bar(); } would run Foo(), then test, then if the test was successful run Bar() and loop back to the beginng, or if unsuccessful end the loop. A single loop can even have multiple tests at various points. Loops can be labelled, and flow-of-control statements can take labels to break alter loops other than the innermost.

T has language support for threading and synchronization (the name originally stood for "threaded"). The type system enforces some aspects of thread-safety: a thread object cannot be instantiated with a function that uses static data that is not of a threadsafe type (e.g. semaphores, or variables locked with a mutex monitor) or declared volatile. Locking of monitors is done with lock blocks of the form lock(var1, var2, ...) statements;, which locks the variables given in parentheses for the duration of the statement and unlocks them afterwards (this ensures that any mutex that is locked gets unlocked). There is also the when statement for synchronized conditionals. The when statement has the same syntax as an if statement (when (condition) {statements}), but when encountered blocks until it can acquire a lock for all monitors in the condition, then tests—if the condition is true, it executes the conditional block; if not, it executes the else block (if present), releases the locks, and tries again. The locks are held for the duration of the when block. The whenever loop statement is similar, but releases the locks and tries again after executing the conditional code for a success as well as for a failure—the only way to exit the loop is with a break statement (it's considered a loop, so flow-of-control statements can be used with it).
Tags: ,
Current Mood: geekygeeky
Current Music: Oingo Boingo - Dead Man's Party
Twisted, but strangely fluffy: inappropriatemoltare on July 6th, 2004 01:02 pm (UTC)
And here's the scary part, lads: I actually understood some of that.
The All-Purpose Guruallpurposeguru on July 6th, 2004 08:29 pm (UTC)

I like some of the things you have here. I especially like your looping construct, I can think of several places in my current code where it would be very nice to have. I wonder how it would be formatted in a block-structure?

I wonder about the different pointer types-- I have always used pointers at the concrete level, to write registers and such, (remember, I'm a firmware engineer) so I like being able to explicitly point pointers (and declare them, also) at a specific, physical memory location so I get a lexical representation of the register in the language. Especially when you can specify a volatile to ensure register writes and reads, it makes hardware control very easy.

Now, I know that your, as well as most languages weren't designed for these hijinks-- but you'd be amazed how powerful some of these things can become by allowing for these things in the code.

Have you put together a spec with a more complete definition of the language? If you wouldn't mind my looking at it, drop a copy into my email-- it's my livejournal name at yahoo.com.