Rewriting Rust: A Response

Assumed Audience: Hackers, Rustaceans, and anyone interested in programming language design.

Epistemic Status: Quite confident. The ideas are partially implemented and working.

This post is partially an ad!

Introduction

josephg wrote a post about what he’d like to see in future Rust, or another programming language following Rust.

Unbeknownst to josephg, that language is coming into existence right now! It is called Yao.

I already wrote about what else Rust got wrong, but let’s go over the features josephg wants and see how Yao stacks up.

Before I do, in order to avoid misunderstandings, I must admit that I am not completely certain that these ideas are possible. More on that later.

Function Traits (Effects)

I briefly mentioned function traits here, but I should expand on them.

First of all, function traits require Restricted Structured Concurrency (RSC), so if you’re looking for a catch, that is one of them.

The second catch is that every function trait needs to be restrictive. For example, pure is a restrictive trait because it restricts the function from having side effects. On the other hand, async is not restrictive; it actually allows the function to do more.

Why does this matter? Because a function without pure will always be able to call a function with it, while a function without async cannot just call a function with it.

This is actually the root of the function color problem: async is exactly backwards! Instead of marking asynchronous functions, languages should have had us mark synchronous functions, and then async should have been the default!

Although that wouldn’t remove the possibility of bugs when calling a blocking function in async code, which is the real reason I hate async.

Anyway, Yao’s function traits will always be restrictive, and that’s what will make them scale better than Rust’s while avoiding much of the problems because by default, functions will be able to do anything.

Global Context

But RSC gives so much more than just function traits: it can make it possible to safely handle global context.

An example: on both POSIX and Windows, there is a bit of global context called the “current working directory.” On all platforms, it’s global context, and it can be changed at any time. This can easily lead to problems, especially in multi-threaded code.

However, Yao code can have multiple current working directories. In fact, each thread can have its own!

To demonstrate this, you can do this:

$ git clone https://git.yzena.com/Yzena/Yc.git
$ cd Yc
$ make
<...>
BOOTSTRAP OK
$ ./release/yc yao tests/yao/cd.yao
/home/gavin/Downloads/Yc
test.txt DOES NOT EXIST
/home/gavin/Downloads/Yc/test1
test.txt DOES NOT EXIST
/home/gavin/Downloads/Yc/test1/test2
test.txt EXISTS
/home/gavin/Downloads/Yc/test1
test.txt DOES NOT EXIST
/home/gavin/Downloads/Yc

That clones this repo, bootstraps Yao and its build system, then runs Yao on this file, which uses the pwd command to demonstrate that the current working directory is, in fact, being changed. Also, notice that the file existence check is relative to the CWD; it only succeeds when the CWD is test1/test2, so path operations are relative to the CWD Yao sets.

And in none of this is Yao setting the global CWD; in fact, none of my code changes that global CWD. Yao is written to use its own CWD rather than the global CWD.

Now, this doesn’t demonstrate separate threads having the same capability, but that’s only because I haven’t implemented thread spawning in Yao yet. But the same capability exists in my C code, which can span threads.

This uses something called “context stacks,” a concept I stole from Jonathan Blow and his language, Jai. The “current working directory” is just the directory on the top of the CWD context stack for the thread.

But what if the thread doesn’t have one? Well, it is the directory that was on top of the parent thread’s CWD context stack when the child thread was created, and so on, recursively to the root thread.

And the root thread always has a CWD.

When combined with RSC, context stacks are enormously powerful and incredibly safe! I know that a CWD that is pushed onto a parent’s context stack before a child thread is created will always be on the stack for the entire lifetime of the child.

Another example: environments. The environ variable, as well as getenv and setenv, are the source of many footguns.

In Yao, environments are just another context stack:

$ ./release/yc yao tests/yao/env.yao
------------------
------------------
BC_ENV_ARGS=-l
BC_LINE_LENGTH=64
------------------
BC_ENV_ARGS=-l
++++++++++++++++++
BC_ENV_ARGS: -l
------------------
BC_ENV_ARGS=-l
BC_LINE_LENGTH=64
------------------
------------------
BC_ENV_ARGS=-l
BC_LINE_LENGTH=64
------------------
BC_ENV_ARGS: -l
BC_LINE_LENGTH: 64

That uses Yao to run this file, which uses env to demonstrate that the environment is changed for child processes. It also grabs an environment variable with the equivalent of getenv() (env.env in the script) to demonstrate that yes, you can still grab individual environment variables.

You can set whatever you want in the environment before you run that, and it doesn’t matter: the output will be exactly the same (barring any bugs).

Automatic Memory Management Without GC

Another thing that I think RSC gives us, when combined with scope-based resource management, is fully automatic memory management without a garbage collector.

Rust has already proven that this is possible, and I believe, but I’m not completely sure, that RSC would make it easier to reason about.

Regardless, Yao as currently implemented has no leaks. You can run Yao on both of those scripts above and under Valgrind to check:

$ ./release/yc rig -Dvalgrind=1
$ valgrind ./build/yc yao tests/yao/env.yao
$ valgrind ./build/yc yao tests/yao/cd.yao

Compile-Time Capabilities

Next, josephg wants compile-time capabilities.

Yao already has them. In fact, it not only has them for functions, but it has them for keywords, which are Yao’s equivalent of macros.

To demonstrate this, run this:

$ ./release/yc yao tools/rig_slam.yao
Panic: Unimplemented
    Source:    /home/gavin/Downloads/Yc/src/yao/keywords.c:2105
    Function:  yao_parse_while()

Illegal instruction

That panic happens because it is trying to parse the while loop in this file, and I haven’t implemented parsing while loops yet.

However, say you want to restrict a Yao script; you want to make sure it can’t be fully Turing-complete. And you don’t trust the author because he’s in the cubicle next to yours, and you know he’s dumb.

Yao has a builtin mode that makes the language non-Turing-complete, which removes a lot of stuff. One of those things is while, which makes things Turing-complete.

So you could tell Yao to use the iterative language mode, which enables restricted loops, but not unrestricted loops like while:

$ ./release/yc yao --lang-mode=iterative tools/rig_slam.yao
yc: tools/rig_slam.yao[146:2]
    Parse error: Incomplete variable declaration for name: while

yc: tools/rig_slam.yao[146:8]
    Invalid token: Expected semicolon (';')


Panic: Unimplemented
    Source:    /home/gavin/Downloads/Yc/src/yao/keywords.c:1753
    Function:  yao_parse_if()

Illegal instruction

Another unimplemented panic, but notice that it’s different: Yao tried to parse while as a plain name! It’s as though it didn’t even exist!

So yeah, Yao has compile-time capabilities right now! It does it by simply not even importing the definition of things that are restricted, which means that the restricted code can’t access it, no matter what.

It’s fine-grained too; while is gone, but if is not; that’s where the panic happened! And yes, this shows that it applies to “builtin” keywords too, not just user-defined ones. It can also apply to functions, packages, types, and context stacks!

Although it is only implemented on keywords and packages at the moment.

Runtime Capabilities

But if Yao has compile-time capabilities on context stacks, it can prevent code from accessing certain context stacks. What if we used that to go a step further and use context stacks to implement runtime capabilities?

Take the CWD context stack for example. We could use something like it, along with RESOLVE_BENEATH on Linux and O_RESOLVE_BENEATH on FreeBSD, to implement filesystem capabilities at runtime.

What if your program foo started with / as an open directory on a context stack? Then, your setup code could use that directory to openat() on /etc/foo and /home/$USER/.config/foo, and push both of those open directories onto the context stack.

Then you call a dependency that you don’t trust; you didn’t give it access to that directory context stack, so it can’t change them. Perhaps it does open files, but since it has to use the standard library to do that, and the standard library does have access to that context stack, it can read the top of the context stack and try to open a file in either /etc/foo or /home/$USER/.config/foo. Since it is using {O_}RESOLVE_BENEATH, it can only open files in those two locations.

This means that the untrusted dependency can only open files in the config directories of your foo program; it can’t steal your SSH keys or your crypto wallet. And it certainly can’t encrypt your whole drive and demand a ransom.

And the best part? Because those directories are open directories, foo avoids TOCTTOU bugs on directory operations.

These RSC- and context stack-based runtime capabilities are enormously powerful! You can imagine one for restricting what external commands a dependency can run (only a set of C compilers, for example), or what domain names can be resolved. Or even what IP addresses to connect to.

Distribution

Unfortunately, there is a cost to such awesomeness: Yao can’t be compiled in the traditional way.

Instead, Yao will be compiled to an LLVM-like IR (which already exists), and that’s how it will be distributed. Users will have to do the final compile step on their local machines.

However, there is a nice side effect: Yao will avoid the stupid-long link times that traditional object files suffer from.

Pin, Move, and Struct Borrows

I agree with josephg that Pin is complicated.

Yao will have the same problems, but unlike Rust, there will be another way to avoid Box: since Yao uses RSC, anything allocated on the stack before child threads are created can be safely passed to those child threads.

And lest you think that that could easily blow out the stack, Yao actually has a shadow stack on the heap that is capable of allocations of any size.

In like manner, struct fields can be borrowed by child threads with smaller lifetimes. Easy as pie.

Okay, not completely easy, but easier than Rust!

Comptime

I wish I had code to demonstrate this, but I haven’t gotten there yet.

But Yao will have comptime!

Unlike Zig, its comptime will not be lazy.

First, some background: one of my first design decisions is that Yao will use currying. It will look like this:

fn curried(n: num) -> (s: str) -> bool
{
    // stuff
}

This is actually how Yao will implement closures because explicit is better than implicit:

n: num = 1;
curried(n);

Anyway, I have always loved the idea of Zig, but hated the execution! I have written Haskell, and I can think in lazy languages, but I don’t like it.

So what if I just used currying to have comptime? Comptime callable code can be marked with a # (like C’s preprocessor), so it would look like this:

fn comp#(n: num) -> (s: str) -> bool
{
    // stuff
}

Then you can call comp#(n) at compile time and get back a compile-time constant function. This is how Yao will implement generic functions.

With the addition that comptime functions can take and return types, Yao will get generic types too:

fn vec#(t: type) -> type
{
    return struct {
        array: ^t;
        len: usize;
    };
}

I will also add pure as a function trait for zero side effects, as well as total for functions that are not Turing-complete (guaranteed to halt). If comptime functions are required to be pure and total, then even though Yao’s type system will be powerful, it won’t be Turing-complete!

Conclusion

If all of that is sounding too good to be true, it kind of is.

See, while I have succeeded in implementing some of Yao’s design, I haven’t implemented all of it. So unlike the V creator, I’m going to be very honest, clear, and precise: I do not know if Yao’s design is even possible to implement!

I do have some evidence that it is possible, and I do have some demos, but I do not know for sure!

In addition, while there is some good about RSC and Yao, there is some bad and ugly to them as well.

If Yao works, will it even be worth it with those disadvantages?

Of course, I do think so, but every programmer needs to make that decision personally. It could be that most, or all, just don’t think Yao is worth it.

And I need to be fine with that.

Oh, and if you’re worried about the license on the repo, I expressly give any reader permission to use that code to run the demonstrations in this post. Also, I would like to remove the SSPL and move to an AGPL-like license, but I need lawyer money first.

About

Contact

Archive

Categories

Tags

Subscribe