Continuations in C++ with fork

While reading ”
Continuations in C
” I came across an intriguing idea:

It is possible to simulate
call/cc
, or something like it, on Unix systems with system calls like
fork()
that literally duplicate the running process.

The author sets this idea aside, and instead discusses some code that uses

setjmp

/

longjmp

and stack copying. And there are several other
continuation
-like constructs available for C, such as POSIX

getcontext

. But the idea of implementing

call/cc

with

fork

stuck with me, if only for its amusement value. I’d seen
fork
used for
computing with probability distributions
, but I couldn’t find an implementation of
call/cc
itself. So I decided to give it a shot, using my favorite
esolang
, C++.

Continuations are a famously mind-bending idea, and this article doesn’t totally explain what they are or what they’re good for. If you aren’t familiar with continuations, you might catch on from the examples, or you might want to consult another source first (
1
,
2
,
3
,
4
,
5
,
6
).

Small examples

I’ll get to the implementation later, but right now let’s see what these
fork
-based continuations can do. The interface looks like this.

template <typename T>
class cont {
public:
void operator()(const T &x);
};

template <typename T>
T call_cc( std::function< T (cont) > f );


std::function

is a wrapper that can hold function-like values, such as function objects or C-style function pointers. So
call_cc
will accept any function-like value that takes an argument of type
cont
and returns a value of type
T
. This wrapper is the first of several
C++11
features we’ll use.

call_cc
stands for “call with current continuation”, and that’s exactly what it does.
call_cc(f)
will call
f
, and return whatever
f
returns. The interesting part is that it passes to
f
an instance of our
cont
class, which represents all the stuff that’s going to happen in the program after
f
returns. That
cont
object overloads
operator()
and so can be called like a function. If it’s called with some argument
x
, the program behaves as though
f
had returned
x
.

The types reflect this usage. The type parameter
T
in
cont
is the return type of the function passed to
call_cc
. It’s also the type of values accepted by
cont::operator()
.

Here’s a small example.

int f(cont<int> k) {
std::cout << "f called" << std::endl;
k(1);
std::cout << "k returns" << std::endl;
return 0;
}

int main() {
std::cout << "f returns " << call_cc<int>(f) << std::endl;
}

When we run this code we get:

f called
f returns 1

We don’t see the ”
k returns
” message. Instead, calling
k(1)
bails out of
f
early, and forces it to return 1. This would happen even if we passed
k
to some deeply nested function call, and invoked it there.

This nonlocal return is kind of like throwing an exception, and is not that surprising. More exciting things happen if a continuation outlives the function call it came from.

boost::optional< cont<int> > global_k;

int g(cont<int> k) {
std::cout << "g called" << std::endl;
global_k = k;
return 0;
}

int main() {
std::cout << "g returns " << call_cc<int>(g) << std::endl;

if (global_k)
(*global_k)(1);
}

When we run this, we get:

g called
g returns 0
g returns 1

g
is called once, and returns twice! When called,
g
saves the current continuation in a global variable. After
g
returns,
main
calls that continuation, and
g
returns again with a different value.

What value should
global_k
have before
g
is called? There’s no such thing as a “default” or “uninitialized”
cont
. We solve this problem by wrapping it with

boost::optional

. We use the resulting object much like a pointer, checking for “null” and then dereferencing. The difference is that
boost::optional
manages storage for the underlying value, if any.

Why isn’t this code an infinite loop? Because

invoking a
cont
also resets global state

to the values it had when the continuation was captured. The second time
g
returns,
global_k
has been reset to the “null”
optional
value. This is

unlike Scheme’s
call/cc
and most other continuation systems

. It turns out to be a serious limitation, though it’s sometimes convenient. The reason for this behavior is that invoking a continuation is implemented as a transfer of control to another process. More on that later.

Backtracking

We can use continuations to implement backtracking, as found in
logic programming
languages. Here is a suitable interface.

bool guess();
void fail();

We will use
guess
as though it has a magical ability to predict the future. We assume it will only return
true
if doing so results in a program that never calls
fail
. Here is the implementation.

boost::optional< cont<bool> > checkpoint;

bool guess() {
return call_cc<bool>( [](cont<bool> k) {
checkpoint = k;
return true;
} );
}

void fail() {
if (checkpoint) {
(*checkpoint)(false);
} else {
std::cerr << "Nothing to be done." << std::endl;
exit(1);
}
}

guess
invokes
call_cc
on a
lambda expression
, which saves the current continuation and returns
true
. A subsequent call to
fail
will invoke this continuation, retrying execution in a world where
guess
had returned
false
instead. In Scheme et al, we would store a whole stack of continuations. But invoking our
cont
resets global state, including the
checkpoint
variable itself, so we only need to explicitly track the most recent continuation.

Now we can implement the integer factoring example from ”
Continuations in C
“.

int integer(int m, int n) {
for (int i=m; i<=n; i++) {
if (guess())
return i;
}
fail();
}

void factor(int n) {
const int i = integer(2, 100);
const int j = integer(2, 100);

if (i*j != n)
fail();

std::cout << i << " * " << j << " = " << n << std::endl;
}

factor(n)
will guess two integers, and fail if their product is not
n
. Calling
factor(391)
will produce the output

17 * 23 = 391

after a moment’s delay. In fact, you might see this
after
your shell prompt has returned, because the output is produced by a thousand-generation descendant of the process your shell created.

Solving a maze

For a more substantial use of backtracking, let’s solve a maze.

const int maze_size = 15;
char maze[] =
"X-------------+n"
" | |n"
"|--+ | | | |n"
"| | | | --+ |n"
"| | | |n"
"|-+---+--+- | |n"
"| | | |n"
"| | | ---+-+- |n"
"| | | |n"
"| +-+-+--| |n"
"| | | |--- |n"
"| | |n"
"|--- -+-------|n"
"| n"
"+------------- n";

void solve_maze() {
int x=0, y=0;

while ((x != maze_size-1)
|| (y != maze_size-1)) {

if (guess()) x++;
else if (guess()) x--;
else if (guess()) y++;
else y--;

if ( (x < 0) || (x >= maze_size) ||
(y < 0) || (y >= maze_size) )
fail();

const int i = y*(maze_size+1) + x;
if (maze[i] != ' ')
fail();
maze[i] = 'X';
}

for (char c : maze) {
if (c == 'X')
std::cout << "e[1;32mXe[0m";
else
std::cout << c;
}
}

Whether code or prose, the algorithm is pretty simple. Start at the upper-left corner. As long as we haven’t reached the lower-right corner, guess a direction to move. Fail if we go off the edge, run into a wall, or find ourselves on a square we already visited.

Once we’ve reached the goal, we
iterate
over the
char
array and print it out with some rad
ANSI color codes
.

Once again, we’re making good use of the fact that our continuations reset global state. That’s why we see
'X'
marks not on the failed detours, but only on a successful path through the maze. Here’s what it looks like.


X-------------+
XXXXXXXX|     |
|--+  |X|   | |
|  |  |X| --+ |
|     |XXXXX| |
|-+---+--+-X| |
| |XXX   | XXX|
| |X|X---+-+-X|
|XXX|XXXXXX|XX|
|X+-+-+--|XXX |
|X|   |  |--- |
|XXXX |       |
|---X-+-------|
|   XXXXXXXXXXX
+-------------X

Excess backtracking

We can run both examples in a single program.

int main() {
factor(391);
solve_maze();
}

If we change the maze to be unsolvable, we’ll get:

17 * 23 = 391
23 * 17 = 391
Nothing to be done.

Factoring 391 a different way won’t change the maze layout, but the program doesn’t know that. We can add a
cut
primitive to eliminate unwanted backtracking.

void cut() {
checkpoint = boost::none;
}

int main() {
factor(391);
cut();
solve_maze();
}

The implementation

For such a crazy idea, the code to implement
call_cc
with
fork
is actually pretty reasonable. Here’s the core of it.

template <typename T>
// static
T cont::call_cc(call_cc_arg f) {
int fd[2];
pipe(fd);
int read_fd = fd[0];
int write_fd = fd[1];

if (fork()) {
// parent
close(read_fd);
return f( cont(write_fd) );
} else {
// child
close(write_fd);
char buf[sizeof(T)];
if (read(read_fd, buf, sizeof(T)) < ssize_t(sizeof(T)))
exit(0);
close(read_fd);
return *reinterpret_cast(buf);
}
}

template <typename T>
void cont::impl::invoke(const T &x) {
write(m_pipe, &x, sizeof(T));
exit(0);
}

To capture a continuation, we fork the process. The resulting processes share a
pipe
which was created before the fork. The parent process will call
f
immediately, passing a
cont
object that holds onto the write end of this pipe. If that continuation is invoked with some argument
x
, the parent process will send
x
down the pipe and then exit. The child process wakes up from its
read
call, and returns
x
from
call_cc
.

There are a few more implementation details.

  • If the parent process exits, it will close the write end of the pipe, and the child’s
    read
    will return 0, i.e. end-of-file. This prevents a buildup of unused continuation processes. But what if the parent deletes the last copy of some
    cont
    , yet keeps running? We’d like to kill the corresponding child process immediately.

    This sounds like a use for a reference-counted smart pointer, but we want to hide this detail from the user. So we split off a private implementation class,
    cont::impl
    , with a destructor that calls
    close
    . The user-facing class
    cont
    holds a

    std::shared_ptr

    to a
    cont::impl
    . And
    cont::operator()
    simply calls
    cont::impl::invoke
    through this pointer.

  • It would be nice to tell the compiler that
    cont::operator()
    won’t return, to avoid warnings like “control reaches end of non-void function”. GCC provides the

    noreturn
    attribute

    for this purpose.

  • We want the
    cont
    constructor to be private, so we had to make
    call_cc
    a static member function of that class. But the examples above use a free function
    call_cc
    . It’s easiest to implement the latter as a 1-line function that calls the former. The alternative is to make it a
    friend
    function of
    cont
    , which requires some forward declarations and other noise.

There are a number of limitations too.

  • As noted, the forked child process doesn’t see changes to the parent’s global state. This precludes some interesting uses of continuations, like implementing
    coroutines
    . In fact, I had trouble coming up with any application other than backtracking. You could work around this limitation with
    shared memory
    , but it seemed like too much hassle.

  • Each captured continuation can only be invoked once. This is easiest to observe if the code using continuations also invokes
    fork
    directly. It could possibly be fixed with additional
    fork
    ing inside
    call_cc
    .

  • Calling a continuation sends the argument through a pipe using a naive byte-for-byte copy. So the argument needs to be
    Plain Old Data
    , and had better not contain pointers to anything not shared by the two processes. This means we can’t send continuations through other continuations, sad to say.

  • I left out the error handling you would expect in serious code, because this is anything but.

  • Likewise, I’m assuming that a single
    write
    and
    read
    will suffice to send the value. Robust code will need to loop until completion, handle
    EINTR
    , etc. Or use some higher-level IPC mechanism.

  • At some size, stack-allocating the receive buffer will become a problem.

  • It’s slow. Well, actually, I’m impressed with the speed of
    fork
    on Linux. My machine solves both backtracking problems in about a second,
    fork
    ing about 2000 processes along the way. You can speed it up more with
    static linking
    . But it’s still far more overhead than the alternatives.

As usual, you can get the code from
GitHub
.

稿源:main is usually a function (源链) | 关于 | 阅读提示

本站遵循[CC BY-NC-SA 4.0]。如您有版权、意见投诉等问题,请通过eMail联系我们处理。
酷辣虫 » 综合技术 » Continuations in C++ with fork

喜欢 (0)or分享给?

专业 x 专注 x 聚合 x 分享 CC BY-NC-SA 4.0

使用声明 | 英豪名录