Monday, May 28, 2007

Erlang: The Pros

I had meant to post this much sooner, but there's been this thing called work. It gets in the way. Finally, here is the second half of my overview of Erlang. Concurrency Given it's background, it's no surprise that concurrency is easy and cheap in Erlang. Here's a contrived example:
12> PrintRandom = fun() ->              
        {A, B, C} = erlang:now(),           
        random:seed(A, B, C),
        io:format("~p~n", [random:uniform()])
        end.
#Fun

13> lists:foreach(fun(_) -> spawn(PrintRandom) end, lists:seq(1, 10)).
0.664105                                                             
0.664137
0.664170
0.664203
0.993887
0.993920
0.993953
0.993986
0.994019
0.994052
ok
First, the state for the random number generator is stored in each process, and it's initialized from a standard value. (Actually, maybe the random module should be on my Erlang: Cons list, although having the random module not produce random numbers probably makes debugging easier.) Line 13 spawns 10 processes, each of which prints a random number. Not very useful, but it should illustrate how easy it is to create processes. Also, you can easily spawn a lot processes. This code generates a random number and throws it away a million times in a million processes, and it executes in about 3.5 seconds on my machine:
-module(timespawn).

-export([make_random/0, spawn_random/1, ts/1]).

make_random() ->
    {A, B, C} = erlang:now(),
    random:seed(A, B, C),
    random:uniform().

spawn_random(0) ->
    ok;
spawn_random(X) ->
    spawn(fun make_random/0),
    spawn_random(X-1).

ts(X) ->
    timer:tc(?MODULE, spawn_random, [X]).
Distribution Another nice aspect Erlang is how easy it is to distribute processing. This is partially due to its open security model, which I listed as a negative. For example, in two separate console windows, I can start two different instances of Erlang. As long as I pass both the -sname argument (with different values) and the -setcookie argument (with the same value), they can talk to each other, even if they're on different computers in the same network. Functional I've worked with functional languages in the past, but I haven't really drunk the functional Kool Aid until now. I'm enjoying it this time, though, and want to start working more with one of the Lisps (probably Scheme), Haskell, or ML. When I do start on one (or more likely, on all) of these, I'm sure I'll talk about it here. OTP The OTP (Open Telecom Platform) is the standard Erlang library, and it contains modules that make writing fault tolerant, supervised, client-server, and distributed applications easy and fast. Conclusion I'm glad I've gotten to do a lot of Erlang recently. It's changed the way I think about concurrency and distribution, and it's raised my expectations of other systems that tackle the same problem. Even if I do most of my concurrent programming primarily in other languages, this will give me a good basis. And really, what more can I ask for from a programming language?

Sunday, May 13, 2007

Erlang: The Cons

Today, I'm going to talk about the warts I've seen in Erlang so far. I'm sure I'll find other things I don't like, and I'm also sure that I'll make peace with some of the things I've listed here. Nevertheless, here is my list of the bad and the ugly so far. (This isn't a tutorial on Erlang, so I'm not going to explain the code snippets below. Sorry.) Strings The one thing that really made me go "Yuck" as I was learning Erlang is strings. Strings are just lists of integers, which are treated as strings for printing. For example, if you crank up the read-eval print loop (REPL) and type in a list of integers, it gets printed as a string if there are no control characters in the string:
Eshell V5.5.4 (abort with ^G) 1> [69, 114, 105, 99]. "Eric"
Likewise, if you type in a string, it's interpreted as a list of integers:
4> lists:foreach(fun(X) -> io:format("~p ", [X]) end, "Rochester"). 82 111 99 104 101 115 116 101 114 ok
Initially, this just felt wrong to me. It still does in some respects, but I've also learned to appreciate how this makes string processing easy most of the time, particularly when it's coupled with Erlang's fantastic pattern matching features. For example, the function get_ints in the module below walks through a string and returns a list of all the integers in the string:
-module(get_ints). -export([get_ints/1]). get_ints(String) -> get_ints(String, [], []). get_ints([], [], Ints) -> lists:reverse(Ints); get_ints([], Current, Ints) -> CurrentInt = list_to_integer(lists:reverse(Current)), lists:reverse([CurrentInt|Ints]); get_ints([Chr|Data], Current, Ints) when Chr >= $0, Chr =< $9 -> get_ints(Data, [Chr|Current], Ints); get_ints([_Chr|Data], [], Ints) -> get_ints(Data, [], Ints); get_ints([_Chr|Data], Current, Ints) -> CurrentInt = list_to_integer(lists:reverse(Current)), get_ints(Data, [], [CurrentInt|Ints]).
To use this, save it in a file named "get_ints.erl", compile it, and call the function:
8> c(get_ints). {ok,get_ints} 9> get_ints:get_ints("This is the answer: 42"). "*" 10> get_ints:get_ints("This is the answer: 0, 42"). [0,42] 11> get_ints:get_ints("This is the answer: 0, 42, 23"). [0,42,23]
Notice that the first list of one integer ([42]) is interpreted as a string. I included the number zero (a control character) in the next test to force Erlang to print the results as a list of integers. Another small complaint with strings involves Unicode. I've done enough processing XML and processing phonetic data that good Unicode handling is important to me, no matter whether I'm using it much at the time or not. In one sense, Erlang handles Unicode just fine. A string containing a schwa character is just [601]. Unfortunately, this is the depth of its Unicode handling. It doesn't give you any information about the Unicode data points or a function to change a character from upper-case to lower- or vice versa. Security Another complaint is Erlang's security model. On the one hand, it has the virtue of being easy to set up, but if two nodes can connect and communicate, there's nothing one can't do on the other node. Having more fine-grained control over things would be nice. Speed Also, there's the issue of speed. Erlang is generally fast enough, and where it's not, you can easily set up a node written in C, Java, or C#. Still, being able to deal with everything in Erlang would be more convenient. REPL Finally, there are restrictions in working from the REPL that I could do without. To create a function like get_ints above, I more or less have to create a new module in a file and put the function there. There are ways to work around this, but they seem unnatural. I'd rather not use them. Nothing on this list is a deal-killer for me. I've been doing a lot of Erlang the past few weeks, and I've really enjoyed it. It's been productive and interesting. Still, I can always dream about a better world.

Sunday, May 6, 2007

Erlang: An Overview

Recently I've been programming in Erlang. It's a new computer language for me, and I thought I would talk about it. This will take several posts. Today I'll just provide an overview of the language and environment. Erlang was developed by Ericsson telecommunications for developing soft real-time concurrent systems. I've never developed real-time systems before, soft or hard, so I won't talk about that much. I haven't done a lot with concurrency, either, except a touch of threading here and there. Nevertheless, the concurrency and distributed features of this language are what really attracted me to it. Originally, Ericsson implemented Erlang in PROLOG, and this still shows in some parts of Erlang, such as the way it uses pattern-matching. As an undergraduate, I went through a period where I programmed more or less only in PROLOG for about two years, so a lot of this feels like a stroll down memory lane. Other parts are very new. It's primarily a functional language. It has a lot of features designed for doing telecom programming, such as processing binary data. Erlang code is organized into modules, which are compiled and run from a read-eval-print-loop (REPL). A short "hello world" style module would look like this:
-module(hello). -exports([hello/1]). hello(Name) -> io:format("Hello, ~p.~n", [Name]).
The -module line identifies the module, which is also the name of the file containing the module. The -exports line identifies the functions from this module that are exposed to the outside world. The hello function is the rest of the module. io:format is the standard way to print to the console. Next, I want to look at some of the parts of Erlang that I'm less fond of.