Sunday, May 13, 2007

Erlang: The Cons

Today, I'm going to talk about the warts I've seen in Erlang so far. I'm sure I'll find other things I don't like, and I'm also sure that I'll make peace with some of the things I've listed here. Nevertheless, here is my list of the bad and the ugly so far. (This isn't a tutorial on Erlang, so I'm not going to explain the code snippets below. Sorry.) Strings The one thing that really made me go "Yuck" as I was learning Erlang is strings. Strings are just lists of integers, which are treated as strings for printing. For example, if you crank up the read-eval print loop (REPL) and type in a list of integers, it gets printed as a string if there are no control characters in the string:
Eshell V5.5.4 (abort with ^G) 1> [69, 114, 105, 99]. "Eric"
Likewise, if you type in a string, it's interpreted as a list of integers:
4> lists:foreach(fun(X) -> io:format("~p ", [X]) end, "Rochester"). 82 111 99 104 101 115 116 101 114 ok
Initially, this just felt wrong to me. It still does in some respects, but I've also learned to appreciate how this makes string processing easy most of the time, particularly when it's coupled with Erlang's fantastic pattern matching features. For example, the function get_ints in the module below walks through a string and returns a list of all the integers in the string:
-module(get_ints). -export([get_ints/1]). get_ints(String) -> get_ints(String, [], []). get_ints([], [], Ints) -> lists:reverse(Ints); get_ints([], Current, Ints) -> CurrentInt = list_to_integer(lists:reverse(Current)), lists:reverse([CurrentInt|Ints]); get_ints([Chr|Data], Current, Ints) when Chr >= $0, Chr =< $9 -> get_ints(Data, [Chr|Current], Ints); get_ints([_Chr|Data], [], Ints) -> get_ints(Data, [], Ints); get_ints([_Chr|Data], Current, Ints) -> CurrentInt = list_to_integer(lists:reverse(Current)), get_ints(Data, [], [CurrentInt|Ints]).
To use this, save it in a file named "get_ints.erl", compile it, and call the function:
8> c(get_ints). {ok,get_ints} 9> get_ints:get_ints("This is the answer: 42"). "*" 10> get_ints:get_ints("This is the answer: 0, 42"). [0,42] 11> get_ints:get_ints("This is the answer: 0, 42, 23"). [0,42,23]
Notice that the first list of one integer ([42]) is interpreted as a string. I included the number zero (a control character) in the next test to force Erlang to print the results as a list of integers. Another small complaint with strings involves Unicode. I've done enough processing XML and processing phonetic data that good Unicode handling is important to me, no matter whether I'm using it much at the time or not. In one sense, Erlang handles Unicode just fine. A string containing a schwa character is just [601]. Unfortunately, this is the depth of its Unicode handling. It doesn't give you any information about the Unicode data points or a function to change a character from upper-case to lower- or vice versa. Security Another complaint is Erlang's security model. On the one hand, it has the virtue of being easy to set up, but if two nodes can connect and communicate, there's nothing one can't do on the other node. Having more fine-grained control over things would be nice. Speed Also, there's the issue of speed. Erlang is generally fast enough, and where it's not, you can easily set up a node written in C, Java, or C#. Still, being able to deal with everything in Erlang would be more convenient. REPL Finally, there are restrictions in working from the REPL that I could do without. To create a function like get_ints above, I more or less have to create a new module in a file and put the function there. There are ways to work around this, but they seem unnatural. I'd rather not use them. Nothing on this list is a deal-killer for me. I've been doing a lot of Erlang the past few weeks, and I've really enjoyed it. It's been productive and interesting. Still, I can always dream about a better world.

No comments: