Python is annoying

2019-Feb-07, Thursday 09:38
mindstalk: (Default)
Our code works with binary data (hashes/digest) and hexstring representations of such data, a lot. It was written in Python 2, when everything was a string, but some strings were "beef" and some were "'\xbe\xef'"

Then we converted to Python 3, which introduced the 'bytes' type for binary data, and Unicode strings everywhere, which led to some type problems I had figured out, but a recent debugging session revealed I had to think about it some more. Basically we can now have a hexstring "beef", the bytes object b'\xbe\xef' described by that hexstring... and the bytes b"beef" which is the UTF-8 encoding of the string.

In particular, the function binascii.hexlify (aka binascii.b2a_hex) which we used a lot, changed what it returned.

Python 2:
>>> binascii.a2b_hex("beef")
>>> binascii.hexlify(_)

Python 3:
>>> binascii.a2b_hex("beef")
>>> binascii.hexlify(_)

>>> binascii.a2b_hex("beef")
>>> _.hex()

I found it easy to assume that if one of our functions was returning b"beef" and the other "beef" that they were on the same page, when really, not.

Bunch of examples in the cut.

Grah Python )

logging thoughts

2018-May-16, Wednesday 16:09
mindstalk: (robot)
One of the first things I did for work was implementing a better logging system, based around the Python logging library, which introduced me to the world of DEBUG, WARNING, INFO, ERROR. Useful, but something felt off from my expectations. I think I've finally pinned that down.

At my old job, which also involved multiple processes talking to each other over the network, I created my own logging system, though I'm not sure I called it that. Really it was an evolution of printf debugging, with a way to control how much output there was. I had DEBUG macros, controlled by a buglevel variable, which more precisely was a verbosity level. 0 meant nothing, 1 meant a small amount of high level output like entering functions, 4 printed from inside nested loops. I don't remember what 2 and 3 were, and it was more ad hoc than formally designed, but it seems reasonable that 2 would be top level statements after the initial entry statement, while 3 would from within loops. E.g.

int some_function(type param) {
  DEBUG(1, "Entering some_function");
  do sometthing
  DEBUG(2, "I got something %s", something);
  while(condition( {
    DEBUG(3, "loop variables");
    for (some other condition) {
      DEBUG(4, "really verbose");

What about INFO, ERROR, etc. levels? I'm guessing those were just printfs -- I had no concept that you'd ever not want to see such output, nor did I have anything like the Python logging flexibility where you can send logging to multiple outputs. So my homegrown macros were entirely for levels of DEBUG output, and really provided a trace record with varying levels of resolution (and performance hit[1].)

And I feel weird because what's become common, in Python and Apache and such, is just a single level of DEBUG.

I've already tried to address that; our logging module contains some custom methods and levels. HITRACE for my old DEBUG 1, function entry (and sometimes exit). And TRACE for "even more verbose than DEBUG", but I hadn't pinned down that that meant; now it seems like I'd have to add one more level to really match up to my old system. And then teach other people how to use it. Feh...

I've also wondered if there's a useful distinction between "statements for tracing in general" and "statements for debugging this problem in particular."

Then of course there's exactly how to split WARNING, ERROR, and CRITICAL, and how to report user errors. Currently I mostly use ERROR for "there's a problem with our code" and CRITICAL for "there's a problem with the environment" like disk full or network connection failure. ERROR gets programmers out of bed, CRITICAL gets Ops out of bed. FATAL is CRITICAL plus a built-in "we die right now." WARNING is "I'm not sure if this is a problem" or "a human should look at this soon". Errors because of bad *user* input, I'm not sure, maybe just DEBUG? Don't want bad users filling up production logs.

[1] Once Ops came to me and said "the new release is really slow!" It turned they'd pushed the code to production with a configured buglevel=4, not 1 or 0 as I instructed. (The macro was such that all that code could have been compiled away, too, but we never bothered; I think we also liked the idea that we could get more output if we needed to.)
mindstalk: (Default)
Once again, into the Pyth circle of hell. This time trying to convert our code from Python 2 to 3. 2to3 does much fo the grunt work of print() and 'from . import', though it didn't always get the latter right. But instead of string and unicode, we now have string and bytes value types, and a strong barrier between them. And of course no static compiler to find up front when types might be mixed up. And yes, we're weak on unit tests, especially tests that exercise all possible code paths. Things seem to mostly work now, but will they under all conditions? Who knows?
mindstalk: (Default)
So, programming languages. Years ago I'd run across D, aiming to be a better C++, or even a better Python -- enabling high level coding without giving up type safety or speed. It had lots of features, including contracts from Eiffel, and being functional friendly; overall it seemed like a language I'd design. It was cool, but I didn't code actively enough to bother learning it.

Last weekend I did start, but also started on the new cult kid, Rust. I went in parallel a bit, but Rust has pulled ahead.

D is not very exciting or revolutionary; it's like a better done kitchen sink. There's a lot of value in that, and in what it's trying to be, and AFAICT the language itself is pretty good. I've seen people criticize the toolchain, and uptake has been rather modest -- though the forums do see daily activity, at least.

Rust, OTOH, is trying to be revolutionary: compiler-enforced memory safety, meaning not just no memory leaks but "fearless concurrency", where the compiler would enforce no data races in multithread code unless you did something specifically unsafe. "No leaks" doesn't sound exciting unless you're an engineer who's had to worry about them; "safe concurrency" is potentially sexy to lots of people.

On diving in, I noticed something else sexy to me: it's like the unholy love child of C and ML and other functional languages; one blog post even called it a functional language in C clothing. Enums/sum types/algebraic data types/tagged unions, which I quickly fell in love with while playing with Ocaml; 'traits' or type classes a la Haskell, which serve for generics, dynanmic dispatch, and overloading, all with one coherent mechanism; hygienic macros a la Scheme, something I thought I'd never get to play with seriously unless I got into Clojure. Also, supposedly, an easy and powerful package system, and a minor taste of mine, nested comments.

aliasing fi

2017-May-13, Saturday 07:12
mindstalk: (Default)
I think I mentioned not long ago that I found I'd been aliasing fi=finger which breaks if loops in my shell, and marveled that it took so long to find that. It makes more sense to me now.

1) Yeah, I didn't script much.
2) When I did do an ad hoc script at the prompt, it was a for loop.
3) Scripts you get are mostly bash scripts.
4) Even an explicitly written zsh script wouldn't have a problem: my aliases are loaded by .zshrc, which is loaded by interactive shells, i.e. not script shells[1].
5) Only when I tried pasting an if loop into a *function*, also loaded by .zshrc after my aliases, did a problem occur. Possibly it had occurred before and I simply gave up on some unnecessary function that mysteriously didn't work.

[1] This also sheds light on past failures to ssh in somewhere and invoke a function directly: not an interactive shell, so no functions loaded. When I try 'ssh ... "zsh -i script_invoking_function"', it works. So if I want remote function invocation, I'll need to use -i or to load functions outside of .zshrc.

why zsh?

2017-May-11, Thursday 21:21
mindstalk: (Default)
When I got to Caltech and discovered Unix, the default shell on the cluster was csh, with more user features than the sh at the time, but not a lot. If you got the lowdown, you could switch to the far more useful tcsh, but the sysadmin refused to make that the default for resource reasons. There was also ksh, but I never heard people talking about it.

A few years later zsh came along, and the more techie undergraduate cluster largely switched to it en masse. It was even made the default shell there.

Out in the greater world, and in the era of Linux, bash seems the default shell, pretty much incorporating much of what was good about tcsh and ksh, and also displacing any more primitive sh. zsh still is an exotic thing even Linux people may not have heard of... which is a shame, because it's so much better.

Granted, it's also way more complicated, and a lot of its cooler features have to be turned on. If you want a shell that's full-featured out of the box, there's the even more obscure 'fish'.

And bash can approach, though not catch up to zsh, with the "bash-completion" package.

But what's so cool? Well, tab-completion can be far more powerful, working not just on filenames, but environment or shell variables, command options, man pages, process numbers, and git branches. It can also go to a menu mode, for scrolling around lots of options.

(But fish will do the magic of parsing man pages on the fly to display command options. :O )

It's easy to have your prompt display the exit code of the last command, something I find pretty useful; doing that in bash requires writing your own functions.

Likewise, you can easily have sophisticated right-hand prompts.

**/ recursive directory listing, though that is something you can turn on in bash. (shopt -s globstar)

Even more extended globbing, including excluding patterns, or selecting files based on modification time within a window and other criteria.

Redirection tricks, some of which reduce the need for tee. |& pipes stdout and stderr to a program such as less. >! can clobber files even when you have noclobber on.

I'd anticipated sticking to bash for scripting, for better standards compliance/portability, but I realized that I'm not writing a package script, just in-house tools. And zsh scripting has a lot going for it. Arrays just work, while bash arrays were described Sunday as the worst of any language. I'm using the mod time glob mentioned above.

zsh can share history between shells. I find this useful and annoying -- useful now for storing and reusing commands, but also destroys the individual history of a particular window. Oh well. An impressive application was when I found myself reusing history across *machines*, where my home dir was NFS mounted.

"Named directories" mean I can collapse long pathnames in my prompt, e.g. Main/wsgi-scripts becomes just ~WS

Probably a lot more, but those come to mind.

That said, there is one odd lacuna in zsh. bash has --rc-file, to tell it to read in a custom rc (like bashrc) file after everything else. zsh... doesn't. And sometimes I would like to start a shell with a custom additional environment, e.g. from ssh.
mindstalk: (Default)
The work system has two kinds of debug/logging statements.

One is a class with the standard methods (trace through critical) which are just print statements that add "DEBUG" or "INFO" as appropriate. There's not even a hint of output throttling. But it is a class, and so I can rip its guts out and replace them with calls to Python's logging module, and it works.

Then there's the 500+ naked print statements, with "ERROR" or such in their constructed output strings. I can search for them easily -- though I can just search for 'print', I think these are the only uses -- but I don't see any programmatic way of converting them, especially as the logging output formatting needs types (%s, %d) which are totally absent from the existing statements. (And it's python 2, so they are statements.)

I see a day of boring editing in my future.
mindstalk: (Default)
(Definition: solving a problem or writing code in an interview, in front of people, under time pressure. Video/online counts as well.)

I'm not sure I've gotten much better at this in the past year. It's one thing if the problem is one I already know and I just have to write code; I think I regurgitate under pressure fairly well. But if I have to really think about the problem then it feels like my IQ drops 20+ points under stress and being stared at. And when I come up with one idea for a solution, it's hard to try to think of others that might be better in some sense -- after all, the clock is ticking, and I have to start writing code! Not to mention the fun of having to write correct code without backup documentation or a compiler -- my memory prioritizes the stuff that's hard to look up, like What You Shouldn't Do, or Where Information Is, over the stuff that's trivial to look up at need.

As for actually being creative, that goes a lot better when I have time to relax, or step away from the problem and not consciously think about it. A lot of my best solutions just come to me when doing something else, or musing in bed or the shower, or walking.

Post prompted by Thursday's experience, where I was asked to construct a relative path between two directories, and I saw it as a tree problem and hared down making C++ trees with nodes and everything. At the end I asked how I'd done, and was told "well, this works, or would work with a bit more polishing of the code, but there's a simpler way with just lists." One minute after leaving the building I saw what that would be, and at home it took me 18 minutes to code and debug, which I e-mailed in, but apparently got no credit for that.

I did better Monday, with some basic linked list questions; that rejection was "you did well and seem a fine technologist, but not commercially experienced enough". Which is back to the Catch-22 of not being able to get experience because I'm not experienced enough.

On the flip side, Wednesday had a video interview where I had no idea how I did, but they want me to go to NYC for an onsite next week. So yay, progress... of course, that'll probably be more whiteboarding.
mindstalk: (angry sky)
I've started studying JavaScript a bit, as you might guess from recent posts. It's struck me as a mix of Perl, Python, and crack. It's got some neat things, especially in comparison to one language or the other. And it's got lots of... wtf things.

+: exponentiation operator; nested arrays and objects (dictionaries) (without Perl's references); first class functions and lambdas and closures, including nested functions (unlike Perl); Perl-like push/pop/shift/unshift array operators (but what's the performance?); consistent 'valueOf' and 'toString' methods; JSON; multiple kinds of for loops; Perl style labeled break and continue; some convenient conversions (but see below); nice Date methods.

-: oh boy.

* JSON stringifies nested structures nicely, but simple output doesn't: [1, [2,3]] outputs as [1, 2, 3].
* (object1 == object2) is always false, no matter the underlying values. This holds for arrays and Date objects too. Nothing like Python's structural equality, or even that of STL containers.
** But you can do *inequality* comparison: ([1,2,3] < [2,2,3]) == true.
* strings take negative indices a la Python, but arrays don't.
* there's a typeof operator, but it just says 'object' for arrays.
* "5"+2 == "52" (convert 2 to "2", concatenate), but "5"-2 == 3 (convert "5" to 5, subtract.) And no Python string multiplier like "a"*2 == "aa".
** As Avi noted, it gets even weirder given that the values could be hidden in a variable. a+2=="52", a-2==3
* [1,2]+[3,4] doesn't concatenate arrays, doesn't add by element, doesn't give a type error, but gives... "1,23,4" (turn arrays into strings, concatenate without delimiter.)

My friend Mark linked me to which gives some more:
* []+{} == [object Object]
Ok, addition is commutative, right?
{}+[] == 0
And for luck: {}+{} == NaN
As above, []+[] == ""
** Actually, on playing with typeof, I think those are actually all strings. "[object Object]", "0", "NaN". OTOH, {}+[4]+5 == 9 (but typeof string)
>>> 5+{}+[4]
5[object Object]4 // because of course it does

turns into

* All numbers are 64-bit floats; you can still bitshift them, but as integers, so 5.2 << 2 == 20. This makes more sense when I remembered that floats are weird, not integers+fractions, so a simple bitshift of the fraction wouldn't make sense.

Hoisting Shadows

2017-Feb-05, Sunday 11:35
mindstalk: (Default)
A bit after writing the previous post on shadowing variables in JavaScript, I came across this page on hoisting. JavaScript re-writes your code so that declarations but not initializations to the top of current scope, meaning the script or function[1]. So

var a=5;

turns into

var a;

Put that way, it's clear why the problem happens, if not why the language was designed this way.

Does the same thing happen in Python? Searching did not get clear results. I saw pages contrasting JavaScript with Python, which doesn't even *have* declarations. OTOH, the same behavior occurs. So... I dunno.

[1] JavaScipt does not have block scoping the way C and Perl[2] do; scopes are delimited by curly braces, but random curly braces do nothing to isolate variables.

{ a=5; }

will work just fine. :(

[2] As I was verifying this in Perl, I ran into behavior I'd forgotten. I'd thrown in "use strict;" but wasn't getting the errors I expected. Eventually I recalled that $a and $b have special meaning in Perl (I think for comparison functions), and I guess are pre-declared, and I was using $a a la the above code, so strict didn't complain about trying to access $a before assigning to it. Sigh.
mindstalk: (Default)
Months ago, Robbie had found this scoping problem in Python, which I reduced to essentials.

I've started finally learning JavaScript, and it has nicer lambdas than Python, and proper hiding of nested functions unlike Perl. But it has the same scope problem:

g1 = 12;
function func() {
  document.getElementById("demo").innerHTML = g1;

  var g1 = 5;


(I'm not including the HTML framework because DW/LJ would yell at me if I did.)

Output is 'undefined', rather than 12. As in Python, the local variable further down shadows the outer scope variable (doesn't matter if the "g1=12" has a 'var' before it) even for lines before the local variable.

As mentioned before, Perl has proper lexical scoping here (though not for nested functions.) I don't think I can even create similar code in Scheme/Lisp, where the scoping is explicit with parentheses. (There's 'define' but I think that makes a new global, and it didn't work.) In Ocaml I have

let g1="10";;

let func () =
  print_endline g1;
  let g1="cat" in


Which I suspect is as explicit as Lisp parentheses, in its own way; the print line is obviously outside the following "let ... in...".

fast array rotation

2017-Jan-28, Saturday 13:17
mindstalk: (Default)
A simple problem I'd never had occasion to think much before, before I saw a sample coding problem.

How to rotate the elements of an N-element array by k spaces? An obvious way is to shuffle it one space, k times, but that's slow, O(N*k). Faster, which I saw about as soon as I thought about performance, is to use a second array B, where B[(i+k)%N] = A[i]. But that takes O(N) extra space. Can you do better?

Yes, as I realized at 5am. For each element A[i], move it to A[(i+k)%N]. O(N) time, O(1) extra space. Can't beat that!

Except some mildly tricky math intervenes: the natural approach only works if N and k are relatively prime. A more general solution is

let g = gcd(N,k)
for i in [0,g)
  for j in [0, N/g)
    shuffle element k spaces

Despite looking like a double loop, it's still O(N), with g*N/g iterations.

I've also learned that C++ has a gcd function now. std::experimental::gcd, in the header experimental/numeric. C++17 moves it out of experimental but even g++ isn't doing that by default.

The really annoying this is that this is the sort of solution that comes naturally to me lying in bed, with little conscious effort, but that I'd likely fail to get in a whiteboard session or timed coding test, due to stress dropping my working IQ.

Applying A*

2017-Jan-17, Tuesday 12:55
mindstalk: (Default)
I've played a lot of freeciv and freecol over recent years. Both games let you order a unit to go to some square, and hopefully it takes the fastest route there. The 'world' is a square grid, of various terrain types and associated movement costs -- e.g. plains or desert take 1 move, but mountains take 3; roads and rivers take 1/3 no matter what terrain type they're laid over.

A* is this sweet magic algorithm for finding shortest paths in some graphs efficiently, vs. doing breadth-first search in all directions, but I was having trouble applying it mentally. I was using the common Manhattan distance heuristic, h((0,0),(x,y)) = x+y, and I wasn't getting good results: the algorithm would cheerfully march down a straight plains path to the goal, while ignoring a path that might step away and into mountains, but then ride a river to the goal much faster.

So I backed off, and thought about BFS. I realized that would work better if instead of naive BFS, enqueuing grid squares as you found them, instead you ranked them by total travel time so far. This is basically A* without a heuristic. Instead of exploring all paths N squares away, you'd explore all paths N moves away; it would still be radiating in all sorts of directions, but at least you'd find the shortest path to the goal.

Then I realized I'd been using the wrong heuristic; the right one should be the shortest possible journey. Or as WP says, "it never overestimates the actual cost to get to the nearest goal node." So the heuristic in this application has to consider rivers and roads, such that h() = (x+y)/3, not (x+y). This works much better: the plains march looks less attractive as it advances, converting cheap heuristic moves into actual plains moves, and the "mountain and away" move gets a chance to be considered.

Actually, units and roads can go diagonally, though rivers are rectilinear, so the proper heuristic is h((0,0),(x,y)) = max(x,y)/3.

Actually actually, infantry units only have one move, but are still guaranteed one square of movement per turn, so can march across mountains as easily as across plains; it's mounted units, with e.g. 4 move in freecol, that really care about base terrain type. Also fractional movement can be useful, e.g. I think a unit with 2/3 move left (after moving on river) can still move onto an adjacent plains.

booyeah, I'm good!

2016-Dec-31, Saturday 15:01
mindstalk: (squeee)
So some months ago I worked on an AVL tree in pure C. My main goal was to get self-balancing happening at all, to demystify that process, and I did that; getting the code to be robust or even asymptotically appropriate (I'm still calculating heights every time I need to, not caching and updating them) was far down the priority list. So, while I tried to be correct in handling memory, it would not have surprised me if subtle bugs still existed.

My past work experience didn't emphasize use of static analysis tools. I think I played with lint a bit on my own, but relying on it or valgrind wasn't a thing. I finally investigated them today, ending up with cppcheck for source code check, and valgrind for what it does. And ran them on my directory of C/C++ exericses, which as small exercises, could easily have had lots of problems.

...not so much! cppcheck found a few problems, most notably a missing return value I wasn't testing properly in linked list, and a realloc bug[1] in my memoized Fibonacci table, and an unset ID variable in a threading program, but nothing else, including in the AVL tree or the new C++ hash table.

valgrind found some memory leaks in linked list and AVL tree... only because the testing main() for each was creating structures and then not freeing them before exit, because who cares at that point? With a manual destroy added, both check out clean. Which the list really should, it's pretty simple, but I'm stoked about the AVL tree doing so, since it's complex and memory correctness wasn't a top priority. Seems like I do have good habits.

There's also clang's static analyzer, supposedly deeper than cppcheck, though much slower to use. It's happy with my codes too.

Of course, valgrind can only test what's executed, so more use might uncover more bugs, but still, I feel good.

[1] ptr = realloc (ptr, ...) sets ptr to NULL if the realloc fails, which means a memory leak if nothing else is pointing to what ptr had been pointing to. Not having used realloc much, plus also not worrying about errors in a tiny program, this didn't stand out to me.
mindstalk: (Default)
So, I've been working today on cloning my Ocaml chat program in C++. It's going slowly as I work on infrastructure, like exception-throwing wrappers to libc socket functions, and a RAII class for file descriptors.

In the process, I found something. I'd been using inet_pton() on the user provided server name, forgetting that I have to look it up first. I was getting away with it because testing has been local, client connect to localhost. My first code wasn't error checking inet_pton, but I got around to making the Inet_pton wrapper, which started throwing exceptions. inet_pton is supposed to return 1 on success (with the IP address written into a parameter), and 0 on invalid input. Turns out, given the string "localhost", it returns 0 but *also* writes in the local IP address.

That, or some weirder default was happening... yeah, on going the other way, I find the address (written or more likely just left over from the usual memset/bzero initialization) is which connects to localhost, just like the well-known Huh. Guess I should look up which looks messy an multi-purpose, but yeah, this seems one potential behavior of it, especially on a DHCP machine. OTOH, neither of the remote servers I have access to will connect to themselves via that address, so it's not reliable.
mindstalk: (Default)
One thing easy to miss about Python is how it can easily print anything, including lists of dictionaries of lists of objects.  C++, not so much.  So I wrote a simple template function to print a vector of printable types, and then one for deques.

template<class T>
void pr_vector(const vector <T> &v) {
    for (auto i : v)
        cout << i;
    cout << "\n";
template<class T>
void pr_deque(const deque <T> &v) {
    for (auto i : v)
        cout << i;
    cout << "\n";

Well, lots of duplication there.  I tried templating on the container type too, template <class C, class T>, but the resulting C <T> didn't work; it probably should with some syntax, but I don't know what, yet.

But I figured I'd try auto:

void pr_cont(const auto &v) {
    for (auto i : v)
        cout << i;
    cout << "\n";

...that subsumes both templates.  And lots of others.  vector, deque, list...  Of course, the values themselves have to be printable, it can't handle a vector<pair<int, string>> because there's no printing on pairs.  But still, for C, this is amazing!
mindstalk: (Default)
At my last onsite interview I got asked -- out of curiosity, not as interview question -- what was so great about dynamically typed languages. I struggled to explain -- everyone knows why they're cool, right? Code is fast and compact to write! It is known!

But I've been thinking about the question some more. Right now I'm thinking there's one thing they're genuinely better at that doesn't come up that often; for the rest, it's more a historical accident of feature packaging, i.e. what else has traditionally come along with dynamic typing that made coding convenient.

The one thing: dealing with some variable length list of arbitrary values that you deal with at run time. Stuff like printf(), or message-passing into a closure-based object. I think I mentioned building a closure object in ocaml, and you can do it, but you need a sum type for various method inputs, and another for output, and it just seems like more work. Yeah, you get type safety for the work, but it's still more work. Likewise C has long had an ability to make things like printf(), but it feels clunky.

Also, based on language features, it seems likely that Lisp-style macros work better with dynamic typing, but I can't defend the claim. eval() seems to go with dynamic typing too, not that it's used that much IME.

The other stuff that's tended to come with dynamic typing:

* direct running of code, without a manual compile step. So faster development.
* (Lisp and Python, but not Perl or Ruby AFAIK): a REPL so you don't have to even write code into a file to run it, but can play at a prompt.
* easily making product types (tuples, lists of arbitrary types.) In C you have to declare and define a struct.
* garbage collection or at least reference counting, so little need to worry about memory management. Corollary: if you stick with small or at least short-lived "scripts", then resources like file descriptors or file handles don't have to be worried about much either; that's a domain thing rather than a language thing.
* And what might be the biggest one, a convenient environment:

In C you need to define a main function and include at least one header (like stdio.h) to do anything. Java's entry point is even more verbose. By contrast, a one line print statement is a valid scripting language or Lisp program. Put another way, in C++ terms, simply starting Perl gets you roughly the equivalent of

#include cstdio
#include string
#include regex
#include deque
#include unordered_map
#include unistd
#include sys/socket.h
#include cstdlib

...and probably more. (I left out the angle brackets because Dreamwidth is annoying.)

plus any statements are assumed to be contents of main() and thus executed. Python makes you import a few of the above as modules, though adds , bigints, and complex numbers.

Now, none of this depends on dynamic typing! ocaml and ghci provide REPLs and ocaml can directly run source files. Type inference and compact notation makes creating typed tuples on the fly trivial (on the fly coding, at least; not at runtime). Java's GC feels like half the reason programmers originally ooh-ed over it. And while Perl's default environment is unmatched AFAIK, having strings and some sort of container types and IO (more than C/C++) without import or use statements is common to Java and the ML-ish languages.

But if your experience of languages is something like C++, Java, and Perl/Python, as was not uncommon, then there's a clear bias in code convenience and compactness. And even with the ML-ish languages, with type inference and REPL, there's still nothing that matches Perl for simply doing useful system stuff right away without imports. I see no reason one couldn't exist, it just doesn't. So it's easy to associate dynamic typing with "lets me get stuff done quickly and concisely".

mysteries of TCO

2016-Dec-11, Sunday 14:44
mindstalk: (Default)
So I've learned that gcc/g++ do tail call optimization with the -O2 option; supposedly MS Visual Studio and clang do it when they can as well. I have tested this:

this got long )

ocaml threads grrr

2016-Dec-07, Wednesday 22:19
mindstalk: (Default)
So for a coding test I had to make a chat client-server pair in ocaml. Done, based on a select loop. Well, mostly done, I didn't address blocking writes. I decided to redo the thing with threads, for the practice. Getting the basic functionality wasn't hard: synchronized queue for communication, simpler buffered socket readers than the first version had, and such.

Getting the threads to quit gracefully, though, that took a lot longer. I think part is due to ocaml's POSIX threads implentation: I'm told that "no one" uses it, they all use LWT (lightweight threads) instead. I can confirm some deficiencies: no Thread.detach functionality, for example, which was a big roadblock. And Thread.kill is in the API documentation, but when I tried to use it, got

Thread 2 killed on uncaught exception Invalid_argument("Thread.kill: not

Throwing an exception from the worker threads to notify the main program to join() them didn't work either, though I'm told I shouldn't expect that to work anyway. I think it works now, with a fair bit of polling -- not busy wait spinning, but reads timing out every 0.1 seconds to check for changes of state -- but that feels messy. As for whether it's simpler... well, it is fewer lines than the select() version, and should handle large (potentially blocking) messages better, so I guess so. Hmm, well, the core chat module is slightly shorter, but main is longer, and there's an extra SQueue module. Still, being more correct is good.

C++ lambdas

2016-Nov-30, Wednesday 12:42
mindstalk: (Default)
I was catching up on C++11 last night. Learned about lambdas, and started playing around with them. A neat addition. But did they provide full closure capability, like creating a snapshot of state that the function can privately modify? Looked like no, and I was all set to be disappointed.

But at a job interview yesterday, I learned about the mutable keyword, which was described as letting you modify const things, like deep in a const object hierarchy. "Why would you even do that?" was my first reaction, though I can vaguely imagine why, "for the same reason that Haskell's purity annoyed me."

So I remembered that, and figured I would try adding mutable to the lambda. Ka-ching!

#include <vector>
#include <iostream>
#include <algorithm>
using namespace std;

int main() {
    vector  v;
    for (int i=1; i<=10; i++) {
    int sum=0;
    for_each(begin(v), end(v), [sum](int n) mutable {sum+=n; cout << sum << endl;});
    cout << sum << endl;



Not that you need the external sum here, you could drop it and have [sum=0] for the lambda, but it illustrates the idea. Which isn't obscure, I saw it in the docs I was reading shortly after returning to them, but still, I found this on my own.

I've been applying to a bunch of C++ jobs more because of experience than because of any deep love for the language, but features like this and auto (type inference) and others from C++11/14 make it a lot more appealing.

April 2019

141516 17181920

Expand Cut Tags

No cut tags

Style Credit