kivikakk.ee

Inefficiences and insufficiencies: C++, others

Lately I’ve been game programming. A few things have been coming to mind:

  • The Sapir–Whorf hypothesis is totally true when it comes to programming languages.
  • Dynamically typed languages encourage a whole lot of waste.
  • There’s some sweet spot of expressivity, low level and non-shoot-yourself-in-the-footness in language that’s missing.

I will attempt to expound.

The languages I end up using on a day-to-day basis tend to be higher level. Non-scientific demonstration by way of my GitHub account’s contents at time of writing:

15 Ruby
9 OCaml
6 Go
6 Javascript
4 C
3 C++
2 Clojure
3 Perl
2 Haskell
...

In my professional career, I’ve concentrated on JavaScript, PHP, Erlang, Python and C#. The lowest level of these is, by far, Erlang. Perhaps it’s fairer to say that Erlang keeps me in check, perf-wise, more than any of the others.

So my mind has been a fairly high-level sort of mindset. I’ve made occasional trips to lower-level code, but there hasn’t been much of that lately, particularly as I’ve changed jobs and needed to concentrate solely on work stuff for a while.

Choosing C++ to write a game wasn’t too hard; it’s fairly standard in the industry, and I know it quite well. Bindings to libraries are plentiful, and getting the same codebase compiling on Windows, OS X and Linux is a well-understood problem that’s known to be solvable.

The thing is, C++ makes it abundantly clear when you’re doing something costly. This is something that occurs particularly to me now as I’ve not done this lower-level stuff in a while.

You wrote the copy constructor yourself, so you know exactly how expensive pushing a bunch of objects into a vector is. You chose a vector, and not a list, so you know exactly why you don’t want to call push_front so many times. You’re creating a ostringstream to turn all this junk into a string: it has a cost. Are we putting this on the stack or in the heap? Shall we use reference counting?

You make hundreds of tiny decisions all the time you’re using it; ones which are usually being abstracted away from you in higher level languages. It’s why they’re higher level.

And that’s basically all I have to say on that: the language makes you feel the cost of what you choose to do. Going to use a pointer? Better be sure the object doesn’t get moved around. Maybe I’ll just store an ID to that thing and store lookups in a map. How costly is the hashing function on the map key? You add such a lookup table in Ruby without a moment’s notice; here, you’re forced to consider your decision a moment. Every time you access the data, you’re reminded as well; it’s not something that ever goes out of mind.

Moreover, the ahead-of-time compilation means you can’t do costly runtime checks or casts unless you really want to (e.g. dynamic_cast), but again, the cost of doing so means you’ll never be caught unaware by slowing performance. In many (dynamic) higher level languages, basically every operation is laced with these.

So it’s suited for games programming, because performance is usually pretty important, and the language keeping performance on your mind means it’s not hard to consistently achieve high performance.

But C++’s deficiencies are also well-documented. It’s awful. It’s waiting to trip you up at every turn. After re-reading those talk slides, I figured I’d just port the code to C – until I remembered how much I used std::string, std::vector, std::list, and moreover, enjoyed the type- and memory-safety they all bring. I’m not particularly fond of giving that up and implementing a bunch of containers myself, or using generic containers and throwing away my type checks.

I think I’m after C with templates for structs (and associated functions), but I’m not sure yet. If you think I want C++, you probably need to re-read those notes.

The other solution is to only use as much of C++ as I like, and that’s basically what I do – but the language is still waiting to trip me up, no matter how much I try not to use it.

Time to think a bit about what the issues at hand really are.

Snapchat: not for state secrets

I use Snapchat. It’s an app where you can take a photo or short (< 10 second) video and send it to your friends who use the service; they’ll then be able to see it, once, before it disappears forever.

Ostensibly, the app is for sexting, because there’s no fear that your photo will get spread around (no forwarding/etc.) or retained for longer than you’d like, but it seems like it’s not as much a sexter’s hangout as the media might want you to think.

My circle of friends use it basically as an extension of weird Twitter – most snaps I send and receive are strange angles of weird objects; the completely mundane but somehow therapeutic (7 seconds of the camera pointed outside the window of a tram, pointed at the ground moving below); or just closeups of Curtis Stone’s face, wherever we see him.

Of course, the promise that they won’t get retained is just that: a promise. Since your phone receives this image and shows it to you at some point, it must be downloaded by your phone. If it can be downladed by the phone, it can be downloaded by something else. We decided to find out how.

Read more

Contrigraph: drawing in GitHub

Here’s Contrigraph, a “data visualisation” (?) created by generating commits to match the Contribution Graph Shirt from GitHub.

It was pretty much a hack; first, I read the colours straight off the shirt, producing a big block of data like

0002342
2223322
2323241
2224333
3322122
2242231
...

Then we read that in one digit at a time, work out what day to start on so everything aligns correctly, and how many commits on a given day produce each gradient of colour. The result isn’t pretty:

start = Time.new 2012, 4, 23, 12, 0, 0, 0
tbl = {"0" => 0, "1" => 0, "2" => 1, "3" => 9, "4" => 14, "5" => 23}
3.times do
dbd.each do |n|
tbl[n].times do
`echo 1 >> graafik`
`git commit -a -m 'contrigraph' --date='#{start.to_s[0..9]}T12:00:00'`
end
start += 86400
end
end

Three times so this thing will scroll on for the next few years. Other values for tbl would work too; I just didn’t bother to do anything better. I’ve written clearer code, but that wasn’t really the point either.

I actually screwed this up twice: first I didn’t remember to treat the 0 entries correctly (i.e. I should have skipped those days, whereas I ignored them entirely); second, it seemed like I was getting hit by timezone issues where everything was shifted up a block.

In retrospect, I should have first produced a mini-contributions graph generator (i.e. one that takes a Git repository and produces what GitHub would do), validate that against an existing user/repo, then use that to ensure it’d work the first time. I did a similar thing to ensure I had the data correct, by producing a graph directly from the data:

Newlines and regular expressions

This has come up a few times recently, so a little note:

/./ will match any character—except a newline.

/[\s\S]/, /[\w\W]/ and so forth are the only way to portably match every character including newlines.

what about /./m?

/./m does not match a newline (usually—see below).

The m modifier does not alter . in any way (usually—see below).

m changes what ^ and $ match—normally they’ll match only the start and end of the string, but m alters them to also match the start and end of any line, i.e. immediately after and prior a newline.

Mnemonic: m for “multiline”.

what about /./s?

Yes, that will do what you want—unless you’re using Ruby or JavaScript.

In Ruby, the s modifier is inexplicably unused—in 1.8, it stays on the Regexp object, but in no way affects matching, and in 1.9, it doesn’t even stay on the object, it just disappears. Note that these behaviours are different to using a complete nonsense modifier, like f, which causes a SyntaxError1.

Even more inexplicable, this feature has been rolled into m instead! So in Ruby, you can use /./m to match a newline, and you can also use /^a/m to match an a at the beginning of the string, or after any newline in the string.

In JavaScript, the s modifier is absent entirely, and it’s not rolled into anything else. Use /[\s\S]/.

Mnemonic: s for “single line” (the string is treated as one line, in a twisted sense).

conclusion

Yay, regular expressions. Be sure what you mean.

To match any character including newlines:

  • In JavaScript or any other language lacking the s modifier, use /[\s\S]/.
  • In Ruby, use /./m, but be aware that this also modifies ^ and $. If unsure, use /[\s\S]/.
  • If you have true PCREs, you may safely use /./s.

Note that this is often what you really want—rarely do you want to explicitly match every character except newlines. If you do that on purpose, at least leave a comment to that effect, otherwise your coworkers will just assume you didn’t know what you were doing.

  1. Or just gets thrown away if you use Regexp.new directly.

On real relationships

Writers of all stripes enjoy engaging in the most cynical readings of human behavior because they think it makes them appear hyper-rational. But in fact here is a perfect example of how trying to achieve that makes you irrational. Human emotion is real. It is an observable phenomenon. It observably influences behavior. Therefore to fail to account for it when discussing coupling and relationships is the opposite of cold rationality; it is in fact a failure of empiricism.

L’Hote on Kate Bolick’s “All the Single Ladies”.