So you want to learn to program

Last week my friend asked me how she could learn to program. She’s a lawyer now, but she took one C++ programming class in college years ago.

I started down some long path: “Python is a nice beginner language. But you’ll want the most current Python version. You use a Mac, so you should get homebrew, so you can download the latest Python. Then you’ll need to install pip so you can get new Python packages. And you’ll do better in an IDE, so you should probably get Pycharm from JetBrains.”

What, am I insane? She doesn’t want to do all this. So I backed up: “What are you interested in about learning to program?”

Continue reading “So you want to learn to program”

The perils of `rm -rf`

I came across this question on Stack Exchange about how to recover from an accidental rm -rf /*.

The questioner says he runs a small web-hosting service, and accidentally deleted all the files on his servers, and all the files on his backup drives. He asks how to recover. (The answer: it’s difficult.)

This made me smile, because I, like everyone who programs for a living, have an rm -rf story too.

Continue reading “The perils of `rm -rf`”

Writing Solid Code

Writing-Solid-Code

I came across Writing Solid Code by Steve Maguire (Microsoft Press, 1993) at work. My boss had a shelf full of 90’s software texts, and this one caught my eye while I was waiting for a meeting to start.

It’s fantastic, and all programmers should read it.

It’s written by an early Microsoft programmer, with experience writing for Excel. The focus of the book is C, but the ideas are applicable to almost any language. Here are some of them.

Procedures/methods/functions should do one thing well.

They should avoid special cases.

“If you pass in a pointer as the third parameter, the function will fill in the data structure it points to. But if you pass in 0 as the third parameter, the function will allocate memory and then fill in the data structure.” This type of intricate behavior allows subtle bugs or memory leaks to sneak in.

Continue reading “Writing Solid Code”

Entropy at Black Hat 2015

At Black Hat USA 2015, I gave a briefing on entropy use and management in the Linux kernel, along with Bruce Potter, CTO of KEYW Corporation.

You need random numbers to do many things on modern computers. For instance, all the cryptography that secures your web transactions is powered by random numbers. This cryptography means that only Amazon can use your credit card details and that only you can tell your bank to transfer money out of your account. But random numbers are hard to come by on a computer. Computers are, by nature, pretty deterministic machines.

You can generate mostly random numbers by things like measuring temperature very accurately and looking only at the last decimal place, which fluctuates a lot. Or measuring the RF radiation passing by, and again, looking only at the last decimal place. Your computer generates mostly random numbers by looking at the time that various things happen – you press a key on the keyboard, a packet arrives on the network – and looking only at the last decimal place of the time that that even happen.

Continue reading “Entropy at Black Hat 2015”

Understanding Hard Links

First signs of a problem

I copied my Movies directory from my old Mac to my new Mac, using target disk mode. On my old Mac, the Movies directory took up 84G. But on my new Mac, it took up 149G. What was going on?

My Movies directory contained hard links, which I wrote about last time.

The Movies directory contained 65G of files that were hard-linked to other files also within the Movies directory. When I copied them the usual way (by drag and drop, or cp), the hard linked files were copied one time for each hard link. So tons of duplication, tons of wasted space.

Investigating hard links

The command du, which calculates disk usage of a directory, is useful for understanding what hard links you have and where.

Continue reading “Understanding Hard Links”

Hard Links on Mac OS X in Movies and Pictures

Mac OS X Background

Early versions of Mac OS X did not have hard links, but Apple added them in Leopard 10.5 to make use of in Time Machine.  Initially, the functionality was crippled, but as of Mavericks 10.9 (and perhaps earlier), full hard link functionality is available.

What is a hard link?

In the file system, a filename points to a particular inode, which contains the metadata about the file:

file1 ---> [Inode 317905] ---> [Address, Ref Count = 1]

When you a create a hard link, two separate file names point to the same inode:

file1 ---
        |---> [Inode 317905] ---> [Address, Ref count = 2]
file2 ---

file1 and file2 are completely symmetric. If you delete file1, file2 will still exist and still point to inode 317905. And if you delete file2, file1 will still exist and still point to inode 317905.

The file system maintains a reference count within each inode. If you delete both file1 and file2, the reference count for the inode drops to 0, and the space used by the file is marked for deletion.

Continue reading “Hard Links on Mac OS X in Movies and Pictures”