Hard Links on Mac OS X in Movies and Pictures

Mac OS X Background

Early versions of Mac OS X did not have hard links, but Apple added them in Leopard 10.5 to make use of in Time Machine.  Initially, the functionality was crippled, but as of Mavericks 10.9 (and perhaps earlier), full hard link functionality is available.

What is a hard link?

In the file system, a filename points to a particular inode, which contains the metadata about the file:

file1 ---> [Inode 317905] ---> [Address, Ref Count = 1]

When you a create a hard link, two separate file names point to the same inode:

file1 ---
        |---> [Inode 317905] ---> [Address, Ref count = 2]
file2 ---

file1 and file2 are completely symmetric. If you delete file1, file2 will still exist and still point to inode 317905. And if you delete file2, file1 will still exist and still point to inode 317905.

The file system maintains a reference count within each inode. If you delete both file1 and file2, the reference count for the inode drops to 0, and the space used by the file is marked for deletion.

Try the following experiment:

$ echo 'Hello World' > file1     # Create file1

$ ln file1 file2                 # Hard link file2 to file1

$ cat file1                      # Check contents of file1
Hello World

$ cat file2                      # Check contents of file2
Hello World

$ echo 'Goodbye World' > file3

$ ls -li                         # List long format and inodes
total 24
3557449 -rw-r--r--  2 sasha  staff  12 Feb 24 14:20 file1
3557449 -rw-r--r--  2 sasha  staff  12 Feb 24 14:20 file2
3557462 -rw-r--r--  1 sasha  staff  14 Feb 24 14:20 file3

The first number on each line in the directory listing is the inode number. file1 and file2 point to the same inode (3557449), but file3 points to a different inode (3557462). The third column contains the reference count for the inode. Since file1 and file2 point to the same inode, that inode has a reference count of 2. The inode for file3 is only pointed to by one inode, so its reference count is 1.

More information:

Why would you use hard links?

If two different locations need access to the exact same file, you can create a hard link to that file. Both locations can use it. The file contents will always stay in sync between both locations. If the file entry in one location is deleted, the other location still has full access to it. If the file entry in both locations is deleted, the file will be deleted and the space reclaimed.

This is different than symbolic links, or aliases. In a symbolic link, a filename points to another filename, which then points to an inode:

file2 ---> file1 ---> [Inode 317905] ---> [Address, Ref Count = 1]

In a symbolic link, file1 and file2 are not symmetric. You can delete file2 without causing any problems. But if you delete file1, file2 is still there, pointing to the now missing file1. If you try to access file2, you will get an error: file2: No such file or directory, even though file2 looks like it is there in your directory.

There are two important limitations on hard links:

  • Hard links can only be created within the same file system. For a Mac, that means that you cannot create a hard link between different volumes.
  • Hard links cannot be created to a directory. This prevents loops in file system structure. This is not entirely true – under the hood, the . and .. file system entries are hard links. But users, even superusers, can’t create hard links to a directory.

Why does the Mac use hard links?

Time Machine

Hard links are perfect for a backup system.

I take a snapshot of your current drive, and save it as Backup-2016-02-24. Then tomorrow, I walk your drive, and check whether each file is new or has changed since yesterday. If it is new, or has changed, I copy it into Backup-2016-02-25. If it has not, I create a hard link in Backup-2016-02-25 to the copy of the file in the previous day’s snapshot. This saves a ton of space.

A year from now, when I’ve run out of room on my backup drive, I delete Backup-2016-02-24. This is completely safe, because if Backup-2016-02-25 has a hard link to a file in Backup-2016-02-24, that file will still be there even after Backup-2016-02-24 is deleted.

~/Pictures

When Mac OS X changed from iPhotos to Photos in 2015, they changed the library format. But they did not want to copy all the media files, because that would consume huge amounts of space. So they created two new directories, Photos Library.photoslibrary and iPhoto Library.migratedphotolibrary, and created hard links between them for all the media files. Every media file can be seen from both directories, every media file is a first-class citizen in both directories, and there is no duplication of media files.

~/Movies

As you create and edit clips in iMovies, hard links are created between files in iMovie Library.imovielibrary and iMovie Events.localized. Sometimes, they are also created from the directories in ~/Movies to the directories in ~/Pictures. Again it is a space-saving maneuver, that allows unnecessary files to be deleted without concern.

Why would I care about these hard links?

In general, you don’t. But if you are migrating your ~/Pictures or ~/Movies directory to a new drive, you may need to think about hard links.

Apple’s Migration Assistant handles hard links correctly without you having to do anything special. However, if you are copying the pictures yourself by dragging and dropping or using cp, hard links will not be maintained across the copy.

This means that the directories will take up substantially more space after the copy than before and that media files will no longer be linked (changes to one media file will not affect the other). There is a way around this, but it requires the use of rsync.

More on this next time.

Leave a Reply

Your email address will not be published. Required fields are marked *