Created and maintained by Jordy in collaboration with Connect Magazine

Topic: unix

March 5, 2009
» Using Grep and Find

GREP

Image by dannyman via Flickr

One of my favorite tools is "grep." That gives away the fact that I spend more time on the command line than many. One of the things I originally loved about OS X was that I could fire up a terminal and use the machine just like Unix (yeah, Linux was a new fangled thing for me).

Recently I complained about always having to look up a certain switch for grep and Weldon Dodd tweeted "if you write a blog post about grep, maybe others will commit the switch to memory too, and when I say others, I mean me." So, here it is.

Grep is an acronym from "global, regular expression, print" three commands you often did in a row inside ed, a primitive Unix editor (vi is slightly better, emacs is the best, of course). As an aside, knowing some rudimentary ed commands is a good thing for any sysadmin because it's always available, even in single-user mode.

Grep is used to searching files. I was using it this morning to search for strings in old email mailboxes to find a Quicktime registration. The fact that grep accepts regular expressions makes it a very powerful tool for finding data in files.

There are only a few flags I commonly use:

  • -H - print the filename of any matching files. If you grep a file glob and get match you want to know which file in the glob the match was in
  • -P - use Perl-style regular expressions

There are others, just grep --help to see them.

One thing you will commonly want to do is grep through directories. Grep allows you to recurse, but I am in the habit of using it within find to do the same thing:

find . -exec grep -H "Quicktime" {} \;

Why do this instead of using the built-in recursive features? simply because I know find well and when I want it's powerful filtering features to work with, this makes them readily available. For example, if knew I wanted to search for "Quicktime" in files that were created in my Documents folder within the last week:

find ~/Documents -ctime 7 -exec grep -H "Quicktime" {} \;

So, why not use Spotlight? I do. But sometimes I want a scaple rather than a chainsaw and grep combined with find give me that.

Tags: unix osx

November 19, 2008
» OS X Leopard Technical Details

frameless

Image via Wikipedia

Jordan Hubbard, Apple's Director of Engineering of Unix Technologies, spoke at LISA '08 last week. Most people are commenting on the date he gave for the release of Snow Leopard (10.6), the newest version of OS X. I have to admit, I'm ready for some stability improvements, but I was much more intrigued by the details of his talk (PDF).

He spent the bulk of his talk on technical features in Leopard (10.5) that many aren't aware of. He starts with a number of security improvements in Leopard: file quarantine, sandbox, package and code signing, application firewall, parental controls, non-executable (NX) data, address space layout, and randomization. I was completely unaware of most of these improvements.

Jordan also talks about the Unix improvements in Leopard. Leopard is fully Unix compliant. But more than that includes a number of additions like DTrace, Launchd (complete), ASL (replacement for syslog), a read-only version of ZFS (for future compatibility) with a read/write version available. He also talked about Apple's evolving open source strategy.

Last, he talks about improvements coming in OS X that will help developers take better advantage of the multicore chips and sophisticated GPUs that already ship with most Macs. Future kernels will provide better facilities, along with APIs, for managing multi-threaded programs. He says:

Forget everything you thought you knew about multi-threaded programming (and, as it turns out, most developers didn't know much anyway). The kernel is the only one who really knows the right mix of cores and power states to use at any given time - this can't be a pure app-driven decision

I don't know if there's audio or video of the talk available, but it would be very good to hear firsthand.

BTW, anyone know what "LWFLAF" stands for? Jordan uses it as some kind of metric in discussion the various versions of OS X, but I couldn't figure out what it meant.

Tags: osx apple unix security

September 3, 2008
» How to browse securely with SSH and a SOCKS proxy

I was in Moab this weekend with my family and our motel had free wireless Internet. I used SSH and a SOCKS proxy to create a secure tunnel to my iMac at work. This allowed me to browse Gmail and Facebook securely.

Here's a screencast on how to create an SSH tunnel and browse securely in Safari and Firefox:

Here's a full-size video:
How to browse securely with SSH and a SOCKS proxy (full size video)

These are the basic steps on a Mac:
1. Open Terminal. (In your Applications/Utilities folder.)
2. Type "ssh -D 9999 username@example.com", replacing "username" and "example.com" with the actual username and address of your remote machine. The remote machine will need the SSH service, or Remote Login service, turned on.
3. Open System Preferences -> Network -> Advanced tab -> Proxies.
4. Turn on the "SOCKS Proxy" and enter "127.0.0.1" and "9999" in the fields. Click OK and Apply.

Now your Internet connection will be tunneled through a secure connection to your remote machine -- a poor man's VPN.

July 19, 2007
» The Patriot Act and Customer Service

I. Mac and Linux computers come with a command called “rsync” that makes backup and synchronization easy. Every morning before work I synchronize my 4 year old dying Powerbook to my iMac at work. When I get home, I synchronize back. This way, I get my same mail, documents, and music wherever I am, and if something were to happen to one computer, I’d have a backup. I synchronize over the Internet, but I know a local guy that synchronizes to his iPod so he can physically carry his updates in and out of the office.

canaries.jpg
Photo by quimby

II. At work, we’ve begun using a service called rsync.net for backup. We synchronize our files to their service and pay them $1.60 per gigabyte per month. It’s a pretty inexpensive way to do backup, and it’s nice to have the backup offsite. The rsync.net engineers with whom I’ve spoken have been top notch.

For privacy, we actually use a derivative of rsync called “duplicity”, which encrypts our data before storing them at rsync.net. Their website explains how to use duplicity and other encryption techniques, but I thought it was particularly interesting to find they publish a “warrant canary”. Because the Patriot Act allows the service of secret warrants for the search and seizure of data, and criminal penalties for failing to maintain secrecy, rsync.net publishes a weekly declaration that they haven’t been served a warrant:

rsync.net will also make available, weekly, a “warrant canary” in the form of a cryptographically signed message containing the following:

- a declaration that, up to that point, no warrants have been served, nor have any searches or seizures taken place

- a cut and paste headline from a major news source, establishing date

Special note should be taken if these messages ever cease being updated, or are removed from this page.

Source: rsync.net Warrant Canary

If the “canary” dies, you’re supposed to close shop and get out.

I don’t know the legal implications of a warrant canary, but it seems like a particularly unique example of putting the customer first!

April 4, 2007
» Cleaning Up Unwanted Files in Linux

One of my grad students just went to remove some unwanted, automatically created files in his directory and accidentally deleted some things he wanted. I use a script to do clean ups to prevent these kinds of silly errors (which we're all prone to). Here's the script:

#!/bin/bash

if [ ! -e $HOME/.rmd ]
then mkdir $HOME/.rmd
fi

find $HOME  \( -name '.rmd' -prune \) -o \
  \( -name '*~' \
     -o -name ',*' \
     -o -name '#*#' \
     -o -name '*.bak'\
     -o -name '*.backup' -atime +5\
     -o -name 'core'\
  \) \
  -print -exec mv -f {} $HOME/.rmd \;

find $HOME/.rmd -atime +5 -exec rm -f {} \;

The script creates a directory called .rmd if it doesn't exist, finds files matching a certain set of patterns to that directory, and finally removes things in that directory that were moved there more than five days ago. It's not perfect--files with the same name are just moved over the top of each other.

I name it "clean" and put it in my personal bin directory. You might add or delete individual line items depending on what kinds of files your programs create. When I was a grad student, disks were expensive, and worked on a system that enforced quotas, I ran it in a cronjob once a day. Now I just run it whenever things look ugly--the same approach I have to dusting.

Building or modifying a script like this can be dangerous since a bug could cause things you care about to be systematically removed. I recommend testing it on an account that doesn't have anything you care about in it before you blindly trust it.

One last thing: I used Linux in the title, but this will obviously work in anything with bash and find including varieties of Unix and OS X. These days I'm running it on OS X rather than Ultrix or 4.3BSD. Not all versions of find have a "prune" option.

Tags: unix linux sysadmin programming osx

January 19, 2007
» Building Living Software

Steve Yegge rants, in reference to software design, that crap is still crap, no matter how many rubies you swallowed. If software design interests you, then you'll enjoy this--even if you don't agree.

As I was reading this, I was reminded several times about Scott Rosenberg's article on Charles Simonyi, Anything You Can Do, I Can Do Meta. Simonyi, who was the force behind Office at Microsoft and arguably the richest programmer in the world, is hot on the heels of a programming methodology he calls "intentional programming" and has a company to develop it Intentional Software.

The basic point is described in the article by means of a fable. The bottom line: don't build systems, rather build systems that build systems. Steve's not saying the same thing exactly, but there's a similarity of purpose, if not execution.

Tags: software programming design unix emacs