2010/11/12

Apple to drop Xserve

(For reference, Apple's Xserve Transition Guide.)

This has been written about by some good folks:

They cover the downsides, however:

Apple's target market is the end user. Overwhelmingly.

As such, the server market has always been a straddle of the world of the end user and that of the IT department - though clearly there are still some markets where an Apple server is a great fit, such as organizations where there is no IT department.

However, the target for the Xserve in particular has moved - in large part, to the cloud (such as AWS (Amazon Web Services).

And the cloud is built, in large part, with commodity hardware - not Apple's business at all. The cloud is much less about serious high-quality equipment, as it is about lots of cheap stuff that is configured so that failure is fully expected; just toss it (while something else automatically takes over its load) and slot in another.

That's the hardware side, which is all Apple plans to drop (in January).

On to software:

Apple explicitly will continue to develop Mac OS X Server and will continue to develop some server configs based on its other CPUs; a Mac mini with Mac OS X Server is a very nice server - and fits Apple's target market much better. It's no Xserve - and many of us have no need for all the Xserve does -- nor any need to pay for it.

Though this transition will cause some pain for those of us who find great utility in the Xserve, this is a great move for Apple and aligns its resources where they fit best.

Oh - and by the way, here's a left-field idea:

Apple is in a far stronger position now, than when it first started selling Mac OS X Server. At which time, tightly tying it to Apple's hardware made great sense. And Apple's very firm policy (full price) on licensing even virtualized instances of Mac OS X Server, while rather painful, made a certain amount of sense too.

Now? The landscape is significantly different. The business case can be much more solidly made, that Apple could license Mac OS X Server on commodity hardware - including in the cloud.

A stretch, yes. But within reach.

2010/09/06

Strong passwords

Microsoft has a page to check your password, which rates its strength.

Good idea - to help people choose stronger passwords - however:

  • I certainly wouldn't want to tell anyone my password, especially over the Internet; I'll keep it between me and whatever service I need a password for. (Yes, as it notes at the bottom of the page "The password you enter is checked and validated on your computer. It is not sent over the Internet." however I'd have to read the code to confirm that - and that it hasn't been changed since the last time I read the code. An unlikely requirement for the intended audience.)
  • I certainly wouldn't want to tell Microsoft in particular; regardless whether you trust them or not, they're a big target, so why take the chance of being "collateral damage"?
  • The page has little intelligence behind it; here are some example strength ratings:
    • Medium: qwertyuiopas (12 characters - straight across the keyboard)
    • Strong: abcdefghijklmn (the first 14 letters of the alphabet)
    • Strong: 12345678912345678912 (20 digits, in order)
    • BEST: 28 of any letter.
Whoops.

2010/06/28

Recipes for tm-diff (Time Machine difference)

A few recipes for using the tm-diff.sh script that I posted earlier (see link for details):

  • Ignore errors/warnings (ex: "Permission denied"):
    tm-diff.sh folder1 folder2 2> /dev/null

  • Find all files, regardless of their privileges/permissions:
    sudo tm-diff.sh folder1 folder2

  • Display disk usage, in "human-readable" sizes (ex: KB, MB, GB...):
    tm-diff.sh folder1 folder2 | while read f; do du -h "$f"; done

  • Display disk usage, with a grand total at the end (sizes in KB):
    tm-diff.sh folder1 folder2 | while read f; do du -k "$f"; done | awk '{ sum += $1; print; } END { print sum; }'

  • Display the largest 50 files (sizes in MB):
    tm-diff.sh folder1 folder2 | while read f; do du -m "$f"; done | sort -n | head -50

  • Display the smallest 100 files (sizes in KB):
    tm-diff.sh folder1 folder2 | while read f; do du -k "$f"; done | sort -rn | head -100

  • Display a full listing (with mode/privileges, owner, group...):
    tm-diff.sh folder1 folder2 | while read f; do ls -l -h "$f"; done


Notes:
  1. This presumes you've set your PATH so it'll find the tm-diff.sh script. (Or use its full path.)

  2. In the commands above, "folder1" and "folder2" of course represent the paths to the two Time Machine backup folders to compare; they'll look something like:
    /VOLUMENAME/Backups.backupdb/USERNAME/2009-07-20-091153

  3. The script explicitly searches for regular files only; no directories or other special files.

  4. I use a read loop, since this will get the full path, including "special" characters like spaces and double-quotes.

  5. Sizes are in the classic style ("as God intended"): KB = 1024 bytes; MB = 1024 KB; GB = 1024 MB ...

  6. Most Time Machine backups contain tens of thousands of files, so these will take awhile. The ones that sort will display no files, until all of them have been found, to be sorted - so they will seem to take even longer.

2010/06/27

Show differences between two Time Machine backups

Below is a quick Bash script to display the difference between two Time Machine backups.

The output is simply the pathnames of the files that were added to Time Machine in that time period. This may of course be piped to another command/script to provide further info - say, the amount of space used.

Cheat: Provide a simple date (formatted the same way that Time Machine does) instead of the "older" backup pathname, and it'll work whether or not there was a backup then. (The order of the arguments is unimportant; the script will figure it out.)

#!/bin/bash

LF=$'\n'
usage="Usage: $(basename $0) Time-Machine-folder-1 Time-Machine-folder-2"
usage=$usage$LF"(Compare the two Time Machine backup folders;"
usage=$usage"find and display files that were backed up in that time period.)"

function tweakDate { # to a fmt understood by find cmd
echo "$1" | awk -v FS="" '{print $1$2$3$4$5$6$7$8$9$10" "$12$13":"$14$15":"$16$17}'
}

if [ ! $# == 2 ] ; then
{
echo "$0 requires two arguments; the Time Machine folders to compare." >&2
echo "$usage" >&2
exit 1
}
fi

baseName1=$(basename $1)
baseName2=$(basename $2)

# NB: operator must be escaped or will signify redirection
if [ "$baseName1" \< "$baseName2" ] ; then
{
olderFldr=$1
recentFldr=$2
olderFldrName=$baseName1
recentFldrName=$baseName2
}
else
{
reverse=TRUE
olderFldr=$2
recentFldr=$1
olderFldrName=$baseName2
recentFldrName=$baseName1
}
fi

if [ ! -d "$olderFldr" ] ; then
{
echo -n "Warning: $olderFldr doesn't seem to exist;" >&2
echo " maybe you're just using it to specify a date?..." >&2
}
fi

olderFldrParent=$(dirname $olderFldr)
recentFldrParent=$(dirname $recentFldr)

if [ ! $olderFldrParent == $recentFldrParent ] ; then
{
echo -n "Warning: These folders are not in the same parent folder;" >&2
echo " this may not be what you want..." >&2
}
fi

olderFldrDate=$(tweakDate $olderFldrName)
recentFldrDate=$(tweakDate $recentFldrName)

if [ ! -d "$recentFldr" ] ; then
{
latestBackup="${recentFldrParent}/$(ls -1 $recentFldrParent | tail -2 | head -1)"

echo "Can't find $recentFldr; searching the latest backup" >&2
echo " ($latestBackup)" >&2
echo " for files modified between $olderFldrDate and $recentFldrDate." >&2
echo " This may take awhile..." >&2
echo $LF >&2

find -x "$latestBackup" -type f -newermt "$olderFldrDate" \! -newermt "$recentFldrDate" -print
}
else
{
echo "Now searching for backups in $recentFldrName" >&2
echo " that are more recent than $olderFldrDate." >&2
echo " This may take awhile..." >&2
echo $LF >&2

find -x "$recentFldr" -type f -newermt "$olderFldrDate" -print
}
fi

# eof

2010/02/23

Cloudera Desktop setup tip: DNS, DNS, DNS

Yes, it's important with Cloudera Desktop, just like everything else - try to fudge domain names and get bit.

Strange errors like:

"An unknown error occurred: hdfs put returned bad code: 255 stderr: 10/02/18 18:41:52 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:8020... Bad connection to FS. command aborted."

When attempting to upload a file - and I know that localhost is indeed responding properly on that port because it works find from the command line.

Solution: Spend the few minutes to determine what the real domain names are (*1) and set up the config files (*2) properly.

*1: ifconfig will tell you the IP address of the node you're on; host will tell you the domain name. If there is none, see *3 below.

*2: config files of potential interest:
/usr/share/cloudera-desktop/conf/cloudera-desktop.ini
/etc/hadoop/conf/masters
/etc/hadoop/conf/slaves
/etc/hadoop/conf/core-site.xml
/etc/hadoop/conf/hdfs-site.xml
/etc/hadoop/conf/mapred-site.xml
/etc/hadoop/conf/hadoop-metrics.properties
/etc/hadoop/conf/hadoop-env.sh
/etc/hadoop/conf/configuration.xsl
/etc/hadoop/conf/fair-scheduler.xml

*3: No domain name? See whoever's in charge of DNS, to fix it. That you? Well, you can either do it right (a little effort up front, pays big in the long run...) or you can try to handle it via /etc/hosts - good luck with that though.

Note to self: Don't shortcut DNS!