I've always been an EMC Celerra guy since I cut my teeth on it so many years ago,  and it's support of de-duplication (single instancing) left a lot to be desired by me – mainly that it could not work across filesystems.  When I first started investigating Isilon I had high hopes for de-dupe across the entire array because it din't have separate file systems.. New announcements today bring that idea to light finally. It is certainly not a surprise that OneFS will support de-dupe,  but the fact that it allows it across the entire IFS is a huge benefit.   Data written to the cluster will be written full size,  but in post processing objects and files will be de-duped using an 8k block size. EMC is suggesting you'll see a 30% reduction in storage consumption, but your mileage will vary.  This is great news for the Isilon TCO – it's already low overhead in raw

We recently turned on FAST on one of our EMC VMAX arrays and I was curious as to how much data was moving.  I didn't have an easy way to see what percentage of data changes from day to day,  let alone hour to hour.  Getting the current data is actually really easy using symtier -sid <sid> list -vp,  but I wanted something to view historical information.  So I whipped up a quick perl script which will parse the output of symtier and recode it in a CSV format for importing into Excel. You can download it here: It's really simple,  just run the script and pass it a sid and the path/name of a file to record to.. and it does the rest.  Like so: [box]mark@shell# ./record.pl 1234 ./tier_usage.csv[/box] A few second later you'll see the information in the file you specify.  Now for the historical part you'll need to setup a crontab entry which runs