Wednesday, October 26, 2011

Data-mining in the Uncanny Valley: An e-Commerce Puzzle

Here's a "psychology of data mining puzzle" for you.

I received the following email from an e-commerce site. (Names expunged to protect the innocent and guilty.)


Dear [Customer],

One year ago, you ordered the following product from [e-commerce site].

[So-and-So] Salad Bowl & Servers - Stone - One Size/One Size

We wanted to let you know that right now, your size is still available
from [e-commerce site]. You can order the same product again by visiting:

http://www.example.com/bin/z/ref/oneyear/...html

Or, if you'd like to view our entire [So-and-So] collection, please visit:

http://www.example.com/bin/z/ref/oneyear/...html

If there's anything we can do to improve our service, please don't hesitate to let us know!



Why would the e-commerce site think I'd want to buy the same product again one year later?

What do you think was the rationale for this email campaign?

Here is an important hint, if you want one. Stop reading, if you don't want a hint. (The e-commerce vendor in question has a name that start with zap and ends with pos, and has a particular area of focus.)

Monday, October 17, 2011

Privacy Bugs in Picasa's Batch Upload

 

Privacy Vulnerability in Batch Upload

I have been using Picasa/PicasaWeb  to share photos. Here is my workflow:

  1. Load all photos from my camera onto my computer, creating a folder for each event/day.
  2. In Picasa, "Star" my favorite photos.
  3. Sync the folder to web, "Starred photos only", and share that folder publicly.

Today, I discovered the Batch Upload feature, and thought it would be a good way to create a private online backup of all my photos. Mistake!

When you use the "Batch Upload" feature in Picasa, you choose Online settings, like Photo Size, Visibility, and Sync On/Off. If you are Uploading a folder that is already partially synced to Web, these settings are ignored.

If you have already chosen to Sync the Folder to Web, the new upload is merged with the old upload. Previous settings are maintained, including visibility. Picasa's Batch Upload will set the entire folder to Public, even if you choose "Private" in the Upload Options!

There is a separate radio button for "Change options".  If you are using the Upload to create a private backup,  Use the "Change options" button to make an album private.

However, doing so will make ALL the photos private, including the Starred Photos that were previously public. So, you will then need to create new albums for photos you want to share.

If you have already shared a folder in the past, you'll need to go back and update blog posts / web links to point to the new albums. 

Effectively Batch Upload gives you two options:

  1. Publicize ALL your photos in a folder,
  2. or Privatize ALL your photos.

There is no way to create an online folder (album) that is part public and part private.

 

If you want to use "Batch Upload" as a Backup feature, you can't use Folder Sync (with Starred Photos only) as a "sharing/publishing" feature.albums.

Pooh!

Tuesday, October 11, 2011

Connecting to Postgres from R on Windows 7 64bit



Here is some quick advice on how to get up and running with R + Postgres on Windows 7 64bit. There are several possibilities, but most have fatal flaws. Here is what worked for me:

Add jvm.dll to your PATH

rJava, the R<->Java bridge, will need jvm.dll, but R will have trouble finding that DLL. It resides in a folder like

C:\Program Files\Java\jdk1.6.0_25\jre\bin\server

or 

C:\Program Files\Java\jre6\jre\bin\client

Wherever yours is, add that directory to your windows PATH variable. (Windows -> "Path" -> "Edit environment variables to for your account" -> PATH -> edit the value.)

You may already have Java on your PATH. If so you should find the client/server directory in the same Java "home" dir as the one already on your PATH.


To be safe, make sure your architectures match.If you have Java in "Program Files", it is 64-bit, so you ought to run R64. If you have Java in "Program Files (x86)", that's 32-bit, so you use plain 32-bit R.

Re-launch R from the Start Menu


If R is running, quit.

From the Start Menu , Start R / RGUI, RStudio. This is very important, to make R pick up your PATH changes.

 

Install rJava 0.9.2.

Earlier versions do not work!  Mirrors are not up-to-date, so go to the source at www.rforge.net: http://www.rforge.net/rJava/files/. Note the advice there
Please use
install.packages('rJava',,'http://www.rforge.net/')
to install.”
That is almost correct. This actually works:

install.packages('rJava', .libPaths()[1], 'http://www.rforge.net/')

Watch the punctuation! The mysterious “.libPaths()[1],” just tells R to install the package in the primary library directory. For some reason, leaving the value blank doesn’t work, even though it should default.

Install RpgSQL.

install.packages('RpgSQL')

 

Set Classpath. 

Set a classpath to use for pgSQL:

jdbcClasspath = "C:\\Program Files (x86)\\PostgreSQL\\pgJDBC\\postgresql-8.4-702.jdbc4.jar"
);
Do NOT use .jaddClassPath(). The pgSQL method ignores .jclassPath()

 

Create a DBI driver.

The Examples code in help("pgSQL") has errors (as does most R documentation, sadly). Here is how to set up your driver instance:
myPgSQL = pgSQL(driverClass='org.postgresql.Driver', 
                jdbcClasspath, 
                identifier.quote="\"");

Make a Connection.

connection <- dbConnect(myPgSQL,
  user = "your_username",  
  password = "your_password", 
  dbname = "your_db_name", 
  host = "your_db_host", 
  port = 5432
);

Query!

dbGetQuery(connection, "select datname from pg_database");
dbGetQuery(connection, 
  "SELECT table_name FROM information_schema.tables WHERE table_schema = 'public'");
# dbGetQuery returns a Data Frame, very nice.  
result = dbGetQuery(connection, "select * from markets"); 
View(result);  
head(result, n=1)  # Show 1 just row of data
head(result$name, n=1)  # Show 1 just row of data
help("pgSQL")    ; # To see more example queries
dbDisconnect(con); # When you are done.
 
 
Let me know how it goes for you!
 

Sunday, October 9, 2011

Linux virtual desktop: “Remote” audio-video access

“The post documents how I set up my Linux desktop system. My particular system runs inside VirtualBox on Windows, but most of the discussion here applies to any Linux productivity desktop.
You may wonder why I am running on VirtualBox. There are a few reasons:
  1. With VirtualBox, I can run a Windows system and a Linux system on the same piece hardware at the same time.
  2. Windows has much better hardware driver support than Linux, especially around power management.
  3. VirtualBox provides virtual hardware to the Guest Linux OS, and this hardware is much better supported by Linux than real hardware.

Headless Linux

Linux has a client-server architecure that allows us to run a program on one machine that sends its output to another machine. “Output” includes text, and also graphics and sound!

When we run a program without using the video/audio/text console of the machine where the program is running, we say the program is running “headless”. To make use of such a program, we run a video/audio/text server on a different machine (usually the one where we are sitting, which is not where the program is running).

This section explains how to run Audio and video servers on Windows, for use with programs on remote Linux systems.
Video Setup: XWin video server
You’ll need Cygwin/X to make graphics work.
Run XWin Server on your local/host machine, using the Shortcut or the command:
C:\Program-Files\Cygwin\bin\run.exe /usr/bin/bash.exe -l -c /usr/bin/startxwin.exe
Set and export DISPLAY (It’s best to due this in ~/.bashrc, so it runs automatically in every shell session):
export DISPLAY=:0
eSound audio server
We need to install a sound server on Windows, to play sounds sent back from the remote Linux system. Esound and Cygwin come to the rescue:

Run Cygwin setup.exe and choose to install “eSound”.

Then, run eSound (in a loop, in case it dies):
while true ; do esd.exe -tcp -public ; sleep 5; done
You’ll probably want to stow this command somewhere that will make it run exactly once each time you boot your Windows machine.

Connecting your Headless system
Use ssh to connect to the Linux machine.
  • Add the –Y option to enable X forwarding, so you can see the GUI of guest OS programs on your local desktop.
  • Add the –R option to set up a tunnel for eSound

ssh –Y -C -R 33001:localhost:16001 linux_user@linux_machine

When you log in to the Linux machine, instruct your session to send eSound output over the network through your tunnel.

export ESPEAKER=127.0.0.1:33001
Run your favorite programs with graphics and audio, like xterm or even firefoxYou’ll need to instruct any program you run to send its sound to eSound. Consult the documentation for each individual program.
For example:
ESPEAKER=127.0.0.1:33001 FIREFOX_DSP="esddsp" firefox

(This works using Cygwin/POSIX. You might need to tweak things to run from a Windows command shell.)
Reference: http://www.gbar.dtu.dk/wiki/Remote_Sound#1._Manually_by_SSH

Friday, October 7, 2011

Setting up Linux on Windows with VirtualBox

 

After trying and failing to run Ubuntu as my primary OS (for reasons I am not in the mood to discuss, I relented and installed Windows 7 Professional. (I bought a heavily discounted copy through Friends and Family program at the Microsoft company store. Never pay retail price.)
 
I use VirtualBox to host a virtual Linux machine inside my Windows Desktop. This setup lets me to have a (vaguely) usable Desktop console experience, with sleep/wake and power management, wireless, stable desktop graphics, and all those nice things.

This article describes how I set up VirtualBox to host Ubuntu as a Guest OS running inside Windows. A separate article will discuss how I set up the purely-Linux bits. This article discusses how to install and configure Virtual Box, and how configure the Linux<->Windows connection.

Install VirtualBox and create a virtual machine.

Install VirtualBox 4.1.2

Create a new Virtual Machine, following the prompts.

I like to separate my large “special” files from the rest of my home directory. This makes backups easier – I can frequently backup my small set of personal files, and backup large media/system files on a different schedule.

So, change the default locations for the Virtual Disk Image (VDI) and VM Snapshot, since these are very large files. I use C:\data.

You’ll need to install an OS on your virtual machine. I use Ubuntu. In VM Settings –> Storage –> IDE Controller, click on the CD icon on the right and choose a virtual CD/DVD (ISO image file) to “insert” into the virtual optical disk drive.

Start your VM, and it will boot from that “virtual CD” and let you install an operating system (called the Guest OS on your PC, distinguished from the Host OS, which is Windows).
Once the Guest OS is installed, I strongly recommend that you install an SSH server as soon as possible, so that you can connect to your virtual machine without using the virtual display, which is memory-intensive and CPU-intensive.

Guest Extensions

Guest Extensions is a package of software that runs inside the Guest OS, to enable special Virtual Box features. Generally, these are “virtual drivers” that let the Guest OS communicate with VirtualBox, enabling great features like fancier mouse/keyboard support, higher/variable “virtual display” resolution, and exposing host system directories as virtual hard drive partitions in the guest.

After installing and booting the OS, install “Guest Extensions”. To do this, pick the option from the Menu at the top of the window in which the Guest OS is running. That will mount a virtual CD. Run the “autorun.sh” shell script (as root) from that virtual CD, and Guest Extensions will install. Guest Extensions fixes GUI resolution, making the Guest OS display size match the window size, which you can change by clicking and dragging on the corner of the window.

After installing Guest Extensions, shut down the Guest OS, so we can finish setting it up.

Extension Pack

Extension Pack is different from Guest Extensions.  Extension Pack is a package of Host software that extends the functionality of VirtualBox. Extension Pack has the non-Open Source code of Virtual Box, which is why it is packaged as an extension.

The most important Extension Pack feature is USB 2.0 filter support. This feature lets your guest OS interact directly with your Host machine’s USB 2.0 devices. Download and install Extension Pack, and then visit Settings –> USB in VirtualBox Manager. Here you can check the Enable USB 2.0 box, and then click on the “USB-plus” icon to choose devices you want to provide to the Guest OS.

One very important USB 2.0 device is our external hard drive, which has a Linux formatted (ext-format) partition that Windows cannot see.

Networking

VirtualBox uses NAT (network address translation) networking by default, which is (probably) the same setup that your home PC uses to connect to the Internet. This is a one-way sort of networking:  It’s easy to reach the Internet from the machine, but hard to reach the machine from the Internet. In the case of a virtual machine, your local network is considered part of the Internet. This is probably not want you want.

Most likely, you’ll want to connect to the Guest OS via ssh, or a webserver, or some other network protocols, besides a direct virtual console (keyboard/mouse/display) connection. To do this, you’ll need to set up either Bridged Networking or  Port Forwarding.

If you want to treat your VM like a real machine on your local network, so that you connect to services (SSH, HTTP) running on the machine in the “usual” way,  then you’ll want to set up Bridged Networking, which basically instructs Virtual Box to run a virtual switch to direct traffic between the virtual machine (one one side) and the Host machine and rest of the local network network (on the other side)
Bridged Networking
While our virtual machine is powered off, open VirtualBox Manager and navigate to Settings –> Network –> Adapter 1. Change the “Attached to:” option from NAT to Bridged Adapter. Make sure Cable Connected is checked.

Now you are set up for full bidirectional network connectivity on your virtual machine. Have fun!
Also, beware: Your virtual machine is now equally exposed to the Internet as your host machine, so be conscious of security.

In particular, if you follow the (very wise) practice of running a web browser inside a virtual machine, in order to isolate your PC from malicious web sites, then you need to turn off Bridged Networking on any virtual machine that you want to quarantine from the Internet.

Port forwarding
Skip this section if you have chosen Bridged Networking.

I wrote a simple shell script (requires Cygwin to run) to set up Port Forwarding from the Windows desktop to the VirtualBox guest Linux system. This script sets up forwarding from Windows:2222 –> Linux:22 for ssh, Windows:8787 –> Linux:8787for Rstudio, and 8000 –> 80 for http. I hope it’s obvious how you can modify it to set up the services/ports you need.

I named this script virtualbox-port-setup. Run it like so:
  virtualbox-port-setup VM_Name

#!/bin/bash test "$1" == "" && echo "USAGE: $0 guest_name" && exit 1 guestname=$1 net_path="VBoxInternal/Devices/pcnet/0/LUN#0/Config" function run() { echo "$@" "$@" } function forward_port() { cmd="VBoxManage setextradata $guestname" local service=$1 local host_port=$2 local guest_port=$3 local protocol=$4 run $cmd "$net_path/$service/HostPort" $host_port run $cmd "$net_path/$service/GuestPort" $guest_port run $cmd "$net_path/$service/Protocol" $protocol } case $2 in undo) # unset settings forward_port ssh forward_port rstudio ;; *) forward_port ssh 2222 22 TCP forward_port rstudio 8787 8787 TCP forward_port http 8000 80 TCP ;; esac echo echo "Updated NAT config:" run VBoxManage getextradata $guestname enumerate | grep $net_path


(Side note: I am using Code Formatter Plugin for Windows Live Writer to insert code snippets into this article.)

You can run this script at any time, but you’ll need to completely close down your Guest OS (I think saving state is fine) in order to complete the setup and activate port forwarding.

Important! Activating port forwarding breaks your virtual machine. (There is a fix.) After closing the virtual machine, open VirtualBox Manager, edit Settings for the VM, navigate to the “Network” section, and change “Adapter Type” from Intel to PCnet-FAST III Am79C973. This change will fix the error “Configuration error: Failed to get the "MAC" value. VBox status code: -2103 (VERR_CFGM_VALUE_NOT_FOUND).”



If you ever decide to remove the NAT, run this script with the “undo” argument, (or else your virtual machine will fail to start up, giving an error about “unknown configuration node”):

   virtualbox-port-setup VM_Name undo

Running Headless

You could launch the  virtual machine now, but here’s a better idea: Run the VM headless. Headless VM has several benefits:


  1. You can save the memory and CPU cost of running a graphical desktop environment inside the Guest OS.
  2. You can use a host-side (Windows) X Server that will let you run each of your graphical guest/Linux programs “rootless” on the Windows desktop, instead of all locked inside a single Linux container window.

To manage your headless guest OS, use these commands:


# start "C:\Program Files\Oracle\VirtualBox\VBoxHeadless.exe" --comment "<VM name>" --startvm "00000000-0000-0000-0000-000000000000" # suspend "C:\Program Files\Oracle\VirtualBox\VBoxManage.exe" controlvm "<VM name>" savestate


You’ll need to change the <VM NAME> and GUID (long string of digits) to match your VM. An easy way to get this information is to enter VirtualBox Manager, right-click on your VM entry, and choose “Create a Shortcut on Desktop” Then right-click on the shortcut, choose Properties, and copy-paste the VM specs.

 

Connecting your Headless VM

Remember when I told you to install an ssh server? I hope you followed my advice. We are going to connect to the virtual machine over ssh, and forward graphics and sound back to the local (Host) machine.

Before we connect, we need to prepare the Host system to play audio and video from the guest system.


Simple: Audio Setup: Use your head

The simplest way to play sound is through the “Head” console.

To do this, start your virtual machine in the “normal” way (non-headless), and login to the desktop session.

Then, any programs you run (on the head console or on a headless connection) will send audio to the head console for VirtualBox to play for you.

The idea here is that you can still run rootless GUI windows on your Host system, while playing audio through VirtualBox.

Running the head console loses some of the RAM-savings and CPU-savings of pure headless mode, so you’ll want to configure your virtual machine’s desktop session to be a minimalist lightweight desktop that can still play audio.

If you want to run totally headless, follow the advice for “Headless Linux” in the Linux article in this series.

Now we are ready to connect!

If you are using NAT Port-Forwarding: To connect to your VM, ssh to the forwarded port on the host machine, using the username and password of an account on the guest OS.

If you are using Bridged Networking: To connect to your VM, ssh to the guest machine hostname/IP, using the username and password of an account on the guest OS.


  • Add the –Y option to enable X forwarding, so you can see the GUI of guest OS programs on your local desktop.
  • Add the –R option to set up a tunnel for eSound


Port-forwarded NAT networking:

    ssh –Y -C -R 33001:localhost:16001  guest_os_user@host_machine –p 2022


Bridged networking:  

    ssh –Y -C -R 33001:localhost:16001  guest_os_user@guest_machine



(This works using Cygwin/POSIX. You might need to tweak things to run from a Windows command shell.)



Raw hard-drive access


I have a Linux-formatter hard disk from my machine’s briefly previous life running Linux. I would like to access that disk directly from Linux. Here is how we do it:

WARNING: Do not attempt to mount any disk/partition that is also mounted by the Host system, especially not the Host boot disk!

WARNING: Some programs, like Nautilus, will auto-detect all partitions on a hard drive, and attempt to mount them if you click on an icon. Clicking the icon for any partition mounted by the Host OS will crash the Guest OS!

Create VMDK files. Here is a Cygwin/bash script that can help with that.

drive_infos=$@for drive_info in $drive_infos ; do drive_number=$(echo -n "$drive_info" | sed -e 's/:.*$//') partitions=$(echo -n "$drive_info" | sed -e 's/.*://') if [ "$partitions" == "" ] ; then partitions_flag="" else partitions_flag="-partitions $partitions" fi vmdk_file=$vmdk_dir/RawDisk-PhysicalDrive${drive_number}-$partitions.vmdk echo_run VBoxManage internalcommands createrawvmdk -filename $vmdk_file $partitions_flag -rawdisk '\\.\PhysicalDrive'$drive_number

In VirtualBox Manager, Settings –> Storage, add the newly VMDK files as virtual disks. (Check “Solid State” if appropriate.)



Use  “listpartitions” to find the partitions numbers on your disks, if you want to only expose non-Windows partitions to Linux. Then you can pass drive numbers like “1:3,4” to the script above, for drive 1, partitions 3 and 4):


VBoxManage.exe internalcommands  listpartitions -rawdisk '\\.\PhysicalDrive0'

VBoxManage.exe internalcommands  listpartitions -rawdisk '\\.\PhysicalDrive1'

VBoxManage.exe internalcommands  listpartitions -rawdisk '\\.\PhysicalDrive2'





Guest OS setup

Login to the Guest OS for the rest of this section.

Create mount-point in Guest OS, and set its group to vboxsf, like other shared folder mountpoints:

   mkdir /media/internal-storage-drive

   chgrp vboxsf /media/internal-storage-drive

Run blkid to find the UUIDS for all the drives of interest, which we will need shortly: (Credit: liquidat)

  blkid /dev/sd*

Create entry in /etc/fstab in Guest. Set options appropriately based on Filesystem type. Study this example, and then write your own version to suit your system:


##### /dev/sdb – Internal storage drive

#/dev/sdb1 – Internal SSD

UUID=4957fb72-... /media/internal-storage-drive ext4    user,group,rw              0       2

#/dev/sdb2 – Swap partition we’ll ignore for now.

##### /dev/sdc: External USB drive

# /dev/sdc1: MBR or something

# /dev/sdc1: LABEL="EFI" UUID="70D6-1701" TYPE="vfat"

#noauto, at least until we install hfsplus drivers and everything looks stable
# /dev/sdc2: UUID="aa810eb7-4048-3b0f-9ecf-8a2d6841337b" LABEL="Elements-Mac" TYPE="hfsplus"
LABEL=Elements-Mac /media/external-elements-mac  hfsplus    rw,noauto              0       2


# NOT SAFE to mount ntfs drive that is also mounted by Host OS!
#/dev/sdc3: LABEL="Elements-NT" UUID="52FB51860ACFBBED" TYPE="ntfs"
#LABEL=Elements-NT  /media/external-elements-nt  ntfs    user,group,rw              0       2


# /dev/sdc4: LABEL="Linux" UUID="59a3928e-d38f-4855-837a-31d32562fac9" TYPE="ext4"
UUID=59a3928e-...   /media/external-elements-linux  ext4    user,group,rw              0       2


# /dev/sdb5: UUID="08045a62-85bc-4028-b23e-c05670d6cca0" TYPE="swap"



Make your user a member of “vboxsf” group that can mount modify mountpoint. (vbosxf is overloaded – I’m using the Shared Folder group for Host disks)


sudo usermod –append –groups vboxsf

in gparted in Guest, remove “Boot” flag (for safety, for now)


sudo gparted

Mount!

 mount /dev/sdb1    or    mount -a

Note: If the userids on the Guest OS do not match the user ids on the hard drive, you’ll have permissions problems. If you have problems, you’ll need to re-number users on the Guest OS (consult OS documentation, /etc/passwd), or change owners of files on the hard drive (chown –R).

WARNING: Do not attempt to mount any disk/partition that is also mounted by the Host system, especially not the Host boot disk!

WARNING: Some programs, like Nautilus, will auto-detect all partitions on a hard drive, and attempt to mount them if you click on an icon. Clicking the icon for any partition mounted by the Host OS will crash the Guest OS!

References: http://www.sysprobs.com/access-physical-disk-virtualbox-desktop-virtualization-software



USB devices


I thought I needed to set this up, but I was able to access my external USB hard drive using the “raw hard drive access” method above.

Audio

Should work automatically. Audio works for me when I connect to the “head” console of the VM (non-headless mode). When I connect to the VM over ssh and run a program there, sound does not transmit back to my client.

This is a general challenge of running a program on a remote machine, not VirtualBox specific.






To Be Continued…


There is a lot more Linux setup to do, but it’s not VirtualBox specific. We’ll cover that material in another article.

Wednesday, October 5, 2011

Managing Home Videos

We have a Kodak “Flip”-esque video camera for caputring “video snapshots” at home.
The camera comes preloaded with ArcSoft MediaImpression for Kodak. I use MediaImpression’s Import functionality to import videos (and a few stills) from the camera to %USER_PROFILE%/Videos/YYYYMMDD.

I don’t know what to do with them next, but for now, they sit there. (From my cell phone, I upload to my Youtube account. I could do the same for video files on my desktop.)

I delete files from the mobile device every time I import. This is slightly risky, but it avoids the hassle of juggling lots of old content mixed with new content when I use the device. Also, this is a clear strategy that lets me know the status of each item I’ve captured: not yet imported, imported, or deleted.

It is very important to have a backup strategy in place before we start importing media.
MediaImpression Quirk:  If a media file is open in another application (maybe you are watching it  QuickTime?), MediaImpression will silently fail to delete the file on import, leaving you perhaps confused why one or two files remain on the camera after import. Be confused no longer!

Monday, October 3, 2011

JPEG quality in Nikon 60 camera


I recently migrated my photos from my old Mac to my new Windows desktop. The migration was a bit of a pain, in part because I had 30GB of photo files to migrate. Most of those 40GB were photos from the past two years.
Looking ahead, this pattern is unsustainable. I need a smaller photo library.

There are a few general approaches to shrinking the library:
  1. Fewer photos
  2. Smaller photo files.
    1. Fewer megapixels
    2. Tighter (JPEG) compression.

I am now employing a mix of methods. This post discusses an approach to Smaller photo files through tighter compression. Ken Rockwell has an excellent article on his excellent site, that should convince you that you do not need "RAW" or "JPEG Fine" or "JPEG Super-fine" files from your DSLR camera. I'll just add that lenses and lighting will do far more to your photo quality than (non-)compression.

All you really need to know is, as Ken says: "For Nikon cameras without the Optimal Image Quality JPG mode (D1X, D70, D50, D100) I use NORMAL JPG.

I have a Nikon D60. (Thanks, Dan and Lucy!)  The Nikon D60 offers four photo storage quality modes. I took some test photos to decode their meaning. Here is what they are called and what they mean:
Name Quality bits per pixel Relative File size Subjective quality loss
NEF (RAW) 100
200% – 400% (10 MB)
JPEG Fine 98 4 100% (2-4 MB) None
JPEG Normal 97 2 70% (1-2 MB) None
JPEG Basic 75 1 20% (0.4 - 0.8 MB) Trivial
NEF (RAW) + JPEG Basic

stores two files
Conclusion: Unless you are taking photos under perfectly lit conditions with perfect focus, optical error is going to dominate picture quality, not JPEG compression.

Recommendation: Use JPEG Basic (or Normal if you are paranoid) for day-to-day shooting outside of a studio.

I will update this post with more sample photos and analysis, some time….

Saturday, October 1, 2011

My Picasa Workflow

Here is how I manage my photos in Picasa.
  • Import photos from camera over USB using Windows Photo Gallery –> Import
  • Choose “Review, organize”
  • Adjust time-slider to auto-split events appropriately.
  • Open Picasa.
  • Select all photos in new folder.
  • Picture –> Batch Edit –> I’m Feeling Lucky
  • Review photos individually.
    • Star photos I want to Sync to Web
    • Sync photos to web.
    • (My Sync settings: 1600px, which is big enough to fill most computer monitors, and saves storage space / bandwidth)

TODO:
Permanently resize photos down from 8 Megapixel “superfine” (98% JPEG ) to a more practical (smaller) file size. I don’t have a good plan for this yet.