Thursday, December 20, 2012

Transferring Data Between R and Excel

For a long time, I assumed that the only way to transfer data between R and Excel was to do something like the following:

write.csv(x=some.data.frame, file="some_file.csv")

and then to open the resulting CSV file in Excel.  I didn't like doing this, because by the end of an analysis I would have a bunch of temporary "deleteme.csv" files cluttering my Desktop.

It turns out there is a better way to move data between R and Excel:

Move Data from R to Excel

First, from R:
write.table(x=some.data.frame, file="clipboard", sep="\t")

This command copies the specified data frame (not sure how well it works with other object types) to the system clipboard.

Then all one needs to do is open a new workbook in Excel and Ctrl-V (paste) the data into the workbook!  Easy as pie!

Move Data from Excel to R

First, in Excel, select the data you would like to copy, then copy it (Ctrl-C).

Then, in R:
data.from.excel <- read.table(file="clipboard", sep="\t")

That's all there is to it.  This will allow you to quickly and easily transfer data between R and Excel, or more generally, between R and any program that can read data from the clipboard.

Wednesday, November 30, 2011

Excel Changes Values and Formatting when Importing CSV

When opening a text-based data format with Excel (CSV, tab separated, etc.) you will find that Excel has a nasty habit of automatically interpreting the data types of the cells.  Once it (thinks it) has determined the proper data type, it will also reformat the data to its default formats for numbers, currency, dates, times, etc.
For example, I work with data that involves a lot of timestamps.  I frequently have timestamp data in a MySQL database that looks like this:
   2011-11-30 13:56:02
This is a typical timestamp format that is recognized by many different programs and programming languages.
Usually I will export data from the database into CSV format and load it into Excel to do some tweaking before I turn it into a visualization.  However, when reading the data from the CSV file, Excel, detecting it as a date/time value, will reformat it according to its default formatting rules, like this:
   11/30/2011 13:02
If I then save the CSV file, Excel will save it out using its default format (MM/dd/yyyy hh:mm) instead of my original format (yyyy-MM-dd hh:mm:ss).
Yes, ladies and gentlemen, Excel just changed my values and corrupted my data.
Here is how to prevent this from happening:
  1. In Windows Explorer rename the file, changing the extension from ".csv" to ".txt".  (If you don't see file extensions, do the following:
    1. Press the Alt key in the Explorer window.  (This will make the menu bar visible.)
    2. In the newly visible menu bar, select Tools -> Folder Options ...
    3. Click on the "View" tab.
    4. UNCHECK the box that says "Hide extensions for known file types."
    5. Click OK.
    6. You should now be able to see the ".csv", ".txt", ".whatever" extensions on the file names.
  2. In Excel, open the newly renamed file.  (You will probably need to change the file type selector to "All Files (*.*)" in order to find your .txt file.)  After clicking "Open", Excel will present you with a window titled "Text Import Wizard - Step 1 of 3".
  3. In Step 1, choose the "Delimited" radio button.  Click "Next".
  4. In Step 2, in the "Delimiters" section, check the "Comma" box (or whatever delimiter your file is using) and make sure that all other delimiter boxes are UNCHECKED.  You should now see your data divided into the appropriate columns at the bottom of this window.  Click "Next".
  5. In Step 3, click on the first column.  Use the scrollbar to scroll over to your rightmost column.  Hold down the Shift key and click on the final column.  All columns should now be selected.
  6. Click on the "Text" radio button in the "column data format" section.  Now click "Finish."
Following these steps, Excel will NOT try to automatically determine a data type, nor will it reformat or change your data in any way.

Wednesday, December 15, 2010

Continuously monitoring open files in real-time

Recently, I wanted to be able to get a list of files that were being opened by a running process.  Searching all over the web, I found a number of solutions, but they all involved using the lsof command.

The lsof command has many, many options, and it allows you to see which files have been opened by a given process. Coming the other direction, it also allows you to see which process has opened a given file. I can see how it would be extremely useful in many different circumstances.

However, my problem was that my process was opening files and closing them almost immediately. In other words, I had no hope of using the lsof command to view open files, because lsof only shows files that are currently open, and by the time lsof would run, the files were already closed again!

I discovered a different way to continuously monitor, in real-time, all of the files that were being opened by a process, regardless of how quickly they were closed:

strace -tt myprog 2> system_calls.txt
grep 'open(' system_calls.txt > opened_files.txt


strace is a command that logs all of the system calls for myprog.  The -tt option includes a timestamp (with microseconds) at the beginning of each line. Each file is opened with a call to "open(", so grepping for this string should give you a list of all files that were opened.

Tuesday, August 31, 2010

Installing Ubuntu 10.04 (Lynx) x86_64 Server on Dell XPS 630i

Recently, I needed to install Ubuntu on a Dell XPS 630i.  There was one irritating problem: the installation cd would consistently freeze just after selecting "Install Ubuntu" from the main menu, leaving me with a blinking white cursor and the inner turmoil that can only be experienced while wondering whether your computer is actually doing anything

I've never had these kinds of problems installing Ubuntu before, and I wasn't really sure where to start troubleshooting.  A number of forum websites with postings similar to my own situation recommended changing some of the install parameters, such as noapic, nolapic, noacpi, etc.

None of this worked.

I finally found this post on a Dell community forum, which ingeniously suggested to:

  1. Install Ubuntu 8.04 (Heron) x86_64 Server

  2. Check for updates in the package manager.  Install all UPDATES (NOT distribution upgrade)

  3. Restart

  4. In a terminal, sudo update-manager --devel-release

  5. Check for updates one more time.  THEN click on the button at the top of the package manager window to install the distribution upgrade to arrive at 10.04 (Lynx).


I never would have thought of that.  I followed the post instructions exactly.  Success!  Everything appears to be working just fine.  Here's to you, jakeman66.

Tuesday, October 13, 2009

Keeping a remote process running after terminal disconnect

Quoting TheOneKEA at http://www.linuxquestions.org/questions/linux-general-1/keeping-a-process-running-after-disconnect-150235/:

nohup is what you want - it's a wrapper that blocks the SIGHUP signal sent to all applications connected to a terminal when that terminal is closed by the shell.

Just ssh into the box and start the command using this syntax:

[user@remoteboxen user]$ nohup /path/to/command arguments &

The man page explains it better.

Friday, September 11, 2009

Rebuilding the VirtualBox Kernel Modules (Ubuntu 9.04)

Any time there is a kernel update, you would do well to rebuild the VirtualBox kernel module to ensure compatibility with your new kernel version. This can be done by executing the following command from the terminal:

sudo /etc/init.d/vboxdrv setup

Thursday, September 10, 2009

Installing Fonts in Linux (Ubuntu 9.04)

First, you can find some good free font downloads at http://www.sostars.com.  I downloaded a stencil font called "Ver Army." I unzipped the file, and found a .ttf font file.

I learned how it install it from this page. Here's a summary:

To install Microsoft Windows fonts: sudo apt-get install ttf-mscorefonts-installer
To install Red Hat Liberation fonts: sudo apt-get install ttf-liberation

To install any other kind of font (including the one I downloaded from sostars.com):

  1. mkdir ~/.fonts (make a font directory in your home directory if one doesn't exist already)

  2. mv ver-army.ttf ~/.fonts (move your ttf file into the .fonts folder)

  3. Restart the computer