knowledge stockpile: 2011

Saturday, December 31, 2011

Drawing a straight line with D3.js

In this post you will see how to draw lines using D3.js. For those who have not heard about D3.js, it is a powerful JavaScript library that can be used for creating beautiful, scalable, static or interactive drawings, figures, plots, graphs, diagrams etc. in the browser. I have just started learning this library. I also have little experience in JavaScript. So, I thought creating such tutorials would make for good notes and opportunity to learn these two. The code below may not be the best code to achieve the results, nevertheless, if you are new to these things, it will certainly be helpful to get you started. I will start with the very basics.

We will first draw a simple line using D3 and then modify some its properties such as width and opacity. Note that we are actually using the D3.js to generate SVG code for drawing the line. If you don't know what SVG is, don't worry. We will pick it up as we go along.

Create the basic HTML file that will contain the code. Note the DIV container with id = "D3line" . We will add the line to this DIV container. Also, see the header of the file. We have linked to the D3 repository to load the D3.js script. You will need to be connected to the internet for the following to work.
```
 <!DOCTYPE html>
 <html>
 <head>
     <title>Lines using D3.js</title>
     <script type="text/javascript" src="http://mbostock.github.com/d3/d3.js"></script>
 </head> 
 <body>   
     <div id="D3line"></div>
 </body>
 </html>
```
Now add the following D3 script after the <div id="D3line"></div> line. Here we select the DIV container D3line, add an SVG element to it, and set its height and width attributes. We assign the SVG element to a variable that we name lineGraph.
```
 <script type="text/javascript">
 /*Select the DIV container "D3line" and add an SVG element to it*/
 var lineGraph = d3.select("#D3line")
     .append("svg:svg")
     .attr("width", 500)   
     .attr("height", 200); 
 </script>
```
Now add the SVG line element to the SVG element that we added above, by adding the following script after the .attr("height", 400); line. As you can see, you need to specify four attributes to add a line: x and y coordinates of the starting point, (x1,y1), and end point x and y coordinates (x2, y2). Further, you need to specify the color for the line, which is value of the stroke element.
```
 // To draw a line use the "svg:line" element.
 // "svg:line" element requires 4 attributes (x1, y1, x2, and y2)
 // (x1,y1) are coordinates of the starting point. 
 // (x2,y2) are coordinates of the end point.
 // You also need to specify the stroke color.
 var myLine = lineGraph.append("svg:line")
     .attr("x1", 40)
     .attr("y1", 50)
     .attr("x2", 450)
     .attr("y2", 150)
     .style("stroke", "rgb(6,120,155)");
 
```
Open the HTML file that you have so far in the browser and you should see a line such as in the following figure. Note that the origin in SVG is at the top left hand corner.

There, you are done! Before I wrap up this post, I will show how to change a couple of properties of this line. We will first increase the thickness of the line (using the stroke-width property) and then make it a little transparent (reduce its opacity) using the stroke-opacity property.

To increase the line width add:

  // Increase the line thickness/width
  myLine.style("stroke-width", 24);

This gives:

Finally, to make the line transparent add the following lines to the script.

 // Make the line a little transparent (decrease opacity)
 myLine.style("stroke-opacity", 0.6);

This gives:

Wednesday, December 28, 2011

Creating a sequence of dates in R

To create a sequence of dates in R you can use the as.Date() function along with seq() function. Following are some examples. While your needs may be different, this will help you get started.

1. Creating a sequence of dates by specifying the start date and number of days

2. Creating a sequence of dates by specifying the start date and an end date

Thursday, December 8, 2011

Changing python version used by Textmate

The python version that is used by Textmate when you use Command-R to run a python program may be different from the version that is used when you run the program from the command line. This post explains how to make Textmate use the latter python version, if you prefer to do so.

Go the terminal and at the prompt, type which python and press Enter.
Copy the path that you get as a result.
Go to Textmate and go to Preferences → Advanced → Shell variables.
See if a variable TM_PYTHON is listed here. If so, change its value to the value you copied in Step 2. If not, press + to add a variable. Type TM_PYTHON for variable name and the copied path for Value.

Command-R in Textmate should now use the same Python version that is used when you run the program from the command line.

Friday, November 18, 2011

How to include HTML code snippets in an HTML file?

To have a quick reference at hand, today I was writing a note on how to include software code in my blog posts. I was listing things such as what HTML tags to use and how to use them. To make myself comfortable with HTML, I thought of writing the note as an HTML file. HTML provides <pre>. . .</pre> tags for including verbatim text in your HTML file. However, I quickly realized that any HTML code included in the <pre>. . .</pre> is not treated as verbatim text. It is executed as HTML code. I searched online and after some time, found the following solution.

Instead of using < and > to enclose HTML tags in the code snippet that you want to display, use < and >. For example, to display the <b>. . .</b> tags that you would use to get bold text, type:

    &lt;b&gt; . . .&lt;/b&gt;

Wednesday, November 16, 2011

Look up the LaTeX Command for a Symbol

Today I came across a post on John D. Cook's blog that mentions a really neat online utility called Detexify. You can use the utility to find the LaTeX command of a symbol by drawing the symbol with your mouse.

The following shows a screenshot from the Detexify website. I drew the symbol in the box on the left; you can see the results of that drawing on the right.

Wednesday, January 12, 2011

Keep a "Comment and Markup" tool selected in Adobe Acrobat

Every once in a while I forget how to keep a commenting tool selected in Adobe Acrobat 9.0. And, when I do, I cannot quickly find the instructions on the web. Here are the instructions (I got them from the Adobe website):

Select the tool you want to use (but don’t use it yet).
Choose View > Toolbars > Properties Bar.
Select Keep Tool Selected.

Sunday, January 9, 2011

String formatting in Python

You have written a piece of code in your favorite language Python. The code does some fancy (or simple) calculations. Now you want to output the results of these calculations in a format that is easier and more pleasant for humans to read. Keep reading to find out about some options available in Python (version 2.6 and above) for basic formatting of the output produced using print statements. Specifically, below I discuss:

how to set the column width of the column in which the output is printed;
how to align the output

left, center or right align
how to align the digits such that the positive and negative numbers start in the same column

how to set the precision of floating numbers
some other examples that illustrate output of numbers with thousand separator commas, output of numbers as percentages etc.

Not all possibilities are discussed below. For more information, see http://www.python.org/dev/peps/pep-3101/

Suppose you have a list of tuples called lang_info where each tuple has the name of the language, the year it was developed (taken from this Wikipedia article) and TIOBE rating. You want simple text output of this information in the form of a table.


lang_info =  [('Fortran', 1954, 0.435), ('Cobol', 1959, 0.391),
              ('C', 1972, 16.076), ('C++', 1980, 9.014), 
              ('Python', 1991, 6.482), 
              ('Java', 1995, 17.99), ('C#', 2001, 6.687)]

If you use the code:

print "Language Year Developed TIOBE rating"
print "--------------------------------------"
for element in lang_info:
    print element[0], element[1], element[2]

you will get the output (call this output_1):

Language Year Developed TIOBE rating
--------------------------------------
Fortran 1954 0.435
Cobol 1959 0.391
C 1972 16.076
C++ 1980 9.014
Python 1991 6.482
Java 1995 17.99
C# 2001 6.687

This is not very easy or pleasant to read. Let us format it so that we get the output:

Language      Year Developed      TIOBE rating
----------------------------------------------
Fortran            1954                   0.43
Cobol              1959                   0.39
C                  1972                  16.08
C++                1980                   9.01
Python             1991                   6.48
Java               1995                  17.99
C#                 2001                   6.69

The first step is to replace the words and numbers (such as "Language", "Year Developed"), and the variables (element[0], element[1]) that we want to format by what are known as replacement fields.

So instead of:

print "Language Year Developed TIOBE rating"
print "--------------------------------------"
for element in lang_info:
    print element[0], element[1], element[2]

we have,

print "{0} {1} {2}".format("Language", "Year Developed",
                            "TIOBE rating")
print "-"*46
for element in lang_info:
    print "{0} {1} {2}".format(element[0], element[1], element[2])

The curly braces are known as replacement fields. The numbers within them are called field names and specify the position of the argument in the .format method that will replace that replacement field. For example, {1} will be replaced by argument in position 1 in the .format method. Running the above code gives us the same output as in output_1; this is because we have not applied any formatting to it. Next we type what are known as 'format specifiers' in the replacement fields. These are separated from the field name by a colon (:). The general form of the format specifier is

[[fill]align][sign][#][0][minimumwidth][.precision][type]

We will discuss only some of these flags (specifically, we will discuss align, sign, minimumwidth, .precision and type flags) and that too only some of the possible values for these flags.

First, note that all flags are optional. The following are some caveats:

if you specify two or more flags they should be specified in the same order as shown in the general form of the format specifier.
if you want to use the fill, you must give a value for the align flag.

Let us first set the column width. The minimumwidth flag allows us to do this. It is an integer which specifies the width of the column. The following code:

print "{0:12} {1:16} {2:16}".format("Language", "Year Developed", 
                                     "TIOBE rating")
print "-"*46
for element in lang_info:
    print "{0:12} {1:16} {2:16}".format(element[0], element[1], 
                                       element[2])

gives the following output:

Language     Year Developed   TIOBE rating    
----------------------------------------------
Fortran                  1954             0.43
Cobol                    1959            0.391
C                        1972           16.076
C++                      1980            9.014
Python                   1991            6.482
Java                     1995            17.99
C#                       2001            6.687

This is already looking better, but still there is room for improvement. While the numbers are right aligned, the column titles (in second and third column) are left aligned. Let us change the alignment of each of these to center. We will use the align flag for this. Some possible values for the align flag are:

'^' -  for center alignment
'<' -  for left alignment 
'>' -  for right alignment

Applying the left alignment tag to the year column:

print "{0:12} {1:16} {2:16}".format("Language", "Year Developed", 
                                    "TIOBE rating")
print "-"*46
for element in lang_info:
    print "{0:12} {1:<16} {2:16}".format(element[0], element[1], 
                                         element[2])

gives the following output.

Language     Year Developed   TIOBE rating    
----------------------------------------------
Fortran      1954                         0.43
Cobol        1959                        0.391
C            1972                       16.076
C++          1980                        9.014
Python       1991                        6.482
Java         1995                        17.99
C#           2001                        6.687

Still not great. Let us center align the header of the second column, right align the header of the third column and center align the contents of the second column (the year column).

print "{0:<12} {1:^16} {2:>16}".format("Language", 
                                       "Year Developed", 
                                       "TIOBE rating")
print "-"*46
for element in lang_info:
    print "{0:12} {1:^16} {2:16}".format(element[0], element[1], 
                                         element[2])

This piece of code gives:

Language      Year Developed      TIOBE rating
----------------------------------------------
Fortran            1954                   0.43
Cobol              1959                  0.391
C                  1972                 16.076
C++                1980                  9.014
Python             1991                  6.482
Java               1995                  17.99
C#                 2001                  6.687

The last column does not look that great given different number of digits after the decimal point. We will use the .precision flag and the type flag to fix this:

precision is a whole number that specifies the number of digits you want to display after the decimal point.

type takes many different values (see http://www.python.org/dev/peps/pep-3101/ for all different values that this flag takes). Here, we use f for fixed point number.

Hence, using

print "{0:<12} {1:^16} {2:>16}".format("Language", 
                                        "Year Developed", 
                                        "TIOBE rating")
print "-"*46
for element in lang_info:
    print "{0:12} {1:^16} {2:16.2f}".format(element[0], 
                                            element[1], 
                                            element[2])

we have,

Language      Year Developed      TIOBE rating
----------------------------------------------
Fortran            1954                   0.43
Cobol              1959                   0.39
C                  1972                  16.08
C++                1980                   9.01
Python             1991                   6.48
Java               1995                  17.99
C#                 2001                   6.69

That looks much nicer and is much easier to understand.

Another example using value n for the type flag.

import locale

# Setting the locale to US English
locale.setlocale(locale.LC_ALL, 'en_US.UTF-8')

print "Number without formatting applied: 65739838"
print "Number with formatting applied: {0:n}".format(65739838)

The output of this code is:

Number without formatting applied: 65739838
Number with formatting applied: 65,739,838

I think the second is far easier to read than the first.

Changing the Color Scheme in Emacs: the color-theme extension

If you are new to Emacs, as I am, you are probably are a little bothered by how it looks out of the box. Those of us who are just learning to use Emacs in this day and age have most probably worked with other editors (such as Textmate) that provide more eye candy right out of the box. There is no reason to be disappointed. As I have mentioned in earlier posts, Emacs is infinitely extensible and there are great color themes available to make Emacs more beautiful and more pleasing to your eyes.

For modifying the color scheme of Emacs you can use the wonderful color-theme extension. I recently installed it to work with Emacs 23.2 on Mac OS X. I have explained the steps below. These may work for other versions of Emacs.

Go to http://download.savannah.gnu.org/releases/color-theme/ and download color-theme-6.6.0.zip to your computer.

For the purpose of discussion, suppose you save this zip file in your ~/Downloads folder.
Unzip the downloaded zip file. This will result in a folder named
color-theme-6.6.0 within your ~/Downloads folder (or wherever you saved the color-theme-6.6.0.zip file).

You may prefer to keep the color-theme-6.6.0 extension folder where you saved the color-theme-6.6.0.zip file, or you may prefer to move it to some other place on your computer where you keep downloaded Emacs extensions. If you want to move it to another place on your computer, ensure that you move the entire color-theme-6.6.0 folder. (I read this tip on some forum and had planned to link from here. Unfortunately, I deleted it and cannot find it again.)
Add the following lines to your Emacs init file (~/.emacs file).
```
(add-to-list 'load-path "~/Downloads/color-theme-6.6.0")
    (require 'color-theme)
    (eval-after-load "color-theme"
    '(progn
    (color-theme-initialize)
    (color-theme-dark-blue2)))
```
Note that ~/Downloads/color-theme-6.6.0 is the folder where you saved the color-theme-6.6.0 folder. Also, dark-blue2 in the last line of the above code snippet is the name of one of the many available color themes. You can replace this by the name of the theme you like (see below how to find theme names).
Close Emacs and start it again and enjoy the beautiful colors.

If you want to see the other themes that came with the extension that you downloaded, you can use M-x color-theme-select. This will open a new buffer with a list of available color themes. You can change to a theme in this list by moving the cursor to that theme in the buffer and pressing Enter.

If you want to see what different themes look like before changing yours, you can also go here: http://color-theme-select.heroku.com/#color-theme-light