Programming Historian Home

Getting Started

Install and set up software

In order to work through the techniques in this book, you will need to download and install some freely available software. As much as possible, we've tried to make everything compatible with Linux, Mac and Windows PCs. We assume that the majority of our readers will probably be using Windows, so we've taken the approach of getting a Windows XP version working first, then a Mac version and finally a Linux version. We'd be happy to include instructions for specific platforms, especially if you want to send them to us. We've also included peer feedback and commentary on the discussion page. If you run into trouble with our instructions or find something that doesn't work on your platform, please let us know. Since this is very much a work-in-progress, we will occasionally make comments and indicate things that are provisional in purple.

Linux instructions

Mac instructions

Windows instructions

"Hello world" in Python

It is traditional to begin programming in a new environment by trying to create a program that says "hello world" and terminates. In keeping with our polyglot approach, we will do this in a number of different ways using a few different programming languages.

The languages that we will be using are all interpreted. This means that there is a special computer program (known as an interpreter) that knows how to follow instructions written in the language. One way to use the interpreter is to store all of your instructions in a file, and then run the interpreter on the file. A file that contains programming language instructions is known as a program. The interpreter will execute each of the instructions that you gave it in your program and then stop. Let's try this.

In Komodo, create a new file, enter the following two-line program and save it as hello-world.py

# hello-world.py
print 'hello world'

You should then be able to double-click the "Run Python" button that you created in the previous step to execute your program. If all went well, it should look something like this:

"Hello World" in Python on Windows
"Hello World" in Python on Windows

Notice that the output of your program was printed to the "Command Output" pane.

Interacting with a Python shell

Another way to interact with an interpreter is to use what is known as a shell. You can type in a statement and press the Enter key, and the interpreter will respond to your command. Using a shell is a great way to test statements to make sure that they do what you think they should.

Linux instructions

Linux instructions are pretty much the same as Mac. Just go to Applications (again, upper left of toolbar) -> Accessories -> terminal

Terminal in Linux
Terminal in Linux

Mac instructions

You can run a Python interpreter by going to the Finder and double-clicking on Applications->Utilities->Terminal then typing "python" into the window that opens on your screen. At the Python interpreter prompt, type

print 'hello world'

and press Enter. The computer will respond with

hello world

When we want to represent an interaction with the shell, we will use -> to indicate the shell's response to your command, as shown below:

print 'hello world'

-> hello world

On your screen, it will look more like this:

Python Shell in Mac Terminal
Python Shell in Mac Terminal
Windows instructions

You can get access to a Python shell by double-clicking on C:\Python25\python.exe A new window will open on your screen. In the shell window, type

print 'hello world'

and press Enter. The computer will respond with

hello world

When we want to represent an interaction with the shell, we will use -> to indicate the shell's response to your command, as shown below:

print 'hello world'

-> hello world

On your screen, it will look like this:

Python Shell in Windows
Python Shell in Windows

The reason that we will be using Python for many of our programming tasks is that it is very high-level. It is possible, in other words, to write short programs that accomplish a lot. The shorter the program, the more likely it is for the whole thing to fit on one screen, and the easier it is to keep track of all of it in your mind.

"Hello world" in JavaScript

A second programming language that we will be using is JavaScript. Like Python, JavaScript is an interpreted language. One of the things that makes JavaScript special is that the browser is a JavaScript interpreter. So it is possible to write programs that control the behavior of your browser. In fact, that is what Zotero is, a program written (mostly) in JavaScript that adds some powerful functionality to Firefox.

Being able to program the browser makes it possible to do many interesting things, but it also introduces some important limitations. Imagine if someone else were able to use JavaScript to program your browser so that it erased all of the files on your hard drive? Not good. For this reason, the JavaScript language has no mechanisms for creating, opening, or deleting files. The language also prevents information from being exchanged outside of well-defined and fairly limited boundaries.

Hence our polyglot approach. For some tasks, we will want to use Python, for others, JavaScript. Sometimes we will mix code from both languages to get the best results. Most of the work that we do at the beginning will be in Python, however.

In Firefox, choose Tools->Extension Developer->Javascript Shell. A window should open on your screen. In that window type the following statements and press Enter.

print("hello world");

If all went well, it should look something like this:

JavaScript Shell
JavaScript Shell

Viewing HTML files

When you are working with online sources, much of the time you will be using files that have been marked up with HTML (Hyper Text Markup Language). Your browser already knows how to interpret HTML, which is handy for human readers. Most browsers also let you see the HTML source for any page that you visit. The two images below show a typical web page (the History News Network) and the HTML source used to generate that page, which you can see with the View->Page Source command in Firefox.

When you're working in the browser, you typically don't want or need to see the source for a web page. If you are writing a page of your own, however, it can be very useful to see how other people accomplished a particular effect. You will also study HTML source as you write programs to manipulate web pages or automatically extract information from them.

History News Network Web Page
History News Network Web Page
HTML Source for HNN Web Page
HTML Source for HNN Web Page

(To learn more about HTML, you may find it useful at this point to work through the W3 Schools HTML tutorial. Detailed knowledge of HTML isn't necessary to continue reading, but any time that you spend learning HTML will be amply rewarded in your work as a digital historian.)

"Hello World" in HTML

HTML consists of text and tags which typically indicate the beginning and ending of particular elements. Suppose you are formatting a bibliographic entry and you want to indicate the title of a work by italicizing it. In HTML you use em tags ("em" stands for emphasis). So part of your HTML file might look like this

... in Cohen and Rosenzweig's <em>Digital History</em>, for example ...

The simplest HTML file consists of tags which indicate the beginning and end of the whole document, and tags which identify a head and a body within that document. Information about the file usually goes into the head, whereas information that will be displayed on the screen usually goes into the body.

<html>
<head></head>
<body>Hello World!</body>
</html>

You can try creating some HTML code. Go to Komodo, and choose File->New. Copy the code below into the editor. The first line tells the browser what kind of file it is. The html tag has the lang property (for language) set to en (for English). The title tag in the head of the HTML document contains material that is usually displayed in the top bar of a window when the page is being viewed, and in Firefox tabs.

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html lang="en">
<head>
<title><!-- Insert your title here --></title>
</head>
<body>
<!-- Insert your content here -->
</body>
</html>

Change both

<!-- Insert your title here -->

and

<!-- Insert your content here -->

to

Hello World!

Save the file as hello-world.html. Now go to Firefox and choose File-> New Tab and then File-> Open File. Choose hello-world.html. Your message should appear in the browser.

"Hello World" in embedded JavaScript

Remember that we said that your browser already knows how to interpret both HTML and JavaScript. In fact, it also understands when you mix the two, as long as you tell it what you are doing. We are going to make extensive use of this capability later on, so let's see how it works.

If you want to include JavaScript within HTML, you use the script tag to tell the browser that you are doing so. You can then embed the script right in the body of your HTML file like this:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html lang="en">
<head>
<title>Hello World! Script</title>
</head>
<body>
<script type="text/javascript">
document.write("Hello World!");
</script>
</body>
</html>

Create a new empty HTML file in Komodo and modify the title and body to match the example above. Save it as hello-world-js.html. When you open it with Firefox, your message should appear as before.

We've now gotten the same result using HTML in two very different ways, so we should be clear about the difference. In the first case we created a very basic static web page using pure HTML. The body of the page says "Hello World!" and nothing else. In the second case, we created a blank HTML page and then ran a short JavaScript program to print "Hello World!" onto that blank page. From the point of view of the person reading the page, they look the same and it may not matter to them how the page was created. From our perspective, however, the difference is crucial, because the second method allows us to embed our JavaScript programs in HTML files which can be viewed in the browser. Anything that can be viewed in the browser can be indexed and annotated with Zotero. This means that you can keep track of the programs that you write and their output using the same system that you use to keep track of the rest of your research.

Back up your work

Once you begin to program, it is crucial that you make backups of your work regularly. Each day before you do any programming, make sure to back up your Zotero database. At the end of a day's work, make another backup of the Zotero database and of any programs that you've written that day. You should back up your whole computer at least weekly, and preferably more frequently.

Keep in touch with us

As you work through the examples in this book you will, no doubt, want to apply similar techniques to your own sources. If you come up with a variation or generalization, e-mail us to let us know about it. Likewise, if you run into trouble or can't figure out how to modify one of our programs so it applies to your situation, we'd like to hear from you. We can try to help you get something running, or try to add some new material to The Programming Historian to cover situations like yours.

Other resources

As you're working through the tutorials here, you will want to have a few key resources open in your browser. Until you become familiar with the programming languages that we're using, it is nice to have a few different introductory treatments to look at. There are many good online resources like

As you proceed (or if you already have some programming experience) you'll probably prefer more general references like:

We also like to have a few printed books ready-to-hand, especially

Other references will be cited as we make use of them.

Suggested readings

Some of our readers have expressed an interest in using The Programming Historian for formal or informal coursework. To get a solid foundation in Python programming, it is probably best to pair these exercises with some additional readings. We like Mark Lutz's Learning Python, 3rd ed. Sebastopol, CA: O'Reilly, 2008.