Programming
3 February 2011 1 Comment

Python 2.x: Parsing an HTML Page From a String With html5lib

For Python 2.x there is a well-known library for parsing html pages (). This library requires a File Object as the parsing source, but sometimes the raw HTML of a page is contained in a string variable. So how do we access a string with a File Object? Use StringIO!

When you create a StringIO object, you can treat that object exactly like a File Object: writing, seeking and reading with all the standard functions.

 
data = "A whole bunch of information";
 
# Create a stream on the string called 'data'.
 from StringIO import StringIO
 dataStream = StringIO()
 dataStream.write(data)

Now you can pass dataStream to any function expecting a File Object!

Combined with html5lib we can parse an HTML page like this:

from html5lib import html5parser, treebuilders
 
treebuilder = treebuilders.getTreeBuilder("simpleTree")
parser = html5parser.HTMLParser(tree=treebuilder)
document = parser.parse(dataStream)

Now the variable document contains the tree representation of the HTML contained in dataStream.…

Tags: , file object, html5lib, , string, stringio
Programming
26 January 2011 0 Comments

Selecting a File in C# Using the OpenFileDialog Class

This C# code snippet allows you to pick a file using the OpenFileDialog dialog box.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
using System.IO;
using System.Text;
using System.Windows.Forms;
 
private string SelectTextFile ( string initialDirectory )
{
    OpenFileDialog dialog = new OpenFileDialog();
 
    dialog.Filter = "txt files (*.txt)|*.txt|All files (*.*)|*.*";
    dialog.InitialDirectory = initialDirectory;
    dialog.Title = "Select a text file";
 
    return ( dialog.ShowDialog() == DialogResult.OK ) ? dialog.FileName : null;
}

Let’s say we have a form with:

  • a Button object named fileSelectButton
  • an OpenFileDialog object named openFileDialog
  • a TextBox object named filePathText

When a file is selected the file path should be put in the textbox. To do this we hook up an event with the following code.

1
2
3
4
5
6
7
8
private void fileSelectButton_Click(object sender, EventArgs e)
{
 
    if (openFileDialog.ShowDialog() == DialogResult.OK)
    {
        filePathText.Text = openFileDialog.FileName;
    }
}

Tags: dialogresult.ok, , open file, openfiledialog