Atom feed for keyword "iterator"

Iterating Over Database Results in C#
Dec 17, 2007


At my job, we use what is essentially a custom C# ORM. Soon after I arrived, we adopted this idiom to pull a DB connection from the pool and run a query:

 class Something
{
    public int ourIdiom(long identifier)
    {
        using (DBConnection db = new DBConnection())
        {
            using (IDataReader rec = db.execQuery(@"
                SELECT some, rows
                FROM  Table1 t1, Table2 t2
                WHERE t1.t2id = t2.id
                AND   t1.id = ?",
                identifier))
            {
                while (rec.Read())
                    doSomething();
                return 1;
            }
        }
    }
}

Which is nice because it disposes of both the connection and the dataReader properly on failure, but ugly because it uses a whole bunch of boilerplate. Unfortunately, because doSomething() needs to execute inside both using statements, and we weren't using C# 2.0 until a few months ago, the only way to avoid the boilerplate would have been to pass in delegates.

This would have been just as ugly as keeping the boilerplate in the code, so we stuck with the using2 idiom.

Since our switchover to 2.0 a few months ago, I've had an idea that I could make this idiom more concise and remove some boilerplate that I didn't get a chance to try until this week. Using 2.0's iterators, which seem awfully familiar from my python work, that can be reduced to:

class Something
{
    public int ourIdiom(long identifier)
    {
        foreach (IDataReader rec in new SQL().Query(@"
            SELECT some, rows
            FROM  Table1 t1, Table2 t2
            WHERE t1.t2id = t2.id
            AND   t1.id = ?",
            identifier))
        {
            doSomething();
        }
        return 1;
    }
}

With some fairly simple code:

public class SQL : 
    System.Collections.Generic.IEnumerable<IDataReader>
    {
        string m_sql;
        ArrayList m_sqlparams;

        public SQL()
        {
        }

        public SQL query(string sql, params object[] sqlparams)
        {
            m_sql = sql;
            m_sqlparams = new ArrayList(sqlparams);
            return this;
        }

        IEnumerator<IDataReader> IEnumerable<IDataReader>.GetEnumerator()
        {
            using (clsDBConnection db = new clsDBConnection(m_dbname))
            {
                using (IDataReader rec = db.execQuery(m_sql, m_sqlparams))
                {
                    while (rec.Read())
                        yield return rec;
                }
            }
        }

        //this is required because IEnumerable<> inherits IEnumerable. bleh.
        IEnumerator IEnumerable.GetEnumerator() { return GetEnumerator() }
    }
}

This code should come in handy for more reasons than just its cleanliness. First off, if you ever need to change the idiom for accessing the DB, it's stored conveniently in one place.

Second, It makes a prepared SQL call into something like a thunk: a bit of code representing a future computation that can then be passed around to be performed later. An example:

void setUpQueries(nameId, addressId, custId)
{
    List<SQL> stringsWeNeed = new List(new object[]{
        new SQL().query("SELECT name FROM names WHERE id=?", nameId),
        new SQL().query("SELECT addy FROM addresses WHERE id=?", addressId),
        new SQL().query("SELECT zip FROM customers WHERE id=?", custId)});

    List<string> strings = getOurStrings(stringsWeNeed);

    mangleStrings(strings);
}

List<string> getOurStrings(List<SQL> queries)
{
    List<string> strings = new List<string>();
    foreach (SQL query in queries)
    {
        foreach (IDataReader rec in query)
            strings.Add(rec.GetString(0));
    }
    return strings;
}

While the example is silly, this property can be exploited to seperate declaration from execution, which I often find to be a useful pattern.

Though I am far from the first person to discover this simplification, I did figure it out on my own and I thought it was neat enough to share.

A quick disclaimer: the code here is modified from the original to remove some details, and is completely untested, and probably doesn't compile. You may, however, use it or distribute it as you see fit; it's licensed under the wtfpl.

UPDATE: added the while(rec.Read()), which I'd forgotten

[# ] database, c#, programming, computer, python, iterator

Reverse File Iterator and Premature Optimization
Oct 03, 2007


This afternoon, I had a need for a reverse file iterator in a script I was writing to process a quite large log file. After some unsuccessful googling followed by some quick hacking, I was left with a frustratingly close, but nonfunctional, iterator. Instead of fixing it while I was at work, I hacked up a solution calling "tail" and left it for a while.

When I came back to finish the job this evening, I figured that I ought to go look at the source of tail for some inspiration. Surely the unix hackers had figured this problem out long ago? In reverse.c, I found the inspiration I needed:

    for (; pos >= start; pos--) {
        /* A seek per char isn't a problem with a smart stdio */
        if (fseeko(fp, pos, SEEK_SET) != 0) {
        //snip
            if ((ch = getc(fp)) == '\n')

I had been reading the file in chunks, splitting the chunk into lines, handling a cache of the lines, and compensating for unfinished lines, all because I had buried deep down in my lizard brain the idea that an fseek per character was "slow". In pure python, for chrissake!

Properly reminded of the fact that premature optimization can sneak up anywhere, I ended up with this code:

import os

class reversefile(object):
    """Iterate backwards through a file. f should be an open file handle"""
    def __init__(self, f):
        self._f = f
        self.end = os.stat(f.name).st_size

    def __iter__(self): return self

    def next(self):
        if self.end == 0:
            raise StopIteration

        pos = self.end-2
        while pos >= 0:
            self._f.seek(pos)
            if self._f.read(1) == '\n':
                end = self.end
                self.end = pos
                return self._f.read(end - pos - 1)
            pos -= 1

        end = self.end
        self.end = 0
        self._f.seek(0)
        return self._f.read(end).strip("\n")

You can see the whole thing, with source and some very brief tests I hacked up, here.

update:

Reddit user Brian posts a solution that goes much faster on his platform (I've done no performance testing of my code, it wasn't important for me) and handles platform linebreaks better. In other words, it's better for all purposes. Thanks Brian, I wish I'd found that when I was googling!

[# ] python, file, iterator, optimization, openbsd

I'm a programmer for a small company in Baltimore, Maryland, USA. Besides programming, I play competitive ultimate. I blog at irregular intervals about various programming topics, but mostly Python.