2008-11-22

Two interesting add-ins for MS Office

If you use Microsoft Office, there are a couple of interesting add-ins which I find interesting.
Until next, may you enjoy adding things.

2008-04-16

Hyper/Net: MDSoC support for .NET

Earlier this year, I attended the MSc discussion of Tiago Dias, entitled "Hyper/Net: MDSoC support for .NET". The thesis explored Multi-Dimensional Separation of Concerns (MDSoc) for the .NET platform, namely using the C# programming language. The idea is based Hyper/J, a research project abandoned by IBM a couple of years ago. While this may same like a bad start, Tiago's presentation was very good and convincing. Essentially, the MDSoC support is based on .NET's partial classes and some generative programming techniques. This work originated a research paper, with the same name of the dissertation, published in 2006. Also, some of the dissertation's chapters were stripped, as they belong on a possible future PhD thesis. These chapters dealt with the cognitive aspects of the multidimensional separation of concerns.

Until next time, may you enjoy separating concerns, one way or the other.

2008-02-01

SQL Server BCP

A couple of weeks ago I was writing a small application to perform some ETL (Extract, Transform, Load) operations. As I was researching an efficient way to perform a large number of SQL insert statements, I found out about the BCP (Bulk CoPy) utility for MS SQL Server. This tool performs efficient data insertion (something in the order of thousands of row inserts per second), using as source CSV (Comma Separated Values) files. The mapping is performed through configuration files in a semi-structured format, or using an XML file (the latter is strongly recommended). Basically, the trick resides that unlike in regular inserts, database restrictions (primary keys, foreign keys, unique restrictions, null values, etc.) are not enforced for each row, but only at the end of the introduction, or per each block.

This was good solution for my needs at the time, which was to extract a large number of data from CSV files into a small number of tables, on a secondary database to the system in question. Of course, the mapping is limited: you cannot perform data normalization, only insert/remove columns, or switch its order. This means that either the input data is normalized, or you end up with a denormalized database. In sum, BCP does not replace an ETL tool, but provides an interesting complement in the "Loading" part.

By the way, a very nice book on the subject is Professional SQL Server 2005 Programming, by Robert Vieira, published by Wrox.



Until next time, may you enjoy bulk copying.