Tag Archives: ETL

Remove CSV text qualifiers within a field using Python
Have you ever tried to import a CSV into a database and found that it won’t load because one of your CSV fields has a text qualifier inside the field itself? In this post I look at how to resolve this issue using a few lines of Python code.

My Top 10 ETL Best Practices
It’s been over ten years since I coded my first ever ETL routine. Since then I’ve collected together a number of important lessons and best practices, presented here as my very own ‘Ten Commandments’ of ETL. 1. Know your data Whether you’re defining an ETL strategy, designing a set of data flows or writing the code, the single […]

BI 101: What is a data warehouse?
It is fair to say that the foundation stone upon which nearly every BI solution is built is the data warehouse. So what is it, and what makes it different from a conventional database? The first thing to stress is that a data warehouse is still a type of database. What makes it different is […]