Skip to main content

What is ETL system testing?

This week let’s talk about what is ETL System testing, and what exactly do we test when we perform this type of test?
But as usual let’s start from the very beginning and understand what ETL type of system is.

What Is ETL System?

ETL – Extract Transfer Load.
This type of a system has a special structure that is usually used in case when we want to either load or extract data in big amounts that cannot be contained in a regular API. In other words we are talking about a components that their main purpose is to move around big amounts of data and manipulate it along the way based on specific logic. ETL system implementations are different; involving different data sources on different platforms; each of those systems can be broken down to a set of interfaces that work together to move the data, but despite the differences we can commonly group them based on the direction the data flows at from data source perspective into two groups:
  •         Inbound interfaces
  •        Outbound interfaces

It is important to understand that ETL system can include very complex logic and calculation that are happening during the data movement process, which is both brings the value and the complexity to such systems and makes the testing effort especially challenging.

What Do We Test in ETL System?

When we talk about ETL testing we talk about testing a process that moves data around, either extracting data from a data source or loading data into data source.
While the specifics of what is being tested is changing based on the system and implementation we can still group all the test in 3 main areas:
  •         Verifying the source data structure and content (file, database etc) before load/extract
  •         Verifying the target data structure and content (file, database etc) after load/extract
  •         Verifying the data manipulation logic during load/extract

 What tools we use?

This type of systems are lacking in proper automation tools that can be used to automate the testing process beyond a dedicated Unit testing code created for specifics of each implementation and hence a lot of testers find them self using half automated half manuals methods to verify. The most common is the use of SQL tools for data verification (I.e SQL developer,  SQL management studio) and powerful file manipulation tools (i.e notepad++, beyond compare etc) 


Popular posts from this blog

Story Points estimation for Scrum with Fibonacci vs Shirt Sizes vs Linear - 7 minute guide

It is all began long time ago when Development Teams were constantly asked to provide estimate and they were having a hard time to properly face the task. Let's admit it, there are so many things that can change, happen, and simply go wrong during the development process that one can hardly expect a proper estimation of hours for each task. That why a relative estimation with Story Points came along. Story Points Estimation Its a different way to estimate the effort of the Scrum Development Team with-in Agile methodology, which means that instead of estimating hours of work the team estimates each effort relatively to other efforts in the project. Let's assume that a developer knows that specific 'Task 1' is much harder than another 'Task 2' it is hard for him/her to quantify that harder feeling in hours of additional work but it is possible to say that it much more work. This situation is being address by Story Points when each story point is

7 Most Popular Test Types in Software Testing

Today we are going to return back to basics of software testing and discuss the 7 most popular test types that are being used in every software testing effort. Those different test types cover all the levels of the software to make sure that the final result matching the expectations from every possible angle. Here is our list: Unit testing Smoke testing Regression testing Functional testing Integration testing User Acceptance Testing Performance Testing Now let’s have a deeper dive into each one of those by using a simple example of an imaginary system that was created in order to manage warehouse activity including shipments, inventory and goods receptions from suppliers. Unit Testing This type of testing is usually performed by the developers and is covering the very basic development component. In this test developers are testing the straight forward functionality of a functional piece of code to make sure that it is performing according to th

What is the velocity of an Agile scrum methodology?

Let's discuss some of the important measurements in Agile, and that is the Velocity of the Scrum team work. Based on Wikipedia definition Velocity is " ...the rate of change of its position with respect to a frame of reference and is a function of time...", which when transferred to the scrum world can be summarized as: The amount of work that the scrum team completed in a single measure of time - in a sprint. How we Calculate Velocity? Velocity is actually a very simple to calculate, it is done but totaling the number of story points of fully completed user stories from the sprint backlog. So if a current sprint included 4 user stories: 2 with 8 story points each, one with 3 story points and one with 32 story points. and by the end of the sprint the 32 one was not fully done the velocity calculation will be: 8+8+3=19 Note: the 32 story points are not part of the velocity calculation as this user story was not completed. What Velocity is used for? The v