Working towards becoming a Data Engineer

Today the word Data Engineer is becoming just as popular as the term Data Scientist. Everyone is looking for one and everyone is trying to make sure that they are going to learn the necessary skills to be one.

 I think one of the best ways to become one is to look at the skills that you need to have and the mind set that one must have to become a successful Data Engineer. A good website that I came across the describes the difference can be found on Oreilly.

Personally I came to be a Data Engineer through an unconventional path. I was first an ETL developer using SAP Business Objects Data Services (BODS). I was involved in a number of ETL projects where I extracted, transformed, and loaded the data into a target system. That was my whole job, find the best way to get and transform incoming data and then load it into a target system without data defects.

The tool was GUI driven so I did not have to rely on coding to do most of those ETL tasks, however today I find that most tools in the marketplace are somewhere in between of being accessible either through only code (ex: Python, Scala) or be entirely GUI driver (ex: SAP BODS, Informatica). I don't view one as better than the other, but it was during that time I started learning more about cloud technologies (AWS, Azure, GCP) and from there saw the different possibilities that an ETL developer could have by migrating his skills to the newer platforms and start to slowly learn programming to do ETL type tasks.

 That is the purpose of this blog. I am hoping to document some of the stuff that I have been doing and hopefully help someone stuck in a position trying to get into Data Engineering. I am not an expert in any sense of the imagination, but I have made enough mistakes to at least show some of the things that work and don't work. I will be adding to this blog some of the things that I am working on and include tutorials for others that want to replicate it.

Best of luck and have fun learning.