Ronald van Loon has written an interesting article on the 9 Skills You Need to Become a Data Modeler. In this post I want to express my thoughts on that article, because I don’t agree with all the statements made in the article.
I certainly agree that data modeling skills are one of the best skills to have in the current information driven industries, but would like to add that it hasn’t recently emerged. Data modeling has been around for decades (in fact more than half a century), but seems to have been buried under a lot of misconceptions. It is finally slowly being recognized again however.
Data modeling indeed helps in understanding how the data neurons connect with each other, which is crucial. It doesn’t define per se how the data is generated, nor does it need to be in a computer system. Data modeling mostly determines definition, structure and relations between, but not the processes. It should be based on facts that can and need to be verified.
Stepping into a career as a modeler, you’ll have to work with data analysts and architects to identify key dimensions and facts to support the system requirements of your client or company. […]
As long as dimensions and facts don’t refer to the concepts of dimensional / star modeling as was popularized and extensively taught by Ralph Kimball, I agree with this part and it also implicitly contains the number one skill you need as a data modeler: communication skills.
The career path for becoming a data modeler starts with specific education in the data science field […]
I really don’t agree on this point. Data science as we know it today was never part of my education. In fact, I am not really good at some of the underlying mathematical aspects that are involved in data science. Still, there are lots of “colleagues” that acknowledge I am good at data modeling.
In general, my opinion is that the article mixes up a few things and focuses too much on technology. The “definition” of data modeling in the article is too narrow, because data modeling is not just focused at database management systems.
Data modeling serves as a means to complement business modeling and to work towards generating a sufficient database.
Again, I don’t agree that is serves to work towards generating a sufficient database. Data doesn’t need to reside in a database. In my opinion, data modeling serves as a way of communicating, structuring, interpreting and understanding data. That structuring can go down to the implementation level where databases, files systems or other forms are used to store (and retrieve) the data.
The process for designing a database includes the production of three major schemas: conceptual, logical and physical. […] A Data Definition Language is used to convert these schemas into an active database. A data model that is fully attributed and covers all major aspects includes detailed descriptions for every entity contained within it.
I guess the author meant three major models, not schemas. Schemas are a technical way of separating data structures on particular database implementations. A Data Definition Language (DDL) is generated from a physical data model. The physical model however no longer talks about entities and attributes, but about tables and columns (assuming that we are talking about a relational database management system as the target).
This part is basically the essence of the article and exactly the part that I think is the ugliest of all.
You must exhibit the following skills before pursuing a career in data modeling: 1. Digital logic 2. Computer architecture and organization 3. Data representation 4. Memory architecture 5. Familiarity with numerous modeling tools that are currently in place within organizations 6. Directions in computing 7. SQL language and its implementing 8. Exemplary communication skills that will help you in making your way around organizations with an intricate hierarchy 9. Sufficient experience using Teradata or Oracle database systems
The most important skill you need to possess is number 8. Communication skills are ranked as the number one and even following skills by Steve Hoberman.
Skills one to four on the list are irrelevant until you get to physical data modeling and even then they could be questionable to a certain degree. Skill number 5 is something you can learn on the job. And beware, lots of data modeling tools in the form of software tend to focus only on certain aspects of data modeling, not even all of them. A data model can be as simple as a set of post-it’s on a whiteboard with lines between them. In fact, that is most likely to be the data model that is best understood by people without the technical background that is being referred to by most of the skills listed. Skill number 7 comes in handy at some point when the implementation of the physical data model that actually can handle SQL. What about graph databases that don’t support it?
I can’t comment on skill number 6 as I don’t understand what is meant by it.
And skill number 9… well, sorry Microsoft and all other database vendors. Seems like you have all just been wiped out of business…
Training and certification
Getting sufficient data modeling training and staying up-to-date with the evolution of the industry is indeed very important.
Certifications are crucial when it comes to data modeling in the formal setting. Companies agree it’s important for their data modelers to obtain reputable certifications that prove their expertise and also enhances their skills. These certifications include Big Data and Data Science courses, Big Data Architect Master’s Programs, Big Data Hadoop Training, and Data Science with R, among others.
I really am missing the importance of these certifications regarding data modeling. I’m pretty sure they have their value, but on a entirely different area.
- If the DAMA organization doesn’t want me to include the picture of the DAMA Wheel in this post, please let me know and I will remove it. ↩