English 中文(简体)
Resources to learn about engineering aspects of data analytics (OLAP, warehousing, ETL, etc.)
原标题:

I m a math/stats guy, interested in learning more about the engineering aspects of "data analytics" (probably an overly broad term, but this is definitely a case of "I don t know what I don t know", so I m not sure how to be more specific).

I m fine with manipulating and analyzing the data once it s already stored somewhere and I can access it, and I m fine with writing scripts and SQL queries (and have a general knowledge of things like normalization). What I don t know is the whole engineering process of capturing and storing the data. For example, terms I ve heard thrown about that I only vaguely understand the meaning of include:

  • OLAP, OLTP
  • Data warehousing
  • ETL
  • ???

What s a good book (or any other resource) to learn about these kinds of things? What are things I should know about database design (normalization seems kinda "obvious" to me, something I would have done even before I knew the term -- is there anything else?)?

In other words, for jobs falling under the umbrella term of "analytics engineer", what kinds of things should I know and what s a good way to learn about them?

问题回答

You might start with the Ralph Kimball books - these seem to be the starting point for most people.





相关问题
How to model a many-to-many relationship in App Engine?

I have a question regarding how to model a many-to-many relationship in App Engine: A Blogentry can have many tags, a tag can apply to many blog entries. I see a couple of scenarios: Use a Set of ...

How to emulate tagged union in a database?

What is the best way to emulate Tagged union in databases? I m talking about something like this: create table t1 { vehicle_id INTEGER NOT NULL REFERENCES car(id) OR motor(id) -- not valid ... } ...

Users asking for denormalized database

I am in the early stages of developing a database-driven system and the largest part of the system revolves around an inheritance type of relationship. There is a parent entity with about 10 columns ...

How to best implement a 1:1 relationship in a RDBMS?

Yesterday while working on a project I came up on a peculiar 1:1 relationship which left me wondering - how to best implement this (clearly, we had done it wrong :D) The idea is that there are two ...

Automatic filling entity properties

I have some type an architect(?) question I develop an application, based on Spring and Hibernate (annotated configuration) For each table in my database I added 4 fields: createdBy and modifiedBy(...

热门标签