English 中文(简体)
Straight Java/Groovy versus ETL tool (Talend/etc) - what libraries would you use?
原标题:

Assume you have a small project which on the surface looks like a good match for an ETL tool like Talend.

But assume further, that you have never used Talend and furthermore, you do not trust "visual programming" tools in general and would rather code everything the old fashioned way (text on a nice IDE!) with the help of an appropriate language & support libraries.

What are some language patterns & support libraries that could help you stay away from the ETL tool temptation/trap?

问题回答

It depends on whether the deliverable is the processor or the output itself. If you just need to deliver the output, you don t need to maintain the code. If the code needs to be maintained then will it be you maintaining it or somebody else?

If somebody else needs to maintain I d use Java or give them Talend.

If it s throwaway code, I d use what will be easier or fun to program with.

If you need to maintain it and the processing is complex, I d use Scala. It has:

  • some libraries to interact with databases
  • xml literals
  • parser combinators
  • interesting features on its collection packages (map, filter, groupBy, partition, ...)
  • and of course any other existing Java libraries.

I used to think that "visual programming" is something for people who can t program. Then I was exposed to Talend in a project, and I realized that this type of tool is exactly right for the job, when it comes to moving data from A to B, and transforming it in the process. It s component-oriented software design, by a more academic label.

I still consider myself a decent programmer who can do anything, and then some, with a text editor and a shell prompt. But I ve become a big fan of Talend as well.

Full disclosure: I now work for the company :-)

Check out DataExpress. It s a Scala-based, cross-database ETL toolkit.

I think this is a pretty good match for Rails-inspired frameworks, such as Grails on Groovy or Lift on Scala.

Depending on the size of the DB schema, you could map everything real quick in Hibernate and just use the resulting object model to do your work (depending on what you want the ETL tool for anyways)





相关问题
Spring Properties File

Hi have this j2ee web application developed using spring framework. I have a problem with rendering mnessages in nihongo characters from the properties file. I tried converting the file to ascii using ...

Logging a global ID in multiple components

I have a system which contains multiple applications connected together using JMS and Spring Integration. Messages get sent along a chain of applications. [App A] -> [App B] -> [App C] We set a ...

Java Library Size

If I m given two Java Libraries in Jar format, 1 having no bells and whistles, and the other having lots of them that will mostly go unused.... my question is: How will the larger, mostly unused ...

How to get the Array Class for a given Class in Java?

I have a Class variable that holds a certain type and I need to get a variable that holds the corresponding array class. The best I could come up with is this: Class arrayOfFooClass = java.lang....

SQLite , Derby vs file system

I m working on a Java desktop application that reads and writes from/to different files. I think a better solution would be to replace the file system by a SQLite database. How hard is it to migrate ...

热门标签