Friday, August 7, 2015
Tuesday, August 4, 2015
How to move data out of Informatica File Archive to Hadoop using sqoop
Although it seem hard and complicated. Once you discovered correct connection string and driver name getting data out of Informatica file archive with sqoop is pretty simple and straightforward.
Unfortunately due to structure there is no way to specify schema name other than getting data out with --query command
sqoop import --driver com.informatica.fas.jdbc.Driver --connect jdbc:infafas://<server_name>:<port: Default=8500>/<database_name> --username xxxx --password xxxxx -m 1 -delete-target-dir --target-dir <target_dir> --query "SELECT * FROM <schema_name>.<table_name> where \$CONDITIONS" --fields-terminated-by \| --lines-terminated-by \\n --hive-drop-import-delims
You have to keep "where \$CONDITIONS" even though you do not specify one.
You also need to copy "infafas.jar" to shared library.
Please feel free to ask any questions.
Unfortunately due to structure there is no way to specify schema name other than getting data out with --query command
sqoop import --driver com.informatica.fas.jdbc.Driver --connect jdbc:infafas://<server_name>:<port: Default=8500>/<database_name> --username xxxx --password xxxxx -m 1 -delete-target-dir --target-dir <target_dir> --query "SELECT * FROM <schema_name>.<table_name> where \$CONDITIONS" --fields-terminated-by \| --lines-terminated-by \\n --hive-drop-import-delims
You have to keep "where \$CONDITIONS" even though you do not specify one.
You also need to copy "infafas.jar" to shared library.
Please feel free to ask any questions.