본문 바로가기

OpenSource(Bigdata&distribution&Network)/Sqoop

sqoop parquet snappy 테스트

현재 프로젝트에서 SQOOP 과 압축과 저장포멧을 선택해야해서 간단하게 테스트를 했다. 


테스트항목은 sqoop 을 통해서 oracle 데이터를 hadoop 에 넣을때 

snappy의 압축 / 비압축  

text plain / parquet 포멧

이 두가지 종류로 총4 개의 테스트를 진행한다.




테스트 장비의 간단한 스펙 


Host 장비 : 

CPU : Xeon CPU E5-2620 v3 * 2 HT ( Total 24 Core )

RAM : 256GB

HDD : PCI-E(NVMe) Vm OS , SATA (hadoop , oracle data )




guest os 스펙 


HADOOP ECO SYSTEM


vm node spec 

core : 16core ( 4socket , 4core )

ram : 16GB

1 name node , 4 data node , 1 ambari-util server 구성 



ORACLE


vm node spec

core : 8core ( 4socket , 2core )

ram : 8GB

1 single node : Oracle DB (Enterprise Single Node) 12C



원천데이터 (SOURCE) 설명 

이전테스트 실패에서 알수있듯이 랜덤으로 텍스트를 가득 채웠더니 압축률이 0%가 나왔다. 근거는

압축과 비압축 전송총량이 동일했다. 

해당 테스트 결과는 아래에 있다. 


위와 같은 문제로 일반적인 데이터 즉 무언가 문장이 있고 숫자가있는 데이터를 만들기로 하고 oracle dictionary 에 있는 comment 를 가지고 테스트데이터를 만들었다. 


테이블 구조 : 파티션테이블이 아닌 일반 힙테이블.


CREATE TABLE HDFS_TEST5

(

TABLE_NAME VARCHAR2(128),

LVL NUMBER,

COMMENTS VARCHAR(4000),

REG_DT VARCHAR2(19),

SEQ NUMBER 

)


대량 들어있는 데이터를 보면.

약간 데이터 스럽게 생겼다 ㅎㅎ;;


ROW COUNT : 11,620,800건




테스트 1.

========================= 압축률이 0% 인 결과를 보여주는 테스트 ============================


CASE 1. 파일포멧 : text plain , 비압축


명령어

sqoop import --target-dir=/dev/test/data_nc_txt --table HDFS_3_SUB -direct --connect jdbc:oracle:thin:@192.168.0.117:1521:ORCL --username flashone --password 1234 --split-by ACOL1


실행 결과

[hdfs@amb2 ~]$ sqoop import --target-dir=/dev/test/data_nc_txt --table HDFS_3_SUB -direct --connect jdbc:oracle:thin:@192.168.0.117:1521:ORCL --username flashone --password 1234 --split-by ACOL1

16/08/10 17:46:04 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6.2.4.2.0-258

16/08/10 17:46:04 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.

16/08/10 17:46:04 INFO manager.SqlManager: Using default fetchSize of 1000

SLF4J: Class path contains multiple SLF4J bindings.

SLF4J: Found binding in [jar:file:/usr/hdp/2.4.2.0-258/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: Found binding in [jar:file:/usr/hdp/2.4.2.0-258/zookeeper/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: Found binding in [jar:file:/usr/hdp/2.4.2.0-258/accumulo/lib/slf4j-log4j12.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.

SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]

16/08/10 17:46:06 INFO oracle.OraOopManagerFactory: 

**************************************************

*** Using Data Connector for Oracle and Hadoop ***

**************************************************

16/08/10 17:46:06 INFO oracle.OraOopManagerFactory: Oracle Database version: Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production

16/08/10 17:46:06 INFO oracle.OraOopManagerFactory: This Oracle database is not a RAC.

16/08/10 17:46:06 INFO tool.CodeGenTool: Beginning code generation

16/08/10 17:46:06 INFO manager.SqlManager: Executing SQL statement: SELECT "ACOL1","ACOL2","ACOL3","ACOL4","ACOL5","ACOL6","ACOL7","ACOL8","ACOL9","ACOL10" FROM HDFS_3_SUB WHERE 0=1

16/08/10 17:46:06 INFO manager.SqlManager: Executing SQL statement: SELECT "ACOL1","ACOL2","ACOL3","ACOL4","ACOL5","ACOL6","ACOL7","ACOL8","ACOL9","ACOL10" FROM "HDFS_3_SUB" WHERE 1=0

16/08/10 17:46:06 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/hdp/2.4.2.0-258/hadoop-mapreduce

Note: /tmp/sqoop-hdfs/compile/7e534716bf2036f166e1b14257055d00/HDFS_3_SUB.java uses or overrides a deprecated API.

Note: Recompile with -Xlint:deprecation for details.

16/08/10 17:46:09 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hdfs/compile/7e534716bf2036f166e1b14257055d00/HDFS_3_SUB.jar

16/08/10 17:46:09 INFO mapreduce.ImportJobBase: Beginning import of HDFS_3_SUB

16/08/10 17:46:11 INFO manager.SqlManager: Executing SQL statement: SELECT "ACOL1","ACOL2","ACOL3","ACOL4","ACOL5","ACOL6","ACOL7","ACOL8","ACOL9","ACOL10" FROM "HDFS_3_SUB" WHERE 1=0

16/08/10 17:46:12 INFO impl.TimelineClientImpl: Timeline service address: http://amb3.local:8188/ws/v1/timeline/

16/08/10 17:46:12 INFO client.RMProxy: Connecting to ResourceManager at amb3.local/192.168.0.143:8050

16/08/10 17:46:15 WARN oracle.OraOopUtilities: System property java.security.egd is not set to file:///dev/urandom - Oracle connections may time out.

16/08/10 17:46:15 INFO db.DBInputFormat: Using read commited transaction isolation

16/08/10 17:46:15 INFO oracle.OraOopOracleQueries: Session Time Zone set to GMT

16/08/10 17:46:15 INFO oracle.OracleConnectionFactory: Initializing Oracle session with SQL :

begin 

  dbms_application_info.set_module(module_name => 'Data Connector for Oracle and Hadoop', action_name => 'import 20160810174606KST'); 

end;

16/08/10 17:46:15 INFO oracle.OracleConnectionFactory: Initializing Oracle session with SQL : alter session disable parallel query

16/08/10 17:46:15 INFO oracle.OracleConnectionFactory: Initializing Oracle session with SQL : alter session set "_serial_direct_read"=true

16/08/10 17:46:15 INFO oracle.OracleConnectionFactory: Initializing Oracle session with SQL : alter session set tracefile_identifier=oraoop

16/08/10 17:46:17 INFO oracle.OraOopDataDrivenDBInputFormat: The table being imported by sqoop has 72704 blocks that have been divided into 128 chunks which will be processed in 4 splits. The chunks will be allocated to the splits using the method : ROUNDROBIN

16/08/10 17:46:17 INFO mapreduce.JobSubmitter: number of splits:4

16/08/10 17:46:17 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1470815003334_0001

16/08/10 17:46:18 INFO impl.YarnClientImpl: Submitted application application_1470815003334_0001

16/08/10 17:46:18 INFO mapreduce.Job: The url to track the job: http://amb3.local:8088/proxy/application_1470815003334_0001/

16/08/10 17:46:18 INFO mapreduce.Job: Running job: job_1470815003334_0001

16/08/10 17:46:26 INFO mapreduce.Job: Job job_1470815003334_0001 running in uber mode : false

16/08/10 17:46:26 INFO mapreduce.Job:  map 0% reduce 0%

16/08/10 17:46:40 INFO mapreduce.Job:  map 7% reduce 0%

16/08/10 17:46:41 INFO mapreduce.Job:  map 32% reduce 0%

16/08/10 17:46:42 INFO mapreduce.Job:  map 57% reduce 0%

16/08/10 17:46:43 INFO mapreduce.Job:  map 80% reduce 0%

16/08/10 17:46:44 INFO mapreduce.Job:  map 87% reduce 0%

16/08/10 17:46:46 INFO mapreduce.Job:  map 100% reduce 0%

16/08/10 17:46:46 INFO mapreduce.Job: Job job_1470815003334_0001 completed successfully

16/08/10 17:46:46 INFO mapreduce.Job: Counters: 30

File System Counters

FILE: Number of bytes read=0

FILE: Number of bytes written=614156

FILE: Number of read operations=0

FILE: Number of large read operations=0

FILE: Number of write operations=0

HDFS: Number of bytes read=11629

HDFS: Number of bytes written=505000000

HDFS: Number of read operations=16

HDFS: Number of large read operations=0

HDFS: Number of write operations=8

Job Counters 

Launched map tasks=4

Other local map tasks=4

Total time spent by all maps in occupied slots (ms)=116770

Total time spent by all reduces in occupied slots (ms)=0

Total time spent by all map tasks (ms)=58385

Total vcore-seconds taken by all map tasks=58385

Total megabyte-seconds taken by all map tasks=89679360

Map-Reduce Framework

Map input records=500000

Map output records=500000

Input split bytes=11629

Spilled Records=0

Failed Shuffles=0

Merged Map outputs=0

GC time elapsed (ms)=1801

CPU time spent (ms)=75560

Physical memory (bytes) snapshot=1357881344

Virtual memory (bytes) snapshot=13260427264

Total committed heap usage (bytes)=709361664

File Input Format Counters 

Bytes Read=0

File Output Format Counters 

Bytes Written=505000000

16/08/10 17:46:46 INFO mapreduce.ImportJobBase: Transferred 481.6055 MB in 34.0606 seconds (14.1397 MB/sec)

16/08/10 17:46:46 INFO mapreduce.ImportJobBase: Retrieved 500000 records.

[hdfs@amb2 ~]$ 



CASE 2. 파일포멧 : text plain , 압축 : snappy 


명령어

sqoop import --target-dir=/dev/test/data_sn_txt --compress --compression-codec org.apache.hadoop.io.compress.SnappyCodec --table HDFS_3_SUB -direct --connect jdbc:oracle:thin:@192.168.0.117:1521:ORCL --username flashone --password 1234 --split-by ACOL1


실행 결과

[hdfs@amb2 ~]$ sqoop import --target-dir=/dev/test/data_sn_txt --compress --compression-codec org.apache.hadoop.io.compress.SnappyCodec --table HDFS_3_SUB -direct --connect jdbc:oracle:thin:@192.168.0.117:1521:ORCL --username flashone --password 1234 --split-by ACOL1

16/08/10 17:50:13 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6.2.4.2.0-258

16/08/10 17:50:13 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.

16/08/10 17:50:13 INFO manager.SqlManager: Using default fetchSize of 1000

SLF4J: Class path contains multiple SLF4J bindings.

SLF4J: Found binding in [jar:file:/usr/hdp/2.4.2.0-258/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: Found binding in [jar:file:/usr/hdp/2.4.2.0-258/zookeeper/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: Found binding in [jar:file:/usr/hdp/2.4.2.0-258/accumulo/lib/slf4j-log4j12.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.

SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]

16/08/10 17:50:14 INFO oracle.OraOopManagerFactory: 

**************************************************

*** Using Data Connector for Oracle and Hadoop ***

**************************************************

16/08/10 17:50:14 INFO oracle.OraOopManagerFactory: Oracle Database version: Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production

16/08/10 17:50:14 INFO oracle.OraOopManagerFactory: This Oracle database is not a RAC.

16/08/10 17:50:14 INFO tool.CodeGenTool: Beginning code generation

16/08/10 17:50:14 INFO manager.SqlManager: Executing SQL statement: SELECT "ACOL1","ACOL2","ACOL3","ACOL4","ACOL5","ACOL6","ACOL7","ACOL8","ACOL9","ACOL10" FROM HDFS_3_SUB WHERE 0=1

16/08/10 17:50:14 INFO manager.SqlManager: Executing SQL statement: SELECT "ACOL1","ACOL2","ACOL3","ACOL4","ACOL5","ACOL6","ACOL7","ACOL8","ACOL9","ACOL10" FROM "HDFS_3_SUB" WHERE 1=0

16/08/10 17:50:14 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/hdp/2.4.2.0-258/hadoop-mapreduce

Note: /tmp/sqoop-hdfs/compile/175537b93a775793afb75735d65b176f/HDFS_3_SUB.java uses or overrides a deprecated API.

Note: Recompile with -Xlint:deprecation for details.

16/08/10 17:50:16 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hdfs/compile/175537b93a775793afb75735d65b176f/HDFS_3_SUB.jar

16/08/10 17:50:16 INFO mapreduce.ImportJobBase: Beginning import of HDFS_3_SUB

16/08/10 17:50:17 INFO manager.SqlManager: Executing SQL statement: SELECT "ACOL1","ACOL2","ACOL3","ACOL4","ACOL5","ACOL6","ACOL7","ACOL8","ACOL9","ACOL10" FROM "HDFS_3_SUB" WHERE 1=0

16/08/10 17:50:18 INFO impl.TimelineClientImpl: Timeline service address: http://amb3.local:8188/ws/v1/timeline/

16/08/10 17:50:18 INFO client.RMProxy: Connecting to ResourceManager at amb3.local/192.168.0.143:8050

16/08/10 17:50:21 WARN oracle.OraOopUtilities: System property java.security.egd is not set to file:///dev/urandom - Oracle connections may time out.

16/08/10 17:50:21 INFO db.DBInputFormat: Using read commited transaction isolation

16/08/10 17:50:21 INFO oracle.OraOopOracleQueries: Session Time Zone set to GMT

16/08/10 17:50:21 INFO oracle.OracleConnectionFactory: Initializing Oracle session with SQL :

begin 

  dbms_application_info.set_module(module_name => 'Data Connector for Oracle and Hadoop', action_name => 'import 20160810175014KST'); 

end;

16/08/10 17:50:21 INFO oracle.OracleConnectionFactory: Initializing Oracle session with SQL : alter session disable parallel query

16/08/10 17:50:21 INFO oracle.OracleConnectionFactory: Initializing Oracle session with SQL : alter session set "_serial_direct_read"=true

16/08/10 17:50:21 INFO oracle.OracleConnectionFactory: Initializing Oracle session with SQL : alter session set tracefile_identifier=oraoop

16/08/10 17:50:22 INFO oracle.OraOopDataDrivenDBInputFormat: The table being imported by sqoop has 72704 blocks that have been divided into 128 chunks which will be processed in 4 splits. The chunks will be allocated to the splits using the method : ROUNDROBIN

16/08/10 17:50:22 INFO mapreduce.JobSubmitter: number of splits:4

16/08/10 17:50:22 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1470815003334_0002

16/08/10 17:50:23 INFO impl.YarnClientImpl: Submitted application application_1470815003334_0002

16/08/10 17:50:23 INFO mapreduce.Job: The url to track the job: http://amb3.local:8088/proxy/application_1470815003334_0002/

16/08/10 17:50:23 INFO mapreduce.Job: Running job: job_1470815003334_0002

16/08/10 17:50:31 INFO mapreduce.Job: Job job_1470815003334_0002 running in uber mode : false

16/08/10 17:50:31 INFO mapreduce.Job:  map 0% reduce 0%

16/08/10 17:50:41 INFO mapreduce.Job:  map 25% reduce 0%

16/08/10 17:50:42 INFO mapreduce.Job:  map 50% reduce 0%

16/08/10 17:50:44 INFO mapreduce.Job:  map 80% reduce 0%

16/08/10 17:50:46 INFO mapreduce.Job:  map 89% reduce 0%

16/08/10 17:50:47 INFO mapreduce.Job:  map 100% reduce 0%

16/08/10 17:50:47 INFO mapreduce.Job: Job job_1470815003334_0002 completed successfully

16/08/10 17:50:47 INFO mapreduce.Job: Counters: 30

File System Counters

FILE: Number of bytes read=0

FILE: Number of bytes written=614136

FILE: Number of read operations=0

FILE: Number of large read operations=0

FILE: Number of write operations=0

HDFS: Number of bytes read=11629

HDFS: Number of bytes written=505057648

HDFS: Number of read operations=16

HDFS: Number of large read operations=0

HDFS: Number of write operations=8

Job Counters 

Launched map tasks=4

Other local map tasks=4

Total time spent by all maps in occupied slots (ms)=86936

Total time spent by all reduces in occupied slots (ms)=0

Total time spent by all map tasks (ms)=43468

Total vcore-seconds taken by all map tasks=43468

Total megabyte-seconds taken by all map tasks=66766848

Map-Reduce Framework

Map input records=500000

Map output records=500000

Input split bytes=11629

Spilled Records=0

Failed Shuffles=0

Merged Map outputs=0

GC time elapsed (ms)=1239

CPU time spent (ms)=67760

Physical memory (bytes) snapshot=1410674688

Virtual memory (bytes) snapshot=13316395008

Total committed heap usage (bytes)=737673216

File Input Format Counters 

Bytes Read=0

File Output Format Counters 

Bytes Written=505057648

16/08/10 17:50:47 INFO mapreduce.ImportJobBase: Transferred 481.6605 MB in 29.1766 seconds (16.5084 MB/sec)

16/08/10 17:50:47 INFO mapreduce.ImportJobBase: Retrieved 500000 records.

[hdfs@amb2 ~]$





본격적인테스트는 아래부터 이다.


========================= 4가지 케이스 테스트  ============================




CASE 1. 텍스트 , 비압축


명령어

sqoop import --target-dir=/dev/test2/data_nc_txt --table HDFS_TEST5 -direct --connect jdbc:oracle:thin:@192.168.0.117:1521:ORCL --username flashone --password 1234 --split-by ACOL1


결과

[hdfs@amb2 ~]$ sqoop import --target-dir=/dev/test2/data_nc_txt --table HDFS_TEST5 -direct --connect jdbc:oracle:thin:@192.168.0.117:1521:ORCL --username flashone --password 1234 --split-by ACOL1

16/08/10 18:09:15 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6.2.4.2.0-258

16/08/10 18:09:15 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.

16/08/10 18:09:15 INFO manager.SqlManager: Using default fetchSize of 1000

SLF4J: Class path contains multiple SLF4J bindings.

SLF4J: Found binding in [jar:file:/usr/hdp/2.4.2.0-258/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: Found binding in [jar:file:/usr/hdp/2.4.2.0-258/zookeeper/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: Found binding in [jar:file:/usr/hdp/2.4.2.0-258/accumulo/lib/slf4j-log4j12.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.

SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]

16/08/10 18:09:17 INFO oracle.OraOopManagerFactory: 

**************************************************

*** Using Data Connector for Oracle and Hadoop ***

**************************************************

16/08/10 18:09:17 INFO oracle.OraOopManagerFactory: Oracle Database version: Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production

16/08/10 18:09:17 INFO oracle.OraOopManagerFactory: This Oracle database is not a RAC.

16/08/10 18:09:17 INFO tool.CodeGenTool: Beginning code generation

16/08/10 18:09:17 INFO manager.SqlManager: Executing SQL statement: SELECT "TABLE_NAME","LVL","COMMENTS","REG_DT","SEQ" FROM HDFS_TEST5 WHERE 0=1

16/08/10 18:09:17 INFO manager.SqlManager: Executing SQL statement: SELECT "TABLE_NAME","LVL","COMMENTS","REG_DT","SEQ" FROM "HDFS_TEST5" WHERE 1=0

16/08/10 18:09:17 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/hdp/2.4.2.0-258/hadoop-mapreduce

Note: /tmp/sqoop-hdfs/compile/83d82e240522d651aa49f619fb1c723b/HDFS_TEST5.java uses or overrides a deprecated API.

Note: Recompile with -Xlint:deprecation for details.

16/08/10 18:09:20 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hdfs/compile/83d82e240522d651aa49f619fb1c723b/HDFS_TEST5.jar

16/08/10 18:09:20 INFO mapreduce.ImportJobBase: Beginning import of HDFS_TEST5

16/08/10 18:09:22 INFO manager.SqlManager: Executing SQL statement: SELECT "TABLE_NAME","LVL","COMMENTS","REG_DT","SEQ" FROM "HDFS_TEST5" WHERE 1=0

16/08/10 18:09:23 INFO impl.TimelineClientImpl: Timeline service address: http://amb3.local:8188/ws/v1/timeline/

16/08/10 18:09:23 INFO client.RMProxy: Connecting to ResourceManager at amb3.local/192.168.0.143:8050

16/08/10 18:09:26 WARN oracle.OraOopUtilities: System property java.security.egd is not set to file:///dev/urandom - Oracle connections may time out.

16/08/10 18:09:26 INFO db.DBInputFormat: Using read commited transaction isolation

16/08/10 18:09:26 INFO oracle.OraOopOracleQueries: Session Time Zone set to GMT

16/08/10 18:09:26 INFO oracle.OracleConnectionFactory: Initializing Oracle session with SQL :

begin 

  dbms_application_info.set_module(module_name => 'Data Connector for Oracle and Hadoop', action_name => 'import 20160810180917KST'); 

end;

16/08/10 18:09:26 INFO oracle.OracleConnectionFactory: Initializing Oracle session with SQL : alter session disable parallel query

16/08/10 18:09:26 INFO oracle.OracleConnectionFactory: Initializing Oracle session with SQL : alter session set "_serial_direct_read"=true

16/08/10 18:09:26 INFO oracle.OracleConnectionFactory: Initializing Oracle session with SQL : alter session set tracefile_identifier=oraoop

16/08/10 18:09:27 INFO oracle.OraOopDataDrivenDBInputFormat: The table being imported by sqoop has 180224 blocks that have been divided into 185 chunks which will be processed in 4 splits. The chunks will be allocated to the splits using the method : ROUNDROBIN

16/08/10 18:09:27 INFO mapreduce.JobSubmitter: number of splits:4

16/08/10 18:09:27 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1470815003334_0004

16/08/10 18:09:28 INFO impl.YarnClientImpl: Submitted application application_1470815003334_0004

16/08/10 18:09:28 INFO mapreduce.Job: The url to track the job: http://amb3.local:8088/proxy/application_1470815003334_0004/

16/08/10 18:09:28 INFO mapreduce.Job: Running job: job_1470815003334_0004

16/08/10 18:09:36 INFO mapreduce.Job: Job job_1470815003334_0004 running in uber mode : false

16/08/10 18:09:36 INFO mapreduce.Job:  map 0% reduce 0%

16/08/10 18:09:47 INFO mapreduce.Job:  map 3% reduce 0%

16/08/10 18:09:49 INFO mapreduce.Job:  map 10% reduce 0%

16/08/10 18:09:50 INFO mapreduce.Job:  map 13% reduce 0%

16/08/10 18:09:52 INFO mapreduce.Job:  map 23% reduce 0%

16/08/10 18:09:54 INFO mapreduce.Job:  map 27% reduce 0%

16/08/10 18:09:55 INFO mapreduce.Job:  map 37% reduce 0%

16/08/10 18:09:57 INFO mapreduce.Job:  map 44% reduce 0%

16/08/10 18:09:58 INFO mapreduce.Job:  map 59% reduce 0%

16/08/10 18:10:00 INFO mapreduce.Job:  map 60% reduce 0%

16/08/10 18:10:01 INFO mapreduce.Job:  map 73% reduce 0%

16/08/10 18:10:03 INFO mapreduce.Job:  map 77% reduce 0%

16/08/10 18:10:04 INFO mapreduce.Job:  map 82% reduce 0%

16/08/10 18:10:06 INFO mapreduce.Job:  map 83% reduce 0%

16/08/10 18:10:08 INFO mapreduce.Job:  map 91% reduce 0%

16/08/10 18:10:10 INFO mapreduce.Job:  map 95% reduce 0%

16/08/10 18:10:11 INFO mapreduce.Job:  map 100% reduce 0%

16/08/10 18:10:12 INFO mapreduce.Job: Job job_1470815003334_0004 completed successfully

16/08/10 18:10:12 INFO mapreduce.Job: Counters: 30

File System Counters

FILE: Number of bytes read=0

FILE: Number of bytes written=614016

FILE: Number of read operations=0

FILE: Number of large read operations=0

FILE: Number of write operations=0

HDFS: Number of bytes read=16707

HDFS: Number of bytes written=1422465312

HDFS: Number of read operations=16

HDFS: Number of large read operations=0

HDFS: Number of write operations=8

Job Counters 

Launched map tasks=4

Other local map tasks=4

Total time spent by all maps in occupied slots (ms)=241392

Total time spent by all reduces in occupied slots (ms)=0

Total time spent by all map tasks (ms)=120696

Total vcore-seconds taken by all map tasks=120696

Total megabyte-seconds taken by all map tasks=185389056

Map-Reduce Framework

Map input records=11620800

Map output records=11620800

Input split bytes=16707

Spilled Records=0

Failed Shuffles=0

Merged Map outputs=0

GC time elapsed (ms)=2308

CPU time spent (ms)=178250

Physical memory (bytes) snapshot=1499193344

Virtual memory (bytes) snapshot=13390106624

Total committed heap usage (bytes)=746586112

File Input Format Counters 

Bytes Read=0

File Output Format Counters 

Bytes Written=1422465312

16/08/10 18:10:12 INFO mapreduce.ImportJobBase: Transferred 1.3248 GB in 48.8625 seconds (27.763 MB/sec)

16/08/10 18:10:12 INFO mapreduce.ImportJobBase: Retrieved 11620800 records.

[hdfs@amb2 ~]$ 




case 2 . 텍스트 , snappy 압축


명령어

sqoop import --target-dir=/dev/test2/data_sn_txt --compress --compression-codec org.apache.hadoop.io.compress.SnappyCodec --table HDFS_TEST5 -direct --connect jdbc:oracle:thin:@192.168.0.117:1521:ORCL --username flashone --password 1234 --split-by ACOL1


결과

[hdfs@amb2 ~]$ sqoop import --target-dir=/dev/test2/data_sn_txt --compress --compression-codec org.apache.hadoop.io.compress.SnappyCodec --table HDFS_TEST5 -direct --connect jdbc:oracle:thin:@192.168.0.117:1521:ORCL --username flashone --password 1234 --split-by ACOL1

16/08/10 18:12:53 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6.2.4.2.0-258

16/08/10 18:12:53 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.

16/08/10 18:12:53 INFO manager.SqlManager: Using default fetchSize of 1000

SLF4J: Class path contains multiple SLF4J bindings.

SLF4J: Found binding in [jar:file:/usr/hdp/2.4.2.0-258/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: Found binding in [jar:file:/usr/hdp/2.4.2.0-258/zookeeper/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: Found binding in [jar:file:/usr/hdp/2.4.2.0-258/accumulo/lib/slf4j-log4j12.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.

SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]

16/08/10 18:12:54 INFO oracle.OraOopManagerFactory: 

**************************************************

*** Using Data Connector for Oracle and Hadoop ***

**************************************************

16/08/10 18:12:54 INFO oracle.OraOopManagerFactory: Oracle Database version: Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production

16/08/10 18:12:54 INFO oracle.OraOopManagerFactory: This Oracle database is not a RAC.

16/08/10 18:12:54 INFO tool.CodeGenTool: Beginning code generation

16/08/10 18:12:54 INFO manager.SqlManager: Executing SQL statement: SELECT "TABLE_NAME","LVL","COMMENTS","REG_DT","SEQ" FROM HDFS_TEST5 WHERE 0=1

16/08/10 18:12:54 INFO manager.SqlManager: Executing SQL statement: SELECT "TABLE_NAME","LVL","COMMENTS","REG_DT","SEQ" FROM "HDFS_TEST5" WHERE 1=0

16/08/10 18:12:54 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/hdp/2.4.2.0-258/hadoop-mapreduce

Note: /tmp/sqoop-hdfs/compile/a40bc11d71b280f6f6f0be86d8987524/HDFS_TEST5.java uses or overrides a deprecated API.

Note: Recompile with -Xlint:deprecation for details.

16/08/10 18:12:56 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hdfs/compile/a40bc11d71b280f6f6f0be86d8987524/HDFS_TEST5.jar

16/08/10 18:12:56 INFO mapreduce.ImportJobBase: Beginning import of HDFS_TEST5

16/08/10 18:12:57 INFO manager.SqlManager: Executing SQL statement: SELECT "TABLE_NAME","LVL","COMMENTS","REG_DT","SEQ" FROM "HDFS_TEST5" WHERE 1=0

16/08/10 18:12:58 INFO impl.TimelineClientImpl: Timeline service address: http://amb3.local:8188/ws/v1/timeline/

16/08/10 18:12:58 INFO client.RMProxy: Connecting to ResourceManager at amb3.local/192.168.0.143:8050

16/08/10 18:13:01 WARN oracle.OraOopUtilities: System property java.security.egd is not set to file:///dev/urandom - Oracle connections may time out.

16/08/10 18:13:01 INFO db.DBInputFormat: Using read commited transaction isolation

16/08/10 18:13:01 INFO oracle.OraOopOracleQueries: Session Time Zone set to GMT

16/08/10 18:13:01 INFO oracle.OracleConnectionFactory: Initializing Oracle session with SQL :

begin 

  dbms_application_info.set_module(module_name => 'Data Connector for Oracle and Hadoop', action_name => 'import 20160810181254KST'); 

end;

16/08/10 18:13:01 INFO oracle.OracleConnectionFactory: Initializing Oracle session with SQL : alter session disable parallel query

16/08/10 18:13:01 INFO oracle.OracleConnectionFactory: Initializing Oracle session with SQL : alter session set "_serial_direct_read"=true

16/08/10 18:13:01 INFO oracle.OracleConnectionFactory: Initializing Oracle session with SQL : alter session set tracefile_identifier=oraoop

16/08/10 18:13:02 INFO oracle.OraOopDataDrivenDBInputFormat: The table being imported by sqoop has 180224 blocks that have been divided into 185 chunks which will be processed in 4 splits. The chunks will be allocated to the splits using the method : ROUNDROBIN

16/08/10 18:13:02 INFO mapreduce.JobSubmitter: number of splits:4

16/08/10 18:13:02 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1470815003334_0005

16/08/10 18:13:02 INFO impl.YarnClientImpl: Submitted application application_1470815003334_0005

16/08/10 18:13:02 INFO mapreduce.Job: The url to track the job: http://amb3.local:8088/proxy/application_1470815003334_0005/

16/08/10 18:13:02 INFO mapreduce.Job: Running job: job_1470815003334_0005

16/08/10 18:13:10 INFO mapreduce.Job: Job job_1470815003334_0005 running in uber mode : false

16/08/10 18:13:10 INFO mapreduce.Job:  map 0% reduce 0%

16/08/10 18:13:22 INFO mapreduce.Job:  map 3% reduce 0%

16/08/10 18:13:23 INFO mapreduce.Job:  map 10% reduce 0%

16/08/10 18:13:25 INFO mapreduce.Job:  map 15% reduce 0%

16/08/10 18:13:26 INFO mapreduce.Job:  map 24% reduce 0%

16/08/10 18:13:28 INFO mapreduce.Job:  map 27% reduce 0%

16/08/10 18:13:29 INFO mapreduce.Job:  map 36% reduce 0%

16/08/10 18:13:31 INFO mapreduce.Job:  map 43% reduce 0%

16/08/10 18:13:32 INFO mapreduce.Job:  map 55% reduce 0%

16/08/10 18:13:34 INFO mapreduce.Job:  map 56% reduce 0%

16/08/10 18:13:35 INFO mapreduce.Job:  map 67% reduce 0%

16/08/10 18:13:37 INFO mapreduce.Job:  map 70% reduce 0%

16/08/10 18:13:38 INFO mapreduce.Job:  map 81% reduce 0%

16/08/10 18:13:40 INFO mapreduce.Job:  map 85% reduce 0%

16/08/10 18:13:41 INFO mapreduce.Job:  map 90% reduce 0%

16/08/10 18:13:43 INFO mapreduce.Job:  map 94% reduce 0%

16/08/10 18:13:44 INFO mapreduce.Job:  map 97% reduce 0%

16/08/10 18:13:47 INFO mapreduce.Job:  map 100% reduce 0%

16/08/10 18:13:47 INFO mapreduce.Job: Job job_1470815003334_0005 completed successfully

16/08/10 18:13:47 INFO mapreduce.Job: Counters: 30

File System Counters

FILE: Number of bytes read=0

FILE: Number of bytes written=613996

FILE: Number of read operations=0

FILE: Number of large read operations=0

FILE: Number of write operations=0

HDFS: Number of bytes read=16707

HDFS: Number of bytes written=809177318

HDFS: Number of read operations=16

HDFS: Number of large read operations=0

HDFS: Number of write operations=8

Job Counters 

Launched map tasks=4

Other local map tasks=4

Total time spent by all maps in occupied slots (ms)=244686

Total time spent by all reduces in occupied slots (ms)=0

Total time spent by all map tasks (ms)=122343

Total vcore-seconds taken by all map tasks=122343

Total megabyte-seconds taken by all map tasks=187918848

Map-Reduce Framework

Map input records=11620800

Map output records=11620800

Input split bytes=16707

Spilled Records=0

Failed Shuffles=0

Merged Map outputs=0

GC time elapsed (ms)=2268

CPU time spent (ms)=166060

Physical memory (bytes) snapshot=1489506304

Virtual memory (bytes) snapshot=13366169600

Total committed heap usage (bytes)=789577728

File Input Format Counters 

Bytes Read=0

File Output Format Counters 

Bytes Written=809177318

16/08/10 18:13:47 INFO mapreduce.ImportJobBase: Transferred 771.6916 MB in 49.3399 seconds (15.6403 MB/sec)

16/08/10 18:13:47 INFO mapreduce.ImportJobBase: Retrieved 11620800 records.

[hdfs@amb2 ~]$ 





case 3. 파켓 포멧  , 비압축 


명령어

sqoop import --target-dir=/dev/test2/data_nc_pq --table HDFS_TEST5 --as-parquetfile -direct --connect jdbc:oracle:thin:@192.168.0.117:1521:ORCL --username flashone --password 1234 --split-by ACOL1


결과

[hdfs@amb2 ~]$ sqoop import --target-dir=/dev/test2/data_nc_pq --table HDFS_TEST5 --as-parquetfile -direct --connect jdbc:oracle:thin:@192.168.0.117:1521:ORCL --username flashone --password 1234 --split-by ACOL1

16/08/10 18:16:27 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6.2.4.2.0-258

16/08/10 18:16:27 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.

16/08/10 18:16:27 INFO manager.SqlManager: Using default fetchSize of 1000

SLF4J: Class path contains multiple SLF4J bindings.

SLF4J: Found binding in [jar:file:/usr/hdp/2.4.2.0-258/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: Found binding in [jar:file:/usr/hdp/2.4.2.0-258/zookeeper/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: Found binding in [jar:file:/usr/hdp/2.4.2.0-258/accumulo/lib/slf4j-log4j12.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.

SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]

16/08/10 18:16:28 INFO oracle.OraOopManagerFactory: 

**************************************************

*** Using Data Connector for Oracle and Hadoop ***

**************************************************

16/08/10 18:16:28 INFO oracle.OraOopManagerFactory: Oracle Database version: Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production

16/08/10 18:16:28 INFO oracle.OraOopManagerFactory: This Oracle database is not a RAC.

16/08/10 18:16:28 INFO tool.CodeGenTool: Beginning code generation

16/08/10 18:16:28 INFO tool.CodeGenTool: Will generate java class as codegen_HDFS_TEST5

16/08/10 18:16:28 INFO manager.SqlManager: Executing SQL statement: SELECT "TABLE_NAME","LVL","COMMENTS","REG_DT","SEQ" FROM HDFS_TEST5 WHERE 0=1

16/08/10 18:16:28 INFO manager.SqlManager: Executing SQL statement: SELECT "TABLE_NAME","LVL","COMMENTS","REG_DT","SEQ" FROM "HDFS_TEST5" WHERE 1=0

16/08/10 18:16:28 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/hdp/2.4.2.0-258/hadoop-mapreduce

Note: /tmp/sqoop-hdfs/compile/cc39417a2380979acedad21e69766d18/codegen_HDFS_TEST5.java uses or overrides a deprecated API.

Note: Recompile with -Xlint:deprecation for details.

16/08/10 18:16:31 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hdfs/compile/cc39417a2380979acedad21e69766d18/codegen_HDFS_TEST5.jar

16/08/10 18:16:31 INFO mapreduce.ImportJobBase: Beginning import of HDFS_TEST5

16/08/10 18:16:31 INFO manager.SqlManager: Executing SQL statement: SELECT "TABLE_NAME","LVL","COMMENTS","REG_DT","SEQ" FROM "HDFS_TEST5" WHERE 1=0

16/08/10 18:16:32 INFO manager.SqlManager: Executing SQL statement: SELECT "TABLE_NAME","LVL","COMMENTS","REG_DT","SEQ" FROM "HDFS_TEST5" WHERE 1=0

16/08/10 18:16:34 INFO impl.TimelineClientImpl: Timeline service address: http://amb3.local:8188/ws/v1/timeline/

16/08/10 18:16:34 INFO client.RMProxy: Connecting to ResourceManager at amb3.local/192.168.0.143:8050

16/08/10 18:16:36 WARN oracle.OraOopUtilities: System property java.security.egd is not set to file:///dev/urandom - Oracle connections may time out.

16/08/10 18:16:36 INFO db.DBInputFormat: Using read commited transaction isolation

16/08/10 18:16:36 INFO oracle.OraOopOracleQueries: Session Time Zone set to GMT

16/08/10 18:16:36 INFO oracle.OracleConnectionFactory: Initializing Oracle session with SQL :

begin 

  dbms_application_info.set_module(module_name => 'Data Connector for Oracle and Hadoop', action_name => 'import 20160810181628KST'); 

end;

16/08/10 18:16:36 INFO oracle.OracleConnectionFactory: Initializing Oracle session with SQL : alter session disable parallel query

16/08/10 18:16:36 INFO oracle.OracleConnectionFactory: Initializing Oracle session with SQL : alter session set "_serial_direct_read"=true

16/08/10 18:16:36 INFO oracle.OracleConnectionFactory: Initializing Oracle session with SQL : alter session set tracefile_identifier=oraoop

16/08/10 18:16:37 INFO oracle.OraOopDataDrivenDBInputFormat: The table being imported by sqoop has 180224 blocks that have been divided into 185 chunks which will be processed in 4 splits. The chunks will be allocated to the splits using the method : ROUNDROBIN

16/08/10 18:16:37 INFO mapreduce.JobSubmitter: number of splits:4

16/08/10 18:16:37 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1470815003334_0006

16/08/10 18:16:38 INFO impl.YarnClientImpl: Submitted application application_1470815003334_0006

16/08/10 18:16:38 INFO mapreduce.Job: The url to track the job: http://amb3.local:8088/proxy/application_1470815003334_0006/

16/08/10 18:16:38 INFO mapreduce.Job: Running job: job_1470815003334_0006

16/08/10 18:16:46 INFO mapreduce.Job: Job job_1470815003334_0006 running in uber mode : false

16/08/10 18:16:46 INFO mapreduce.Job:  map 0% reduce 0%

16/08/10 18:17:10 INFO mapreduce.Job:  map 2% reduce 0%

16/08/10 18:17:22 INFO mapreduce.Job:  map 9% reduce 0%

16/08/10 18:17:25 INFO mapreduce.Job:  map 16% reduce 0%

16/08/10 18:17:28 INFO mapreduce.Job:  map 21% reduce 0%

16/08/10 18:17:31 INFO mapreduce.Job:  map 28% reduce 0%

16/08/10 18:17:34 INFO mapreduce.Job:  map 36% reduce 0%

16/08/10 18:17:37 INFO mapreduce.Job:  map 40% reduce 0%

16/08/10 18:17:40 INFO mapreduce.Job:  map 49% reduce 0%

16/08/10 18:17:43 INFO mapreduce.Job:  map 60% reduce 0%

16/08/10 18:17:46 INFO mapreduce.Job:  map 65% reduce 0%

16/08/10 18:17:49 INFO mapreduce.Job:  map 72% reduce 0%

16/08/10 18:17:52 INFO mapreduce.Job:  map 78% reduce 0%

16/08/10 18:17:55 INFO mapreduce.Job:  map 82% reduce 0%

16/08/10 18:17:58 INFO mapreduce.Job:  map 87% reduce 0%

16/08/10 18:18:01 INFO mapreduce.Job:  map 93% reduce 0%

16/08/10 18:18:03 INFO mapreduce.Job:  map 94% reduce 0%

16/08/10 18:18:04 INFO mapreduce.Job:  map 97% reduce 0%

16/08/10 18:18:07 INFO mapreduce.Job:  map 99% reduce 0%

16/08/10 18:18:10 INFO mapreduce.Job:  map 100% reduce 0%

16/08/10 18:18:10 INFO mapreduce.Job: Job job_1470815003334_0006 completed successfully

16/08/10 18:18:10 INFO mapreduce.Job: Counters: 30

File System Counters

FILE: Number of bytes read=0

FILE: Number of bytes written=618184

FILE: Number of read operations=0

FILE: Number of large read operations=0

FILE: Number of write operations=0

HDFS: Number of bytes read=49495

HDFS: Number of bytes written=456758842

HDFS: Number of read operations=200

HDFS: Number of large read operations=0

HDFS: Number of write operations=36

Job Counters 

Launched map tasks=4

Other local map tasks=4

Total time spent by all maps in occupied slots (ms)=612860

Total time spent by all reduces in occupied slots (ms)=0

Total time spent by all map tasks (ms)=306430

Total vcore-seconds taken by all map tasks=306430

Total megabyte-seconds taken by all map tasks=470676480

Map-Reduce Framework

Map input records=11620800

Map output records=11620800

Input split bytes=16707

Spilled Records=0

Failed Shuffles=0

Merged Map outputs=0

GC time elapsed (ms)=31375

CPU time spent (ms)=363220

Physical memory (bytes) snapshot=2344189952

Virtual memory (bytes) snapshot=13274750976

Total committed heap usage (bytes)=1465909248

File Input Format Counters 

Bytes Read=0

File Output Format Counters 

Bytes Written=0

16/08/10 18:18:10 INFO mapreduce.ImportJobBase: Transferred 435.5992 MB in 97.1124 seconds (4.4855 MB/sec)

16/08/10 18:18:10 INFO mapreduce.ImportJobBase: Retrieved 11620800 records.

[hdfs@amb2 ~]$ 




case 4. 파켓 포멧 , snappy 압축


명령어

sqoop import --target-dir=/dev/test2/data_sn_pq --compress --compression-codec org.apache.hadoop.io.compress.SnappyCodec --table HDFS_TEST5 -direct --as-parquetfile --connect jdbc:oracle:thin:@192.168.0.117:1521:ORCL --username flashone --password 1234 --split-by ACOL1


결과

[hdfs@amb2 ~]$ sqoop import --target-dir=/dev/test2/data_sn_pq --compress --compression-codec org.apache.hadoop.io.compress.SnappyCodec --table HDFS_TEST5 -direct --as-parquetfile --connect jdbc:oracle:thin:@192.168.0.117:1521:ORCL --username flashone --password 1234 --split-by ACOL1

16/08/10 18:21:21 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6.2.4.2.0-258

16/08/10 18:21:21 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.

16/08/10 18:21:21 INFO manager.SqlManager: Using default fetchSize of 1000

SLF4J: Class path contains multiple SLF4J bindings.

SLF4J: Found binding in [jar:file:/usr/hdp/2.4.2.0-258/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: Found binding in [jar:file:/usr/hdp/2.4.2.0-258/zookeeper/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: Found binding in [jar:file:/usr/hdp/2.4.2.0-258/accumulo/lib/slf4j-log4j12.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.

SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]

16/08/10 18:21:22 INFO oracle.OraOopManagerFactory: 

**************************************************

*** Using Data Connector for Oracle and Hadoop ***

**************************************************

16/08/10 18:21:22 INFO oracle.OraOopManagerFactory: Oracle Database version: Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production

16/08/10 18:21:22 INFO oracle.OraOopManagerFactory: This Oracle database is not a RAC.

16/08/10 18:21:22 INFO tool.CodeGenTool: Beginning code generation

16/08/10 18:21:22 INFO tool.CodeGenTool: Will generate java class as codegen_HDFS_TEST5

16/08/10 18:21:22 INFO manager.SqlManager: Executing SQL statement: SELECT "TABLE_NAME","LVL","COMMENTS","REG_DT","SEQ" FROM HDFS_TEST5 WHERE 0=1

16/08/10 18:21:22 INFO manager.SqlManager: Executing SQL statement: SELECT "TABLE_NAME","LVL","COMMENTS","REG_DT","SEQ" FROM "HDFS_TEST5" WHERE 1=0

16/08/10 18:21:22 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/hdp/2.4.2.0-258/hadoop-mapreduce

Note: /tmp/sqoop-hdfs/compile/ffc7dd24a3b45d0a3b7dad6697d1826d/codegen_HDFS_TEST5.java uses or overrides a deprecated API.

Note: Recompile with -Xlint:deprecation for details.

16/08/10 18:21:24 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hdfs/compile/ffc7dd24a3b45d0a3b7dad6697d1826d/codegen_HDFS_TEST5.jar

16/08/10 18:21:24 INFO mapreduce.ImportJobBase: Beginning import of HDFS_TEST5

16/08/10 18:21:25 INFO manager.SqlManager: Executing SQL statement: SELECT "TABLE_NAME","LVL","COMMENTS","REG_DT","SEQ" FROM "HDFS_TEST5" WHERE 1=0

16/08/10 18:21:27 INFO manager.SqlManager: Executing SQL statement: SELECT "TABLE_NAME","LVL","COMMENTS","REG_DT","SEQ" FROM "HDFS_TEST5" WHERE 1=0

16/08/10 18:21:29 INFO impl.TimelineClientImpl: Timeline service address: http://amb3.local:8188/ws/v1/timeline/

16/08/10 18:21:29 INFO client.RMProxy: Connecting to ResourceManager at amb3.local/192.168.0.143:8050

16/08/10 18:21:32 WARN oracle.OraOopUtilities: System property java.security.egd is not set to file:///dev/urandom - Oracle connections may time out.

16/08/10 18:21:32 INFO db.DBInputFormat: Using read commited transaction isolation

16/08/10 18:21:32 INFO oracle.OraOopOracleQueries: Session Time Zone set to GMT

16/08/10 18:21:32 INFO oracle.OracleConnectionFactory: Initializing Oracle session with SQL :

begin 

  dbms_application_info.set_module(module_name => 'Data Connector for Oracle and Hadoop', action_name => 'import 20160810182122KST'); 

end;

16/08/10 18:21:32 INFO oracle.OracleConnectionFactory: Initializing Oracle session with SQL : alter session disable parallel query

16/08/10 18:21:32 INFO oracle.OracleConnectionFactory: Initializing Oracle session with SQL : alter session set "_serial_direct_read"=true

16/08/10 18:21:32 INFO oracle.OracleConnectionFactory: Initializing Oracle session with SQL : alter session set tracefile_identifier=oraoop

16/08/10 18:21:32 INFO oracle.OraOopDataDrivenDBInputFormat: The table being imported by sqoop has 180224 blocks that have been divided into 185 chunks which will be processed in 4 splits. The chunks will be allocated to the splits using the method : ROUNDROBIN

16/08/10 18:21:33 INFO mapreduce.JobSubmitter: number of splits:4

16/08/10 18:21:33 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1470815003334_0007

16/08/10 18:21:33 INFO impl.YarnClientImpl: Submitted application application_1470815003334_0007

16/08/10 18:21:33 INFO mapreduce.Job: The url to track the job: http://amb3.local:8088/proxy/application_1470815003334_0007/

16/08/10 18:21:33 INFO mapreduce.Job: Running job: job_1470815003334_0007

16/08/10 18:21:41 INFO mapreduce.Job: Job job_1470815003334_0007 running in uber mode : false

16/08/10 18:21:41 INFO mapreduce.Job:  map 0% reduce 0%

16/08/10 18:21:57 INFO mapreduce.Job:  map 2% reduce 0%

16/08/10 18:21:58 INFO mapreduce.Job:  map 8% reduce 0%

16/08/10 18:22:00 INFO mapreduce.Job:  map 9% reduce 0%

16/08/10 18:22:01 INFO mapreduce.Job:  map 14% reduce 0%

16/08/10 18:22:03 INFO mapreduce.Job:  map 15% reduce 0%

16/08/10 18:22:04 INFO mapreduce.Job:  map 20% reduce 0%

16/08/10 18:22:06 INFO mapreduce.Job:  map 22% reduce 0%

16/08/10 18:22:07 INFO mapreduce.Job:  map 27% reduce 0%

16/08/10 18:22:09 INFO mapreduce.Job:  map 28% reduce 0%

16/08/10 18:22:10 INFO mapreduce.Job:  map 33% reduce 0%

16/08/10 18:22:12 INFO mapreduce.Job:  map 34% reduce 0%

16/08/10 18:22:13 INFO mapreduce.Job:  map 40% reduce 0%

16/08/10 18:22:15 INFO mapreduce.Job:  map 44% reduce 0%

16/08/10 18:22:16 INFO mapreduce.Job:  map 48% reduce 0%

16/08/10 18:22:19 INFO mapreduce.Job:  map 54% reduce 0%

16/08/10 18:22:20 INFO mapreduce.Job:  map 59% reduce 0%

16/08/10 18:22:22 INFO mapreduce.Job:  map 65% reduce 0%

16/08/10 18:22:24 INFO mapreduce.Job:  map 71% reduce 0%

16/08/10 18:22:25 INFO mapreduce.Job:  map 76% reduce 0%

16/08/10 18:22:26 INFO mapreduce.Job:  map 77% reduce 0%

16/08/10 18:22:29 INFO mapreduce.Job:  map 78% reduce 0%

16/08/10 18:22:32 INFO mapreduce.Job:  map 81% reduce 0%

16/08/10 18:22:33 INFO mapreduce.Job:  map 82% reduce 0%

16/08/10 18:22:34 INFO mapreduce.Job:  map 83% reduce 0%

16/08/10 18:22:35 INFO mapreduce.Job:  map 86% reduce 0%

16/08/10 18:22:36 INFO mapreduce.Job:  map 87% reduce 0%

16/08/10 18:22:38 INFO mapreduce.Job:  map 90% reduce 0%

16/08/10 18:22:39 INFO mapreduce.Job:  map 92% reduce 0%

16/08/10 18:22:40 INFO mapreduce.Job:  map 93% reduce 0%

16/08/10 18:22:41 INFO mapreduce.Job:  map 96% reduce 0%

16/08/10 18:22:42 INFO mapreduce.Job:  map 97% reduce 0%

16/08/10 18:22:44 INFO mapreduce.Job:  map 100% reduce 0%

16/08/10 18:22:46 INFO mapreduce.Job: Job job_1470815003334_0007 completed successfully

16/08/10 18:22:46 INFO mapreduce.Job: Counters: 30

File System Counters

FILE: Number of bytes read=0

FILE: Number of bytes written=618704

FILE: Number of read operations=0

FILE: Number of large read operations=0

FILE: Number of write operations=0

HDFS: Number of bytes read=49495

HDFS: Number of bytes written=456758940

HDFS: Number of read operations=200

HDFS: Number of large read operations=0

HDFS: Number of write operations=36

Job Counters 

Launched map tasks=4

Other local map tasks=4

Total time spent by all maps in occupied slots (ms)=466256

Total time spent by all reduces in occupied slots (ms)=0

Total time spent by all map tasks (ms)=233128

Total vcore-seconds taken by all map tasks=233128

Total megabyte-seconds taken by all map tasks=358084608

Map-Reduce Framework

Map input records=11620800

Map output records=11620800

Input split bytes=16707

Spilled Records=0

Failed Shuffles=0

Merged Map outputs=0

GC time elapsed (ms)=9158

CPU time spent (ms)=325900

Physical memory (bytes) snapshot=2222469120

Virtual memory (bytes) snapshot=13310775296

Total committed heap usage (bytes)=1477443584

File Input Format Counters 

Bytes Read=0

File Output Format Counters 

Bytes Written=0

16/08/10 18:22:46 INFO mapreduce.ImportJobBase: Transferred 435.5993 MB in 78.3224 seconds (5.5616 MB/sec)

16/08/10 18:22:46 INFO mapreduce.ImportJobBase: Retrieved 11620800 records.

[hdfs@amb2 ~]$ 




위테스트에 대한 결과.


 

텍스트 , 비압축 

텍스트, 압축 

파켓, 비압축 

파켓, 압축

 총 작업및 전송시간(Sec)

 48.8625

 49.3399

 97.1124

 78.3224

 총 전송량

 1.3248(GB)

 771.6916(MB)

 435.5992(MB)

 435.5993(MB)

 초당 전송량 (MB/sec)

 27.763

 15.6403

 4.4855

 5.5616