- Inserted 100001 rows in 7.28s
Table format:
(key_ID, int)
For example:
insert into a values (0,126),(1,173),(2,568),(3,164),(4,593),(5,788),(6,924),(7,206),(8,359),(9,690),(10,987),(11,231),(12,817),(13,122),(14,373),(15,177),(16,156),(17,256),(18,203),(19,38);
When the row number reaches 1,000,000, the parsing speed becomes pretty slow.
- Inserted 10000 rows in 14.11s
Table format: (11 columns)
(string, boolean, double, float, bigint, int, smallint, tinyint, string, string, timestamp)
For example:
insert into hbasestringids values (‘0′,true,0.8,1.5,123456789101112,12345678,12345,12,’aaa’,’abc’,’1985-09-25 17:45:30.005′),(‘1′,false,6.77777,1.1111,7654321121212,87654321,21345,123,’bbb’,’dcba’,’1986-10-25 17:45:30.005′);
When the row number reaches 100,000, the program crashes.
If dividing the query into multiple queries, it would work.
- Insert with subselect:
Inserted 10000 rows in 1.47s
Inserted 20000 rows in 2.26s
There is no problem with this method for 100000 rows.
Query:
insert into default.b (id, string_col, int_col, double_col, bool_col, timestamp_col) select id, string_col, int_col, double_col, bool_col, timestamp_col from default.hbasestringids limit 20000;
Table default.b and default.hbasestringids have the same description:
+—————–+———–+———+
| name | type | comment |
+—————–+———–+———+
| id | string | |
| bool_col | boolean | |
| double_col | double | |
| float_col | float | |
| bigint_col | bigint | |
| int_col | int | |
| smallint_col | smallint | |
| tinyint_col | tinyint | |
| date_string_col | string | |
| string_col | string | |
| timestamp_col | timestamp | |
+—————–+———–+———+
System Configuration:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 8
On-line CPU(s) list: 0-7
Thread(s) per core: 2
Core(s) per socket: 4
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 42
Stepping: 7
CPU MHz: 1600.000
BogoMIPS: 6785.08
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 8192K
NUMA node0 CPU(s): 0-7
Physical Memory:
16G
Disk:
ATA device, with non-removable media
Model Number: ST31500341AS
Serial Number: 9VS551EJ
Firmware Revision: CC4G
Transport: Serial
Standards:
Used: unknown (minor revision code 0x0029)
Supported: 8 7 6 5
Likely used: 8
Configuration:
Logical max current
cylinders 16383 16383
heads 16 16
sectors/track 63 63
—
CHS current addressable sectors: 16514064
LBA user addressable sectors: 268435455
LBA48 user addressable sectors: 2930277168
Logical/Physical Sector size: 512 bytes
device size with M = 1024*1024: 1430799 MBytes
device size with M = 1000*1000: 1500301 MBytes (1500 GB)
cache/buffer size = unknown
Nominal Media Rotation Rate: 7200
OS Version:
Linux version 2.6.32-431.el6.x86_64 (mockbuild@c6b8.bsys.dev.centos.org) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-4) (GCC) ) #1 SMP Fri Nov 22 03:15:09 UTC 2013