You might encounter situations where you need to pre-populate test data for performance testing
but you need huge records of data which is taking ages if runnig with normal insert query
.
OK, I got you. The following is the steps on how to update MySQL configuration and loading fast using CSV
file.
Optional Config for Better Performance Before Loading
The following configurations are good to set for better performance before loading.
SET GLOBAL max_allowed_packet = 1G;
SET GLOBAL sql_mode = '';
SET GLOBAL innodb_buffer_pool_size = 1G;
SET FOREIGN_KEY_CHECKS = 0;
SET UNIQUE_CHECKS = 0;
Create the Table
For this demo, user
table will be created. This can be any table of your choice.
CREATE TABLE user (
id INT PRIMARY KEY,
name VARCHAR(255),
age INT NULL,
email VARCHAR(255),
created_at TIMESTAMP,
updated_at TIMESTAMP
);
Prepare the CSV File
For this demo, the following file will be saved as data.csv
. This can be any name.
id,name,age,email,created_at,updated_at
1,John Doe,30,johndoe@example.com,2024-06-25 10:00:00,2024-06-25 10:00:00
2,Jane Smith,25,janesmith@example.com,2024-06-25 11:00:00,2024-06-25 11:00:00
Load it
LOAD DATA INFILE 'C:/path/to/your/data.csv'
INTO TABLE user
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 LINES
(id, name, age, email, created_at, updated_at);
PS secure-file-priv
issues can be occurred when loading.
This is the security feature which restricts the location from which files can be loaded.
If that the case, you have two options:
- Option 1. Put the
data.csv
file in the configured directory, if there is one. - Option 2. Turn this feature off temporarily, which you should not do in production.
To check the configured directory:
SHOW VARIABLES LIKE 'secure-file-priv';
To turn off the settings. Open my.ini
or my.cnf
of your MySQL configuration.
[mysqld]
secure-file-priv=""
Restart the MySQL server to apply the changes.
Handling Null Values
You need to specify how MySQL should interpret missing or blank values in the CSV file to handle null values.
The following example shows that @age
is a user variable that temporarily stores the value from the CSV file.
The NULLIF
function replaces empty strings with NULL
.
LOAD DATA INFILE 'C:/path/to/your/data.csv'
INTO TABLE big_dataset
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 LINES
(id, name, @age, email, created_at, updated_at)
SET age = NULLIF(@age, '');
By now your data should be imported as expected.
Post-Load Steps
If you make changes for the Step 1, now, it is time to revert back the changes.
SET GLOBAL max_allowed_packet = 64M;
SET GLOBAL sql_mode = 'STRICT_TRANS_TABLES,NO_ENGINE_SUBSTITUTION';
SET GLOBAL innodb_buffer_pool_size = 128M;
SET FOREIGN_KEY_CHECKS = 1;
SET UNIQUE_CHECKS = 1;
Conclusion
This is all you need but properly configuring your MySQL server and handling null
values correctly are the minimum requirements to load a smooth and error-free
data.
Happy Data Importing!