When you have to import files as text, csv, etc ... into an SQL database, non-desired line breaks (often from notes fields, text boxes, <BR>
tags, tabs, or other misinterpreted HTML ...) are very painful. It may fail your import, or be a waste of time. Exploring one's file with Notepad is sometimes necessary.
Complete 4-step correction
How to effectively manage these line breaks?
In this standard example, we want to identify and fix lines that do not start with a double quote (caused by line breaks) within a file where healthy lines start with a double quote.
We will proceed in 4 steps thanks to regular expressions (in Find mode, Ctrl+f, Replace tab):
1) Replacing all the correctly enclosed line breaks (\r\n
) and the first double quote (^["]
) with a complex string, voluntarily chosen to be totally absent from the file (followed by a double quote):
Find what: \r\n^["]
Replace with: xxxxxyzf123456789fzyxxxxx"
The number of lines in your file will decrease significantly, and you will easily see any errors. Do not touch them.
2) Replacing the remaining line breaks (the corrupt ones) with a space:
Find what: \r\n
Replace with:
(a space)
The space is necessary to not concatenate strings separated by line breaks.
The number of lines keep to decrease visibly. If you look at the Notepad horizontal scrollbar, you will see that its length has also increased.
3) Replacing of the last remaining line breaks by a space:
Find what: \n
Replace with:
(a space)
Now you get only one loooonnng line. It was the aim.
4) Perfect, you just have to replace in this unique line our famous string by true line breaks (followed by a double quote).
When the unwanted line breaks have been removed, you'll be sure to restore only the necessary line breaks at the beginning of each record:
Find what: xxxxxyzf123456789fzyxxxxx"
Replace with: \r\n"
The extended previous line gets the shape of a standard file, but without the unwanted line breaks. Good for import!