If your file loads in looking something like this:
it is because you’ve saved it using a MAC formatted file
structure, probably using MAC Excel, as MACOS 10.x uses UNIX style line
endings. If you then open this file
under VIM running in Linux and VIM is not set to automatically detect it, you
get this mess. This happens because
old-school MACs used CR line endings, and Unix expects LF type line endings.
Most UNIX based utilities and scripting languages do not
know how to interpret a CR type line ending.
If you try to parse the CSV file with any UNIX based utility or
scripting language, you are going to have trouble because the function is going
to treat the entire file as a single line. Most PHP file handling functions also have
trouble with it.
So you need to convert the line endings. There are numerous ways to do this such as
using TR or Perl:
tr '\n'
'\r' < mac_formatted _file.csv
perl -ne
's/([^\r])\r/$1\n/g; s/\r//g; print;' mac_format.csv
But the easiest way I have found is to just use the VIM
editor.
First, check your current file format settings with a:
:set ffs?
If you are having trouble reading MAC formatted files, you
will likely find this produces a:
Fileformats=unix,dos
A simple execution of:
:set
ffs=unix,dos,mac
will generally easily fix this. Exit
the file and open it again, and you will now be able to see your text
correctly. However, you haven’t actually
converted it. You’ve just taught VIM how
to automatically understand the CR type line endings.
VIM can be in any one of three file editing modes: dos,
unix, or mac. VIM attempts to figure out
the correct mode when it loads the file.
But if the file format settings (ffs) do not contain the mac entry, VIM
does not know how to read MAC files.
You can convert the line endings by first switching the
editor into MAC mode so it can understand the file, switching the file format over
to UNIX, and then writing out the converted file:
- :e ++ff=MAC
- :setlocal ff=unix
- :w
The VIM “:e ++ff=mac” command switches VIM over to look for
‘\r’ (Carriage Return or CR) characters as line endings. This will show up as “^M” in the file if the
file is being interpreted as a UNIX file.
Unix uses a pure ‘\n’ (Line Feed or LR) as the only line-ending
indicator. The UNIX LF character shows
up as a “^J” if you switch the editor over to UNIX mode. If you switch a UNIX file to DOS mode, it
will display correctly, but it will show a display at the bottom saying “[CR
missing][dos]” right after you execute the “:e ++ff=dos” command.
You can switch back and forth into any mode you want by
executing:
- :e ++ff=current_format
- :setlocal ff=target_format
- :w
where “current_format” and “target_format” are: dos, unix, or mac
No comments:
Post a Comment