asciimap - byte level analysis of files

I was recently presented with the challenge of dumping the contents of one database into an ascii file that could be imported by a second, different database. It sounded easy, but I was having problems with certain characters, field delimiters, etc. that I didn't expect. I needed a way to do a byte-by-byte analysis of my source file to find out what exactly was inside it, as merely looking at it in a text editor wasn't working due to the existence of several naughty invisible characters that eventually needed to be stripped out.

Hence, asciimap.

Download

Grab the Perl source here.

Usage

You use it like this:

          asciimap.pl filename

asciimap will then read in a configurable amount of data from the file (default is 15 bytes at a time) and provide you with the relevant ascii codes for each character in the file. For invisible characters (vertical tabs, bells, etc.), the software will print an English representation of the code. Here's sample output for a small file containing the word 'whirlycott'.

$ asciimap.pl myfile.txt
-------------- ROW: 0 --- [whirlycott
]
0:      w       -> 119 
1:      h       -> 104 
2:      i       -> 105 
3:      r       -> 114 
4:      l       -> 108 
5:      y       -> 121 
6:      c       -> 99 
7:      o       -> 111 
8:      t       -> 116 
9:      t       -> 116 
10:             -> 10 (Newline)
--------------
TOTAL BYTES: 11         UNIQUE CHARS: 10

For larger files, you'll see something like this.

Bugs / Future Development

I don't know of any obvious bugs. I'm not sure how this will work with files coded in Unicode or EBCDIC. If you happen to find out, please let me know. As far as the future goes, I have no plans to work on this, but I may if someone needs something fixed or added.

License

This is released under the GNU General Public License.