Chronosbox

Change Charset and line break recursively in Linux

by Handrus Nogueira on Wednesday, 14 April/2010, under Linux, Tips

Sometime ago I have to change the charset and line breaks of a web application from charset iso-8859-1 (US-Latin) to UTF-8 and linux break lines to windows break lines. At that time chronos helped to create this small script that I’m sharing now:

1
2
3
4
5
6
7
8
9
10
#!/bin/sh
#change extensions filter as you need ;)
find -type f | egrep "\.php|\.css|\.js|\.html|\.htm" | egrep -v "\.svn" > ./charset_list.txt;
while read -r line;do
	#charset change
	iconv -f LATIN1 -t UTF-8 "$line" > "${line}2";
	mv "${line}2" "$line";
	#we adopted dos format for line breaks
	unix2dos "${line}";
done < ./charset_list.txt > ./charset_after.log 2>&1 #receive list of params and send output to log

Explanationss: The script generate a list with files in actual and subfolders (according to the filter described in 2nd line).
After that reads the file, create a UTF-8 version of it in another file and move it over the old one, after this it changes the final file line breaks. Very simple, I LOVE BASH!


Leave a Reply

StatPress

Visits today: 4 Visits since 6 de April de 2009: 15860 Visitors now: %visitoronline%