Find UTF-8 byte order marks

In a templating application I just ran into ugly “” characters a the beginning of the text. This is caused by the byte order mark with the hex characters 0xEFBBBF. As it was not the only one file that contained the BOM I ran a search:

find . -iname '*.css' -o -iname '*.html' -o -iname '*.js' -o -iname '*.pm' -o -iname '*.pl' -o -iname '*.xml' | xargs grep -rl $'xEFxBBxBF'

To remove the BOM I followed the suggested way by using sed:

find . -iname '*.css' -o -iname '*.html' -o -iname '*.js' -o -iname '*.pm' -o -iname '*.pl' -o -iname '*.xml' -exec sed 's/^xEFxBBxBF//' -i.bak {} ; -exec rm {}.bak ;

Tada, no more ugly BOMs!


2 thoughts on “Find UTF-8 byte order marks

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s