Showing posts with label editor. Show all posts
Showing posts with label editor. Show all posts

Sunday, July 15, 2012

Save Your PHP in UTF-8 Format Without BOM

BOM or byte order mark is to indicate the byte order of your file. If your PHP file contains non-English characters, UTF-8 formatted PHP file will display as question mark(?) if the PHP script is run on Linux/Apache web service platform. However, no problem is found on Windows/IIS web servers. You can run the following PHP codes to check whether your PHP got BOM:


<?php

$handle = fopen('yourfilename.php','r');
$bom = fread($handle, 3);

if (empty($bom)) {
 echo("Error");
}
else if ($bom === chr(0xef).chr(0xbb).chr(0xbf)) {
 // UTF8 Byte Order Mark present
 echo("BOM");
}
else {
 echo("No BOM");
}

?>


If you see BOM in your UTF-8 PHP file, you'll need to convert to non-BOM format. You can easily fix it by editing using Notepad++. After opening your BOM PHP file, go to the "Encoding" menu and choose "Encode in UTF-8 without BOM". Then save the file.



Your PHP can now output non-English texts without any problem!

P.S.: (update on 18th of July 2012) You need to upload to the server as ASCII formatted file. I missed out this important info earlier. Sorry.

Read More »

Wednesday, March 10, 2010

Fixing PERL script saved in UTF-8 Format using Notepad

If your PERL script needs to display non-English texts, say Japanese or Chinese, usually you will save your script in UTF-8 format. Problem arises if you are using Notepad to save the UTF-8 format as it adds the BOM (Byte Order Mark) header to the file.

The problem is that PERL won’t run the UTF-8 file with BOM header once it is uploaded to the server. However, it will run fine if you are running with later version of Apache locally using your Windows based PC.
To solve this problem (if you are not running locally with latest Apache), the following is a PERL script to rid of the first three bytes (BOM header) of the UTF-8 file saved using notepad:

#!c:\perl\bin\perl.exe #or use path to your perl executable
$input = “myfile.pl”;
$output = “myfile_utf8.pl”;
print "Content-type: text/html\n\n";
binmode(STDIN);
binmode(STDOUT, ":utf8");
open(IN, "$input");
@ALL = <IN>;
close(IN);
$all = join("",@ALL);
$BOM_removed = substr $all, 3;
open(OUT, ">$output");
print OUT $BOM_removed;
close(IN);
print “Conversion done! File is output to $output.”;

You can assign $input and $output variables to the file names you desire. These PERL codes can be saved in normal ANSI format.

After the conversion, if the output file doesn’t contain any non-English characters, Notepad will assume it is a ANSI file. Don’t worry. Try to put some non-English characters in your $input file and run this conversion, the $output file will show up as UTF-8 file using Notepad (when you do a Save As…, you will see the format).

Upload the $output file to your server using ASCII transfer mode (not Binary mode), your $output file will run without any Internal Server Error message after chmod (change file permission) $output file to 755.

Please also make sure your $input/$output PERL script prints "Content-type: text/html; charset=utf-8\n\n" HTML header for proper UTF-8 content output.


Note: Do not save the $output file again with notepad, it will lose its "charm". :)
Read More »