PHP – MySQL: Unicode solution to Chinese, Russian or any language

I am a Freelance Web Developer and my main tools are PHP & MySQL. Few days ago, I got a Chinese project where I had to develop a Real Estate site in Chinese language. You know we often build websites in English and Databases are in English too. So, the default configuration in MySQL works fine everytime.

But when it comes a language other than English, many people do not know what to do. Well. When I started the project, I did not even know that the default MySQL settings will not work for the Chinese language. So, I started searching for a stable solution where my program will support any language for adding, updating and searching data from the MySQL database.

And Yeah.
I found it!

OK.

Let us see the solution now.
It is very very simple.

Step One: SET THE CHARSET TO UTF-8 IN THE HEAD SECTION

First of all, the browser needs to know that you are going to display or use Unicode in this page. So, go to your <HEAD></HEAD> section and set the charset to utf-8. So, the browser will be able to show the Unicode text without any error and smoothly. You can also copy and paste the line below:

1

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

Step Two: CREATING THE DATABASE

When you create your (a) Database and (b) any Table in the database, set the Collation of both of them to utf8_unicode_ci and you know it is very easy if you are using phpMyAdmin.

Step Three: DATABASE INITIALIZATION

When you initialize the database connection, please add the “extra lines”

1

<?php

	define('HOSTNAME', 'localhost');
	define('USERNAME', 'database_user_name');
	define('PASSWORD', 'database_password');
	define('DATABASE', 'database_name');

	$dbLink = mysql_connect(HOSTNAME, USERNAME, PASSWORD);
	mysql_query("SET character_set_results=utf8", $dbLink);
	mb_language('uni'); 
	mb_internal_encoding('UTF-8');
	mysql_select_db(DATABASE, $dbLink);
	mysql_query("set names 'utf8'",$dbLink);

?>

But why are you adding the extra lines? Because you are letting the database know what kind of input you are going to work with soon.

Step Four: INSERTING INPUTS/DATA IN THE DATABASE

1

<?php

	mysql_query("SET character_set_client=utf8", $dbLink);
	mysql_query("SET character_set_connection=utf8", $dbLink);

	$sql_query = "INSERT INTO
	TABLE_NAME(field_name_one, field_name_two)
	VALUES('field_value_one', 'field_value_two')";
	mysql_query($sql_query, $dbLink);

?>

Why are you adding the first two lines for? Because the database should know what kind of data is going to be stored.

Step Five: UPDATING INPUTS/DATA IN THE DATABASE

1

<?php

	mysql_query("SET character_set_client=utf8", $dbLink);
	mysql_query("SET character_set_connection=utf8", $dbLink);

	$sql_query = "UPDATE TABLE_NAME
	SET field_name_one='field_value_one', field_name_two='field_value_two'
	WHERE id='$id'; ";
	mysql_query($sql_query, $dbLink);

?>

So, you are adding the extra two lines before you run your query string as you are playing with Unicode.

Step Six: SEARCHING DATA FROM THE DATABASE

<?php

	mysql_query("SET character_set_results=utf8", $dbLink);

	$sql_query = "SELECT * FROM TABLE_NAME WHERE id='$id'; ";
	$dbResult = mysql_query( $sql_query, $dbLink);

?>

Adding the one extra line every time you search your Unicode data is enough.

OKKK.
You are done. This should work smoothly for handling your data in any language does not matter it is Bangla (my mother tongue), Hindi, Chinese, French, German, Spanish, Russian, Arabian (Arabic), Urdu, or any other language.

And do not forget to leave a comment if you have any. Because I need to update the post in case required.

Thanks for reading and please check if it works for you.


Try to set charachter encoding after mysql_connect function like this:

mysql_query (“set character_set_client=’utf8′”);
mysql_query (“set character_set_results=’utf8′”);

mysql_query (“set collation_connection=’utf8_general_ci'”);

Try to make sure the browser recognizes the page as Unicode.

Generally, this can be done by having your server send the right Content-type HTTP header, that includes the charset you’re using.
For instance, something like this should work :

header('Content-type: text/html; charset=UTF-8');
echo "வெள்ளிக்கிழமை ஐ";

If this works, and your dynamically generated page still doesn’t :

  • make sure your data in your MySQL database is in UTF-8 too
    • this can be set for each table, or even columns, in MySQL
  • and make sure you are connecting to it using UTF-8.

Basically, all your application should use the same encoding :

  • PHP files,
  • Database
  • HTTP server

After that make sure that your page’s encoding type is utf-8:

<meta http-equiv=”Content-Type” content=”text/html; charset=utf-8″ />


<meta http-equiv="Content-Type" content="text/html;charset=UTF-8">

<?php 
$con = mysql_connect("localhost","root","");
if (!$con)
  {
  die('Could not connect: ' . mysql_error());
  }

mysql_query('SET character_set_results=utf8');
mysql_query('SET names=utf8');
mysql_query('SET character_set_client=utf8');
mysql_query('SET character_set_connection=utf8');
mysql_query('SET character_set_results=utf8');
mysql_query('SET collation_connection=utf8_general_ci');

mysql_select_db('onlinetest',$con);

$nith = "CREATE TABLE IF NOT EXISTS `TAMIL` (
  `data` varchar(1000) character set utf8 collate utf8_bin default NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1";

if (!mysql_query($nith,$con))
{
  die('Error: ' . mysql_error());
}

$nithi = "INSERT INTO `TAMIL` VALUES ('இந்தியா நாட்டின் பக்கங்கள்')";

if (!mysql_query($nithi,$con))
{
  die('Error: ' . mysql_error());
}

$result = mysql_query("SET NAMES utf8");//the main trick
$cmd = "select * from TAMIL";
$result = mysql_query($cmd);
while($myrow = mysql_fetch_row($result))
{
    echo ($myrow[0]);
}
?>
</body>
</html>

Characters (Unicode) Not Display Properly

I’m generating word doc from php. The problem is characters(unicode) which I give in the input textfield is not display properly in the generated word doc. My input is:…. , but displays in Word doc :…. seems lot of difference in the word doc like “è” became “č” and “øþë” became “ųžė”.

My php code is $text = utf8_decode(htmlspecialchars(stripslashes(trim($_POST[‘imageDesc’.$i])))) and I store the $text in the word doc. Note: I have place meta tag in the php page like: when check in the php page(print($text)) its working fine but its not proper only in the word doc.

Advertisements
By Rz Rasel Posted in Php

One comment on “PHP – MySQL: Unicode solution to Chinese, Russian or any language

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s