vBulletin Search Engine Optimization
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| I've encountered a difficult unicode problem with MYSQL. My configuration is: OS: Windows XP MYSQL: version 5.0.41-community-nt running Web server: Apache 2.2 Programming language: Python 2.3 All subsystems are configured to use utf8. I have a development system and a server. The server runs windows XP server edition. Both systems are configured with exactly the same software and exactly the same configuration files (to the best of my abilities). My app is runs perfectly on my development machine but crashes on the server. The problem appears to be with unicode and MYSQL (but I could be mistaken). Here's the error trace: UnicodeEncodeError: 'ascii' codec can't encode character u'\xf1' in position 487: ordinal not in range(128) args = ('ascii', u'\n\t\t\tINSERT INTO sources (\n\t\t\t\tuID, lID, tID, cyp...\t\t\t\t0, 0, 0, 0, 0, "2007-12-16 19:06:27"\n\t\t\t)\n\t\t\t', 487, 488, 'ordinal not in range(128)') encoding = 'ascii' end = 488 object = u'\n\t\t\tINSERT INTO sources (\n\t\t\t\tuID, lID, tID, cyp...\t\t\t\t0, 0, 0, 0, 0, "2007-12-16 19:06:27"\n\t\t\t)\n\t\t\t' reason = 'ordinal not in range(128)' start = 487 I am trying to insert the Spanish word "aņo" into the database. As I said, the app works perfectly on the development machine but not on the server. I've looked at the MySQL "my.ini" file and it has the character encoding set to utf8. Anybody have any ideas? |
| |||
| > I've encountered a difficult unicode problem with MYSQL. My configuration > is: <snip> Do you send a "SET NAMES utf8;" command directly after connection? Otherwise, the mysql server may still default to latin-1. This is a bit server-dependent and you may take a second look at your settings file. the [mysql] section is just for the command-line client and not for others. Hope this helps, -- Willem Bogaerts Application smith Kratz B.V. http://www.kratz.nl/ |
| |||
| Thanks Willem, I think so. But I'm not exactly sure. My python code looks like this: db = MySQLdb.connect(host='localhost',user='root',passw d='somepassword',db='somedb') c = db.cursor(MySQLdb.cursors.DictCursor) c.execute("SET NAMES utf8") c.execute("SET CHARACTER SET utf8") c.execute("SET character_set_client=utf8") c.execute("SET character_set_connection=utf8") bigLongQuery = "INSERT INTO mytable ..... " c.execute(bigLongQuery) db.commit() data = c.fetchall() c.close() db.close() Can you tell if the above is correct? As for the [mysql] section of my.ini, it sets "default-character-set=utf8". Same goes for [mysqld]. "Willem Bogaerts" <w.bogaerts@kratz.maardanzonderditstuk.nl> wrote in message news:47663cf2$0$85790$e4fe514c@news.xs4all.nl... >> I've encountered a difficult unicode problem with MYSQL. My configuration >> is: > <snip> > > Do you send a "SET NAMES utf8;" command directly after connection? > Otherwise, the mysql server may still default to latin-1. This is a bit > server-dependent and you may take a second look at your settings file. > the [mysql] section is just for the command-line client and not for > others. > > Hope this helps, > -- > Willem Bogaerts > > Application smith > Kratz B.V. > http://www.kratz.nl/ |
| ||||
| > I think so. But I'm not exactly sure. My python code looks like this: > > db = > MySQLdb.connect(host='localhost',user='root',passw d='somepassword',db='somedb') > c = db.cursor(MySQLdb.cursors.DictCursor) > c.execute("SET NAMES utf8") > c.execute("SET CHARACTER SET utf8") > c.execute("SET character_set_client=utf8") > c.execute("SET character_set_connection=utf8") > bigLongQuery = "INSERT INTO mytable ..... " > c.execute(bigLongQuery) > db.commit() > data = c.fetchall() > c.close() > db.close() > > Can you tell if the above is correct? Of the four SET commands, only the first one is needed. Whether it is correct depends on the encoding of the strings used in the big query. A string is just a sequence of bytes. Valid utf-8 is also valid latin-1, cp1252, and so on (not necessarily opposite, though). Which symbols it will give on your screen depends on how you render it. You have probably seen capital A's with a tilde, followed by another character. That is usually utf8, rendered as latin-1. > As for the [mysql] section of my.ini, it sets "default-character-set=utf8". > Same goes for [mysqld]. I think your code uses neither. My my.ini also has a [client] section, which may work. But that is why the "SET NAMES utf8" is sent. That will set the conversation encoding to utf-8, whatever setting, either willingly or by default, is used. If you want to see how the texts are stored in the database, use the mysqldump utility to dump a table and take a good look at its output (using an encoding-aware editor or a hexdump viewer, for instance). Good luck, -- Willem Bogaerts Application smith Kratz B.V. http://www.kratz.nl/ |