URIEncoding和useBodyEncodingForURI区别

时间 2019-12-08

标签 uriencoding usebodyencodingforuri 区别繁體版

原文原文链接

转载：http://blog.itpub.NET/29254281/viewspace-1073278/
html

Tomcat解决请求乱码能够使用URIEncoding和useBodyEncodingForURI.下面是两个参数的具体说明，参见ApacheTomcat官方手册。java

URIEncoding	This specifies the character encoding used to decode the URI bytes, after %xx decoding the URL. If not specified, ISO-8859-1 will be used.
useBodyEncodingForURI	This specifies if the encoding specified in contentType should be used for URI query parameters, instead of using the URIEncoding. This setting is present for compatibility with Tomcat 4.1.x, where the encoding specified in the contentType, or explicitly set using Request.setCharacterEncoding method was also used for the parameters from the URL. The default value is false.

http://tomcat.apache.org/tomcat-7.0-doc/config/http.html

在上图能够看到，中文乱码容易出如今两个地方。一个是所请求的资源名称为中文，一个是查询参数的内容包括中文。
更复杂的是，不一样的浏览器可能使用两种编码分别处理URL和查询参数。
useBodyEncodingForURI只是针对图上"author=君山"的查询参数(QueryString)有效，他的设置对于URL和URI无效。
下面以Windows环境为例，分别测试谷歌、火狐和IE浏览器请求中文资源和中文参数的乱码状况。

下表是三种浏览器的编码状况。其中IE的URI编码能够调整为UTF8。apache

	默认URI编码	默认查询参数编码
谷歌	UTF8	UTF8
火狐	UTF8	GBK
IE	GBK	GBK

1.Tomcat的URIEncoding设置为UTF8
谷歌正常
火狐能够请求到资源，可是查询参数的中文为乱码
IE不能请求到资源

测试代码以下

测试结果以下:

2.将IE的URI编码设置为UTF8,开启useBodyEncodingForURI，并设置request的字符集为GBK。

针对URI和查询参数使用两种编码的状况，能够使用useBodyEncodingForURI。他会根据http body设置的字符集解码。
将IE设置为"发送UTF8的URL"以后，三种浏览器都使用UTF8做为URI编码，可是IE和火狐的查询参数使用GBK编码，而谷歌的查询参数使用UTF8编码。因此在这种状况下，IE和火狐的访问都是正常的，而使用谷歌浏览器，能够访问资源，可是中文的查询参数则是乱码。

测试结果:

实验得出的结论是
1.URIEncoding和useBodyEncodingForURI均可以处理中文乱码的问题
2.浏览器对于URI和查询参数可能使用两种不一样的编码方式，这种状况下，能够使用useBodyEncodingForURI调整查询参数的编码。

参考:
http://www.ibm.com/developerworks/cn/java/j-lo-chinesecoding/浏览器