request.setCharacterEncoding 关于编码

request.setCharacterEncoding 关于编码 php

 

概述
从Servlet2.3开始,支持客户端内容协商。服务端内容协商,很早就有,服务端在返回的数据中经过Content-Type来指定返回的数据内容。在REST叫嚣的背景下,客户端也须要协商:例如经过PUT方法提交一段XML或JSON数据来更新服务端的一个对象。客户端可经过URL后缀名.xml或.json的方式来告诉服务端提交的数据类型;也可经过HTTP头的Content-Type来告之服务端提交的数据类型。html

 

关于该问题的Blog web

1 】给出了两个办法 spring

http://forum.springsource.org/showthread.php?t=14063

Hi,
I am woking on a site that receives input in CJK .
This may be a naive  question:
I am using org.springframework.web.servlet.DispatcherServlet as my servlet and I need to set CharacterEnconding on the HttpServletRequest.
I looked into the source code and I relaized there is no code that calls setCharacterEncoding

I dig into the forum and found 2 solutions:

1. Use the CharacterEncodingFilter

2. Override DispatcherServlet.doService  as:
public class MyServlet extends DispatcherServlet {
protected void doService(HttpServletRequest request, HttpServletResponse response) throws Exception {
request.setCharacterEncoding( "UTF-8" );
super.doService( request , response ) ;
}
}

I tried to do #1 for it seems to be more desirable (cleaner) solution.
in web.xml I added:
<filter>
<filter-name>CharacterEncodingFilter</filter-name>
<filter-class>
org.springframework.web.filter.CharacterEncodingFi lter</filter-class>
<init-param>
<param-name>encoding</param-name>
<param-value>UTF-8</param-value>
</init-param>
</filter>
<filter-mapping>
<filter-name>CharacterEncodingFilter</filter-name>
<url-pattern>/*</url-pattern>
</filter-mapping>

But it doesn't seem to work.
Could someone tell me how to do that? Use CharacterEncodingFilter?
Thanks in advance.json

 

【2】注意事项 浏览器

http://www.junlu.com/msg/125726.html

With the 2.3 servet API, there is a new method:
 request.setCharacterEncoding(String encoding)
经过Content-Type告诉服务端当前请求编码:
This lets you tell the server a request's character encoding.
(例如HTTP头: Content-Type:    text/html; charset=UTF-8
Content-Type:application/x-www-form-urlencoded; charset=UTF-8)

It is critical that  setCharacterEncoding is called BEFORE any
request.getParameter is called (or getReader). Otherwise, you are at the
mercy of the appserver for what you get back on the getParameter call.

For example, if setCharacterEncoding is not called, you could get a null
value back on getParameter("foo").服务器

 

 

解决该问题:客户端协商编码方式 session

private String negotiateCharacterEncoding(HttpServletRequest request,Map<String,String> outParam) {
		String clientEncoding = request.getCharacterEncoding();//HTTP标准:客户端声称的编码(可是目前大多数浏览器并未实施该标准)
		outParam.put("point","HTTP标准");
		//协商过程:
		//1. 经过HTTP标准指定客户端编码;(在HTTP头中设置:Content-Type	=[text/html; charset=UTF-8])
		//2. 经过自定义HTTP头(Client-Charset)指定客户端编码;
		//3. 经过自定义HTTP查询参数(Client-Charset)指定客户端编码(只针对HTTP-GET方法)。(以避免动态script标签发起请求时设置不了HTTP头;编码数值都是英文字符,提取编码数值跟编码无关。)
		//4. 若是全部的协商都没有,那么服务端强制使用配置:defaultEncoding
		//5. 若是服务端没有配置defaultEncoding,那么使用容器默认的ISO-8...(若是上述指定的编码不被支持,那么依然使用容器默认的)
		if(clientEncoding==null || clientEncoding.trim().equals("")) {
			clientEncoding = request.getHeader("Client-Charset");
			outParam.put("point","自定义HTTP头");
			if(clientEncoding==null || clientEncoding.trim().equals("")) {
//				clientEncoding = request.getParameter("Client-Charset");//不能经过该方式提取Client-Charset参数
//request.setCharacterEncoding(encoding);发挥做用的前提是:调用setCharacterEncoding以前不能执行任何request.getParameter
				if("GET".equalsIgnoreCase(request.getMethod())) {
					String queryString = request.getQueryString();
					if(queryString!=null && !queryString.equals("")) {
						//定位参数[Client-Charset]的起始和终止位置
						int startIndex = queryString.indexOf("Client-Charset=");
						int endIndex = -1;
						if(startIndex!=-1) {
							startIndex = startIndex+"Client-Charset=".length();
							endIndex = queryString.indexOf("&", startIndex);
							if(endIndex==-1) {//Client-Charset是最后一个参数
								int sessionidIndex = queryString.indexOf(";", startIndex);//去掉基于URL的SessionID
								if(sessionidIndex!=-1) {
									endIndex = sessionidIndex;
								} else {
									endIndex = queryString.length();
								}
							}
						}
						if(startIndex<endIndex) {
							clientEncoding = queryString.substring(startIndex, endIndex);
							outParam.put("point","自定义HTTP查询参数");
						}
					}
				}
			}
			if(clientEncoding==null || clientEncoding.trim().equals("")) {
				clientEncoding = defaultEncoding;
				outParam.put("point","服务端配置");
			}
			
		}
		return clientEncoding;
	}

 

if (encoding != null) {
			try {
				request.setCharacterEncoding(encoding);//注:被强制认为是GBK编码,好处在于客户端在提交GET请求时再也不须要作URLEncode处理了。很差的是,若是客户端提交以UTF-8的编码,则编码出错了。
				//http://www.junlu.com/msg/125726.html
			} catch (Exception e) {
				log.error("Error setting character encoding to '" + encoding
						+ "' - ignoring.", e);
			}
		}
 

测试用例 app

/modifyListener_test.htm?nick=繁體昵稱衝頂&mobile=13812345678
/modifyListener_test.htm?nick=%B7%B1%F3%77%EA%C7%B7%51%D0%6E%ED%94&mobile=13812345678&Client-Charset=GBK
/modifyListener_test.htm?nick=%E7%B9%81%E9%AB%94%E6%98%B5%E7%A8%B1%E8%A1%9D%E9%A0%82&mobile=13812345678&Client-Charset=UTF-8


/modifyListener_test.htm?nick=涛&mobile=13812345678
/modifyListener_test.htm?nick=%CC%CE&mobile=13812345678&Client-Charset=GBK
/modifyListener_test.htm?nick=%CC%CE&mobile=13812345678
/modifyListener_test.htm?nick=%E6%B6%9B&mobile=13812345678&Client-Charset=UTF-8
/modifyListener_test.htm?nick=%E6%B6%9B&mobile=13812345678  //出错:输入是UTF-8,却被服务器强制为GBK (###nick=娑?,mobile=13812345678)
/modifyListener_test.htm?nick=%E6%B6%9B&mobile=13812345678&Client-Charset=UTF-8;12345
/modifyListener_test.htm?Client-Charset=UTF-8&nick=%E6%B6%9B&mobile=13812345678;12345 //nick=涛,mobile=13812345678;12345


/modifyListener_test.htm?nick=%E6%B6%9B&mobile=13812345678
并设置包头:
Content-Type:    text/html; charset=UTF-8

Content-Type:    charset=UTF-8
//协商或配置的编码:UTF-8,协商源:HTTP标准ide

 

问题答复:

Passport有一个全局的Filter,强制全部的HTTP请求的编码为GBK,因此支持不了URLEncode(UTF-8)。我想了这么些办法,都不行: 一、    把这个全局的Filter去掉,不强制为GBK,现有线上的那些没有编码的东西支持不了。 这样设置,对于那些没有编码的数据,好比:http://localhost/modifyListener_test.htm?nick=繁體昵稱衝頂&mobile=13812345678 提取的nick则会出错。所以会影响线上其余地方。 二、    手动从GBK再转UTF-8,部分数据能支持,有些不支持。(GBK和UTF-8字符集毕竟不是包含与被包含的关系,其中有冲突的部分) New String(nick.getBytes(“GBK”),”UTF-8”)  对于http://localhost/modifyListener_test.htm?nick=繁體昵稱衝頂&mobile=13812345678能转换出“繁體昵稱衝頂”;但对于 http://localhost/modifyListener_test.htm?nick=涛&mobile=13812345678  其中“波涛”的“涛”则转换失败。 如今一个可行的解决办法是经过HTTP头协商,须要麻烦你那边在请求中加一个参数: 一、    在HTTP头部增长参数:Content-Type,并设置数值:charset=UTF-8   (备注:在HTTP头里设置Client-Charset参数,数值为UTF-8也行。但前提是Content-Type没被设置为其余) 这样设置后,nick就能够支持URLEncode(UTF-8)了。 二、    对于动态script标签的请求,因为没法设置HTTP头,后台支持查询参数:Client-Charset 好比:这样提交HTTP请求:  nick=%E6%B6%9B&mobile=13812345678&Client-Charset=UTF-8