字符串对象是“变长对象”。ui
Python中字符串(strs)对象最重要的建立方法为PyUnicode_DecodeUTF8Stateful,以下Python语句最终会调用到PyUnicode_DecodeUTF8Stateful:atom
a = 'hello b = str('world')
词法解析,最终调到PyUnicode_DecodeUTF8Stateful,调用顺序以下:spa
// ast.c ast_for_expr =>ast_for_power =>ast_for_atom_expr =>ast_for_atom (case STRING) =>parsestrplus =>parsestr // unicodeobject.c => PyUnicode_DecodeUTF8Stateful
// unicodeobject.c PyObject * PyUnicode_DecodeUTF8Stateful(const char *s, Py_ssize_t size, const char *errors, Py_ssize_t *consumed) { _PyUnicodeWriter writer; const char *starts = s; const char *end = s + size; Py_ssize_t startinpos; Py_ssize_t endinpos; const char *errmsg = ""; PyObject *error_handler_obj = NULL; PyObject *exc = NULL; _Py_error_handler error_handler = _Py_ERROR_UNKNOWN; if (size == 0) { if (consumed) *consumed = 0; _Py_RETURN_UNICODE_EMPTY(); } /* ASCII is equivalent to the first 128 ordinals in Unicode. */ if (size == 1 && (unsigned char)s[0] < 128) { if (consumed) *consumed = 1; return get_latin1_char((unsigned char)s[0]); } _PyUnicodeWriter_Init(&writer); writer.min_length = size; if (_PyUnicodeWriter_Prepare(&writer, writer.min_length, 127) == -1) goto onError; writer.pos = ascii_decode(s, end, writer.data); s += writer.pos; while (s < end) { // ascii解码后的size小于传入的size } End: if (consumed) *consumed = s - starts; Py_XDECREF(error_handler_obj); Py_XDECREF(exc); return _PyUnicodeWriter_Finish(&writer); onError: Py_XDECREF(error_handler_obj); Py_XDECREF(exc); _PyUnicodeWriter_Dealloc(&writer); return NULL; }
能够看到:code