[寒江孤叶丶的Cocos2d-x之旅_36]用LUA实现UTF8的字符串基本操作 UTF8字符串长度,UTF8字符串剪裁等

原创文章,欢迎转载,转载请注明:文章来自[寒江孤叶丶的Cocos2d-x之旅系列]

博客地址:http://blog.csdn.net/qq446569365

一个用于UTF8字符串操作的类,功能比较齐全,包括:

string.utf8len UTF8字符串长度

string.utf8sub 对UTF8字符串进行切割

string.utf8upper 大写转换

string.utf8lower 小写转换

string.utf8reverse 字符串逆转


Github下载地址:https://github.com/ArcherPeng/utf8String4LUA


--===========================--
--===========================--

-- $Id: utf8.lua 147 2007-01-04 00:57:00Z pasta $
--
-- Provides UTF-8 aware string functions implemented in pure lua:
-- * string.utf8len(s)
-- * string.utf8sub(s,i,j)
-- * string.utf8reverse(s)
--
-- If utf8data.lua (containing the lower<->upper case mappings) is loaded,these
-- additional functions are available:
-- * string.utf8upper(s)
-- * string.utf8lower(s)
--
-- All functions behave as their non UTF-8 aware counterparts with the exception
-- that UTF-8 characters are used instead of bytes for all units.

--[[
Copyright (c) 2006-2007,Kyle Smith
All rights reserved.

Redistribution and use in source and binary forms,with or without
modification,are permitted provided that the following conditions are met:

    * Redistributions of source code must retain the above copyright notice,this list of conditions and the following disclaimer.
    * Redistributions in binary form must reproduce the above copyright
      notice,this list of conditions and the following disclaimer in the
      documentation and/or other materials provided with the distribution.
    * Neither the name of the author nor the names of its contributors may be
      used to endorse or promote products derived from this software without
      specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES,INCLUDING,BUT NOT LIMITED TO,THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT,INDIRECT,INCIDENTAL,SPECIAL,EXEMPLARY,OR CONSEQUENTIAL
DAMAGES (INCLUDING,PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE,DATA,OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY,WHETHER IN CONTRACT,STRICT LIABILITY,OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE,EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
--]]

-- ABNF from RFC 3629
-- 
-- UTF8-octets = *( UTF8-char )
-- UTF8-char   = UTF8-1 / UTF8-2 / UTF8-3 / UTF8-4
-- UTF8-1      = %x00-7F
-- UTF8-2      = %xC2-DF UTF8-tail
-- UTF8-3      = %xE0 %xA0-BF UTF8-tail / %xE1-EC 2( UTF8-tail ) /
--               %xED %x80-9F UTF8-tail / %xEE-EF 2( UTF8-tail )
-- UTF8-4      = %xF0 %x90-BF 2( UTF8-tail ) / %xF1-F3 3( UTF8-tail ) /
--               %xF4 %x80-8F 2( UTF8-tail )
-- UTF8-tail   = %x80-BF
-- 

-- returns the number of bytes used by the UTF-8 character at byte i in s
-- also doubles as a UTF-8 character validator
local function utf8charbytes (s,i)
	-- argument defaults
	i = i or 1

	-- argument checking
	if type(s) ~= "string" then
		error("bad argument #1 to 'utf8charbytes' (string expected,got ".. type(s).. ")")
	end
	if type(i) ~= "number" then
		error("bad argument #2 to 'utf8charbytes' (number expected,got ".. type(i).. ")")
	end

	local c = s:byte(i)

	-- determine bytes needed for character,based on RFC 3629
	-- validate byte 1
	if c > 0 and c <= 127 then
		-- UTF8-1
		return 1

	elseif c >= 194 and c <= 223 then
		-- UTF8-2
		local c2 = s:byte(i + 1)

		if not c2 then
			error("UTF-8 string terminated early")
		end

		-- validate byte 2
		if c2 < 128 or c2 > 191 then
			error("Invalid UTF-8 character")
		end

		return 2

	elseif c >= 224 and c <= 239 then
		-- UTF8-3
		local c2 = s:byte(i + 1)
		local c3 = s:byte(i + 2)

		if not c2 or not c3 then
			error("UTF-8 string terminated early")
		end

		-- validate byte 2
		if c == 224 and (c2 < 160 or c2 > 191) then
			error("Invalid UTF-8 character")
		elseif c == 237 and (c2 < 128 or c2 > 159) then
			error("Invalid UTF-8 character")
		elseif c2 < 128 or c2 > 191 then
			error("Invalid UTF-8 character")
		end

		-- validate byte 3
		if c3 < 128 or c3 > 191 then
			error("Invalid UTF-8 character")
		end

		return 3

	elseif c >= 240 and c <= 244 then
		-- UTF8-4
		local c2 = s:byte(i + 1)
		local c3 = s:byte(i + 2)
		local c4 = s:byte(i + 3)

		if not c2 or not c3 or not c4 then
			error("UTF-8 string terminated early")
		end

		-- validate byte 2
		if c == 240 and (c2 < 144 or c2 > 191) then
			error("Invalid UTF-8 character")
		elseif c == 244 and (c2 < 128 or c2 > 143) then
			error("Invalid UTF-8 character")
		elseif c2 < 128 or c2 > 191 then
			error("Invalid UTF-8 character")
		end
		
		-- validate byte 3
		if c3 < 128 or c3 > 191 then
			error("Invalid UTF-8 character")
		end

		-- validate byte 4
		if c4 < 128 or c4 > 191 then
			error("Invalid UTF-8 character")
		end

		return 4

	else
		error("Invalid UTF-8 character")
	end
end


-- returns the number of characters in a UTF-8 string
local function utf8len (s)
	-- argument checking
	if type(s) ~= "string" then
		error("bad argument #1 to 'utf8len' (string expected,got ".. type(s).. ")")
	end

	local pos = 1
	local bytes = s:len()
	local len = 0

	while pos <= bytes and len ~= chars do
		local c = s:byte(pos)
		len = len + 1

		pos = pos + utf8charbytes(s,pos)
	end

	if chars ~= nil then
		return pos - 1
	end

	return len
end

-- install in the string library
if not string.utf8len then
	string.utf8len = utf8len
end


-- functions identically to string.sub except that i and j are UTF-8 characters
-- instead of bytes
local function utf8sub (s,j)
	-- argument defaults
	j = j or -1

	-- argument checking
	if type(s) ~= "string" then
		error("bad argument #1 to 'utf8sub' (string expected,got ".. type(s).. ")")
	end
	if type(i) ~= "number" then
		error("bad argument #2 to 'utf8sub' (number expected,got ".. type(i).. ")")
	end
	if type(j) ~= "number" then
		error("bad argument #3 to 'utf8sub' (number expected,got ".. type(j).. ")")
	end

	local pos = 1
	local bytes = s:len()
	local len = 0

	-- only set l if i or j is negative
	local l = (i >= 0 and j >= 0) or s:utf8len()
	local startChar = (i >= 0) and i or l + i + 1
	local endChar   = (j >= 0) and j or l + j + 1

	-- can't have start before end!
	if startChar > endChar then
		return ""
	end

	-- byte offsets to pass to string.sub
	local startByte,endByte = 1,bytes

	while pos <= bytes do
		len = len + 1

		if len == startChar then
			startByte = pos
		end

		pos = pos + utf8charbytes(s,pos)

		if len == endChar then
			endByte = pos - 1
			break
		end
	end

	return s:sub(startByte,endByte)
end

-- install in the string library
if not string.utf8sub then
	string.utf8sub = utf8sub
end


-- replace UTF-8 characters based on a mapping table
local function utf8replace (s,mapping)
	-- argument checking
	if type(s) ~= "string" then
		error("bad argument #1 to 'utf8replace' (string expected,got ".. type(s).. ")")
	end
	if type(mapping) ~= "table" then
		error("bad argument #2 to 'utf8replace' (table expected,got ".. type(mapping).. ")")
	end

	local pos = 1
	local bytes = s:len()
	local charbytes
	local newstr = ""

	while pos <= bytes do
		charbytes = utf8charbytes(s,pos)
		local c = s:sub(pos,pos + charbytes - 1)

		newstr = newstr .. (mapping[c] or c)

		pos = pos + charbytes
	end

	return newstr
end


-- identical to string.upper except it knows about unicode simple case conversions
local function utf8upper (s)
	return utf8replace(s,utf8_lc_uc)
end

-- install in the string library
if not string.utf8upper and utf8_lc_uc then
	string.utf8upper = utf8upper
end


-- identical to string.lower except it knows about unicode simple case conversions
local function utf8lower (s)
	return utf8replace(s,utf8_uc_lc)
end

-- install in the string library
if not string.utf8lower and utf8_uc_lc then
	string.utf8lower = utf8lower
end


-- identical to string.reverse except that it supports UTF-8
local function utf8reverse (s)
	-- argument checking
	if type(s) ~= "string" then
		error("bad argument #1 to 'utf8reverse' (string expected,got ".. type(s).. ")")
	end

	local bytes = s:len()
	local pos = bytes
	local charbytes
	local newstr = ""

	while pos > 0 do
		c = s:byte(pos)
		while c >= 128 and c <= 191 do
			pos = pos - 1
			c = s:byte(pos)
		end

		charbytes = utf8charbytes(s,pos)

		newstr = newstr .. s:sub(pos,pos + charbytes - 1)

		pos = pos - 1
	end

	return newstr
end

-- install in the string library
if not string.utf8reverse then
	string.utf8reverse = utf8reverse
end

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。

相关推荐


    本文实践自 RayWenderlich、Ali Hafizji 的文章《How To Create Dynamic Textures with CCRenderTexture in Cocos2D 2.X》,文中使用Cocos2D,我在这里使用Cocos2D-x 2.1.4进行学习和移植。在这篇文章,将会学习到如何创建实时纹理、如何用Gimp创建无缝拼接纹
Cocos-code-ide使用入门学习地点:杭州滨江邮箱:appdevzw@163.com微信公众号:HopToad 欢迎转载,转载标注出处:http://blog.csdn.netotbaron/article/details/424343991.  软件准备 下载地址:http://cn.cocos2d-x.org/download 2.  简介2.1         引用C
第一次開始用手游引擎挺激动!!!进入正题。下载资源1:从Cocos2D-x官网上下载,进入网页http://www.cocos2d-x.org/download,点击Cocos2d-x以下的Download  v3.0,保存到自定义的文件夹2:从python官网上下载。进入网页https://www.python.org/downloads/,我当前下载的是3.4.0(当前最新
    Cocos2d-x是一款强大的基于OpenGLES的跨平台游戏开发引擎,易学易用,支持多种智能移动平台。官网地址:http://cocos2d-x.org/当前版本:2.0    有很多的学习资料,在这里我只做为自己的笔记记录下来,错误之处还请指出。在VisualStudio2008平台的编译:1.下载当前稳
1.  来源 QuickV3sample项目中的2048样例游戏,以及最近《最强大脑》娱乐节目。将2048改造成一款挑战玩家对数字记忆的小游戏。邮箱:appdevzw@163.com微信公众号:HopToadAPK下载地址:http://download.csdn.net/detailotbaron/8446223源码下载地址:http://download.csdn.net/
   Cocos2d-x3.x已经支持使用CMake来进行构建了,这里尝试以QtCreatorIDE来进行CMake构建。Cocos2d-x3.X地址:https://github.com/cocos2d/cocos2d-x1.打开QtCreator,菜单栏→"打开文件或项目...",打开cocos2d-x目录下的CMakeLists.txt文件;2.弹出CMake向导,如下图所示:设置
 下载地址:链接:https://pan.baidu.com/s/1IkQsMU6NoERAAQLcCUMcXQ提取码:p1pb下载完成后,解压进入build目录使用vs2013打开工程设置平台工具集,打开设置界面设置: 点击开始编译等待编译结束编译成功在build文件下会出现一个新文件夹Debug.win32,里面就是编译
分享一下我老师大神的人工智能教程吧。零基础!通俗易懂!风趣幽默!还带黄段子!希望你也加入到我们人工智能的队伍中来!http://www.captainbed.net前言上次用象棋演示了cocos2dx的基本用法,但是对cocos2dx并没有作深入的讨论,这次以超级马里奥的源代码为线索,我们一起来学习超级马里奥的实
1. 圆形音量button事实上作者的本意应该是叫做“电位计button”。可是我觉得它和我们的圆形音量button非常像,所以就这么叫它吧~先看效果:好了,不多解释,本篇到此为止。(旁白: 噗。就这样结束了?)啊才怪~我们来看看代码:[cpp] viewplaincopyprint?CCContro
原文链接:http://www.cnblogs.com/physwf/archive/2013/04/26/3043912.html为了进一步深入学习贯彻Cocos2d,我们将自己写一个场景类,但我们不会走的太远,凡是都要循序渐进,哪怕只前进一点点,那也至少是前进了,总比贪多嚼不烂一头雾水的好。在上一节中我们建
2019独角兽企业重金招聘Python工程师标准>>>cocos2d2.0之后加入了一种九宫格的实现,主要作用是用来拉伸图片,这样的好处在于保留图片四个角不变形的同时,对图片中间部分进行拉伸,来满足一些控件的自适应(PS: 比如包括按钮,对话框,最直观的形象就是ios里的短信气泡了),这就要求图
原文链接:http://www.cnblogs.com/linji/p/3599478.html1.环境和工具准备Win7VS2010/2012,至于2008v2版本之后似乎就不支持了。 2.安装pythonv.2.0版本之前是用vs模板创建工程的,到vs2.2之后就改用python创建了。到python官网下载版本2.7.5的,然后
环境:ubuntu14.04adt-bundle-linux-x86_64android-ndk-r9d-linux-x86_64cocos2d-x-3.0正式版apache-ant1.9.3python2.7(ubuntu自带)加入环境变量exportANDROID_SDK_ROOT=/home/yangming/adt-bundle-linux/sdkexportPATH=${PATH}:/$ANDROID_SDK_ROOTools/export
1开发背景游戏程序设计涉及了学科中的各个方面,鉴于目的在于学习与进步,本游戏《FlappyBird》采用了两个不同的开发方式来开发本款游戏,一类直接采用win32底层API来实现,另一类采用当前火热的cocos2d-x游戏引擎来开发本游戏。2需求分析2.1数据分析本项目要开发的是一款游
原文链接:http://www.cnblogs.com/linji/p/3599912.html//纯色色块控件(锚点默认左下角)CCLayerColor*ccc=CCLayerColor::create(ccc4(255,0,0,128),200,100);//渐变色块控件CCLayerGradient*ccc=CCLayerGradient::create(ccc4(255,0,0,
原文链接:http://www.cnblogs.com/linji/p/3599488.html//载入一张图片CCSprite*leftDoor=CCSprite::create("loading/door.png");leftDoor->setAnchorPoint(ccp(1,0.5));//设置锚点为右边中心点leftDoor->setPosition(ccp(240,160));/
为了答谢广大学员对智捷课堂以及关老师的支持,现购买51CTO学院关老师的Cocos2d-x课程之一可以送智捷课堂编写图书一本(专题可以送3本)。一、Cocos2d-x课程列表:1、Cocos2d-x入门与提高视频教程__Part22、Cocos2d-x数据持久化与网络通信__Part33、Cocos2d-x架构设计与性能优化内存优
Spawn让多个action同时执行。Spawn有多种不同的create方法,最终都调用了createWithTwoActions(FiniteTimeAction*action1,FiniteTimeAction*action2)方法。createWithTwoActions调用initWithTwoActions方法:对两个action变量初始化:_one=action1;_two=action2;如果两个a
需要环境:php,luajit.昨天在cygwin上安装php和luajit环境,这真特么是一个坑。建议不要用虚拟环境安装打包环境,否则可能会出现各种莫名问题。折腾了一下午,最终将环境转向linux。其中,luajit的安装脚本已经在quick-cocos2d-x-develop/bin/中,直接luajit_install.sh即可。我的lin
v3.0相对v2.2来说,最引人注意的。应该是对触摸层级的优化。和lambda回调函数的引入(嗯嗯,不枉我改了那么多类名。话说,每次cocos2dx大更新。总要改掉一堆类名函数名)。这些特性应该有不少人研究了,所以今天说点跟图片有关的东西。v3.0在载入图片方面也有了非常大改变,仅仅只是