TA的每日心情 | 开心 2021-3-12 23:18 |
---|
签到天数: 2 天 [LV.1]初来乍到
|
jdk中的Integer类是int对象的包装类,正常的Integer占用内存开销要比int大,比例大概是1:4 。
今天分享的代码是Integer类中的静态方法parseInt(String, int)。这个方法众所周知, 甚至在我们一开始学习编程时就尝试的写过这样的代码,一个正常的思路:
遍历输入的字符数组(java的字符串就是一个字符数组), 然后parse每个char,依据参数给定的进制,判断每个char是否满足,满足则继续, 否则抛出异常或中断,直到处理完毕所有字符,返回结果。
那么我们看看jdk给出的实现:
public static int parseInt(String s, int radix) throws NumberFormatException
{
if (s == null) {
throw new NumberFormatException("null");
}
if (radix < Character.MIN_RADIX) {
throw new NumberFormatException("radix " + radix +
" less than Character.MIN_RADIX");
}
if (radix > Character.MAX_RADIX) {
throw new NumberFormatException("radix " + radix +
" greater than Character.MAX_RADIX");
}
int result = 0;
boolean negative = false;
int i = 0, max = s.length();
int limit;
int multmin;
int digit;
if (max > 0) {
if (s.charAt(0) == "-") {
negative = true;
limit = Integer.MIN_VALUE;
i++;
} else {
limit = -Integer.MAX_VALUE;
}
multmin = limit / radix;
if (i < max) {
digit = Character.digit(s.charAt(i++),radix);
if (digit < 0) {
throw NumberFormatException.forInputString(s);
} else {
result = -digit;
}
}
while (i < max) {
// Accumulating negatively avoids surprises near MAX_VALUE
digit = Character.digit(s.charAt(i++),radix);
if (digit < 0) {
throw NumberFormatException.forInputString(s);
}
if (result < multmin) {
throw NumberFormatException.forInputString(s);
}
result *= radix;
if (result < limit + digit) {
throw NumberFormatException.forInputString(s);
}
result -= digit;
}
} else {
throw NumberFormatException.forInputString(s);
}
if (negative) {
if (i > 1) {
return result;
} else { /* Only got "-" */
throw NumberFormatException.forInputString(s);
}
} else {
return -result;
}
}
过程就是按照思路来的,但是更全面一些,首先做一些参数检查,然后定义了局部变量用于计算:
result是对应的int结果,negative对应是否是负数的判断,i是遍历用的索引指针,max代表字符串的长度,
limit是合法数字的上限(下限),digit是当前扫描到的字符对应的数字,multmin是在做乘法计算时能走到的合法下限。
严谨是这段程序最大的特点,因为有符号int的上下限是-2147483648~2147483647,可见负数表达的范围比正数多一个,
这样就好理解为什么在开头要把limit全部表达为负数(下限),这样的操作减少了后续的判断,可以一步到位,
相当于二者选择取其大一样,大的包含了小的。同理,那么multmin也就是负数了,
而且可以认为是只和进制参数radix有关系。接着每个char的扫描计算digit利用到了Character.digit(char,int) 方法,
这个方法就是在调用CharacterDataLatin1.digit(codePoint, radix) 方法,
而这个新的方法其实只是去静态数组中取个映射而已。最后当顺利的执行完while循环后,result结果也就计算好了。
作为程序设计人员,我最初接触的语言是C++,当初用到的库函数是atoi,那么我们看看atoi的库标准实现:
- int
- atoi(str)
- const char *str;
- {
- _DIAGASSERT(str != NULL);
- return((int)strtol(str, (char **)NULL, 10));
- }
复制代码
- 其中调用了strtol方法,参数传递的radix是10,也就是说我们常用的atoi是默认转化字符串到10进制的。
- 其中开始时还进行了一个trim的操作,而且支持16进制的0x开头,可谓完全的尽善尽美啊。
复制代码
- #define _FUNCNAME strtol
- #define __INT long
- #define __INT_MIN LONG_MIN
- #define __INT_MAX LONG_MAX
复制代码
- __INT
- _FUNCNAME(const char *nptr, char **endptr, int base)
- {
- const char *s;
- __INT acc, cutoff;
- char c;
- int i, neg, any, cutlim;
- _DIAGASSERT(nptr != NULL);
- /* endptr may be NULL */
- /* check base value */
- if (base && (base < 2 || base > 36)) {
- errno = EINVAL;
- return(0);
- }
- /*
- * Skip white space and pick up leading +/- sign if any.
- * If base is 0, allow 0x for hex and 0 for octal, else
- * assume decimal; if base is already 16, allow 0x.
- */
- s = nptr;
- do {
- c = *s++;
- } while (isspace(c));
- if (c == "-") {
- neg = 1;
- c = *s++;
- } else {
- neg = 0;
- if (c == "+")
- c = *s++;
- }
- if ((base == 0 || base == 16) &&
- c == "0" && (*s == "x" || *s == "X")) {
- c = s[1];
- s += 2;
- base = 16;
- }
- if (base == 0)
- base = c == "0" ? 8 : 10;
- /*
- * Compute the cutoff value between legal numbers and illegal
- * numbers. That is the largest legal value, divided by the
- * base. An input number that is greater than this value, if
- * followed by a legal input character, is too big. One that
- * is equal to this value may be valid or not; the limit
- * between valid and invalid numbers is then based on the last
- * digit. For instance, if the range for longs is
- * [-2147483648..2147483647] and the input base is 10,
- * cutoff will be set to 214748364 and cutlim to either
- * 7 (neg==0) or 8 (neg==1), meaning that if we have accumulated
- * a value > 214748364, or equal but the next digit is > 7 (or 8),
- * the number is too big, and we will return a range error.
- *
- * Set any if any `digits" consumed; make it negative to indicate
- * overflow.
- */
- cutoff = neg ? __INT_MIN : __INT_MAX;
- cutlim = (int)(cutoff % base);
- cutoff /= base;
- if (neg) {
- if (cutlim > 0) {
- cutlim -= base;
- cutoff += 1;
- }
- cutlim = -cutlim;
- }
- for (acc = 0, any = 0;; c = *s++) {
- if (!isascii(c))
- break;
- if (isdigit(c))
- i = c - "0";
- else if (isalpha(c))
- i = c - (isupper(c) ? "A" - 10 : "a" - 10);
- else
- break;
- if (i >= base)
- break;
- if (any < 0)
- continue;
- if (neg) {
- if (acc < cutoff || (acc == cutoff && i > cutlim)) {
- any = -1;
- acc = __INT_MIN;
- errno = ERANGE;
- } else {
- any = 1;
- acc *= base;
- acc -= i;
- }
- } else {
- if (acc > cutoff || (acc == cutoff && i > cutlim)) {
- any = -1;
- acc = __INT_MAX;
- errno = ERANGE;
- } else {
- any = 1;
- acc *= base;
- acc += i;
- }
- }
- }
- if (endptr != 0)
- /* LINTED interface specification */
- *endptr = __DECONST(char *, (any ? s - 1 : nptr));
- return(acc);
- }
复制代码
- 当然,类似的代码还有很多,这里只列出了两大语言的库实现,总体思路是一致的,
- 当我们设计api时,这种编程思路和风格以及功能的考虑是我们需要学习的。
复制代码
- 下面这两篇stackoverflow的问答给出了一些比较全面的c风格代码,可以参考,这里不贴全文只给link:
复制代码
- [url=http://stackoverflow.com/questions/194465/how-to-parse-a-string-to-an-int-in-c]http://stackoverflow.com/questions/194465/how-to-parse-a-string-to-an-int-in-c[/url]
复制代码
- [url=http://stackoverflow.com/questions/4442658/c-parse-int-from-string]http://stackoverflow.com/questions/4442658/c-parse-int-from-string[/url]
复制代码
|
|