亚洲一级淫片,国产精品久久久久久久久久免费看 ,精品国产一区二区三区性色av

本文介紹了調用 String#toLowerCase 時應該指定哪個語言環境?的處理方法，對大家解決問題具有一定的參考價值，需要的朋友們下面隨著小編來一起學習吧！

問題描述

限時送ChatGPT賬號..

在 Java 中，String#toLowerCase 方法使用默認的系統 Locale 來確定如何處理小寫.如果我將一些 ASCII 文本小寫并希望確保按預期處理，我應該使用哪個語言環境?

我主要關心的是編程標識符，例如模式中的表名和列名.因此，我希望應用英文小寫.

Locale.ROOT 聲明它是區域設置敏感操作的語言/國家中性區域設置

Locale.ENGLISH 大概也是一個安全的選擇.

解決方案

是的，Locale.ENGLISH 是編程語言標識符和 URL 部分等案例操作的安全選擇，因為它不涉及任何特殊的大小寫規則和英文大小寫中的所有 7 位 ASCII 字符 - 轉換為 7 位 ASCII 字符.

這不適用于所有其他語言環境.在土耳其語中，I"和i"字符不進行大小寫轉換.

有點和無點的我"解釋道:

<塊引用>

土耳其字母表是拉丁字母表的變體，包括字母 I 的兩個不同版本，一個帶點，另一個不帶點.

在 Unicode 中，U+0131 是一個小寫的無點 i (???).U+0130 (?) 是帶有點的大寫 i.ISO-8859-9 分別將它們放在 0xFD 和 0xDD 位置.在正常的排版中，當小寫 i 與其他變音符號組合時，通常在添加變音符號之前刪除點；然而，Unicode 仍然列出了包括點 i 在內的等效組合序列，因為從邏輯上講，它是被修改的普通點 i 字符.

大多數 Unicode 軟件將大寫 ? 轉換為 I 并將小寫 ? 轉換為 i，但是，除非專門為土耳其語設置，否則它將小寫 I 轉換為 i 并將大寫 i 轉換為 I.因此，大寫然后小寫，反之亦然，會更改字母.

特殊例外列表保存在 http://unicode.org/Public/UNIDATA/SpecialCasing.txt

<塊引用>

# =================================================================================# 土耳其語和阿塞拜疆語# i 和 i-dotless;I-dot 和 i 是土耳其語和阿塞拜疆語的大小寫對# 以下規則處理這些情況.0130;0069;0130;0130;tr;# 上面帶點的拉丁文大寫字母 I0130;0069;0130;0130;az;# 上面帶點的拉丁文大寫字母 I# 小寫時，去掉序列i + dot_above中的dot_above，變成i.# 這與規范等效的 I-dot_above 的行為相匹配0307;;0307;0307;tr After_I;# 結合上面的點0307;;0307;0307;az After_I;# 結合上面的點

...

In Java the String#toLowerCase method uses the default system Locale to determine how to handle lowercasing. If I am lowercasing some ASCII text and want to be sure that this is processed as expected which Locale should I use?

EDIT: I'm mainly concerned about programming identifiers such as table and column names in a schema. As such I want English lower casing to apply.

Locale.ROOT states that it is the language/country neutral locale for the locale sensitive operations

Locale.ENGLISH would presumably also be a safe choice.

解決方案

Yes, Locale.ENGLISH is a safe choice for case operations for things like programming language identifiers and URL parts since it doesn't involve any special casing rules and all 7-bit ASCII characters in the ENGLISH case-convert to 7-bit ASCII characters.

That is not true for all other locales. In Turkish, the 'I' and 'i' characters are not case-converted to one another.

"Dotted and dotless I" explains:

The Turkish alphabet, which is a variant of the Latin alphabet, includes two distinct versions of the letter I, one dotted and the other dotless.

In Unicode, U+0131 is a lower case letter dotless i (?). U+0130 (?) is capital i with dot. ISO-8859-9 has them at positions 0xFD and 0xDD respectively. In normal typography, when lower case i is combined with other diacritics, the dot is generally removed before the diacritic is added; however, Unicode still lists the equivalent combining sequences as including the dotted i, since logically it is the normal dotted i character that is being modified.

Most Unicode software uppercases ? to I and lowercases ? to i, but, unless specifically set up for Turkish, it lowercases I to i and uppercases i to I. Thus uppercasing then lowercasing, or vice versa, changes the letters.

The list of special exceptions is maintained at http://unicode.org/Public/UNIDATA/SpecialCasing.txt

# ================================================================================

# Turkish and Azeri

# I and i-dotless; I-dot and i are case pairs in Turkish and Azeri
# The following rules handle those cases.

0130; 0069; 0130; 0130; tr; # LATIN CAPITAL LETTER I WITH DOT ABOVE
0130; 0069; 0130; 0130; az; # LATIN CAPITAL LETTER I WITH DOT ABOVE

# When lowercasing, remove dot_above in the sequence I + dot_above, which will turn into i.
# This matches the behavior of the canonically equivalent I-dot_above

0307; ; 0307; 0307; tr After_I; # COMBINING DOT ABOVE
0307; ; 0307; 0307; az After_I; # COMBINING DOT ABOVE

...

這篇關于調用 String#toLowerCase 時應該指定哪個語言環境?的文章就介紹到這了，希望我們推薦的答案對大家有所幫助，也希望大家多多支持html5模板網！

【網站聲明】本站部分內容來源于互聯網,旨在幫助大家更快的解決問題，如果有圖片或者內容侵犯了您的權益，請聯系我們刪除處理，感謝您的支持！

pbootcms网站模板|日韩1区2区|织梦模板||网站源码|日韩1区2区|jquery建站特效-html5模板网

調用 String#toLowerCase 時應該指定哪個語言環境?

問題描述

相關文檔推薦