Class RuleBasedNumberFormat
- All Implemented Interfaces:
Serializable
,Cloneable
A class that formats numbers according to a set of rules. This number formatter is typically used for spelling out numeric values in words (e.g., 25,3476 as "twenty-five thousand three hundred seventy-six" or "vingt-cinq mille trois cents soixante-seize" or "funfundzwanzigtausenddreihundertsechsundsiebzig"), but can also be used for other complicated formatting tasks, such as formatting a number of seconds as hours, minutes and seconds (e.g., 3,730 as "1:02:10").
The resources contain three predefined formatters for each locale: spellout, which spells out a value in words (123 is "one hundred twenty-three"); ordinal, which appends an ordinal suffix to the end of a numeral (123 is "123rd"); and duration, which shows a duration in seconds as hours, minutes, and seconds (123 is "2:03"). The client can also define more specialized RuleBasedNumberFormats by supplying programmer-defined rule sets.
The behavior of a RuleBasedNumberFormat is specified by a textual description that is either passed to the constructor as a String or loaded from a resource bundle. In its simplest form, the description consists of a semicolon-delimited list of rules. Each rule has a string of output text and a value or range of values it is applicable to. In a typical spellout rule set, the first twenty rules are the words for the numbers from 0 to 19:
zero; one; two; three; four; five; six; seven; eight; nine; ten; eleven; twelve; thirteen; fourteen; fifteen; sixteen; seventeen; eighteen; nineteen;
For larger numbers, we can use the preceding set of rules to format the ones place, and we only have to supply the words for the multiples of 10:
20: twenty[->>]; 30: thirty{->>]; 40: forty[->>]; 50: fifty[->>]; 60: sixty[->>]; 70: seventy[->>]; 80: eighty[->>]; 90: ninety[->>];
In these rules, the base value is spelled out explicitly and set off from the rule's output text with a colon. The rules are in a sorted list, and a rule is applicable to all numbers from its own base value to one less than the next rule's base value. The ">>" token is called a substitution and tells the formatter to isolate the number's ones digit, format it using this same set of rules, and place the result at the position of the ">>" token. Text in brackets is omitted if the number being formatted is an even multiple of 10 (the hyphen is a literal hyphen; 24 is "twenty-four," not "twenty four").
For even larger numbers, we can actually look up several parts of the number in the list:
100: << hundred[ >>];
The "<<" represents a new kind of substitution. The << isolates the hundreds digit (and any digits to its left), formats it using this same rule set, and places the result where the "<<" was. Notice also that the meaning of >> has changed: it now refers to both the tens and the ones digits. The meaning of both substitutions depends on the rule's base value. The base value determines the rule's divisor, which is the highest power of 10 that is less than or equal to the base value (the user can change this). To fill in the substitutions, the formatter divides the number being formatted by the divisor. The integral quotient is used to fill in the << substitution, and the remainder is used to fill in the >> substitution. The meaning of the brackets changes similarly: text in brackets is omitted if the value being formatted is an even multiple of the rule's divisor. The rules are applied recursively, so if a substitution is filled in with text that includes another substitution, that substitution is also filled in.
This rule covers values up to 999, at which point we add another rule:
1000: << thousand[ >>];
Again, the meanings of the brackets and substitution tokens shift because the rule's base value is a higher power of 10, changing the rule's divisor. This rule can actually be used all the way up to 999,999. This allows us to finish out the rules as follows:
1,000,000: << million[ >>]; 1,000,000,000: << billion[ >>]; 1,000,000,000,000: << trillion[ >>]; 1,000,000,000,000,000: OUT OF RANGE!;
Commas, periods, and spaces can be used in the base values to improve legibility and are ignored by the rule parser. The last rule in the list is customarily treated as an "overflow rule," applying to everything from its base value on up, and often (as in this example) being used to print out an error message or default representation. Notice also that the size of the major groupings in large numbers is controlled by the spacing of the rules: because in English we group numbers by thousand, the higher rules are separated from each other by a factor of 1,000.
To see how these rules actually work in practice, consider the following example: Formatting 25,430 with this rule set would work like this:
<< thousand >> | [the rule whose base value is 1,000 is applicable to 25,340] | |
twenty->> thousand >> | [25,340 over 1,000 is 25. The rule for 20 applies.] | |
twenty-five thousand >> | [25 mod 10 is 5. The rule for 5 is "five." | |
twenty-five thousand << hundred >> | [25,340 mod 1,000 is 340. The rule for 100 applies.] | |
twenty-five thousand three hundred >> | [340 over 100 is 3. The rule for 3 is "three."] | |
twenty-five thousand three hundred forty | [340 mod 100 is 40. The rule for 40 applies. Since 40 divides evenly by 10, the hyphen and substitution in the brackets are omitted.] |
The above syntax suffices only to format positive integers. To format negative numbers, we add a special rule:
-x: minus >>;
This is called a negative-number rule, and is identified by "-x" where the base value would be. This rule is used to format all negative numbers. the >> token here means "find the number's absolute value, format it with these rules, and put the result here."
We also add a special rule called a fraction rule for numbers with fractional parts:
x.x: << point >>;
This rule is used for all positive non-integers (negative non-integers pass through the negative-number rule first and then through this rule). Here, the << token refers to the number's integral part, and the >> to the number's fractional part. The fractional part is formatted as a series of single-digit numbers (e.g., 123.456 would be formatted as "one hundred twenty-three point four five six").
To see how this rule syntax is applied to various languages, examine the resource data.
There is actually much more flexibility built into the rule language than the description above shows. A formatter may own multiple rule sets, which can be selected by the caller, and which can use each other to fill in their substitutions. Substitutions can also be filled in with digits, using a DecimalFormat object. There is syntax that can be used to alter a rule's divisor in various ways. And there is provision for much more flexible fraction handling. A complete description of the rule syntax follows:
The description of a RuleBasedNumberFormat's behavior consists of one or more rule sets. Each rule set consists of a name, a colon, and a list of rules. A rule set name must begin with a % sign. Rule sets with names that begin with a single % sign are public: the caller can specify that they be used to format and parse numbers. Rule sets with names that begin with %% are private: they exist only for the use of other rule sets. If a formatter only has one rule set, the name may be omitted.
The user can also specify a special "rule set" named %%lenient-parse. The body of %%lenient-parse isn't a set of number-formatting rules, but a RuleBasedCollator description which is used to define equivalences for lenient parsing. For more information on the syntax, see RuleBasedCollator. For more information on lenient parsing, see setLenientParse(). Note: symbols that have syntactic meaning in collation rules, such as '&', have no particular meaning when appearing outside of the lenient-parse rule set.
The body of a rule set consists of an ordered, semicolon-delimited list of rules. Internally, every rule has a base value, a divisor, rule text, and zero, one, or two substitutions. These parameters are controlled by the description syntax, which consists of a rule descriptor, a colon, and a rule body.
A rule descriptor can take one of the following forms (text in italics is the name of a token):
bv: | bv specifies the rule's base value. bv is a decimal number expressed using ASCII digits. bv may contain spaces, period, and commas, which are ignored. The rule's divisor is the highest power of 10 less than or equal to the base value. | |
bv/rad: | bv specifies the rule's base value. The rule's divisor is the highest power of rad less than or equal to the base value. | |
bv>: | bv specifies the rule's base value. To calculate the divisor, let the radix be 10, and the exponent be the highest exponent of the radix that yields a result less than or equal to the base value. Every > character after the base value decreases the exponent by 1. If the exponent is positive or 0, the divisor is the radix raised to the power of the exponent; otherwise, the divisor is 1. | |
bv/rad>: | bv specifies the rule's base value. To calculate the divisor, let the radix be rad, and the exponent be the highest exponent of the radix that yields a result less than or equal to the base value. Every > character after the radix decreases the exponent by 1. If the exponent is positive or 0, the divisor is the radix raised to the power of the exponent; otherwise, the divisor is 1. | |
-x: | The rule is a negative-number rule. | |
x.x: | The rule is an improper fraction rule. If the full stop in the middle of the rule name is replaced with the decimal point that is used in the language or DecimalFormatSymbols, then that rule will have precedence when formatting and parsing this rule. For example, some languages use the comma, and can thus be written as x,x instead. For example, you can use "x.x: << point >>;x,x: << comma >>;" to handle the decimal point that matches the language's natural spelling of the punctuation of either the full stop or comma. | |
0.x: | The rule is a proper fraction rule. If the full stop in the middle of the rule name is replaced with the decimal point that is used in the language or DecimalFormatSymbols, then that rule will have precedence when formatting and parsing this rule. For example, some languages use the comma, and can thus be written as 0,x instead. For example, you can use "0.x: point >>;0,x: comma >>;" to handle the decimal point that matches the language's natural spelling of the punctuation of either the full stop or comma | |
x.0: | The rule is a default rule. If the full stop in the middle of the rule name is replaced with the decimal point that is used in the language or DecimalFormatSymbols, then that rule will have precedence when formatting and parsing this rule. For example, some languages use the comma, and can thus be written as x,0 instead. For example, you can use "x.0: << point;x,0: << comma;" to handle the decimal point that matches the language's natural spelling of the punctuation of either the full stop or comma | |
Inf: | The rule for infinity. | |
NaN: | The rule for an IEEE 754 NaN (not a number). | |
nothing | If the rule's rule descriptor is left out, the base value is one plus the preceding rule's base value (or zero if this is the first rule in the list) in a normal rule set. In a fraction rule set, the base value is the same as the preceding rule's base value. |
A rule set may be either a regular rule set or a fraction rule set, depending on whether it is used to format a number's integral part (or the whole number) or a number's fractional part. Using a rule set to format a rule's fractional part makes it a fraction rule set.
Which rule is used to format a number is defined according to one of the following algorithms: If the rule set is a regular rule set, do the following:
- If the rule set includes a default rule (and the number was passed in as a double), use the default rule. (If the number being formatted was passed in as a long, the default rule is ignored.)
- If the number is negative, use the negative-number rule.
- If the number has a fractional part and is greater than 1, use the improper fraction rule.
- If the number has a fractional part and is between 0 and 1, use the proper fraction rule.
- Binary-search the rule list for the rule with the highest base value less than or equal to the number. If that rule has two substitutions, its base value is not an even multiple of its divisor, and the number is an even multiple of the rule's divisor, use the rule that precedes it in the rule list. Otherwise, use the rule itself.
If the rule set is a fraction rule set, do the following:
- Ignore negative-number and fraction rules.
- For each rule in the list, multiply the number being formatted (which will always be between 0 and 1) by the rule's base value. Keep track of the distance between the result the nearest integer.
- Use the rule that produced the result closest to zero in the above calculation. In the event of a tie or a direct hit, use the first matching rule encountered. (The idea here is to try each rule's base value as a possible denominator of a fraction. Whichever denominator produces the fraction closest in value to the number being formatted wins.) If the rule following the matching rule has the same base value, use it if the numerator of the fraction is anything other than 1; if the numerator is 1, use the original matching rule. (This is to allow singular and plural forms of the rule text without a lot of extra hassle.)
A rule's body consists of a string of characters terminated by a semicolon. The rule may include zero, one, or two substitution tokens, and a range of text in brackets. The brackets denote optional text (and may also include one or both substitutions). The exact meanings of the substitution tokens, and under what conditions optional text is omitted, depend on the syntax of the substitution token and the context. The rest of the text in a rule body is literal text that is output when the rule matches the number being formatted.
A substitution token begins and ends with a token character. The token character and the context together specify a mathematical operation to be performed on the number being formatted. An optional substitution descriptor specifies how the value resulting from that operation is used to fill in the substitution. The position of the substitution token in the rule body specifies the location of the resultant text in the original rule text.
The meanings of the substitution token characters are as follows:
>> | in normal rule | Divide the number by the rule's divisor and format the remainder | |
in negative-number rule | Find the absolute value of the number and format the result | ||
in fraction or default rule | Isolate the number's fractional part and format it. | ||
in rule in fraction rule set | Not allowed. | ||
>>> | in normal rule | Divide the number by the rule's divisor and format the remainder, but bypass the normal rule-selection process and just use the rule that precedes this one in this rule list. | |
in all other rules | Not allowed. | ||
<< | in normal rule | Divide the number by the rule's divisor and format the quotient | |
in negative-number rule | Not allowed. | ||
in fraction or default rule | Isolate the number's integral part and format it. | ||
in rule in fraction rule set | Multiply the number by the rule's base value and format the result. | ||
== | in all rule sets | Format the number unchanged | |
[] | in normal rule | Omit the optional text if the number is an even multiple of the rule's divisor | |
in negative-number rule | Not allowed. | ||
in improper-fraction rule | Omit the optional text if the number is between 0 and 1 (same as specifying both an x.x rule and a 0.x rule) | ||
in default rule | Omit the optional text if the number is an integer (same as specifying both an x.x rule and an x.0 rule) | ||
in proper-fraction rule | Not allowed. | ||
in rule in fraction rule set | Omit the optional text if multiplying the number by the rule's base value yields 1. | ||
$(cardinal,plural syntax)$ | in all rule sets | This provides the ability to choose a word based on the number divided by the radix to the power of the exponent of the base value for the specified locale, which is normally equivalent to the << value. This uses the cardinal plural rules from PluralFormat. All strings used in the plural format are treated as the same base value for parsing. | |
$(ordinal,plural syntax)$ | in all rule sets | This provides the ability to choose a word based on the number divided by the radix to the power of the exponent of the base value for the specified locale, which is normally equivalent to the << value. This uses the ordinal plural rules from PluralFormat. All strings used in the plural format are treated as the same base value for parsing. |
The substitution descriptor (i.e., the text between the token characters) may take one of three forms:
a rule set name | Perform the mathematical operation on the number, and format the result using the named rule set. | |
a DecimalFormat pattern | Perform the mathematical operation on the number, and format the result using a DecimalFormat with the specified pattern. The pattern must begin with 0 or #. | |
nothing | Perform the mathematical operation on the number, and format the result using the rule
set containing the current rule, except:
|
Whitespace is ignored between a rule set name and a rule set body, between a rule descriptor and a rule body, or between rules. If a rule body begins with an apostrophe, the apostrophe is ignored, but all text after it becomes significant (this is how you can have a rule's rule text begin with whitespace). There is no escape function: the semicolon is not allowed in rule set names or in rule text, and the colon is not allowed in rule set names. The characters beginning a substitution token are always treated as the beginning of a substitution token.
See the resource data and the demo program for annotated examples of real rule sets using these features.
- See Also:
-
Nested Class Summary
Nested classes/interfaces inherited from class com.ibm.icu.text.NumberFormat
NumberFormat.Field, NumberFormat.NumberFormatFactory, NumberFormat.NumberFormatShim, NumberFormat.SimpleNumberFormatFactory
Nested classes/interfaces inherited from class com.ibm.icu.text.UFormat
UFormat.SpanField
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate BreakIterator
private boolean
private boolean
private boolean
Data for handling context-based capitalizationprivate static final boolean
private DecimalFormat
The NumberFormat used when lenient parsing numbers.private DecimalFormatSymbols
The DecimalFormatSymbols object that any DecimalFormat objects this formatter uses should use.private NFRule
The rule used when dealing with infinity.private NFRule
The rule used when dealing with IEEE 754 NaN.private NFRuleSet
A pointer to the formatter's default rule set.static final int
Deprecated.ICU 74 Use MeasureFormat instead.private boolean
Flag specifying whether lenient parse mode is on or off.private String
If the description specifies lenient-parse rules, they're stored here until the collator is created.private ULocale
The formatter's locale.private static final String[]
private boolean
private static final BigDecimal
private static final BigDecimal
static final int
Selector code that tells the constructor to create a numbering system formatterstatic final int
Selector code that tells the constructor to create an ordinal formatterprivate RBNFPostProcessor
Post processor lazily constructed from the postProcessRules.private String
If the description specifies post-process rules, they're stored here until post-processing is required.private String[]
The public rule set names;private int
The formatter's rounding mode.private static final String[]
Localizations for rule set names.private NFRuleSet[]
The formatter's rule sets.The formatter's rule names mapped to rule sets.private RbnfLenientScannerProvider
Collator to be used in lenient parsing.(package private) static final long
static final int
Selector code that tells the constructor to create a spellout formatterFields inherited from class com.ibm.icu.text.NumberFormat
ACCOUNTINGCURRENCYSTYLE, CASHCURRENCYSTYLE, CURRENCYSTYLE, currentSerialVersion, FRACTION_FIELD, INTEGER_FIELD, INTEGERSTYLE, ISOCURRENCYSTYLE, NUMBERSTYLE, PERCENTSTYLE, PLURALCURRENCYSTYLE, SCIENTIFICSTYLE, STANDARDCURRENCYSTYLE
-
Constructor Summary
ConstructorsConstructorDescriptionRuleBasedNumberFormat
(int format) Creates a RuleBasedNumberFormat from a predefined description.RuleBasedNumberFormat
(ULocale locale, int format) Creates a RuleBasedNumberFormat from a predefined description.RuleBasedNumberFormat
(String description) Creates a RuleBasedNumberFormat that behaves according to the description passed in.RuleBasedNumberFormat
(String description, ULocale locale) Creates a RuleBasedNumberFormat that behaves according to the description passed in.RuleBasedNumberFormat
(String description, String[][] localizations) Creates a RuleBasedNumberFormat that behaves according to the description passed in.RuleBasedNumberFormat
(String description, String[][] localizations, ULocale locale) Creates a RuleBasedNumberFormat that behaves according to the description passed in.RuleBasedNumberFormat
(String description, Locale locale) Creates a RuleBasedNumberFormat that behaves according to the description passed in.RuleBasedNumberFormat
(Locale locale, int format) Creates a RuleBasedNumberFormat from a predefined description. -
Method Summary
Modifier and TypeMethodDescriptionprivate String
adjustForContext
(String result) Adjust capitalization of formatted result for display contextclone()
Duplicates this formatter.(package private) PluralFormat
createPluralFormat
(PluralRules.PluralType pluralType, String pattern) boolean
Tests two RuleBasedNumberFormats for equality.private String
extractSpecial
(StringBuilder description, String specialName) This extracts the special information from the rule sets before the main parsing starts.(package private) NFRuleSet
findRuleSet
(String name) Returns the named rule set.private String
Bottleneck through which all the public format() methods that take a double pass.Formats the specified number according to the specified rule set.format
(double number, StringBuffer toAppendTo, FieldPosition ignore) Formats the specified number using the formatter's default rule set.private String
Bottleneck through which all the public format() methods that take a long pass.Formats the specified number according to the specified rule set.format
(long number, StringBuffer toAppendTo, FieldPosition ignore) Formats the specified number using the formatter's default rule set.format
(BigDecimal number, StringBuffer toAppendTo, FieldPosition pos) NEW Implement com.ibm.icu.text.NumberFormat: Format a BigDecimal.format
(BigDecimal number, StringBuffer toAppendTo, FieldPosition pos) NEW Implement com.ibm.icu.text.NumberFormat: Format a BigDecimal.format
(BigInteger number, StringBuffer toAppendTo, FieldPosition pos) NEW Implement com.ibm.icu.text.NumberFormat: Format a BigInteger.(package private) DecimalFormat
(package private) DecimalFormatSymbols
Returns the DecimalFormatSymbols object that should be used by all DecimalFormat instances owned by this formatter.(package private) NFRule
Returns the default rule for infinity.(package private) NFRule
Returns the default rule for NaN.(package private) NFRuleSet
Returns a reference to the formatter's default rule set.Return the name of the current default rule set.(package private) RbnfLenientScanner
Returns the scanner to use for lenient parsing.Returns the lenient scanner provider.private String[]
int
Returns the rounding mode.getRuleSetDisplayName
(String ruleSetName) Return the rule set display name for the provided rule set in the current defaultDISPLAY
locale.getRuleSetDisplayName
(String ruleSetName, ULocale loc) Return the rule set display name for the provided rule set and locale.ULocale[]
Return a list of locales for which there are locale-specific display names for the rule sets in this formatter.String[]
Return the rule set display names for the current defaultDISPLAY
locale.String[]
Return the rule set display names for the provided locale.String[]
Returns a list of the names of all of this formatter's public rule sets.int
hashCode()
private void
This function parses the description and uses it to build all of internal data structures that the formatter uses to do formattingprivate void
initCapitalizationContextInfo
(ULocale theLocale) Set capitalizationForListOrMenu, capitalizationForStandAloneprivate void
initLocalizations
(String[][] localizations) Take the localizations array and create a Map from the locale strings to the localization arrays.boolean
Returns true if lenient-parse mode is turned on.parse
(String text, ParsePosition parsePosition) Parses the specified string, beginning at the specified position, according to this formatter's rules.private void
postProcess
(StringBuilder result, NFRuleSet ruleSet) Post-process the rules if we have a post-processor.private void
Reads this object in from a stream.void
setContext
(DisplayContext context) Set a particular DisplayContext value in the formatter, such as CAPITALIZATION_FOR_STANDALONE.void
setDecimalFormatSymbols
(DecimalFormatSymbols newSymbols) Sets the decimal format symbols used by this formatter.void
setDefaultRuleSet
(String ruleSetName) Override the default rule set to use.void
setLenientParseMode
(boolean enabled) Turns lenient parse mode on and off.void
setLenientScannerProvider
(RbnfLenientScannerProvider scannerProvider) Sets the provider for the lenient scanner.void
setRoundingMode
(int roundingMode) Sets the rounding mode.private StringBuilder
stripWhitespace
(String description) This function is used by init() to strip whitespace between rules (i.e., after semicolons).toString()
Generates a textual description of this formatter.private void
Writes this object to a stream.Methods inherited from class com.ibm.icu.text.NumberFormat
createInstance, format, format, format, format, format, format, format, format, getAvailableLocales, getAvailableULocales, getContext, getCurrency, getCurrencyInstance, getCurrencyInstance, getCurrencyInstance, getEffectiveCurrency, getInstance, getInstance, getInstance, getInstance, getInstance, getInstance, getIntegerInstance, getIntegerInstance, getIntegerInstance, getMaximumFractionDigits, getMaximumIntegerDigits, getMinimumFractionDigits, getMinimumIntegerDigits, getNumberInstance, getNumberInstance, getNumberInstance, getPattern, getPattern, getPatternForStyle, getPatternForStyleAndNumberingSystem, getPercentInstance, getPercentInstance, getPercentInstance, getScientificInstance, getScientificInstance, getScientificInstance, isGroupingUsed, isParseIntegerOnly, isParseStrict, parse, parseCurrency, parseObject, registerFactory, setCurrency, setGroupingUsed, setMaximumFractionDigits, setMaximumIntegerDigits, setMinimumFractionDigits, setMinimumIntegerDigits, setParseIntegerOnly, setParseStrict, unregister
Methods inherited from class java.text.Format
format, formatToCharacterIterator, parseObject
-
Field Details
-
serialVersionUID
static final long serialVersionUID- See Also:
-
SPELLOUT
public static final int SPELLOUTSelector code that tells the constructor to create a spellout formatter- See Also:
-
ORDINAL
public static final int ORDINALSelector code that tells the constructor to create an ordinal formatter- See Also:
-
DURATION
Deprecated.ICU 74 Use MeasureFormat instead.Selector code that tells the constructor to create a duration formatter- See Also:
-
NUMBERING_SYSTEM
public static final int NUMBERING_SYSTEMSelector code that tells the constructor to create a numbering system formatter- See Also:
-
ruleSets
The formatter's rule sets. -
ruleSetsMap
The formatter's rule names mapped to rule sets. -
defaultRuleSet
A pointer to the formatter's default rule set. This is always included in ruleSets. -
locale
The formatter's locale. This is used to create DecimalFormatSymbols and Collator objects. -
roundingMode
private int roundingModeThe formatter's rounding mode. -
scannerProvider
Collator to be used in lenient parsing. This variable is lazy-evaluated: the collator is actually created the first time the client does a parse with lenient-parse mode turned on. -
lookedForScanner
private transient boolean lookedForScanner -
decimalFormatSymbols
The DecimalFormatSymbols object that any DecimalFormat objects this formatter uses should use. This variable is lazy-evaluated: it isn't filled in if the rule set never uses a DecimalFormat pattern. -
decimalFormat
The NumberFormat used when lenient parsing numbers. This needs to reflect the locale. This is lazy-evaluated, like decimalFormatSymbols. It is here so it can be shared by different NFSubstitutions. -
defaultInfinityRule
The rule used when dealing with infinity. This is lazy-evaluated, and derived from decimalFormat. It is here so it can be shared by different NFRuleSets. -
defaultNaNRule
The rule used when dealing with IEEE 754 NaN. This is lazy-evaluated, and derived from decimalFormat. It is here so it can be shared by different NFRuleSets. -
lenientParse
private boolean lenientParseFlag specifying whether lenient parse mode is on or off. Off by default. -
lenientParseRules
If the description specifies lenient-parse rules, they're stored here until the collator is created. -
postProcessRules
If the description specifies post-process rules, they're stored here until post-processing is required. -
postProcessor
Post processor lazily constructed from the postProcessRules. -
ruleSetDisplayNames
Localizations for rule set names. -
publicRuleSetNames
The public rule set names; -
capitalizationInfoIsSet
private boolean capitalizationInfoIsSetData for handling context-based capitalization -
capitalizationForListOrMenu
private boolean capitalizationForListOrMenu -
capitalizationForStandAlone
private boolean capitalizationForStandAlone -
capitalizationBrkIter
-
DEBUG
private static final boolean DEBUG -
rulenames
-
locnames
-
MAX_VALUE
-
MIN_VALUE
-
-
Constructor Details
-
RuleBasedNumberFormat
Creates a RuleBasedNumberFormat that behaves according to the description passed in. The formatter uses the defaultFORMAT
locale.- Parameters:
description
- A description of the formatter's desired behavior. See the class documentation for a complete explanation of the description syntax.- See Also:
-
RuleBasedNumberFormat
Creates a RuleBasedNumberFormat that behaves according to the description passed in. The formatter uses the defaultFORMAT
locale.The localizations data provides information about the public rule sets and their localized display names for different locales. The first element in the list is an array of the names of the public rule sets. The first element in this array is the initial default ruleset. The remaining elements in the list are arrays of localizations of the names of the public rule sets. Each of these is one longer than the initial array, with the first String being the ULocale ID, and the remaining Strings being the localizations of the rule set names, in the same order as the initial array.
- Parameters:
description
- A description of the formatter's desired behavior. See the class documentation for a complete explanation of the description syntax.localizations
- a list of localizations for the rule set names in the description.- See Also:
-
RuleBasedNumberFormat
Creates a RuleBasedNumberFormat that behaves according to the description passed in. The formatter uses the specified locale to determine the characters to use when formatting in numerals, and to define equivalences for lenient parsing.- Parameters:
description
- A description of the formatter's desired behavior. See the class documentation for a complete explanation of the description syntax.locale
- A locale, which governs which characters are used for formatting values in numerals, and which characters are equivalent in lenient parsing.
-
RuleBasedNumberFormat
Creates a RuleBasedNumberFormat that behaves according to the description passed in. The formatter uses the specified locale to determine the characters to use when formatting in numerals, and to define equivalences for lenient parsing.- Parameters:
description
- A description of the formatter's desired behavior. See the class documentation for a complete explanation of the description syntax.locale
- A locale, which governs which characters are used for formatting values in numerals, and which characters are equivalent in lenient parsing.
-
RuleBasedNumberFormat
Creates a RuleBasedNumberFormat that behaves according to the description passed in. The formatter uses the specified locale to determine the characters to use when formatting in numerals, and to define equivalences for lenient parsing.The localizations data provides information about the public rule sets and their localized display names for different locales. The first element in the list is an array of the names of the public rule sets. The first element in this array is the initial default ruleset. The remaining elements in the list are arrays of localizations of the names of the public rule sets. Each of these is one longer than the initial array, with the first String being the ULocale ID, and the remaining Strings being the localizations of the rule set names, in the same order as the initial array.
- Parameters:
description
- A description of the formatter's desired behavior. See the class documentation for a complete explanation of the description syntax.localizations
- a list of localizations for the rule set names in the description.locale
- A ULocale that governs which characters are used for formatting values in numerals, and determines which characters are equivalent in lenient parsing.
-
RuleBasedNumberFormat
Creates a RuleBasedNumberFormat from a predefined description. The selector code chooses among three possible predefined formats: spellout, ordinal, and duration.- Parameters:
locale
- The locale for the formatter.format
- A selector code specifying which kind of formatter to create for that locale. There are three legal values: SPELLOUT, which creates a formatter that spells out a value in words in the desired language, ORDINAL, which attaches an ordinal suffix from the desired language to the end of a number (e.g. "123rd"), and DURATION, which formats a duration in seconds as hours, minutes, and seconds.
-
RuleBasedNumberFormat
Creates a RuleBasedNumberFormat from a predefined description. The selector code chooses among three possible predefined formats: spellout, ordinal, and duration.- Parameters:
locale
- The locale for the formatter.format
- A selector code specifying which kind of formatter to create for that locale. There are four legal values: SPELLOUT, which creates a formatter that spells out a value in words in the desired language, ORDINAL, which attaches an ordinal suffix from the desired language to the end of a number (e.g. "123rd"), DURATION, which formats a duration in seconds as hours, minutes, and seconds, and NUMBERING_SYSTEM, which is used to invoke rules for alternate numbering systems such as the Hebrew numbering system, or for Roman numerals, etc..
-
RuleBasedNumberFormat
public RuleBasedNumberFormat(int format) Creates a RuleBasedNumberFormat from a predefined description. Uses the defaultFORMAT
locale.- Parameters:
format
- A selector code specifying which kind of formatter to create. There are three legal values: SPELLOUT, which creates a formatter that spells out a value in words in the default locale's language, ORDINAL, which attaches an ordinal suffix from the default locale's language to a numeral, and DURATION, which formats a duration in seconds as hours, minutes, and seconds always rounding down. or NUMBERING_SYSTEM, which is used for alternate numbering systems such as Hebrew.- See Also:
-
-
Method Details
-
clone
Duplicates this formatter.- Overrides:
clone
in classNumberFormat
- Returns:
- A RuleBasedNumberFormat that is equal to this one.
-
equals
Tests two RuleBasedNumberFormats for equality.- Overrides:
equals
in classNumberFormat
- Parameters:
that
- The formatter to compare against this one.- Returns:
- true if the two formatters have identical behavior.
-
hashCode
public int hashCode()- Overrides:
hashCode
in classNumberFormat
-
toString
Generates a textual description of this formatter. -
writeObject
Writes this object to a stream.- Parameters:
out
- The stream to write to.- Throws:
IOException
-
readObject
Reads this object in from a stream.- Parameters:
in
- The stream to read from.- Throws:
IOException
-
getRuleSetNames
Returns a list of the names of all of this formatter's public rule sets.- Returns:
- A list of the names of all of this formatter's public rule sets.
-
getRuleSetDisplayNameLocales
Return a list of locales for which there are locale-specific display names for the rule sets in this formatter. If there are no localized display names, return null.- Returns:
- an array of the ULocales for which there is rule set display name information
-
getNameListForLocale
-
getRuleSetDisplayNames
Return the rule set display names for the provided locale. These are in the same order as those returned by getRuleSetNames. The locale is matched against the locales for which there is display name data, using normal fallback rules. If no locale matches, the default display names are returned. (These are the internal rule set names minus the leading '%'.)- Returns:
- an array of the locales that have display name information
- See Also:
-
getRuleSetDisplayNames
Return the rule set display names for the current defaultDISPLAY
locale.- Returns:
- an array of the display names
- See Also:
-
getRuleSetDisplayName
Return the rule set display name for the provided rule set and locale. The locale is matched against the locales for which there is display name data, using normal fallback rules. If no locale matches, the default display name is returned.- Returns:
- the display name for the rule set
- Throws:
IllegalArgumentException
- if ruleSetName is not a valid rule set name for this format- See Also:
-
getRuleSetDisplayName
Return the rule set display name for the provided rule set in the current defaultDISPLAY
locale.- Returns:
- the display name for the rule set
- See Also:
-
format
Formats the specified number according to the specified rule set.- Parameters:
number
- The number to format.ruleSet
- The name of the rule set to format the number with. This must be the name of a valid public rule set for this formatter.- Returns:
- A textual representation of the number.
- Throws:
IllegalArgumentException
-
format
Formats the specified number according to the specified rule set. (If the specified rule set specifies a default ["x.0"] rule, this function ignores it. Convert the number to a double first if you ned it.) This function preserves all the precision in the long-- it doesn't convert it to a double.- Parameters:
number
- The number to format.ruleSet
- The name of the rule set to format the number with. This must be the name of a valid public rule set for this formatter.- Returns:
- A textual representation of the number.
- Throws:
IllegalArgumentException
-
format
Formats the specified number using the formatter's default rule set. (The default rule set is the last public rule set defined in the description.)- Specified by:
format
in classNumberFormat
- Parameters:
number
- The number to format.toAppendTo
- A StringBuffer that the result should be appended to.ignore
- This function doesn't examine or update the field position.- Returns:
- toAppendTo
- See Also:
-
format
Formats the specified number using the formatter's default rule set. (The default rule set is the last public rule set defined in the description.) (If the specified rule set specifies a default ["x.0"] rule, this function ignores it. Convert the number to a double first if you ned it.) This function preserves all the precision in the long-- it doesn't convert it to a double.- Specified by:
format
in classNumberFormat
- Parameters:
number
- The number to format.toAppendTo
- A StringBuffer that the result should be appended to.ignore
- This function doesn't examine or update the field position.- Returns:
- toAppendTo
- See Also:
-
format
NEW Implement com.ibm.icu.text.NumberFormat: Format a BigInteger.- Specified by:
format
in classNumberFormat
- See Also:
-
format
NEW Implement com.ibm.icu.text.NumberFormat: Format a BigDecimal.- Specified by:
format
in classNumberFormat
- See Also:
-
format
NEW Implement com.ibm.icu.text.NumberFormat: Format a BigDecimal.- Specified by:
format
in classNumberFormat
- See Also:
-
parse
Parses the specified string, beginning at the specified position, according to this formatter's rules. This will match the string against all of the formatter's public rule sets and return the value corresponding to the longest parseable substring. This function's behavior is affected by the lenient parse mode.- Specified by:
parse
in classNumberFormat
- Parameters:
text
- The string to parseparsePosition
- On entry, contains the position of the first character in "text" to examine. On exit, has been updated to contain the position of the first character in "text" that wasn't consumed by the parse.- Returns:
- The number that corresponds to the parsed text. This will be an instance of either Long or Double, depending on whether the result has a fractional part.
- See Also:
-
setLenientParseMode
public void setLenientParseMode(boolean enabled) Turns lenient parse mode on and off. When in lenient parse mode, the formatter uses an RbnfLenientScanner for parsing the text. Lenient parsing is only in effect if a scanner is set. If a provider is not set, and this is used for parsing, a default scannerRbnfLenientScannerProviderImpl
will be set if it is available on the classpath. Otherwise this will have no effect.- Parameters:
enabled
- If true, turns lenient-parse mode on; if false, turns it off.- See Also:
-
lenientParseEnabled
public boolean lenientParseEnabled()Returns true if lenient-parse mode is turned on. Lenient parsing is off by default.- Returns:
- true if lenient-parse mode is turned on.
- See Also:
-
setLenientScannerProvider
Sets the provider for the lenient scanner. If this has not been set,setLenientParseMode(boolean)
has no effect. This is necessary to decouple collation from format code.- Parameters:
scannerProvider
- the provider- See Also:
-
getLenientScannerProvider
Returns the lenient scanner provider. If none was set, and lenient parse is enabled, this will attempt to instantiate a default scanner, setting it if it was successful. Otherwise this returns false.- See Also:
-
setDefaultRuleSet
Override the default rule set to use. If ruleSetName is null, reset to the initial default rule set.- Parameters:
ruleSetName
- the name of the rule set, or null to reset the initial default.- Throws:
IllegalArgumentException
- if ruleSetName is not the name of a public ruleset.
-
getDefaultRuleSetName
Return the name of the current default rule set.- Returns:
- the name of the current default rule set, if it is public, else the empty string.
-
setDecimalFormatSymbols
Sets the decimal format symbols used by this formatter. The formatter uses a copy of the provided symbols.- Parameters:
newSymbols
- desired DecimalFormatSymbols- See Also:
-
setContext
Set a particular DisplayContext value in the formatter, such as CAPITALIZATION_FOR_STANDALONE. Note: For getContext, see NumberFormat.- Overrides:
setContext
in classNumberFormat
- Parameters:
context
- The DisplayContext value to set.
-
getRoundingMode
public int getRoundingMode()Returns the rounding mode.- Overrides:
getRoundingMode
in classNumberFormat
- Returns:
- A rounding mode, between
BigDecimal.ROUND_UP
andBigDecimal.ROUND_UNNECESSARY
. - See Also:
-
setRoundingMode
public void setRoundingMode(int roundingMode) Sets the rounding mode. This has no effect unless the rounding increment is greater than zero.- Overrides:
setRoundingMode
in classNumberFormat
- Parameters:
roundingMode
- A rounding mode, betweenBigDecimal.ROUND_UP
andBigDecimal.ROUND_UNNECESSARY
.- Throws:
IllegalArgumentException
- ifroundingMode
is unrecognized.- See Also:
-
getDefaultRuleSet
NFRuleSet getDefaultRuleSet()Returns a reference to the formatter's default rule set. The default rule set is the last public rule set in the description, or the one most recently set by setDefaultRuleSet.- Returns:
- The formatter's default rule set.
-
getLenientScanner
RbnfLenientScanner getLenientScanner()Returns the scanner to use for lenient parsing. The scanner is provided by the provider.- Returns:
- The collator to use for lenient parsing, or null if lenient parsing is turned off.
-
getDecimalFormatSymbols
DecimalFormatSymbols getDecimalFormatSymbols()Returns the DecimalFormatSymbols object that should be used by all DecimalFormat instances owned by this formatter. This object is lazily created: this function creates it the first time it's called.- Returns:
- The DecimalFormatSymbols object that should be used by all DecimalFormat instances owned by this formatter.
-
getDecimalFormat
DecimalFormat getDecimalFormat() -
createPluralFormat
-
getDefaultInfinityRule
NFRule getDefaultInfinityRule()Returns the default rule for infinity. This object is lazily created: this function creates it the first time it's called. -
getDefaultNaNRule
NFRule getDefaultNaNRule()Returns the default rule for NaN. This object is lazily created: this function creates it the first time it's called. -
extractSpecial
This extracts the special information from the rule sets before the main parsing starts. Extra whitespace must have already been removed from the description. If found, the special information is removed from the description and returned, otherwise the description is unchanged and null is returned. Note: the trailing semicolon at the end of the special rules is stripped.- Parameters:
description
- the rbnf description with extra whitespace removedspecialName
- the name of the special rule text to extract- Returns:
- the special rule text, or null if the rule was not found
-
init
This function parses the description and uses it to build all of internal data structures that the formatter uses to do formatting- Parameters:
description
- The description of the formatter's desired behavior. This is either passed in by the caller or loaded out of a resource by one of the constructors, and is in the description format specified in the class docs.
-
initLocalizations
Take the localizations array and create a Map from the locale strings to the localization arrays. -
initCapitalizationContextInfo
Set capitalizationForListOrMenu, capitalizationForStandAlone -
stripWhitespace
This function is used by init() to strip whitespace between rules (i.e., after semicolons).- Parameters:
description
- The formatter description- Returns:
- The description with all the whitespace that follows semicolons taken out.
-
format
Bottleneck through which all the public format() methods that take a double pass. By the time we get here, we know which rule set we're using to do the formatting.- Parameters:
number
- The number to formatruleSet
- The rule set to use to format the number- Returns:
- The text that resulted from formatting the number
-
format
Bottleneck through which all the public format() methods that take a long pass. By the time we get here, we know which rule set we're using to do the formatting.- Parameters:
number
- The number to formatruleSet
- The rule set to use to format the number- Returns:
- The text that resulted from formatting the number
-
postProcess
Post-process the rules if we have a post-processor. -
adjustForContext
Adjust capitalization of formatted result for display context -
findRuleSet
Returns the named rule set. Throws an IllegalArgumentException if this formatter doesn't have a rule set with that name.- Parameters:
name
- The name of the desired rule set- Returns:
- The rule set with that name
- Throws:
IllegalArgumentException
-