Logo Search packages:      
Sourcecode: falconpl version File versions  Download package

bool Falcon::Format::parse ( const String fmt  ) 

Parses a format string. Transforms a format string into a setup for this format object.

The format is a sequence of commands that are parsed independently from their position. Commands are usually described by one, two or more character.

Formats are meant to deal with different item types. A format thought for a certain kind of object, in example, a number, may be applied to something different, in example, a string, or the other way around.

For this reason, Falcon formats include also what to do if the given item is not thought for the given format.

Format elements:

  • Size: The minimum field lengt; it can be just expressed by a number. if the formatted output is wide as or wider than the allocated size, the output will NOT be truncated, and the resulting string may be just too wide to be displayed where it was intented to be. The size can be mandatory by adding '*' after it. In this case, the function will return false (and eventually raise an error) if the conversion caused the output to be wider than allowed.

  • Padding: the padding character is appended after the formatted size, or it is prepended before it alignment is right. To define padding character, use 'p' followed by the character. In example, p0 to fill the field with zeroes. Of course, the character may be any Unicode character (the format string accepts standard falcon character escapes). In the special case of p0, front sign indicators are placed at the beginning of the field; in example "4p0+" will produce "+001" "-002" and so on, while "4px+" will produce "xx+1", "xx-2" etc.

  • Numeric base: the way an integer should be rendered. It may be:
    • Decimal; as it's the default translation, no command is needed; a 'N' character may be added to the format to specify that we are actually expecting a number.
    • Hexadecimal: Command may be 'x' (lowercase hex), 'X' (uppercase Hex), 'c' (0x prefixed lowercase hex) or 'C' (0x prefixed uppercase hex).
    • Binary: 'b' to convert to binary, and 'B' to convert to binary and add a "b" after the number.
    • Octal: 'o' to display an octal number, or '0' to display an octal with "0" prefix.
    • Scientific: 'e' to display a number in scientific notation W.D+/-eM. Format of numbers in scientific notation is fixed, so thousand separator and decimal digit separator cannot be set, but decimals cipher setting will still work.

  • Decimals: '.' followed by a number indicates the number of decimal to be displayed. If no decimal is specified, floating point numbers will be displayed with all significant digits digits, while if is's set to zero, decimal numbers will be rounded.

  • Decimal separator: a 'd' followed by any non-cipher character will be interpreted as decimal separator setting. In example, to use central european standard for decimal nubmers and limit the output to 3 decimals, write ".3d,", or "d,.3". The default value is '.'.

  • (Thousand) Grouping: actually it's the integer part group separator, as it will be displayed also for hexadecimal, octal and binary conversions. It is set using 'g' followed by the separator character, it defaults to ','. Normally, it is not displayed; to activate it set also the integer grouping digit count; normally is 3, but it's 4 in Jpanaese and Chinese localses, while it may be useful to set it to 2 or 4 for hexadecimal, 3 for octal and 4 or 8 for binary. In example 'g4-' would group digits 4 by 4, grouping them with a "-". Zero would disable grouping.

  • Grouping Character: If willing to change only the grouping character and not the default grouping count, use 'G'.

  • Alignment: by default the field is aligned to the left; to align the field to the right use 'r'.

  • Negative display format: By default, a '-' sign is appended in front of the number if it's negative. If the '+' character is added to the format, then in case the number is positive, '+' will be appended in front. '--' will postpend a '-' if the number is negative, while '++' will postpend either '+' or '-' depending on the sign of the number. To display a parenthesis around negative numbers, use '[', or use ']' to display a parenthesis for negative numbers and use the padding character in front and after positive numbers. Using parenthesis will prevent using '+', '++' or '--' formats. Format '-^' will add a - in front of padding space if the number is negative, while '+^' will add plus or minus depending on number sign. In example, "5+" would render -12 as " -12", while "5+^" will render as "- 12". If alignment is to the right, the sign will be added at the other side of the padding: "5+^r" would render -12 as "12 -". If size is not mandatory, parenthesis will be wrapped around the formatted field, while if size is mandatory they will be wrapped around the whole field, included padding. In example "5[r" on -4 would render as " (4)", while "5*[r" would render as "( 4)".

  • Object specific format: Objects may accept an object specific formatting as parameter of the standard "toString" method. A pipe separator '|' will cause all the following format to be passed unparsed to the toString method of objects eventually being formatted. If the object does not provides a toString method, or if it's not an object at all, an error will be raised. The object is the sole responsible for parsing and applying its specific format.

  • Nil format: How to represent a nil. It may be one of the following:
    • 'nn': nil is not represented (mute).
    • 'nN': nil is represented by "N"
    • 'nl': nil is rendered with "nil"
    • 'nL': nil is rendered with "Nil". This is also the default.
    • 'nu': nil is rendered with "Null"
    • 'nU': nil is rendered with "NULL"
    • 'no': nil is rendered with "None"
    • 'nA': nil is rendered with "NA"

  • Action on error: Normally, if trying to format something different from what is expected, format() the method will simply return false. In example, to format a string in a number, a string using the date formatter, a number in a simple pad-and-size formatter etc. To change this behavior, use '/' followed by one of the following:
    • 'n': act as the wrong item was nil (and uses the defined nil formatter).
    • '0': act as if the given item was 0, the empty string or an invalid date, or anyhow the neuter member of the expected type.
    • 'r': raise a type error. A 'c' letter may be added after the '/' and before the specifier to try a basic conversion into the expected type before triggering the requested effect. This will, in example, cause the toString() method of objects to be called if the formatting is detected to be a string-type format.

If the pattern is invalid, a paramter error will be raised. Examples:

  • "8*Xs-g2": Mandatory 8 characters, Hexadecimal uppercase, grouped 2 by 2 with '-' characters. A result may be "0A-F1-DA".
  • "12.3'0r+/r" - 12 ciphers, of which 3 are fixed decimals, 0 padded, right aligned. + is always added in front of positive numbers. In case the formatted item is not a number, a type error is raised.

Note:
ranges will be represented as [n1:n2] or [n1:] if they are open. Size, alignment and padding will work on the whole range, while numeric formatting will be applied to each end of the range.
Returns:
true if parse is succesful, false on parse format error.

Definition at line 53 of file format.cpp.

References e_actConvertNil, e_actConvertRaise, e_actConvertZero, e_actNil, e_actRaise, e_actZero, e_binary, e_binaryB, e_cHexLower, e_cHexUpper, e_decimal, e_hexLower, e_hexUpper, e_minusBack, e_minusEnd, e_minusFront, e_nilEmpty, e_nilN, e_nilNA, e_nilNil, e_nilnil, e_nilNone, e_nilNULL, e_nilNull, e_octal, e_octalZero, e_parenthesis, e_parpad, e_plusMinusBack, e_plusMinusEnd, e_plusMinusFront, e_scientific, e_tError, e_tNum, e_tStr, Falcon::String::getCharAt(), Falcon::String::length(), Falcon::String::parseInt(), and Falcon::String::size().

{

   String tmp;
   uint32 pos = 0;
   uint32 len = fmt.length();

   typedef enum {
      e_sInitial,
      e_sSize,
      e_sDecimals,
      e_sPadding,
      e_sDecSep,
      e_sGroupSep,
      e_sGroupSep2,
      e_sErrorEffect,
      e_sErrorEffect2,
      e_sNilMode,
      e_sNegFmt,
      e_sNegFmt2
   }
   t_state;
   t_state state = e_sInitial;


   while( pos < len )
   {
      uint32 chr = fmt.getCharAt( pos );

      switch( state )
      {

         //=============================
         // Basic state.
         //
         case e_sInitial:
            if( chr >= '1' && chr <= '9' )
            {
               // size already given
               if ( m_size != 0 )
                  return false;

               state = e_sSize;
               tmp.size(0);
               tmp += chr;
               break;
            }

            // else:
            switch( chr )
            {
               case 'N':
                  // it should be an octal.
                  m_convType = e_tNum;
                  numFormat( e_decimal );
               break;

               case '.':
                  // it should be an octal.
                  m_convType = e_tNum;
                  state = e_sDecimals;
                  tmp.size(0);
               break;

               case 'b':
                  // it should be an octal.
                  m_convType = e_tNum;
                  numFormat( e_binary );
               break;

               case 'B':
                  // it should be an octal.
                  m_convType = e_tNum;
                  numFormat( e_binaryB );
               break;

               case 'd':
                  m_convType = e_tNum;
                  state = e_sDecSep;
               break;

               case 'p':
                  state = e_sPadding;
               break;

               case 'g':
                  m_convType = e_tNum;
                  state = e_sGroupSep;
               break;

               case 'G':
                  m_convType = e_tNum;
                  state = e_sGroupSep2;
               break;

               case '0':
                  // it should be an octal.
                  m_convType = e_tNum;
                  numFormat( e_octalZero );
               break;

               case 'o':
                  // it should be an octal.
                  m_convType = e_tNum;
                  numFormat( e_octal );
               break;

               case 'x':
                  // it should be an octal.
                  m_convType = e_tNum;
                  numFormat( e_hexLower );
               break;

               case 'X':
                  // it should be an octal.
                  m_convType = e_tNum;
                  numFormat( e_hexUpper );
               break;

               case 'c':
                  // it should be an octal.
                  m_convType = e_tNum;
                  numFormat( e_cHexLower );
               break;

               case 'C':
                  // it should be an octal.
                  m_convType = e_tNum;
                  numFormat( e_cHexUpper );
               break;

               case 'e':
                  // it should be in scientific format
                  m_convType = e_tNum;
                  numFormat( e_scientific );
               break;

               case '/':
                  // it should be an octal.
                  state = e_sErrorEffect;
               break;

               case 'n':
                  state = e_sNilMode;
               break;

               case '|':
                  m_posOfObjectFmt = pos;
                  m_convType = e_tStr;
                  // complete parsing
                  pos = len;
               break;

               case '+':
                  m_negFormat = e_plusMinusFront;
                  state = e_sNegFmt;
               break;

               case '-':
                  m_negFormat = e_minusFront;
                  state = e_sNegFmt2;
               break;

               case '[':
                  m_negFormat = e_parenthesis;
               break;

               case ']':
                  m_negFormat = e_parpad;
               break;

               case 'r':
                  m_rightAlign = true;
               break;

               default:
                  // unrecognized character
                  m_convType = e_tError;
                  return false;
            }
         break;

         //=============================
         // Parse padding
         //
         case e_sDecSep:
            m_decimalSep = chr;
            state = e_sInitial;
         break;

         case e_sPadding:
            m_paddingChr = chr;
            state = e_sInitial;
         break;

         case e_sGroupSep:
            if( chr >= '0' && chr <='9' )
            {
               m_grouping = chr - '0';
               state = e_sGroupSep2;
            }
            else {
               m_thousandSep = chr;
               state = e_sInitial;
            }
         break;

         case e_sGroupSep2:
            m_thousandSep = chr;
            state = e_sInitial;
         break;

         //=============================
         // Size parsing state
         //
         case e_sSize:
            if( chr >= '0' && chr <= '9' )
            {
               tmp += chr;

               // size too wide
               if ( tmp.length() > 4 ) {
                  m_convType = e_tError;
                  return false;
               }
            }
            else
            {
               int64 tgt;
               tmp.parseInt( tgt );
               fieldSize( (uint16) tgt );

               if( chr == '*' )
               {
                  fixedSize( true );
               }
               else {
                  // reparse current char
                  --pos;
               }

               state = e_sInitial;
            }
         break;

         //=============================
         // Decimals parsing state
         //
         case e_sDecimals:
            if( chr >= '0' && chr <= '9' )
            {
               tmp += chr;

               // size too wide
               if ( tmp.length() > 2 ) {
                  m_convType = e_tError;
                  return false;
               }
            }
            else
            {
               int64 tgt;
               tmp.parseInt( tgt );
               decimals( (uint8) tgt );
               // reparse current char
               --pos;
               state = e_sInitial;
            }
         break;

         //===============================================
         // Parsing what should be done in case of error.
         //
         case e_sErrorEffect:
            if ( chr == 'c' )
            {
               state = e_sErrorEffect2;
               break;
            }

            // else
            switch( chr )
            {
               case 'n': mismatchAction( e_actNil ); break;
               case '0': mismatchAction( e_actZero ); break;
               case 'r': mismatchAction( e_actRaise ); break;

               default:
                  // invalid choiche
                  m_convType = e_tError;
                  return false;
            }

            state = e_sInitial;
         break;

         case e_sErrorEffect2:
            switch( chr )
            {
               case 'n': mismatchAction( e_actConvertNil ); break;
               case '0': mismatchAction( e_actConvertZero ); break;
               case 'r': mismatchAction( e_actConvertRaise ); break;

               default:
                  // invalid choiche
                  m_convType = e_tError;
                  return false;
            }
            state = e_sInitial;
         break;

         //=================================
         // parsing what do to with a Nil
         //
         case e_sNilMode:
            switch( chr )
            {
               case 'n': m_nilFormat = e_nilEmpty; break;
               case 'N': m_nilFormat = e_nilN; break;
               case 'l': m_nilFormat = e_nilnil; break;
               case 'L': m_nilFormat = e_nilNil; break;
               case 'u': m_nilFormat = e_nilNull; break;
               case 'U': m_nilFormat = e_nilNULL; break;
               case 'o': m_nilFormat = e_nilNone; break;
               case 'A': m_nilFormat = e_nilNA; break;

               default:
                  m_convType = e_tError;
                  return false;
            }
            state = e_sInitial;
         break;

         //=================================
         // Parsing neg format
         case e_sNegFmt:
            switch( chr ) {
               case '+': m_negFormat = e_plusMinusBack; break;
               case '^': m_negFormat = e_plusMinusEnd; break;
               default:
                  pos--;
            }
            state = e_sInitial;
         break;

         //=================================
         // Parsing neg format 2
         case e_sNegFmt2:
            switch( chr ) {
               case '-': m_negFormat = e_minusBack; break;
               case '^': m_negFormat = e_minusEnd; break;
               default:
                  pos--;
            }
            state = e_sInitial;
         break;

      }

      ++pos;
   } // end main loop


   // verify output status
   switch( state )
   {
      case e_sInitial: // ok
      case e_sNegFmt:
      break;

      case e_sSize:
      {
         int64 tgt;
         tmp.parseInt( tgt );
         fieldSize( (uint8) tgt );
      }
      break;

      case e_sDecimals:
      {
         int64 tgt;
         tmp.parseInt( tgt );
         decimals( (uint8) tgt );
      }
      break;

      // any other state means we're left in the middle of something
      default:
         m_convType = e_tError;
         return false;
   }

   // if everything goes fine...
   m_originalFormat = fmt;
   return true;
}


Generated by  Doxygen 1.6.0   Back to index