edu.umd.cfar.lamp.viper.util
Class StringHelp

java.lang.Object
  extended byedu.umd.cfar.lamp.viper.util.StringHelp

public class StringHelp
extends java.lang.Object

This class contains generic static string manipulation and generation functions.


Field Summary
static int REL_EQ
          Equality relation
static int REL_GT
          Greater than relation
static int REL_GTEQ
          Greater than or equals relation
static int REL_LT
          Less than relation
static int REL_LTEQ
          Less than or equals relation
static int REL_NEQ
          Inequality relation
 
Constructor Summary
StringHelp()
           
 
Method Summary
static java.lang.String backslashify(java.lang.String str)
          Converts special characters to escpe key sequences.
static java.lang.String banner(java.lang.String text, int width)
          Generates an asterisk-surrounded banner containing the given text.
static boolean containsCommaSeparatedList(java.lang.String s)
          Checks if there are commas that aren't within quote marks.
static java.util.Iterator debackslashedTokenizer(java.lang.String str)
          Tokenizes like the bash command line.
static java.lang.String debackslashify(java.lang.String str)
          Converts escape sequences to their real characters, like new lines.
static java.lang.String encodeAsAnAcceptableFileName(java.lang.String s)
          Encode a string to the "x-www-form-urlencoded" form, enhanced with the UTF-8-in-URL proposal.
static java.lang.String getExtraTextOutsideBrackets(java.lang.String line)
          Get the characters from a bracketed list that are not in brackets.
static java.util.List getQuotedList(java.lang.String str)
          Converts a comma seperated list of quoted strings to a list of debackslashified strings.
static java.lang.String getQuotedListString(java.util.List list)
          Returns the list as a comma sepatated list of quoted and escaped strings.
static int getRelationalOperatorEnum(java.lang.String rel_op)
          Converts the string, e.g.
static java.lang.String getRelationalOperatorString(int rel_op_enum)
          Gets the string for the relational operator.
static void handleExtraTextOutsideBrackets(java.lang.String line, ErrorWriter err)
          Prints a warning for test that should all be inside brackets but has some outside.
static int hasUnmatchedBrackets(java.lang.String str)
          Checks to see if the String has marching '[' and ']' characters.
static boolean isLegalIdentifier(java.lang.String str)
          Checks to see if a String is a legal identifier.
static boolean isLogicalOperator(java.lang.String log_op)
          Will return true if the string passed to it is one of the defined logical operators.
static boolean isRelationalOperator(java.lang.String rel_op)
          Will return true if the string passed to it is one of the defined relational operators.
static boolean isXMLFormat(java.io.File f)
          Checks to see if the file begins with an xml processing directive, eg <?
static boolean isXMLFormat(java.io.InputStream f)
          Checks to see if the stream begins with an xml processing directive, eg <?
static boolean isXMLFormat(java.lang.String fileName)
          Checks to see if the file begins with an xml processing directive, eg <?
static boolean isXMLFormat(java.lang.String configFileName, java.lang.String dataFileName)
          Determines by file content if the data is in XML format.
static java.lang.String padLeft(int amount, java.lang.String S)
          This function returns a string equal to amount number of ' ' characters added to the front of the string S.
static java.lang.String removeSpacesFrom(java.lang.String candidate)
          Will remove whitespace chars from the string.
static java.lang.String shorten(java.lang.String line)
          Removes the first and last character of the string.
static java.lang.String[] splitByBrackets(java.lang.String line)
          Extracts a list of all strings contained within brackets.
static java.lang.String[] splitByCommas(java.lang.String line)
          Deprecated. If you must, use this line: splitBySeparator (removeBrackets (line), ",")
static java.lang.String[] splitBySeparator(java.lang.String line, char sep)
          Split using a seperator.
static java.lang.String[] splitBySeparatorAndParen(java.lang.String line, char sep)
          Split using a separator, but allow for the separator to occur in nested parentheses without splitting.
static java.lang.String[] splitBySeparatorQuoteAndParen(java.lang.String line, char sep)
          Split using a separator, but allow for the separator to occur in nested parentheses without splitting.
static java.lang.String[] splitSpaces(java.lang.String line)
          Divides a String by its whitespace.
static java.lang.String underliner(boolean starts, boolean ends, int start, int end)
          This is useful for generating jikes-style underlines for compiler errors and warnings.
static java.lang.String unescapeAnAcceptableFileName(java.lang.String s)
          Decodes a string encoded with encodeAsAnAcceptableFileName(String), which is to say it decodes any URL encoded string.
static java.lang.String webify(java.lang.String str)
          Convert the plain text string into something that is a valid HTML/XML text string (i.e. escaping angle brackets, etc.).
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

REL_EQ

public static final int REL_EQ
Equality relation

See Also:
Constant Field Values

REL_LT

public static final int REL_LT
Less than relation

See Also:
Constant Field Values

REL_GT

public static final int REL_GT
Greater than relation

See Also:
Constant Field Values

REL_LTEQ

public static final int REL_LTEQ
Less than or equals relation

See Also:
Constant Field Values

REL_GTEQ

public static final int REL_GTEQ
Greater than or equals relation

See Also:
Constant Field Values

REL_NEQ

public static final int REL_NEQ
Inequality relation

See Also:
Constant Field Values
Constructor Detail

StringHelp

public StringHelp()
Method Detail

underliner

public static java.lang.String underliner(boolean starts,
                                          boolean ends,
                                          int start,
                                          int end)
This is useful for generating jikes-style underlines for compiler errors and warnings. For example, if there is an unknown identifier foo in a statement, pass the underliner (true, true, first char index, last char index), and then print out the underline after printing the offending line. eg:
    i += foo;
         <->
 
The first two arguments allow for multiline underlines. eg:
        <-----------------------------------------
    i = ((Descriptor) vec.elementAt (g).sameTypeAs
          (other);
          ------^
 

Parameters:
starts - Does the thing to be underlined start on this line?
ends - Does it end on this line?
start - The character offset of the first offending mark on this line
end - The last character in the error on this line.
Returns:
a string, such as "<-->", to be used to underline errors

padLeft

public static java.lang.String padLeft(int amount,
                                       java.lang.String S)
This function returns a string equal to amount number of ' ' characters added to the front of the string S.

Parameters:
amount - Number of spaces to pad.
S - The string that requires padding.
Returns:
" "*amount + S

splitSpaces

public static java.lang.String[] splitSpaces(java.lang.String line)
Divides a String by its whitespace.

Parameters:
line - The String to divide.
Returns:
An Array of Strings taken from between the whitespace characters.

shorten

public static java.lang.String shorten(java.lang.String line)
Removes the first and last character of the string.

Parameters:
line - The String to be shortened
Returns:
The substring.

getExtraTextOutsideBrackets

public static java.lang.String getExtraTextOutsideBrackets(java.lang.String line)
Get the characters from a bracketed list that are not in brackets. Returns a string if the input contains characters between brackets such as the ... in the following:
   ... [ foo ] ... [1.2 1.2] ... [hi] ...
 
If there isn't anything outside the brackets, it returns null.

Parameters:
line - The String to check.
Returns:
The text outside brackets, or null if there is no such text.

handleExtraTextOutsideBrackets

public static void handleExtraTextOutsideBrackets(java.lang.String line,
                                                  ErrorWriter err)
Prints a warning for test that should all be inside brackets but has some outside. It gets the text from line and, if there is any, prints it out as a warning.

Parameters:
line - The line that might have a problem.
err - The ErrorWriter that gets the warning, if there is one.
See Also:
splitByBrackets(String line)

splitByBrackets

public static java.lang.String[] splitByBrackets(java.lang.String line)
Extracts a list of all strings contained within brackets. Assumes list contains bracketed elements, as in:
   [ foo ]  [1.2 1.2] [hi]
 
This will return {" foo ", "1.2 1.2", "hi"}.

Parameters:
line - The data to be split.
Returns:
An Array containing the Strings inside the brackets.
Throws:
java.lang.IllegalArgumentException - if brackets are unbalanced

hasUnmatchedBrackets

public static int hasUnmatchedBrackets(java.lang.String str)
Checks to see if the String has marching '[' and ']' characters.

Parameters:
str - The String in which to check the brackets.
Returns:
  • 0: Brackets match up.
  • -1: Too many left brackets.
  • 1: Missing right brackets

splitByCommas

public static java.lang.String[] splitByCommas(java.lang.String line)
Deprecated. If you must, use this line: splitBySeparator (removeBrackets (line), ",")

Converts parenthesized list of comma seperated items into an array of Strings.

Parameters:
line - A String of the format (alpha, beta, gamma)
Returns:
An array {"alpha", "beta", "gamma"}

splitBySeparator

public static java.lang.String[] splitBySeparator(java.lang.String line,
                                                  char sep)
Split using a seperator.

Parameters:
line - The String to be seperated.
sep - The seperator character, eg a comma
Returns:
An array of Strings containing the seperated data.
See Also:
splitBySeparatorAndParen(String line, char sep)

splitBySeparatorAndParen

public static java.lang.String[] splitBySeparatorAndParen(java.lang.String line,
                                                          char sep)
Split using a separator, but allow for the separator to occur in nested parentheses without splitting. E.g.
     1, (2,3), 4
 
would split into

Parameters:
line - The String to be seperated.
sep - The seperator character, eg a comma
Returns:
An array of Strings containing the seperated data.
See Also:
splitBySeparator(String line, char c)

splitBySeparatorQuoteAndParen

public static java.lang.String[] splitBySeparatorQuoteAndParen(java.lang.String line,
                                                               char sep)
Split using a separator, but allow for the separator to occur in nested parentheses without splitting.
   E.g.   "1", 2*("2,3"), "4"
      would split into
        -- "1"
        -- 2*("2,3")
        -- "4"
 
If the data has an odd number of "s, it will append a " character to the end. In order to include a quote character without delimiting a string, use the \". For a \, use \\.

Parameters:
line - the string to split
sep - the seperator character, e.g. a comma
Returns:
the split string

isLegalIdentifier

public static boolean isLegalIdentifier(java.lang.String str)
Checks to see if a String is a legal identifier. Uses java/c++ standards, execept it cannot start with an underscore.

Parameters:
str - The String to check.
Returns:
true if the String uses valid characters.

containsCommaSeparatedList

public static boolean containsCommaSeparatedList(java.lang.String s)
Checks if there are commas that aren't within quote marks. Thus, this following is a comma separated list,
     HELLO, THERE, FINE
 
But the following isn't
   "hello, world"  "fine, thank you"
 
since the commas are between quotes. Note: This doesn't do checking for control characters, so it won't work on data that contains them, eg. "Say \"Hello,\" World."

Parameters:
s - The String to check.
Returns:
true if this looks like a comma seperated list.

isRelationalOperator

public static boolean isRelationalOperator(java.lang.String rel_op)
Will return true if the string passed to it is one of the defined relational operators. Right now those operators are <, >, <=, >=, ==, and !=.

Parameters:
rel_op - the string that has to be tested if it is a rel op
Returns:
true if rel_op was a relational operator.

getRelationalOperatorEnum

public static int getRelationalOperatorEnum(java.lang.String rel_op)
Converts the string, e.g. == or <=, to the constant.

Parameters:
rel_op - the relational operator
Returns:
the static final int for the operator, e.g. REL_EQ

getRelationalOperatorString

public static java.lang.String getRelationalOperatorString(int rel_op_enum)
Gets the string for the relational operator.

Parameters:
rel_op_enum - the constant int for the operator
Returns:
the corresponding operator string

isLogicalOperator

public static boolean isLogicalOperator(java.lang.String log_op)
Will return true if the string passed to it is one of the defined logical operators. So far, these are && and ||.

Parameters:
log_op - the string that has to be tested if it is a log op
Returns:
true if log_op was a logical operator.

removeSpacesFrom

public static java.lang.String removeSpacesFrom(java.lang.String candidate)
Will remove whitespace chars from the string.

Parameters:
candidate - String with whitespace characters.
Returns:
The same String as passed in only w/o the white spaces.

backslashify

public static java.lang.String backslashify(java.lang.String str)
Converts special characters to escpe key sequences. eg a new line becomes \n.

Parameters:
str - The string containing control characters that should be expanded
Returns:
The expanded String.
See Also:
debackslashify(String str)

debackslashedTokenizer

public static java.util.Iterator debackslashedTokenizer(java.lang.String str)
Tokenizes like the bash command line.

Parameters:
str - the string to tokenize
Returns:
an iterator over all tokens in the string

getQuotedListString

public static java.lang.String getQuotedListString(java.util.List list)
Returns the list as a comma sepatated list of quoted and escaped strings.

Parameters:
list - the list to print as a string
Returns:
a string of quoted, escaped, comma delimited items

getQuotedList

public static java.util.List getQuotedList(java.lang.String str)
                                    throws BadDataException
Converts a comma seperated list of quoted strings to a list of debackslashified strings. For example: "one", "two", "three" will convert to a list containing three strings, each stripped of quotes.

Parameters:
str - The string containing a list.
Returns:
A list of strings
Throws:
BadDataException
See Also:
debackslashify(String str)

debackslashify

public static java.lang.String debackslashify(java.lang.String str)
                                       throws java.lang.IllegalArgumentException
Converts escape sequences to their real characters, like new lines. For example, the string:
 I went to \"http:\\\\java.sun.com\\\",\nbut they didn't sell coffee.
 
becomes:
 I went to "http:\\java.sun.com\",
 but they didn't sell coffee.
 

Parameters:
str - The string containing C-style control characters.
Returns:
The string with the control characters turned into actual characters.
Throws:
java.lang.IllegalArgumentException - if the last character is a backslash (e.g. "some\")
See Also:
backslashify(String str)

webify

public static java.lang.String webify(java.lang.String str)
Convert the plain text string into something that is a valid HTML/XML text string (i.e. escaping angle brackets, etc.). I owe a lot to the apache project for their source code was an assistance. It is annoying that they don't have this as a public method somewhere, though.

Parameters:
str - the string to escape
Returns:
the string, with some characters converted to XML character entity references

banner

public static java.lang.String banner(java.lang.String text,
                                      int width)
Generates an asterisk-surrounded banner containing the given text. For example, banner("Hello, world!",12) generates:
 ***********
 *  Hello, *
 *  world! *
 ***********
 

Parameters:
text -
width -
Returns:
a banner describing the text, with a newline at the end

isXMLFormat

public static boolean isXMLFormat(java.lang.String configFileName,
                                  java.lang.String dataFileName)
                           throws java.io.IOException
Determines by file content if the data is in XML format. Specialized version for gtf data. It looks for the processing directive.

Parameters:
configFileName - The config file name. XML files only have the data file, so if this is not null, this method returns false.
dataFileName - The name of the data file, the file that is checked for the processing directive
Returns:
true if the data is xml. Does not validate, or anything.
Throws:
java.io.IOException - if there was a problem with the data file.

isXMLFormat

public static boolean isXMLFormat(java.lang.String fileName)
                           throws java.io.IOException
Checks to see if the file begins with an xml processing directive, eg <?xml?>. This method does not check to see that the file is well-formed, or even if the processing directive is good, just that the first non-whitespace characters are "<?xml".

Parameters:
fileName - The file to check for xml processing directive
Returns:
true if the directive was found.
Throws:
java.io.IOException - if there is an error while reading the file, eg FileNotFoundException

isXMLFormat

public static boolean isXMLFormat(java.io.File f)
                           throws java.io.IOException
Checks to see if the file begins with an xml processing directive, eg <?xml?>. This method does not check to see that the file is well-formed, or even if the processing directive is good, just that the first non-whitespace characters are "<?xml".

Parameters:
f - The file to check for xml processing directive
Returns:
true if the directive was found.
Throws:
java.io.IOException - if there is an error while reading the file, eg FileNotFoundException

isXMLFormat

public static boolean isXMLFormat(java.io.InputStream f)
                           throws java.io.IOException
Checks to see if the stream begins with an xml processing directive, eg <?xml?>. This method does not check to see that the stream is well-formed, or even if the processing directive is good, just that the first non-whitespace characters are "<?xml".

Parameters:
f - The file to check for xml processing directive
Returns:
true if the directive was found.
Throws:
java.io.IOException - if there is an error while reading the file, eg FileNotFoundException

encodeAsAnAcceptableFileName

public static java.lang.String encodeAsAnAcceptableFileName(java.lang.String s)
Encode a string to the "x-www-form-urlencoded" form, enhanced with the UTF-8-in-URL proposal. This is modified from the w3c's code, at http://www.w3.org/International/O-URL-code.html, in that it is more conservative in its encoding, including path seperators, tildes, parens, bangs and single quotes in its encoding.

Parameters:
s - The string to be encoded
Returns:
The encoded string

unescapeAnAcceptableFileName

public static java.lang.String unescapeAnAcceptableFileName(java.lang.String s)
Decodes a string encoded with encodeAsAnAcceptableFileName(String), which is to say it decodes any URL encoded string.

Parameters:
s - an URL encoded string
Returns:
the decoded string