Class DB

java.lang.Object
com.articulate.sigma.DB

public class DB extends Object
A class to interface with databases and database-like formats, such as spreadsheets.
  • Field Details

  • Constructor Details

    • DB

      public DB()
  • Method Details

    • printTPTPDataInCSV

      public Map printTPTPDataInCSV(Map byProver) throws IOException
      Print statistics in a summary form for TPTP test run data
      Throws:
      IOException
    • resortTPTPData

      public Map resortTPTPData(Map stats) throws IOException
      Reorganize statistics in a summary form for TPTP test run data
      Throws:
      IOException
    • processTPTPData

      public Map processTPTPData() throws IOException
      Read statistics for TPTP test run data
      Throws:
      IOException
    • generateDB

      public void generateDB(KB kb)
      Generate an SQL database from the knowledge base Tables must be defined as instances of invalid input: '&'%DatabaseTable and must have invalid input: '&'%localDocumentation and invalid input: '&'%HasDatabaseColumn relations.
    • readSpreadsheet

      public static List<List<String>> readSpreadsheet(Reader inReader, List<String> lineStartTokens, boolean quote, char delimiter)
      Parse the input from a Reader for a CSV file into an ArrayList of ArrayLists. If lineStartTokens is a non-empty list, all lines not starting with one of the String tokens it contains will be concatenated. ';' denotes a comment line and will be skipped
      Parameters:
      inReader - A reader for the file to be processed
      lineStartTokens - If a List containing String tokens, all lines not starting with one of the tokens will be concatenated
      quote - signifies whether to retain quotes in elements
      Returns:
      An ArrayList of ArrayLists
    • readSpreadsheet

      public static List<List<String>> readSpreadsheet(String fname, List lineStartTokens, boolean quote, char delimiter)
      Parse a CSV file into an ArrayList of ArrayLists. If lineStartTokens is a non-empty list, all lines not starting with one of the String tokens it contains will be concatenated.
      Parameters:
      fname - The pathname of the CSV file to be processed
      lineStartTokens - If a List containing String tokens, all lines not starting with one of the tokens will be concatenated
      quote - signifies whether to retain quotes in elements
      Returns:
      An ArrayList of ArrayLists
    • readSpreadsheet

      public static List<List<String>> readSpreadsheet(String fname, List lineStartTokens, boolean quote)
    • writeSpreadsheetLine

      public static String writeSpreadsheetLine(List<String> al, boolean quote)
      Parameters:
      quote - signifies whether to quote entries from the spreadsheet
    • writeSpreadsheet

      public static String writeSpreadsheet(List<List<String>> values, boolean quote)
      Parameters:
      quote - signifies whether to quote entries from the spreadsheet
    • readDataInterchangeFormatFile

      public static List<List> readDataInterchangeFormatFile(Reader inReader)
      Parse an input stream Reader from a Data Interchange Format (.dif) file into an ArrayList of ArrayLists.
      Parameters:
      inReader - A reader created from the .dif file to be processed
      Returns:
      An ArrayList of ArrayLists
    • readDataInterchangeFormatFile

      public static List<List> readDataInterchangeFormatFile(String fname)
      Parse and load a Data Interchange Format (.dif) file into an ArrayList of ArrayLists.
      Parameters:
      fname - The pathname of the file to be processed
      Returns:
      An ArrayList of ArrayLists
    • writeSuoKifStatements

      public static int writeSuoKifStatements(Set statements, PrintWriter pw)
    • writeSuoKifStatements

      public static int writeSuoKifStatements(KB kb, String sourceFilePath)
      Writes to sourceFilePath all Formulae in kb that have sourceFilePath as source file.
      Parameters:
      kb - The KB from which Formulae will be written
      sourceFilePath - The canonical pathname of the file to which Formulae will be written
      Returns:
      An int denoting the number of expressions saved to the file named by sourceFilePath
    • printSpreadsheet

      public void printSpreadsheet(Map rows, List relations)
      Print a comma-delimited matrix. The values of the rows are TreeMaps, whose values in turn are Strings. The ArrayList of relations forms the column headers, which are Strings.
      Parameters:
      rows - - the matrix
      relations - - the relations that form the column header
    • exportTable

      public void exportTable(KB kb)
      Export a comma-delimited table of all the ground binary statements in the knowledge base. Only the relations that are actually used are included in the header.
      Parameters:
      kb - The knowledge base.
    • wordWrap

      public static String wordWrap(String input, int length)
    • emptyString

      public static boolean emptyString(String input)
    • RearDBtoKIF

      public static void RearDBtoKIF()
    • parseCuisines

      public static String parseCuisines(String cuisine, String RST_RESTAURANTNAME, String RST_RESTAURANTID)
    • topSUMOInReviews

      public static List<com.articulate.sigma.utils.AVPair> topSUMOInReviews(List<Hotel> reviews)
      Excludes cases of where the mapping is to multiple SUMO terms
      Returns:
      a list of attribute value pairs where the count is in the attribute and the SUMO term is the value
    • wordSensesInReviews

      public static Map<String,Integer> wordSensesInReviews(List<Hotel> reviews)
      Returns:
      a map of all the word senses used in the reviews and a count of their appearances
    • SUMOReviews

      public static void SUMOReviews(List<Hotel> reviews)
      process a side effect on reviews of setting the SUMO list
    • disambigReviews

      public static void disambigReviews(List<Hotel> hotels)
      Parameters:
      hotels - an ArrayList of Hotel with reviews as text process synset values as a side effect
    • processTimeDate

      public static String processTimeDate(String timeDate)
    • readStateAbbrevs

      public static Map<String,String> readStateAbbrevs()
    • fill

      public static List<String> fill(String value, int count)
    • DiningDBImport

      public static void DiningDBImport()
    • getWordSenses

      public static List<String> getWordSenses(List<String> al)
      Returns:
      a list of SUMO terms that are the best guess at classes for each word
    • getFoodWordSenses

      public static List<String> getFoodWordSenses(List<String> al)
      Returns:
      a list of SUMO terms that are the best guess at classes for each word
    • parseRest

      public static Set<String> parseRest(String menu, String placename, String price, String address, String latitude, String longitude, String phone)
    • getAllRest

      public static Set<String> getAllRest()
    • geocode

      public static List<String> geocode(String address)
      Call Google's geocode API to convert an address string into a lat/lon, which is returned as an ArrayList of two String elements containing a real-number format latitude and longitude.
    • printTopSUMOInReviews

      public static String printTopSUMOInReviews(List<com.articulate.sigma.utils.AVPair> topSUMO)
    • parseOneRestFile

      public static Set<String> parseOneRestFile(String fname)
      A test method.
      Parameters:
      fname - has no file extension or directory
    • readStopConceptArray

      public static void readStopConceptArray()
      Fill out from a CSV file a set of concepts that should be ignored during content extraction process side effect on static variable "stopConcept"
    • readSentimentArray

      public static void readSentimentArray()
      Fill out from a CSV file a map of word keys, and values broken down by POS, listing whether it's a positive or negative word interior hash map keys are type, POS, stemmed, polarity
    • computeSentiment

      public static int computeSentiment(String input)
      Calculate an integer sentiment value for a string of words.
    • computeSentimentForWord

      public static int computeSentimentForWord(String word)
      Find the sentiment value for a given word, after finding the root form of the word.
    • addConceptSentimentScores

      public static Map<String,Integer> addConceptSentimentScores(Map<String,Integer> scores, String SUMOs, int total)
      Add new scores to existing scores. Note the side effect on scores.
      Returns:
      a map of concept keys and integer sentiment score values
    • computeConceptSentimentFromFile

      public static Map<String,Integer> computeConceptSentimentFromFile(String filename)
      Associate individual concepts with a sentiment score
      Returns:
      a map of concept keys and integer sentiment score values
    • computeConceptSentiment

      public static Map<String,Integer> computeConceptSentiment(String input)
      Associate individual concepts with a sentiment score
      Returns:
      a map of concept keys and integer sentiment score values
    • readAmenities

      public static void readAmenities()
    • textSentimentByPeriod

      public static void textSentimentByPeriod()
    • textSentiment

      public static void textSentiment()
    • textFileSentiment

      public static void textFileSentiment(String fname, boolean neg)
      Compute sentiment for each line of a text file and output as CSV.
    • testSentiment

      public static void testSentiment()
    • testSentimentCorpus

      public static void testSentimentCorpus()
    • guessGender

      public static void guessGender(String fname)
    • main

      public static void main(String[] args)
      A test method