Package com.articulate.sigma
Class DB
java.lang.Object
com.articulate.sigma.DB
A class to interface with databases and database-like formats,
such as spreadsheets.
-
Field Summary
Fields -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionaddConceptSentimentScores(Map<String, Integer> scores, String SUMOs, int total) Add new scores to existing scores.computeConceptSentiment(String input) Associate individual concepts with a sentiment scorecomputeConceptSentimentFromFile(String filename) Associate individual concepts with a sentiment scorestatic intcomputeSentiment(String input) Calculate an integer sentiment value for a string of words.static intFind the sentiment value for a given word, after finding the root form of the word.static voidstatic voiddisambigReviews(List<Hotel> hotels) static booleanemptyString(String input) voidexportTable(KB kb) Export a comma-delimited table of all the ground binary statements in the knowledge base.voidgenerateDB(KB kb) Generate an SQL database from the knowledge base Tables must be defined as instances of invalid input: '&'%DatabaseTable and must have invalid input: '&'%localDocumentation and invalid input: '&'%HasDatabaseColumn relations.Call Google's geocode API to convert an address string into a lat/lon, which is returned as an ArrayList of two String elements containing a real-number format latitude and longitude.getFoodWordSenses(List<String> al) getWordSenses(List<String> al) static voidguessGender(String fname) static voidA test methodstatic StringparseCuisines(String cuisine, String RST_RESTAURANTNAME, String RST_RESTAURANTID) parseOneRestFile(String fname) A test method.parseRest(String menu, String placename, String price, String address, String latitude, String longitude, String phone) voidprintSpreadsheet(Map rows, List relations) Print a comma-delimited matrix.static StringprintTopSUMOInReviews(List<com.articulate.sigma.utils.AVPair> topSUMO) printTPTPDataInCSV(Map byProver) Print statistics in a summary form for TPTP test run datastatic StringprocessTimeDate(String timeDate) Read statistics for TPTP test run datastatic voidreadDataInterchangeFormatFile(Reader inReader) Parse an input stream Reader from a Data Interchange Format (.dif) file into an ArrayList of ArrayLists.Parse and load a Data Interchange Format (.dif) file into an ArrayList of ArrayLists.static voidFill out from a CSV file a map of word keys, and values broken down by POS, listing whether it's a positive or negative word interior hash map keys are type, POS, stemmed, polarityreadSpreadsheet(Reader inReader, List<String> lineStartTokens, boolean quote, char delimiter) Parse the input from a Reader for a CSV file into an ArrayList of ArrayLists.readSpreadsheet(String fname, List lineStartTokens, boolean quote) readSpreadsheet(String fname, List lineStartTokens, boolean quote, char delimiter) Parse a CSV file into an ArrayList of ArrayLists.static voidFill out from a CSV file a set of concepts that should be ignored during content extraction process side effect on static variable "stopConcept"static voidresortTPTPData(Map stats) Reorganize statistics in a summary form for TPTP test run datastatic voidSUMOReviews(List<Hotel> reviews) process a side effect on reviews of setting the SUMO liststatic voidstatic voidstatic voidtextFileSentiment(String fname, boolean neg) Compute sentiment for each line of a text file and output as CSV.static voidstatic voidstatic List<com.articulate.sigma.utils.AVPair> topSUMOInReviews(List<Hotel> reviews) Excludes cases of where the mapping is to multiple SUMO termswordSensesInReviews(List<Hotel> reviews) static Stringstatic StringwriteSpreadsheet(List<List<String>> values, boolean quote) static StringwriteSpreadsheetLine(List<String> al, boolean quote) static intwriteSuoKifStatements(KB kb, String sourceFilePath) Writes to sourceFilePath all Formulae in kb that have sourceFilePath as source file.static intwriteSuoKifStatements(Set statements, PrintWriter pw)
-
Field Details
-
sentiment
-
amenityTerms
-
stopConcepts
-
-
Constructor Details
-
DB
public DB()
-
-
Method Details
-
printTPTPDataInCSV
Print statistics in a summary form for TPTP test run data- Throws:
IOException
-
resortTPTPData
Reorganize statistics in a summary form for TPTP test run data- Throws:
IOException
-
processTPTPData
Read statistics for TPTP test run data- Throws:
IOException
-
generateDB
Generate an SQL database from the knowledge base Tables must be defined as instances of invalid input: '&'%DatabaseTable and must have invalid input: '&'%localDocumentation and invalid input: '&'%HasDatabaseColumn relations. -
readSpreadsheet
public static List<List<String>> readSpreadsheet(Reader inReader, List<String> lineStartTokens, boolean quote, char delimiter) Parse the input from a Reader for a CSV file into an ArrayList of ArrayLists. If lineStartTokens is a non-empty list, all lines not starting with one of the String tokens it contains will be concatenated. ';' denotes a comment line and will be skipped- Parameters:
inReader- A reader for the file to be processedlineStartTokens- If a List containing String tokens, all lines not starting with one of the tokens will be concatenatedquote- signifies whether to retain quotes in elements- Returns:
- An ArrayList of ArrayLists
-
readSpreadsheet
public static List<List<String>> readSpreadsheet(String fname, List lineStartTokens, boolean quote, char delimiter) Parse a CSV file into an ArrayList of ArrayLists. If lineStartTokens is a non-empty list, all lines not starting with one of the String tokens it contains will be concatenated.- Parameters:
fname- The pathname of the CSV file to be processedlineStartTokens- If a List containing String tokens, all lines not starting with one of the tokens will be concatenatedquote- signifies whether to retain quotes in elements- Returns:
- An ArrayList of ArrayLists
-
readSpreadsheet
-
writeSpreadsheetLine
- Parameters:
quote- signifies whether to quote entries from the spreadsheet
-
writeSpreadsheet
- Parameters:
quote- signifies whether to quote entries from the spreadsheet
-
readDataInterchangeFormatFile
Parse an input stream Reader from a Data Interchange Format (.dif) file into an ArrayList of ArrayLists.- Parameters:
inReader- A reader created from the .dif file to be processed- Returns:
- An ArrayList of ArrayLists
-
readDataInterchangeFormatFile
Parse and load a Data Interchange Format (.dif) file into an ArrayList of ArrayLists.- Parameters:
fname- The pathname of the file to be processed- Returns:
- An ArrayList of ArrayLists
-
writeSuoKifStatements
-
writeSuoKifStatements
Writes to sourceFilePath all Formulae in kb that have sourceFilePath as source file.- Parameters:
kb- The KB from which Formulae will be writtensourceFilePath- The canonical pathname of the file to which Formulae will be written- Returns:
- An int denoting the number of expressions saved to the file named by sourceFilePath
-
printSpreadsheet
Print a comma-delimited matrix. The values of the rows are TreeMaps, whose values in turn are Strings. The ArrayList of relations forms the column headers, which are Strings.- Parameters:
rows- - the matrixrelations- - the relations that form the column header
-
exportTable
Export a comma-delimited table of all the ground binary statements in the knowledge base. Only the relations that are actually used are included in the header.- Parameters:
kb- The knowledge base.
-
wordWrap
-
emptyString
-
RearDBtoKIF
public static void RearDBtoKIF() -
parseCuisines
-
topSUMOInReviews
Excludes cases of where the mapping is to multiple SUMO terms- Returns:
- a list of attribute value pairs where the count is in the attribute and the SUMO term is the value
-
wordSensesInReviews
- Returns:
- a map of all the word senses used in the reviews and a count of their appearances
-
SUMOReviews
process a side effect on reviews of setting the SUMO list -
disambigReviews
- Parameters:
hotels- an ArrayList of Hotel with reviews as text process synset values as a side effect
-
processTimeDate
-
readStateAbbrevs
-
fill
-
DiningDBImport
public static void DiningDBImport() -
getWordSenses
- Returns:
- a list of SUMO terms that are the best guess at classes for each word
-
getFoodWordSenses
- Returns:
- a list of SUMO terms that are the best guess at classes for each word
-
parseRest
-
getAllRest
-
geocode
Call Google's geocode API to convert an address string into a lat/lon, which is returned as an ArrayList of two String elements containing a real-number format latitude and longitude. -
printTopSUMOInReviews
-
parseOneRestFile
A test method.- Parameters:
fname- has no file extension or directory
-
readStopConceptArray
public static void readStopConceptArray()Fill out from a CSV file a set of concepts that should be ignored during content extraction process side effect on static variable "stopConcept" -
readSentimentArray
public static void readSentimentArray()Fill out from a CSV file a map of word keys, and values broken down by POS, listing whether it's a positive or negative word interior hash map keys are type, POS, stemmed, polarity -
computeSentiment
Calculate an integer sentiment value for a string of words. -
computeSentimentForWord
Find the sentiment value for a given word, after finding the root form of the word. -
addConceptSentimentScores
public static Map<String,Integer> addConceptSentimentScores(Map<String, Integer> scores, String SUMOs, int total) Add new scores to existing scores. Note the side effect on scores.- Returns:
- a map of concept keys and integer sentiment score values
-
computeConceptSentimentFromFile
Associate individual concepts with a sentiment score- Returns:
- a map of concept keys and integer sentiment score values
-
computeConceptSentiment
Associate individual concepts with a sentiment score- Returns:
- a map of concept keys and integer sentiment score values
-
readAmenities
public static void readAmenities() -
textSentimentByPeriod
public static void textSentimentByPeriod() -
textSentiment
public static void textSentiment() -
textFileSentiment
Compute sentiment for each line of a text file and output as CSV. -
testSentiment
public static void testSentiment() -
testSentimentCorpus
public static void testSentimentCorpus() -
guessGender
-
main
A test method
-