Package com.articulate.sigma
Class Mapping
java.lang.Object
com.articulate.sigma.Mapping
This code is copyright Articulate Software (c) 2004.
This software is released under the GNU Public License invalid input: '<'http://www.gnu.org/copyleft/gpl.html>.
Users of this code also consent, by use of this code, to credit Articulate Software
in any writings, briefings, publications, presentations, or
other representations of any software which incorporates, builds on, or uses this
code. Please cite the following article in any publication with references:
Pease, A., (2003). The Sigma Ontology Development Environment,
in Working Notes of the IJCAI-2003 Workshop on Ontology and Distributed Systems,
August 9, Acapulco, Mexico. See also http://sigmakee.sourceforge.net
This class maps ontologies. It includes embedded subclasses that
implement specific mapping heuristics.
This class also includes utilities for converting other
ad-hoc formats to KIF
-
Field Summary
Fields -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic voidconvertYAGO(String file, String relName) Convert a YAGO file into KIFstatic intgetJaroWinklerDistance(String s1, String s2) Jaro-Winkler Mapping Method implemented by Gerard de Melostatic intLevenshteinDistance(char s[1..m], char t[1..n]) courtesy of Wikipedia http://en.wikipedia.org/wiki/Levenshtein_distance int LevenshteinDistance(char s[1..m], char t[1..n])static intgetSubstringDistance(String term1, String term2) Substring Mapping Method: returns 1 if the two strings are identical, scores >1 if one string is a substring of the other, and Integer.MAX_VALUE if there is no substring match This approach is based on: John Li, "LOM: A Lexicon-based Ontology Mapping Tool", Proceedings of the Performance Metrics for Intelligent Systems (PerMIS.'04), 2004.static StringgetTermFormat(KB kb, String term) Get the termFormat label for a term.static booleanisValidTerm(String term) check whether a term is valid (worthy of being compared)static voidA test method.static voidmapOntologies(String kbName1, String kbName2, int threshold, String matchMethod) Map ontologies through 4 methods: (1) identical term names (2) substrings of term names are equal (3) terms align to words in the same WordNet synset (4) extra "points" for having terms that align with the same structural arrangementstatic Stringrename terms in KB kbname2 to conform to names in kbname1static StringNormalize a string by replacing all non-letter, non-digit characters with spaces, adding spaces on capitalization boundaries, and then converting to lower casestatic StringwriteEquivalences(Set<String> cbset, String kbname1, String kbname2) Write synonymousExternalConcept expressions for term pairs given in cbset.
-
Field Details
-
mappings
-
termSeparator
public static char termSeparator
-
-
Constructor Details
-
Mapping
public Mapping()
-
-
Method Details
-
writeEquivalences
public static String writeEquivalences(Set<String> cbset, String kbname1, String kbname2) throws IOException Write synonymousExternalConcept expressions for term pairs given in cbset. They are strings of the form [checkbox|subcheckbox]_[T_]name1-name2 There's a known bug when ontology terms contain dashes.- Returns:
- error messages if necessary
- Throws:
IOException
-
merge
rename terms in KB kbname2 to conform to names in kbname1- Returns:
- error messages if necessary
-
convertYAGO
Convert a YAGO file into KIF- Throws:
IOException
-
getTermFormat
Get the termFormat label for a term. Return only the first such label. Return null if no label. -
mapOntologies
Map ontologies through 4 methods: (1) identical term names (2) substrings of term names are equal (3) terms align to words in the same WordNet synset (4) extra "points" for having terms that align with the same structural arrangement -
isValidTerm
check whether a term is valid (worthy of being compared) -
normalize
Normalize a string by replacing all non-letter, non-digit characters with spaces, adding spaces on capitalization boundaries, and then converting to lower case -
getSubstringDistance
Substring Mapping Method: returns 1 if the two strings are identical, scores >1 if one string is a substring of the other, and Integer.MAX_VALUE if there is no substring match This approach is based on: John Li, "LOM: A Lexicon-based Ontology Mapping Tool", Proceedings of the Performance Metrics for Intelligent Systems (PerMIS.'04), 2004. *** This is not yet fully implemented here *** -
getLevenshteinDistance
LevenshteinDistance(char s[1..m], char t[1..n]) courtesy of Wikipedia http://en.wikipedia.org/wiki/Levenshtein_distance int LevenshteinDistance(char s[1..m], char t[1..n]) -
getJaroWinklerDistance
Jaro-Winkler Mapping Method implemented by Gerard de Melo -
main
A test method.
-