service.hucompute.org Open in urlscan Pro
141.2.89.20  Public Scan

URL: https://service.hucompute.org/
Submission: On September 28 via automatic, source certstream-suspicious — Scanned from DE

Form analysis 0 forms found in the DOM

Text Content

ARCHITECTURE


GETTING STARTED

 * generate Java classes from this WSDL file
 * import the generated classes into your project


CITATION

When using TextImager please cite following:
 * A. Mehler, T. Uslu, and W. Hemati, “An Image-driven Approach to Differential
   Diagnosis,” in Proceedings of the 5th Workshop on Vision and Language (VL’16)
   hosted by the 54th Annual Meeting of the Association for Computational
   Linguistics (ACL), Berlin, 2016.
 * A. Mehler, R. Gleim, T. vor der Brück, W. Hemati, T. Uslu, and S. Eger,
   “Wikidition: Automatic Lexiconization and Linkification of Text Corpora,”
   Information Technology, pp. 70-79, 2016.


RUNNING A PROCESS

 * create an interface object
 * add options (a list of all options can be found below)
 * start a process with direct input or files to apply the nlp tools to

Here is an example of how a minimal class using the web service could look like:


package example;

import java.io.IOException;
import java.net.URISyntaxException;

import org.hucompute.services.webservices.TextImagerInterface;
import org.hucompute.services.webservices.TextImagerService;

import net.java.dev.jaxb.array.StringArray;

public class Example {

	public static void main(String[] args) throws IOException, URISyntaxException {
		TextImagerService service = new TextImagerService();
		TextImagerInterface serviceInterface = service.getTextImagerPort();


		StringArray options = new StringArray();

		options.getItem().add("de");								//first add the language
		options.getItem().add("MarMoTTagger, MarMoTLemma");					//next add the nlp tools
		options.getItem().add("outputFormat");								
		options.getItem().add("tcf");										
		options.getItem().add("inputFormat");								
		options.getItem().add("plain");										


		System.out.println(serviceInterface.process("Das ist ein simpler Test.", options));

	}

}
	

The output format can be changed, the default format is xmi. tcf output should
look like this:
Das ist ein simpler Test. Das ist ein simpler Test . der sein ein simpel Test --
PDS VAFIN ART ADJA NN $.


OPTIONS

Languages See available services Tools See available services Input Format
(inputFormat) tei, json, xmi, plain Output Format (outputFormat) tei, tcf, xmi,
webanno

Options have to be used in a key-value manner, the key for the tools is the
language you add to the options beforehand.
You can use multiple language-tools pairs in your options, but at most one for
the input and the output format.


AVAILABLE SERVICES

Group Name Language Model parser StanfordParser ar factored de sr factored pcfg
en rnn sr sr-beam wsj-factored wsj-pcfg wsj-rnn factored pcfg es pcfg sr sr-beam
fr factored sr-beam zh xinhua-factored xinhua-pcfg factored pcfg sr
ClearNlpDependencyParser en mayo ontonotes ner StanfordNamedEntityRecognizer de
dewac_175m_600.crf hgc_175m_600.crf en all.3class.caseless.distsim.crf
all.3class.distsim.crf conll.4class.caseless.distsim.crf
conll.4class.distsim.crf muc.7class.caseless.distsim.crf muc.7class.distsim.crf
nowiki.3class.caseless.distsim.crf es ancora.distsim.s512.crf NERAnnotator de ?
OpenNlpNameFinder en ? time HeidelTime en ? de ? sentiment Sentiws en date
location money organization percentage person time es locatin misc organization
person nl location misc organization lemmatizer LanguageToolLemmatizer en ? de ?
StanfordLemmatizer en ? MarMoTLemma la ? de ? pos StanfordPosTagger ar accurate
de fast-caseless dewac fast hgc en bidirectional-distsim
caseless-left3words-distsim fast.41 left3words-distsim twitter twitter-fast
wsj-0-18-bidirectional-nodistsim wsj-0-18-caseless-left3words-distsim
wsj-0-18-left3words-distsim wsj-0-18-left3words-nodistsim es default distsim
MarMoTTagger la ? de ? tokenizer OpenNlpSegmenter da maxent de maxent en maxent
it maxent nb maxent nl maxent pt maxent sv maxent LanguageToolSegmenter en ? de
? BreakIteratorSegmenter en ? de ? la ? StanfordTokenizer ar ? en ? es ? fr ?
OpenNLPTokenizer en ? de ? ClearNlpSegmenter en ? paragraphSplitter
ParagraphSplitter en ? de ? la ? similarity CosineSimilarity de ? en ? la ?
Biemann de ? en ? la ? WordNGramJaccardMeasure de ? en ? la ?
grammaticalcategory TreeTaggerGrammaticalCategory de ? disambiguation
HUComputeWSD en ?

Legal Notice (German) - Legal Notice (English)