Java Client

Creating a Client

//Create the cluster configuration by adding cluster servers
ZuliaPoolConfig zuliaPoolConfig = new ZuliaPoolConfig();
zuliaPoolCondig.addNode("localhost", 32191, 32192) // serverAddress, servicePort (opt), REST port (opt); servicePort and REST port default to 32191 and 32192, only need to be provided if multiple nodes are running.

//optional settings (default values shown)
zuliaPoolConfig.setDefaultRetries(0);//Number of attempts to try before throwing an exception
zuliaPoolConfig.setMaxConnections(8); //Maximum connections per server
zuliaPoolConfig.setMaxIdle(8); //Maximum idle connections per server
zuliaPoolConfig.setCompressedConnection(false); //Use this for WAN client connections
zuliaPoolConfig.setPoolName(null); //For logging purposes only, null gives default of zuliaPool-n
zuliaPoolConfig.setNodeUpdateEnabled(true); //Periodicly update the nodes of the cluster
zuliaPoolConfig.setNodeUpdateInterval(10000); //Interval to update the nodes in ms
zuliaPoolConfig.setRoutingEnabled(true); //enable routing indexing to the correct server, this only works if automatic node updating is enabled or it is periodically called manually.

//create the connection pool
ZuliaWorkPool zuliaWorkPool = new ZuliaWorkPool(zuliaPoolConfig);

Updating Current Nodes (Optional)

zuliaWorkPool.updateNodes();

Creating an Index

String defaultSearchField = "title";
int numberOfShards = 16;
int ramBufferMB = 128;
int indexWeight = 8; // helps store the index in the appropriate node

ClientIndexConfig indexConfig = new ClientIndexConfig();
indexConfig.setIndexName(indexName);
indexConfig.setNumberOfShards(numberOfShards);
indexConfig.setShardCommitInterval(64000);
indexConfig.setIndexWeight(indexWeight);
indexConfig.setRamBufferMB(ramBufferMB);

indexConfig.addFieldConfig(FieldConfigBuilder.create("title", FieldType.STRING).indexAs(DefaultAnalyzers.STANDARD));
indexConfig.addFieldConfig(FieldConfigBuilder.create("issn", FieldType.STRING).indexAs(DefaultAnalyzers.LC_KEYWORD).facet());
indexConfig.addFieldConfig(FieldConfigBuilder.create("an", FieldType.NUMERIC_INT).index());

CreateIndex createIndex = new CreateIndex(indexConfig);
zuliaWorkPool.createIndex(createIndex);

The number of shards and unique id field cannot be changed for the index once the index is created. Set number of shards to greater than or equal to the maximum number of nodes possible in the cluster. In the future, changing the number of shards will be possible through a separate process.

Zulia supports indexes created from object annotations. For more info see section on Object Persistence.

Changing or adding analyzers for fields that are already indexed may require re-indexing for desired results.

Note that an update cannot change the number of shards or the unique id field. and that changing or adding analyzers for fields that are already indexed may require re-indexing for desired results

Index Config Details

The individual settings on IndexConfig are explained below:

defaultSearchField - The field that is searched if no field is given to a lucene query
defaultAnalyzer - The default analyzer for all fields not specified by a field config
fieldConfig - Overrides the default analyzer for a field
shardCommitInterval - Indexes or deletes to shard before a commit is forced (default 3200)
idleTimeWithoutCommit - Time without indexing before commit is force in seconds (0 disables) (default 30)
applyUncommitedDeletes - Apply all deletes before search (default true)
shardQueryCacheSize - Number of queries cached at the shard level
shardQueryCacheMaxAmount - Queries with more than this amount of documents returned are not cached

//The following are used in optimizing federation of shards when more than one shard is used. 
//The amount requested from each shard on a query is (((amountRequestedByQuery / numberOfShards) + minShardRequest) * requestFactor).
requestFactor - Used in calculation of request size for a shard (default 2.0)
minShardRequest - Added to the calculated request for a shard (default 2)
shardTolerance - Difference in scores between shards tolerated before requesting full results (query request amount) from the shard (default 0.05)

These Field Types are Available

STRING
NUMERIC_INT
NUMERIC_LONG
NUMERIC_FLOAT
NUMERIC_DOUBLE
DATE
BOOL

These built-in Analyzers are available (DefaultAnalyzers)

KEYWORD - Field is searched as one token
LC_KEYWORD - Field is searched as one token in lowercase (case insenstive, use for wildcard searches)
LC_CONCAT_ALL
STANDARD - Standard lucene analyzer (good for general full text)
MIN_STEM - Minimal English Stemmer
KSTEMMED - K Stemmer
LSH - Locality Sensitive Hash
TWO_TWO_SHINGLE - (n-grams)
THREE_THREE_SHINGLE - (n-grams)

Custom Analyzer

indexConfig.addAnalyzerSetting("myAnalyzer", Tokenizer.WHITESPACE, Arrays.asList(Filter.ASCII_FOLDING, Filter.LOWERCASE), Similarity.BM25);
indexConfig.addFieldConfig(FieldConfigBuilder.create("abstract", FieldType.STRING).indexAs("myAnalyzer"));

Delete Index

zuliaWorkPool.deleteIndex(indexName);

Storing / Indexing Documents

Zulia supports indexing and storing from object annotations. For more info see section on Object Persistence.

BSON Document (org.mongodb.bson)

Document document = new Document();
document.put("title", "Magic Java Beans");
document.put("issn", "4321-4321");

Store store = new Store("myid222", "myIndexName");

ResultDocBuilder resultDocumentBuilder = new ResultDocBuilder();
resultDocumentBuilder.setDocument(document);

//optional meta
resultDocumentBuilder.addMetaData("test1", "val1");
resultDocumentBuilder.addMetaData("test2", "val2");

store.setResultDocument(resultDocumentBuilder);

zuliaWorkPool.store(s);

Storing Associated Documents

String uniqueId = "myid123";
String indexName = MY_INDEX_NAME;
String filename = "myfile2";
		
AssociatedBuilder associatedBuilder = new AssociatedBuilder();
associatedBuilder.setFilename(filename);
associatedBuilder.setCompressed(false);
associatedBuilder.setDocument("Some Text3");
associatedBuilder.addMetaData("mydata", "myvalue2");
associatedBuilder.addMetaData("sometypeinfo", "text file2");
		
//can be part of the same store request as the document
Store store = new Store(uniqueId, indexName);

//multiple associated documented can be added at once
store.addAssociatedDocument(associatedBuilder);

zuliaWorkPool.store(s);

Storing Large Associated Documents (Streaming)

String uniqueId = "myid333";
String filename = "myfilename";
String indexName = "myIndexName";
		
StoreLargeAssociated storeLargeAssociated = new StoreLargeAssociated(uniqueId, indexName, filename, new File("/tmp/myFile"));
		
zuliaWorkPool.storeLargeAssociated(storeLargeAssociated);

Fetching Documents

Fetch Document

FetchDocument fetchDocument = new FetchDocument("myid222", MY_INDEX_NAME);
		
FetchResult fetchResult = zuliaWorkPool.fetch(fetchDocument);

if (fetchResult.hasResultDocument()) {
	Document document = fetchResult.getDocument();
	
	//Get optional Meta
	Map<String, String> meta = fetchResult.getMeta();
}

Fetch All Associated


FetchAllAssociated fetchAssociated = new FetchAllAssociated("myid123", "myIndexName");

FetchResult fetchResult = zuliaWorkPool.fetch(fetchAssociated);

if (fetchResult.hasResultDocument()) {
	Document object = fetchResult.getDocument();
	
	//Get optional Meta
	Map<String, String> meta = fetchResult.getMeta();
}

for (AssociatedResult ad : fetchResult.getAssociatedDocuments()) {
    //use correct function for document type
    String text = ad.getDocumentAsUtf8();
}

Fetch Associated

FetchAssociated fetchAssociated = new FetchAssociated("myid123", "myIndexName",  "myfile2");

FetchResult fetchResult = zuliaWorkPool.fetch(fetchAssociated);

if (fetchResult.getAssociatedDocumentCount() != 0) {
	AssociatedResult ad = fetchResult.getAssociatedDocument(0);
        //use correct function for document type
	String text = ad.getDocumentAsUtf8();
}

Fetch Large Associated (Streaming)

FetchLargeAssociated fetchLargeAssociated = new FetchLargeAssociated("myid333", "myIndexName", "myfilename", new File("/tmp/myFetchedFile"));
zuliaWorkPool.fetchLargeAssociated(fetchLargeAssociated);

Querying

Simple Query

int numberOfResults = 10;

String normalLuceneQuery = "issn:1234-1234 AND title:special";
Query query = new Query("myIndexName", normalLuceneQuery, numberOfResults);

//optionally set realtime to false for better performance under high indexing load
//this will prevent flushing shards become searching
//query.setRealTime(false);

QueryResult queryResult = zuliaWorkPool.query(query);

long totalHits = queryResult.getTotalHits();

System.out.println("Found <" + totalHits + "> hits");
for (ScoredResult sr : queryResult.getResults()) {
	System.out.println("Matching document <" + sr.getUniqueId() + "> with score <" + sr.getScore() + ">");
}

Search Multiple Indexes

int numberOfResults = 10;

String normalLuceneQuery = "issn:4321-1234 AND title:java";
Query query = new Query(Arrays.asList("myIndexName", "myIndexName2"), normalLuceneQuery, numberOfResults);

//optionally set realtime to false for better performance under high indexing load
//this will prevent flushing segments become searching
//query.setRealTime(false);

QueryResult queryResult = zuliaWorkPool.query(query);

long totalHits = queryResult.getTotalHits();

System.out.println("Found <" + totalHits + "> hits");
for (ScoredResult sr : queryResult.getResults()) {
	System.out.println("Matching document <" + sr.getUniqueId() + "> with score <" + sr.getScore() + ">");
}

Paging Query Results

Query query = new Query("myIndexName", "issn:1234-1234 AND title:special", 10);
		
QueryResult firstResult = zuliaWorkPool.query(query);
		
query.setLastResult(firstResult);
		
QueryResult secondResult = zuliaWorkPool.query(query);

Sorting

Query query = new Query("myIndexName", "title:special", 10);
query.addFieldSort("issn", Direction.ASCENDING); //Field must be KEYWORD, LC_KEYWORD, or NUMERIC
QueryResult queryResult = zuliaWorkPool.query(query);

Filter Queries (fq) and Query Fields (qf)

Query query = new Query("myIndexName", "cancer cure", numberOfResults);
query.addQueryField("abstract");
query.addQueryField("title");
query.addFilterQuery("title:special");
query.addFilterQuery("issn:1234-1234");
QueryResult queryResult = zuliaWorkPool.query(query);

Requesting Facets

// Can set number of documents to return to 0 unless you want the documents
// at the same time

Query query = new Query(Arrays.asList("myIndexName", "myIndexName2"), "title:special", 0);
int maxFacets = 30;
query.addCountRequest("issn", maxFacets);

QueryResult queryResult = zuliaWorkPool.query(query);
for (FacetCount fc : queryResult.getFacetCounts("issn")) {
	System.out.println("Facet <" + fc.getFacet() + "> with count <" + fc.getCount() + ">");
}

Drilling Down Facets

Query query = new Query("myIndexName", "title:special", 0);
query.addDrillDown("issn", "1111-1111");
QueryResult queryResult = zuliaWorkPool.query(query);
for (FacetCount fc : queryResult.getFacetCounts("issn")) {
   System.out.println("Facet <" + fc.getFacet() + "> with count <" + fc.getCount() + ">");
}

Deleting

Delete From Index

//Deletes the document from the index but not any associated documents
DeleteFromIndex deleteFromIndex = new DeleteFromIndex("myid111", "myIndexName");
zuliaWorkPool.delete(deleteFromIndex);

Delete Completely

//Deletes the result document, the index documents and all associated documents associated with an id
DeleteFull deleteFull = new DeleteFull("myid123", MY_INDEX_NAME);
zuliaWorkPool.delete(deleteFull);

Delete Single Associated

//Removes a single associated document with the unique id and filename given
DeleteAssociated deleteAssociated = new DeleteAssociated("myid123", "myIndexName", "myfile2");
zuliaWorkPool.delete(deleteAssociated);

Delete All Associated

DeleteAllAssociated deleteAllAssociated = new DeleteAllAssociated("myid123", "myIndexName");
zuliaWorkPool.delete(deleteAllAssociated);

Other Operations

Get Current Document Count for Index

GetNumberOfDocsResult result = zuliaWorkPool.getNumberOfDocs("myIndexName");
System.out.println(result.getNumberOfDocs());

Get Fields for Index

GetFieldsResult result = zuliaWorkPool.getFields(new GetFields("myIndexName"));
System.out.println(result.getFieldNames());

#Get Terms for Field

GetTermsResult getTermsResult = zuliaWorkPool.getAllTerms(new GetAllTerms("myIndexName", "title"));
for (Term term : getTermsResult.getTerms()) {
   System.out.println(term.getValue() + ": " + term.getDocFreq());
}

#Get Cluster Nodes

GetNodesResult getNodesResult = zuliaWorkPool.getNodes();
for (Node node : getNodesResult.getNodes()) {
    System.out.println(node);
}

#Async API Every Function has a Corresponding Async Version

Query query = new Query(MY_INDEX_NAME, "issn:1234-1234 AND title:special", 10);

ListenableFuture<QueryResult> resultFuture = zuliaWorkPool.queryAsync(query);

Futures.addCallback(resultFuture, new FutureCallback<QueryResult>() {

    @Override
    public void onSuccess(QueryResult explosion) {
	
    }

    @Override
    public void onFailure(Throwable thrown) {
	
    }
});

Object Persistence / Mapping

Annotated Object Example

@Settings(
    indexName = "wikipedia",
    numberOfSegments = 16,
    segmentCommitInterval = 6000
    )
public class Article {

    public Article() {

    }

    @UniqueId
    private String id;

    @Indexed(analyzerName = DefaultAnalyzers.STANDARD)
    private String title;

    @Indexed
    private Integer namespace;

    @DefaultSearch
    @Indexed(analyzerName = DefaultAnalyzers.STANDARD)
    private String text;

    private Long revision;

    @Indexed
    private Integer userId;

    @Indexed(analyzerName = DefaultAnalyzers.STANDARD)
    private String user;

    @Indexed
    private Date revisionDate;

    //Getters and Setters
    //....
}

Creating Index for Annotated Class Example

Mapper<Article> mapper = new Mapper<>(Article.class);
zuliaWorkPool.createOrUpdateIndex(mapper.createOrUpdateIndex());

Storing an Object

Article article = new Article();
...
Store store = mapper.createStore(article);
zuliaWorkPool.store(store);   

Querying and Fetching

Query query = new Query("wikipedia", "title:a*", 10);
QueryResult queryResult = zuliaWorkPool.query(query);

BatchFetch batchFetch = new BatchFetch().addFetchDocumentsFromResults(queryResult);
BatchFetchResult bfr = zuliaWorkPool.batchFetch(batchFetch);

List<Article> articles = mapper.fromBatchFetchResult(bfr);

Address

Maryland USA