Java Client
Zulia Java Client
Gradle
repositories {
mavenCentral()
maven {
url "https://maven.ascend-tech.us/repo/"
}
}
dependencies {
implementation 'io.zulia:zulia-client:3.4.6'
implementation 'org.mongodb:mongodb-driver-sync:4.9.1'
}
Maven
<repository>
<id>astMaven</id>
<name>AST Maven</name>
<url>https://maven.ascend-tech.us/repo</url>
</repository>
<dependencies>
<dependency>
<groupId>io.zulia</groupId>
<artifactId>zulia-client</artifactId>
<version>3.4.6</version>
</dependency>
<dependency>
<groupId>org.mongodb</groupId>
<artifactId>mongodb-driver-sync</artifactId>
<version>4.9.1</version>
</dependency>
</dependencies>
Creating a Client
The Zulia java client is named ZuliaWorkPool. ZuliaWorkPool is a thread safe connection pool using a gRPC connection to Zulia on the service port. There are async versions methods of all methods that return a ListenableFuture<> of the result.
Simple Client creation
ZuliaWorkPool zuliaWorkPool = new ZuliaWorkPool(new ZuliaPoolConfig().addNode("someIp"));
Full Client Configuration
ZuliaPoolConfig zuliaPoolConfig = new ZuliaPoolConfig();
zuliaPoolConfig.addNode("someIp");
//optionally give ports if not default values
//zuliaPoolConfig.addNode("localhost", 32191, 32192);
//optional settings (default values shown)
zuliaPoolConfig.setDefaultRetries(0);//Number of attempts to try before throwing an exception
zuliaPoolConfig.setMaxConnections(10); //Maximum connections per server
zuliaPoolConfig.setMaxIdle(10); //Maximum idle connections per server
zuliaPoolConfig.setCompressedConnection(false); //Use this for WAN client connections
zuliaPoolConfig.setPoolName(null); //For logging purposes only, null gives default of zuliaPool-n
zuliaPoolConfig.setNodeUpdateEnabled(true); //Periodically update the nodes of the cluster and to enable smart routing to the correct node. Do not use this with ssh port forwarding. This can be done manually with zuliaWorkPool.updateNodes();
zuliaPoolConfig.setNodeUpdateInterval(10000); //Interval to update the nodes in ms
zuliaPoolConfig.setRoutingEnabled(true); //enable routing indexing to the correct server, this only works if automatic node updating is enabled or it is periodically called manually.
//create the connection pool
ZuliaWorkPool zuliaWorkPool = new ZuliaWorkPool(zuliaPoolConfig);
Creating an Index
Basic Creation
ClientIndexConfig indexConfig = new ClientIndexConfig().setIndexName("test").addDefaultSearchField("test");
indexConfig.addFieldConfig(FieldConfigBuilder.createString("title").indexAs(DefaultAnalyzers.STANDARD));
indexConfig.addFieldConfig(FieldConfigBuilder.createString("issn").indexAs(DefaultAnalyzers.LC_KEYWORD).facet());
indexConfig.addFieldConfig(FieldConfigBuilder.createInt("an").index().sort());
// createLong, createFloat, createDouble, createBool, createDate, createVector, createUnitVector is also available
// or create(storedFieldName, fieldType)
CreateIndex createIndex = new CreateIndex(indexConfig);
zuliaWorkPool.createIndex(createIndex);
- Calling create index again will update index settings. However, the number of shards cannot be changed for the index once the index is created. The number of shards can be greater than the number of nodes to future-proof index if using sharding. Also see UpdateIndex for partial index setting updates.
- Changing or adding analyzers for fields that are already indexed may require re-indexing for desired results.
- Zulia supports indexes created from object annotations. For more info see section on Object Persistence.
Index Config Details
Full ClientIndexConfig
settings are explained below:
defaultSearchField - The field that is searched if no field is given to a query (missing query fields or direct fielded search)
defaultAnalyzer - The default analyzer for all fields not specified by a field config
fieldConfig - Overrides the default analyzer for a field
shardCommitInterval - Indexes or deletes to shard before a commit is forced (default 3200)
idleTimeWithoutCommit - Time without indexing before commit is forced in seconds (0 disables) (default 30)
applyUncommitedDeletes - Apply all deletes before search (default true)
shardQueryCacheSize - Number of queries cached at the shard level
shardQueryCacheMaxAmount - Queries with more than this amount of documents returned are not cached
//The following are used in optimizing federation of shards when more than one shard is used.
//The amount requested from each shard on a query is (((amountRequestedByQuery / numberOfShards) + minShardRequest) * requestFactor).
requestFactor - Used in calculation of request size for a shard (default 2.0)
minShardRequest - Added to the calculated request for a shard (default 2)
shardTolerance - Difference in scores between shards tolerated before requesting full results (query request amount) from the shard (default 0.05)
These Field Types are Available
STRING
NUMERIC_INT
NUMERIC_LONG
NUMERIC_FLOAT
NUMERIC_DOUBLE
DATE
BOOL
UNIT_VECTOR
VECTOR
These built-in Analyzers are available (DefaultAnalyzers)
KEYWORD - Field is searched as one token
LC_KEYWORD - Field is searched as one token in lowercase (case insenstive, use for wildcard searches)
LC_CONCAT_ALL
STANDARD - Standard lucene analyzer (good for general full text)
MIN_STEM - Minimal English Stemmer
KSTEMMED - K Stemmer
LSH - Locality Sensitive Hash
TWO_TWO_SHINGLE - (n-grams)
THREE_THREE_SHINGLE - (n-grams)
Custom Analyzer
Custom analyzers are a combination of a tokenizer and an ordered list of filters
// define a custom analyzer in for the index
clientIndexConfig.addAnalyzerSetting("myAnalyzer", Tokenizer.WHITESPACE, Arrays.asList(Filter.ASCII_FOLDING, Filter.LOWERCASE), Similarity.BM25);
// reference a custom analyzer for a field
clientIndexConfig.addFieldConfig(FieldConfigBuilder.create("abstract", FieldType.STRING).indexAs("myAnalyzer"));
These tokenizers are available:
STANDARD - Word Break rules from the Unicode Text Segmentation algorithm as specified in Unicode Standard Annex #29.
KEYWORD - Treat entire field as a single token
WHITESPACE - A tokenizer that divides text at whitespace characters as defined by Character.isWhitespace(int).
The corresponding Lucene classes are StandardTokenizer, KeywordTokenizer, and WhitespaceTokenizer.
These filters are available
LOWERCASE - Lowercase text
UPPERCASE - Uppercase text
STOPWORDS - Removes stop words based on defaults below or loads from a file %user.home%/.zulia/stopwords.txt
- Default stopwords - a, an, and, are, as, at, be, but, by, for, if, in, into, is,
it, no, not, of, on, or, such, that, the, their, then, there,
these, they, this, to, was, will, with
ASCII_FOLDING - Converts alphabetic, numeric, and symbolic Unicode characters which are not
in the first 127 ASCII characters (the "Basic Latin" Unicode block) into their ASCII equivalents, if one exists.
KSTEM - high-performance kstem filter for english
ENGLISH_MIN_STEM - Minimal plural stemmer for English
SNOWBALL_STEM - english snowball from org.tartarus.snowball.ext.EnglishStemmer
ENGLISH_POSSESSIVE - removes possessives (trailing 's) from words.
MINHASH - Generate min hash tokens from an incoming stream of tokens. The incoming tokens would typically be 5 word shingles.
TWO_TWO_SHINGLE - Creates 2 word shingles
THREE_THREE_SHINGLE - Creates 3 word shingles
FOUR_FOUR_SHINGLE - Creates 4 word shingles
FIVE_FIVE_SHINGLE - Creates 5 word shingles
BRITISH_US - Normalizes british spellings to US english equilavent (i.e. colour to color)
CONCAT_ALL - Can be used with the whitespace tokenizer to combine words like wi-fi into wifi, see Lucene's WordDelimiterGraphFilter
CASE_PROTECTED_WORDS - Protects the cases of certain words to make them case senstive (work in progress)
GERMAN_NORMALIZATION - Normalizes German characters according to the heuristics of the German2 snowball algorithm
Notes:
- See Lucene’s ASCIIFoldingFilter, EnglishMinimalStemFilter, EnglishPossessiveFilter, GermanNormalizationFilter, KStemFilter, MinHashFilterFactory, LowerCaseFilter, ShingleFilter, SnowballFilter, StopFilter, UpperCaseFilter, and WordDelimiterGraphFilter for more information
- Submit an issue for additional filters needed
Index Metadata
clientIndexConfig.setMeta(new Document("category", "special").append("otherKey", 10));
Warmed Searches
Search search1 = new Search("someIndex").addQuery(new FilterQuery("the best query")).setSearchLabel("custom");
Search search2 = new Search("someIndex").addQuery(new FilterQuery("the worst query")).setSearchLabel("mine");
clientIndexConfig.addWarmingSearch(search1);
clientIndexConfig.addWarmingSearch(search2);
Field Mapping
clientIndexConfig.addFieldMapping(new FieldMapping("title").addMappedFields("longTitle","shortTitle"));
clientIndexConfig.addFieldMapping(new FieldMapping("category").addMappedFields("category-*"));
clientIndexConfig.addFieldMapping(new FieldMapping("rating").addMappedFields("otherRating").includeSelf());
Update Index
To replace the entire index config use the CreateIndex
command. For partial updates use UpdateIndex
Basic Usage
UpdateIndex updateIndex = new UpdateIndex("someIndex");
// ... any set of changes listed below (can change multiple things at once)
UpdateIndexResult updateIndexResult = zuliaWorkPool.updateIndex(updateIndex);
// full index settings are returned after the change that can be accessed if needed
IndexSettings fullIndexSettings = updateIndexResult.getFullIndexSettings();
Numeric Settings
UpdateIndex updateIndex = new UpdateIndex("someIndex");
// selectivity call setXXX on the settings that you can to change
// if set is not called there will be no changes to that setting
updateIndex.setIndexWeight(10);
// ...
zuliaWorkPool.updateIndex(updateIndex);
Add/Change Field(s)
UpdateIndex updateIndex = new UpdateIndex("someIndex");
// if a field myField or otherField exists, it will be updated with these settings
FieldConfigBuilder myField = FieldConfigBuilder.createString("myField").indexAs(DefaultAnalyzers.STANDARD).sort();
FieldConfigBuilder otherField = FieldConfigBuilder.createString("otherField").indexAs(DefaultAnalyzers.LC_KEYWORD).sort();
updateIndex.mergeFieldConfig(myField, otherField);
zuliaWorkPool.updateIndex(updateIndex);
Replace Fields
UpdateIndex updateIndex = new UpdateIndex("someIndex");
// replaces all fields with the two fields given
FieldConfigBuilder myField = FieldConfigBuilder.createString("myField").indexAs(DefaultAnalyzers.STANDARD).sort();
FieldConfigBuilder otherField = FieldConfigBuilder.createString("otherField").indexAs(DefaultAnalyzers.LC_KEYWORD).sort();
updateIndex.replaceFieldConfig(myField, otherField);
zuliaWorkPool.updateIndex(updateIndex);
Remove Fields
UpdateIndex updateIndex = new UpdateIndex("someIndex");
// removes the stored field with name myField if it exists
updateIndex.removeFieldConfigByStoredName(List.of("myField"));
zuliaWorkPool.updateIndex(updateIndex);
Add/Change Custom Analyzers
UpdateIndex updateIndex = new UpdateIndex("someIndex");
// if an analyzer custom or mine exists, it will be updated with these settings, otherwise they are added
ZuliaIndex.AnalyzerSettings custom = ZuliaIndex.AnalyzerSettings.newBuilder().setName("custom").addFilter(Filter.LOWERCASE).build();
ZuliaIndex.AnalyzerSettings mine = ZuliaIndex.AnalyzerSettings.newBuilder().setName("mine").addFilter(Filter.LOWERCASE).addFilter(Filter.BRITISH_US)
.build();
updateIndex.mergeAnalyzerSettings(custom, mine);
Replace Custom Analyzers
UpdateIndex updateIndex = new UpdateIndex("someIndex");
// replaces all analyzers with the two custom analyzers given
ZuliaIndex.AnalyzerSettings custom = ZuliaIndex.AnalyzerSettings.newBuilder().setName("custom").addFilter(Filter.LOWERCASE).build();
ZuliaIndex.AnalyzerSettings mine = ZuliaIndex.AnalyzerSettings.newBuilder().setName("mine").addFilter(Filter.LOWERCASE).addFilter(Filter.BRITISH_US)
.build();
updateIndex.replaceAnalyzerSettings(custom, mine);
Remove Custom Analyzer
UpdateIndex updateIndex = new UpdateIndex("someIndex");
// removes the analyzer field with name myCustomOne if it exists
updateIndex.removeAnalyzerSettingsByName(List.of("myCustomOne"));
zuliaWorkPool.updateIndex(updateIndex);
Add/Change Warmed Searches
UpdateIndex updateIndex = new UpdateIndex("someIndex");
// if a warmed search with search label custom or mine exists, it will be updated with these settings, otherwise they are added
Search search1 = new Search("someIndex").addQuery(new FilterQuery("the best query")).setSearchLabel("custom");
Search search2 = new Search("someIndex").addQuery(new FilterQuery("the worst query")).setSearchLabel("mine");
updateIndex.mergeWarmingSearches(search1, search2);
Replace Warmed Searches
UpdateIndex updateIndex = new UpdateIndex("someIndex");
// replaces all warmed searches with the given warmed searches
Search search1 = new Search("someIndex").addQuery(new FilterQuery("some stuff")).setSearchLabel("the best label");
Search search2 = new Search("someIndex").addQuery(new FilterQuery("more stuff")).setSearchLabel("the good label");
updateIndex.replaceWarmingSearches(search1, search2);
Remove Warmed Search
UpdateIndex updateIndex = new UpdateIndex("someIndex");
// removes the warmed search with search label myCustomOne if it exists
updateIndex.removeWarmingSearchesByLabel(List.of("myCustomOne"));
zuliaWorkPool.updateIndex(updateIndex);
Add/Change Metadata
UpdateIndex updateIndex = new UpdateIndex("someIndex");
// replaces key someKey with value 5 and otherKey with value "a string" if they exist, otherwise add they to the metadata (putAll with new metadata)
updateIndex.mergeMetadata(new Document().append("someKey", 5).append("otherKey", "a string"));
zuliaWorkPool.updateIndex(updateIndex);
Replace Metadata
UpdateIndex updateIndex = new UpdateIndex("someIndex");
// replaces metadata document with the document below
updateIndex.replaceMetadata(new Document().append("stuff", "for free"));
zuliaWorkPool.updateIndex(updateIndex);
Remove Metadata
UpdateIndex updateIndex = new UpdateIndex("someIndex");
// removes the keys below from the metadata object if they exist
updateIndex.removeMetadataByKey(List.of("oneKey", "twoKey", "redKey", "blueKey"));
zuliaWorkPool.updateIndex(updateIndex);
Add/Change Field Mapping
UpdateIndex updateIndex = new UpdateIndex("someIndex");
// if a field mapping with alias test1 or test2 exists, it will be updated with these mappings, otherwise they are added
FieldMappingBuilder test1 = new FieldMapping("test1").addMappedFields("field1", "field2");
FieldMappingBuilder test2 = new FieldMapping("test2").addMappedFields("field3", "fieldPattern4*").includeSelf();
updateIndex.mergeFieldMapping(test1, test2);
Replace Field Mapping
UpdateIndex updateIndex = new UpdateIndex("someIndex");
// replaces all field mappings with the two field mappings given
FieldMappingBuilder special = new FieldMapping("special").addMappedFields("specialist", "specialThings").includeSelf();
FieldMappingBuilder custom = new FieldMapping("custom").addMappedFields("custom*");
updateIndex.replaceFieldMapping(special, custom);
Remove Field Mapping
UpdateIndex updateIndex = new UpdateIndex("someIndex");
// removes the field mapping with alias special if it exists
updateIndex.removeFieldMappingByAlias("special");
zuliaWorkPool.updateIndex(updateIndex);
Delete Index
Basic Delete
zuliaWorkPool.deleteIndex("myIndex");
Delete Index and Associated Files
DeleteIndex deleteIndex = new DeleteIndex("myIndex").setDeleteAssociated(true);
zuliaWorkPool.deleteIndex(deleteIndex);
Storing / Indexing Documents
Zulia supports indexing and storing from object annotations. For more info see section on Object Persistence
Result Document Storage
Simple Store
Document document = new Document();
document.put("id", "myid222");
document.put("title", "Magic Java Beans");
document.put("issn", "4321-4321");
Store store = new Store("myid222", "myIndexName").setResultDocument(document);
zuliaWorkPool.store(store);
Simple Store Json
String json = """
{
"documentId": "someId",
"docType": "pdf",
"docAuthor": "Java Developer Zone",
"docTitle": "Elastic Search Blog",
"isParent": false,
"parentDocId": 1,
"docLanguage": [
"en",
"czech"
]
}""";
Store store = new Store("someId", "myIndexName").setResultDocument(json);
zuliaWorkPool.store(store);
Store with Metadata
Document document = new Document();
document.put("id", "myid222");
document.put("title", "Magic Java Beans");
document.put("issn", "4321-4321");
Store store = new Store("myid222", "myIndexName");
ResultDocBuilder resultDocumentBuilder = new ResultDocBuilder().setDocument(document);
//optional metadata document
resultDocumentBuilder.setMetadata(new Document().append("test1", "val1").append("test2", "val2"));
store.setResultDocument(resultDocumentBuilder);
zuliaWorkPool.store(store);
Storing Associated Documents
AssociatedBuilder associatedBuilder = new AssociatedBuilder();
associatedBuilder.setFilename("myfile2.txt");
// either set as text
associatedBuilder.setDocument("Some Text3");
// or as bytes
associatedBuilder.setDocument(new byte[]{0, 1, 2, 3});
associatedBuilder.setMetadata(new Document().append("mydata", "myvalue2").append("sometypeinfo", "text file2"));
//can be part of the same store request as the document
Store store = new Store("myid123", "someIndex");
//multiple associated documented can be added at once
store.addAssociatedDocument(associatedBuilder);
zuliaWorkPool.store(store);
Storing Large Associated Documents (Streaming)
StoreLargeAssociated storeLargeAssociated = new StoreLargeAssociated("myid333", "myIndexName", "myfilename", new File("/tmp/myFile"));
zuliaWorkPool.storeLargeAssociated(storeLargeAssociated);
Fetching Documents
Fetch Document
FetchDocument fetchDocument = new FetchDocument("myid222", "myIndex");
FetchResult fetchResult = zuliaWorkPool.fetch(fetchDocument);
if (fetchResult.hasResultDocument()) {
Document document = fetchResult.getDocument();
//Get optional Meta
Document meta = fetchResult.getMeta();
}
Fetch All Associated
FetchAllAssociated fetchAssociated = new FetchAllAssociated("myid123", "myIndexName");
FetchResult fetchResult = zuliaWorkPool.fetch(fetchAssociated);
if (fetchResult.hasResultDocument()) {
Document object = fetchResult.getDocument();
//Get optional metadata
Document meta = fetchResult.getMeta();
}
for (AssociatedResult ad : fetchResult.getAssociatedDocuments()) {
//use correct function for document type
String text = ad.getDocumentAsUtf8();
// OR
byte[] documentAsBytes = ad.getDocumentAsBytes();
//get optional metadata
Document meta = ad.getMeta();
String filename = ad.getFilename();
}
Fetch Associated
FetchAssociated fetchAssociated = new FetchAssociated("myid123", "myIndexName", "myfile2");
FetchResult fetchResult = zuliaWorkPool.fetch(fetchAssociated);
AssociatedResult ad = fetchResult.getFirstAssociatedDocument();
//use correct function for document type
String text = ad.getDocumentAsUtf8();
// OR
byte[] documentAsBytes = ad.getDocumentAsBytes();
//get optional metadata
Document meta = ad.getMeta();
String filename = ad.getFilename();
Fetch Large Associated (Streaming)
FetchLargeAssociated fetchLargeAssociated = new FetchLargeAssociated("myid333", "myIndexName", "myfilename", new File("/tmp/myFetchedFile"));
zuliaWorkPool.fetchLargeAssociated(fetchLargeAssociated);
Querying
Simple Query with only ids returned
Search search = new Search("myIndexName").setAmount(10);
search.addQuery(new ScoredQuery("issn:1234-1234 AND title:special"));
search.setResultFetchType(ZuliaQuery.FetchType.NONE); // just return the score and unique id
SearchResult searchResult = zuliaWorkPool.search(search);
long totalHits = searchResult.getTotalHits();
System.out.println("Found <" + totalHits + "> hits");
for (CompleteResult completeResult : searchResult.getCompleteResults()) {
System.out.println("Matching document <" + completeResult.getUniqueId() + "> with score <" + completeResult.getScore() + ">");
}
Simple Query with full documents returned
Search search = new Search("myIndexName").setAmount(10);
search.addQuery(new ScoredQuery("issn:1234-1234 AND title:special"));
search.setResultFetchType(ZuliaQuery.FetchType.FULL); //return the full bson document that was stored
SearchResult searchResult = zuliaWorkPool.search(search);
long totalHits = searchResult.getTotalHits();
System.out.println("Found <" + totalHits + "> hits");
for (Document document : searchResult.getDocuments()) {
System.out.println("Matching document <" + document + ">");
}
Caching
// make sure this search stays in the query cache until the index is changed or zulia is restarted
Search search = new Search("myIndexName").setAmount(10);
search.addQuery(new ScoredQuery("issn:1234-1234 AND title:special"));
search.setPinToCache(true);
// Alternatively can force search to not be cached. Searches that return more results than shardQueryCacheMaxAmount are not cached regardless
search.setDontCache(true);
Search Multiple Indexes
Search search = new Search("myIndexName", "myOtherIndex").setAmount(10);
search.addQuery(new ScoredQuery("issn:1234-1234 AND title:special"));
SearchResult searchResult = zuliaWorkPool.search(search);
long totalHits = searchResult.getTotalHits();
System.out.println("Found <" + totalHits + "> hits");
for (CompleteResult completeResult : searchResult.getCompleteResults()) {
Document doc = completeResult.getDocument();
System.out.println("Matching document <" + completeResult.getUniqueId() + "> with score <" + completeResult.getScore() + "> from index <" + completeResult.getIndexName() + ">");
System.out.println(" full document <" + doc + ">");
}
Sorting
Search search = new Search("myIndexName").setAmount(100);
search.addQuery(new FilterQuery("title:(brown AND bear)"));
// can add multiple sorts with ascending or descending (default ascending)
// can also specify whether missing values are returned first or last (default missing first)
search.addSort(new Sort("year").descending());
search.addSort(new Sort("journal").ascending().missingLast());
SearchResult searchResult = zuliaWorkPool.search(search);
Query Fields
Query fields set the search field used when one is not given for a term. if query fields are not set on the query and a term is not qualified, the default search fields on the index will be used.
Query Fields Given
Search search = new Search("myIndexName").setAmount(100);
// search for lung in title,abstract AND cancer in title,abstract AND treatment in title
search.addQuery(new ScoredQuery("lung cancer title:treatment").addQueryFields("title", "abstract").setDefaultOperator(ZuliaQuery.Query.Operator.AND));
Default Query Fields
// search for lung in default index fields OR cancer in default index fields
// OR is the default operator unless set
Search search = new Search("myIndexName").setAmount(100);
search.addQuery(new ScoredQuery("lung cancer"));
Wildcard Query Fields
Search search = new Search("myIndexName").setAmount(100);
// search for lung in any field starting with title and abstract AND cancer in any field starting with title and abstract
// can also use title*:someTerm in a query, see Query Syntax Documentation
search.addQuery(new ScoredQuery("lung cancer").addQueryFields("title*", "abstract").setDefaultOperator(ZuliaQuery.Query.Operator.AND));
Highlighting
Search search = new Search("myIndexName").setAmount(100);
search.addQuery(new ScoredQuery("lung cancer").addQueryFields("title").setDefaultOperator(ZuliaQuery.Query.Operator.AND));
//can optionally set pre and post tag for the the highlight and set the number of fragments on the Highlight object
search.addHighlight(new Highlight("title"));
SearchResult searchResult = zuliaWorkPool.search(search);
for (CompleteResult completeResult : searchResult.getCompleteResults()) {
Document document = completeResult.getDocument();
List<String> titleHighlightsForDoc = completeResult.getHighlightsForField("title");
}
Filter Queries
Filter queries are the same as scored queries except they do not require the search engine to compute a score. They should be used in cases where a sort is being applied and a score is not needed or when a filter should not influence the relevance score. Filter queries and scored queries can be combined together.
Search search = new Search("myIndexName").setAmount(100);
// include only years 2020 forward
search.addQuery(new FilterQuery("year:[2020 TO *]"));
// require both terms to be matched in either the title or abstract
search.addQuery(new FilterQuery("cheetah cub").setDefaultOperator(Operator.AND).addQueryFields("title", "abstract"));
// require two out of the three terms in the abstract
search.addQuery(new FilterQuery("sleep play run").setMinShouldMatch(2).addQueryField("abstract"));
// exclude the journal nature
search.addQuery(new FilterQuery("journal:Nature").exclude());
SearchResult searchResult = zuliaWorkPool.search(search);
Query Helpers
FilterFactory for numerics
search = new Search("myIndexName");
// Search for pub years in range [2015, 2020]
search.addQuery(FilterFactory.rangeInt("pubYear").setRange(2015, 2020));
search = new Search("myIndexName");
// Search for pubs for any year before 2020
search.addQuery(FilterFactory.rangeInt("pubYear").setMaxValue(2020).setEndpointBehavior(RangeBehavior.EXCLUSIVE));
Values for tokens
String query;
// (a OR b)
query = Values.any().of("a", "b").asString();
" slow cat" AND "pink shirt"
query = Values.all().valueHandlerChain(List.of(String::toLowerCase, Values.VALUE_QUOTER)).of("slow cat", "Pink Shirt").asString();
// ("slow cat" OR "Pink Shirt")
Function<String, String> quoteAndTrim = s -> Values.VALUE_QUOTER.apply(s).trim(); // Values.VALUE_QUOTER is default value handler
query = Values.all().valueHandler(quoteAndTrim).of(" slow cat ", " Pink Shirt ").asString();
// title,abstract:(a OR b)
query = Values.any().of("a", "b").withFields("title", "abstract").asString();
// -title,abstract:(a OR b OR c)
query = Values.any().of("a", "b", "c").withFields("title", "abstract").exclude().asString();
// title,abstract:(\"fast dog\" OR b OR c)~2
query = Values.atLeast(2).of("fast dog", "b", "c").withFields("title", "abstract").asString();
// -title,abstract:(a OR b OR c)~2
query = Values.atLeast(2).of("a", "b", "c").withFields("title", "abstract").exclude().asString();
FilterQuery fq;
// fq = new FilterQuery("\"fast dog\" b c").setDefaultOperator(ZuliaQuery.Query.Operator.OR).exclude().addQueryFields("title", "abstract").setMinShouldMatch(2)
fq = Values.atLeast(2).of("fast dog", "b", "c").withFields("title", "abstract").exclude().asFilterQuery();
ScoredQuery sq;
// sq = new ScoredQuery("\"slow cat\" b c").setDefaultOperator(ZuliaQuery.Query.Operator.OR).addQueryFields("title", "abstract").setMinShouldMatch(2);
sq = Values.atLeast(2).of("slow cat", "b", "c").withFields("title", "abstract").asScoredQuery();
Term Queries
Optimized search for many terms. Terms given are not analyzed, so they must match exactly what is in the search engine. This is most useful for things like ids that are not analyzed with KEYWORD or lightly analyzed with something like LC_KEYWORD (lower case keyword)
Search search = new Search("myIndexName").setAmount(100);
// search for the terms 1,2,3,4 in the field id
search.addQuery(new TermQuery("id").addTerms("1", "2", "3", "4"));
SearchResult searchResult = zuliaWorkPool.search(search);
Numeric Set Queries
Optimized search for many numeric terms
Search search = new Search("myIndexName").setAmount(100);
//search for values 1, 5, 7, 9 in the field intField
search.addQuery(new NumericSetQuery("intField").addValues(1, 5, 7, 9));
Vector Queries
Vector Indexing and Basic Queries
// create an index with add field config
ClientIndexConfig indexConfig = new ClientIndexConfig();
// call createVector or createUnitVector depending on if the vector is unit normalized
indexConfig.addFieldConfig(FieldConfigBuilder.createUnitVector("v").index());
// ...
indexConfig.setIndexName("vectorTestIndex");
// also can could updateIndex with mergeFieldConfig to add vector field to existing index
zuliaWorkPool.createIndex(indexConfig);
// store some documents with a vector field
Document mongoDocument = new Document();
float[] vector = new float[]{ 0, 0, 0.70710678f, 0.70710678f };
mongoDocument.put("v", Floats.asList(vector));
Store s = new Store("someId", "vectorTestIndex").setResultDocument(mongoDocument);
zuliaWorkPool.store(s);
Search search = new Search("vectorTestIndex").setAmount(100);
// returns the top 3 documents closest to [1.0,0,0,0] in the field v
search.addQuery(new VectorTopNQuery(new float[] { 1.0f, 0.0f, 0.0f, 0.0f }, 3, "v"));
SearchResult searchResult = zuliaWorkPool.search(search);
Pre Filters with Vector Queries
Search search = new Search("vectorTestIndex").setAmount(100);
// filters for blue in the description then returns the top 3 documents closest to [1.0,0,0,0] in the field v
StandardQuery descriptionQuery = new FilterQuery("blue").addQueryField("description");
search.addQuery(new VectorTopNQuery(new float[] { 1.0f, 0.0f, 0.0f, 0.0f }, 3, "v").addPreFilterQuery(descriptionQuery));
Post Filters with Vector Queries
Search search = new Search("vectorTestIndex").setAmount(100);
// returns the top 3 documents closest to [1.0,0,0,0] in the field v, then filters for red in the description (possible less than 3 now)
search.addQuery(new VectorTopNQuery(new float[] { 1.0f, 0.0f, 0.0f, 0.0f }, 3, "v"));
search.addQuery(new FilterQuery("red").addQueryField("description"));
Count Facets
// Can set number of documents to return to 0 or omit setAmount unless you want the documents at the same time
// normally is combined with a FilterQuery or ScoredQuery to count a set of results
Search search = new Search("myIndexName").setAmount(0);
search.addCountFacet(new CountFacet("issn").setTopN(20));
SearchResult searchResult = zuliaWorkPool.search(search);
for (ZuliaQuery.FacetCount fc : searchResult.getFacetCounts("issn")) {
System.out.println("Facet <" + fc.getFacet() + "> with count <" + fc.getCount() + ">");
}
Numeric Stat
// show number of values, number of documents, min, max, and sum for field pubYear
// normally is combined with a FilterQuery or ScoredQuery to count a set of results
Search search = new Search("myIndexName").setAmount(100);
search.addStat(new NumericStat("pubYear"));
SearchResult searchResult = zuliaWorkPool.search(search);
ZuliaQuery.FacetStats pyFieldStat = searchResult.getNumericFieldStat("pubYear");
System.out.println(pyFieldStat.getMin()); // minimum value for the field
System.out.println(pyFieldStat.getMax()); // maximum value for the field
System.out.println(pyFieldStat.getSum()); // sum of the values for the field, use one of the counts below for the average/mean
System.out.println(pyFieldStat.getDocCount()); // count of documents with the field not null
System.out.println(pyFieldStat.getAllDocCount()); // count of documents matched by the query
System.out.println(pyFieldStat.getValueCount()); // count of total number of values in the field (equal to document count except for multivalued fields)
Numeric Stat with Percentiles
List<Double> percentiles = List.of(
0.0, // 0th percentile (min) - can be retrieved without percentiles
0.25, // 25th percentile
0.50, // median
0.75, // 75th percentile
1.0 // 100th percentile (max) - can be retrieved without percentiles
);
Search search = new Search("myIndexName");
// Get the requested percentiles within 1% of their true value
search.addStat(new NumericStat("pubYear").setPercentiles(percentiles).setPercentilePrecision(0.01));
SearchResult searchResult = zuliaWorkPool.search(search);
for (ZuliaQuery.Percentile percentile : searchResult.getNumericFieldStat("pubYear").getPercentilesList()) {
System.out.println(percentile.getPoint() + " -> " + percentile.getValue());
}
Stat Facet
// return the highest sum on author count for each journal name
Search search = new Search("myIndexName").setAmount(100);
search.addStat(new StatFacet("authorCount", "journalName"));
SearchResult searchResult = zuliaWorkPool.search(search);
// journals ordered by the sum of author count
List<ZuliaQuery.FacetStats> authorCountForJournalName = searchResult.getFacetFieldStat("authorCount", "journalName");
for (ZuliaQuery.FacetStats journalStats : authorCountForJournalName) {
System.out.println(journalStats.getFacet()); // the journal
System.out.println(journalStats.getMin()); // minimum value of author count for journal
System.out.println(journalStats.getMax()); // maximum value of author count for journal
System.out.println(journalStats.getSum()); // sum of the values of author count for journal, use counts below for average/mean
System.out.println(journalStats.getDocCount()); // count of documents for the journal where the author count not null
System.out.println(journalStats.getAllDocCount()); // count of documents for the journal
System.out.println(journalStats.getValueCount()); // count of total number of values of author count for the journal (equal to document count except for multivalued fields)
}
Stat Facet Percentiles
//get the 25th percentile, median, and 75th percentile of author count for the journal names
Search search = new Search("myIndexName").setAmount(100);
search.addStat(new StatFacet("authorCount", "journalName").setPercentiles(List.of(0.25, 0.5, 0.75)).setPercentilePrecision(0.01));
SearchResult searchResult = zuliaWorkPool.search(search);
// journals ordered by the sum of author count
List<ZuliaQuery.FacetStats> authorCountForJournalName = searchResult.getFacetFieldStat("authorCount", "journalName");
for (ZuliaQuery.FacetStats journalStats : authorCountForJournalName) {
for (ZuliaQuery.Percentile percentile : journalStats.getPercentilesList()) {
System.out.println(percentile.getPoint() + " -> " + percentile.getValue());
}
// journalStats also will have facet, min, max, sum, and counts as other example
}
Drilling Down Facets
Search search = new Search("myIndexName").setAmount(100);
search.addFacetDrillDown("issn", "1111-1111");
SearchResult searchResult = zuliaWorkPool.search(search);
Getting the second page of results with a cursor
Search search = new Search("myIndexName");
search.setAmount(100);
search.addQuery(new ScoredQuery("issn:1234-1234 AND title:special"));
// on a changing index a sort on is necessary
// it can be sort on another field AND id as well
search.addSort(new Sort("id"));
SearchResult firstResult = zuliaWorkPool.search(search);
search.setLastResult(firstResult);
SearchResult secondResult = zuliaWorkPool.search(search);
Getting the all results with a cursor
Search search = new Search("myIndexName");
search.setAmount(100); //this will be the page size
search.addQuery(new ScoredQuery("issn:1234-1234 AND title:special"));
// on a changing index a sort on is necessary
// it can be sort on another field AND id as well
search.addSort(new Sort("id"));
//option 1 - requires fetch type full (default)
zuliaWorkPool.searchAllAsDocument(search, document -> {
// do something with mongo bson document
});
//variation 2 - when score is needed, searching multiple indexes and index name is needed, or fetch type is NONE/META
zuliaWorkPool.searchAllAsScoredResult(search, scoredResult -> {
System.out.println(scoredResult.getUniqueId() + " has score " + scoredResult.getScore() + " for index " + scoredResult.getIndexName());
// if result fetch type is full (default)
Document document = ResultHelper.getDocumentFromScoredResult(scoredResult);
});
//variation 3 - each page is a returned as a search result. less convenient but gives access to total hits
zuliaWorkPool.searchAll(search, searchResult -> {
System.out.println("There are " + searchResult.getTotalHits());
// variation 3a - requires fetch type full (default)
for (Document document : searchResult.getDocuments()) {
}
// variation 3b - when score is needed, searching multiple indexes and index name is needed, or fetch type is NONE/META
for (CompleteResult result : searchResult.getCompleteResults()) {
System.out.println("Result for <" + result.getIndexName() + "> with score <" + result.getScore() + ">");
//if fetch type is FULL
Document document = result.getDocument();
}
});
Deleting
Delete From Index
//Deletes the document from the index but not any associated documents
DeleteFromIndex deleteFromIndex = new DeleteFromIndex("myid111", "myIndexName");
zuliaWorkPool.delete(deleteFromIndex);
Delete Completely
//Deletes the result document, the index documents and all associated documents associated with an id
DeleteFull deleteFull = new DeleteFull("myid123", "myIndexName");
zuliaWorkPool.delete(deleteFull);
Delete Single Associated
//Removes a single associated document with the unique id and filename given
DeleteAssociated deleteAssociated = new DeleteAssociated("myid123", "myIndexName", "myfile2");
zuliaWorkPool.delete(deleteAssociated);
Delete All Associated
DeleteAllAssociated deleteAllAssociated = new DeleteAllAssociated("myid123", "myIndexName");
zuliaWorkPool.delete(deleteAllAssociated);
Other Operations
Get Current Document Count for Index
GetNumberOfDocsResult result = zuliaWorkPool.getNumberOfDocs("myIndexName");
System.out.println(result.getNumberOfDocs());
Get Fields for Index
GetFieldsResult result = zuliaWorkPool.getFields(new GetFields("myIndexName"));
System.out.println(result.getFieldNames());
Get Terms for Field
GetTermsResult getTermsResult = zuliaWorkPool.getTerms(new GetTerms("myIndexName", "title"));
for (ZuliaBase.Term term : getTermsResult.getTerms()) {
System.out.println(term.getValue() + ": " + term.getDocFreq());
}
Get Cluster Nodes
GetNodesResult getNodesResult = zuliaWorkPool.getNodes();
for (Node node : getNodesResult.getNodes()) {
System.out.println(node);
}
Async API
Every Function has a Corresponding Async Version
Executor executor = Executors.newCachedThreadPool();
Search search = new Search("myIndexName").setAmount(10);
ListenableFuture<SearchResult> resultFuture = zuliaWorkPool.searchAsync(search);
Futures.addCallback(resultFuture, new FutureCallback<>() {
@Override
public void onSuccess(SearchResult result) {
}
@Override
public void onFailure(Throwable t) {
}
}, executor);
Object Persistence / Mapping
Annotated Object Example
@Settings(indexName = "wikipedia", numberOfShards = 16, shardCommitInterval = 6000)
public class Article {
public Article() {
}
@UniqueId
private String id;
@Indexed(analyzerName = DefaultAnalyzers.STANDARD)
private String title;
@Indexed
private Integer namespace;
@DefaultSearch
@Indexed(analyzerName = DefaultAnalyzers.STANDARD)
private String text;
private Long revision;
@Indexed
private Integer userId;
@Indexed(analyzerName = DefaultAnalyzers.STANDARD)
private String user;
@Indexed
private Date revisionDate;
//Getters and Setters
//....
}
Creating Index for Annotated Class Example
Mapper<Article> mapper = new Mapper<>(Article.class);
zuliaWorkPool.createIndex(mapper.createOrUpdateIndex());
Storing an Object with Mapper
Article article = new Article();
//...
Store store = mapper.createStore(article);
zuliaWorkPool.store(store);
Querying with Mapper
Search search = new Search("wikipedia").setAmount(10);
search.addQuery(new ScoredQuery("title:technology"));
SearchResult searchResult = zuliaWorkPool.search(search);
List<Article> articles = searchResult.getMappedDocuments(mapper);