Java Client

Zulia Java Client

Gradle


repositories {
    mavenCentral()
    maven {
        url "https://maven.ascend-tech.us/repo/"
    }
}


dependencies {
    implementation 'io.zulia:zulia-client:4.10.0'
    implementation 'org.mongodb:mongodb-driver-sync:5.6.0'
}

Maven

<repository>
   <id>astMaven</id>
   <name>AST Maven</name>
   <url>https://maven.ascend-tech.us/repo</url>
</repository>
<dependencies>
  <dependency>
      <groupId>io.zulia</groupId>
      <artifactId>zulia-client</artifactId>
      <version>4.10.0</version>
  </dependency>
  <dependency>
      <groupId>org.mongodb</groupId>
      <artifactId>mongodb-driver-sync</artifactId>
      <version>5.6.0</version>
  </dependency>
</dependencies>

Creating a Client

The Zulia java client is named ZuliaWorkPool. ZuliaWorkPool is a thread safe connection pool using a gRPC connection to Zulia on the service port. There are async versions methods of all methods that return a ListenableFuture<> of the result.

Simple Client creation

ZuliaWorkPool zuliaWorkPool = new ZuliaWorkPool(new ZuliaPoolConfig().addNode("someIp"));

Full Client Configuration

ZuliaPoolConfig zuliaPoolConfig = new ZuliaPoolConfig();
zuliaPoolConfig.addNode("someIp");
//optionally give ports if not default values
//zuliaPoolConfig.addNode("localhost", 32191, 32192);

//optional settings (default values shown)
zuliaPoolConfig.setDefaultRetries(0);//Number of attempts to try before throwing an exception
zuliaPoolConfig.setMaxConnections(10); //Maximum connections per server
zuliaPoolConfig.setMaxIdle(10); //Maximum idle connections per server
zuliaPoolConfig.setCompressedConnection(false); //Use this for WAN client connections
zuliaPoolConfig.setPoolName(null); //For logging purposes only, null gives default of zuliaPool-n
zuliaPoolConfig.setNodeUpdateEnabled(true); //Periodically update the nodes of the cluster and to enable smart routing to the correct node. Do not use this with ssh port forwarding.  This can be done manually with zuliaWorkPool.updateNodes();
zuliaPoolConfig.setNodeUpdateInterval(10000); //Interval to update the nodes in ms
zuliaPoolConfig.setRoutingEnabled(true); //enable routing indexing to the correct server, this only works if automatic node updating is enabled or it is periodically called manually.

//create the connection pool
ZuliaWorkPool zuliaWorkPool = new ZuliaWorkPool(zuliaPoolConfig);

Creating an Index

Basic Creation

ClientIndexConfig indexConfig = new ClientIndexConfig().setIndexName("test").addDefaultSearchField("test");
indexConfig.addFieldConfig(FieldConfigBuilder.createString("title").indexAs(DefaultAnalyzers.STANDARD));
indexConfig.addFieldConfig(FieldConfigBuilder.createString("issn").indexAs(DefaultAnalyzers.LC_KEYWORD).facet());
indexConfig.addFieldConfig(FieldConfigBuilder.createInt("an").index().sort());
// createLong, createFloat, createDouble, createBool, createDate, createVector, createUnitVector is also available
// or create(storedFieldName, fieldType)

// hierarchical facet on a string field (values use "/" as the path delimiter, e.g. "USA/California/LA")
indexConfig.addFieldConfig(FieldConfigBuilder.createString("path").indexAs(DefaultAnalyzers.LC_KEYWORD).facetHierarchical().sort());
// hierarchical facet on a date field (automatically splits into year/month/day levels)
indexConfig.addFieldConfig(FieldConfigBuilder.createDate("publishDate").index().facetHierarchical().sort());

CreateIndex createIndex = new CreateIndex(indexConfig);
zuliaWorkPool.createIndex(createIndex);
  • Calling create index again will update index settings. However, the number of shards cannot be changed for the index once the index is created. The number of shards can be greater than the number of nodes to future-proof index if using sharding. Also see UpdateIndex for partial index setting updates.
  • Changing or adding analyzers for fields that are already indexed may require re-indexing for desired results.
  • Zulia supports indexes created from object annotations. For more info see section on Object Persistence.

Index Config Details

Full ClientIndexConfig settings are explained below:

defaultSearchField - The field that is searched if no field is given to a query (missing query fields or direct fielded search)
defaultAnalyzer - The default analyzer for all fields not specified by a field config
fieldConfig - Overrides the default analyzer for a field
shardCommitInterval - Indexes or deletes to shard before a commit is forced (default 3200)
idleTimeWithoutCommit - Time without indexing before commit is forced in seconds (0 disables) (default 30)
applyUncommitedDeletes - Apply all deletes before search (default true)
shardQueryCacheSize - Number of queries cached at the shard level
shardQueryCacheMaxAmount - Queries with more than this amount of documents returned are not cached

//The following are used in optimizing federation of shards when more than one shard is used.
//The amount requested from each shard on a query is (((amountRequestedByQuery / numberOfShards) + minShardRequest) * requestFactor).
requestFactor - Used in calculation of request size for a shard (default 2.0)
minShardRequest - Added to the calculated request for a shard (default 2)
shardTolerance - Difference in scores between shards tolerated before requesting full results (query request amount) from the shard (default 0.05)
defaultConcurrency - Number of virtual threads used for parallel Lucene segment search and aggregation within each shard (default 1, i.e. single-threaded). Can be overridden per query. Higher values improve latency on large shards at the cost of more CPU.
ramBufferMB - RAM buffer size in MB used by each shard's Lucene IndexWriter before flushing to disk (default 128). Increasing this value can improve indexing throughput at the cost of higher memory usage. Each shard maintains its own buffer, so total memory usage is approximately ramBufferMB × numberOfShards.
disableCompression - When true, disables compression of stored documents (default false, i.e. compression is enabled). Disabling compression reduces CPU usage during store and fetch at the cost of increased storage size.
indexWeight - Relative weight of the index used for cluster shard distribution and node load balancing (default 1, must be positive). Indexes with higher weights are treated as heavier when the cluster decides how to distribute shards across nodes. This does not affect query scoring or result ranking.
// Increase the RAM buffer for high-throughput indexing workloads
clientIndexConfig.setRamBufferMB(256);

// Disable compression for faster store/fetch when storage space is not a concern
clientIndexConfig.setDisableCompression(true);

// Give this index higher weight for shard distribution (e.g. a large, frequently queried index)
clientIndexConfig.setIndexWeight(4);

These can also be changed on an existing index using UpdateIndex:

UpdateIndex updateIndex = new UpdateIndex("myIndexName");
updateIndex.setRamBufferMB(256);
zuliaWorkPool.updateIndex(updateIndex);

These Field Types are Available

STRING
NUMERIC_INT
NUMERIC_LONG
NUMERIC_FLOAT
NUMERIC_DOUBLE
DATE
BOOL
UNIT_VECTOR
VECTOR

These built-in Analyzers are available (DefaultAnalyzers)

KEYWORD - Field is searched as one token
LC_KEYWORD - Field is searched as one token in lowercase (case insenstive, use for wildcard searches)
LC_CONCAT_ALL
STANDARD - Standard lucene analyzer (good for general full text)
MIN_STEM - Minimal English Stemmer
KSTEMMED - K Stemmer
LSH - Locality Sensitive Hash
TWO_TWO_SHINGLE - (n-grams)
THREE_THREE_SHINGLE - (n-grams)

Field Metadata

Fields can be annotated with a display name and description for documentation and UI purposes:

indexConfig.addFieldConfig(
    FieldConfigBuilder.createDouble("rating")
        .index()
        .sort()
        .displayName("Product Rating")
        .description("Customer product rating from 1-5 stars")
);

Field metadata can be retrieved via GetIndexConfig:

ClientIndexConfig config = zuliaWorkPool.getIndexConfig("myIndexName").getIndexConfig();
ZuliaIndex.FieldConfig fieldConfig = config.getFieldConfig("rating");
String displayName = fieldConfig.getDisplayName();  // "Product Rating"
String description = fieldConfig.getDescription();  // "Customer product rating from 1-5 stars"

Field metadata is also returned by the REST endpoint GET /indexes/{indexName}.

Sort String Handling

When configuring a field as sortable, you can specify how string values are normalized for sorting using StringHandling:

StringHandling Description
STANDARD Default. Case-sensitive, no transformation.
LOWERCASE Converts to lowercase for case-insensitive sorting.
FOLDING Applies ASCII folding to remove accents and diacritics (e.g. “Blāh” becomes “Blah”).
LOWERCASE_FOLDING Combines both lowercase and ASCII folding for case-insensitive, accent-insensitive sorting.
// Default sort (STANDARD — case-sensitive)
FieldConfigBuilder.createString("title").indexAs(DefaultAnalyzers.STANDARD).sort();

// Case-insensitive sort
FieldConfigBuilder.createString("title").indexAs(DefaultAnalyzers.STANDARD)
    .sortAs(ZuliaIndex.SortAs.StringHandling.LOWERCASE, "titleLower");

// Accent-insensitive sort (ASCII folding)
FieldConfigBuilder.createString("category").indexAs(DefaultAnalyzers.STANDARD)
    .sortAs(ZuliaIndex.SortAs.StringHandling.FOLDING, "categoryFolded");

// Case and accent insensitive sort
FieldConfigBuilder.createString("author").indexAs(DefaultAnalyzers.STANDARD)
    .sortAs(ZuliaIndex.SortAs.StringHandling.LOWERCASE_FOLDING, "authorNormalized");

A single field can have multiple sort configurations with different field names:

// Both case-sensitive and case-insensitive sort on the same field
FieldConfigBuilder.createString("otherTitle").indexAs(DefaultAnalyzers.STANDARD)
    .sort()  // STANDARD sort as "otherTitle"
    .sortAs(ZuliaIndex.SortAs.StringHandling.LOWERCASE_FOLDING, "otherTitleFolding");  // Normalized sort

// Query using either sort field name
search.addSort(new Sort("otherTitle").ascending());          // Case-sensitive
search.addSort(new Sort("otherTitleFolding").ascending());   // Case/accent insensitive

Custom Analyzer

Custom analyzers are a combination of a tokenizer and an ordered list of filters

// define a custom analyzer in for the index
clientIndexConfig.addAnalyzerSetting("myAnalyzer", Tokenizer.WHITESPACE, Arrays.asList(Filter.ASCII_FOLDING, Filter.LOWERCASE), Similarity.BM25);
// reference a custom analyzer for a field
clientIndexConfig.addFieldConfig(FieldConfigBuilder.create("abstract", FieldType.STRING).indexAs("myAnalyzer"));

These tokenizers are available:

STANDARD   - Word Break rules from the Unicode Text Segmentation algorithm as specified in Unicode Standard Annex #29.
KEYWORD    - Treat entire field as a single token
WHITESPACE - A tokenizer that divides text at whitespace characters as defined by Character.isWhitespace(int).

The corresponding Lucene classes are StandardTokenizer, KeywordTokenizer, and WhitespaceTokenizer.

These filters are available

LOWERCASE - Lowercase text
UPPERCASE - Uppercase text
STOPWORDS - Removes stop words based on defaults below or loads from a file %user.home%/.zulia/stopwords.txt
          - Default stopwords - a, an, and, are, as, at, be, but, by, for, if, in, into, is,
                                it, no, not, of, on, or, such, that, the, their, then, there,
                                these, they, this, to, was, will, with
ASCII_FOLDING - Converts alphabetic, numeric, and symbolic Unicode characters which are not
                in the first 127 ASCII characters (the "Basic Latin" Unicode block) into their ASCII equivalents, if one exists.
KSTEM - high-performance kstem filter for english
ENGLISH_MIN_STEM - Minimal plural stemmer for English
SNOWBALL_STEM - english snowball from org.tartarus.snowball.ext.EnglishStemmer
ENGLISH_POSSESSIVE - removes possessives (trailing 's) from words.
MINHASH - Generate min hash tokens from an incoming stream of tokens. The incoming tokens would typically be 5 word shingles.
TWO_TWO_SHINGLE - Creates 2 word shingles
THREE_THREE_SHINGLE - Creates 3 word shingles
FOUR_FOUR_SHINGLE - Creates 4 word shingles
FIVE_FIVE_SHINGLE - Creates 5 word shingles
BRITISH_US - Normalizes british spellings to US english equilavent (i.e. colour to color)
CONCAT_ALL - Can be used with the whitespace tokenizer to combine words like wi-fi into wifi, see Lucene's WordDelimiterGraphFilter
CASE_PROTECTED_WORDS - Protects the cases of certain words to make them case senstive (work in progress)
GERMAN_NORMALIZATION - Normalizes German characters according to the heuristics of the German2 snowball algorithm

Notes:

  • See Lucene’s ASCIIFoldingFilter, EnglishMinimalStemFilter, EnglishPossessiveFilter, GermanNormalizationFilter, KStemFilter, MinHashFilterFactory, LowerCaseFilter, ShingleFilter, SnowballFilter, StopFilter, UpperCaseFilter, and WordDelimiterGraphFilter for more information
  • Submit an issue for additional filters needed

HTML Stripping

Fields containing HTML can be analyzed with HTML tags and entities stripped before tokenization. Enable stripHTML on the analyzer settings for the field:

// Build a custom analyzer with HTML stripping enabled
clientIndexConfig.addAnalyzerSetting(AnalyzerSettings.newBuilder()
    .setName("htmlAnalyzer")
    .setTokenizer(AnalyzerSettings.Tokenizer.STANDARD)
    .addFilter(AnalyzerSettings.Filter.LOWERCASE)
    .addFilter(AnalyzerSettings.Filter.ENGLISH_MIN_STEM)
    .setStripHTML(true)
    .build());
// Reference the analyzer for a field
clientIndexConfig.addFieldConfig(FieldConfigBuilder.create("htmlContent", FieldType.STRING).indexAs("htmlAnalyzer"));

This wraps the configured analyzer with Lucene’s HTMLStripCharFilter, which removes HTML/XML tags and decodes HTML entities (e.g. &amp; becomes &) before the text reaches the tokenizer. The stored document retains the original HTML; only the indexed form is stripped.

There is also a built-in default analyzer STANDARD_HTML that uses the standard tokenizer with lowercase, stopwords, and HTML stripping:

clientIndexConfig.addFieldConfig(FieldConfigBuilder.create("htmlContent", FieldType.STRING).indexAs(DefaultAnalyzers.STANDARD_HTML));

Index Metadata

clientIndexConfig.setMeta(new Document("category", "special").append("otherKey", 10));

Warmed Searches

Warming searches are pre-defined queries that are automatically executed after each shard commit to keep results in the query cache. Each warming search requires a unique search label (set via setSearchLabel) to identify it.

Search search1 = new Search("someIndex").addQuery(new FilterQuery("the best query")).setSearchLabel("custom");
Search search2 = new Search("someIndex").addQuery(new FilterQuery("the worst query")).setSearchLabel("mine");
clientIndexConfig.addWarmingSearch(search1);
clientIndexConfig.addWarmingSearch(search2);

The search label is required and must be unique across all warming searches for an index. It is also logged on the server when the warming search executes, making it useful for monitoring cache warm-up behavior.

Field Mapping

clientIndexConfig.addFieldMapping(new FieldMapping("title").addMappedFields("longTitle","shortTitle"));
clientIndexConfig.addFieldMapping(new FieldMapping("category").addMappedFields("category-*"));
clientIndexConfig.addFieldMapping(new FieldMapping("rating").addMappedFields("otherRating").includeSelf());

Update Index

To replace the entire index config use the CreateIndexcommand. For partial updates use UpdateIndex

Basic Usage

UpdateIndex updateIndex = new UpdateIndex("someIndex");
// ... any set of changes listed below (can change multiple things at once)
UpdateIndexResult updateIndexResult = zuliaWorkPool.updateIndex(updateIndex);
// full index settings are returned after the change that can be accessed if needed
IndexSettings fullIndexSettings = updateIndexResult.getFullIndexSettings();

Numeric Settings

UpdateIndex updateIndex = new UpdateIndex("someIndex");

// selectivity call setXXX on the settings that you can to change
// if set is not called there will be no changes to that setting
updateIndex.setIndexWeight(10);
// ...
zuliaWorkPool.updateIndex(updateIndex);

Add/Change Field(s)

UpdateIndex updateIndex = new UpdateIndex("someIndex");
// if a field myField or otherField exists, it will be updated with these settings
FieldConfigBuilder myField = FieldConfigBuilder.createString("myField").indexAs(DefaultAnalyzers.STANDARD).sort();
FieldConfigBuilder otherField = FieldConfigBuilder.createString("otherField").indexAs(DefaultAnalyzers.LC_KEYWORD).sort();
updateIndex.mergeFieldConfig(myField, otherField);
zuliaWorkPool.updateIndex(updateIndex);

Replace Fields

UpdateIndex updateIndex = new UpdateIndex("someIndex");
// replaces all fields with the two fields given
FieldConfigBuilder myField = FieldConfigBuilder.createString("myField").indexAs(DefaultAnalyzers.STANDARD).sort();
FieldConfigBuilder otherField = FieldConfigBuilder.createString("otherField").indexAs(DefaultAnalyzers.LC_KEYWORD).sort();
updateIndex.replaceFieldConfig(myField, otherField);
zuliaWorkPool.updateIndex(updateIndex);

Remove Fields

UpdateIndex updateIndex = new UpdateIndex("someIndex");
// removes the stored field with name myField if it exists
updateIndex.removeFieldConfigByStoredName(List.of("myField"));
zuliaWorkPool.updateIndex(updateIndex);

Add/Change Custom Analyzers

UpdateIndex updateIndex = new UpdateIndex("someIndex");
// if an analyzer custom or mine exists, it will be updated with these settings, otherwise they are added
ZuliaIndex.AnalyzerSettings custom = ZuliaIndex.AnalyzerSettings.newBuilder().setName("custom").addFilter(Filter.LOWERCASE).build();
ZuliaIndex.AnalyzerSettings mine = ZuliaIndex.AnalyzerSettings.newBuilder().setName("mine").addFilter(Filter.LOWERCASE).addFilter(Filter.BRITISH_US)
        .build();
updateIndex.mergeAnalyzerSettings(custom, mine);

Replace Custom Analyzers

UpdateIndex updateIndex = new UpdateIndex("someIndex");
// replaces all analyzers with the two custom analyzers given
ZuliaIndex.AnalyzerSettings custom = ZuliaIndex.AnalyzerSettings.newBuilder().setName("custom").addFilter(Filter.LOWERCASE).build();
ZuliaIndex.AnalyzerSettings mine = ZuliaIndex.AnalyzerSettings.newBuilder().setName("mine").addFilter(Filter.LOWERCASE).addFilter(Filter.BRITISH_US)
        .build();
updateIndex.replaceAnalyzerSettings(custom, mine);

Remove Custom Analyzer

UpdateIndex updateIndex = new UpdateIndex("someIndex");
// removes the analyzer field with name myCustomOne if it exists
updateIndex.removeAnalyzerSettingsByName(List.of("myCustomOne"));
zuliaWorkPool.updateIndex(updateIndex);

Add/Change Warmed Searches

UpdateIndex updateIndex = new UpdateIndex("someIndex");
// if a warmed search with search label custom or mine exists, it will be updated with these settings, otherwise they are added
Search search1 = new Search("someIndex").addQuery(new FilterQuery("the best query")).setSearchLabel("custom");
Search search2 = new Search("someIndex").addQuery(new FilterQuery("the worst query")).setSearchLabel("mine");
updateIndex.mergeWarmingSearches(search1, search2);

Replace Warmed Searches

UpdateIndex updateIndex = new UpdateIndex("someIndex");
// replaces all warmed searches with the given warmed searches
Search search1 = new Search("someIndex").addQuery(new FilterQuery("some stuff")).setSearchLabel("the best label");
Search search2 = new Search("someIndex").addQuery(new FilterQuery("more stuff")).setSearchLabel("the good label");
updateIndex.replaceWarmingSearches(search1, search2);
UpdateIndex updateIndex = new UpdateIndex("someIndex");
// removes the warmed search with search label myCustomOne if it exists
updateIndex.removeWarmingSearchesByLabel(List.of("myCustomOne"));
zuliaWorkPool.updateIndex(updateIndex);

Add/Change Metadata

UpdateIndex updateIndex = new UpdateIndex("someIndex");
// replaces key someKey with value 5 and otherKey with value "a string" if they exist, otherwise add they to the metadata (putAll with new metadata)
updateIndex.mergeMetadata(new Document().append("someKey", 5).append("otherKey", "a string"));
zuliaWorkPool.updateIndex(updateIndex);

Replace Metadata

UpdateIndex updateIndex = new UpdateIndex("someIndex");
// replaces metadata document with the document below
updateIndex.replaceMetadata(new Document().append("stuff", "for free"));
zuliaWorkPool.updateIndex(updateIndex);

Remove Metadata

UpdateIndex updateIndex = new UpdateIndex("someIndex");
// removes the keys below from the metadata object if they exist
updateIndex.removeMetadataByKey(List.of("oneKey", "twoKey", "redKey", "blueKey"));
zuliaWorkPool.updateIndex(updateIndex);

Add/Change Field Mapping

UpdateIndex updateIndex = new UpdateIndex("someIndex");
// if a field mapping with alias test1 or test2 exists, it will be updated with these mappings, otherwise they are added
FieldMappingBuilder test1 = new FieldMapping("test1").addMappedFields("field1", "field2");
FieldMappingBuilder test2 = new FieldMapping("test2").addMappedFields("field3", "fieldPattern4*").includeSelf();
updateIndex.mergeFieldMapping(test1, test2);

Replace Field Mapping

UpdateIndex updateIndex = new UpdateIndex("someIndex");
// replaces all field mappings with the two field mappings given
FieldMappingBuilder special = new FieldMapping("special").addMappedFields("specialist", "specialThings").includeSelf();
FieldMappingBuilder custom = new FieldMapping("custom").addMappedFields("custom*");
updateIndex.replaceFieldMapping(special, custom);

Remove Field Mapping

UpdateIndex updateIndex = new UpdateIndex("someIndex");
// removes the field mapping with alias special if it exists
updateIndex.removeFieldMappingByAlias("special");
zuliaWorkPool.updateIndex(updateIndex);

Delete Index

Basic Delete

zuliaWorkPool.deleteIndex("myIndex");

Delete Index and Associated Files

DeleteIndex deleteIndex = new DeleteIndex("myIndex").setDeleteAssociated(true);
zuliaWorkPool.deleteIndex(deleteIndex);

Index Aliases

An alias provides an alternative name for an index. Aliases are transparent — searching, storing, fetching, and deleting all work identically whether you use the index name or an alias. This is useful for zero-downtime reindexing (blue/green deployments) where you can build a new index and swap the alias to point to it.

Create an Alias

// Create an alias pointing to an index
zuliaWorkPool.createIndexAlias("myAlias", "myIndexName");

Use an Alias

// Store documents using the alias
Store store = new Store("myid123", "myAlias");
store.setResultDocument(ResultDocBuilder.newBuilder().setDocument(document));
zuliaWorkPool.store(store);

// Search using the alias — returns the same results as searching by index name
Search search = new Search("myAlias");
SearchResult searchResult = zuliaWorkPool.search(search);

// Fetch, getNumberOfDocs, getFields, getTerms all work with aliases
FetchResult fetch = zuliaWorkPool.fetch(new Fetch("myid123", "myAlias"));
GetNumberOfDocsResult count = zuliaWorkPool.getNumberOfDocs("myAlias");

Update an Alias

Calling createIndexAlias with an existing alias name updates it to point to the new index:

// Repoint the alias to a different index
zuliaWorkPool.createIndexAlias("myAlias", "myNewIndexName");

Delete an Alias

zuliaWorkPool.deleteIndexAlias("myAlias");

List Aliases

Aliases are included in the nodes response:

GetNodesResult nodes = zuliaWorkPool.getNodes();
List<IndexAlias> aliases = nodes.getIndexAliases();

Note: getIndexes() returns actual index names only, not aliases. getIndexConfig() called with an alias returns the configuration of the underlying index.

Storing / Indexing Documents

Zulia supports indexing and storing from object annotations. For more info see section on Object Persistence

Result Document Storage

Simple Store

Document document = new Document();
document.put("id", "myid222");
document.put("title", "Magic Java Beans");
document.put("issn", "4321-4321");

Store store = new Store("myid222", "myIndexName").setResultDocument(document);
zuliaWorkPool.store(store);

Simple Store Json

String json = """
        {
          "documentId": "someId",
          "docType": "pdf",
          "docAuthor": "Java Developer Zone",
          "docTitle": "Elastic Search Blog",
          "isParent": false,
          "parentDocId": 1,
          "docLanguage": [
            "en",
            "czech"
          ]
        }""";

Store store = new Store("someId", "myIndexName").setResultDocument(json);
zuliaWorkPool.store(store);

Store with Metadata

Document document = new Document();
document.put("id", "myid222");
document.put("title", "Magic Java Beans");
document.put("issn", "4321-4321");

Store store = new Store("myid222", "myIndexName");

ResultDocBuilder resultDocumentBuilder = new ResultDocBuilder().setDocument(document);
//optional metadata document
resultDocumentBuilder.setMetadata(new Document().append("test1", "val1").append("test2", "val2"));
store.setResultDocument(resultDocumentBuilder);

zuliaWorkPool.store(store);

Storing Associated Documents

AssociatedBuilder associatedBuilder = new AssociatedBuilder();
associatedBuilder.setFilename("myfile2.txt");
// either set as text
associatedBuilder.setDocument("Some Text3");
// or as bytes
associatedBuilder.setDocument(new byte[]{0, 1, 2, 3});
associatedBuilder.setMetadata(new Document().append("mydata", "myvalue2").append("sometypeinfo", "text file2"));

//can be part of the same store request as the document
Store store = new Store("myid123", "someIndex");

//multiple associated documented can be added at once
store.addAssociatedDocument(associatedBuilder);

zuliaWorkPool.store(store);

Storing Large Associated Documents (Streaming)

StoreLargeAssociated storeLargeAssociated = new StoreLargeAssociated("myid333", "myIndexName", "myfilename", new File("/tmp/myFile"));
zuliaWorkPool.storeLargeAssociated(storeLargeAssociated);

Fetching Documents

Fetch Document

FetchDocument fetchDocument = new FetchDocument("myid222", "myIndex");

FetchResult fetchResult = zuliaWorkPool.fetch(fetchDocument);

if (fetchResult.hasResultDocument()) {
    Document document = fetchResult.getDocument();

    //Get optional Meta
    Document meta = fetchResult.getMeta();
}

Fetch All Associated

FetchAllAssociated fetchAssociated = new FetchAllAssociated("myid123", "myIndexName");

FetchResult fetchResult = zuliaWorkPool.fetch(fetchAssociated);

if (fetchResult.hasResultDocument()) {
    Document object = fetchResult.getDocument();

    //Get optional metadata
    Document meta = fetchResult.getMeta();
}

for (AssociatedResult ad : fetchResult.getAssociatedDocuments()) {
    //use correct function for document type
    String text = ad.getDocumentAsUtf8();
    // OR
    byte[] documentAsBytes = ad.getDocumentAsBytes();

    //get optional metadata
    Document meta = ad.getMeta();

    String filename = ad.getFilename();

}

Fetch Associated

FetchAssociated fetchAssociated = new FetchAssociated("myid123", "myIndexName", "myfile2");

FetchResult fetchResult = zuliaWorkPool.fetch(fetchAssociated);


AssociatedResult ad = fetchResult.getFirstAssociatedDocument();
//use correct function for document type
String text = ad.getDocumentAsUtf8();
// OR
byte[] documentAsBytes = ad.getDocumentAsBytes();

//get optional metadata
Document meta = ad.getMeta();

String filename = ad.getFilename();

Fetch Large Associated (Streaming)

FetchLargeAssociated fetchLargeAssociated = new FetchLargeAssociated("myid333", "myIndexName", "myfilename", new File("/tmp/myFetchedFile"));
zuliaWorkPool.fetchLargeAssociated(fetchLargeAssociated);

Document Field Filtering

By default, fetching and searching return the full stored document. You can control which fields are returned using a whitelist (include only) or blacklist (exclude) approach. This reduces bandwidth and improves performance when you only need specific fields.

Whitelist Fields on Fetch

// Return only the title and author fields
Fetch fetch = new Fetch("myid222", "myIndexName");
fetch.addDocumentField("title");
fetch.addDocumentField("author");
FetchResult fetchResult = zuliaWorkPool.fetch(fetch);

Mask Fields on Fetch

// Return all fields except the large content field
Fetch fetch = new Fetch("myid222", "myIndexName");
fetch.addDocumentMaskedField("content");
fetch.addDocumentMaskedField("internalNotes");
FetchResult fetchResult = zuliaWorkPool.fetch(fetch);
// Return only specific fields in search results
Search search = new Search("myIndexName").setAmount(10);
search.addQuery(new ScoredQuery("title:special"));
search.addDocumentFields("title", "author", "date");
SearchResult searchResult = zuliaWorkPool.search(search);

// Or mask specific fields from search results
Search search2 = new Search("myIndexName").setAmount(10);
search2.addQuery(new ScoredQuery("title:special"));
search2.addDocumentMaskedField("largeTextField");
SearchResult searchResult2 = zuliaWorkPool.search(search2);

Nested fields are supported with dot notation (e.g. addDocumentField("user.name")).

Via REST, use the fl parameter with a - prefix for masking:

GET /query?index=myIndex&q=*:*&fl=title&fl=author
GET /query?index=myIndex&q=*:*&fl=-content&fl=-internalNotes

Batch Fetch

Fetch multiple documents in a single gRPC call with streaming responses:

// Batch fetch by unique IDs
BatchFetch batchFetch = new BatchFetch();
batchFetch.addFetchDocumentsFromUniqueIds(List.of("id1", "id2", "id3"), "myIndexName");
BatchFetchResult batchResult = zuliaWorkPool.batchFetch(batchFetch);

// Process results as a list
List<FetchResult> results = batchResult.getFetchResults();

// Or stream results for memory efficiency
batchResult.getFetchResults(fetchResult -> {
    Document doc = fetchResult.getDocument();
    // process each document
});

Batch fetch from search results:

SearchResult searchResult = zuliaWorkPool.search(search);
BatchFetch batchFetch = new BatchFetch().addFetchDocumentsFromResults(searchResult);
BatchFetchResult batchResult = zuliaWorkPool.batchFetch(batchFetch);

Querying

Simple Query with only ids returned

Search search = new Search("myIndexName").setAmount(10);
search.addQuery(new ScoredQuery("issn:1234-1234 AND title:special"));
search.setResultFetchType(ZuliaQuery.FetchType.NONE); // just return the score and unique id

SearchResult searchResult = zuliaWorkPool.search(search);

long totalHits = searchResult.getTotalHits();

System.out.println("Found <" + totalHits + "> hits");
for (CompleteResult completeResult : searchResult.getCompleteResults()) {
    System.out.println("Matching document <" + completeResult.getUniqueId() + "> with score <" + completeResult.getScore() + ">");
}

Simple Query with full documents returned

Search search = new Search("myIndexName").setAmount(10);
search.addQuery(new ScoredQuery("issn:1234-1234 AND title:special"));
search.setResultFetchType(ZuliaQuery.FetchType.FULL); //return the full bson document that was stored

SearchResult searchResult = zuliaWorkPool.search(search);

long totalHits = searchResult.getTotalHits();

System.out.println("Found <" + totalHits + "> hits");
for (Document document : searchResult.getDocuments()) {
    System.out.println("Matching document <" + document + ">");
}

Caching

// make sure this search stays in the query cache until the index is changed or zulia is restarted
Search search = new Search("myIndexName").setAmount(10);
search.addQuery(new ScoredQuery("issn:1234-1234 AND title:special"));
search.setPinToCache(true);

// Alternatively can force search to not be cached.  Searches that return more results than shardQueryCacheMaxAmount are not cached regardless
search.setDontCache(true);

Realtime

By default, queries run against the last committed index state. Enabling realtime triggers an NRT (Near Real-Time) refresh on each shard before searching, making recently stored but uncommitted documents visible without forcing a disk commit.

Search search = new Search("myIndexName").setAmount(10);
search.addQuery(new ScoredQuery("title:latest"));
search.setRealtime(true);

SearchResult searchResult = zuliaWorkPool.search(search);

Realtime mode is also supported on fetch, get number of docs, get fields, and get terms:

FetchDocument fetchDocument = new FetchDocument("myid222", "myIndex");
fetchDocument.setRealtime(true);
FetchResult fetchResult = zuliaWorkPool.fetch(fetchDocument);

Notes:

  • Realtime does not force a commit to disk. It refreshes the Lucene reader to include buffered (uncommitted) documents.
  • There is a small overhead per request to check for and open new index segments. Use selectively when visibility of recent writes is important rather than on every query.

Concurrency

Concurrency controls the number of virtual threads used for parallel Lucene segment search and facet aggregation within each shard. Higher values can reduce latency for queries over large shards at the cost of additional CPU.

Concurrency is resolved in the following order:

  1. Per-query setting (if set, takes priority)
  2. Index-level defaultConcurrency (see Index Config Details)
  3. Node-level defaultConcurrency (from server zulia.yaml)
  4. Falls back to 1 (single-threaded)
Search search = new Search("myIndexName").setAmount(100);
search.addQuery(new ScoredQuery("title:gene"));
search.setConcurrency(4); // use 4 virtual threads per shard

SearchResult searchResult = zuliaWorkPool.search(search);

Notes:

  • For facet/stat aggregation, the effective concurrency is automatically capped to at most 1 thread per 10,000 matching documents and cannot exceed the number of Lucene segments.
  • Concurrency is independent from requestFactor and minShardRequest, which control how many results are requested from each shard during federation.

Debug

Debug mode logs the parsed Lucene query and the rewritten (optimized) query for each shard on the server. This is useful for understanding how Zulia translates your query string, troubleshooting unexpected results, or diagnosing performance issues.

Search search = new Search("myIndexName").setAmount(10);
search.addQuery(new ScoredQuery("title:gene"));
search.setDebug(true);

SearchResult searchResult = zuliaWorkPool.search(search);

When debug is enabled, the server logs two lines per shard:

Lucene Query for index myIndexName:s0: +title:gene
Rewritten Query for index myIndexName:s0: +title:gene

The rewritten query shows the result after Lucene’s query optimization (e.g. wildcard expansion, synonym expansion, multi-term rewrites).

Note: Debug output appears only in the server logs, not in the query response. The client receives the normal search result.

Search Labels

Search labels are optional string identifiers attached to a query. They appear in the server logs alongside the query, making it easy to identify and track specific queries in production.

Search search = new Search("myIndexName").setAmount(10);
search.addQuery(new ScoredQuery("title:gene"));
search.setSearchLabel("gene-title-lookup");

SearchResult searchResult = zuliaWorkPool.search(search);

Server log output:

Running id 42 with label gene-title-lookup query { ... }
Finished query id 42 with label gene-title-lookup with result size 12.50KB in 23ms

Search labels are also required for warming searches where they must be unique and serve as the identifier for adding, updating, and removing warming searches.

Search Multiple Indexes

Search search = new Search("myIndexName", "myOtherIndex").setAmount(10);
search.addQuery(new ScoredQuery("issn:1234-1234 AND title:special"));


SearchResult searchResult = zuliaWorkPool.search(search);

long totalHits = searchResult.getTotalHits();

System.out.println("Found <" + totalHits + "> hits");
for (CompleteResult completeResult : searchResult.getCompleteResults()) {
    Document doc = completeResult.getDocument();
    System.out.println("Matching document <" + completeResult.getUniqueId() + "> with score <" + completeResult.getScore() + "> from index <" + completeResult.getIndexName() + ">");
    System.out.println(" full document <" + doc + ">");
}

Sorting

Search search = new Search("myIndexName").setAmount(100);
search.addQuery(new FilterQuery("title:(brown AND bear)"));
// can add multiple sorts with ascending or descending (default ascending)
// can also specify whether missing values are returned first or last (default missing first)
search.addSort(new Sort("year").descending());
search.addSort(new Sort("journal").ascending().missingLast());
SearchResult searchResult = zuliaWorkPool.search(search);

Internal Fields

Zulia exposes several internal fields that can be used in queries and sorting. These are accessible via ZuliaConstants (which extends ZuliaFieldConstants).

Constant Field Value Description
ZuliaConstants.ID_FIELD zuliaId The unique document identifier. Queryable as a term field and sortable.
ZuliaConstants.SCORE_FIELD zuliaScore The query relevance score. Useful as a tie-breaker sort.
ZuliaConstants.TIMESTAMP_FIELD _ztsf_ Automatically set when a document is stored or updated. Range-queryable as a date field.

Querying by Document ID

The zuliaId field is indexed as a keyword and can be queried with term queries:

// Find specific documents by their unique IDs
search.addQuery(new TermQuery("zuliaId").addTerms("doc1", "doc2", "doc3"));

Sorting by Internal Fields

// Sort by year, then use relevance score as a tie-breaker for documents in the same year
search.addSort(new Sort("year").descending());
search.addSort(new Sort(ZuliaConstants.SCORE_FIELD).descending());

// Sort by document ID
search.addSort(new Sort(ZuliaConstants.ID_FIELD));

Timestamp Range Queries

The timestamp field is automatically maintained and can be used to find recently updated documents:

// Find documents updated in a date range
search.addQuery(new FilterQuery(ZuliaConstants.TIMESTAMP_FIELD + ":[2024-01-01 TO 2024-12-31]"));

Length-Based Sorting

Zulia supports sorting by string character length and list element count using special syntax:

// Sort by character length of the "title" field
search.addSort(new Sort("|title|"));

// Sort by number of elements in the "authors" list field
search.addSort(new Sort("|||authors|||").descending());

Note: |field| only works on string fields. |||field||| works on any list field.

Query Fields

Query fields set the search field used when one is not given for a term. if query fields are not set on the query and a term is not qualified, the default search fields on the index will be used.

Query Fields Given

Search search = new Search("myIndexName").setAmount(100);

// search for lung in title,abstract AND cancer in title,abstract AND treatment in title
search.addQuery(new ScoredQuery("lung cancer title:treatment").addQueryFields("title", "abstract").setDefaultOperator(ZuliaQuery.Query.Operator.AND));

Default Query Fields

// search for lung in default index fields OR cancer in default index fields
// OR is the default operator unless set
Search search = new Search("myIndexName").setAmount(100);
search.addQuery(new ScoredQuery("lung cancer"));

Wildcard Query Fields

Search search = new Search("myIndexName").setAmount(100);

// search for lung in any field starting with title and abstract AND cancer in any field starting with title and abstract
// can also use title*:someTerm in a query, see Query Syntax Documentation
search.addQuery(new ScoredQuery("lung cancer").addQueryFields("title*", "abstract").setDefaultOperator(ZuliaQuery.Query.Operator.AND));

Highlighting

Search search = new Search("myIndexName").setAmount(100);
search.addQuery(new ScoredQuery("lung cancer").addQueryFields("title").setDefaultOperator(ZuliaQuery.Query.Operator.AND));

//can optionally set pre and post tag for the the highlight and set the number of fragments on the Highlight object
search.addHighlight(new Highlight("title"));

SearchResult searchResult = zuliaWorkPool.search(search);

for (CompleteResult completeResult : searchResult.getCompleteResults()) {
    Document document = completeResult.getDocument();
    List<String> titleHighlightsForDoc = completeResult.getHighlightsForField("title");
}

Filter Queries

Filter queries are the same as scored queries except they do not require the search engine to compute a score. They should be used in cases where a sort is being applied and a score is not needed or when a filter should not influence the relevance score. Filter queries and scored queries can be combined together.

Search search = new Search("myIndexName").setAmount(100);
// include only years 2020 forward
search.addQuery(new FilterQuery("year:[2020 TO *]"));
// require both terms to be matched in either the title or abstract
search.addQuery(new FilterQuery("cheetah cub").setDefaultOperator(Operator.AND).addQueryFields("title", "abstract"));
// require two out of the three terms in the abstract
search.addQuery(new FilterQuery("sleep play run").setMinShouldMatch(2).addQueryField("abstract"));
// exclude the journal nature
search.addQuery(new FilterQuery("journal:Nature").exclude());
SearchResult searchResult = zuliaWorkPool.search(search);

Excluding Queries

FilterQuery, TermQuery, and NumericSetQuery all support .exclude() to negate the query (equivalent to Lucene’s MUST_NOT). When all queries in a search are excluded, you must add a MatchAllQuery to provide the base document set to exclude from:

Search search = new Search("myIndexName").setAmount(100);
// exclude status:draft AND exclude year 2019 — MatchAllQuery provides the base set
search.addQuery(new FilterQuery("status:draft").exclude());
search.addQuery(new NumericSetQuery("year").addValues(2019).exclude());
search.addQuery(new MatchAllQuery());
SearchResult searchResult = zuliaWorkPool.search(search);

All three query types also support .include() to toggle back to positive matching after .exclude() has been called.

Query Helpers

FilterFactory for numerics


search = new Search("myIndexName");
// Search for pub years in range [2015, 2020]
search.addQuery(FilterFactory.rangeInt("pubYear").setRange(2015, 2020));

search = new Search("myIndexName");
// Search for pubs for any year before 2020
search.addQuery(FilterFactory.rangeInt("pubYear").setMaxValue(2020).setEndpointBehavior(RangeBehavior.EXCLUSIVE));

Values for tokens

String query;

// (a OR b)
query = Values.any().of("a", "b").asString();

" slow cat" AND "pink shirt"
query = Values.all().valueHandlerChain(List.of(String::toLowerCase, Values.VALUE_QUOTER)).of("slow cat", "Pink Shirt").asString();

// ("slow cat" OR "Pink Shirt")
Function<String, String> quoteAndTrim = s -> Values.VALUE_QUOTER.apply(s).trim(); // Values.VALUE_QUOTER is default value handler
query = Values.all().valueHandler(quoteAndTrim).of("   slow cat   ", "   Pink Shirt ").asString();

// title,abstract:(a OR b)
query = Values.any().of("a", "b").withFields("title", "abstract").asString();

// -title,abstract:(a OR b OR c)
query = Values.any().of("a", "b", "c").withFields("title", "abstract").exclude().asString();

// title,abstract:(\"fast dog\" OR b OR c)~2
query = Values.atLeast(2).of("fast dog", "b", "c").withFields("title", "abstract").asString();



// -title,abstract:(a OR b OR c)~2
query = Values.atLeast(2).of("a", "b", "c").withFields("title", "abstract").exclude().asString();

FilterQuery fq;
// fq = new FilterQuery("\"fast dog\" b c").setDefaultOperator(ZuliaQuery.Query.Operator.OR).exclude().addQueryFields("title", "abstract").setMinShouldMatch(2)
fq = Values.atLeast(2).of("fast dog", "b", "c").withFields("title", "abstract").exclude().asFilterQuery();

ScoredQuery sq;
// sq = new ScoredQuery("\"slow cat\" b c").setDefaultOperator(ZuliaQuery.Query.Operator.OR).addQueryFields("title", "abstract").setMinShouldMatch(2);
sq = Values.atLeast(2).of("slow cat", "b", "c").withFields("title", "abstract").asScoredQuery();


Term Queries

Optimized search for many terms. Terms given are not analyzed, so they must match exactly what is in the search engine. This is most useful for things like ids that are not analyzed with KEYWORD or lightly analyzed with something like LC_KEYWORD (lower case keyword)

Search search = new Search("myIndexName").setAmount(100);

// search for the terms 1,2,3,4 in the field id
search.addQuery(new TermQuery("id").addTerms("1", "2", "3", "4"));

SearchResult searchResult = zuliaWorkPool.search(search);

Use .exclude() to match documents that do not contain any of the given terms:

// Exclude documents where status is "deleted" or "archived"
Search search = new Search("myIndexName").setAmount(100);
search.addQuery(new TermQuery("status").addTerms("deleted", "archived").exclude());

// Combine with a positive query: documents with tag "news" but NOT id "1" or "2"
search.addQuery(new TermQuery("tag").addTerm("news"));
search.addQuery(new TermQuery("id").addTerms("1", "2").exclude());

Score Functions

Score functions modify the relevance score using mathematical expressions that can reference zuliaScore (the baseline text relevance score) and any sortable numeric or date field. See [[Query-Syntax#Score-Functions]] for the full list of available functions.

// Boost results by a popularity field
Search search = new Search("myIndexName").setAmount(100);
search.addQuery(new ScoredQuery("title:cats").setScoreFunction("zuliaScore * popularity"));
SearchResult searchResult = zuliaWorkPool.search(search);
// Incorporate PageRank with logarithmic dampening
Search search = new Search("myIndexName").setAmount(100);
search.addQuery(new ScoredQuery("title:cats").setScoreFunction("zuliaScore * (1 + ln(pageRank))"));
// Complex expression combining multiple fields
Search search = new Search("myIndexName").setAmount(100);
search.addQuery(new ScoredQuery("title:cats").setScoreFunction("zuliaScore * (sqrt(popularity) + pageRank)"));
// Score function with a SHOULD query (boosts matching docs without requiring a match)
Search search = new Search("myIndexName").setAmount(100);
search.addQuery(new ScoredQuery("*:*")); // match all
search.addQuery(new ScoredQuery("title:cats", false).setScoreFunction("zuliaScore * popularity"));

Match All Query

MatchAllQuery matches every document in the index with a score of 1.0. It is a convenience wrapper around ScoredQuery with a null query. This is useful for returning all documents with sorting, faceting, or when combined with a score function to rank by a field value.

Search search = new Search("myIndexName").setAmount(100);
search.addQuery(new MatchAllQuery());
search.addSort(new Sort("year").descending());
SearchResult searchResult = zuliaWorkPool.search(search);
// Rank all documents by a field value using a score function
Search search = new Search("myIndexName").setAmount(100);
search.addQuery(new MatchAllQuery().setScoreFunction("popularity"));
SearchResult searchResult = zuliaWorkPool.search(search);

Numeric Set Queries

Optimized search for many numeric terms

Search search = new Search("myIndexName").setAmount(100);
//search for values 1, 5, 7, 9 in the field intField
search.addQuery(new NumericSetQuery("intField").addValues(1, 5, 7, 9));

Use .exclude() to match documents that do not have any of the given numeric values:

// Exclude documents where year is 2020 or 2021
Search search = new Search("myIndexName").setAmount(100);
search.addQuery(new NumericSetQuery("year").addValues(2020, 2021).exclude());

NumericSetQuery supports int, long, float, and double values through type-specific addValues overloads.

Vector Queries

Vector Indexing and Basic Queries

// create an index with add field config
ClientIndexConfig indexConfig = new ClientIndexConfig();

// call createVector or createUnitVector depending on if the vector is unit normalized
indexConfig.addFieldConfig(FieldConfigBuilder.createUnitVector("v").index());
// ...
indexConfig.setIndexName("vectorTestIndex");
// also can could updateIndex with mergeFieldConfig to add vector field to existing index
zuliaWorkPool.createIndex(indexConfig);


// store some documents with a vector field
Document mongoDocument = new Document();
float[] vector = new float[]{ 0, 0, 0.70710678f, 0.70710678f };
mongoDocument.put("v", Floats.asList(vector));
Store s = new Store("someId", "vectorTestIndex").setResultDocument(mongoDocument);
zuliaWorkPool.store(s);

Search search = new Search("vectorTestIndex").setAmount(100);
// returns the top 3 documents closest to [1.0,0,0,0] in the field v
search.addQuery(new VectorTopNQuery(new float[] { 1.0f, 0.0f, 0.0f, 0.0f }, 3, "v"));

SearchResult searchResult = zuliaWorkPool.search(search);

Pre Filters with Vector Queries

Search search = new Search("vectorTestIndex").setAmount(100);
// filters for blue in the description then returns the top 3 documents closest to [1.0,0,0,0] in the field v
StandardQuery descriptionQuery = new FilterQuery("blue").addQueryField("description");
search.addQuery(new VectorTopNQuery(new float[] { 1.0f, 0.0f, 0.0f, 0.0f }, 3, "v").addPreFilterQuery(descriptionQuery));

Post Filters with Vector Queries

Search search = new Search("vectorTestIndex").setAmount(100);
// returns the top 3 documents closest to [1.0,0,0,0] in the field v, then filters for red in the description (possible less than 3 now)
search.addQuery(new VectorTopNQuery(new float[] { 1.0f, 0.0f, 0.0f, 0.0f }, 3, "v"));
search.addQuery(new FilterQuery("red").addQueryField("description"));

Count Facets

// Can set number of documents to return to 0 or omit setAmount unless you want the documents at the same time
// normally is combined with a FilterQuery or ScoredQuery to count a set of results
Search search = new Search("myIndexName").setAmount(0);

search.addCountFacet(new CountFacet("issn").setTopN(20));

SearchResult searchResult = zuliaWorkPool.search(search);
for (ZuliaQuery.FacetCount fc : searchResult.getFacetCounts("issn")) {
    System.out.println("Facet <" + fc.getFacet() + "> with count <" + fc.getCount() + ">");
}

Numeric Stat

// show number of values, number of documents, min, max, and sum for field pubYear
// normally is combined with a FilterQuery or ScoredQuery to count a set of results
Search search = new Search("myIndexName").setAmount(100);
search.addStat(new NumericStat("pubYear"));
SearchResult searchResult = zuliaWorkPool.search(search);
ZuliaQuery.FacetStats pyFieldStat = searchResult.getNumericFieldStat("pubYear");
System.out.println(pyFieldStat.getMin()); // minimum value for the field
System.out.println(pyFieldStat.getMax()); // maximum value for the field
System.out.println(pyFieldStat.getSum()); // sum of the values for the field, use one of the counts below for the average/mean
System.out.println(pyFieldStat.getDocCount()); // count of documents with the field not null
System.out.println(pyFieldStat.getAllDocCount()); // count of documents matched by the query
System.out.println(pyFieldStat.getValueCount()); // count of total number of values in the field (equal to document count except for multivalued fields)

Numeric Stat with Percentiles

List<Double> percentiles = List.of(
	0.0,  // 0th percentile (min) - can be retrieved without percentiles
	0.25, // 25th percentile
	0.50, // median
	0.75, // 75th percentile
	1.0   // 100th percentile (max) - can be retrieved without percentiles
);

Search search = new Search("myIndexName");
// Get the requested percentiles within 1% of their true value
search.addStat(new NumericStat("pubYear").setPercentiles(percentiles).setPercentilePrecision(0.01));
SearchResult searchResult = zuliaWorkPool.search(search);
for (ZuliaQuery.Percentile percentile : searchResult.getNumericFieldStat("pubYear").getPercentilesList()) {
	System.out.println(percentile.getPoint() + " -> " + percentile.getValue());
}

Stat Facet

// return the highest sum on author count for each journal name
Search search = new Search("myIndexName").setAmount(100);
search.addStat(new StatFacet("authorCount", "journalName"));
SearchResult searchResult = zuliaWorkPool.search(search);

// journals ordered by the sum of author count
List<ZuliaQuery.FacetStats> authorCountForJournalName = searchResult.getFacetFieldStat("authorCount", "journalName");
for (ZuliaQuery.FacetStats journalStats : authorCountForJournalName) {
    System.out.println(journalStats.getFacet()); // the journal
    System.out.println(journalStats.getMin()); // minimum value of author count for journal
    System.out.println(journalStats.getMax()); // maximum value of author count for journal
    System.out.println(journalStats.getSum()); // sum of the values of author count for journal, use counts below for average/mean
    System.out.println(journalStats.getDocCount()); // count of documents for the journal where the author count not null
    System.out.println(journalStats.getAllDocCount()); // count of documents for the journal
    System.out.println(journalStats.getValueCount()); // count of total number of values of author count for the journal (equal to document count except for multivalued fields)
}

Stat Facet Percentiles

//get the 25th percentile, median, and 75th percentile of author count for the journal names
Search search = new Search("myIndexName").setAmount(100);
search.addStat(new StatFacet("authorCount", "journalName").setPercentiles(List.of(0.25, 0.5, 0.75)).setPercentilePrecision(0.01));
SearchResult searchResult = zuliaWorkPool.search(search);

// journals ordered by the sum of author count
List<ZuliaQuery.FacetStats> authorCountForJournalName = searchResult.getFacetFieldStat("authorCount", "journalName");
for (ZuliaQuery.FacetStats journalStats : authorCountForJournalName) {
    for (ZuliaQuery.Percentile percentile : journalStats.getPercentilesList()) {
        System.out.println(percentile.getPoint() + " -> " + percentile.getValue());
    }
    // journalStats also will have facet, min, max, sum, and counts as other example
}

Drilling Down Facets

Search search = new Search("myIndexName").setAmount(100);
search.addFacetDrillDown("issn", "1111-1111");
SearchResult searchResult = zuliaWorkPool.search(search);

Hierarchical Facets

Hierarchical faceting enables counting and drill-down across multi-level taxonomies. String fields use / as the path delimiter (e.g. "USA/California/LA"), while date fields automatically split into year / month / day levels.

Configuring Hierarchical Facets

// String field — values stored with "/" delimiter are split into hierarchy levels
indexConfig.addFieldConfig(FieldConfigBuilder.createString("path").indexAs(DefaultAnalyzers.LC_KEYWORD).facetHierarchical().sort());

// Date field — automatically creates a year/month/day hierarchy
indexConfig.addFieldConfig(FieldConfigBuilder.createDate("publishDate").index().facetHierarchical().sort());

// Hierarchical facet with a custom facet name
indexConfig.addFieldConfig(FieldConfigBuilder.createString("category").indexAs(DefaultAnalyzers.LC_KEYWORD).facetAsHierarchical("categoryFacet"));

Storing Hierarchical Data

For string fields, store the full path using / as the delimiter:

Document mongoDocument = new Document();
mongoDocument.put("path", "USA/California/LA");
// Indexed as three levels: "USA" → "California" → "LA"

Date fields require no special formatting — dates are automatically split into year, month, and day:

mongoDocument.put("publishDate", Date.from(LocalDate.of(2024, 3, 15).atStartOfDay(ZoneId.of("UTC")).toInstant()));
// Indexed as: "2024" → "3" → "15"

Querying Hierarchical Facets

Use path parameters on CountFacet to query at different levels of the hierarchy:

Search search = new Search("myIndexName").setAmount(0);

// Top level — returns counts for the first path component (e.g. "USA": 500, "UK": 200)
search.addCountFacet(new CountFacet("path"));

// Second level — returns children of "USA" (e.g. "California": 300, "Texas": 100)
search.addCountFacet(new CountFacet("path", "USA"));

// Third level — returns children of "USA/California" (e.g. "LA": 150, "SF": 80)
search.addCountFacet(new CountFacet("path", "USA", "California"));

SearchResult searchResult = zuliaWorkPool.search(search);

// Retrieve results for each level
for (ZuliaQuery.FacetCount fc : searchResult.getFacetCounts("path")) {
    System.out.println(fc.getFacet() + ": " + fc.getCount()); // top level
}
for (ZuliaQuery.FacetCount fc : searchResult.getFacetCountsForPath("path", "USA")) {
    System.out.println(fc.getFacet() + ": " + fc.getCount()); // second level
}
for (ZuliaQuery.FacetCount fc : searchResult.getFacetCountsForPath("path", "USA", "California")) {
    System.out.println(fc.getFacet() + ": " + fc.getCount()); // third level
}

Date Hierarchical Facets

Search search = new Search("myIndexName").setAmount(0);

// Top level — returns year counts (e.g. "2023": 1000, "2024": 500)
search.addCountFacet(new CountFacet("publishDate"));

// Second level — returns month counts within a year (e.g. "1": 80, "2": 95, ...)
search.addCountFacet(new CountFacet("publishDate", "2024"));

// Third level — returns day counts within a year and month
search.addCountFacet(new CountFacet("publishDate", "2024", "3"));

SearchResult searchResult = zuliaWorkPool.search(search);
for (ZuliaQuery.FacetCount fc : searchResult.getFacetCountsForPath("publishDate", "2024")) {
    System.out.println("Month " + fc.getFacet() + ": " + fc.getCount());
}

Hierarchical Drill-Down

Use DrillDown with path components to filter documents at a specific level. The first argument to addValue is the top-level component, and subsequent arguments are deeper components:

Search search = new Search("myIndexName").setAmount(100);

// Drill down to all documents under "USA" (matches USA/California/LA, USA/Texas, etc.)
search.addFacetDrillDown(new DrillDown("path").addValue("USA"));

// Drill down to "USA/California" (matches USA/California, USA/California/LA, etc.)
search.addFacetDrillDown(new DrillDown("path").addValue("USA", "California"));

// Drill down to exact path "USA/California/LA"
search.addFacetDrillDown(new DrillDown("path").addValue("USA", "California", "LA"));

// Multiple values with OR (documents matching either path)
search.addFacetDrillDown(new DrillDown("path").any().addValue("USA", "California").addValue("USA", "Texas"));

// Multiple values with AND (documents matching both paths — useful for multi-valued fields)
search.addFacetDrillDown(new DrillDown("path").all().addValue("USA", "California").addValue("USA", "Texas"));

SearchResult searchResult = zuliaWorkPool.search(search);

Hierarchical Stat Facets

StatFacet also supports path parameters for hierarchical fields:

Search search = new Search("myIndexName").setAmount(0);

// Stats for the top level of the path hierarchy
search.addStat(new StatFacet("rating", "path"));

// Stats for children of "USA" in the path hierarchy
search.addStat(new StatFacet("rating", "path", "USA"));

SearchResult searchResult = zuliaWorkPool.search(search);
List<ZuliaQuery.FacetStats> statsByPath = searchResult.getFacetFieldStat("rating", "path");
for (ZuliaQuery.FacetStats fs : statsByPath) {
    System.out.println(fs.getFacet() + " — min: " + fs.getMin() + ", max: " + fs.getMax() + ", sum: " + fs.getSum());
}

Pagination

Zulia supports two pagination strategies: offset-based (using start) and cursor-based (using lastResult).

Offset Pagination

Offset pagination skips the first N results. It is simple but becomes slower with deeper pages because all skipped documents must still be processed internally.

Search search = new Search("myIndexName").setAmount(10);
search.addQuery(new ScoredQuery("title:gene"));
search.setStart(20); // skip the first 20 results, return results 21-30

SearchResult searchResult = zuliaWorkPool.search(search);

Use offset pagination for small result sets or when you only need the first few pages.

Cursor Pagination

Cursor pagination uses Lucene’s keyset (search-after) mechanism. Each response includes a LastResult containing per-shard sort values that tell the next query exactly where to resume. This makes every page equally fast regardless of depth.

Requirements:

  • A sort must be specified. This is especially important on a changing index where documents may be added or updated between pages. The sort field(s) should form a unique value or unique combination (e.g. id alone, or title,id). Without uniqueness, documents with identical sort values may be skipped or duplicated across pages.
Search search = new Search("myIndexName");
search.setAmount(100);
search.addQuery(new ScoredQuery("issn:1234-1234 AND title:special"));

// sort is required for cursor pagination
// use a unique field or add id as a tiebreaker
search.addSort(new Sort("id"));

SearchResult firstResult = zuliaWorkPool.search(search);

// pass the last result as the cursor for the next page
search.setLastResult(firstResult);

SearchResult secondResult = zuliaWorkPool.search(search);

Getting all results with a cursor

The searchAll* convenience methods handle cursor iteration automatically. Set amount to the desired page size.

Search search = new Search("myIndexName");
search.setAmount(100); // page size
search.addQuery(new ScoredQuery("issn:1234-1234 AND title:special"));
search.addSort(new Sort("id"));

// option 1 - requires fetch type full (default)
zuliaWorkPool.searchAllAsDocument(search, document -> {
    // do something with mongo bson document
});

// variation 2 - when score is needed, searching multiple indexes and index name is needed, or fetch type is NONE/META
zuliaWorkPool.searchAllAsScoredResult(search, scoredResult -> {
    System.out.println(scoredResult.getUniqueId() + " has score " + scoredResult.getScore() + " for index " + scoredResult.getIndexName());
    // if result fetch type is full (default)
    Document document = ResultHelper.getDocumentFromScoredResult(scoredResult);
});

// variation 3 - each page is returned as a search result, gives access to total hits
zuliaWorkPool.searchAll(search, searchResult -> {
    System.out.println("There are " + searchResult.getTotalHits());

    // variation 3a - requires fetch type full (default)
    for (Document document : searchResult.getDocuments()) {

    }

    // variation 3b - when score is needed, searching multiple indexes and index name is needed, or fetch type is NONE/META
    for (CompleteResult result : searchResult.getCompleteResults()) {
        System.out.println("Result for <" + result.getIndexName() + "> with score <" + result.getScore() + ">");
        // if fetch type is FULL
        Document document = result.getDocument();
    }
});

When to use which

  Offset (setStart) Cursor (setLastResult)
Sort required No Yes (unique or unique combination)
Deep pagination Slower with depth Constant time per page
Random page access Yes (jump to any page) No (must iterate sequentially)
Best for Small results, few pages Large exports, streaming all results

Deleting

Delete From Index

//Deletes the document from the index but not any associated documents
DeleteFromIndex deleteFromIndex = new DeleteFromIndex("myid111", "myIndexName");
zuliaWorkPool.delete(deleteFromIndex);

Delete Completely

//Deletes the result document, the index documents and all associated documents associated with an id
DeleteFull deleteFull = new DeleteFull("myid123", "myIndexName");
zuliaWorkPool.delete(deleteFull);

Delete Single Associated

//Removes a single associated document with the unique id and filename given
DeleteAssociated deleteAssociated = new DeleteAssociated("myid123", "myIndexName", "myfile2");
zuliaWorkPool.delete(deleteAssociated);

Delete All Associated

DeleteAllAssociated deleteAllAssociated = new DeleteAllAssociated("myid123", "myIndexName");
zuliaWorkPool.delete(deleteAllAssociated);

Batch Delete

Delete multiple documents in a single gRPC call:

BatchDelete batchDelete = new BatchDelete();
batchDelete.addDelete(new DeleteFromIndex("id1", "myIndexName"));
batchDelete.addDelete(new DeleteFromIndex("id2", "myIndexName"));
batchDelete.addDelete(new DeleteFromIndex("id3", "myIndexName"));
zuliaWorkPool.batchDelete(batchDelete);

Delete all documents from a search result:

SearchResult searchResult = zuliaWorkPool.search(search);
BatchDelete batchDelete = new BatchDelete().deleteDocumentFromQueryResult(searchResult);
zuliaWorkPool.batchDelete(batchDelete);

Other Operations

Clear Index

Removes all documents from an index while keeping the index structure and configuration:

zuliaWorkPool.clearIndex("myIndexName");

Optimize Index

Merges Lucene index segments into fewer, larger segments for improved search performance. Best used on indexes that are no longer being actively written to:

zuliaWorkPool.optimizeIndex("myIndexName");

Reindex

Re-reads all stored documents and re-indexes them with the current schema. Useful after changing field analyzers, adding sort or facet configurations, or other schema changes:

// Update schema first
UpdateIndex updateIndex = new UpdateIndex("myIndexName");
updateIndex.mergeFieldConfig(FieldConfigBuilder.createString("title").indexAs(DefaultAnalyzers.STANDARD).sort());
zuliaWorkPool.updateIndex(updateIndex);

// Then reindex to apply the new schema to existing documents
zuliaWorkPool.reindex("myIndexName");

All three operations are also available via CLI:

zuliaadmin clearIndex --index myIndexName
zuliaadmin optimizeIndex --index myIndexName
zuliaadmin reindex --index myIndexName

Get Current Document Count for Index

GetNumberOfDocsResult result = zuliaWorkPool.getNumberOfDocs("myIndexName");
System.out.println(result.getNumberOfDocs());

Get Fields for Index

GetFieldsResult result = zuliaWorkPool.getFields(new GetFields("myIndexName"));
System.out.println(result.getFieldNames());

Get Terms for Field

GetTermsResult getTermsResult = zuliaWorkPool.getTerms(new GetTerms("myIndexName", "title"));
for (ZuliaBase.Term term : getTermsResult.getTerms()) {
    System.out.println(term.getValue() + ": " + term.getDocFreq());
}

Get Cluster Nodes

GetNodesResult getNodesResult = zuliaWorkPool.getNodes();
for (Node node : getNodesResult.getNodes()) {
    System.out.println(node);
}

Async API

Every Function has a Corresponding Async Version

Executor executor = Executors.newCachedThreadPool();

Search search = new Search("myIndexName").setAmount(10);

ListenableFuture<SearchResult> resultFuture = zuliaWorkPool.searchAsync(search);

Futures.addCallback(resultFuture, new FutureCallback<>() {
    @Override
    public void onSuccess(SearchResult result) {

    }

    @Override
    public void onFailure(Throwable t) {

    }
}, executor);

Object Persistence / Mapping

Annotated Object Example

@Settings(indexName = "wikipedia", numberOfShards = 16, shardCommitInterval = 6000)
public class Article {

	public Article() {

	}

	@UniqueId
	private String id;

	@Indexed(analyzerName = DefaultAnalyzers.STANDARD)
	private String title;

	@Indexed
	private Integer namespace;

	@DefaultSearch
	@Indexed(analyzerName = DefaultAnalyzers.STANDARD)
	private String text;

	private Long revision;

	@Indexed
	private Integer userId;

	@Indexed(analyzerName = DefaultAnalyzers.STANDARD)
	private String user;

	@Indexed
	private Date revisionDate;

	//Getters and Setters
	//....
}

Creating Index for Annotated Class Example

Mapper<Article> mapper = new Mapper<>(Article.class);
zuliaWorkPool.createIndex(mapper.createOrUpdateIndex());

Storing an Object with Mapper

Article article = new Article();
//...
Store store = mapper.createStore(article);
zuliaWorkPool.store(store);

Querying with Mapper

Search search = new Search("wikipedia").setAmount(10);
search.addQuery(new ScoredQuery("title:technology"));

SearchResult searchResult = zuliaWorkPool.search(search);
List<Article> articles = searchResult.getMappedDocuments(mapper);

Address

Maryland USA