Package org.apache.lucene.document.column
package org.apache.lucene.document.column
Column-oriented batch indexing API.
IndexWriter.addBatch(org.apache.lucene.document.column.ColumnBatch)
accepts a ColumnBatch: a fixed number of documents
presented field-by-field rather than document-by-document. Each field is a Column that exposes its values via cursor iterators rather
than concrete IndexableField instances per document.
Column subtypes
LongColumn— single- or multi-valued long values forNUMERIC/SORTED_NUMERICdoc values, 1‑D numeric points (int / long / float / double), and stored numeric fields.BinaryColumn— variable-length binary values forBINARY,SORTED, andSORTED_SETdoc values, term inversion, multi-dimensional or arbitrary-width points, and stored binary or string fields.DictionaryColumn— pre-defined term dictionary plus per-doc ordinals forSORTEDandSORTED_SETdoc values, term inversion, and stored binary or string fields.VectorColumn— KNN vectors (FLOAT32 or BYTE encoding); vector-only field type.TokenStreamColumn— caller-suppliedTokenStreams for term inversion (the columnar analogue of a custom token stream on aField); inverted-index-only field type.
Cursors
A Column declares its Column.Density (DENSE or SPARSE) and exposes its values via
cursors:
- A tuple cursor (e.g.
LongTupleCursor,ObjectTupleCursor) yields(batchDocID, value)pairs in non-decreasing doc-id order. Always available. - A bulk values cursor (e.g.
LongValuesCursor) feeds dense data directly into the underlying writer. Required whenColumn.density()isDENSEand consulted only in that case.
Each call that requests a cursor returns a fresh cursor positioned at the first value, so columns can be consumed multiple times — once in the row-oriented pass for stored fields and term inversion, and again in the column-oriented pass for doc values, points, and vectors.
- WARNING: This API is experimental and might change in incompatible ways in the next release.
-
ClassDescriptionA
Columnthat provides variable-size binary values via a tuple cursor.A values cursor over a denseBinaryColumn.A single field's values across multiple documents in aColumnBatch.Whether a column has a value for every document in the batch.A column-oriented batch of documents for indexing.Lightweight adapter that presents aColumn's current cursor value as anIndexableFieldso it can be fed through the row-oriented indexing pass (stored fields and term inversion).Static validation and bounds-checking helpers for the columnar indexing path.AColumnthat provides string or binary values via a pre-defined term dictionary plus per-doc ordinals into that dictionary.AColumnthat provides long values.The numeric interpretation of the column's long values.A tuple cursor over aLongColumn.A values cursor over a denseLongColumn.A tuple cursor over aColumnwhose values are objects.A dense values cursor over aDictionaryColumn.A tuple cursor over aDictionaryColumn.AColumnthat provides caller-suppliedTokenStreams for term inversion.VectorColumn<T>AColumnthat provides KNN vector values via a tuple cursor.