Date: Thu, 9 Mar 2006 16:04:25 +0100
On Thursday 09 March 2006 15:54, Yonik Seeley wrote:
> On 3/9/06, Øyvind Stegard <[email protected]> wrote:
> > - How does many stored fields eventually affect indexing/query
> > performance compared to if no fields were stored (only indexed) ?
> Additional stored fields should have no effect on querying (the
> internal information about a field is looked up in a hashmap).
> Additional stored fields that are used has an impact on indexing since
> that data must be copied every time segments are merged.
> Additional stored fields that are not used in most documents (sparse)
> should have very little performance impact on indexing.  The field
> list is walked a few times linearly (in-memory) during a segment
> merge, which should be very fast, but it's still O(n), so don't go
> crazy and have a million stored field types.
> > - Are there any known scalability issues with a large amount of distinct
> > fields in an index (not necessarily the same set of fields for every doc)
> > ?
> If they are indexed fields, yes.
> Each indexed field has a 1 byte norm *per document*, regardless of if
> the document contains that field.  In the current version of lucene,
> there is a way to omit these norms on a per field basis (see
> Field.setOmitNorms()) if you don't need length normalization or
> index-time field boosting.
Thanks for the quick and informative reply ! I will investigate further into  
the possibility of omitting such norm data from fields (we typically do very 
exact searches, and don't use much score data, yet).

Øyvind Stegard

