LogoLogo
HomeGitHub RepoBook Demo
version-3.5.x
version-3.5.x
  • Introduction
  • Installation
  • Quick start
  • Mental model
  • Architecture
  • Walkthrough
    • Editing a Package
    • Uploading a Package
    • Installing a Package
    • Getting Data from a Package
    • Working with the Catalog
    • Working with a Bucket
  • API Reference
    • quilt3
    • quilt3.Package
    • quilt3.Bucket
    • CLI, environment
    • Known limitations
  • Catalog
    • Configuration
    • Preview
    • Search & query
    • Metadata for teams
    • Admin UI
    • Embed
  • Advanced Usage
    • Filtering a Package
    • .quiltignore
    • Materialization
    • Working with Manifests
    • S3 Select
    • Workflows
    • Enterprise install
    • S3 Events, EventBridge
  • More
    • Frequently Asked Questions
    • Troubleshooting
    • Contributing
    • Changelog
Powered by GitBook
On this page
  • ElasticSearch
  • Indexing
  • Queries

Was this helpful?

  1. Catalog

Search & query

PreviousPreviewNextMetadata for teams

Last updated 3 years ago

Was this helpful?

Out of the box, Quilt provides support for queries in the ElasticSearch DSL, as well as SQL queries in Athena (details forthcoming).

ElasticSearch

The objects in S3 buckets connected to Quilt are synchronized to an ElasticSearch cluster, which powers Quilt's search features. For custom queries, you can use the Queries tab in the Quilt catalog to directly query ElasticSearch cluster.

Quilt uses ElasticsSearch 6.7 ().

Indexing

Quilt maintains a near-realtime index of the objects in your S3 bucket in ElasticSearch. Each bucket corresponds to one or more ElasticSearch indexes. As objects are mutated in S3, Quilt uses an event-driven system (via SNS and SQS) to update ElasticSearch.

There are two types of indexing in Quilt:

  • shallow indexing includes object metadata (such as the file name and size)

  • deep indexing includes object contents. Quilt supports deep

    indexing for the following file extensions:

    • .fcs (FlowJo)

    • .ipynb (Jupyter notebooks)

    • .parquet

    • .pdf

    • .html, .txt, .tsv, .csv, .md (plus many other plain-text formats)

    • .xls, .xlsx

Queries

Quilt ElasticSearch queries support the following keys:

Saved queries

You can provide pre-canned queries for your users by providing a configuration file at s3://YOUR_BUCKET/.quilt/queries/config.yaml:

version: "1"
queries:
  query-1:
    name: My first query
    description: Optional description
    url: s3://BUCKET/.quilt/queries/query-1.json
  query-2:
    name: Second query
    url: s3://BUCKET/.quilt/queries/query-2.json

The Quilt catalog displays your saved queries in a drop-down for your users to select, edit, and execute.

index — comma-separated list of indexes to search ()

filter_path — to reducing response nesting, ()

_source — boolean that adds or removes the _source field, or a list of fields to return ()

size — limits the number of hits ()

from — starting offset for pagination ()

body — the search query body as a JSON dictionary ()

docs
learn more
learn more
learn more
learn more
learn more
learn more