LogoLogo
HomeGitHub RepoBook Demo
dev
dev
  • About Quilt
  • Architecture
  • Mental Model
  • Metadata Management
  • Metadata Workflows
  • Quilt Platform (Catalog) User
    • About the Catalog
    • Bucket Browsing
    • Document Previews
    • Embeddable iFrames
    • Packaging Engine
    • Query
    • Quilt+ URIs
    • Qurator Omni
    • Search
    • Visualization & Dashboards
    • Advanced
      • Athena
      • Elasticsearch
      • Removing Stacks
  • Quilt Platform Administrator
    • Admin Settings UI
    • Catalog Configuration
    • Cross-Account Access
    • Enterprise Installs
    • quilt3.admin Python API
    • Advanced
      • Package Events
      • Private Endpoints
      • Restrict Access by Bucket Prefix
      • S3 Events via EventBridge
      • SSO Permissions Mapping
      • Tabulator
      • Troubleshooting
        • SSO Redirect Loop
    • Best Practices
      • GxP for Security & Compliance
      • Organizing S3 Buckets
  • Quilt Python SDK
    • Installation
    • Quick Start
    • Editing a Package
    • Uploading a Package
    • Installing a Package
    • Getting Data from a Package
    • Example: Git-like Operations
    • API Reference
      • quilt3
      • quilt3.Package
      • quilt3.Bucket
      • quilt3.hooks
      • Local Catalog
      • CLI, Environment
      • Known Limitations
      • Custom SSL Certificates
    • Advanced
      • Browsing Buckets
      • Filtering a Package
      • .quiltignore
      • Manipulating Manifests
      • Materialization
      • S3 Select
    • More
      • Changelog
      • Contributing
      • Frequently Asked Questions
      • Troubleshooting
  • Quilt Ecosystem Integrations
    • Benchling Packager
    • Event-Driven Packaging
    • Nextflow Plugin
Powered by GitBook
On this page
  • How do I sync my notebook and all of its data and models to S3 as a package?
  • How does Quilt versioning relate to S3 object versioning?
  • Where are the Quilt 2 packages?
  • Does quilt3 collect anonymous usage statistics?
  • Can I turn off TQDM progress bars for log files?
  • Which version of Quilt are you on?
  • Hashing during push takes a long time. Can I speed it up?
  • Does Quilt work with R?
  • How do I delete a data package and all of the objects in the data package?
  • Do I have to login via quilt3 to use the Quilt APIs?
  • How do I push to Quilt from a headless environment like a Docker container?
  • How complex can my Athena queries be?
  • Are there any limitations on characters in Quilt filenames?
  • How many IPs does a standard Quilt stack require?
  • The "Last Modified" column in the Quilt catalog is empty

Was this helpful?

  1. Quilt Python SDK
  2. More

Frequently Asked Questions

PreviousContributingNextTroubleshooting

Last updated 1 month ago

Was this helpful?

How do I sync my notebook and all of its data and models to S3 as a package?

p = quilt3.Package()
p.set_dir(".", ".")
p.push("USR/PKG", message="MSG", registry="s3://BUCKET")

Use a for more control over which files set_dir() includes.

How does Quilt versioning relate to S3 object versioning?

Quilt packages are one level of abstraction above S3 object versions. Object versions track mutations to a single file, whereas a quilt package references acollection files and assigns this collection a unique version.

It is strongly recommended that you enable object versioning on the S3 buckets that you push Quilt packages to. Object versioning ensures that mutations to every object are tracked, and provides some protection against deletion.

Where are the Quilt 2 packages?

Visit and use on PyPI.

Does quilt3 collect anonymous usage statistics?

Yes, to find bugs and prioritize features.

You can disable anonymous usage collection with an environment variable:

export QUILT_DISABLE_USAGE_METRICS=true

Or call quilt3.disable_telemetry() to persistently disable anonymous usage statistics.

Can I turn off TQDM progress bars for log files?

Yes:

export QUILT_MINIMIZE_STDOUT=true

Which version of Quilt are you on?

Python client

quilt3 --version

CloudFormation application

  1. Go to CloudFormation > Stacks > YourQuiltStack > Outputs

  2. Copy the row labeled TemplateBuildMetadata

  3. "git_revision" is your template version

This information is also available in the footer of the main page of the Catalog.

Hashing during push takes a long time. Can I speed it up?

Yes. Follow these steps:

  1. Run your compute in the same region as your S3 bucket (as opposed to a local machine or foreign region)—I/O is much faster.

  2. Use a larger instance with more vCPUs.

  3. Increase

  1. If you are using Quilt Catalog 1.51 (released Feb 2024), you can enable theChunkedChecksums CloudFormation parameter so it will calculate the checksums in parallel, or reuse them if already existing in S3. Parallel checksums are also available by default in quilt3 v6 or later (pre-released Feb 2024).

Does Quilt work with R?

  1. The Command Line Interface (CLI) API

Using the Quilt CLI API with R

You can script the Quilt CLI directly from your shell environment and chain it with your R scripts to create a unified workflow:

quilt3 install my-package # download Quilt data package 
[Run R commands or scripts] # modify the data in Quilt data package using R
quilt3 push --dir path/to/remote-registry my-package 
# upload Quilt data package to the remote registry

Using Quilt with Reticulate

How do I delete a data package and all of the objects in the data package?

You may have a test data package that you wish to delete at some point to ensure your data repository is clean and organized. Please do this very carefully! In favor of immutability, Quilt makes deletion a bit tricky. First, note that quilt3.Package.delete only deletes thepackage manifest, not the underlying objects. If you wish to delete the entire package and its objects, delete the objects first.

Warning: the objects you delete will be lost forever. Ditto for the package revision.

To delete, first browse the package then walk it, deleting its entry objects as follows:

import boto3
import quilt3 as q3

s3 = boto3.client("s3")

reg = "s3://quilt-bio-staging"
pname = "akarve/delete-object"
p = q3.Package.browse(pname, registry=reg)

for (k, e) in p.walk():
    pk = e.physical_key
    s3.delete_object(Bucket=pk.bucket, Key=pk.path, VersionId=pk.version_id)

You can then follow the above with q3.delete_package(pname, registry=reg, top_hash=p.top_hash).

Do I have to login via quilt3 to use the Quilt APIs?

How do I push to Quilt from a headless environment like a Docker container?

Be sure to run quilt3 logout if you've previously logged in.

Select among multiple profiles in your shell as follows:

export AWS_PROFILE=your_profile

The S3 permissions needed by quilt3 are similar to

How complex can my Athena queries be?

This allows for extremely granular querying of your data package name, metadata, and contents and includes logical operators, comparison functions, conditional expressions, mathematical functions, bitwise functions, date and time functions and operators, regular expression functions, and aggregate functions. Please review the references linked below to learn more.

Helpful examples

regexp_extract_all(string, pattern)

Return the substring(s) matched by the regular expression pattern in string

SELECT regexp_extract_all('1a 2b 14m', '\d+');

Considerations and limitations

There are [many considerations and

References

Are there any limitations on characters in Quilt filenames?

Yes. Quilt is built on top of Amazon S3, and has the same character limitations. Although any UTF-8 character is supported in an object key name (filename), using certain characters can result in problems with some applications and protocols. The following guideline will help you maximize compliance. For a comprehensive list of safe characters, characters that might require special handling, and characters to avoid, please review the official Amazon S3 documentation linked below.

List of safe characters

  • Alphanumeric characters:

    • 0-9

    • a-z

    • A-Z

  • Special characters:

    • Exclamation point (!)

    • Hyphen (-)

    • Underscore (_)

    • Period (.)

    • Asterisk (*)

    • Single quote (')

    • Open parenthesis (()

    • Close parenthesis ())

How many IPs does a standard Quilt stack require?

Optional additional features (such as automated data packaging) require additional IPs.

The "Last Modified" column in the Quilt catalog is empty

Amazon S3 is a key-value store with prefixes but no true "folders". In the Quilt Catalog Bucket view, as in AWS Console, only objects have a "Last modified" value, whereas package entries and prefixes do not.

above its default to match your available vCPUs.

In the scientific computing community, the is commonly used as an alternative, or companion, to Python. It is a language and environment for statistical computing and graphics, and is available as Free Software under the .

Currently there are no plans to release a Quilt package for distribution through the . However, you can still use Quilt with R, using either:

The package provides a set of tools for interoperability between Python and R by embedding a Python session within your R session.

Configure and quilt3 will use the same for its API calls.

but quilt3 does not need either s3:GetBucketNotification or s3:PutBucketNotification.

Amazon Athena supports a subset of Data Defintion Language (DDL) and Data Manipulation Language (DML) statements, functions, operators, and data types, based on and .

limitations]() when writing Amazon Athena queries.

For more details, see in the Amazon S3 documentation.

Currently, a full size, multi-Availability Zone deployment (without) requires at least 256 IPs. This means a minimum CIDR block of /24.

.quiltignore file
legacy.quiltdata.com
quilt
R Project
GNU General Public License
CRAN package repository
Reticulate
Reticulate
AWS CLI credentials
this bucket policy
Presto
Trino
https://docs.aws.amazon.com/athena/latest/ug/other-notable-limitations.html
SQL reference for Amazon Athena
Functions in Amazon Athena
Creating object key names
Voila
QUILT_TRANSFER_MAX_CONCURRENCY