LogoLogo
HomeGitHub RepoBook Demo
version-3.1.10
version-3.1.10
  • Introduction
  • Installation
  • Quickstart
  • Walkthrough
    • Editing a Package
    • Uploading a Package
    • Installing a Package
    • Getting Data from a Package
    • Working with the Catalog
    • Working with a Bucket
  • Advanced Usage
    • Filtering a Package
    • .quiltignore
    • Materialization
    • Working with Manifests
    • S3 Select
  • API Reference
    • quilt3
    • quilt3.Package
    • quilt3.Bucket
    • quilt3 CLI
  • References
    • Frequently Asked Questions
    • Technical Reference
    • Contributing
    • Further Reading
Powered by GitBook
On this page
  • Slicing through a package
  • Downloading package data to disk
  • Downloading package data into memory
  • Getting entry locations
  • Getting metadata

Was this helpful?

  1. Walkthrough

Getting Data from a Package

PreviousInstalling a PackageNextWorking with the Catalog

Last updated 5 years ago

Was this helpful?

The examples in this section use the aleksey/hurdat :

# import quilt3
# p = quilt3.Package.browse('aleksey/hurdat', 's3://quilt-example')
(remote Package)
 └─.gitignore
 └─.quiltignore
 └─notebooks/
   └─QuickStart.ipynb
 └─quilt_summarize.json
 └─requirements.txt
 └─scripts/
   └─build.py

Slicing through a package

Use dict key selection to slice into a package tree:

p["requirements.txt"]
# returns PackageEntry("requirements.txt")

p["notebooks"]
# returns:
# (remote Package)
# └─QuickStart.ipynb

Slicing into a Package directory returns another Package rooted at that subdirectory. Slicing into a package entry returns an individual PackageEntry.

Downloading package data to disk

To download a subset of files from a package directory to a dest, use fetch:

# download a subfolder
p["notebooks"].fetch()

# download a single file
p["notebooks"]["QuickStart.ipynb"].fetch()

# download everything
p.fetch()

fetch will default to downloading the files to the current directory, but you can also specify an alternative path:

p["notebooks"]["QuickStart.ipynb"].fetch("./references/")

Downloading package data into memory

Alternatively, you can download data directly into memory:

p["quilt_summarize.json"]()
# returns a dict

To apply a custom deserializer to your data, pass the function as a parameter to the function. For example, to load a hypothetical yaml file using yaml.safe_load:

p["symbols.yaml"](yaml.safe_load)
# returns a dict

The deserializer should accept a byte stream as input.

Getting entry locations

You can get the path to a package entry or directory using get:

p["notebooks"]["QuickStart.ipynb"].get()
# returns /path/to/pkg/root/notebooks/QuickStart.ipynb

p.get()
# returns /path/to/pkg/root/

Getting metadata

Metadata is available using the meta property.

# get entry metadata
p["notebooks"]["QuickStart.ipynb"].meta

# get directory metadata
p["notebooks"].meta

# get package metadata
p.meta
demo package