🌎 Adding a New Fetch Module#

The most common contribution is adding support for a new data source. Because Fetchez uses dynamic discovery, you do not need to register your module in any central files!

  1. Create the Module File: Create a new Python file in src/fetchez/modules/ (e.g., mydata.py).

  2. Inherit from FetchModule: Your class must inherit from fetchez.modules.FetchModule. You must define the meta_ class attributes so the CLI and API can discover and index your module. Be sure to include all the relevant metadata for posterity and discoverability.

    from fetchez.core import FetchModule
    from fetchez import cli
    
    @cli.cli_opts(help_text="Fetch data from MyData Source")
    class MyData(FetchModule):
    
    	name = "mydata"
        meta_category = "Topography"
        meta_desc = "Short summary of the dataset (e.g., Global Lidar Synthesis)"
        meta_agency = "Provider Name (e.g., USGS, NOAA)"
        meta_tags = ["lidar", "elevation", "high-res"]
        meta_coverage = "Coverage Area (e.g., CONUS, Global)"
        meta_resolution = "Nominal Resolution (e.g., 1m)"
        meta_license = "License Type (e.g., Public Domain, CC-BY)"
        meta_urls = {
            "home": "[https://provider.gov/data](https://provider.gov/data)",
            "docs": "[https://provider.gov/docs](https://provider.gov/docs)"
        }
    
        def __init__(self, **kwargs):
            super().__init__(name='mydata', **kwargs)
            # Initialize your specific headers or API endpoints here
    
        def run(self):
            # 1. Construct the download URL based on self.region
            # 2. Use core.Fetch(url).fetch_req(...) to query the API for download urls
            # 3. Add successful download urls to the results with `self.add_entry_to_results'
            pass
    
  3. Test It: Run fetchez mydata --help to ensure it loads correctly.

Handling Dependencies & Imports#

Fetchez aims to keep its core footprint small. If your new module or plugin requires a non-standard library (e.g., boto3, pyshp, netCDF4):

  1. Do Not Add to Core Requirements: Do NOT add the library to the main dependencies list in pyproject.toml.

  2. Add to Optional Dependencies: Open pyproject.toml and add your library to a relevant group under [project.optional-dependencies]. If no group fits, create a new one (e.g. netcdf = ["netCDF4"]).

  3. Soft Imports: Wrap your imports in a try/except ImportError block so the module does not crash the CLI for users who don’t use that specific data source.

  4. Document It: Clearly list the required packages (and the install command) in the class docstring.

    Example:

    # fetchez/modules/mys3.py
    from fetchez.core import FetchModule
    
    try:
    	import boto3
    
    	HAS_BOTO = True
    except ImportError:
    	HAS_BOTO = False
    
    
    @cli.cli_opts(help_text="Fetch data from AWS")
    class MyS3Fetcher(FetchModule):
    	"""Fetches data from private S3 buckets.
    
    	**Dependencies:**
    	This module requires `boto3`.
    	Install via: `pip install "fetchez[aws]"`
    	"""
    
    	def run(self):
    		if not HAS_BOTO:
    			logger.error("Missing dependency 'boto3'. Please run: pip install 'fetchez[aws]'")
    			return
    
    		# Proceed with fetching...