= aics_pipeline(1, "../_data/aics") image_target_paths, data_manifest
Loading manifest: 100%|██████████| 77165/77165 [00:01<00:00, 44.1k/s]
download_medmnist (dataset:str, output_dir:str='.', download_only:bool=False, save_images:bool=True)
*Downloads the specified MedMNIST dataset and saves the training, validation, and test datasets into the specified output directory. Images are saved as .png for 2D data and multi-page .tiff for 3D data, organized into folders named after their labels.
Args: - dataset: The MedMNIST dataset name (e.g., ‘pathmnist’, ‘bloodmnist’, etc.). - output_dir: Path where the images will be saved. - download_only: If True, only downloads the dataset, no processing or saving. - save_images: If True, save the images in the specified output directory.
Returns: - None, saves images in the specified output directory if save_images is True.*
Type | Default | Details | |
---|---|---|---|
dataset | str | The name of the MedMNIST dataset (e.g., ‘pathmnist’, ‘bloodmnist’, etc.). | |
output_dir | str | . | The path to the directory where the datasets will be saved. |
download_only | bool | False | If True, only download the dataset into the output directory without processing. |
save_images | bool | True | If True, save the images into the output directory as .png (2D datasets) or multipage .tiff (3D datasets) files. |
download_dataset (base_url, expected_checksums, file_names, output_dir, processor=None)
*Download a dataset using Pooch and save it to the specified output directory.
Parameters: base_url (str): The base URL from which the files will be downloaded. expected_checksums (dict): A dictionary mapping file names to their expected checksums. file_names (dict): A dictionary mapping task identifiers to file names. output_dir (str): The directory where the downloaded files will be saved. processor (callable, optional): A function to process the downloaded data. Defaults to None.*
Allen Institute Cell Science (AICS)
aics_pipeline (n_images_to_download=40, image_save_dir=None)
Loading manifest: 100%|██████████| 77165/77165 [00:01<00:00, 44.1k/s]
[]
ProteinDisplayName | StructureSegmentationAlgorithmVersion | WorkflowId | NucMembSegmentationAlgorithm | CellIndex | Gene | WellId | StructureShortName | NucMembSegmentationAlgorithmVersion | WellName | ... | Clone | Col | StructureDisplayName | DataSetId | ChannelNumber638 | ChannelNumberBrightfield | PlateId | StructEducationName | SourceReadPath | FeatureExplorerURL | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
4131 | Tom20 | 51 | 1 | Matlab nucleus/membrane segmentation | 1 | TOMM20 | 24822 | Mitochondria | 1.3.0 | E6 | ... | 27 | 5 | Mitochondria | 3 | 1 | 6 | 3500001004 | NaN | fovs/6677e50c_3500001004_100X_20170623_5-Scene... | https://cfe.allencell.org/?selectedPoint[0]=18... |
1 rows × 47 columns
Make a manifest of all of the files in csv form
manifest2csv (paths, data_manifest, signal, target, train_fraction=0.8, data_save_path_train='./train.csv', data_save_path_test='./test.csv')