API Reference
- class GenomeSpy(height=600, server_port=18089)[source]
Bases:
objectA Python wrapper for GenomeSpy visualization library.
- Parameters:
Notes
GenomeSpy is a toolkit for interactive visualization of genomic and other data. It enables tailored visualizations through a declarative grammar inspired by Vega-Lite, allowing mapping of data to visual channels (position, color, etc.) and composing complex visualizations from primitive graphical marks (points, rectangles, etc.).
Key Features: - GPU-accelerated rendering for fluid interaction with large datasets - Support for specialized genomic file formats (BigWig, BigBed, Indexed FASTA) - Built-in genomic coordinate handling and transformations - Interactive zooming and navigation - Composable visualization grammar
- load_spec(spec, is_url=False)[source]
Load a GenomeSpy specification.
GenomeSpy specifications define how data should be visualized, including data sources, transformations, and visual encodings. Specifications can be loaded from a JSON file or directly as a dictionary.
- save_html(filename)[source]
Save the visualization as a standalone HTML file.
- Parameters:
filename (str) – Output HTML file path.
- show(filename=None)[source]
Display the visualization in a browser or Jupyter notebook.
- Parameters:
filename (str, optional) – Optional filename to save the HTML file. If None, creates a temporary file.
Notes
When running in a Jupyter notebook, the visualization will be displayed inline. Otherwise, it will open in the default web browser.
Examples
>>> plot = GenomeSpy() >>> # Configure visualization... >>> plot.show() # Display inline in notebook >>> >>> # Save to specific file >>> plot.show("visualization.html")
- close()[source]
Close the server if it’s running and cleanup temporary files.
Notes
This method should be called when you’re done with the visualization to: - Stop the local HTTP server if running - Remove any temporary files created during visualization - Free up system resources
Examples
>>> plot = GenomeSpy() >>> # Create visualization... >>> plot.show() >>> plot.close() # Cleanup when done
- data(data, format='json')[source]
Set the data for the visualization.
- Parameters:
data (Union[pd.DataFrame, np.ndarray, str]) – The data to visualize. Can be: - pandas DataFrame: Converted to records format - numpy array: Converted to list format - str: URL or path to data file
format (str, optional) – The format of the data file if using URL/path, by default “json” Options include: - “json”: JSON data - “csv”: Comma-separated values - “tsv”: Tab-separated values - “bigwig”: BigWig genomic data - “bigbed”: BigBed genomic data - “fasta”: FASTA sequence data - “gff3”: GFF3 genomic features
- Returns:
The current instance for method chaining
- Return type:
Notes
GenomeSpy utilizes a tabular data structure as its fundamental data model, similar to a spreadsheet or database table. Each dataset consists of records containing named data fields.
Data Sources: - Eager data: Fully loaded during initialization (CSV, TSV, JSON) - Lazy data: Loaded on-demand (BigWig, BigBed, Indexed FASTA) - Named data: Can be dynamically updated using the API
Examples
>>> import pandas as pd >>> from genomespy import GenomeSpy >>> >>> # Using pandas DataFrame >>> df = pd.DataFrame({'x': [1, 2, 3], 'y': [4, 5, 6]}) >>> plot = GenomeSpy() >>> plot.data(df) >>> >>> # Using file path >>> plot.data("data.bigwig", format="bigwig")
- transform(transform)[source]
Add transformations to the visualization specification.
- Parameters:
transform (List[Dict[str, Any]]) – A list of transformation specifications. Each transformation is a dictionary with at least a “type” field and transformation-specific parameters.
- Returns:
The current instance for method chaining
- Return type:
Notes
Transformations allow data manipulation before visualization. GenomeSpy provides specialized transformations for genomic data visualization and analysis tasks.
Common Transformations: - formula: Calculate new fields using expressions - filter: Filter data based on conditions - flatten: Flatten nested data structures - coverage: Calculate coverage from interval data - pileup: Create piled-up layout for overlapping features - flattenSequence: Split sequences into individual bases - collect: Group and sort data - project: Select and rename fields
Examples
>>> plot = GenomeSpy() >>> plot.transform([ ... { ... "type": "formula", ... "expr": "datum.end - datum.start", ... "as": "length" ... }, ... { ... "type": "filter", ... "expr": "datum.length > 1000" ... } ... ])
- mark(mark_type, **kwargs)[source]
Set the mark type for the visualization.
- Parameters:
- Returns:
The current instance for method chaining
- Return type:
Notes
Marks are the basic graphical elements used to represent data. GenomeSpy provides various mark types suitable for genomic data visualization.
Mark Types: - rect: Rectangles (good for intervals, exons) - point: Points (good for variants, peaks) - line: Lines (good for continuous data) - rule: Rules (good for boundaries) - text: Text labels - area: Filled areas
Mark Properties: - size: Size of the mark - color: Color of the mark - opacity: Transparency - strokeWidth: Width of stroke - tooltip: Tooltip configuration - minWidth: Minimum width for visibility - minOpacity: Minimum opacity for visibility
Examples
>>> plot = GenomeSpy() >>> plot.mark("rect", ... size=5, ... minWidth=0.5, ... tooltip={"content": "data"} ... )
- encode(**kwargs)[source]
Set the encoding for the visualization.
Encodings map data fields to visual properties. GenomeSpy supports various encoding types and provides special support for genomic coordinates.
- Parameters:
**kwargs (dict) – Encoding specifications for different channels. Each specification should be a dictionary defining the encoding properties.
- Returns:
GenomeSpy – The current instance for method chaining.
Supported Channels
—————-
- x, y (Position encoding)
- x2, y2 (Secondary position for intervals)
- color (Color encoding)
- opacity (Transparency)
- size (Size of marks)
- text (Text content)
- tooltip (Tooltip content)
- sample (Sample ID for multi-sample visualizations)
Data Types
———
- quantitative (Numerical values)
- nominal (Categorical values)
- ordinal (Ordered categories)
- locus (Genomic coordinates (requires chrom and pos fields))
Examples
>>> plot = GenomeSpy() >>> plot.encode( ... x={"chrom": "chr", "pos": "start", "type": "locus"}, ... y={"field": "value", "type": "quantitative"}, ... color={"field": "category", "type": "nominal"} ... )
- scale(**kwargs)[source]
Set the scales for the visualization.
Scales are functions that map abstract data values (e.g., a type of mutation) to visual values (e.g., colors). GenomeSpy implements most of Vega-Lite’s scale types and adds specialized scales for genomic data.
- Parameters:
**kwargs (dict) – Scale specifications for different channels. Each specification can include: - type: The type of scale to use - domain: Input domain range - range: Output range values - nice: Whether to extend domain to nice round numbers - padding: Padding to add around domain - scheme: Color scheme for color scales
- Returns:
GenomeSpy – The current instance for method chaining.
Supported Scale Types
——————-
- linear (Linear mapping for quantitative data)
- pow (Power scale for quantitative data)
- sqrt (Square root scale for quantitative data)
- symlog (Symmetric log scale)
- log (Logarithmic scale)
- ordinal (Discrete mapping for categorical data)
- band (Special scale for discrete ranges)
- point (Position-based scale)
- quantize (Binning for continuous data)
- threshold (Threshold-based binning)
Examples
>>> plot = GenomeSpy() >>> plot.scale( ... y={ ... "type": "linear", ... "domain": [0, 1], ... "range": [0, 100], ... "nice": True ... }, ... color={ ... "type": "ordinal", ... "domain": ["A", "C", "G", "T"], ... "range": ["red", "blue", "green", "yellow"] ... } ... )
- view(view_spec)[source]
Add a view to the visualization.
Views in GenomeSpy allow for hierarchical composition of visualizations. Views can be concatenated, layered, or arranged in other ways. Each view inherits data and encoding from its parent but can override them with its own specifications.
- Parameters:
view_spec (Dict[str, Any]) – The view specification defining the visualization properties, data, marks, and encodings for this view.
- Returns:
GenomeSpy – The current instance for method chaining.
View Properties
————–
- data (Data source for the view)
- transform (Data transformations)
- mark (Visual marks to represent data)
- encoding (Visual encodings)
- height (View height)
- width (View width)
- name (Unique identifier for the view)
- title (View title)
- description (View description)
- padding (Space around the view)
- opacity (View opacity)
- configurableVisibility (Whether view can be toggled)
Examples
>>> plot = GenomeSpy() >>> plot.view({ ... "name": "genes", ... "height": 120, ... "data": {"url": "genes.bed"}, ... "mark": "rect", ... "encoding": { ... "x": {"chrom": "chr", "pos": "start", "type": "locus"}, ... "x2": {"chrom": "chr", "pos": "end"} ... } ... })
- import_view(url)[source]
Import a view from a URL.
This function allows importing external view specifications, enabling reuse and sharing of visualization components. Common uses include importing standard genomic tracks like: - Chromosome ideograms - Gene annotation tracks - Reference genome sequences
- Parameters:
url (str) – The URL or path to the view specification to import. Can be absolute URL or relative to the base URL.
- Returns:
GenomeSpy – The current instance for method chaining.
Built-in Views
————-
The following views are available in the .genomespy_shared/ directory
- cytobands.json (Chromosome ideogram track)
- genes.json (Gene annotation track)
- hg38.json (Reference genome sequence)
Examples
>>> plot = GenomeSpy() >>> # Import chromosome ideogram >>> plot.import_view(".genomespy_shared/cytobands.json") >>> >>> # Import gene annotations >>> plot.import_view(".genomespy_shared/genes.json") >>> >>> # Import reference genome >>> plot.import_view(".genomespy_shared/hg38.json")
- expression(name, expr)[source]
Add an expression to the visualization.
Expressions in GenomeSpy allow for computing new data fields or modifying existing ones. They use a JavaScript-like syntax and can access the current data object using ‘datum’. Expressions can be used in transforms, encodings, and other places where dynamic computation is needed.
- Parameters:
- Returns:
GenomeSpy – The current instance for method chaining.
Common Uses
———-
- Computing derived values
- Conditional logic
- String manipulation
- Mathematical calculations
- Accessing parameters
Examples
>>> plot = GenomeSpy() >>> # Calculate length of genomic interval >>> plot.expression("length", "datum.end - datum.start") >>> >>> # Compute log ratio >>> plot.expression("logRatio", "log2(datum.value / datum.control)") >>> >>> # Create conditional label >>> plot.expression( ... "label", ... "datum.score > 0.05 ? 'High impact' : 'Low impact'" ... )
- parameter(name, value)[source]
Add a parameter to the visualization.
Parameters enable dynamic behaviors and interactions in GenomeSpy visualizations. They can be used for interactive selections, conditional encoding, data filtering, and parameterizing imported specifications.
- Parameters:
name (str) – The name of the parameter to be referenced in expressions and conditions.
value (Any) – The parameter value or configuration. Can be a simple value or a parameter definition object.
- Returns:
GenomeSpy – The current instance for method chaining.
Parameter Types
————–
- Selection parameters (Enable interactive data selection)
- Value parameters (Store single values)
- Range parameters (Store numeric ranges)
- Vector parameters (Store arrays of values)
Common Uses
———-
- Interactive filtering
- Conditional encoding
- Dynamic thresholds
- Coordinated selections
- View parameterization
Examples
>>> plot = GenomeSpy() >>> # Selection parameter for interactive highlighting >>> plot.parameter("highlight", { ... "select": {"type": "point", "on": "pointerover"} ... }) >>> >>> # Value parameter for filtering >>> plot.parameter("threshold", 0.05) >>> >>> # Use in encoding >>> plot.encode( ... opacity={ ... "condition": {"param": "highlight", "value": 1.0}, ... "value": 0.3 ... } ... )
- to_json()[source]
Convert the specification to a JSON string.
This function serializes the current GenomeSpy specification into a JSON string, which can be used for saving or sharing the visualization configuration.
- Returns:
The JSON string representation of the specification.
- Return type:
Examples
>>> plot = GenomeSpy() >>> plot.encode(x={"field": "value", "type": "quantitative"}) >>> json_spec = plot.to_json()
- heatmap(data, x_label='x', y_label='y')[source]
Create a heatmap from a pandas DataFrame.
Heatmaps are a common way to visualize matrix-like data, where values are represented by colors. This function prepares the data and sets up the GenomeSpy specification for rendering a heatmap.
- Parameters:
- Returns:
The current instance for method chaining.
- Return type:
Examples
>>> import pandas as pd >>> plot = GenomeSpy() >>> data = pd.DataFrame({ ... 'A': [1, 2, 3], ... 'B': [4, 5, 6], ... 'C': [7, 8, 9] ... }) >>> plot.heatmap(data, x_label="Samples", y_label="Features")
- clustermap(data, x_label='x', y_label='y', method='ward', metric='euclidean', z_score=None, standard_scale=None, row_cluster=True, col_cluster=True, vmax=None, vmin=None, center=None, cmap='viridis')[source]
Create a clustermap from a pandas DataFrame.
A clustermap combines a heatmap with hierarchical clustering dendrograms on both axes. The clustering helps reveal patterns and relationships in the data by grouping similar rows and columns together.
- Parameters:
data (pd.DataFrame) – Input data matrix to be clustered and visualized
x_label (str, optional) – Label for x-axis, by default “x”
y_label (str, optional) – Label for y-axis, by default “y”
method (str, optional) – Linkage method for hierarchical clustering, by default “ward”
metric (str, optional) – Distance metric for clustering, by default “euclidean”
z_score (int, optional) – Standardize the data along rows (0) or columns (1), by default None
standard_scale (int, optional) – Scale data along rows (0) or columns (1), by default None
row_cluster (bool, optional) – Whether to cluster rows, by default True
col_cluster (bool, optional) – Whether to cluster columns, by default True
vmax (float, optional) – Maximum value for color scaling, by default None
vmin (float, optional) – Minimum value for color scaling, by default None
center (float, optional) – Center value for diverging colormaps, by default None
cmap (str, optional) – Colormap name, either “viridis” or “blues”, by default “viridis”
- Returns:
The current instance for method chaining
- Return type:
Examples
>>> import pandas as pd >>> from genomespy import GenomeSpy >>> >>> # Create sample data >>> data = pd.DataFrame({ ... 'A': [1, 2, 3], ... 'B': [2, 4, 6], ... 'C': [3, 6, 9] ... }) >>> >>> # Create and display clustermap >>> plot = GenomeSpy() >>> plot.clustermap( ... data, ... x_label="Samples", ... y_label="Features", ... z_score=1, ... method="ward" ... )
- dendrogram(data, method='ward', metric='euclidean')[source]
Create a dendrogram using GenomeSpy.
Dendrograms are tree-like diagrams used to visualize the arrangement of clusters produced by hierarchical clustering.
- Parameters:
- Returns:
The current instance for method chaining
- Return type:
Examples
>>> import pandas as pd >>> plot = GenomeSpy() >>> data = pd.DataFrame({ ... 'A': [1, 2, 3], ... 'B': [4, 5, 6] ... }) >>> plot.dendrogram(data, method="ward", metric="euclidean")
- igv(file_dict, region=None, height=600, server_port=18089, gs=None)[source]
Create a GenomeSpy visualization with custom tracks in IGV style.
This function creates a genome browser visualization similar to IGV (Integrative Genomics Viewer), with support for various genomic data formats and customizable tracks.
- Parameters:
file_dict (Dict[str, Dict[str, Any]]) – A dictionary mapping track names to their configurations. Each track configuration should specify: - url or path : Path to the data file - type : Data format (e.g., “bigwig”, “bigbed”) - height : Track height in pixels
region (Optional[Dict[str, Any]], optional) – The genomic region to display, by default None. Should contain: - chrom : Chromosome name - start : Start position - end : End position
height (int, optional) – The height of the visualization in pixels, by default 600
server_port (int, optional) – The port number for the GenomeSpy server, by default 18089
gs (GenomeSpy, optional) – An existing GenomeSpy instance to reuse, by default None
- Returns:
The configured GenomeSpy instance ready for display
- Return type:
Examples
>>> from genomespy import igv >>> # Configure tracks >>> tracks = { ... "ZBTB7A": { ... "url": "https://chip-atlas.dbcls.jp/data/hg38/eachData/bw/SRX3161009.bw", ... "height": 40, ... "type": "bigwig" ... } ... } >>> # Create visualization >>> plot = igv( ... tracks, ... region={"chrom": "chr7", "start": 66600000, "end": 66800000} ... ) >>> plot.show()
Core Functionality
- class RangeRequestHandler(*args, directory=None, **kwargs)[source]
Bases:
SimpleHTTPRequestHandlerHTTP handler that supports range requests for bigwig/bigbed files.
This handler extends the SimpleHTTPRequestHandler to support HTTP range requests, which are necessary for serving large genomic data files like bigwig and bigbed.
- BINARY_EXTENSIONS = ['.bw', '.bigwig']
- log_message(format, *args)[source]
Log an arbitrary message.
This is used by all other logging functions. Override it if you have specific logging wishes.
The first argument, FORMAT, is a format string for the message to be logged. If the format string contains any % escapes requiring parameters, they should be specified as subsequent arguments (it’s just like printf!).
The client ip and current date/time are prefixed to every message.
Unicode control characters are replaced with escaped hex before writing the output to stderr.
- class GenomeSpy(height=600, server_port=18089)[source]
Bases:
objectA Python wrapper for GenomeSpy visualization library.
- Parameters:
Notes
GenomeSpy is a toolkit for interactive visualization of genomic and other data. It enables tailored visualizations through a declarative grammar inspired by Vega-Lite, allowing mapping of data to visual channels (position, color, etc.) and composing complex visualizations from primitive graphical marks (points, rectangles, etc.).
Key Features: - GPU-accelerated rendering for fluid interaction with large datasets - Support for specialized genomic file formats (BigWig, BigBed, Indexed FASTA) - Built-in genomic coordinate handling and transformations - Interactive zooming and navigation - Composable visualization grammar
- load_spec(spec, is_url=False)[source]
Load a GenomeSpy specification.
GenomeSpy specifications define how data should be visualized, including data sources, transformations, and visual encodings. Specifications can be loaded from a JSON file or directly as a dictionary.
- save_html(filename)[source]
Save the visualization as a standalone HTML file.
- Parameters:
filename (str) – Output HTML file path.
- show(filename=None)[source]
Display the visualization in a browser or Jupyter notebook.
- Parameters:
filename (str, optional) – Optional filename to save the HTML file. If None, creates a temporary file.
Notes
When running in a Jupyter notebook, the visualization will be displayed inline. Otherwise, it will open in the default web browser.
Examples
>>> plot = GenomeSpy() >>> # Configure visualization... >>> plot.show() # Display inline in notebook >>> >>> # Save to specific file >>> plot.show("visualization.html")
- close()[source]
Close the server if it’s running and cleanup temporary files.
Notes
This method should be called when you’re done with the visualization to: - Stop the local HTTP server if running - Remove any temporary files created during visualization - Free up system resources
Examples
>>> plot = GenomeSpy() >>> # Create visualization... >>> plot.show() >>> plot.close() # Cleanup when done
- data(data, format='json')[source]
Set the data for the visualization.
- Parameters:
data (Union[pd.DataFrame, np.ndarray, str]) – The data to visualize. Can be: - pandas DataFrame: Converted to records format - numpy array: Converted to list format - str: URL or path to data file
format (str, optional) – The format of the data file if using URL/path, by default “json” Options include: - “json”: JSON data - “csv”: Comma-separated values - “tsv”: Tab-separated values - “bigwig”: BigWig genomic data - “bigbed”: BigBed genomic data - “fasta”: FASTA sequence data - “gff3”: GFF3 genomic features
- Returns:
The current instance for method chaining
- Return type:
Notes
GenomeSpy utilizes a tabular data structure as its fundamental data model, similar to a spreadsheet or database table. Each dataset consists of records containing named data fields.
Data Sources: - Eager data: Fully loaded during initialization (CSV, TSV, JSON) - Lazy data: Loaded on-demand (BigWig, BigBed, Indexed FASTA) - Named data: Can be dynamically updated using the API
Examples
>>> import pandas as pd >>> from genomespy import GenomeSpy >>> >>> # Using pandas DataFrame >>> df = pd.DataFrame({'x': [1, 2, 3], 'y': [4, 5, 6]}) >>> plot = GenomeSpy() >>> plot.data(df) >>> >>> # Using file path >>> plot.data("data.bigwig", format="bigwig")
- transform(transform)[source]
Add transformations to the visualization specification.
- Parameters:
transform (List[Dict[str, Any]]) – A list of transformation specifications. Each transformation is a dictionary with at least a “type” field and transformation-specific parameters.
- Returns:
The current instance for method chaining
- Return type:
Notes
Transformations allow data manipulation before visualization. GenomeSpy provides specialized transformations for genomic data visualization and analysis tasks.
Common Transformations: - formula: Calculate new fields using expressions - filter: Filter data based on conditions - flatten: Flatten nested data structures - coverage: Calculate coverage from interval data - pileup: Create piled-up layout for overlapping features - flattenSequence: Split sequences into individual bases - collect: Group and sort data - project: Select and rename fields
Examples
>>> plot = GenomeSpy() >>> plot.transform([ ... { ... "type": "formula", ... "expr": "datum.end - datum.start", ... "as": "length" ... }, ... { ... "type": "filter", ... "expr": "datum.length > 1000" ... } ... ])
- mark(mark_type, **kwargs)[source]
Set the mark type for the visualization.
- Parameters:
- Returns:
The current instance for method chaining
- Return type:
Notes
Marks are the basic graphical elements used to represent data. GenomeSpy provides various mark types suitable for genomic data visualization.
Mark Types: - rect: Rectangles (good for intervals, exons) - point: Points (good for variants, peaks) - line: Lines (good for continuous data) - rule: Rules (good for boundaries) - text: Text labels - area: Filled areas
Mark Properties: - size: Size of the mark - color: Color of the mark - opacity: Transparency - strokeWidth: Width of stroke - tooltip: Tooltip configuration - minWidth: Minimum width for visibility - minOpacity: Minimum opacity for visibility
Examples
>>> plot = GenomeSpy() >>> plot.mark("rect", ... size=5, ... minWidth=0.5, ... tooltip={"content": "data"} ... )
- encode(**kwargs)[source]
Set the encoding for the visualization.
Encodings map data fields to visual properties. GenomeSpy supports various encoding types and provides special support for genomic coordinates.
- Parameters:
**kwargs (dict) – Encoding specifications for different channels. Each specification should be a dictionary defining the encoding properties.
- Returns:
GenomeSpy – The current instance for method chaining.
Supported Channels
—————-
- x, y (Position encoding)
- x2, y2 (Secondary position for intervals)
- color (Color encoding)
- opacity (Transparency)
- size (Size of marks)
- text (Text content)
- tooltip (Tooltip content)
- sample (Sample ID for multi-sample visualizations)
Data Types
———
- quantitative (Numerical values)
- nominal (Categorical values)
- ordinal (Ordered categories)
- locus (Genomic coordinates (requires chrom and pos fields))
Examples
>>> plot = GenomeSpy() >>> plot.encode( ... x={"chrom": "chr", "pos": "start", "type": "locus"}, ... y={"field": "value", "type": "quantitative"}, ... color={"field": "category", "type": "nominal"} ... )
- scale(**kwargs)[source]
Set the scales for the visualization.
Scales are functions that map abstract data values (e.g., a type of mutation) to visual values (e.g., colors). GenomeSpy implements most of Vega-Lite’s scale types and adds specialized scales for genomic data.
- Parameters:
**kwargs (dict) – Scale specifications for different channels. Each specification can include: - type: The type of scale to use - domain: Input domain range - range: Output range values - nice: Whether to extend domain to nice round numbers - padding: Padding to add around domain - scheme: Color scheme for color scales
- Returns:
GenomeSpy – The current instance for method chaining.
Supported Scale Types
——————-
- linear (Linear mapping for quantitative data)
- pow (Power scale for quantitative data)
- sqrt (Square root scale for quantitative data)
- symlog (Symmetric log scale)
- log (Logarithmic scale)
- ordinal (Discrete mapping for categorical data)
- band (Special scale for discrete ranges)
- point (Position-based scale)
- quantize (Binning for continuous data)
- threshold (Threshold-based binning)
Examples
>>> plot = GenomeSpy() >>> plot.scale( ... y={ ... "type": "linear", ... "domain": [0, 1], ... "range": [0, 100], ... "nice": True ... }, ... color={ ... "type": "ordinal", ... "domain": ["A", "C", "G", "T"], ... "range": ["red", "blue", "green", "yellow"] ... } ... )
- view(view_spec)[source]
Add a view to the visualization.
Views in GenomeSpy allow for hierarchical composition of visualizations. Views can be concatenated, layered, or arranged in other ways. Each view inherits data and encoding from its parent but can override them with its own specifications.
- Parameters:
view_spec (Dict[str, Any]) – The view specification defining the visualization properties, data, marks, and encodings for this view.
- Returns:
GenomeSpy – The current instance for method chaining.
View Properties
————–
- data (Data source for the view)
- transform (Data transformations)
- mark (Visual marks to represent data)
- encoding (Visual encodings)
- height (View height)
- width (View width)
- name (Unique identifier for the view)
- title (View title)
- description (View description)
- padding (Space around the view)
- opacity (View opacity)
- configurableVisibility (Whether view can be toggled)
Examples
>>> plot = GenomeSpy() >>> plot.view({ ... "name": "genes", ... "height": 120, ... "data": {"url": "genes.bed"}, ... "mark": "rect", ... "encoding": { ... "x": {"chrom": "chr", "pos": "start", "type": "locus"}, ... "x2": {"chrom": "chr", "pos": "end"} ... } ... })
- import_view(url)[source]
Import a view from a URL.
This function allows importing external view specifications, enabling reuse and sharing of visualization components. Common uses include importing standard genomic tracks like: - Chromosome ideograms - Gene annotation tracks - Reference genome sequences
- Parameters:
url (str) – The URL or path to the view specification to import. Can be absolute URL or relative to the base URL.
- Returns:
GenomeSpy – The current instance for method chaining.
Built-in Views
————-
The following views are available in the .genomespy_shared/ directory
- cytobands.json (Chromosome ideogram track)
- genes.json (Gene annotation track)
- hg38.json (Reference genome sequence)
Examples
>>> plot = GenomeSpy() >>> # Import chromosome ideogram >>> plot.import_view(".genomespy_shared/cytobands.json") >>> >>> # Import gene annotations >>> plot.import_view(".genomespy_shared/genes.json") >>> >>> # Import reference genome >>> plot.import_view(".genomespy_shared/hg38.json")
- expression(name, expr)[source]
Add an expression to the visualization.
Expressions in GenomeSpy allow for computing new data fields or modifying existing ones. They use a JavaScript-like syntax and can access the current data object using ‘datum’. Expressions can be used in transforms, encodings, and other places where dynamic computation is needed.
- Parameters:
- Returns:
GenomeSpy – The current instance for method chaining.
Common Uses
———-
- Computing derived values
- Conditional logic
- String manipulation
- Mathematical calculations
- Accessing parameters
Examples
>>> plot = GenomeSpy() >>> # Calculate length of genomic interval >>> plot.expression("length", "datum.end - datum.start") >>> >>> # Compute log ratio >>> plot.expression("logRatio", "log2(datum.value / datum.control)") >>> >>> # Create conditional label >>> plot.expression( ... "label", ... "datum.score > 0.05 ? 'High impact' : 'Low impact'" ... )
- parameter(name, value)[source]
Add a parameter to the visualization.
Parameters enable dynamic behaviors and interactions in GenomeSpy visualizations. They can be used for interactive selections, conditional encoding, data filtering, and parameterizing imported specifications.
- Parameters:
name (str) – The name of the parameter to be referenced in expressions and conditions.
value (Any) – The parameter value or configuration. Can be a simple value or a parameter definition object.
- Returns:
GenomeSpy – The current instance for method chaining.
Parameter Types
————–
- Selection parameters (Enable interactive data selection)
- Value parameters (Store single values)
- Range parameters (Store numeric ranges)
- Vector parameters (Store arrays of values)
Common Uses
———-
- Interactive filtering
- Conditional encoding
- Dynamic thresholds
- Coordinated selections
- View parameterization
Examples
>>> plot = GenomeSpy() >>> # Selection parameter for interactive highlighting >>> plot.parameter("highlight", { ... "select": {"type": "point", "on": "pointerover"} ... }) >>> >>> # Value parameter for filtering >>> plot.parameter("threshold", 0.05) >>> >>> # Use in encoding >>> plot.encode( ... opacity={ ... "condition": {"param": "highlight", "value": 1.0}, ... "value": 0.3 ... } ... )
- to_json()[source]
Convert the specification to a JSON string.
This function serializes the current GenomeSpy specification into a JSON string, which can be used for saving or sharing the visualization configuration.
- Returns:
The JSON string representation of the specification.
- Return type:
Examples
>>> plot = GenomeSpy() >>> plot.encode(x={"field": "value", "type": "quantitative"}) >>> json_spec = plot.to_json()
- heatmap(data, x_label='x', y_label='y')[source]
Create a heatmap from a pandas DataFrame.
Heatmaps are a common way to visualize matrix-like data, where values are represented by colors. This function prepares the data and sets up the GenomeSpy specification for rendering a heatmap.
- Parameters:
- Returns:
The current instance for method chaining.
- Return type:
Examples
>>> import pandas as pd >>> plot = GenomeSpy() >>> data = pd.DataFrame({ ... 'A': [1, 2, 3], ... 'B': [4, 5, 6], ... 'C': [7, 8, 9] ... }) >>> plot.heatmap(data, x_label="Samples", y_label="Features")
- clustermap(data, x_label='x', y_label='y', method='ward', metric='euclidean', z_score=None, standard_scale=None, row_cluster=True, col_cluster=True, vmax=None, vmin=None, center=None, cmap='viridis')[source]
Create a clustermap from a pandas DataFrame.
A clustermap combines a heatmap with hierarchical clustering dendrograms on both axes. The clustering helps reveal patterns and relationships in the data by grouping similar rows and columns together.
- Parameters:
data (pd.DataFrame) – Input data matrix to be clustered and visualized
x_label (str, optional) – Label for x-axis, by default “x”
y_label (str, optional) – Label for y-axis, by default “y”
method (str, optional) – Linkage method for hierarchical clustering, by default “ward”
metric (str, optional) – Distance metric for clustering, by default “euclidean”
z_score (int, optional) – Standardize the data along rows (0) or columns (1), by default None
standard_scale (int, optional) – Scale data along rows (0) or columns (1), by default None
row_cluster (bool, optional) – Whether to cluster rows, by default True
col_cluster (bool, optional) – Whether to cluster columns, by default True
vmax (float, optional) – Maximum value for color scaling, by default None
vmin (float, optional) – Minimum value for color scaling, by default None
center (float, optional) – Center value for diverging colormaps, by default None
cmap (str, optional) – Colormap name, either “viridis” or “blues”, by default “viridis”
- Returns:
The current instance for method chaining
- Return type:
Examples
>>> import pandas as pd >>> from genomespy import GenomeSpy >>> >>> # Create sample data >>> data = pd.DataFrame({ ... 'A': [1, 2, 3], ... 'B': [2, 4, 6], ... 'C': [3, 6, 9] ... }) >>> >>> # Create and display clustermap >>> plot = GenomeSpy() >>> plot.clustermap( ... data, ... x_label="Samples", ... y_label="Features", ... z_score=1, ... method="ward" ... )
- dendrogram(data, method='ward', metric='euclidean')[source]
Create a dendrogram using GenomeSpy.
Dendrograms are tree-like diagrams used to visualize the arrangement of clusters produced by hierarchical clustering.
- Parameters:
- Returns:
The current instance for method chaining
- Return type:
Examples
>>> import pandas as pd >>> plot = GenomeSpy() >>> data = pd.DataFrame({ ... 'A': [1, 2, 3], ... 'B': [4, 5, 6] ... }) >>> plot.dendrogram(data, method="ward", metric="euclidean")
- create_track_spec(track_name, track_config, region)[source]
Create a track specification for GenomeSpy.
- Parameters:
- Returns:
The complete track specification
- Return type:
Dict[str, Any]
Examples
>>> region = {"chrom": "chr1", "start": 1000, "end": 2000} >>> config = { ... "type": "bigwig", ... "url": "data.bw", ... "height": 100 ... } >>> spec = create_track_spec("Coverage", config, region)
- create_base_spec(region)[source]
Create the base specification for GenomeSpy visualization.
- Parameters:
region (Dict[str, Any]) – The genomic region for the visualization
- Returns:
The base specification including schema and default tracks
- Return type:
Dict[str, Any]
Examples
>>> region = {"chrom": "chr1", "start": 1000, "end": 2000} >>> base_spec = create_base_spec(region)
- igv(file_dict, region=None, height=600, server_port=18089, gs=None)[source]
Create a GenomeSpy visualization with custom tracks in IGV style.
This function creates a genome browser visualization similar to IGV (Integrative Genomics Viewer), with support for various genomic data formats and customizable tracks.
- Parameters:
file_dict (Dict[str, Dict[str, Any]]) – A dictionary mapping track names to their configurations. Each track configuration should specify: - url or path : Path to the data file - type : Data format (e.g., “bigwig”, “bigbed”) - height : Track height in pixels
region (Optional[Dict[str, Any]], optional) – The genomic region to display, by default None. Should contain: - chrom : Chromosome name - start : Start position - end : End position
height (int, optional) – The height of the visualization in pixels, by default 600
server_port (int, optional) – The port number for the GenomeSpy server, by default 18089
gs (GenomeSpy, optional) – An existing GenomeSpy instance to reuse, by default None
- Returns:
The configured GenomeSpy instance ready for display
- Return type:
Examples
>>> from genomespy import igv >>> # Configure tracks >>> tracks = { ... "ZBTB7A": { ... "url": "https://chip-atlas.dbcls.jp/data/hg38/eachData/bw/SRX3161009.bw", ... "height": 40, ... "type": "bigwig" ... } ... } >>> # Create visualization >>> plot = igv( ... tracks, ... region={"chrom": "chr7", "start": 66600000, "end": 66800000} ... ) >>> plot.show()
- create_gencode_layers()[source]
Create the layer specifications for the Gencode track.
- Returns:
The list of layer specifications.
- Return type:
- create_gencode_exons_layer()[source]
Create the exons layer specification for the Gencode track.
- Returns:
The exons layer specification.
- Return type:
Dict[str, Any]
- create_exon_layer()[source]
Create the exon sublayer specification.
- Returns:
The exon sublayer specification.
- Return type:
Dict[str, Any]
- create_feature_layer()[source]
Create the feature sublayer specification.
- Returns:
The feature sublayer specification.
- Return type:
Dict[str, Any]