Clustering TCRαβ and GEX ======================================== This example shows how to cluster the joint information from TCRαβ and GEX data. We use the human huARdb v2 reference dataset as an example. Load the reference dataset -------------------------- .. code-block:: python :linenos: import tcr_deep_insight as tdi import torch gex_reference_adata = tdi.data.human_gex_reference_v2() Searching for cGxTr clusters without any constraints ----------------------------------------------------- Search for TCR clusters without considering disease type or HLA information. The clustering will be performed based on the similarity of TCR sequences and the GEX data. .. code-block:: python :linenos: tdi_cluster_result = tdi.tl.cluster_tcr( tcr_reference_adata, max_distance=4., max_cluster_size=100, n_jobs=16 ) Searching for cGxTr clusters with at least one common HLA alleles ----------------------------------------------------------------- If the HLA information is available in the `tcr_adata.obs` object, we can include the HLA information in the clustering. .. code-block:: python :linenos: # if HLA information is available in the tcr_adata object include_hla_keys = { 'A': ['A_1','A_2'], 'B': ['B_1','B_2'], 'C': ['C_1','C_2'] } hla_map = { key: dict( zip( range(len(tcr_adata.obs)), tcr_adata.obs.loc[:,val].to_numpy() ) ) for key,val in include_hla_keys.items() } tdi_cluster_result_hla = tdi.tl.cluster_tcr( tcr_reference_adata, max_distance=4., max_cluster_size=100, n_jobs=16, include_hla_keys=include_hla_keys ) Searching for cGxTr clusters with unique disease type ------------------------------------------------------ if `label_key` is provided in the `tcr_adata.obs` object, we can include the disease type information in the clustering. We constrain the TCRs in the same cluster to have the same disease type. See more details in the :func:`tcr_deep_insight.tl.cluster_tcr` function documentation. .. code-block:: python :linenos: tdi_cluster_result_disease = tdi.tl.cluster_tcr( tcr_reference_adata, label_key='disease_type', max_distance=4., max_cluster_size=100, n_jobs=16 ) Searching for cGxTr clusters with constrains on TCRs ---------------------------------------------------- We can constrain that the TCRs in the same cluster have the same TRBV gene segment or the same CDR3β length. .. code-block:: python :linenos: tdi_cluster_result_disease = tdi.tl.cluster_tcr( tcr_reference_adata, label_key='disease_type', max_distance=4., max_cluster_size=100, same_trbv=True, same_cdr3b_length=True, n_jobs=16 ) We also provide the constrain on the alpha chain, including arguments `same_trav`, `same_cdr3a_length`. Combining the clustering constrains ------------------------------------ The constrains can be combined together. .. code-block:: python :linenos: include_hla_keys = { 'A': ['A_1','A_2'], 'B': ['B_1','B_2'], 'C': ['C_1','C_2'] } tdi_cluster_result_disease = tdi.tl.cluster_tcr( tcr_reference_adata, label_key='disease_type', max_distance=4., max_cluster_size=100, same_trbv=True, same_cdr3b_length=True, same_trav=True, same_cdr3a_length=True, include_hla_keys=include_hla_keys, n_jobs=16 ) For more information, please refer to the :func:`tcr_deep_insight.tl.cluster_tcr` function documentation.