Skip to content

Plotly¤

The plotly submodule contains methods of producing visualizations using the Plotly graphing library.

Important

There is currently a problem displaying Plotly plots in Jupyter notebooks (see issue #23). The workaround is to save the plot and open it in a different browser window.

The plotly submodule contains components for generating word clouds and cluster analysis plots.

Plotly Word Clouds¤

lexos.visualization.plotly.cloud.wordcloud.make_wordcloud(data, opts=None, round=None) ¤

Make a word cloud.

Accepts data from a string, list of lists or tuples, a dict with terms as keys and counts/frequencies as values, or a dataframe.

Parameters:

Name Type Description Default
data Union[dict, list, object, str, tuple]

The data. Accepts a text string, a list of lists or tuples, a dict with the terms as keys and the counts/frequencies as values, or a dataframe with "term" and "count" or "frequency" columns.

required
opts dict

The WordCloud() options. For testing, try {"background_color": "white", "max_words": 2000, "contour_width": 3, "contour_width": "steelblue"}

None
round int

An integer (generally between 100-300) to apply a mask that rounds the word cloud.

None

Returns:

Type Description
object

word cloud (object): A WordCloud object if show is set to False.

Notes
Source code in lexos\visualization\plotly\cloud\wordcloud.py
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
def make_wordcloud(
    data: Union[dict, list, object, str, tuple], opts: dict = None, round: int = None
) -> object:
    """Make a word cloud.

    Accepts data from a string, list of lists or tuples, a dict with
    terms as keys and counts/frequencies as values, or a dataframe.

    Args:
        data (Union[dict, list, object, str, tuple]): The data. Accepts a text string, a list of lists or tuples,
            a dict with the terms as keys and the counts/frequencies as values, or a dataframe with "term" and
            "count" or "frequency" columns.
        opts (dict): The WordCloud() options.
            For testing, try {"background_color": "white", "max_words": 2000, "contour_width": 3, "contour_width": "steelblue"}
        round (int): An integer (generally between 100-300) to apply a mask that rounds the word cloud.

    Returns:
        word cloud (object): A WordCloud object if show is set to False.

    Notes:
        - For a full list of options, see https://amueller.github.io/word_cloud/generated/wordcloud.WordCloud.html#wordcloud-wordcloud.
        - If `show=False` the function expects to be called with something like `wordcloud = make_wordcloud(data, show=False)`.
            This returns WordCloud object which can be manipulated by any of its methods, such as `to_file()`. See the
            WordCloud documentation for a list of methods.
    """
    if isinstance(data, str):
        wordcloud = WordCloud(**opts).generate_from_text(data)
    else:
        if isinstance(data, list):
            data = {x[0]: x[1] for x in data}
        elif isinstance(data, pd.DataFrame):
            term_counts = data.to_dict(orient="records")
            try:
                data = {x["terms"]: x["count"] for x in term_counts}
            except KeyError:
                data = {x["terms"]: x["frequency"] for x in term_counts}
        wordcloud = WordCloud(**opts).generate_from_frequencies(data)
    return wordcloud

lexos.visualization.plotly.cloud.wordcloud.plot(dtm, docs=None, opts=None, layout=None, show=True) ¤

Convert a Python word cloud to a Plotly word cloud.

This is some prototype code for generating word clouds in Plotly. Based on https://github.com/PrashantSaikia/Wordcloud-in-Plotly.

This is really a case study because Plotly does not do good word clouds. One of the limitations is that WordCloud.layout_ always returns None for orientation and frequencies for counts. That limits the options for replicating its output.

Run with:

from lexos.visualization.plotly.cloud.wordcloud import plot
plot(dtm)

or

wc = plot(dtm, show=False)
wc.show()

Parameters:

Name Type Description Default
dtm object

A lexos.DTM object.

required
docs List[str]

(List[str]): A list of document names to use.

None
opts dict

(dict): A dict of options to pass to WordCloud.

None
layout dict

(dict): A dict of options to pass to Plotly.

None
show bool

(bool): Whether to show the plot.

True

Returns:

Name Type Description
object go.Figure

A Plotly figure.

Source code in lexos\visualization\plotly\cloud\wordcloud.py
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
def plot(
    dtm: object,
    docs: List[str] = None,
    opts: dict = None,
    layout: dict = None,
    show: bool = True,
) -> go.Figure:
    """Convert a Python word cloud to a Plotly word cloud.

    This is some prototype code for generating word clouds in Plotly.
    Based on https://github.com/PrashantSaikia/Wordcloud-in-Plotly.

    This is really a case study because Plotly does not do good
    word clouds. One of the limitations is that `WordCloud.layout_`
    always returns `None` for orientation and frequencies for
    counts. That limits the options for replicating its output.

    Run with:

    ```python
    from lexos.visualization.plotly.cloud.wordcloud import plot
    plot(dtm)

    or

    wc = plot(dtm, show=False)
    wc.show()
    ```

    Args:
        dtm (object): A lexos.DTM object.
        docs: (List[str]): A list of document names to use.
        opts: (dict): A dict of options to pass to WordCloud.
        layout: (dict): A dict of options to pass to Plotly.
        show: (bool): Whether to show the plot.

    Returns:
        object: A Plotly figure.
    """
    word_list = []
    freq_list = []
    fontsize_list = []
    position_list = []
    orientation_list = []
    color_list = []
    layout_opts = {
        "xaxis": {"showgrid": False, "showticklabels": False, "zeroline": False},
        "yaxis": {"showgrid": False, "showticklabels": False, "zeroline": False},
        "autosize": False,
        "width": 750,
        "height": 750,
        "margin": {"l": 50, "r": 50, "b": 100, "t": 100, "pad": 4},
    }

    if layout:
        for k, v in layout.items():
            layout_opts[k] = v

    # Get the dtm table
    data = dtm.get_table()

    # Get the counts for the desired documents
    if docs:
        docs = ["terms"] + docs
        data = data[docs].copy()
        # Create a new column with the total for each row
        data["count"] = data.sum(axis=1)
    # Get the dtm sums
    else:
        data["count"] = data.sum(axis=1)
        # data = data.rename({"terms": "term", "sum": "count"}, axis=1)

    # Ensure that the table only has terms and counts
    data = data[["terms", "count"]].copy()

    # Create the word cloud
    if opts is None:
        opts = {}
    wc = make_wordcloud(data, opts)

    # Plot the word cloud
    for (word, freq), fontsize, position, orientation, color in wc.layout_:
        word_list.append(word)
        freq_list.append(freq)
        fontsize_list.append(fontsize)
        position_list.append(position)
        orientation_list.append(orientation)
        color_list.append(color)

    # Get the positions
    x = []
    y = []
    for i in position_list:
        x.append(i[0])
        y.append(i[1])

    # Get the relative occurence frequencies
    new_freq_list = []
    for i in freq_list:
        new_freq_list.append(f"{round(i*100, 2)}%")
    new_freq_list
    trace = go.Scatter(
        x=x,
        y=y,
        textfont=dict(size=fontsize_list, color=color_list),
        hoverinfo="text",
        hovertext=[f"{w}: {f}" for w, f in zip(word_list, new_freq_list)],
        mode="text",
        text=word_list,
    )

    # Set the laoyt and create the figure
    layout = go.Layout(layout_opts)
    fig = go.Figure(data=[trace], layout=layout)

    # Show the plot and/or return the figure
    if show:
        fig.show()
        return fig
    else:
        return fig

Plotly Clustermaps¤

lexos.visualization.plotly.cluster.clustermap.PlotlyClustermap ¤

PlotlyClustermap.

Source code in lexos\visualization\plotly\cluster\clustermap.py
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
class PlotlyClustermap():
    """PlotlyClustermap."""

    def __init__(self,
                 dtm: Any,
                 metric: str = "euclidean",
                 method: str = "average",
                 hide_upper: bool = False,
                 hide_side: bool = False,
                 colorscale: str = "Viridis",
                 width: int = 600,
                 height: int = 600,
                 title: str = None,
                 config: dict = dict(
                     displaylogo=False,
                     modeBarButtonsToRemove=[
                        "toImage",
                        "toggleSpikelines"
                    ],
                    scrollZoom=True
                 ),
                 show: bool = False):
        """Initialise the Clustermap.

        Args:
            dtm (Any): The document-term-matrix
        """
        self.dtm = dtm
        table = dtm.get_table()
        self.labels = table.columns.values.tolist()[1:]
        self.df = table.set_index("terms").T
        self.metric = metric
        self.method = method
        self.hide_upper = hide_upper
        self.hide_side = hide_side
        self.colorscale = colorscale
        self.width = width
        self.height = height
        self.config = config
        self.title = title
        self.show = show
        self.build()

    def build(self) -> Any:
        """Build a clustermap."""
        # Set the distance and linkage metrics
        def distfun(x):
            """Get the pairwise distance matrix.

            Args:
                x (Any): The distance matrix.

            Returns:
                Any: The pairwise distance matrix.
            """
            return pdist(x, metric=self.metric)

        def linkagefun(x):
            """Get the hierarchical clustering encoded as a linkage matrix.

            Args:
                x (Any): The pairwise distance matrix.

            Returns:
                Any: The linkage matrix.
            """
            return sch.linkage(x, self.method)

        # Initialize figure by creating upper dendrogram
        fig = create_dendrogram(self.df,
                                distfun=distfun,
                                linkagefun=linkagefun,
                                orientation="bottom",
                                labels=self.labels,
                                colorscale=self._get_colorscale(),
                                color_threshold=None)
        for i in range(len(fig["data"])):
            fig["data"][i]["yaxis"] = "y2"

        # Renders the upper dendrogram invisible
        # Also removes the labels, so you have to rely on hovertext
        if self.hide_upper:
            fig.for_each_trace(lambda trace: trace.update(visible=False))

        # Create Side Dendrogram
        dendro_side = create_dendrogram(self.df,
                                        distfun=distfun,
                                        linkagefun=linkagefun,
                                        orientation="right",
                                        colorscale=self._get_colorscale(),
                                        color_threshold=None)
        for i in range(len(dendro_side["data"])):
            dendro_side["data"][i]["xaxis"] = "x2"

        # Add Side Dendrogram Data to Figure
        if not self.hide_side:
            for data in dendro_side["data"]:
                fig.add_trace(data)

        # Create Heatmap
        dendro_leaves = dendro_side["layout"]["yaxis"]["ticktext"]
        dendro_leaves = list(map(int, dendro_leaves))
        data_dist = pdist(self.df)
        heat_data = squareform(data_dist)
        heat_data = heat_data[dendro_leaves, :]
        heat_data = heat_data[:, dendro_leaves]

        num = len(self.labels)
        heatmap = [
            go.Heatmap(
                x=dendro_leaves,
                y=dendro_leaves,
                z=heat_data,
                colorscale=self.colorscale,
                hovertemplate="X: %{x}<br>Y: %{customdata}<br>Z: %{z}<extra></extra>",
                customdata=[[label for x in range(num)] for label in self.labels]
            )
        ]

        heatmap[0]["x"] = fig["layout"]["xaxis"]["tickvals"]
        heatmap[0]["y"] = dendro_side["layout"]["yaxis"]["tickvals"]

        # Add Heatmap Data to Figure
        for data in heatmap:
            fig.add_trace(data)

        # Edit Layout
        fig.update_layout({"width": self.width, "height": self.height,
                           "showlegend": False, "hovermode": "closest",
                           })

        # Edit xaxis (dendrogram)
        if not self.hide_side:
            x = .15
        else:
            x = 0
        fig.update_layout(xaxis={"domain": [x, 1],
                                 "mirror": False,
                                 "showgrid": False,
                                 "showline": False,
                                 "zeroline": False,
                                 "ticks": ""})
        # Edit xaxis2 (heatmap)
        fig.update_layout(xaxis2={"domain": [0, .15],
                                  "mirror": False,
                                  "showgrid": False,
                                  "showline": False,
                                  "zeroline": False,
                                  "showticklabels": False,
                                  "ticks": ""})

        # Edit yaxis (heatmap)
        fig.update_layout(yaxis={"domain": [0, .85],
                                 "mirror": False,
                                 "showgrid": False,
                                 "showline": False,
                                 "zeroline": False,
                                 "showticklabels": False,
                                 "ticks": "",
                                 })
        # Edit yaxis2 (dendrogram)
        fig.update_layout(yaxis2={"domain": [.840, .975],
                                  "mirror": False,
                                  "showgrid": False,
                                  "showline": False,
                                  "zeroline": False,
                                  "showticklabels": False,
                                  "ticks": ""})

        fig.update_layout(margin=dict(l=0),
                          paper_bgcolor="rgba(0,0,0,0)",
                          plot_bgcolor="rgba(0,0,0,0)",
                          xaxis_tickfont=dict(color="rgba(0,0,0,0)"))

        # Set the title
        if self.title:
            title = dict(
                text=self.title,
                x=0.5,
                y=0.95,
                xanchor="center",
                yanchor="top"
            )
            fig.update_layout(
                title=title,
                margin=dict(t=40)
            )

        # Save the figure variable
        self.fig = fig

        # Show the plot
        if self.show:
            self.fig.show(config=self.config)

    def _get_colorscale(self) -> list:
        """Get the colorscale as a list.

        Plotly continuous colorscales assign colors to the range [0, 1]. This function
        computes the intermediate color for any value in that range.

        Plotly doesn't make the colorscales directly accessible in a common format.
        Some are ready to use, and others are just swatche that need to be constructed
        into a colorscale.
        """
        try:
            colorscale = plotly.colors.PLOTLY_SCALES[self.colorscale]
        except ValueError:
            swatch = getattr(plotly.colors.sequential, self.colorscale)
            colors, scale = plotly.colors.convert_colors_to_same_type(swatch)
            colorscale = plotly.colors.make_colorscale(colors, scale=scale)
        return colorscale

    def savefig(self, filename: str):
        """Save the figure.

        Args:
            filename (str): The name of the file to save.
        """
        self.fig.savefig(filename)

    def showfig(self):
        """Show the figure."""
        self.fig.show(config=self.config)

    def to_html(self,
                show_link: bool = False,
                output_type: str = "div",
                include_plotlyjs: bool = False,
                filename: str = None,
                auto_open: bool = False,
                config: dict = None):
        """Convert the figure to HTML.

        Args:
            show_link (bool): For exporting to Plotly cloud services. Default is `False`.
            output_type (str): If `file`, then the graph is saved as a standalone HTML
                file and plot returns None. If `div`, then plot returns a string that
                just contains the HTML <div> that contains the graph and the script to
                generate the graph. Use `file` if you want to save and view a single
                graph at a time in a standalone HTML file. Use `div` if you are embedding
                these graphs in an existing HTML file. Default is `div`.
            include_plotlyjs (bool): If True, include the plotly.js source code in the
                output file or string, which is good for standalone web pages but makes
                for very large files. If you are embedding the graph in a webpage, it
                is better to import the plotly.js library and use a `div`. Default is `False`.
            filename (str): The local filename to save the outputted chart to. If the
                filename already exists, it will be overwritten. This argument only applies
                if output_type is `file`. The default is `temp-plot.html`.
            auto_open (bool): If True, open the saved file in a web browser after saving.
                This argument only applies if output_type is `file`. Default is `False`.
            config (dict): A dict of parameters in the object's configuration.

        Note:
            This method uses `plotly.offline.plot`, which no longer appears to be documented.
            It has been replaced by renderers: https://plotly.com/python/renderers/. However,
            there does not appear to be an HTML renderer, so no attempt has been made to
            use the new functionality.
        """
        if self.config:
            config = self.config

        if filename and output_type == "file":
            return _plot(
                self.fig,
                show_link=show_link,
                output_type="file",
                include_plotlyjs=include_plotlyjs,
                filename=filename,
                auto_open=auto_open,
                config=config
            )
        elif filename and output_type == "div":
            pl = _plot(
                self.fig,
                show_link=show_link,
                output_type="div",
                include_plotlyjs=include_plotlyjs,
                auto_open=auto_open,
                config=config
            )
            with open(filename, "w") as f:
                f.write(pl)
            return pl
        else:
            return _plot(
                self.fig,
                show_link=show_link,
                output_type="div",
                include_plotlyjs=include_plotlyjs,
                auto_open=auto_open,
                config=config
            )

__init__(dtm, metric='euclidean', method='average', hide_upper=False, hide_side=False, colorscale='Viridis', width=600, height=600, title=None, config=dict(displaylogo=False, modeBarButtonsToRemove=['toImage', 'toggleSpikelines'], scrollZoom=True), show=False) ¤

Initialise the Clustermap.

Parameters:

Name Type Description Default
dtm Any

The document-term-matrix

required
Source code in lexos\visualization\plotly\cluster\clustermap.py
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
def __init__(self,
             dtm: Any,
             metric: str = "euclidean",
             method: str = "average",
             hide_upper: bool = False,
             hide_side: bool = False,
             colorscale: str = "Viridis",
             width: int = 600,
             height: int = 600,
             title: str = None,
             config: dict = dict(
                 displaylogo=False,
                 modeBarButtonsToRemove=[
                    "toImage",
                    "toggleSpikelines"
                ],
                scrollZoom=True
             ),
             show: bool = False):
    """Initialise the Clustermap.

    Args:
        dtm (Any): The document-term-matrix
    """
    self.dtm = dtm
    table = dtm.get_table()
    self.labels = table.columns.values.tolist()[1:]
    self.df = table.set_index("terms").T
    self.metric = metric
    self.method = method
    self.hide_upper = hide_upper
    self.hide_side = hide_side
    self.colorscale = colorscale
    self.width = width
    self.height = height
    self.config = config
    self.title = title
    self.show = show
    self.build()

build() ¤

Build a clustermap.

Source code in lexos\visualization\plotly\cluster\clustermap.py
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
def build(self) -> Any:
    """Build a clustermap."""
    # Set the distance and linkage metrics
    def distfun(x):
        """Get the pairwise distance matrix.

        Args:
            x (Any): The distance matrix.

        Returns:
            Any: The pairwise distance matrix.
        """
        return pdist(x, metric=self.metric)

    def linkagefun(x):
        """Get the hierarchical clustering encoded as a linkage matrix.

        Args:
            x (Any): The pairwise distance matrix.

        Returns:
            Any: The linkage matrix.
        """
        return sch.linkage(x, self.method)

    # Initialize figure by creating upper dendrogram
    fig = create_dendrogram(self.df,
                            distfun=distfun,
                            linkagefun=linkagefun,
                            orientation="bottom",
                            labels=self.labels,
                            colorscale=self._get_colorscale(),
                            color_threshold=None)
    for i in range(len(fig["data"])):
        fig["data"][i]["yaxis"] = "y2"

    # Renders the upper dendrogram invisible
    # Also removes the labels, so you have to rely on hovertext
    if self.hide_upper:
        fig.for_each_trace(lambda trace: trace.update(visible=False))

    # Create Side Dendrogram
    dendro_side = create_dendrogram(self.df,
                                    distfun=distfun,
                                    linkagefun=linkagefun,
                                    orientation="right",
                                    colorscale=self._get_colorscale(),
                                    color_threshold=None)
    for i in range(len(dendro_side["data"])):
        dendro_side["data"][i]["xaxis"] = "x2"

    # Add Side Dendrogram Data to Figure
    if not self.hide_side:
        for data in dendro_side["data"]:
            fig.add_trace(data)

    # Create Heatmap
    dendro_leaves = dendro_side["layout"]["yaxis"]["ticktext"]
    dendro_leaves = list(map(int, dendro_leaves))
    data_dist = pdist(self.df)
    heat_data = squareform(data_dist)
    heat_data = heat_data[dendro_leaves, :]
    heat_data = heat_data[:, dendro_leaves]

    num = len(self.labels)
    heatmap = [
        go.Heatmap(
            x=dendro_leaves,
            y=dendro_leaves,
            z=heat_data,
            colorscale=self.colorscale,
            hovertemplate="X: %{x}<br>Y: %{customdata}<br>Z: %{z}<extra></extra>",
            customdata=[[label for x in range(num)] for label in self.labels]
        )
    ]

    heatmap[0]["x"] = fig["layout"]["xaxis"]["tickvals"]
    heatmap[0]["y"] = dendro_side["layout"]["yaxis"]["tickvals"]

    # Add Heatmap Data to Figure
    for data in heatmap:
        fig.add_trace(data)

    # Edit Layout
    fig.update_layout({"width": self.width, "height": self.height,
                       "showlegend": False, "hovermode": "closest",
                       })

    # Edit xaxis (dendrogram)
    if not self.hide_side:
        x = .15
    else:
        x = 0
    fig.update_layout(xaxis={"domain": [x, 1],
                             "mirror": False,
                             "showgrid": False,
                             "showline": False,
                             "zeroline": False,
                             "ticks": ""})
    # Edit xaxis2 (heatmap)
    fig.update_layout(xaxis2={"domain": [0, .15],
                              "mirror": False,
                              "showgrid": False,
                              "showline": False,
                              "zeroline": False,
                              "showticklabels": False,
                              "ticks": ""})

    # Edit yaxis (heatmap)
    fig.update_layout(yaxis={"domain": [0, .85],
                             "mirror": False,
                             "showgrid": False,
                             "showline": False,
                             "zeroline": False,
                             "showticklabels": False,
                             "ticks": "",
                             })
    # Edit yaxis2 (dendrogram)
    fig.update_layout(yaxis2={"domain": [.840, .975],
                              "mirror": False,
                              "showgrid": False,
                              "showline": False,
                              "zeroline": False,
                              "showticklabels": False,
                              "ticks": ""})

    fig.update_layout(margin=dict(l=0),
                      paper_bgcolor="rgba(0,0,0,0)",
                      plot_bgcolor="rgba(0,0,0,0)",
                      xaxis_tickfont=dict(color="rgba(0,0,0,0)"))

    # Set the title
    if self.title:
        title = dict(
            text=self.title,
            x=0.5,
            y=0.95,
            xanchor="center",
            yanchor="top"
        )
        fig.update_layout(
            title=title,
            margin=dict(t=40)
        )

    # Save the figure variable
    self.fig = fig

    # Show the plot
    if self.show:
        self.fig.show(config=self.config)

savefig(filename) ¤

Save the figure.

Parameters:

Name Type Description Default
filename str

The name of the file to save.

required
Source code in lexos\visualization\plotly\cluster\clustermap.py
231
232
233
234
235
236
237
def savefig(self, filename: str):
    """Save the figure.

    Args:
        filename (str): The name of the file to save.
    """
    self.fig.savefig(filename)

showfig() ¤

Show the figure.

Source code in lexos\visualization\plotly\cluster\clustermap.py
239
240
241
def showfig(self):
    """Show the figure."""
    self.fig.show(config=self.config)

to_html(show_link=False, output_type='div', include_plotlyjs=False, filename=None, auto_open=False, config=None) ¤

Convert the figure to HTML.

Parameters:

Name Type Description Default
show_link bool

For exporting to Plotly cloud services. Default is False.

False
output_type str

If file, then the graph is saved as a standalone HTML file and plot returns None. If div, then plot returns a string that just contains the HTML

that contains the graph and the script to generate the graph. Use file if you want to save and view a single graph at a time in a standalone HTML file. Use div if you are embedding these graphs in an existing HTML file. Default is div.

'div'
include_plotlyjs bool

If True, include the plotly.js source code in the output file or string, which is good for standalone web pages but makes for very large files. If you are embedding the graph in a webpage, it is better to import the plotly.js library and use a div. Default is False.

False
filename str

The local filename to save the outputted chart to. If the filename already exists, it will be overwritten. This argument only applies if output_type is file. The default is temp-plot.html.

None
auto_open bool

If True, open the saved file in a web browser after saving. This argument only applies if output_type is file. Default is False.

False
config dict

A dict of parameters in the object's configuration.

None
Note

This method uses plotly.offline.plot, which no longer appears to be documented. It has been replaced by renderers: https://plotly.com/python/renderers/. However, there does not appear to be an HTML renderer, so no attempt has been made to use the new functionality.

Source code in lexos\visualization\plotly\cluster\clustermap.py
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
def to_html(self,
            show_link: bool = False,
            output_type: str = "div",
            include_plotlyjs: bool = False,
            filename: str = None,
            auto_open: bool = False,
            config: dict = None):
    """Convert the figure to HTML.

    Args:
        show_link (bool): For exporting to Plotly cloud services. Default is `False`.
        output_type (str): If `file`, then the graph is saved as a standalone HTML
            file and plot returns None. If `div`, then plot returns a string that
            just contains the HTML <div> that contains the graph and the script to
            generate the graph. Use `file` if you want to save and view a single
            graph at a time in a standalone HTML file. Use `div` if you are embedding
            these graphs in an existing HTML file. Default is `div`.
        include_plotlyjs (bool): If True, include the plotly.js source code in the
            output file or string, which is good for standalone web pages but makes
            for very large files. If you are embedding the graph in a webpage, it
            is better to import the plotly.js library and use a `div`. Default is `False`.
        filename (str): The local filename to save the outputted chart to. If the
            filename already exists, it will be overwritten. This argument only applies
            if output_type is `file`. The default is `temp-plot.html`.
        auto_open (bool): If True, open the saved file in a web browser after saving.
            This argument only applies if output_type is `file`. Default is `False`.
        config (dict): A dict of parameters in the object's configuration.

    Note:
        This method uses `plotly.offline.plot`, which no longer appears to be documented.
        It has been replaced by renderers: https://plotly.com/python/renderers/. However,
        there does not appear to be an HTML renderer, so no attempt has been made to
        use the new functionality.
    """
    if self.config:
        config = self.config

    if filename and output_type == "file":
        return _plot(
            self.fig,
            show_link=show_link,
            output_type="file",
            include_plotlyjs=include_plotlyjs,
            filename=filename,
            auto_open=auto_open,
            config=config
        )
    elif filename and output_type == "div":
        pl = _plot(
            self.fig,
            show_link=show_link,
            output_type="div",
            include_plotlyjs=include_plotlyjs,
            auto_open=auto_open,
            config=config
        )
        with open(filename, "w") as f:
            f.write(pl)
        return pl
    else:
        return _plot(
            self.fig,
            show_link=show_link,
            output_type="div",
            include_plotlyjs=include_plotlyjs,
            auto_open=auto_open,
            config=config
        )

lexos.visualization.plotly.cluster.dendrogram.PlotlyDendrogram ¤

PlotlyDendrogram.

Typical usage:

from lexos.visualization.plotly.cluster.dendrogram import PlotlyDendrogram

dendrogram = PlotlyDendrogram(dtm, show=True)

or

dendrogram = PlotlyDendrogram(dtm)
dendrogram.fig


Needs some work in returning the figure as a figure
and html and html div.
Source code in lexos\visualization\plotly\cluster\dendrogram.py
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
class PlotlyDendrogram():
    """PlotlyDendrogram.

    Typical usage:

    ```python
    from lexos.visualization.plotly.cluster.dendrogram import PlotlyDendrogram

    dendrogram = PlotlyDendrogram(dtm, show=True)

    or

    dendrogram = PlotlyDendrogram(dtm)
    dendrogram.fig


    Needs some work in returning the figure as a figure
    and html and html div.
    ```
    """

    def __init__(
        self,
        dtm: Any,
        labels: List[str] = None,
        metric: str = "euclidean",
        method: str = "average",
        truncate_mode: str = None,
        get_leaves: bool = True,
        orientation: str = "bottom",
        title: str = None,
        figsize: tuple = (10, 10),
        show: bool = False,
        colorscale: List = None,
        hovertext: List = None,
        color_threshold: float = None,
        config: dict = dict(
            displaylogo=False,
            modeBarButtonsToRemove=[
                "toImage",
                "toggleSpikelines"
            ],
            scrollZoom=True
        ),
        x_tickangle: int = 0,
        y_tickangle: int = 0,
        **layout
    ) -> dict:
        """Initialise the Dendrogram."""
        # Create an empty plot for matplotlib
        self.dtm = dtm
        self.labels = labels
        self.metric = metric
        self.method = method
        self.truncate_mode = truncate_mode
        self.get_leaves = get_leaves
        self.orientation = orientation
        self.title = title
        self.figsize = figsize
        self.show = show
        self.colorscale = colorscale
        self.hovertext = hovertext
        self.color_threshold = color_threshold
        self.config = config
        self.x_tickangle = x_tickangle
        self.y_tickangle = y_tickangle
        self.layout = layout

        # Get the dtm table
        self.df = self.dtm.get_table()

        # Use default labels from the DTM table
        if self.labels is None:
            self.labels = self.df.columns.values.tolist()[1:]

        # Set "terms" as the index and transpose the table
        self.df = self.df.set_index("terms").T

        # Build the dendrogram
        self.build()

    def build(self):
        """Build a dendrogram."""
        # Set the distance and linkage metrics
        def distfun(x):
            """Get the pairwise distance matrix.

            Args:
                x (Any): The distance matrix.

            Returns:
                Any: The pairwise distance matrix.
            """
            return pdist(x, metric=self.metric)

        def linkagefun(x):
            """Get the hierarchical clustering encoded as a linkage matrix.

            Args:
                x (Any): The pairwise distance matrix.

            Returns:
                Any: The linkage matrix.
            """
            return sch.linkage(x, self.method)

        # Create the figure
        self.fig = create_dendrogram(self.df,
                                labels=self.labels,
                                distfun=distfun,
                                linkagefun=linkagefun,
                                orientation=self.orientation,
                                colorscale=self.colorscale,
                                hovertext=self.hovertext,
                                color_threshold=self.color_threshold
                                )

        # Set the standard layout
        self.fig.update_layout(
            margin=dict(l=0, r=0, b=0, t=0, pad=10),
            hovermode='x',
            paper_bgcolor="rgba(0, 0, 0, 0)",
            plot_bgcolor="rgba(0, 0, 0, 0)",
            xaxis=dict(showline=False, ticks="", tickangle=self.x_tickangle),
            yaxis=dict(showline=False, ticks="", tickangle=self.y_tickangle)
        )

        # Set the title
        if self.title is not None:
            title = dict(
                text=self.title,
                x=0.5,
                y=0.95,
                xanchor="center",
                yanchor="top"
            )
            self.fig.update_layout(
                title=title,
                margin=dict(t=40)
            )

        # Add user-configured layout
        self.fig.update_layout(**self.layout)

        # Extend figure hack
        max_label_len = len(max(self.labels, key=len))
        self.fig = _extend_figure(
            self.fig,
            self.orientation,
            max_label_len
        )

        if self.show:
            self.fig.show(config=self.config)

    def showfig(self):
        """Show the figure.

        Calling `Dendrogram.fig` when the dendrogram has been set
        to `False` does not apply the config (there is no way to
        do this in Plotly. Calling `Dendrogram.showfig()` rebuilds
        the fig with the config applied.
        """
        self.show = True
        self.build()

    def to_html(self,
               show_link: bool = False,
               output_type: str = "div",
               include_plotlyjs: bool = False,
               filename: str = None,
               auto_open: bool = False,
               config: dict = None):
        """Convert the figure to HTML.
        Args:
            show_link (bool): For exporting to Plotly cloud services. Default is `False`.
            output_type (str): If `file`, then the graph is saved as a standalone HTML
                file and plot returns None. If `div`, then plot returns a string that
                just contains the HTML <div> that contains the graph and the script to
                generate the graph. Use `file` if you want to save and view a single
                graph at a time in a standalone HTML file. Use `div` if you are embedding
                these graphs in an existing HTML file. Default is `div`.
            include_plotlyjs (bool): If True, include the plotly.js source code in the
                output file or string, which is good for standalone web pages but makes
                for very large files. If you are embedding the graph in a webpage, it
                is better to import the plotly.js library and use a `div`. Default is `False`.
            filename (str): The local filename to save the outputted chart to. If the
                filename already exists, it will be overwritten. This argument only applies
                if output_type is `file`. The default is `temp-plot.html`.
            auto_open (bool): If True, open the saved file in a web browser after saving.
                This argument only applies if output_type is `file`. Default is `False`.
            config (dict): A dict of parameters in the object's configuration.

        Note:
            This method uses `plotly.offline.plot`, which no longer appears to be documented.
            It has been replaced by renderers: https://plotly.com/python/renderers/. However,
            there does not appear to be an HTML renderer, so no attempt has been made to
            use the new functionality.
        """
        if self.config:
            config = self.config

        if filename and output_type == "file":
            return _plot(
                self.fig,
                show_link=show_link,
                output_type="file",
                include_plotlyjs=include_plotlyjs,
                filename=filename,
                auto_open=auto_open,
                config=config
            )
        elif filename and output_type == "div":
            pl = _plot(
                self.fig,
                show_link=show_link,
                output_type="div",
                include_plotlyjs=include_plotlyjs,
                auto_open=auto_open,
                config=config
            )
            with open(filename, "w") as f:
                f.write(pl)
            return pl
        else:
            return _plot(
                self.fig,
                show_link=show_link,
                output_type="div",
                include_plotlyjs=include_plotlyjs,
                auto_open=auto_open,
                config=config
            )

__init__(dtm, labels=None, metric='euclidean', method='average', truncate_mode=None, get_leaves=True, orientation='bottom', title=None, figsize=(10, 10), show=False, colorscale=None, hovertext=None, color_threshold=None, config=dict(displaylogo=False, modeBarButtonsToRemove=['toImage', 'toggleSpikelines'], scrollZoom=True), x_tickangle=0, y_tickangle=0, **layout) ¤

Initialise the Dendrogram.

Source code in lexos\visualization\plotly\cluster\dendrogram.py
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
def __init__(
    self,
    dtm: Any,
    labels: List[str] = None,
    metric: str = "euclidean",
    method: str = "average",
    truncate_mode: str = None,
    get_leaves: bool = True,
    orientation: str = "bottom",
    title: str = None,
    figsize: tuple = (10, 10),
    show: bool = False,
    colorscale: List = None,
    hovertext: List = None,
    color_threshold: float = None,
    config: dict = dict(
        displaylogo=False,
        modeBarButtonsToRemove=[
            "toImage",
            "toggleSpikelines"
        ],
        scrollZoom=True
    ),
    x_tickangle: int = 0,
    y_tickangle: int = 0,
    **layout
) -> dict:
    """Initialise the Dendrogram."""
    # Create an empty plot for matplotlib
    self.dtm = dtm
    self.labels = labels
    self.metric = metric
    self.method = method
    self.truncate_mode = truncate_mode
    self.get_leaves = get_leaves
    self.orientation = orientation
    self.title = title
    self.figsize = figsize
    self.show = show
    self.colorscale = colorscale
    self.hovertext = hovertext
    self.color_threshold = color_threshold
    self.config = config
    self.x_tickangle = x_tickangle
    self.y_tickangle = y_tickangle
    self.layout = layout

    # Get the dtm table
    self.df = self.dtm.get_table()

    # Use default labels from the DTM table
    if self.labels is None:
        self.labels = self.df.columns.values.tolist()[1:]

    # Set "terms" as the index and transpose the table
    self.df = self.df.set_index("terms").T

    # Build the dendrogram
    self.build()

build() ¤

Build a dendrogram.

Source code in lexos\visualization\plotly\cluster\dendrogram.py
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
def build(self):
    """Build a dendrogram."""
    # Set the distance and linkage metrics
    def distfun(x):
        """Get the pairwise distance matrix.

        Args:
            x (Any): The distance matrix.

        Returns:
            Any: The pairwise distance matrix.
        """
        return pdist(x, metric=self.metric)

    def linkagefun(x):
        """Get the hierarchical clustering encoded as a linkage matrix.

        Args:
            x (Any): The pairwise distance matrix.

        Returns:
            Any: The linkage matrix.
        """
        return sch.linkage(x, self.method)

    # Create the figure
    self.fig = create_dendrogram(self.df,
                            labels=self.labels,
                            distfun=distfun,
                            linkagefun=linkagefun,
                            orientation=self.orientation,
                            colorscale=self.colorscale,
                            hovertext=self.hovertext,
                            color_threshold=self.color_threshold
                            )

    # Set the standard layout
    self.fig.update_layout(
        margin=dict(l=0, r=0, b=0, t=0, pad=10),
        hovermode='x',
        paper_bgcolor="rgba(0, 0, 0, 0)",
        plot_bgcolor="rgba(0, 0, 0, 0)",
        xaxis=dict(showline=False, ticks="", tickangle=self.x_tickangle),
        yaxis=dict(showline=False, ticks="", tickangle=self.y_tickangle)
    )

    # Set the title
    if self.title is not None:
        title = dict(
            text=self.title,
            x=0.5,
            y=0.95,
            xanchor="center",
            yanchor="top"
        )
        self.fig.update_layout(
            title=title,
            margin=dict(t=40)
        )

    # Add user-configured layout
    self.fig.update_layout(**self.layout)

    # Extend figure hack
    max_label_len = len(max(self.labels, key=len))
    self.fig = _extend_figure(
        self.fig,
        self.orientation,
        max_label_len
    )

    if self.show:
        self.fig.show(config=self.config)

showfig() ¤

Show the figure.

Calling Dendrogram.fig when the dendrogram has been set to False does not apply the config (there is no way to do this in Plotly. Calling Dendrogram.showfig() rebuilds the fig with the config applied.

Source code in lexos\visualization\plotly\cluster\dendrogram.py
168
169
170
171
172
173
174
175
176
177
def showfig(self):
    """Show the figure.

    Calling `Dendrogram.fig` when the dendrogram has been set
    to `False` does not apply the config (there is no way to
    do this in Plotly. Calling `Dendrogram.showfig()` rebuilds
    the fig with the config applied.
    """
    self.show = True
    self.build()

to_html(show_link=False, output_type='div', include_plotlyjs=False, filename=None, auto_open=False, config=None) ¤

Convert the figure to HTML.

Parameters:

Name Type Description Default
show_link bool

For exporting to Plotly cloud services. Default is False.

False
output_type str

If file, then the graph is saved as a standalone HTML file and plot returns None. If div, then plot returns a string that just contains the HTML

that contains the graph and the script to generate the graph. Use file if you want to save and view a single graph at a time in a standalone HTML file. Use div if you are embedding these graphs in an existing HTML file. Default is div.

'div'
include_plotlyjs bool

If True, include the plotly.js source code in the output file or string, which is good for standalone web pages but makes for very large files. If you are embedding the graph in a webpage, it is better to import the plotly.js library and use a div. Default is False.

False
filename str

The local filename to save the outputted chart to. If the filename already exists, it will be overwritten. This argument only applies if output_type is file. The default is temp-plot.html.

None
auto_open bool

If True, open the saved file in a web browser after saving. This argument only applies if output_type is file. Default is False.

False
config dict

A dict of parameters in the object's configuration.

None
Note

This method uses plotly.offline.plot, which no longer appears to be documented. It has been replaced by renderers: https://plotly.com/python/renderers/. However, there does not appear to be an HTML renderer, so no attempt has been made to use the new functionality.

Source code in lexos\visualization\plotly\cluster\dendrogram.py
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
def to_html(self,
           show_link: bool = False,
           output_type: str = "div",
           include_plotlyjs: bool = False,
           filename: str = None,
           auto_open: bool = False,
           config: dict = None):
    """Convert the figure to HTML.
    Args:
        show_link (bool): For exporting to Plotly cloud services. Default is `False`.
        output_type (str): If `file`, then the graph is saved as a standalone HTML
            file and plot returns None. If `div`, then plot returns a string that
            just contains the HTML <div> that contains the graph and the script to
            generate the graph. Use `file` if you want to save and view a single
            graph at a time in a standalone HTML file. Use `div` if you are embedding
            these graphs in an existing HTML file. Default is `div`.
        include_plotlyjs (bool): If True, include the plotly.js source code in the
            output file or string, which is good for standalone web pages but makes
            for very large files. If you are embedding the graph in a webpage, it
            is better to import the plotly.js library and use a `div`. Default is `False`.
        filename (str): The local filename to save the outputted chart to. If the
            filename already exists, it will be overwritten. This argument only applies
            if output_type is `file`. The default is `temp-plot.html`.
        auto_open (bool): If True, open the saved file in a web browser after saving.
            This argument only applies if output_type is `file`. Default is `False`.
        config (dict): A dict of parameters in the object's configuration.

    Note:
        This method uses `plotly.offline.plot`, which no longer appears to be documented.
        It has been replaced by renderers: https://plotly.com/python/renderers/. However,
        there does not appear to be an HTML renderer, so no attempt has been made to
        use the new functionality.
    """
    if self.config:
        config = self.config

    if filename and output_type == "file":
        return _plot(
            self.fig,
            show_link=show_link,
            output_type="file",
            include_plotlyjs=include_plotlyjs,
            filename=filename,
            auto_open=auto_open,
            config=config
        )
    elif filename and output_type == "div":
        pl = _plot(
            self.fig,
            show_link=show_link,
            output_type="div",
            include_plotlyjs=include_plotlyjs,
            auto_open=auto_open,
            config=config
        )
        with open(filename, "w") as f:
            f.write(pl)
        return pl
    else:
        return _plot(
            self.fig,
            show_link=show_link,
            output_type="div",
            include_plotlyjs=include_plotlyjs,
            auto_open=auto_open,
            config=config
        )

lexos.visualization.plotly.cluster.dendrogram._get_dummy_scatter(x_value) ¤

Create a invisible scatter point at (x_value, 0).

Use this function to help extend the margin of the dendrogram plot.

Parameters:

Name Type Description Default
x_value float

The desired x value we want to extend the margin to.

required

Returns:

Name Type Description
tuple Scatter

An invisible scatter point at (x_value, 0).

Source code in lexos\visualization\plotly\cluster\dendrogram.py
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
def _get_dummy_scatter(x_value: float) -> Scatter:
    """Create a invisible scatter point at (x_value, 0).

    Use this function to help extend the margin of the dendrogram plot.

    Args:
        x_value (float): The desired x value we want to extend the margin to.

    Returns:
        tuple: An invisible scatter point at (x_value, 0).
    """
    return Scatter(
        x=[x_value],
        y=[0],
        mode="markers",
        opacity=0,
        hoverinfo="skip"
    )