Skip to content

Commit 7fea2e0

Browse files
committed
readme updates
1 parent 4c84a4b commit 7fea2e0

File tree

2 files changed

+13
-3
lines changed

2 files changed

+13
-3
lines changed

README.md

+12-1
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ the sentences that are closest to the cluster's centroids. This library also use
1010
https://github.com/huggingface/neuralcoref library to resolve words in summaries that need more context. The greedyness of
1111
the neuralcoref library can be tweaked in the CoreferenceHandler class.
1212

13-
As of version 0.4.2, by default, CUDA is used if a gpu is available.
13+
As of the most recent version of bert-extractive-summarizer, by default, CUDA is used if a gpu is available.
1414

1515
Paper: https://arxiv.org/abs/1906.04165
1616

@@ -61,6 +61,17 @@ result = model(body, ratio=0.2) # Specified with ratio
6161
result = model(body, num_sentences=3) # Will return 3 sentences
6262
```
6363

64+
#### Using multiple hidden layers as the embedding output
65+
66+
You can also concat the summarizer embeddings for clustering. A simple example is below.
67+
68+
```python
69+
from summarizer import Summarizer
70+
body = 'Text body that you want to summarize with BERT'
71+
model = Summarizer('distilbert-base-uncased', hidden=[-1,-2], hidden_concat=True)
72+
result = model(body, num_sentences=3)
73+
```
74+
6475
### Use SBert
6576
One can use Sentence Bert with bert-extractive-summarizer with the newest version. It is based off the paper here:
6677
https://arxiv.org/abs/1908.10084, and the library here: https://www.sbert.net/. To get started,

summarizer/transformer_embeddings/bert_embedding.py

+1-2
Original file line numberDiff line numberDiff line change
@@ -100,7 +100,6 @@ def extract_embeddings(
100100
101101
:param text: The text to extract embeddings for.
102102
:param hidden: The hidden layer(s) to use for a readout handler.
103-
:param squeeze: If we should squeeze the outputs (required for some layers).
104103
:param reduce_option: How we should reduce the items.
105104
:param hidden_concat: Whether or not to concat multiple hidden layers.
106105
:return: A torch vector.
@@ -158,7 +157,7 @@ def create_matrix(
158157
def __call__(
159158
self,
160159
content: List[str],
161-
hidden: int = -2,
160+
hidden: Union[List[int], int] = -2,
162161
reduce_option: str = 'mean',
163162
hidden_concat: bool = False,
164163
) -> ndarray:

0 commit comments

Comments
 (0)