Pipeline¤
The pipeline
component of Scrubber
is used
to manage an ordered application of Scrubber
component
functions to text.
lexos.scrubber.pipeline.make_pipeline(*funcs)
¤
Make a callable pipeline.
Make a callable pipeline that passes a text through a series of functions in sequential order, then outputs a (scrubbed) text string.
This function is intended as a lightweight convenience for users, allowing them to flexibly specify scrubbing options and their order,which (and in which order) preprocessing treating the whole thing as a single callable.
python -m pip install cytoolz
is required for this function to work.
Use pipe
(an alias for functools.partial
) to pass arguments to preprocessors.
from lexos import scrubber
scrubber = Scrubber.pipeline.make_pipeline(
scrubber.replace.hashtags,
scrubber.replace.emojis,
pipe(scrubber.remove.punctuation, only=[".", "?", "!"])
)
scrubber("@spacy_io is OSS for industrial-strength NLP in Python developed by @explosion_ai 💥")
'_USER_ is OSS for industrial-strength NLP in Python developed by _USER_ _EMOJI_'
Parameters:
Name | Type | Description | Default |
---|---|---|---|
*funcs |
dict
|
A series of functions to be applied to the text. |
()
|
Returns:
Type | Description |
---|---|
Callable[[str], str]
|
Pipeline composed of |
Source code in lexos\scrubber\pipeline.py
29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 |
|
lexos.scrubber.pipeline.make_pipeline_from_tuple(funcs)
¤
Return a pipeline from a tuple.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
funcs |
tuple
|
A tuple containing callables or string names of functions. |
required |
Returns a tuple of functions.
Source code in lexos\scrubber\pipeline.py
62 63 64 65 66 67 68 69 70 |
|
Note
lexos.scrubber.pipeline.make_pipeline_from_tuple
is deprecated.
It should not be necessary if you are using lexos.scrubber.registry
.
lexos.scrubber.pipeline.pipe(func, *args, **kwargs)
¤
Apply functool.partial and add __name__
to the partial function.
This allows the function to be passed to the pipeline along with keyword arguments.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
func |
Callable
|
A callable. |
required |
Returns:
Type | Description |
---|---|
Callable
|
A partial function with |
Source code in lexos\scrubber\pipeline.py
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
|