Skip to content

API Reference

pysatl_tsp

PySATL Time Series Processing subproject (abbreviated pysatl-tsp) is a module designed for adaptive processing of time series data with a focus on streaming architecture. It implements a chain of responsibility pattern that enables building complex data processing pipelines with minimal boilerplate code, making it suitable for real-time applications and large dataset analysis.

core

This module provides the core functionality for the pysatl_tsp package.

Handler

Bases: ABC, Generic[T, U]

Abstract base class for time series processing handlers.

This class implements a Chain of Responsibility pattern for processing time series data. Each Handler can be connected to a source handler and process its output data. Handlers can be combined using the pipe operator (|) to create processing pipelines.

Parameters:

Name Type Description Default
source Handler[Any, T] | None

The handler to use as a data source, defaults to None

None
Source code in pysatl_tsp/core/handler.py
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
class Handler(ABC, Generic[T, U]):
    """Abstract base class for time series processing handlers.

    This class implements a Chain of Responsibility pattern for processing time series data.
    Each Handler can be connected to a source handler and process its output data.
    Handlers can be combined using the pipe operator (|) to create processing pipelines.

    :param source: The handler to use as a data source, defaults to None
    """

    def __init__(self, source: Handler[Any, T] | None = None):
        """Initialize a handler with an optional source.

        :param source: The handler to use as a data source, defaults to None
        """
        self._source = source

    @property
    def source(self) -> Handler[Any, T] | None:
        """Get the source handler that provides input data to this handler.

        :return: The source handler or None if this is a root handler
        """
        return self._source

    @source.setter
    def source(self, value: Handler[Any, T]) -> None:
        """Set the source handler for this handler.

        :param value: The handler to use as a data source
        :raises RuntimeError: If the source has already been set
        """
        if self._source is not None:
            raise RuntimeError("Cannot change already setted source")
        self._source = value

    @abstractmethod
    def __iter__(self) -> Iterator[U]:
        """Create an iterator over the output data produced by this handler.

        Each subclass must implement this method to define how data is processed.

        :return: An iterator yielding processed data items
        """
        pass

    def __or__(self, other: Handler[U, V]) -> Pipeline[T, V]:
        """Combine this handler with another handler using the pipe operator.

        This allows for the creation of processing pipelines using syntax like:
        handler1 | handler2 | handler3

        :param other: The next handler in the pipeline
        :return: A Pipeline object connecting this handler to the other handler
        """
        return Pipeline(self, other)
source property writable
source: Handler[Any, T] | None

Get the source handler that provides input data to this handler.

Returns:

Type Description
Handler[Any, T] | None

The source handler or None if this is a root handler

__init__
__init__(source: Handler[Any, T] | None = None)

Initialize a handler with an optional source.

Parameters:

Name Type Description Default
source Handler[Any, T] | None

The handler to use as a data source, defaults to None

None
Source code in pysatl_tsp/core/handler.py
28
29
30
31
32
33
def __init__(self, source: Handler[Any, T] | None = None):
    """Initialize a handler with an optional source.

    :param source: The handler to use as a data source, defaults to None
    """
    self._source = source
__iter__ abstractmethod
__iter__() -> Iterator[U]

Create an iterator over the output data produced by this handler.

Each subclass must implement this method to define how data is processed.

Returns:

Type Description
Iterator[U]

An iterator yielding processed data items

Source code in pysatl_tsp/core/handler.py
54
55
56
57
58
59
60
61
62
@abstractmethod
def __iter__(self) -> Iterator[U]:
    """Create an iterator over the output data produced by this handler.

    Each subclass must implement this method to define how data is processed.

    :return: An iterator yielding processed data items
    """
    pass
__or__
__or__(other: Handler[U, V]) -> Pipeline[T, V]

Combine this handler with another handler using the pipe operator.

This allows for the creation of processing pipelines using syntax like: handler1 | handler2 | handler3

Parameters:

Name Type Description Default
other Handler[U, V]

The next handler in the pipeline

required

Returns:

Type Description
Pipeline[T, V]

A Pipeline object connecting this handler to the other handler

Source code in pysatl_tsp/core/handler.py
64
65
66
67
68
69
70
71
72
73
def __or__(self, other: Handler[U, V]) -> Pipeline[T, V]:
    """Combine this handler with another handler using the pipe operator.

    This allows for the creation of processing pipelines using syntax like:
    handler1 | handler2 | handler3

    :param other: The next handler in the pipeline
    :return: A Pipeline object connecting this handler to the other handler
    """
    return Pipeline(self, other)

data_providers

This module provides various data providers for the pysatl_tsp package.

DataBaseDataProvider

Bases: DataProvider[T], Generic[T]

A data provider that sources time series data from a database.

This class provides a way to query time series data from any database system using adapters that implement the DatabaseAdapter interface. It handles the connection lifecycle and streaming of data from database queries.

Parameters:

Name Type Description Default
connection_params dict[str, Any]

Dictionary containing connection parameters

required
query str

SQL query string to execute

required
adapter DatabaseAdapter[T]

Database adapter implementation

required
params tuple[Any, ...]

Query parameters, defaults to () Example: python # Using the SQLiteAdapter from the example above import sqlite3 class SQLiteAdapter(DatabaseAdapter[dict[str, Any]]): # ... adapter implementation as shown above ... # Create a data provider for SQLite database provider = DataBaseDataProvider( connection_params={"database": "sensors.db"}, query="SELECT timestamp, value FROM temperature WHERE location_id = ? AND timestamp > ?", adapter=SQLiteAdapter(), params=("zone-1", "2023-01-01") ) # Use the provider in a processing pipeline for record in provider: print(f"Time: {record['timestamp']}, Value: {record['value']}") # Or connect to a processing pipeline pipeline = provider | WindowHandler(60) | AverageHandler()

()
Source code in pysatl_tsp/core/data_providers/database_data_provider.py
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
class DataBaseDataProvider(DataProvider[T], Generic[T]):
    """A data provider that sources time series data from a database.

    This class provides a way to query time series data from any database system
    using adapters that implement the DatabaseAdapter interface. It handles the
    connection lifecycle and streaming of data from database queries.

    :param connection_params: Dictionary containing connection parameters
    :param query: SQL query string to execute
    :param adapter: Database adapter implementation
    :param params: Query parameters, defaults to ()

    Example:
        ```python
        # Using the SQLiteAdapter from the example above
        import sqlite3

        class SQLiteAdapter(DatabaseAdapter[dict[str, Any]]):
            # ... adapter implementation as shown above ...

        # Create a data provider for SQLite database
        provider = DataBaseDataProvider(
            connection_params={"database": "sensors.db"},
            query="SELECT timestamp, value FROM temperature WHERE location_id = ? AND timestamp > ?",
            adapter=SQLiteAdapter(),
            params=("zone-1", "2023-01-01")
        )

        # Use the provider in a processing pipeline
        for record in provider:
            print(f"Time: {record['timestamp']}, Value: {record['value']}")

        # Or connect to a processing pipeline
        pipeline = provider | WindowHandler(60) | AverageHandler()
        ```
    """

    def __init__(
        self,
        connection_params: dict[str, Any],
        query: str,
        adapter: DatabaseAdapter[T],
        params: tuple[Any, ...] = (),
    ) -> None:
        """Initialize a database data provider.

        :param connection_params: Dictionary containing connection parameters
        :param query: SQL query string to execute
        :param adapter: Database adapter implementation
        :param params: Query parameters, defaults to ()
        """
        super().__init__()
        self._connection_params = connection_params
        self._query = query
        self._params = params
        self._adapter = adapter

    @contextmanager
    def _connection_context(self) -> Any:
        """Context manager for database connection lifecycle.

        This method handles the proper setup and teardown of database connections,
        ensuring that resources are properly released even in the case of errors.

        :return: Database cursor or similar query result object
        :raises Exception: If database operations fail
        """
        connection = None
        cursor = None
        try:
            connection = self._adapter.connect(self._connection_params)
            cursor = self._adapter.execute_query(connection, self._query, self._params)
            yield cursor
        finally:
            if cursor is not None:
                self._adapter.close_cursor(cursor)
            if connection is not None:
                self._adapter.close_connection(connection)

    def __iter__(self) -> Iterator[T]:
        """Create an iterator over the query results from the database.

        This method executes the query and yields data items by delegating
        to the adapter's fetch_data method.

        :return: An iterator yielding data items from the database query
        :raises Exception: If database operations fail
        """
        with self._connection_context() as cursor:
            yield from self._adapter.fetch_data(cursor)
__init__
__init__(
    connection_params: dict[str, Any],
    query: str,
    adapter: DatabaseAdapter[T],
    params: tuple[Any, ...] = (),
) -> None

Initialize a database data provider.

Parameters:

Name Type Description Default
connection_params dict[str, Any]

Dictionary containing connection parameters

required
query str

SQL query string to execute

required
adapter DatabaseAdapter[T]

Database adapter implementation

required
params tuple[Any, ...]

Query parameters, defaults to ()

()
Source code in pysatl_tsp/core/data_providers/database_data_provider.py
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
def __init__(
    self,
    connection_params: dict[str, Any],
    query: str,
    adapter: DatabaseAdapter[T],
    params: tuple[Any, ...] = (),
) -> None:
    """Initialize a database data provider.

    :param connection_params: Dictionary containing connection parameters
    :param query: SQL query string to execute
    :param adapter: Database adapter implementation
    :param params: Query parameters, defaults to ()
    """
    super().__init__()
    self._connection_params = connection_params
    self._query = query
    self._params = params
    self._adapter = adapter
__iter__
__iter__() -> Iterator[T]

Create an iterator over the query results from the database.

This method executes the query and yields data items by delegating to the adapter's fetch_data method.

Returns:

Type Description
Iterator[T]

An iterator yielding data items from the database query

Raises:

Type Description
Exception

If database operations fail

Source code in pysatl_tsp/core/data_providers/database_data_provider.py
190
191
192
193
194
195
196
197
198
199
200
def __iter__(self) -> Iterator[T]:
    """Create an iterator over the query results from the database.

    This method executes the query and yields data items by delegating
    to the adapter's fetch_data method.

    :return: An iterator yielding data items from the database query
    :raises Exception: If database operations fail
    """
    with self._connection_context() as cursor:
        yield from self._adapter.fetch_data(cursor)
DataProvider

Bases: Handler[None, T]

Abstract base class for time series data providers.

DataProvider serves as a root handler in a processing pipeline, responsible for sourcing the initial time series data. As the first element in the chain, it doesn't receive input from any preceding handler and acts as the data origin.

This class is designed to be subclassed with specific implementations for different data sources such as files, databases, APIs, or generated data.

Source code in pysatl_tsp/core/data_providers/abstract.py
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
class DataProvider(Handler[None, T]):
    """Abstract base class for time series data providers.

    DataProvider serves as a root handler in a processing pipeline, responsible for
    sourcing the initial time series data. As the first element in the chain, it doesn't
    receive input from any preceding handler and acts as the data origin.

    This class is designed to be subclassed with specific implementations for different
    data sources such as files, databases, APIs, or generated data.
    """

    def __init__(self) -> None:
        """Initialize a data provider with no source.

        Data providers are always root handlers and cannot have a source.
        """
        super().__init__(source=None)

    @abstractmethod
    def __iter__(self) -> Iterator[T]:
        """Create an iterator over the time series data provided by this data source.

        Each subclass must implement this method to define how data is sourced and
        potentially pre-processed before being passed to subsequent handlers.

        :return: An iterator yielding time series data items
        """
        pass
__init__
__init__() -> None

Initialize a data provider with no source.

Data providers are always root handlers and cannot have a source.

Source code in pysatl_tsp/core/data_providers/abstract.py
20
21
22
23
24
25
def __init__(self) -> None:
    """Initialize a data provider with no source.

    Data providers are always root handlers and cannot have a source.
    """
    super().__init__(source=None)
__iter__ abstractmethod
__iter__() -> Iterator[T]

Create an iterator over the time series data provided by this data source.

Each subclass must implement this method to define how data is sourced and potentially pre-processed before being passed to subsequent handlers.

Returns:

Type Description
Iterator[T]

An iterator yielding time series data items

Source code in pysatl_tsp/core/data_providers/abstract.py
27
28
29
30
31
32
33
34
35
36
@abstractmethod
def __iter__(self) -> Iterator[T]:
    """Create an iterator over the time series data provided by this data source.

    Each subclass must implement this method to define how data is sourced and
    potentially pre-processed before being passed to subsequent handlers.

    :return: An iterator yielding time series data items
    """
    pass
DatabaseAdapter

Bases: ABC, Generic[T]

Abstract base class for database adapter implementations.

This class defines the interface for adapters that connect to different database systems. Concrete implementations of this class handle the specific details of connecting to databases, executing queries, and transforming results into a consistent format for the data processing pipeline.

Example:

import sqlite3


class SQLiteAdapter(DatabaseAdapter[dict[str, Any]]):
    def connect(self, connection_params: dict[str, Any]) -> sqlite3.Connection:
        connection = sqlite3.connect(connection_params["database"])
        connection.row_factory = sqlite3.Row
        return connection

    def execute_query(
        self, connection: sqlite3.Connection, query: str, params: tuple[Any, ...] = ()
    ) -> sqlite3.Cursor:
        cursor = connection.cursor()
        cursor.execute(query, params)
        return cursor

    def fetch_data(self, cursor: sqlite3.Cursor) -> Iterator[dict[str, Any]]:
        for row in cursor:
            yield dict(row)

    def close_cursor(self, cursor: sqlite3.Cursor) -> None:
        cursor.close()

    def close_connection(self, connection: sqlite3.Connection) -> None:
        connection.close()


# Usage:
adapter = SQLiteAdapter()
provider = DataBaseDataProvider(
    connection_params={"database": "time_series.db"},
    query="SELECT timestamp, value FROM measurements WHERE sensor_id = ?",
    adapter=adapter,
    params=(42,),
)

Source code in pysatl_tsp/core/data_providers/database_data_provider.py
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
class DatabaseAdapter(ABC, Generic[T]):
    """Abstract base class for database adapter implementations.

    This class defines the interface for adapters that connect to different database
    systems. Concrete implementations of this class handle the specific details of
    connecting to databases, executing queries, and transforming results into a
    consistent format for the data processing pipeline.

    Example:
        ```python
        import sqlite3


        class SQLiteAdapter(DatabaseAdapter[dict[str, Any]]):
            def connect(self, connection_params: dict[str, Any]) -> sqlite3.Connection:
                connection = sqlite3.connect(connection_params["database"])
                connection.row_factory = sqlite3.Row
                return connection

            def execute_query(
                self, connection: sqlite3.Connection, query: str, params: tuple[Any, ...] = ()
            ) -> sqlite3.Cursor:
                cursor = connection.cursor()
                cursor.execute(query, params)
                return cursor

            def fetch_data(self, cursor: sqlite3.Cursor) -> Iterator[dict[str, Any]]:
                for row in cursor:
                    yield dict(row)

            def close_cursor(self, cursor: sqlite3.Cursor) -> None:
                cursor.close()

            def close_connection(self, connection: sqlite3.Connection) -> None:
                connection.close()


        # Usage:
        adapter = SQLiteAdapter()
        provider = DataBaseDataProvider(
            connection_params={"database": "time_series.db"},
            query="SELECT timestamp, value FROM measurements WHERE sensor_id = ?",
            adapter=adapter,
            params=(42,),
        )
        ```
    """

    @abstractmethod
    def connect(self, connection_params: dict[str, Any]) -> Any:
        """Establish a connection to the database.

        :param connection_params: Dictionary containing connection parameters
                                 (e.g., host, port, username, password, etc.)
        :return: Database connection object
        :raises Exception: If connection fails
        """
        pass

    @abstractmethod
    def execute_query(self, connection: Any, query: str, params: tuple[Any, ...] = ()) -> Any:
        """Execute a query on the given database connection.

        :param connection: Database connection object
        :param query: SQL query string
        :param params: Query parameters, defaults to ()
        :return: Query result cursor or similar object
        :raises Exception: If query execution fails
        """
        pass

    @abstractmethod
    def fetch_data(self, cursor: Any) -> Iterator[T]:
        """Extract and transform data from the query result.

        :param cursor: Query result cursor
        :return: Iterator yielding data items of type T
        """
        pass

    @abstractmethod
    def close_cursor(self, cursor: Any) -> None:
        """Close the query cursor.

        :param cursor: Query result cursor
        """
        pass

    @abstractmethod
    def close_connection(self, connection: Any) -> None:
        """Close the database connection.

        :param connection: Database connection object
        """
        pass
close_connection abstractmethod
close_connection(connection: Any) -> None

Close the database connection.

Parameters:

Name Type Description Default
connection Any

Database connection object

required
Source code in pysatl_tsp/core/data_providers/database_data_provider.py
102
103
104
105
106
107
108
@abstractmethod
def close_connection(self, connection: Any) -> None:
    """Close the database connection.

    :param connection: Database connection object
    """
    pass
close_cursor abstractmethod
close_cursor(cursor: Any) -> None

Close the query cursor.

Parameters:

Name Type Description Default
cursor Any

Query result cursor

required
Source code in pysatl_tsp/core/data_providers/database_data_provider.py
 94
 95
 96
 97
 98
 99
100
@abstractmethod
def close_cursor(self, cursor: Any) -> None:
    """Close the query cursor.

    :param cursor: Query result cursor
    """
    pass
connect abstractmethod
connect(connection_params: dict[str, Any]) -> Any

Establish a connection to the database.

Parameters:

Name Type Description Default
connection_params dict[str, Any]

Dictionary containing connection parameters (e.g., host, port, username, password, etc.)

required

Returns:

Type Description
Any

Database connection object

Raises:

Type Description
Exception

If connection fails

Source code in pysatl_tsp/core/data_providers/database_data_provider.py
62
63
64
65
66
67
68
69
70
71
@abstractmethod
def connect(self, connection_params: dict[str, Any]) -> Any:
    """Establish a connection to the database.

    :param connection_params: Dictionary containing connection parameters
                             (e.g., host, port, username, password, etc.)
    :return: Database connection object
    :raises Exception: If connection fails
    """
    pass
execute_query abstractmethod
execute_query(
    connection: Any,
    query: str,
    params: tuple[Any, ...] = (),
) -> Any

Execute a query on the given database connection.

Parameters:

Name Type Description Default
connection Any

Database connection object

required
query str

SQL query string

required
params tuple[Any, ...]

Query parameters, defaults to ()

()

Returns:

Type Description
Any

Query result cursor or similar object

Raises:

Type Description
Exception

If query execution fails

Source code in pysatl_tsp/core/data_providers/database_data_provider.py
73
74
75
76
77
78
79
80
81
82
83
@abstractmethod
def execute_query(self, connection: Any, query: str, params: tuple[Any, ...] = ()) -> Any:
    """Execute a query on the given database connection.

    :param connection: Database connection object
    :param query: SQL query string
    :param params: Query parameters, defaults to ()
    :return: Query result cursor or similar object
    :raises Exception: If query execution fails
    """
    pass
fetch_data abstractmethod
fetch_data(cursor: Any) -> Iterator[T]

Extract and transform data from the query result.

Parameters:

Name Type Description Default
cursor Any

Query result cursor

required

Returns:

Type Description
Iterator[T]

Iterator yielding data items of type T

Source code in pysatl_tsp/core/data_providers/database_data_provider.py
85
86
87
88
89
90
91
92
@abstractmethod
def fetch_data(self, cursor: Any) -> Iterator[T]:
    """Extract and transform data from the query result.

    :param cursor: Query result cursor
    :return: Iterator yielding data items of type T
    """
    pass
FileDataProvider

Bases: DataProvider[X]

A data provider that reads time series data from a text file.

This class implements a file-based data source that reads a file line by line and transforms each line into a data item using a specified handler function. It is useful for processing time series data stored in text files with various formats (CSV, JSON per line, custom formats, etc.).

Parameters:

Name Type Description Default
filename str

Path to the file containing time series data

required
handler Callable[[str], X]

A function that converts each line of text to a data item

required

Raises:

Type Description
FileNotFoundError

If the specified file does not exist

PermissionError

If the file cannot be read due to permission issues

Source code in pysatl_tsp/core/data_providers/file_data_provider.py
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
class FileDataProvider(DataProvider[X]):
    """A data provider that reads time series data from a text file.

    This class implements a file-based data source that reads a file line by line
    and transforms each line into a data item using a specified handler function.
    It is useful for processing time series data stored in text files with various
    formats (CSV, JSON per line, custom formats, etc.).

    :param filename: Path to the file containing time series data
    :param handler: A function that converts each line of text to a data item

    :raises FileNotFoundError: If the specified file does not exist
    :raises PermissionError: If the file cannot be read due to permission issues
    """

    def __init__(self, filename: str, handler: Callable[[str], X]) -> None:
        """Initialize a file data provider.

        :param filename: Path to the file containing time series data
        :param handler: A function that converts each line of text to a data item
        """
        super().__init__()
        self.filename = filename
        self.handler = handler

    def __iter__(self) -> Iterator[X]:
        """Create an iterator that reads the file line by line and yields processed data items.

        The method opens the specified file, reads it line by line, and applies the handler
        function to each line to transform it into a data item.

        :return: An iterator yielding processed data items from the file
        :raises FileNotFoundError: If the specified file does not exist
        :raises PermissionError: If the file cannot be read due to permission issues
        """
        with open(self.filename) as f:
            for line in f:
                yield self.handler(line)
__init__
__init__(
    filename: str, handler: Callable[[str], X]
) -> None

Initialize a file data provider.

Parameters:

Name Type Description Default
filename str

Path to the file containing time series data

required
handler Callable[[str], X]

A function that converts each line of text to a data item

required
Source code in pysatl_tsp/core/data_providers/file_data_provider.py
24
25
26
27
28
29
30
31
32
def __init__(self, filename: str, handler: Callable[[str], X]) -> None:
    """Initialize a file data provider.

    :param filename: Path to the file containing time series data
    :param handler: A function that converts each line of text to a data item
    """
    super().__init__()
    self.filename = filename
    self.handler = handler
__iter__
__iter__() -> Iterator[X]

Create an iterator that reads the file line by line and yields processed data items.

The method opens the specified file, reads it line by line, and applies the handler function to each line to transform it into a data item.

Returns:

Type Description
Iterator[X]

An iterator yielding processed data items from the file

Raises:

Type Description
FileNotFoundError

If the specified file does not exist

PermissionError

If the file cannot be read due to permission issues

Source code in pysatl_tsp/core/data_providers/file_data_provider.py
34
35
36
37
38
39
40
41
42
43
44
45
46
def __iter__(self) -> Iterator[X]:
    """Create an iterator that reads the file line by line and yields processed data items.

    The method opens the specified file, reads it line by line, and applies the handler
    function to each line to transform it into a data item.

    :return: An iterator yielding processed data items from the file
    :raises FileNotFoundError: If the specified file does not exist
    :raises PermissionError: If the file cannot be read due to permission issues
    """
    with open(self.filename) as f:
        for line in f:
            yield self.handler(line)
SimpleDataProvider

Bases: DataProvider[T]

A data provider that serves data from an in-memory iterable collection.

This class implements a simple data provider that wraps around any iterable collection and makes it available as a data source in a processing pipeline. It's useful for testing, working with pre-loaded data, or creating pipelines that process data already in memory.

Parameters:

Name Type Description Default
data Iterable[T]

An iterable collection containing the time series data

required
Source code in pysatl_tsp/core/data_providers/simple_data_provider.py
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
class SimpleDataProvider(DataProvider[T]):
    """A data provider that serves data from an in-memory iterable collection.

    This class implements a simple data provider that wraps around any iterable
    collection and makes it available as a data source in a processing pipeline.
    It's useful for testing, working with pre-loaded data, or creating pipelines
    that process data already in memory.

    :param data: An iterable collection containing the time series data
    """

    def __init__(self, data: Iterable[T]) -> None:
        """Initialize a simple data provider with an iterable data source.

        :param data: An iterable collection containing the time series data
        """
        super().__init__()
        self.data = data

    def __iter__(self) -> Iterator[T]:
        """Create an iterator over the provided data collection.

        This method simply yields items from the data collection that was passed
        during initialization, making them available to subsequent handlers in
        the processing pipeline.

        :return: An iterator yielding items from the data collection
        """
        yield from self.data
__init__
__init__(data: Iterable[T]) -> None

Initialize a simple data provider with an iterable data source.

Parameters:

Name Type Description Default
data Iterable[T]

An iterable collection containing the time series data

required
Source code in pysatl_tsp/core/data_providers/simple_data_provider.py
17
18
19
20
21
22
23
def __init__(self, data: Iterable[T]) -> None:
    """Initialize a simple data provider with an iterable data source.

    :param data: An iterable collection containing the time series data
    """
    super().__init__()
    self.data = data
__iter__
__iter__() -> Iterator[T]

Create an iterator over the provided data collection.

This method simply yields items from the data collection that was passed during initialization, making them available to subsequent handlers in the processing pipeline.

Returns:

Type Description
Iterator[T]

An iterator yielding items from the data collection

Source code in pysatl_tsp/core/data_providers/simple_data_provider.py
25
26
27
28
29
30
31
32
33
34
def __iter__(self) -> Iterator[T]:
    """Create an iterator over the provided data collection.

    This method simply yields items from the data collection that was passed
    during initialization, making them available to subsequent handlers in
    the processing pipeline.

    :return: An iterator yielding items from the data collection
    """
    yield from self.data
WebSocketDataProvider

Bases: DataProvider[str]

A data provider that streams time series data from a WebSocket connection.

This class establishes a WebSocket connection to a specified URI and streams received messages into the processing pipeline. It handles the WebSocket connection in a separate thread to avoid blocking the main processing flow and provides a clean streaming interface through the standard iterator protocol.

Parameters:

Name Type Description Default
uri str

WebSocket endpoint URI

required
subscribe_message dict[str, Any] | None

Optional message to send after connection to subscribe to specific data streams Example: python # Example: Connecting to Bybit WebSocket API for Bitcoin price data bybit_provider = WebSocketDataProvider( uri="wss://stream.bybit.com/v5/public/spot", subscribe_message={"op": "subscribe", "args": ["tickers.BTCUSDT"]}, ) try: for message in bybit_provider: data = json.loads(message) if "data" in data and data.get("topic") == "tickers.BTCUSDT": price_data = data["data"] print(f"BTC/USDT: {price_data['lastPrice']} (Time: {price_data['timestamp']})") except KeyboardInterrupt: bybit_provider.close()

None
Source code in pysatl_tsp/core/data_providers/websocket_data_provider.py
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
class WebSocketDataProvider(DataProvider[str]):
    """A data provider that streams time series data from a WebSocket connection.

    This class establishes a WebSocket connection to a specified URI and streams
    received messages into the processing pipeline. It handles the WebSocket connection
    in a separate thread to avoid blocking the main processing flow and provides
    a clean streaming interface through the standard iterator protocol.

    :param uri: WebSocket endpoint URI
    :param subscribe_message: Optional message to send after connection to subscribe to specific data streams

    Example:
        ```python
        # Example: Connecting to Bybit WebSocket API for Bitcoin price data

        bybit_provider = WebSocketDataProvider(
            uri="wss://stream.bybit.com/v5/public/spot",
            subscribe_message={"op": "subscribe", "args": ["tickers.BTCUSDT"]},
        )

        try:
            for message in bybit_provider:
                data = json.loads(message)
                if "data" in data and data.get("topic") == "tickers.BTCUSDT":
                    price_data = data["data"]
                    print(f"BTC/USDT: {price_data['lastPrice']} (Time: {price_data['timestamp']})")
        except KeyboardInterrupt:
            bybit_provider.close()
        ```
    """

    def __init__(self, uri: str, subscribe_message: dict[str, Any] | None = None) -> None:
        """Initialize a WebSocket data provider.

        :param uri: WebSocket endpoint URI
        :param subscribe_message: Optional message to send after connection to subscribe to specific data streams
        """
        super().__init__()
        self._uri = uri
        self._subscribe_message = subscribe_message
        self._iterator_queue: queue.Queue[str] = queue.Queue()
        self._stop_event = threading.Event()
        self._thread: threading.Thread | None = None

    def __iter__(self) -> Iterator[str]:
        """Create an iterator over the messages received from the WebSocket.

        This method starts a background thread to handle the WebSocket connection
        if it's not already running, and yields messages as they are received.

        :return: An iterator yielding message strings from the WebSocket
        """
        if self._thread is None or not self._thread.is_alive():
            self._thread = threading.Thread(target=self._thread_main, daemon=True)
            self._thread.start()
        while not self._stop_event.is_set():
            try:
                item = self._iterator_queue.get(timeout=1)
                yield item
            except queue.Empty:
                continue

    def _thread_main(self) -> None:
        """Entry point for the background thread.

        This method runs the asyncio event loop that handles the WebSocket connection.
        """
        asyncio.run(self._receiver())

    async def _receiver(self) -> None:
        """Asynchronous method to handle WebSocket communication.

        This method establishes the WebSocket connection, sends the subscription message
        if provided, and places received messages into the queue for the iterator.
        """
        try:
            async with websockets.connect(self._uri) as ws:
                if self._subscribe_message is not None:
                    await ws.send(json.dumps(self._subscribe_message))
                async for msg in ws:
                    try:
                        if msg is not None:
                            self._iterator_queue.put(str(msg))
                    except Exception:
                        continue
        except Exception:
            pass

    def close(self) -> None:
        """Close the WebSocket connection and stop the background thread.

        This method should be called when the provider is no longer needed
        to release resources properly.
        """
        self._stop_event.set()
        if self._thread is not None:
            self._thread.join(timeout=2)
__init__
__init__(
    uri: str,
    subscribe_message: dict[str, Any] | None = None,
) -> None

Initialize a WebSocket data provider.

Parameters:

Name Type Description Default
uri str

WebSocket endpoint URI

required
subscribe_message dict[str, Any] | None

Optional message to send after connection to subscribe to specific data streams

None
Source code in pysatl_tsp/core/data_providers/websocket_data_provider.py
46
47
48
49
50
51
52
53
54
55
56
57
def __init__(self, uri: str, subscribe_message: dict[str, Any] | None = None) -> None:
    """Initialize a WebSocket data provider.

    :param uri: WebSocket endpoint URI
    :param subscribe_message: Optional message to send after connection to subscribe to specific data streams
    """
    super().__init__()
    self._uri = uri
    self._subscribe_message = subscribe_message
    self._iterator_queue: queue.Queue[str] = queue.Queue()
    self._stop_event = threading.Event()
    self._thread: threading.Thread | None = None
__iter__
__iter__() -> Iterator[str]

Create an iterator over the messages received from the WebSocket.

This method starts a background thread to handle the WebSocket connection if it's not already running, and yields messages as they are received.

Returns:

Type Description
Iterator[str]

An iterator yielding message strings from the WebSocket

Source code in pysatl_tsp/core/data_providers/websocket_data_provider.py
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
def __iter__(self) -> Iterator[str]:
    """Create an iterator over the messages received from the WebSocket.

    This method starts a background thread to handle the WebSocket connection
    if it's not already running, and yields messages as they are received.

    :return: An iterator yielding message strings from the WebSocket
    """
    if self._thread is None or not self._thread.is_alive():
        self._thread = threading.Thread(target=self._thread_main, daemon=True)
        self._thread.start()
    while not self._stop_event.is_set():
        try:
            item = self._iterator_queue.get(timeout=1)
            yield item
        except queue.Empty:
            continue
close
close() -> None

Close the WebSocket connection and stop the background thread.

This method should be called when the provider is no longer needed to release resources properly.

Source code in pysatl_tsp/core/data_providers/websocket_data_provider.py
103
104
105
106
107
108
109
110
111
def close(self) -> None:
    """Close the WebSocket connection and stop the background thread.

    This method should be called when the provider is no longer needed
    to release resources properly.
    """
    self._stop_event.set()
    if self._thread is not None:
        self._thread.join(timeout=2)
abstract
DataProvider

Bases: Handler[None, T]

Abstract base class for time series data providers.

DataProvider serves as a root handler in a processing pipeline, responsible for sourcing the initial time series data. As the first element in the chain, it doesn't receive input from any preceding handler and acts as the data origin.

This class is designed to be subclassed with specific implementations for different data sources such as files, databases, APIs, or generated data.

Source code in pysatl_tsp/core/data_providers/abstract.py
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
class DataProvider(Handler[None, T]):
    """Abstract base class for time series data providers.

    DataProvider serves as a root handler in a processing pipeline, responsible for
    sourcing the initial time series data. As the first element in the chain, it doesn't
    receive input from any preceding handler and acts as the data origin.

    This class is designed to be subclassed with specific implementations for different
    data sources such as files, databases, APIs, or generated data.
    """

    def __init__(self) -> None:
        """Initialize a data provider with no source.

        Data providers are always root handlers and cannot have a source.
        """
        super().__init__(source=None)

    @abstractmethod
    def __iter__(self) -> Iterator[T]:
        """Create an iterator over the time series data provided by this data source.

        Each subclass must implement this method to define how data is sourced and
        potentially pre-processed before being passed to subsequent handlers.

        :return: An iterator yielding time series data items
        """
        pass
__init__
__init__() -> None

Initialize a data provider with no source.

Data providers are always root handlers and cannot have a source.

Source code in pysatl_tsp/core/data_providers/abstract.py
20
21
22
23
24
25
def __init__(self) -> None:
    """Initialize a data provider with no source.

    Data providers are always root handlers and cannot have a source.
    """
    super().__init__(source=None)
__iter__ abstractmethod
__iter__() -> Iterator[T]

Create an iterator over the time series data provided by this data source.

Each subclass must implement this method to define how data is sourced and potentially pre-processed before being passed to subsequent handlers.

Returns:

Type Description
Iterator[T]

An iterator yielding time series data items

Source code in pysatl_tsp/core/data_providers/abstract.py
27
28
29
30
31
32
33
34
35
36
@abstractmethod
def __iter__(self) -> Iterator[T]:
    """Create an iterator over the time series data provided by this data source.

    Each subclass must implement this method to define how data is sourced and
    potentially pre-processed before being passed to subsequent handlers.

    :return: An iterator yielding time series data items
    """
    pass
database_data_provider
DataBaseDataProvider

Bases: DataProvider[T], Generic[T]

A data provider that sources time series data from a database.

This class provides a way to query time series data from any database system using adapters that implement the DatabaseAdapter interface. It handles the connection lifecycle and streaming of data from database queries.

Parameters:

Name Type Description Default
connection_params dict[str, Any]

Dictionary containing connection parameters

required
query str

SQL query string to execute

required
adapter DatabaseAdapter[T]

Database adapter implementation

required
params tuple[Any, ...]

Query parameters, defaults to () Example: python # Using the SQLiteAdapter from the example above import sqlite3 class SQLiteAdapter(DatabaseAdapter[dict[str, Any]]): # ... adapter implementation as shown above ... # Create a data provider for SQLite database provider = DataBaseDataProvider( connection_params={"database": "sensors.db"}, query="SELECT timestamp, value FROM temperature WHERE location_id = ? AND timestamp > ?", adapter=SQLiteAdapter(), params=("zone-1", "2023-01-01") ) # Use the provider in a processing pipeline for record in provider: print(f"Time: {record['timestamp']}, Value: {record['value']}") # Or connect to a processing pipeline pipeline = provider | WindowHandler(60) | AverageHandler()

()
Source code in pysatl_tsp/core/data_providers/database_data_provider.py
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
class DataBaseDataProvider(DataProvider[T], Generic[T]):
    """A data provider that sources time series data from a database.

    This class provides a way to query time series data from any database system
    using adapters that implement the DatabaseAdapter interface. It handles the
    connection lifecycle and streaming of data from database queries.

    :param connection_params: Dictionary containing connection parameters
    :param query: SQL query string to execute
    :param adapter: Database adapter implementation
    :param params: Query parameters, defaults to ()

    Example:
        ```python
        # Using the SQLiteAdapter from the example above
        import sqlite3

        class SQLiteAdapter(DatabaseAdapter[dict[str, Any]]):
            # ... adapter implementation as shown above ...

        # Create a data provider for SQLite database
        provider = DataBaseDataProvider(
            connection_params={"database": "sensors.db"},
            query="SELECT timestamp, value FROM temperature WHERE location_id = ? AND timestamp > ?",
            adapter=SQLiteAdapter(),
            params=("zone-1", "2023-01-01")
        )

        # Use the provider in a processing pipeline
        for record in provider:
            print(f"Time: {record['timestamp']}, Value: {record['value']}")

        # Or connect to a processing pipeline
        pipeline = provider | WindowHandler(60) | AverageHandler()
        ```
    """

    def __init__(
        self,
        connection_params: dict[str, Any],
        query: str,
        adapter: DatabaseAdapter[T],
        params: tuple[Any, ...] = (),
    ) -> None:
        """Initialize a database data provider.

        :param connection_params: Dictionary containing connection parameters
        :param query: SQL query string to execute
        :param adapter: Database adapter implementation
        :param params: Query parameters, defaults to ()
        """
        super().__init__()
        self._connection_params = connection_params
        self._query = query
        self._params = params
        self._adapter = adapter

    @contextmanager
    def _connection_context(self) -> Any:
        """Context manager for database connection lifecycle.

        This method handles the proper setup and teardown of database connections,
        ensuring that resources are properly released even in the case of errors.

        :return: Database cursor or similar query result object
        :raises Exception: If database operations fail
        """
        connection = None
        cursor = None
        try:
            connection = self._adapter.connect(self._connection_params)
            cursor = self._adapter.execute_query(connection, self._query, self._params)
            yield cursor
        finally:
            if cursor is not None:
                self._adapter.close_cursor(cursor)
            if connection is not None:
                self._adapter.close_connection(connection)

    def __iter__(self) -> Iterator[T]:
        """Create an iterator over the query results from the database.

        This method executes the query and yields data items by delegating
        to the adapter's fetch_data method.

        :return: An iterator yielding data items from the database query
        :raises Exception: If database operations fail
        """
        with self._connection_context() as cursor:
            yield from self._adapter.fetch_data(cursor)
__init__
__init__(
    connection_params: dict[str, Any],
    query: str,
    adapter: DatabaseAdapter[T],
    params: tuple[Any, ...] = (),
) -> None

Initialize a database data provider.

Parameters:

Name Type Description Default
connection_params dict[str, Any]

Dictionary containing connection parameters

required
query str

SQL query string to execute

required
adapter DatabaseAdapter[T]

Database adapter implementation

required
params tuple[Any, ...]

Query parameters, defaults to ()

()
Source code in pysatl_tsp/core/data_providers/database_data_provider.py
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
def __init__(
    self,
    connection_params: dict[str, Any],
    query: str,
    adapter: DatabaseAdapter[T],
    params: tuple[Any, ...] = (),
) -> None:
    """Initialize a database data provider.

    :param connection_params: Dictionary containing connection parameters
    :param query: SQL query string to execute
    :param adapter: Database adapter implementation
    :param params: Query parameters, defaults to ()
    """
    super().__init__()
    self._connection_params = connection_params
    self._query = query
    self._params = params
    self._adapter = adapter
__iter__
__iter__() -> Iterator[T]

Create an iterator over the query results from the database.

This method executes the query and yields data items by delegating to the adapter's fetch_data method.

Returns:

Type Description
Iterator[T]

An iterator yielding data items from the database query

Raises:

Type Description
Exception

If database operations fail

Source code in pysatl_tsp/core/data_providers/database_data_provider.py
190
191
192
193
194
195
196
197
198
199
200
def __iter__(self) -> Iterator[T]:
    """Create an iterator over the query results from the database.

    This method executes the query and yields data items by delegating
    to the adapter's fetch_data method.

    :return: An iterator yielding data items from the database query
    :raises Exception: If database operations fail
    """
    with self._connection_context() as cursor:
        yield from self._adapter.fetch_data(cursor)
DatabaseAdapter

Bases: ABC, Generic[T]

Abstract base class for database adapter implementations.

This class defines the interface for adapters that connect to different database systems. Concrete implementations of this class handle the specific details of connecting to databases, executing queries, and transforming results into a consistent format for the data processing pipeline.

Example:

import sqlite3


class SQLiteAdapter(DatabaseAdapter[dict[str, Any]]):
    def connect(self, connection_params: dict[str, Any]) -> sqlite3.Connection:
        connection = sqlite3.connect(connection_params["database"])
        connection.row_factory = sqlite3.Row
        return connection

    def execute_query(
        self, connection: sqlite3.Connection, query: str, params: tuple[Any, ...] = ()
    ) -> sqlite3.Cursor:
        cursor = connection.cursor()
        cursor.execute(query, params)
        return cursor

    def fetch_data(self, cursor: sqlite3.Cursor) -> Iterator[dict[str, Any]]:
        for row in cursor:
            yield dict(row)

    def close_cursor(self, cursor: sqlite3.Cursor) -> None:
        cursor.close()

    def close_connection(self, connection: sqlite3.Connection) -> None:
        connection.close()


# Usage:
adapter = SQLiteAdapter()
provider = DataBaseDataProvider(
    connection_params={"database": "time_series.db"},
    query="SELECT timestamp, value FROM measurements WHERE sensor_id = ?",
    adapter=adapter,
    params=(42,),
)

Source code in pysatl_tsp/core/data_providers/database_data_provider.py
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
class DatabaseAdapter(ABC, Generic[T]):
    """Abstract base class for database adapter implementations.

    This class defines the interface for adapters that connect to different database
    systems. Concrete implementations of this class handle the specific details of
    connecting to databases, executing queries, and transforming results into a
    consistent format for the data processing pipeline.

    Example:
        ```python
        import sqlite3


        class SQLiteAdapter(DatabaseAdapter[dict[str, Any]]):
            def connect(self, connection_params: dict[str, Any]) -> sqlite3.Connection:
                connection = sqlite3.connect(connection_params["database"])
                connection.row_factory = sqlite3.Row
                return connection

            def execute_query(
                self, connection: sqlite3.Connection, query: str, params: tuple[Any, ...] = ()
            ) -> sqlite3.Cursor:
                cursor = connection.cursor()
                cursor.execute(query, params)
                return cursor

            def fetch_data(self, cursor: sqlite3.Cursor) -> Iterator[dict[str, Any]]:
                for row in cursor:
                    yield dict(row)

            def close_cursor(self, cursor: sqlite3.Cursor) -> None:
                cursor.close()

            def close_connection(self, connection: sqlite3.Connection) -> None:
                connection.close()


        # Usage:
        adapter = SQLiteAdapter()
        provider = DataBaseDataProvider(
            connection_params={"database": "time_series.db"},
            query="SELECT timestamp, value FROM measurements WHERE sensor_id = ?",
            adapter=adapter,
            params=(42,),
        )
        ```
    """

    @abstractmethod
    def connect(self, connection_params: dict[str, Any]) -> Any:
        """Establish a connection to the database.

        :param connection_params: Dictionary containing connection parameters
                                 (e.g., host, port, username, password, etc.)
        :return: Database connection object
        :raises Exception: If connection fails
        """
        pass

    @abstractmethod
    def execute_query(self, connection: Any, query: str, params: tuple[Any, ...] = ()) -> Any:
        """Execute a query on the given database connection.

        :param connection: Database connection object
        :param query: SQL query string
        :param params: Query parameters, defaults to ()
        :return: Query result cursor or similar object
        :raises Exception: If query execution fails
        """
        pass

    @abstractmethod
    def fetch_data(self, cursor: Any) -> Iterator[T]:
        """Extract and transform data from the query result.

        :param cursor: Query result cursor
        :return: Iterator yielding data items of type T
        """
        pass

    @abstractmethod
    def close_cursor(self, cursor: Any) -> None:
        """Close the query cursor.

        :param cursor: Query result cursor
        """
        pass

    @abstractmethod
    def close_connection(self, connection: Any) -> None:
        """Close the database connection.

        :param connection: Database connection object
        """
        pass
close_connection abstractmethod
close_connection(connection: Any) -> None

Close the database connection.

Parameters:

Name Type Description Default
connection Any

Database connection object

required
Source code in pysatl_tsp/core/data_providers/database_data_provider.py
102
103
104
105
106
107
108
@abstractmethod
def close_connection(self, connection: Any) -> None:
    """Close the database connection.

    :param connection: Database connection object
    """
    pass
close_cursor abstractmethod
close_cursor(cursor: Any) -> None

Close the query cursor.

Parameters:

Name Type Description Default
cursor Any

Query result cursor

required
Source code in pysatl_tsp/core/data_providers/database_data_provider.py
 94
 95
 96
 97
 98
 99
100
@abstractmethod
def close_cursor(self, cursor: Any) -> None:
    """Close the query cursor.

    :param cursor: Query result cursor
    """
    pass
connect abstractmethod
connect(connection_params: dict[str, Any]) -> Any

Establish a connection to the database.

Parameters:

Name Type Description Default
connection_params dict[str, Any]

Dictionary containing connection parameters (e.g., host, port, username, password, etc.)

required

Returns:

Type Description
Any

Database connection object

Raises:

Type Description
Exception

If connection fails

Source code in pysatl_tsp/core/data_providers/database_data_provider.py
62
63
64
65
66
67
68
69
70
71
@abstractmethod
def connect(self, connection_params: dict[str, Any]) -> Any:
    """Establish a connection to the database.

    :param connection_params: Dictionary containing connection parameters
                             (e.g., host, port, username, password, etc.)
    :return: Database connection object
    :raises Exception: If connection fails
    """
    pass
execute_query abstractmethod
execute_query(
    connection: Any,
    query: str,
    params: tuple[Any, ...] = (),
) -> Any

Execute a query on the given database connection.

Parameters:

Name Type Description Default
connection Any

Database connection object

required
query str

SQL query string

required
params tuple[Any, ...]

Query parameters, defaults to ()

()

Returns:

Type Description
Any

Query result cursor or similar object

Raises:

Type Description
Exception

If query execution fails

Source code in pysatl_tsp/core/data_providers/database_data_provider.py
73
74
75
76
77
78
79
80
81
82
83
@abstractmethod
def execute_query(self, connection: Any, query: str, params: tuple[Any, ...] = ()) -> Any:
    """Execute a query on the given database connection.

    :param connection: Database connection object
    :param query: SQL query string
    :param params: Query parameters, defaults to ()
    :return: Query result cursor or similar object
    :raises Exception: If query execution fails
    """
    pass
fetch_data abstractmethod
fetch_data(cursor: Any) -> Iterator[T]

Extract and transform data from the query result.

Parameters:

Name Type Description Default
cursor Any

Query result cursor

required

Returns:

Type Description
Iterator[T]

Iterator yielding data items of type T

Source code in pysatl_tsp/core/data_providers/database_data_provider.py
85
86
87
88
89
90
91
92
@abstractmethod
def fetch_data(self, cursor: Any) -> Iterator[T]:
    """Extract and transform data from the query result.

    :param cursor: Query result cursor
    :return: Iterator yielding data items of type T
    """
    pass
file_data_provider
FileDataProvider

Bases: DataProvider[X]

A data provider that reads time series data from a text file.

This class implements a file-based data source that reads a file line by line and transforms each line into a data item using a specified handler function. It is useful for processing time series data stored in text files with various formats (CSV, JSON per line, custom formats, etc.).

Parameters:

Name Type Description Default
filename str

Path to the file containing time series data

required
handler Callable[[str], X]

A function that converts each line of text to a data item

required

Raises:

Type Description
FileNotFoundError

If the specified file does not exist

PermissionError

If the file cannot be read due to permission issues

Source code in pysatl_tsp/core/data_providers/file_data_provider.py
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
class FileDataProvider(DataProvider[X]):
    """A data provider that reads time series data from a text file.

    This class implements a file-based data source that reads a file line by line
    and transforms each line into a data item using a specified handler function.
    It is useful for processing time series data stored in text files with various
    formats (CSV, JSON per line, custom formats, etc.).

    :param filename: Path to the file containing time series data
    :param handler: A function that converts each line of text to a data item

    :raises FileNotFoundError: If the specified file does not exist
    :raises PermissionError: If the file cannot be read due to permission issues
    """

    def __init__(self, filename: str, handler: Callable[[str], X]) -> None:
        """Initialize a file data provider.

        :param filename: Path to the file containing time series data
        :param handler: A function that converts each line of text to a data item
        """
        super().__init__()
        self.filename = filename
        self.handler = handler

    def __iter__(self) -> Iterator[X]:
        """Create an iterator that reads the file line by line and yields processed data items.

        The method opens the specified file, reads it line by line, and applies the handler
        function to each line to transform it into a data item.

        :return: An iterator yielding processed data items from the file
        :raises FileNotFoundError: If the specified file does not exist
        :raises PermissionError: If the file cannot be read due to permission issues
        """
        with open(self.filename) as f:
            for line in f:
                yield self.handler(line)
__init__
__init__(
    filename: str, handler: Callable[[str], X]
) -> None

Initialize a file data provider.

Parameters:

Name Type Description Default
filename str

Path to the file containing time series data

required
handler Callable[[str], X]

A function that converts each line of text to a data item

required
Source code in pysatl_tsp/core/data_providers/file_data_provider.py
24
25
26
27
28
29
30
31
32
def __init__(self, filename: str, handler: Callable[[str], X]) -> None:
    """Initialize a file data provider.

    :param filename: Path to the file containing time series data
    :param handler: A function that converts each line of text to a data item
    """
    super().__init__()
    self.filename = filename
    self.handler = handler
__iter__
__iter__() -> Iterator[X]

Create an iterator that reads the file line by line and yields processed data items.

The method opens the specified file, reads it line by line, and applies the handler function to each line to transform it into a data item.

Returns:

Type Description
Iterator[X]

An iterator yielding processed data items from the file

Raises:

Type Description
FileNotFoundError

If the specified file does not exist

PermissionError

If the file cannot be read due to permission issues

Source code in pysatl_tsp/core/data_providers/file_data_provider.py
34
35
36
37
38
39
40
41
42
43
44
45
46
def __iter__(self) -> Iterator[X]:
    """Create an iterator that reads the file line by line and yields processed data items.

    The method opens the specified file, reads it line by line, and applies the handler
    function to each line to transform it into a data item.

    :return: An iterator yielding processed data items from the file
    :raises FileNotFoundError: If the specified file does not exist
    :raises PermissionError: If the file cannot be read due to permission issues
    """
    with open(self.filename) as f:
        for line in f:
            yield self.handler(line)
simple_data_provider
SimpleDataProvider

Bases: DataProvider[T]

A data provider that serves data from an in-memory iterable collection.

This class implements a simple data provider that wraps around any iterable collection and makes it available as a data source in a processing pipeline. It's useful for testing, working with pre-loaded data, or creating pipelines that process data already in memory.

Parameters:

Name Type Description Default
data Iterable[T]

An iterable collection containing the time series data

required
Source code in pysatl_tsp/core/data_providers/simple_data_provider.py
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
class SimpleDataProvider(DataProvider[T]):
    """A data provider that serves data from an in-memory iterable collection.

    This class implements a simple data provider that wraps around any iterable
    collection and makes it available as a data source in a processing pipeline.
    It's useful for testing, working with pre-loaded data, or creating pipelines
    that process data already in memory.

    :param data: An iterable collection containing the time series data
    """

    def __init__(self, data: Iterable[T]) -> None:
        """Initialize a simple data provider with an iterable data source.

        :param data: An iterable collection containing the time series data
        """
        super().__init__()
        self.data = data

    def __iter__(self) -> Iterator[T]:
        """Create an iterator over the provided data collection.

        This method simply yields items from the data collection that was passed
        during initialization, making them available to subsequent handlers in
        the processing pipeline.

        :return: An iterator yielding items from the data collection
        """
        yield from self.data
__init__
__init__(data: Iterable[T]) -> None

Initialize a simple data provider with an iterable data source.

Parameters:

Name Type Description Default
data Iterable[T]

An iterable collection containing the time series data

required
Source code in pysatl_tsp/core/data_providers/simple_data_provider.py
17
18
19
20
21
22
23
def __init__(self, data: Iterable[T]) -> None:
    """Initialize a simple data provider with an iterable data source.

    :param data: An iterable collection containing the time series data
    """
    super().__init__()
    self.data = data
__iter__
__iter__() -> Iterator[T]

Create an iterator over the provided data collection.

This method simply yields items from the data collection that was passed during initialization, making them available to subsequent handlers in the processing pipeline.

Returns:

Type Description
Iterator[T]

An iterator yielding items from the data collection

Source code in pysatl_tsp/core/data_providers/simple_data_provider.py
25
26
27
28
29
30
31
32
33
34
def __iter__(self) -> Iterator[T]:
    """Create an iterator over the provided data collection.

    This method simply yields items from the data collection that was passed
    during initialization, making them available to subsequent handlers in
    the processing pipeline.

    :return: An iterator yielding items from the data collection
    """
    yield from self.data
websocket_data_provider
WebSocketDataProvider

Bases: DataProvider[str]

A data provider that streams time series data from a WebSocket connection.

This class establishes a WebSocket connection to a specified URI and streams received messages into the processing pipeline. It handles the WebSocket connection in a separate thread to avoid blocking the main processing flow and provides a clean streaming interface through the standard iterator protocol.

Parameters:

Name Type Description Default
uri str

WebSocket endpoint URI

required
subscribe_message dict[str, Any] | None

Optional message to send after connection to subscribe to specific data streams Example: python # Example: Connecting to Bybit WebSocket API for Bitcoin price data bybit_provider = WebSocketDataProvider( uri="wss://stream.bybit.com/v5/public/spot", subscribe_message={"op": "subscribe", "args": ["tickers.BTCUSDT"]}, ) try: for message in bybit_provider: data = json.loads(message) if "data" in data and data.get("topic") == "tickers.BTCUSDT": price_data = data["data"] print(f"BTC/USDT: {price_data['lastPrice']} (Time: {price_data['timestamp']})") except KeyboardInterrupt: bybit_provider.close()

None
Source code in pysatl_tsp/core/data_providers/websocket_data_provider.py
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
class WebSocketDataProvider(DataProvider[str]):
    """A data provider that streams time series data from a WebSocket connection.

    This class establishes a WebSocket connection to a specified URI and streams
    received messages into the processing pipeline. It handles the WebSocket connection
    in a separate thread to avoid blocking the main processing flow and provides
    a clean streaming interface through the standard iterator protocol.

    :param uri: WebSocket endpoint URI
    :param subscribe_message: Optional message to send after connection to subscribe to specific data streams

    Example:
        ```python
        # Example: Connecting to Bybit WebSocket API for Bitcoin price data

        bybit_provider = WebSocketDataProvider(
            uri="wss://stream.bybit.com/v5/public/spot",
            subscribe_message={"op": "subscribe", "args": ["tickers.BTCUSDT"]},
        )

        try:
            for message in bybit_provider:
                data = json.loads(message)
                if "data" in data and data.get("topic") == "tickers.BTCUSDT":
                    price_data = data["data"]
                    print(f"BTC/USDT: {price_data['lastPrice']} (Time: {price_data['timestamp']})")
        except KeyboardInterrupt:
            bybit_provider.close()
        ```
    """

    def __init__(self, uri: str, subscribe_message: dict[str, Any] | None = None) -> None:
        """Initialize a WebSocket data provider.

        :param uri: WebSocket endpoint URI
        :param subscribe_message: Optional message to send after connection to subscribe to specific data streams
        """
        super().__init__()
        self._uri = uri
        self._subscribe_message = subscribe_message
        self._iterator_queue: queue.Queue[str] = queue.Queue()
        self._stop_event = threading.Event()
        self._thread: threading.Thread | None = None

    def __iter__(self) -> Iterator[str]:
        """Create an iterator over the messages received from the WebSocket.

        This method starts a background thread to handle the WebSocket connection
        if it's not already running, and yields messages as they are received.

        :return: An iterator yielding message strings from the WebSocket
        """
        if self._thread is None or not self._thread.is_alive():
            self._thread = threading.Thread(target=self._thread_main, daemon=True)
            self._thread.start()
        while not self._stop_event.is_set():
            try:
                item = self._iterator_queue.get(timeout=1)
                yield item
            except queue.Empty:
                continue

    def _thread_main(self) -> None:
        """Entry point for the background thread.

        This method runs the asyncio event loop that handles the WebSocket connection.
        """
        asyncio.run(self._receiver())

    async def _receiver(self) -> None:
        """Asynchronous method to handle WebSocket communication.

        This method establishes the WebSocket connection, sends the subscription message
        if provided, and places received messages into the queue for the iterator.
        """
        try:
            async with websockets.connect(self._uri) as ws:
                if self._subscribe_message is not None:
                    await ws.send(json.dumps(self._subscribe_message))
                async for msg in ws:
                    try:
                        if msg is not None:
                            self._iterator_queue.put(str(msg))
                    except Exception:
                        continue
        except Exception:
            pass

    def close(self) -> None:
        """Close the WebSocket connection and stop the background thread.

        This method should be called when the provider is no longer needed
        to release resources properly.
        """
        self._stop_event.set()
        if self._thread is not None:
            self._thread.join(timeout=2)
__init__
__init__(
    uri: str,
    subscribe_message: dict[str, Any] | None = None,
) -> None

Initialize a WebSocket data provider.

Parameters:

Name Type Description Default
uri str

WebSocket endpoint URI

required
subscribe_message dict[str, Any] | None

Optional message to send after connection to subscribe to specific data streams

None
Source code in pysatl_tsp/core/data_providers/websocket_data_provider.py
46
47
48
49
50
51
52
53
54
55
56
57
def __init__(self, uri: str, subscribe_message: dict[str, Any] | None = None) -> None:
    """Initialize a WebSocket data provider.

    :param uri: WebSocket endpoint URI
    :param subscribe_message: Optional message to send after connection to subscribe to specific data streams
    """
    super().__init__()
    self._uri = uri
    self._subscribe_message = subscribe_message
    self._iterator_queue: queue.Queue[str] = queue.Queue()
    self._stop_event = threading.Event()
    self._thread: threading.Thread | None = None
__iter__
__iter__() -> Iterator[str]

Create an iterator over the messages received from the WebSocket.

This method starts a background thread to handle the WebSocket connection if it's not already running, and yields messages as they are received.

Returns:

Type Description
Iterator[str]

An iterator yielding message strings from the WebSocket

Source code in pysatl_tsp/core/data_providers/websocket_data_provider.py
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
def __iter__(self) -> Iterator[str]:
    """Create an iterator over the messages received from the WebSocket.

    This method starts a background thread to handle the WebSocket connection
    if it's not already running, and yields messages as they are received.

    :return: An iterator yielding message strings from the WebSocket
    """
    if self._thread is None or not self._thread.is_alive():
        self._thread = threading.Thread(target=self._thread_main, daemon=True)
        self._thread.start()
    while not self._stop_event.is_set():
        try:
            item = self._iterator_queue.get(timeout=1)
            yield item
        except queue.Empty:
            continue
close
close() -> None

Close the WebSocket connection and stop the background thread.

This method should be called when the provider is no longer needed to release resources properly.

Source code in pysatl_tsp/core/data_providers/websocket_data_provider.py
103
104
105
106
107
108
109
110
111
def close(self) -> None:
    """Close the WebSocket connection and stop the background thread.

    This method should be called when the provider is no longer needed
    to release resources properly.
    """
    self._stop_event.set()
    if self._thread is not None:
        self._thread.join(timeout=2)

handler

Handler

Bases: ABC, Generic[T, U]

Abstract base class for time series processing handlers.

This class implements a Chain of Responsibility pattern for processing time series data. Each Handler can be connected to a source handler and process its output data. Handlers can be combined using the pipe operator (|) to create processing pipelines.

Parameters:

Name Type Description Default
source Handler[Any, T] | None

The handler to use as a data source, defaults to None

None
Source code in pysatl_tsp/core/handler.py
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
class Handler(ABC, Generic[T, U]):
    """Abstract base class for time series processing handlers.

    This class implements a Chain of Responsibility pattern for processing time series data.
    Each Handler can be connected to a source handler and process its output data.
    Handlers can be combined using the pipe operator (|) to create processing pipelines.

    :param source: The handler to use as a data source, defaults to None
    """

    def __init__(self, source: Handler[Any, T] | None = None):
        """Initialize a handler with an optional source.

        :param source: The handler to use as a data source, defaults to None
        """
        self._source = source

    @property
    def source(self) -> Handler[Any, T] | None:
        """Get the source handler that provides input data to this handler.

        :return: The source handler or None if this is a root handler
        """
        return self._source

    @source.setter
    def source(self, value: Handler[Any, T]) -> None:
        """Set the source handler for this handler.

        :param value: The handler to use as a data source
        :raises RuntimeError: If the source has already been set
        """
        if self._source is not None:
            raise RuntimeError("Cannot change already setted source")
        self._source = value

    @abstractmethod
    def __iter__(self) -> Iterator[U]:
        """Create an iterator over the output data produced by this handler.

        Each subclass must implement this method to define how data is processed.

        :return: An iterator yielding processed data items
        """
        pass

    def __or__(self, other: Handler[U, V]) -> Pipeline[T, V]:
        """Combine this handler with another handler using the pipe operator.

        This allows for the creation of processing pipelines using syntax like:
        handler1 | handler2 | handler3

        :param other: The next handler in the pipeline
        :return: A Pipeline object connecting this handler to the other handler
        """
        return Pipeline(self, other)
source property writable
source: Handler[Any, T] | None

Get the source handler that provides input data to this handler.

Returns:

Type Description
Handler[Any, T] | None

The source handler or None if this is a root handler

__init__
__init__(source: Handler[Any, T] | None = None)

Initialize a handler with an optional source.

Parameters:

Name Type Description Default
source Handler[Any, T] | None

The handler to use as a data source, defaults to None

None
Source code in pysatl_tsp/core/handler.py
28
29
30
31
32
33
def __init__(self, source: Handler[Any, T] | None = None):
    """Initialize a handler with an optional source.

    :param source: The handler to use as a data source, defaults to None
    """
    self._source = source
__iter__ abstractmethod
__iter__() -> Iterator[U]

Create an iterator over the output data produced by this handler.

Each subclass must implement this method to define how data is processed.

Returns:

Type Description
Iterator[U]

An iterator yielding processed data items

Source code in pysatl_tsp/core/handler.py
54
55
56
57
58
59
60
61
62
@abstractmethod
def __iter__(self) -> Iterator[U]:
    """Create an iterator over the output data produced by this handler.

    Each subclass must implement this method to define how data is processed.

    :return: An iterator yielding processed data items
    """
    pass
__or__
__or__(other: Handler[U, V]) -> Pipeline[T, V]

Combine this handler with another handler using the pipe operator.

This allows for the creation of processing pipelines using syntax like: handler1 | handler2 | handler3

Parameters:

Name Type Description Default
other Handler[U, V]

The next handler in the pipeline

required

Returns:

Type Description
Pipeline[T, V]

A Pipeline object connecting this handler to the other handler

Source code in pysatl_tsp/core/handler.py
64
65
66
67
68
69
70
71
72
73
def __or__(self, other: Handler[U, V]) -> Pipeline[T, V]:
    """Combine this handler with another handler using the pipe operator.

    This allows for the creation of processing pipelines using syntax like:
    handler1 | handler2 | handler3

    :param other: The next handler in the pipeline
    :return: A Pipeline object connecting this handler to the other handler
    """
    return Pipeline(self, other)
Pipeline

Bases: Handler[T, V]

A composite handler that connects two handlers in sequence.

The Pipeline takes output from the first handler and feeds it as input to the second handler. This class enables the creation of data processing chains, where each handler in the chain performs a specific transformation on the data.

Parameters:

Name Type Description Default
first Handler[T, U]

The first handler in the pipeline

required
second Handler[U, V]

The second handler in the pipeline

required

Raises:

Type Description
ValueError

If the second handler already has a source configured

Source code in pysatl_tsp/core/handler.py
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
class Pipeline(Handler[T, V]):
    """A composite handler that connects two handlers in sequence.

    The Pipeline takes output from the first handler and feeds it as input to the second handler.
    This class enables the creation of data processing chains, where each handler in the chain
    performs a specific transformation on the data.

    :param first: The first handler in the pipeline
    :param second: The second handler in the pipeline
    :raises ValueError: If the second handler already has a source configured
    """

    def __init__(self, first: Handler[T, U], second: Handler[U, V]):
        """Initialize a pipeline with two handlers.

        :param first: The first handler in the pipeline
        :param second: The second handler in the pipeline
        :raises ValueError: If the second handler already has a source configured
        """
        super().__init__()
        self.first = first

        if second.source is not None:
            raise ValueError(
                f"Cannot create Pipeline: second handler {type(second).__name__} "
                f"already has a source {type(second.source).__name__}. "
                "Use explicit set_source() or rebuild pipeline chain."
            )

        self.second = second
        self.second.source = self.first

    def __iter__(self) -> Iterator[V]:
        """Create an iterator that processes data through both handlers in sequence.

        :return: An iterator yielding data processed through both handlers
        """
        self.second_iterator = iter(self.second)
        return self.second_iterator
__init__
__init__(first: Handler[T, U], second: Handler[U, V])

Initialize a pipeline with two handlers.

Parameters:

Name Type Description Default
first Handler[T, U]

The first handler in the pipeline

required
second Handler[U, V]

The second handler in the pipeline

required

Raises:

Type Description
ValueError

If the second handler already has a source configured

Source code in pysatl_tsp/core/handler.py
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
def __init__(self, first: Handler[T, U], second: Handler[U, V]):
    """Initialize a pipeline with two handlers.

    :param first: The first handler in the pipeline
    :param second: The second handler in the pipeline
    :raises ValueError: If the second handler already has a source configured
    """
    super().__init__()
    self.first = first

    if second.source is not None:
        raise ValueError(
            f"Cannot create Pipeline: second handler {type(second).__name__} "
            f"already has a source {type(second.source).__name__}. "
            "Use explicit set_source() or rebuild pipeline chain."
        )

    self.second = second
    self.second.source = self.first
__iter__
__iter__() -> Iterator[V]

Create an iterator that processes data through both handlers in sequence.

Returns:

Type Description
Iterator[V]

An iterator yielding data processed through both handlers

Source code in pysatl_tsp/core/handler.py
108
109
110
111
112
113
114
def __iter__(self) -> Iterator[V]:
    """Create an iterator that processes data through both handlers in sequence.

    :return: An iterator yielding data processed through both handlers
    """
    self.second_iterator = iter(self.second)
    return self.second_iterator

processor

This module provides the processor functionality for the pysatl_tsp package.

MappingHandler

Bases: Handler[T, U]

A handler that transforms time series data by applying a mapping function to each item.

This handler applies a user-defined transformation function to each data point in the input stream, producing a new stream of transformed values. It's useful for simple point-by-point transformations such as scaling, type conversion, feature extraction, or any operation that processes one input item at a time.

Parameters:

Name Type Description Default
map_func Callable[[T], U]

Function that transforms each input item to an output item

required
source Handler[Any, T] | None

The handler providing input data, defaults to None Example: python # Create a data source with numeric values data_source = SimpleDataProvider([1, 2, 3, 4, 5]) # Simple mapping function to square each value def square(x: int) -> int: return x * x # Create a mapping handler mapper = MappingHandler(map_func=square, source=data_source) # Process the data for transformed in mapper: print(transformed) # Output: # 1 # 4 # 9 # 16 # 25 # Example with a more complex transformation import json # Data source with JSON strings json_data = [ '{"timestamp": "2023-09-01T10:00:00", "value": 42.5}', '{"timestamp": "2023-09-01T10:01:00", "value": 43.2}', '{"timestamp": "2023-09-01T10:02:00", "value": 41.8}', ] json_source = SimpleDataProvider(json_data) # Function to extract timestamp and value from JSON def parse_json(json_str: str) -> tuple[str, float]: data = json.loads(json_str) return (data["timestamp"], data["value"]) # Create a mapping handler for JSON parsing json_mapper = MappingHandler(map_func=parse_json, source=json_source) # Process JSON data for timestamp, value in json_mapper: print(f"Time: {timestamp}, Value: {value}") # Output: # Time: 2023-09-01T10:00:00, Value: 42.5 # Time: 2023-09-01T10:01:00, Value: 43.2 # Time: 2023-09-01T10:02:00, Value: 41.8

None
Source code in pysatl_tsp/core/processor/mapping_handler.py
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
class MappingHandler(Handler[T, U]):
    """A handler that transforms time series data by applying a mapping function to each item.

    This handler applies a user-defined transformation function to each data point
    in the input stream, producing a new stream of transformed values. It's useful for
    simple point-by-point transformations such as scaling, type conversion, feature
    extraction, or any operation that processes one input item at a time.

    :param map_func: Function that transforms each input item to an output item
    :param source: The handler providing input data, defaults to None

    Example:
        ```python
        # Create a data source with numeric values
        data_source = SimpleDataProvider([1, 2, 3, 4, 5])


        # Simple mapping function to square each value
        def square(x: int) -> int:
            return x * x


        # Create a mapping handler
        mapper = MappingHandler(map_func=square, source=data_source)

        # Process the data
        for transformed in mapper:
            print(transformed)

        # Output:
        # 1
        # 4
        # 9
        # 16
        # 25

        # Example with a more complex transformation
        import json

        # Data source with JSON strings
        json_data = [
            '{"timestamp": "2023-09-01T10:00:00", "value": 42.5}',
            '{"timestamp": "2023-09-01T10:01:00", "value": 43.2}',
            '{"timestamp": "2023-09-01T10:02:00", "value": 41.8}',
        ]
        json_source = SimpleDataProvider(json_data)


        # Function to extract timestamp and value from JSON
        def parse_json(json_str: str) -> tuple[str, float]:
            data = json.loads(json_str)
            return (data["timestamp"], data["value"])


        # Create a mapping handler for JSON parsing
        json_mapper = MappingHandler(map_func=parse_json, source=json_source)

        # Process JSON data
        for timestamp, value in json_mapper:
            print(f"Time: {timestamp}, Value: {value}")

        # Output:
        # Time: 2023-09-01T10:00:00, Value: 42.5
        # Time: 2023-09-01T10:01:00, Value: 43.2
        # Time: 2023-09-01T10:02:00, Value: 41.8
        ```
    """

    def __init__(self, map_func: Callable[[T], U], source: Handler[Any, T] | None = None):
        """Initialize a mapping handler.

        :param map_func: Function that transforms each input item to an output item
        :param source: The handler providing input data, defaults to None
        """
        super().__init__(source)
        self.map_func = map_func

    def __iter__(self) -> Iterator[U]:
        """Create an iterator that yields transformed items.

        This method iterates through the source data and applies the mapping function
        to each item, yielding the transformed results.

        :return: Iterator yielding transformed items
        :raises ValueError: If no source has been set
        """
        if self.source is None:
            raise ValueError("Source is not set")

        for segment in self.source:
            yield self.map_func(segment)
__init__
__init__(
    map_func: Callable[[T], U],
    source: Handler[Any, T] | None = None,
)

Initialize a mapping handler.

Parameters:

Name Type Description Default
map_func Callable[[T], U]

Function that transforms each input item to an output item

required
source Handler[Any, T] | None

The handler providing input data, defaults to None

None
Source code in pysatl_tsp/core/processor/mapping_handler.py
75
76
77
78
79
80
81
82
def __init__(self, map_func: Callable[[T], U], source: Handler[Any, T] | None = None):
    """Initialize a mapping handler.

    :param map_func: Function that transforms each input item to an output item
    :param source: The handler providing input data, defaults to None
    """
    super().__init__(source)
    self.map_func = map_func
__iter__
__iter__() -> Iterator[U]

Create an iterator that yields transformed items.

This method iterates through the source data and applies the mapping function to each item, yielding the transformed results.

Returns:

Type Description
Iterator[U]

Iterator yielding transformed items

Raises:

Type Description
ValueError

If no source has been set

Source code in pysatl_tsp/core/processor/mapping_handler.py
84
85
86
87
88
89
90
91
92
93
94
95
96
97
def __iter__(self) -> Iterator[U]:
    """Create an iterator that yields transformed items.

    This method iterates through the source data and applies the mapping function
    to each item, yielding the transformed results.

    :return: Iterator yielding transformed items
    :raises ValueError: If no source has been set
    """
    if self.source is None:
        raise ValueError("Source is not set")

    for segment in self.source:
        yield self.map_func(segment)
OfflineFilterHandler

Bases: Handler[T, U]

A handler that applies a filter function to the entire time series data in batch mode.

This handler collects all data from the source before applying the filter function to the complete series at once. It's suitable for implementing filters that require the entire context of the time series, such as spectral filters, Savitzky-Golay filters, or other techniques that need to process the data as a whole.

Parameters:

Name Type Description Default
filter_func Callable[[ScrubberWindow[T], Any], list[U]]

Function that processes the entire series and returns filtered values

required
filter_config Any

Configuration parameters for the filter function, defaults to None

None
source Handler[Any, T] | None

The handler providing input data, defaults to None Example: python # Create a data source import numpy as np from scipy import signal # Generate a noisy signal t = np.linspace(0, 1, 100) clean_signal = np.sin(2 * np.pi * 5 * t) noise = np.random.normal(0, 0.2, 100) noisy_signal = clean_signal + noise data_source = SimpleDataProvider(noisy_signal) # Define a Savitzky-Golay filter function def savgol_filter(window: ScrubberWindow[float], config: dict) -> list[float]: data = np.array(window.values) window_length = config.get("window_length", 11) polyorder = config.get("polyorder", 3) filtered = signal.savgol_filter(data, window_length, polyorder) return filtered.tolist() # Create the offline filter filter_handler = OfflineFilterHandler( filter_func=savgol_filter, filter_config={"window_length": 11, "polyorder": 3}, source=data_source ) # Process and visualize the results filtered_signal = list(filter_handler) import matplotlib.pyplot as plt plt.figure(figsize=(10, 6)) plt.plot(t, noisy_signal, "b", label="Noisy signal") plt.plot(t, filtered_signal, "r", label="Filtered signal") plt.plot(t, clean_signal, "g", label="Original clean signal") plt.legend() plt.show()

None
Source code in pysatl_tsp/core/processor/filter_handler.py
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
class OfflineFilterHandler(Handler[T, U]):
    """A handler that applies a filter function to the entire time series data in batch mode.

    This handler collects all data from the source before applying the filter function
    to the complete series at once. It's suitable for implementing filters that require
    the entire context of the time series, such as spectral filters, Savitzky-Golay filters,
    or other techniques that need to process the data as a whole.

    :param filter_func: Function that processes the entire series and returns filtered values
    :param filter_config: Configuration parameters for the filter function, defaults to None
    :param source: The handler providing input data, defaults to None

    Example:
        ```python
        # Create a data source
        import numpy as np
        from scipy import signal

        # Generate a noisy signal
        t = np.linspace(0, 1, 100)
        clean_signal = np.sin(2 * np.pi * 5 * t)
        noise = np.random.normal(0, 0.2, 100)
        noisy_signal = clean_signal + noise

        data_source = SimpleDataProvider(noisy_signal)


        # Define a Savitzky-Golay filter function
        def savgol_filter(window: ScrubberWindow[float], config: dict) -> list[float]:
            data = np.array(window.values)
            window_length = config.get("window_length", 11)
            polyorder = config.get("polyorder", 3)

            filtered = signal.savgol_filter(data, window_length, polyorder)
            return filtered.tolist()


        # Create the offline filter
        filter_handler = OfflineFilterHandler(
            filter_func=savgol_filter, filter_config={"window_length": 11, "polyorder": 3}, source=data_source
        )

        # Process and visualize the results
        filtered_signal = list(filter_handler)

        import matplotlib.pyplot as plt

        plt.figure(figsize=(10, 6))
        plt.plot(t, noisy_signal, "b", label="Noisy signal")
        plt.plot(t, filtered_signal, "r", label="Filtered signal")
        plt.plot(t, clean_signal, "g", label="Original clean signal")
        plt.legend()
        plt.show()
        ```
    """

    def __init__(
        self,
        filter_func: Callable[[ScrubberWindow[T], Any], list[U]],
        filter_config: Any = None,
        source: Handler[Any, T] | None = None,
    ):
        """Initialize an offline filter handler.

        :param filter_func: Function that processes the entire series and returns filtered values
        :param filter_config: Configuration parameters for the filter function, defaults to None
        :param source: The handler providing input data, defaults to None
        """
        super().__init__(source)
        self.filter_func = filter_func
        self.filter_config = filter_config

    def __iter__(self) -> Iterator[U]:
        """Create an iterator that yields filtered values after processing the entire series.

        This method collects all data from the source, applies the filter function
        to the complete series, and then yields the resulting filtered values.

        :return: Iterator yielding filtered values
        :raises ValueError: If no source has been set
        """
        if self.source is None:
            raise ValueError("Source is not set")

        full_series = ScrubberWindow(deque(self.source))
        filtered_series = self.filter_func(full_series, self.filter_config)

        yield from filtered_series
__init__
__init__(
    filter_func: Callable[
        [ScrubberWindow[T], Any], list[U]
    ],
    filter_config: Any = None,
    source: Handler[Any, T] | None = None,
)

Initialize an offline filter handler.

Parameters:

Name Type Description Default
filter_func Callable[[ScrubberWindow[T], Any], list[U]]

Function that processes the entire series and returns filtered values

required
filter_config Any

Configuration parameters for the filter function, defaults to None

None
source Handler[Any, T] | None

The handler providing input data, defaults to None

None
Source code in pysatl_tsp/core/processor/filter_handler.py
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
def __init__(
    self,
    filter_func: Callable[[ScrubberWindow[T], Any], list[U]],
    filter_config: Any = None,
    source: Handler[Any, T] | None = None,
):
    """Initialize an offline filter handler.

    :param filter_func: Function that processes the entire series and returns filtered values
    :param filter_config: Configuration parameters for the filter function, defaults to None
    :param source: The handler providing input data, defaults to None
    """
    super().__init__(source)
    self.filter_func = filter_func
    self.filter_config = filter_config
__iter__
__iter__() -> Iterator[U]

Create an iterator that yields filtered values after processing the entire series.

This method collects all data from the source, applies the filter function to the complete series, and then yields the resulting filtered values.

Returns:

Type Description
Iterator[U]

Iterator yielding filtered values

Raises:

Type Description
ValueError

If no source has been set

Source code in pysatl_tsp/core/processor/filter_handler.py
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
def __iter__(self) -> Iterator[U]:
    """Create an iterator that yields filtered values after processing the entire series.

    This method collects all data from the source, applies the filter function
    to the complete series, and then yields the resulting filtered values.

    :return: Iterator yielding filtered values
    :raises ValueError: If no source has been set
    """
    if self.source is None:
        raise ValueError("Source is not set")

    full_series = ScrubberWindow(deque(self.source))
    filtered_series = self.filter_func(full_series, self.filter_config)

    yield from filtered_series
OfflineSamplingHandler

Bases: Handler[T, T]

A handler that samples time series data in batch mode based on identified indices.

This handler processes the entire dataset to identify sampling points before extracting the samples. It's suitable for global sampling strategies that consider the entire time series context, such as selecting representative points or key points that preserve the overall shape of the data.

Parameters:

Name Type Description Default
sampling_rule Callable[[ScrubberWindow[T]], list[int]]

Function that analyzes the entire series and returns indices of points to sample

required
source Handler[Any, T] | None

The handler providing input data, defaults to None Example: python import numpy as np import matplotlib.pyplot as plt from typing import List # Create a data source with a sinusoidal signal x = np.linspace(0, 4*np.pi, 1000) y = np.sin(x) data_source = SimpleDataProvider(y) # Define an offline sampling rule that selects local extrema def find_extrema(window: ScrubberWindow[float]) -> List[int]: data = np.array(window.values) # Find local maxima and minima extrema_indices = [] # First point is always included extrema_indices.append(0) # Find local maxima and minima (simplified) for i in range(1, len(data)-1): if (data[i] > data[i-1] and data[i] > data[i+1]) or (data[i] < data[i-1] and data[i] < data[i+1]): extrema_indices.append(i) # Last point is always included extrema_indices.append(len(data)-1) return extrema_indices # Create a sampling handler sampler = OfflineSamplingHandler( sampling_rule=find_extrema, source=data_source ) # Process and collect sampled points sampled_indices = [] sampled_values = [] original_values = list(y) for i, value in enumerate(sampler): sampled_values.append(value) # Approximate index (not exact) sampled_indices.append(i * len(original_values) // len(sampled_values)) # Visualize the results plt.figure(figsize=(12, 6)) plt.plot(x, y, 'b-', label='Original signal') plt.plot(x[sampled_indices], sampled_values, 'ro', label='Sampled points') plt.legend() plt.title('Sinusoidal Signal with Extrema Sampling') plt.xlabel('x') plt.ylabel('sin(x)') plt.grid(True) plt.show() print(f"Original data points: {len(original_values)}") print(f"Sampled data points: {len(sampled_values)}")

None
Source code in pysatl_tsp/core/processor/sampling_handler.py
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
class OfflineSamplingHandler(Handler[T, T]):
    """A handler that samples time series data in batch mode based on identified indices.

    This handler processes the entire dataset to identify sampling points before
    extracting the samples. It's suitable for global sampling strategies that consider
    the entire time series context, such as selecting representative points or
    key points that preserve the overall shape of the data.

    :param sampling_rule: Function that analyzes the entire series and returns indices of points to sample
    :param source: The handler providing input data, defaults to None

    Example:
        ```python
        import numpy as np
        import matplotlib.pyplot as plt
        from typing import List

        # Create a data source with a sinusoidal signal
        x = np.linspace(0, 4*np.pi, 1000)
        y = np.sin(x)
        data_source = SimpleDataProvider(y)

        # Define an offline sampling rule that selects local extrema
        def find_extrema(window: ScrubberWindow[float]) -> List[int]:
            data = np.array(window.values)
            # Find local maxima and minima
            extrema_indices = []

            # First point is always included
            extrema_indices.append(0)

            # Find local maxima and minima (simplified)
            for i in range(1, len(data)-1):
                if (data[i] > data[i-1] and data[i] > data[i+1]) or \
                   (data[i] < data[i-1] and data[i] < data[i+1]):
                    extrema_indices.append(i)

            # Last point is always included
            extrema_indices.append(len(data)-1)

            return extrema_indices

        # Create a sampling handler
        sampler = OfflineSamplingHandler(
            sampling_rule=find_extrema,
            source=data_source
        )

        # Process and collect sampled points
        sampled_indices = []
        sampled_values = []
        original_values = list(y)

        for i, value in enumerate(sampler):
            sampled_values.append(value)
            # Approximate index (not exact)
            sampled_indices.append(i * len(original_values) // len(sampled_values))

        # Visualize the results
        plt.figure(figsize=(12, 6))
        plt.plot(x, y, 'b-', label='Original signal')
        plt.plot(x[sampled_indices], sampled_values, 'ro', label='Sampled points')
        plt.legend()
        plt.title('Sinusoidal Signal with Extrema Sampling')
        plt.xlabel('x')
        plt.ylabel('sin(x)')
        plt.grid(True)
        plt.show()

        print(f"Original data points: {len(original_values)}")
        print(f"Sampled data points: {len(sampled_values)}")
        ```
    """

    def __init__(self, sampling_rule: Callable[[ScrubberWindow[T]], list[int]], source: Handler[Any, T] | None = None):
        """Initialize an offline sampling handler.

        :param sampling_rule: Function that analyzes the entire series and returns indices of points to sample
        :param source: The handler providing input data, defaults to None
        """
        super().__init__(source)
        self.sampling_rule = sampling_rule

    def __iter__(self) -> Iterator[T]:
        """Create an iterator that yields sampled values based on the indices identified by the sampling rule.

        This method uses OfflineSegmentationScrubber to segment the data at the specified indices
        and a MappingHandler to extract the last item from each segment.

        :return: Iterator yielding sampled values
        :raises ValueError: If no source has been set (propagated from segmentation scrubber)
        """
        mapping_handler: MappingHandler[ScrubberWindow[T], T] = MappingHandler(map_func=lambda window: window[-1])
        pipeline = (
            OfflineSegmentationScrubber(segmentation_rule=self.sampling_rule, source=self.source) | mapping_handler
        )

        yield from pipeline
__init__
__init__(
    sampling_rule: Callable[[ScrubberWindow[T]], list[int]],
    source: Handler[Any, T] | None = None,
)

Initialize an offline sampling handler.

Parameters:

Name Type Description Default
sampling_rule Callable[[ScrubberWindow[T]], list[int]]

Function that analyzes the entire series and returns indices of points to sample

required
source Handler[Any, T] | None

The handler providing input data, defaults to None

None
Source code in pysatl_tsp/core/processor/sampling_handler.py
158
159
160
161
162
163
164
165
def __init__(self, sampling_rule: Callable[[ScrubberWindow[T]], list[int]], source: Handler[Any, T] | None = None):
    """Initialize an offline sampling handler.

    :param sampling_rule: Function that analyzes the entire series and returns indices of points to sample
    :param source: The handler providing input data, defaults to None
    """
    super().__init__(source)
    self.sampling_rule = sampling_rule
__iter__
__iter__() -> Iterator[T]

Create an iterator that yields sampled values based on the indices identified by the sampling rule.

This method uses OfflineSegmentationScrubber to segment the data at the specified indices and a MappingHandler to extract the last item from each segment.

Returns:

Type Description
Iterator[T]

Iterator yielding sampled values

Raises:

Type Description
ValueError

If no source has been set (propagated from segmentation scrubber)

Source code in pysatl_tsp/core/processor/sampling_handler.py
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
def __iter__(self) -> Iterator[T]:
    """Create an iterator that yields sampled values based on the indices identified by the sampling rule.

    This method uses OfflineSegmentationScrubber to segment the data at the specified indices
    and a MappingHandler to extract the last item from each segment.

    :return: Iterator yielding sampled values
    :raises ValueError: If no source has been set (propagated from segmentation scrubber)
    """
    mapping_handler: MappingHandler[ScrubberWindow[T], T] = MappingHandler(map_func=lambda window: window[-1])
    pipeline = (
        OfflineSegmentationScrubber(segmentation_rule=self.sampling_rule, source=self.source) | mapping_handler
    )

    yield from pipeline
OnlineFilterHandler

Bases: Handler[T, U]

A handler that applies a filter function to time series data in real-time.

This handler processes data points one by one as they arrive and applies a filter function to the accumulated history. It's suitable for implementing online filters such as moving averages, exponential smoothing, or real-time anomaly detection.

The filter function receives the current history window and configuration parameters, and produces a filtered value for each input value.

Parameters:

Name Type Description Default
filter_func Callable[[ScrubberWindow[T], Any], U]

Function that applies filtering on the history window

required
filter_config Any

Configuration parameters for the filter function, defaults to None

None
source Handler[Any, T] | None

The handler providing input data, defaults to None Example: python from pysatl_tsp.core.data_providers import SimpleDataProvider from pysatl_tsp.core.scrubber import ScrubberWindow from pysatl_tsp.core.processor import OnlineFilterHandler import random random.seed(42) data = [10 + i + random.uniform(-2, 2) for i in range(20)] data_source = SimpleDataProvider(data) # Define a simple moving average filter def moving_avg(window: ScrubberWindow[float], config: int) -> float: # Use only the last 'config' elements or all if less available lookback = min(len(window), config) if lookback == 0: return 0 return sum(window[-lookback:].values) / lookback # Create the online filter with a window size of 5 filter_handler = OnlineFilterHandler(filter_func=moving_avg, filter_config=5, source=data_source) # Process the data original_values = [] filtered_values = [] for i, filtered_value in enumerate(filter_handler): original_values.append(data[i]) filtered_values.append(filtered_value) print("Original vs Filtered:") for orig, filt in zip(original_values[:10], filtered_values[:10]): print(f"{orig:.2f} -> {filt:.2f}") # Output might look like: # Original vs Filtered: # 9.67 -> 9.67 # 11.79 -> 10.73 # 11.56 -> 11.01 # 12.89 -> 11.48 # 13.89 -> 11.96 # ...

None
Source code in pysatl_tsp/core/processor/filter_handler.py
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
class OnlineFilterHandler(Handler[T, U]):
    """A handler that applies a filter function to time series data in real-time.

    This handler processes data points one by one as they arrive and applies a filter
    function to the accumulated history. It's suitable for implementing online filters
    such as moving averages, exponential smoothing, or real-time anomaly detection.

    The filter function receives the current history window and configuration parameters,
    and produces a filtered value for each input value.

    :param filter_func: Function that applies filtering on the history window
    :param filter_config: Configuration parameters for the filter function, defaults to None
    :param source: The handler providing input data, defaults to None

    Example:
        ```python
        from pysatl_tsp.core.data_providers import SimpleDataProvider
        from pysatl_tsp.core.scrubber import ScrubberWindow
        from pysatl_tsp.core.processor import OnlineFilterHandler
        import random

        random.seed(42)
        data = [10 + i + random.uniform(-2, 2) for i in range(20)]
        data_source = SimpleDataProvider(data)


        # Define a simple moving average filter
        def moving_avg(window: ScrubberWindow[float], config: int) -> float:
            # Use only the last 'config' elements or all if less available
            lookback = min(len(window), config)
            if lookback == 0:
                return 0
            return sum(window[-lookback:].values) / lookback


        # Create the online filter with a window size of 5
        filter_handler = OnlineFilterHandler(filter_func=moving_avg, filter_config=5, source=data_source)

        # Process the data
        original_values = []
        filtered_values = []

        for i, filtered_value in enumerate(filter_handler):
            original_values.append(data[i])
            filtered_values.append(filtered_value)

        print("Original vs Filtered:")
        for orig, filt in zip(original_values[:10], filtered_values[:10]):
            print(f"{orig:.2f} -> {filt:.2f}")

        # Output might look like:
        # Original vs Filtered:
        # 9.67 -> 9.67
        # 11.79 -> 10.73
        # 11.56 -> 11.01
        # 12.89 -> 11.48
        # 13.89 -> 11.96
        # ...
        ```
    """

    def __init__(
        self,
        filter_func: Callable[[ScrubberWindow[T], Any], U],
        filter_config: Any = None,
        source: Handler[Any, T] | None = None,
    ):
        """Initialize an online filter handler.

        :param filter_func: Function that applies filtering on the history window
        :param filter_config: Configuration parameters for the filter function, defaults to None
        :param source: The handler providing input data, defaults to None
        """
        super().__init__(source)
        self.filter_func = filter_func
        self.filter_config = filter_config

    def __iter__(self) -> Iterator[U]:
        """Create an iterator that yields filtered values in real-time.

        This method processes data points one by one, accumulates them in a history window,
        and applies the filter function to produce filtered values.

        :return: Iterator yielding filtered values
        :raises ValueError: If no source has been set
        """
        if self.source is None:
            raise ValueError("Source is not set")

        self._history: ScrubberWindow[T] = ScrubberWindow()

        for item in self.source:
            self._history.append(item)
            yield self.filter_func(self._history, self.filter_config)
__init__
__init__(
    filter_func: Callable[[ScrubberWindow[T], Any], U],
    filter_config: Any = None,
    source: Handler[Any, T] | None = None,
)

Initialize an online filter handler.

Parameters:

Name Type Description Default
filter_func Callable[[ScrubberWindow[T], Any], U]

Function that applies filtering on the history window

required
filter_config Any

Configuration parameters for the filter function, defaults to None

None
source Handler[Any, T] | None

The handler providing input data, defaults to None

None
Source code in pysatl_tsp/core/processor/filter_handler.py
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
def __init__(
    self,
    filter_func: Callable[[ScrubberWindow[T], Any], U],
    filter_config: Any = None,
    source: Handler[Any, T] | None = None,
):
    """Initialize an online filter handler.

    :param filter_func: Function that applies filtering on the history window
    :param filter_config: Configuration parameters for the filter function, defaults to None
    :param source: The handler providing input data, defaults to None
    """
    super().__init__(source)
    self.filter_func = filter_func
    self.filter_config = filter_config
__iter__
__iter__() -> Iterator[U]

Create an iterator that yields filtered values in real-time.

This method processes data points one by one, accumulates them in a history window, and applies the filter function to produce filtered values.

Returns:

Type Description
Iterator[U]

Iterator yielding filtered values

Raises:

Type Description
ValueError

If no source has been set

Source code in pysatl_tsp/core/processor/filter_handler.py
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
def __iter__(self) -> Iterator[U]:
    """Create an iterator that yields filtered values in real-time.

    This method processes data points one by one, accumulates them in a history window,
    and applies the filter function to produce filtered values.

    :return: Iterator yielding filtered values
    :raises ValueError: If no source has been set
    """
    if self.source is None:
        raise ValueError("Source is not set")

    self._history: ScrubberWindow[T] = ScrubberWindow()

    for item in self.source:
        self._history.append(item)
        yield self.filter_func(self._history, self.filter_config)
OnlineSamplingHandler

Bases: Handler[T, T]

A handler that samples time series data in real-time based on a condition.

This handler uses segmentation to identify points where sampling should occur and extracts the last item from each segment. It processes data in real-time and is suitable for adaptive sampling strategies, where sampling decisions are made based on the recent history of the time series.

Parameters:

Name Type Description Default
sampling_rule Callable[[ScrubberWindow[T]], bool]

Function that decides when to take a sample

required
source Handler[Any, T] | None

The handler providing input data, defaults to None Example: python # Create a data source with steadily increasing values data = list(range(100)) data_source = SimpleDataProvider(data) # Define a sampling rule that samples when the value changes by more than 5 def significant_change(window: ScrubberWindow[int]) -> bool: if len(window) < 2: return False # Get last sample taken (first item in window) and current value last_sampled = window[0] current = window[-1] # Sample if change is significant return abs(current - last_sampled) >= 5 # Create a sampling handler sampler = OnlineSamplingHandler(sampling_rule=significant_change, source=data_source) # Process and collect sampled points sampled_points = list(sampler) print(f"Original data points: {len(data)}") print(f"Sampled data points: {len(sampled_points)}") print(f"Sampled values: {sampled_points[:10]}...") # Output might look like: # Original data points: 100 # Sampled data points: 20 # Sampled values: [0, 5, 10, 15, 20, 25, 30, 35, 40, 45]...

None
Source code in pysatl_tsp/core/processor/sampling_handler.py
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
class OnlineSamplingHandler(Handler[T, T]):
    """A handler that samples time series data in real-time based on a condition.

    This handler uses segmentation to identify points where sampling should occur
    and extracts the last item from each segment. It processes data in real-time
    and is suitable for adaptive sampling strategies, where sampling decisions
    are made based on the recent history of the time series.

    :param sampling_rule: Function that decides when to take a sample
    :param source: The handler providing input data, defaults to None

    Example:
        ```python
        # Create a data source with steadily increasing values
        data = list(range(100))
        data_source = SimpleDataProvider(data)


        # Define a sampling rule that samples when the value changes by more than 5
        def significant_change(window: ScrubberWindow[int]) -> bool:
            if len(window) < 2:
                return False

            # Get last sample taken (first item in window) and current value
            last_sampled = window[0]
            current = window[-1]

            # Sample if change is significant
            return abs(current - last_sampled) >= 5


        # Create a sampling handler
        sampler = OnlineSamplingHandler(sampling_rule=significant_change, source=data_source)

        # Process and collect sampled points
        sampled_points = list(sampler)

        print(f"Original data points: {len(data)}")
        print(f"Sampled data points: {len(sampled_points)}")
        print(f"Sampled values: {sampled_points[:10]}...")

        # Output might look like:
        # Original data points: 100
        # Sampled data points: 20
        # Sampled values: [0, 5, 10, 15, 20, 25, 30, 35, 40, 45]...
        ```
    """

    def __init__(self, sampling_rule: Callable[[ScrubberWindow[T]], bool], source: Handler[Any, T] | None = None):
        """Initialize an online sampling handler.

        :param sampling_rule: Function that decides when to take a sample
        :param source: The handler providing input data, defaults to None
        """
        super().__init__(source)
        self.sampling_rule = sampling_rule

    def __iter__(self) -> Iterator[T]:
        """Create an iterator that yields sampled values based on the sampling rule.

        This method uses OnlineSegmentationScrubber to segment the data and a
        MappingHandler to extract the last item from each segment.

        :return: Iterator yielding sampled values
        :raises ValueError: If no source has been set (propagated from segmentation scrubber)
        """
        mapping_handler: MappingHandler[ScrubberWindow[T], T] = MappingHandler(map_func=lambda window: window[-1])
        pipeline = (
            OnlineSegmentationScrubber(segmentation_rule=self.sampling_rule, source=self.source) | mapping_handler
        )

        yield from pipeline
__init__
__init__(
    sampling_rule: Callable[[ScrubberWindow[T]], bool],
    source: Handler[Any, T] | None = None,
)

Initialize an online sampling handler.

Parameters:

Name Type Description Default
sampling_rule Callable[[ScrubberWindow[T]], bool]

Function that decides when to take a sample

required
source Handler[Any, T] | None

The handler providing input data, defaults to None

None
Source code in pysatl_tsp/core/processor/sampling_handler.py
58
59
60
61
62
63
64
65
def __init__(self, sampling_rule: Callable[[ScrubberWindow[T]], bool], source: Handler[Any, T] | None = None):
    """Initialize an online sampling handler.

    :param sampling_rule: Function that decides when to take a sample
    :param source: The handler providing input data, defaults to None
    """
    super().__init__(source)
    self.sampling_rule = sampling_rule
__iter__
__iter__() -> Iterator[T]

Create an iterator that yields sampled values based on the sampling rule.

This method uses OnlineSegmentationScrubber to segment the data and a MappingHandler to extract the last item from each segment.

Returns:

Type Description
Iterator[T]

Iterator yielding sampled values

Raises:

Type Description
ValueError

If no source has been set (propagated from segmentation scrubber)

Source code in pysatl_tsp/core/processor/sampling_handler.py
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
def __iter__(self) -> Iterator[T]:
    """Create an iterator that yields sampled values based on the sampling rule.

    This method uses OnlineSegmentationScrubber to segment the data and a
    MappingHandler to extract the last item from each segment.

    :return: Iterator yielding sampled values
    :raises ValueError: If no source has been set (propagated from segmentation scrubber)
    """
    mapping_handler: MappingHandler[ScrubberWindow[T], T] = MappingHandler(map_func=lambda window: window[-1])
    pipeline = (
        OnlineSegmentationScrubber(segmentation_rule=self.sampling_rule, source=self.source) | mapping_handler
    )

    yield from pipeline
combine_handler
CombineHandler

Bases: Handler[T, U]

A handler that combines outputs from multiple handlers processing the same data.

This handler feeds the same source data to multiple handlers in parallel and combines their outputs using a user-provided function. It's useful for scenarios where you need to process the same data in different ways and then merge the results, such as feature extraction, multi-model predictions, or parallel transformations.

Parameters:

Name Type Description Default
combine_func Callable[[list[Any]], U]

Function that combines values from all handlers into a single output

required
handlers Handler[T, Any]

Variable number of handlers whose outputs will be combined

()
continue_on_partial bool

Whether to continue when some handlers are exhausted, defaults to True Example: python # Create a data source data_source = SimpleDataProvider([1, 2, 3, 4, 5]) # Define handlers for different transformations square_handler = MappingHandler(map_func=lambda x: x * x) double_handler = MappingHandler(map_func=lambda x: 2 * x) str_handler = MappingHandler(map_func=lambda x: f"Value: {x}") # Function to combine outputs from all handlers def combine_results(values): return { "original^2": values[0], "original*2": values[1], "string": values[2] } # Create and use the combine handler combine = CombineHandler( combine_func=combine_results, square_handler, double_handler, str_handler ) combine.set_source(data_source) # Process the data for result in combine: print(result) # Output: # {'original^2': 1, 'original*2': 2, 'string': 'Value: 1'} # {'original^2': 4, 'original*2': 4, 'string': 'Value: 2'} # {'original^2': 9, 'original*2': 6, 'string': 'Value: 3'} # {'original^2': 16, 'original*2': 8, 'string': 'Value: 4'} # {'original^2': 25, 'original*2': 10, 'string': 'Value: 5'}

True
Source code in pysatl_tsp/core/processor/combine_handler.py
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
class CombineHandler(Handler[T, U]):
    """A handler that combines outputs from multiple handlers processing the same data.

    This handler feeds the same source data to multiple handlers in parallel and combines
    their outputs using a user-provided function. It's useful for scenarios where you need
    to process the same data in different ways and then merge the results, such as feature
    extraction, multi-model predictions, or parallel transformations.

    :param combine_func: Function that combines values from all handlers into a single output
    :param handlers: Variable number of handlers whose outputs will be combined
    :param continue_on_partial: Whether to continue when some handlers are exhausted, defaults to True

    Example:
        ```python
        # Create a data source
        data_source = SimpleDataProvider([1, 2, 3, 4, 5])

        # Define handlers for different transformations
        square_handler = MappingHandler(map_func=lambda x: x * x)
        double_handler = MappingHandler(map_func=lambda x: 2 * x)
        str_handler = MappingHandler(map_func=lambda x: f"Value: {x}")

        # Function to combine outputs from all handlers
        def combine_results(values):
            return {
                "original^2": values[0],
                "original*2": values[1],
                "string": values[2]
            }

        # Create and use the combine handler
        combine = CombineHandler(
            combine_func=combine_results,
            square_handler, double_handler, str_handler
        )
        combine.set_source(data_source)

        # Process the data
        for result in combine:
            print(result)

        # Output:
        # {'original^2': 1, 'original*2': 2, 'string': 'Value: 1'}
        # {'original^2': 4, 'original*2': 4, 'string': 'Value: 2'}
        # {'original^2': 9, 'original*2': 6, 'string': 'Value: 3'}
        # {'original^2': 16, 'original*2': 8, 'string': 'Value: 4'}
        # {'original^2': 25, 'original*2': 10, 'string': 'Value: 5'}
        ```
    """

    def __init__(
        self, combine_func: Callable[[list[Any]], U], *handlers: Handler[T, Any], continue_on_partial: bool = True
    ):
        """Initialize a combine handler.

        :param combine_func: Function that combines values from all handlers into a single output
        :param handlers: Variable number of handlers whose outputs will be combined
        :param continue_on_partial: Whether to continue when some handlers are exhausted, defaults to True
        """
        super().__init__()
        self.handlers = handlers
        self.combine_func = combine_func
        self.continue_on_partial = continue_on_partial

    def __iter__(self) -> Iterator[U]:
        """Create an iterator that yields combined results from multiple handlers.

        This method processes the source data through each handler in parallel and
        combines their outputs using the combine function. If continue_on_partial is True,
        it will continue producing outputs even after some handlers are exhausted (using None
        for exhausted handlers). If False, it will stop when any handler is exhausted.

        :return: Iterator yielding combined results
        :raises ValueError: If no source has been set
        """
        if self.source is None:
            raise ValueError("Source is not set")

        source_iterators = itertools.tee(iter(self.source), len(self.handlers))

        pipelines = []
        for i, handler in enumerate(self.handlers):
            data_provider = SimpleDataProvider(source_iterators[i])
            pipelines.append(data_provider | handler)

        iterators = [iter(pipeline) for pipeline in pipelines]

        active_iterators = [True] * len(iterators)

        while any(active_iterators):
            values = [None] * len(iterators)

            for i, iterator in enumerate(iterators):
                if active_iterators[i]:
                    try:
                        values[i] = next(iterator)
                    except StopIteration:
                        active_iterators[i] = False

            if not self.continue_on_partial and not all(active_iterators):
                break

            if not any(active_iterators):
                break

            yield self.combine_func(values)
__init__
__init__(
    combine_func: Callable[[list[Any]], U],
    *handlers: Handler[T, Any],
    continue_on_partial: bool = True,
)

Initialize a combine handler.

Parameters:

Name Type Description Default
combine_func Callable[[list[Any]], U]

Function that combines values from all handlers into a single output

required
handlers Handler[T, Any]

Variable number of handlers whose outputs will be combined

()
continue_on_partial bool

Whether to continue when some handlers are exhausted, defaults to True

True
Source code in pysatl_tsp/core/processor/combine_handler.py
59
60
61
62
63
64
65
66
67
68
69
70
71
def __init__(
    self, combine_func: Callable[[list[Any]], U], *handlers: Handler[T, Any], continue_on_partial: bool = True
):
    """Initialize a combine handler.

    :param combine_func: Function that combines values from all handlers into a single output
    :param handlers: Variable number of handlers whose outputs will be combined
    :param continue_on_partial: Whether to continue when some handlers are exhausted, defaults to True
    """
    super().__init__()
    self.handlers = handlers
    self.combine_func = combine_func
    self.continue_on_partial = continue_on_partial
__iter__
__iter__() -> Iterator[U]

Create an iterator that yields combined results from multiple handlers.

This method processes the source data through each handler in parallel and combines their outputs using the combine function. If continue_on_partial is True, it will continue producing outputs even after some handlers are exhausted (using None for exhausted handlers). If False, it will stop when any handler is exhausted.

Returns:

Type Description
Iterator[U]

Iterator yielding combined results

Raises:

Type Description
ValueError

If no source has been set

Source code in pysatl_tsp/core/processor/combine_handler.py
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
def __iter__(self) -> Iterator[U]:
    """Create an iterator that yields combined results from multiple handlers.

    This method processes the source data through each handler in parallel and
    combines their outputs using the combine function. If continue_on_partial is True,
    it will continue producing outputs even after some handlers are exhausted (using None
    for exhausted handlers). If False, it will stop when any handler is exhausted.

    :return: Iterator yielding combined results
    :raises ValueError: If no source has been set
    """
    if self.source is None:
        raise ValueError("Source is not set")

    source_iterators = itertools.tee(iter(self.source), len(self.handlers))

    pipelines = []
    for i, handler in enumerate(self.handlers):
        data_provider = SimpleDataProvider(source_iterators[i])
        pipelines.append(data_provider | handler)

    iterators = [iter(pipeline) for pipeline in pipelines]

    active_iterators = [True] * len(iterators)

    while any(active_iterators):
        values = [None] * len(iterators)

        for i, iterator in enumerate(iterators):
            if active_iterators[i]:
                try:
                    values[i] = next(iterator)
                except StopIteration:
                    active_iterators[i] = False

        if not self.continue_on_partial and not all(active_iterators):
            break

        if not any(active_iterators):
            break

        yield self.combine_func(values)
filter_handler
OfflineFilterHandler

Bases: Handler[T, U]

A handler that applies a filter function to the entire time series data in batch mode.

This handler collects all data from the source before applying the filter function to the complete series at once. It's suitable for implementing filters that require the entire context of the time series, such as spectral filters, Savitzky-Golay filters, or other techniques that need to process the data as a whole.

Parameters:

Name Type Description Default
filter_func Callable[[ScrubberWindow[T], Any], list[U]]

Function that processes the entire series and returns filtered values

required
filter_config Any

Configuration parameters for the filter function, defaults to None

None
source Handler[Any, T] | None

The handler providing input data, defaults to None Example: python # Create a data source import numpy as np from scipy import signal # Generate a noisy signal t = np.linspace(0, 1, 100) clean_signal = np.sin(2 * np.pi * 5 * t) noise = np.random.normal(0, 0.2, 100) noisy_signal = clean_signal + noise data_source = SimpleDataProvider(noisy_signal) # Define a Savitzky-Golay filter function def savgol_filter(window: ScrubberWindow[float], config: dict) -> list[float]: data = np.array(window.values) window_length = config.get("window_length", 11) polyorder = config.get("polyorder", 3) filtered = signal.savgol_filter(data, window_length, polyorder) return filtered.tolist() # Create the offline filter filter_handler = OfflineFilterHandler( filter_func=savgol_filter, filter_config={"window_length": 11, "polyorder": 3}, source=data_source ) # Process and visualize the results filtered_signal = list(filter_handler) import matplotlib.pyplot as plt plt.figure(figsize=(10, 6)) plt.plot(t, noisy_signal, "b", label="Noisy signal") plt.plot(t, filtered_signal, "r", label="Filtered signal") plt.plot(t, clean_signal, "g", label="Original clean signal") plt.legend() plt.show()

None
Source code in pysatl_tsp/core/processor/filter_handler.py
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
class OfflineFilterHandler(Handler[T, U]):
    """A handler that applies a filter function to the entire time series data in batch mode.

    This handler collects all data from the source before applying the filter function
    to the complete series at once. It's suitable for implementing filters that require
    the entire context of the time series, such as spectral filters, Savitzky-Golay filters,
    or other techniques that need to process the data as a whole.

    :param filter_func: Function that processes the entire series and returns filtered values
    :param filter_config: Configuration parameters for the filter function, defaults to None
    :param source: The handler providing input data, defaults to None

    Example:
        ```python
        # Create a data source
        import numpy as np
        from scipy import signal

        # Generate a noisy signal
        t = np.linspace(0, 1, 100)
        clean_signal = np.sin(2 * np.pi * 5 * t)
        noise = np.random.normal(0, 0.2, 100)
        noisy_signal = clean_signal + noise

        data_source = SimpleDataProvider(noisy_signal)


        # Define a Savitzky-Golay filter function
        def savgol_filter(window: ScrubberWindow[float], config: dict) -> list[float]:
            data = np.array(window.values)
            window_length = config.get("window_length", 11)
            polyorder = config.get("polyorder", 3)

            filtered = signal.savgol_filter(data, window_length, polyorder)
            return filtered.tolist()


        # Create the offline filter
        filter_handler = OfflineFilterHandler(
            filter_func=savgol_filter, filter_config={"window_length": 11, "polyorder": 3}, source=data_source
        )

        # Process and visualize the results
        filtered_signal = list(filter_handler)

        import matplotlib.pyplot as plt

        plt.figure(figsize=(10, 6))
        plt.plot(t, noisy_signal, "b", label="Noisy signal")
        plt.plot(t, filtered_signal, "r", label="Filtered signal")
        plt.plot(t, clean_signal, "g", label="Original clean signal")
        plt.legend()
        plt.show()
        ```
    """

    def __init__(
        self,
        filter_func: Callable[[ScrubberWindow[T], Any], list[U]],
        filter_config: Any = None,
        source: Handler[Any, T] | None = None,
    ):
        """Initialize an offline filter handler.

        :param filter_func: Function that processes the entire series and returns filtered values
        :param filter_config: Configuration parameters for the filter function, defaults to None
        :param source: The handler providing input data, defaults to None
        """
        super().__init__(source)
        self.filter_func = filter_func
        self.filter_config = filter_config

    def __iter__(self) -> Iterator[U]:
        """Create an iterator that yields filtered values after processing the entire series.

        This method collects all data from the source, applies the filter function
        to the complete series, and then yields the resulting filtered values.

        :return: Iterator yielding filtered values
        :raises ValueError: If no source has been set
        """
        if self.source is None:
            raise ValueError("Source is not set")

        full_series = ScrubberWindow(deque(self.source))
        filtered_series = self.filter_func(full_series, self.filter_config)

        yield from filtered_series
__init__
__init__(
    filter_func: Callable[
        [ScrubberWindow[T], Any], list[U]
    ],
    filter_config: Any = None,
    source: Handler[Any, T] | None = None,
)

Initialize an offline filter handler.

Parameters:

Name Type Description Default
filter_func Callable[[ScrubberWindow[T], Any], list[U]]

Function that processes the entire series and returns filtered values

required
filter_config Any

Configuration parameters for the filter function, defaults to None

None
source Handler[Any, T] | None

The handler providing input data, defaults to None

None
Source code in pysatl_tsp/core/processor/filter_handler.py
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
def __init__(
    self,
    filter_func: Callable[[ScrubberWindow[T], Any], list[U]],
    filter_config: Any = None,
    source: Handler[Any, T] | None = None,
):
    """Initialize an offline filter handler.

    :param filter_func: Function that processes the entire series and returns filtered values
    :param filter_config: Configuration parameters for the filter function, defaults to None
    :param source: The handler providing input data, defaults to None
    """
    super().__init__(source)
    self.filter_func = filter_func
    self.filter_config = filter_config
__iter__
__iter__() -> Iterator[U]

Create an iterator that yields filtered values after processing the entire series.

This method collects all data from the source, applies the filter function to the complete series, and then yields the resulting filtered values.

Returns:

Type Description
Iterator[U]

Iterator yielding filtered values

Raises:

Type Description
ValueError

If no source has been set

Source code in pysatl_tsp/core/processor/filter_handler.py
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
def __iter__(self) -> Iterator[U]:
    """Create an iterator that yields filtered values after processing the entire series.

    This method collects all data from the source, applies the filter function
    to the complete series, and then yields the resulting filtered values.

    :return: Iterator yielding filtered values
    :raises ValueError: If no source has been set
    """
    if self.source is None:
        raise ValueError("Source is not set")

    full_series = ScrubberWindow(deque(self.source))
    filtered_series = self.filter_func(full_series, self.filter_config)

    yield from filtered_series
OnlineFilterHandler

Bases: Handler[T, U]

A handler that applies a filter function to time series data in real-time.

This handler processes data points one by one as they arrive and applies a filter function to the accumulated history. It's suitable for implementing online filters such as moving averages, exponential smoothing, or real-time anomaly detection.

The filter function receives the current history window and configuration parameters, and produces a filtered value for each input value.

Parameters:

Name Type Description Default
filter_func Callable[[ScrubberWindow[T], Any], U]

Function that applies filtering on the history window

required
filter_config Any

Configuration parameters for the filter function, defaults to None

None
source Handler[Any, T] | None

The handler providing input data, defaults to None Example: python from pysatl_tsp.core.data_providers import SimpleDataProvider from pysatl_tsp.core.scrubber import ScrubberWindow from pysatl_tsp.core.processor import OnlineFilterHandler import random random.seed(42) data = [10 + i + random.uniform(-2, 2) for i in range(20)] data_source = SimpleDataProvider(data) # Define a simple moving average filter def moving_avg(window: ScrubberWindow[float], config: int) -> float: # Use only the last 'config' elements or all if less available lookback = min(len(window), config) if lookback == 0: return 0 return sum(window[-lookback:].values) / lookback # Create the online filter with a window size of 5 filter_handler = OnlineFilterHandler(filter_func=moving_avg, filter_config=5, source=data_source) # Process the data original_values = [] filtered_values = [] for i, filtered_value in enumerate(filter_handler): original_values.append(data[i]) filtered_values.append(filtered_value) print("Original vs Filtered:") for orig, filt in zip(original_values[:10], filtered_values[:10]): print(f"{orig:.2f} -> {filt:.2f}") # Output might look like: # Original vs Filtered: # 9.67 -> 9.67 # 11.79 -> 10.73 # 11.56 -> 11.01 # 12.89 -> 11.48 # 13.89 -> 11.96 # ...

None
Source code in pysatl_tsp/core/processor/filter_handler.py
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
class OnlineFilterHandler(Handler[T, U]):
    """A handler that applies a filter function to time series data in real-time.

    This handler processes data points one by one as they arrive and applies a filter
    function to the accumulated history. It's suitable for implementing online filters
    such as moving averages, exponential smoothing, or real-time anomaly detection.

    The filter function receives the current history window and configuration parameters,
    and produces a filtered value for each input value.

    :param filter_func: Function that applies filtering on the history window
    :param filter_config: Configuration parameters for the filter function, defaults to None
    :param source: The handler providing input data, defaults to None

    Example:
        ```python
        from pysatl_tsp.core.data_providers import SimpleDataProvider
        from pysatl_tsp.core.scrubber import ScrubberWindow
        from pysatl_tsp.core.processor import OnlineFilterHandler
        import random

        random.seed(42)
        data = [10 + i + random.uniform(-2, 2) for i in range(20)]
        data_source = SimpleDataProvider(data)


        # Define a simple moving average filter
        def moving_avg(window: ScrubberWindow[float], config: int) -> float:
            # Use only the last 'config' elements or all if less available
            lookback = min(len(window), config)
            if lookback == 0:
                return 0
            return sum(window[-lookback:].values) / lookback


        # Create the online filter with a window size of 5
        filter_handler = OnlineFilterHandler(filter_func=moving_avg, filter_config=5, source=data_source)

        # Process the data
        original_values = []
        filtered_values = []

        for i, filtered_value in enumerate(filter_handler):
            original_values.append(data[i])
            filtered_values.append(filtered_value)

        print("Original vs Filtered:")
        for orig, filt in zip(original_values[:10], filtered_values[:10]):
            print(f"{orig:.2f} -> {filt:.2f}")

        # Output might look like:
        # Original vs Filtered:
        # 9.67 -> 9.67
        # 11.79 -> 10.73
        # 11.56 -> 11.01
        # 12.89 -> 11.48
        # 13.89 -> 11.96
        # ...
        ```
    """

    def __init__(
        self,
        filter_func: Callable[[ScrubberWindow[T], Any], U],
        filter_config: Any = None,
        source: Handler[Any, T] | None = None,
    ):
        """Initialize an online filter handler.

        :param filter_func: Function that applies filtering on the history window
        :param filter_config: Configuration parameters for the filter function, defaults to None
        :param source: The handler providing input data, defaults to None
        """
        super().__init__(source)
        self.filter_func = filter_func
        self.filter_config = filter_config

    def __iter__(self) -> Iterator[U]:
        """Create an iterator that yields filtered values in real-time.

        This method processes data points one by one, accumulates them in a history window,
        and applies the filter function to produce filtered values.

        :return: Iterator yielding filtered values
        :raises ValueError: If no source has been set
        """
        if self.source is None:
            raise ValueError("Source is not set")

        self._history: ScrubberWindow[T] = ScrubberWindow()

        for item in self.source:
            self._history.append(item)
            yield self.filter_func(self._history, self.filter_config)
__init__
__init__(
    filter_func: Callable[[ScrubberWindow[T], Any], U],
    filter_config: Any = None,
    source: Handler[Any, T] | None = None,
)

Initialize an online filter handler.

Parameters:

Name Type Description Default
filter_func Callable[[ScrubberWindow[T], Any], U]

Function that applies filtering on the history window

required
filter_config Any

Configuration parameters for the filter function, defaults to None

None
source Handler[Any, T] | None

The handler providing input data, defaults to None

None
Source code in pysatl_tsp/core/processor/filter_handler.py
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
def __init__(
    self,
    filter_func: Callable[[ScrubberWindow[T], Any], U],
    filter_config: Any = None,
    source: Handler[Any, T] | None = None,
):
    """Initialize an online filter handler.

    :param filter_func: Function that applies filtering on the history window
    :param filter_config: Configuration parameters for the filter function, defaults to None
    :param source: The handler providing input data, defaults to None
    """
    super().__init__(source)
    self.filter_func = filter_func
    self.filter_config = filter_config
__iter__
__iter__() -> Iterator[U]

Create an iterator that yields filtered values in real-time.

This method processes data points one by one, accumulates them in a history window, and applies the filter function to produce filtered values.

Returns:

Type Description
Iterator[U]

Iterator yielding filtered values

Raises:

Type Description
ValueError

If no source has been set

Source code in pysatl_tsp/core/processor/filter_handler.py
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
def __iter__(self) -> Iterator[U]:
    """Create an iterator that yields filtered values in real-time.

    This method processes data points one by one, accumulates them in a history window,
    and applies the filter function to produce filtered values.

    :return: Iterator yielding filtered values
    :raises ValueError: If no source has been set
    """
    if self.source is None:
        raise ValueError("Source is not set")

    self._history: ScrubberWindow[T] = ScrubberWindow()

    for item in self.source:
        self._history.append(item)
        yield self.filter_func(self._history, self.filter_config)
lag_handler
LagHandler

Bases: Handler[float | None, float | None]

A handler that applies a lag-based transformation to time series data.

This handler applies a formula (2 * current_value - lagged_value) that compares the current value with a value from 'lag' time steps in the past. For the first 'lag' values where no lagged value is available, it outputs None.

The transformation can be useful for detecting changes or trends in time series by comparing current values with historical ones.

Parameters:

Name Type Description Default
lag int

Number of time steps to look back for the lagged value Example: python # Create a data source with numeric values data_source = SimpleDataProvider([1.0, 2.0, 3.0, 4.0, 5.0]) # Create a lag handler with lag of 2 lag_handler = LagHandler(lag=2) lag_handler.set_source(data_source) # Process the data results = list(lag_handler) print(results) # Output: # [None, None, 5.0, 6.0, 7.0] # # Explanation: # - First two values: None (not enough history) # - Third value: 2*3.0-1.0 = 5.0 (current=3.0, lagged=1.0) # - Fourth value: 2*4.0-2.0 = 6.0 (current=4.0, lagged=2.0) # - Fifth value: 2*5.0-3.0 = 7.0 (current=5.0, lagged=3.0)

required
Source code in pysatl_tsp/core/processor/lag_handler.py
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
class LagHandler(Handler[float | None, float | None]):
    """A handler that applies a lag-based transformation to time series data.

    This handler applies a formula (2 * current_value - lagged_value) that compares
    the current value with a value from 'lag' time steps in the past. For the first
    'lag' values where no lagged value is available, it outputs None.

    The transformation can be useful for detecting changes or trends in time series
    by comparing current values with historical ones.

    :param lag: Number of time steps to look back for the lagged value

    Example:
        ```python
        # Create a data source with numeric values
        data_source = SimpleDataProvider([1.0, 2.0, 3.0, 4.0, 5.0])

        # Create a lag handler with lag of 2
        lag_handler = LagHandler(lag=2)
        lag_handler.set_source(data_source)

        # Process the data
        results = list(lag_handler)
        print(results)

        # Output:
        # [None, None, 5.0, 6.0, 7.0]
        #
        # Explanation:
        # - First two values: None (not enough history)
        # - Third value: 2*3.0-1.0 = 5.0 (current=3.0, lagged=1.0)
        # - Fourth value: 2*4.0-2.0 = 6.0 (current=4.0, lagged=2.0)
        # - Fifth value: 2*5.0-3.0 = 7.0 (current=5.0, lagged=3.0)
        ```
    """

    def __init__(self, lag: int):
        """Initialize a lag handler.

        :param lag: Number of time steps to look back for the lagged value
        """
        super().__init__()
        self.lag = lag

    def __iter__(self) -> Iterator[float | None]:
        """Create an iterator that yields transformed values based on lag comparison.

        This method outputs None for the first 'lag' values, then applies the formula
        2 * current_value - lagged_value for subsequent values. If either the current
        value or the lagged value is None, the result will be None.

        :return: Iterator yielding transformed values or None
        :raises ValueError: If no source has been set
        """
        if self.source is None:
            raise ValueError("LagHandler requires a data source")

        source_iter = iter(self.source)
        buffer: deque[float | None] = deque(maxlen=self.lag + 1)

        try:
            for _ in range(self.lag):
                buffer.append(next(source_iter))
                yield None  # First 'lag' values have no result
        except StopIteration:
            return

        # Apply formula for each new value
        for current_value in source_iter:
            buffer.append(current_value)  # Add new value
            lagged_value = buffer.popleft()

            if current_value is None or lagged_value is None:
                yield None
            else:
                yield 2 * current_value - lagged_value
__init__
__init__(lag: int)

Initialize a lag handler.

Parameters:

Name Type Description Default
lag int

Number of time steps to look back for the lagged value

required
Source code in pysatl_tsp/core/processor/lag_handler.py
45
46
47
48
49
50
51
def __init__(self, lag: int):
    """Initialize a lag handler.

    :param lag: Number of time steps to look back for the lagged value
    """
    super().__init__()
    self.lag = lag
__iter__
__iter__() -> Iterator[float | None]

Create an iterator that yields transformed values based on lag comparison.

This method outputs None for the first 'lag' values, then applies the formula 2 * current_value - lagged_value for subsequent values. If either the current value or the lagged value is None, the result will be None.

Returns:

Type Description
Iterator[float | None]

Iterator yielding transformed values or None

Raises:

Type Description
ValueError

If no source has been set

Source code in pysatl_tsp/core/processor/lag_handler.py
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
def __iter__(self) -> Iterator[float | None]:
    """Create an iterator that yields transformed values based on lag comparison.

    This method outputs None for the first 'lag' values, then applies the formula
    2 * current_value - lagged_value for subsequent values. If either the current
    value or the lagged value is None, the result will be None.

    :return: Iterator yielding transformed values or None
    :raises ValueError: If no source has been set
    """
    if self.source is None:
        raise ValueError("LagHandler requires a data source")

    source_iter = iter(self.source)
    buffer: deque[float | None] = deque(maxlen=self.lag + 1)

    try:
        for _ in range(self.lag):
            buffer.append(next(source_iter))
            yield None  # First 'lag' values have no result
    except StopIteration:
        return

    # Apply formula for each new value
    for current_value in source_iter:
        buffer.append(current_value)  # Add new value
        lagged_value = buffer.popleft()

        if current_value is None or lagged_value is None:
            yield None
        else:
            yield 2 * current_value - lagged_value
mapping_handler
MappingHandler

Bases: Handler[T, U]

A handler that transforms time series data by applying a mapping function to each item.

This handler applies a user-defined transformation function to each data point in the input stream, producing a new stream of transformed values. It's useful for simple point-by-point transformations such as scaling, type conversion, feature extraction, or any operation that processes one input item at a time.

Parameters:

Name Type Description Default
map_func Callable[[T], U]

Function that transforms each input item to an output item

required
source Handler[Any, T] | None

The handler providing input data, defaults to None Example: python # Create a data source with numeric values data_source = SimpleDataProvider([1, 2, 3, 4, 5]) # Simple mapping function to square each value def square(x: int) -> int: return x * x # Create a mapping handler mapper = MappingHandler(map_func=square, source=data_source) # Process the data for transformed in mapper: print(transformed) # Output: # 1 # 4 # 9 # 16 # 25 # Example with a more complex transformation import json # Data source with JSON strings json_data = [ '{"timestamp": "2023-09-01T10:00:00", "value": 42.5}', '{"timestamp": "2023-09-01T10:01:00", "value": 43.2}', '{"timestamp": "2023-09-01T10:02:00", "value": 41.8}', ] json_source = SimpleDataProvider(json_data) # Function to extract timestamp and value from JSON def parse_json(json_str: str) -> tuple[str, float]: data = json.loads(json_str) return (data["timestamp"], data["value"]) # Create a mapping handler for JSON parsing json_mapper = MappingHandler(map_func=parse_json, source=json_source) # Process JSON data for timestamp, value in json_mapper: print(f"Time: {timestamp}, Value: {value}") # Output: # Time: 2023-09-01T10:00:00, Value: 42.5 # Time: 2023-09-01T10:01:00, Value: 43.2 # Time: 2023-09-01T10:02:00, Value: 41.8

None
Source code in pysatl_tsp/core/processor/mapping_handler.py
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
class MappingHandler(Handler[T, U]):
    """A handler that transforms time series data by applying a mapping function to each item.

    This handler applies a user-defined transformation function to each data point
    in the input stream, producing a new stream of transformed values. It's useful for
    simple point-by-point transformations such as scaling, type conversion, feature
    extraction, or any operation that processes one input item at a time.

    :param map_func: Function that transforms each input item to an output item
    :param source: The handler providing input data, defaults to None

    Example:
        ```python
        # Create a data source with numeric values
        data_source = SimpleDataProvider([1, 2, 3, 4, 5])


        # Simple mapping function to square each value
        def square(x: int) -> int:
            return x * x


        # Create a mapping handler
        mapper = MappingHandler(map_func=square, source=data_source)

        # Process the data
        for transformed in mapper:
            print(transformed)

        # Output:
        # 1
        # 4
        # 9
        # 16
        # 25

        # Example with a more complex transformation
        import json

        # Data source with JSON strings
        json_data = [
            '{"timestamp": "2023-09-01T10:00:00", "value": 42.5}',
            '{"timestamp": "2023-09-01T10:01:00", "value": 43.2}',
            '{"timestamp": "2023-09-01T10:02:00", "value": 41.8}',
        ]
        json_source = SimpleDataProvider(json_data)


        # Function to extract timestamp and value from JSON
        def parse_json(json_str: str) -> tuple[str, float]:
            data = json.loads(json_str)
            return (data["timestamp"], data["value"])


        # Create a mapping handler for JSON parsing
        json_mapper = MappingHandler(map_func=parse_json, source=json_source)

        # Process JSON data
        for timestamp, value in json_mapper:
            print(f"Time: {timestamp}, Value: {value}")

        # Output:
        # Time: 2023-09-01T10:00:00, Value: 42.5
        # Time: 2023-09-01T10:01:00, Value: 43.2
        # Time: 2023-09-01T10:02:00, Value: 41.8
        ```
    """

    def __init__(self, map_func: Callable[[T], U], source: Handler[Any, T] | None = None):
        """Initialize a mapping handler.

        :param map_func: Function that transforms each input item to an output item
        :param source: The handler providing input data, defaults to None
        """
        super().__init__(source)
        self.map_func = map_func

    def __iter__(self) -> Iterator[U]:
        """Create an iterator that yields transformed items.

        This method iterates through the source data and applies the mapping function
        to each item, yielding the transformed results.

        :return: Iterator yielding transformed items
        :raises ValueError: If no source has been set
        """
        if self.source is None:
            raise ValueError("Source is not set")

        for segment in self.source:
            yield self.map_func(segment)
__init__
__init__(
    map_func: Callable[[T], U],
    source: Handler[Any, T] | None = None,
)

Initialize a mapping handler.

Parameters:

Name Type Description Default
map_func Callable[[T], U]

Function that transforms each input item to an output item

required
source Handler[Any, T] | None

The handler providing input data, defaults to None

None
Source code in pysatl_tsp/core/processor/mapping_handler.py
75
76
77
78
79
80
81
82
def __init__(self, map_func: Callable[[T], U], source: Handler[Any, T] | None = None):
    """Initialize a mapping handler.

    :param map_func: Function that transforms each input item to an output item
    :param source: The handler providing input data, defaults to None
    """
    super().__init__(source)
    self.map_func = map_func
__iter__
__iter__() -> Iterator[U]

Create an iterator that yields transformed items.

This method iterates through the source data and applies the mapping function to each item, yielding the transformed results.

Returns:

Type Description
Iterator[U]

Iterator yielding transformed items

Raises:

Type Description
ValueError

If no source has been set

Source code in pysatl_tsp/core/processor/mapping_handler.py
84
85
86
87
88
89
90
91
92
93
94
95
96
97
def __iter__(self) -> Iterator[U]:
    """Create an iterator that yields transformed items.

    This method iterates through the source data and applies the mapping function
    to each item, yielding the transformed results.

    :return: Iterator yielding transformed items
    :raises ValueError: If no source has been set
    """
    if self.source is None:
        raise ValueError("Source is not set")

    for segment in self.source:
        yield self.map_func(segment)
sampling_handler
OfflineSamplingHandler

Bases: Handler[T, T]

A handler that samples time series data in batch mode based on identified indices.

This handler processes the entire dataset to identify sampling points before extracting the samples. It's suitable for global sampling strategies that consider the entire time series context, such as selecting representative points or key points that preserve the overall shape of the data.

Parameters:

Name Type Description Default
sampling_rule Callable[[ScrubberWindow[T]], list[int]]

Function that analyzes the entire series and returns indices of points to sample

required
source Handler[Any, T] | None

The handler providing input data, defaults to None Example: python import numpy as np import matplotlib.pyplot as plt from typing import List # Create a data source with a sinusoidal signal x = np.linspace(0, 4*np.pi, 1000) y = np.sin(x) data_source = SimpleDataProvider(y) # Define an offline sampling rule that selects local extrema def find_extrema(window: ScrubberWindow[float]) -> List[int]: data = np.array(window.values) # Find local maxima and minima extrema_indices = [] # First point is always included extrema_indices.append(0) # Find local maxima and minima (simplified) for i in range(1, len(data)-1): if (data[i] > data[i-1] and data[i] > data[i+1]) or (data[i] < data[i-1] and data[i] < data[i+1]): extrema_indices.append(i) # Last point is always included extrema_indices.append(len(data)-1) return extrema_indices # Create a sampling handler sampler = OfflineSamplingHandler( sampling_rule=find_extrema, source=data_source ) # Process and collect sampled points sampled_indices = [] sampled_values = [] original_values = list(y) for i, value in enumerate(sampler): sampled_values.append(value) # Approximate index (not exact) sampled_indices.append(i * len(original_values) // len(sampled_values)) # Visualize the results plt.figure(figsize=(12, 6)) plt.plot(x, y, 'b-', label='Original signal') plt.plot(x[sampled_indices], sampled_values, 'ro', label='Sampled points') plt.legend() plt.title('Sinusoidal Signal with Extrema Sampling') plt.xlabel('x') plt.ylabel('sin(x)') plt.grid(True) plt.show() print(f"Original data points: {len(original_values)}") print(f"Sampled data points: {len(sampled_values)}")

None
Source code in pysatl_tsp/core/processor/sampling_handler.py
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
class OfflineSamplingHandler(Handler[T, T]):
    """A handler that samples time series data in batch mode based on identified indices.

    This handler processes the entire dataset to identify sampling points before
    extracting the samples. It's suitable for global sampling strategies that consider
    the entire time series context, such as selecting representative points or
    key points that preserve the overall shape of the data.

    :param sampling_rule: Function that analyzes the entire series and returns indices of points to sample
    :param source: The handler providing input data, defaults to None

    Example:
        ```python
        import numpy as np
        import matplotlib.pyplot as plt
        from typing import List

        # Create a data source with a sinusoidal signal
        x = np.linspace(0, 4*np.pi, 1000)
        y = np.sin(x)
        data_source = SimpleDataProvider(y)

        # Define an offline sampling rule that selects local extrema
        def find_extrema(window: ScrubberWindow[float]) -> List[int]:
            data = np.array(window.values)
            # Find local maxima and minima
            extrema_indices = []

            # First point is always included
            extrema_indices.append(0)

            # Find local maxima and minima (simplified)
            for i in range(1, len(data)-1):
                if (data[i] > data[i-1] and data[i] > data[i+1]) or \
                   (data[i] < data[i-1] and data[i] < data[i+1]):
                    extrema_indices.append(i)

            # Last point is always included
            extrema_indices.append(len(data)-1)

            return extrema_indices

        # Create a sampling handler
        sampler = OfflineSamplingHandler(
            sampling_rule=find_extrema,
            source=data_source
        )

        # Process and collect sampled points
        sampled_indices = []
        sampled_values = []
        original_values = list(y)

        for i, value in enumerate(sampler):
            sampled_values.append(value)
            # Approximate index (not exact)
            sampled_indices.append(i * len(original_values) // len(sampled_values))

        # Visualize the results
        plt.figure(figsize=(12, 6))
        plt.plot(x, y, 'b-', label='Original signal')
        plt.plot(x[sampled_indices], sampled_values, 'ro', label='Sampled points')
        plt.legend()
        plt.title('Sinusoidal Signal with Extrema Sampling')
        plt.xlabel('x')
        plt.ylabel('sin(x)')
        plt.grid(True)
        plt.show()

        print(f"Original data points: {len(original_values)}")
        print(f"Sampled data points: {len(sampled_values)}")
        ```
    """

    def __init__(self, sampling_rule: Callable[[ScrubberWindow[T]], list[int]], source: Handler[Any, T] | None = None):
        """Initialize an offline sampling handler.

        :param sampling_rule: Function that analyzes the entire series and returns indices of points to sample
        :param source: The handler providing input data, defaults to None
        """
        super().__init__(source)
        self.sampling_rule = sampling_rule

    def __iter__(self) -> Iterator[T]:
        """Create an iterator that yields sampled values based on the indices identified by the sampling rule.

        This method uses OfflineSegmentationScrubber to segment the data at the specified indices
        and a MappingHandler to extract the last item from each segment.

        :return: Iterator yielding sampled values
        :raises ValueError: If no source has been set (propagated from segmentation scrubber)
        """
        mapping_handler: MappingHandler[ScrubberWindow[T], T] = MappingHandler(map_func=lambda window: window[-1])
        pipeline = (
            OfflineSegmentationScrubber(segmentation_rule=self.sampling_rule, source=self.source) | mapping_handler
        )

        yield from pipeline
__init__
__init__(
    sampling_rule: Callable[[ScrubberWindow[T]], list[int]],
    source: Handler[Any, T] | None = None,
)

Initialize an offline sampling handler.

Parameters:

Name Type Description Default
sampling_rule Callable[[ScrubberWindow[T]], list[int]]

Function that analyzes the entire series and returns indices of points to sample

required
source Handler[Any, T] | None

The handler providing input data, defaults to None

None
Source code in pysatl_tsp/core/processor/sampling_handler.py
158
159
160
161
162
163
164
165
def __init__(self, sampling_rule: Callable[[ScrubberWindow[T]], list[int]], source: Handler[Any, T] | None = None):
    """Initialize an offline sampling handler.

    :param sampling_rule: Function that analyzes the entire series and returns indices of points to sample
    :param source: The handler providing input data, defaults to None
    """
    super().__init__(source)
    self.sampling_rule = sampling_rule
__iter__
__iter__() -> Iterator[T]

Create an iterator that yields sampled values based on the indices identified by the sampling rule.

This method uses OfflineSegmentationScrubber to segment the data at the specified indices and a MappingHandler to extract the last item from each segment.

Returns:

Type Description
Iterator[T]

Iterator yielding sampled values

Raises:

Type Description
ValueError

If no source has been set (propagated from segmentation scrubber)

Source code in pysatl_tsp/core/processor/sampling_handler.py
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
def __iter__(self) -> Iterator[T]:
    """Create an iterator that yields sampled values based on the indices identified by the sampling rule.

    This method uses OfflineSegmentationScrubber to segment the data at the specified indices
    and a MappingHandler to extract the last item from each segment.

    :return: Iterator yielding sampled values
    :raises ValueError: If no source has been set (propagated from segmentation scrubber)
    """
    mapping_handler: MappingHandler[ScrubberWindow[T], T] = MappingHandler(map_func=lambda window: window[-1])
    pipeline = (
        OfflineSegmentationScrubber(segmentation_rule=self.sampling_rule, source=self.source) | mapping_handler
    )

    yield from pipeline
OnlineSamplingHandler

Bases: Handler[T, T]

A handler that samples time series data in real-time based on a condition.

This handler uses segmentation to identify points where sampling should occur and extracts the last item from each segment. It processes data in real-time and is suitable for adaptive sampling strategies, where sampling decisions are made based on the recent history of the time series.

Parameters:

Name Type Description Default
sampling_rule Callable[[ScrubberWindow[T]], bool]

Function that decides when to take a sample

required
source Handler[Any, T] | None

The handler providing input data, defaults to None Example: python # Create a data source with steadily increasing values data = list(range(100)) data_source = SimpleDataProvider(data) # Define a sampling rule that samples when the value changes by more than 5 def significant_change(window: ScrubberWindow[int]) -> bool: if len(window) < 2: return False # Get last sample taken (first item in window) and current value last_sampled = window[0] current = window[-1] # Sample if change is significant return abs(current - last_sampled) >= 5 # Create a sampling handler sampler = OnlineSamplingHandler(sampling_rule=significant_change, source=data_source) # Process and collect sampled points sampled_points = list(sampler) print(f"Original data points: {len(data)}") print(f"Sampled data points: {len(sampled_points)}") print(f"Sampled values: {sampled_points[:10]}...") # Output might look like: # Original data points: 100 # Sampled data points: 20 # Sampled values: [0, 5, 10, 15, 20, 25, 30, 35, 40, 45]...

None
Source code in pysatl_tsp/core/processor/sampling_handler.py
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
class OnlineSamplingHandler(Handler[T, T]):
    """A handler that samples time series data in real-time based on a condition.

    This handler uses segmentation to identify points where sampling should occur
    and extracts the last item from each segment. It processes data in real-time
    and is suitable for adaptive sampling strategies, where sampling decisions
    are made based on the recent history of the time series.

    :param sampling_rule: Function that decides when to take a sample
    :param source: The handler providing input data, defaults to None

    Example:
        ```python
        # Create a data source with steadily increasing values
        data = list(range(100))
        data_source = SimpleDataProvider(data)


        # Define a sampling rule that samples when the value changes by more than 5
        def significant_change(window: ScrubberWindow[int]) -> bool:
            if len(window) < 2:
                return False

            # Get last sample taken (first item in window) and current value
            last_sampled = window[0]
            current = window[-1]

            # Sample if change is significant
            return abs(current - last_sampled) >= 5


        # Create a sampling handler
        sampler = OnlineSamplingHandler(sampling_rule=significant_change, source=data_source)

        # Process and collect sampled points
        sampled_points = list(sampler)

        print(f"Original data points: {len(data)}")
        print(f"Sampled data points: {len(sampled_points)}")
        print(f"Sampled values: {sampled_points[:10]}...")

        # Output might look like:
        # Original data points: 100
        # Sampled data points: 20
        # Sampled values: [0, 5, 10, 15, 20, 25, 30, 35, 40, 45]...
        ```
    """

    def __init__(self, sampling_rule: Callable[[ScrubberWindow[T]], bool], source: Handler[Any, T] | None = None):
        """Initialize an online sampling handler.

        :param sampling_rule: Function that decides when to take a sample
        :param source: The handler providing input data, defaults to None
        """
        super().__init__(source)
        self.sampling_rule = sampling_rule

    def __iter__(self) -> Iterator[T]:
        """Create an iterator that yields sampled values based on the sampling rule.

        This method uses OnlineSegmentationScrubber to segment the data and a
        MappingHandler to extract the last item from each segment.

        :return: Iterator yielding sampled values
        :raises ValueError: If no source has been set (propagated from segmentation scrubber)
        """
        mapping_handler: MappingHandler[ScrubberWindow[T], T] = MappingHandler(map_func=lambda window: window[-1])
        pipeline = (
            OnlineSegmentationScrubber(segmentation_rule=self.sampling_rule, source=self.source) | mapping_handler
        )

        yield from pipeline
__init__
__init__(
    sampling_rule: Callable[[ScrubberWindow[T]], bool],
    source: Handler[Any, T] | None = None,
)

Initialize an online sampling handler.

Parameters:

Name Type Description Default
sampling_rule Callable[[ScrubberWindow[T]], bool]

Function that decides when to take a sample

required
source Handler[Any, T] | None

The handler providing input data, defaults to None

None
Source code in pysatl_tsp/core/processor/sampling_handler.py
58
59
60
61
62
63
64
65
def __init__(self, sampling_rule: Callable[[ScrubberWindow[T]], bool], source: Handler[Any, T] | None = None):
    """Initialize an online sampling handler.

    :param sampling_rule: Function that decides when to take a sample
    :param source: The handler providing input data, defaults to None
    """
    super().__init__(source)
    self.sampling_rule = sampling_rule
__iter__
__iter__() -> Iterator[T]

Create an iterator that yields sampled values based on the sampling rule.

This method uses OnlineSegmentationScrubber to segment the data and a MappingHandler to extract the last item from each segment.

Returns:

Type Description
Iterator[T]

Iterator yielding sampled values

Raises:

Type Description
ValueError

If no source has been set (propagated from segmentation scrubber)

Source code in pysatl_tsp/core/processor/sampling_handler.py
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
def __iter__(self) -> Iterator[T]:
    """Create an iterator that yields sampled values based on the sampling rule.

    This method uses OnlineSegmentationScrubber to segment the data and a
    MappingHandler to extract the last item from each segment.

    :return: Iterator yielding sampled values
    :raises ValueError: If no source has been set (propagated from segmentation scrubber)
    """
    mapping_handler: MappingHandler[ScrubberWindow[T], T] = MappingHandler(map_func=lambda window: window[-1])
    pipeline = (
        OnlineSegmentationScrubber(segmentation_rule=self.sampling_rule, source=self.source) | mapping_handler
    )

    yield from pipeline
tee_handler
TeeHandler

Bases: Handler[T, U]

A handler that processes data through two parallel paths and combines the results.

This handler takes input data from a source, sends it through both the original path and a processing path simultaneously, and then combines each pair of outputs using a provided function. It's useful for operations where you need to preserve the original data while also using a transformed version of it.

Parameters:

Name Type Description Default
processor Handler[T, S]

Handler that processes the tee'd data stream

required
combine_func Callable[[T, S], U]

Function that combines original and processed values Example: python # Create a data source data_source = SimpleDataProvider([1, 2, 3, 4, 5]) # Define a processor that squares the values square_processor = MappingHandler(map_func=lambda x: x * x) # Define a function to combine original and processed values def combine(original, processed): return f"{original} squared is {processed}" # Create and use the tee handler tee_handler = TeeHandler(processor=square_processor, combine_func=combine) tee_handler.set_source(data_source) # Process the data for result in tee_handler: print(result) # Output: # 1 squared is 1 # 2 squared is 4 # 3 squared is 9 # 4 squared is 16 # 5 squared is 25

required
Source code in pysatl_tsp/core/processor/tee_handler.py
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
class TeeHandler(Handler[T, U]):
    """A handler that processes data through two parallel paths and combines the results.

    This handler takes input data from a source, sends it through both the original path
    and a processing path simultaneously, and then combines each pair of outputs using
    a provided function. It's useful for operations where you need to preserve the original
    data while also using a transformed version of it.

    :param processor: Handler that processes the tee'd data stream
    :param combine_func: Function that combines original and processed values

    Example:
        ```python
        # Create a data source
        data_source = SimpleDataProvider([1, 2, 3, 4, 5])

        # Define a processor that squares the values
        square_processor = MappingHandler(map_func=lambda x: x * x)


        # Define a function to combine original and processed values
        def combine(original, processed):
            return f"{original} squared is {processed}"


        # Create and use the tee handler
        tee_handler = TeeHandler(processor=square_processor, combine_func=combine)
        tee_handler.set_source(data_source)

        # Process the data
        for result in tee_handler:
            print(result)

        # Output:
        # 1 squared is 1
        # 2 squared is 4
        # 3 squared is 9
        # 4 squared is 16
        # 5 squared is 25
        ```
    """

    def __init__(self, processor: Handler[T, S], combine_func: Callable[[T, S], U]):
        """Initialize a tee handler.

        :param processor: Handler that processes the tee'd data stream
        :param combine_func: Function that combines original and processed values
        """
        super().__init__()
        self.processor = processor
        self.combine_func = combine_func

    def __iter__(self) -> Iterator[U]:
        """Create an iterator that yields combined results from original and processed data.

        This method creates two identical iterators from the source, processes one through
        the processor, and then combines corresponding items from both streams using the
        combine function.

        :return: Iterator yielding combined results
        :raises ValueError: If no source has been set
        """
        if self.source is None:
            raise ValueError("TeeHandler requires a data source")

        source_iterator = iter(self.source)

        original_iter, process_iter = itertools.tee(source_iterator)

        process_provider = SimpleDataProvider(process_iter)
        processed_pipeline = process_provider | self.processor
        processed_iter = iter(processed_pipeline)

        try:
            while True:
                original_value = next(original_iter)
                processed_value = next(processed_iter)
                yield self.combine_func(original_value, processed_value)
        except StopIteration:
            pass
__init__
__init__(
    processor: Handler[T, S],
    combine_func: Callable[[T, S], U],
)

Initialize a tee handler.

Parameters:

Name Type Description Default
processor Handler[T, S]

Handler that processes the tee'd data stream

required
combine_func Callable[[T, S], U]

Function that combines original and processed values

required
Source code in pysatl_tsp/core/processor/tee_handler.py
53
54
55
56
57
58
59
60
61
def __init__(self, processor: Handler[T, S], combine_func: Callable[[T, S], U]):
    """Initialize a tee handler.

    :param processor: Handler that processes the tee'd data stream
    :param combine_func: Function that combines original and processed values
    """
    super().__init__()
    self.processor = processor
    self.combine_func = combine_func
__iter__
__iter__() -> Iterator[U]

Create an iterator that yields combined results from original and processed data.

This method creates two identical iterators from the source, processes one through the processor, and then combines corresponding items from both streams using the combine function.

Returns:

Type Description
Iterator[U]

Iterator yielding combined results

Raises:

Type Description
ValueError

If no source has been set

Source code in pysatl_tsp/core/processor/tee_handler.py
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
def __iter__(self) -> Iterator[U]:
    """Create an iterator that yields combined results from original and processed data.

    This method creates two identical iterators from the source, processes one through
    the processor, and then combines corresponding items from both streams using the
    combine function.

    :return: Iterator yielding combined results
    :raises ValueError: If no source has been set
    """
    if self.source is None:
        raise ValueError("TeeHandler requires a data source")

    source_iterator = iter(self.source)

    original_iter, process_iter = itertools.tee(source_iterator)

    process_provider = SimpleDataProvider(process_iter)
    processed_pipeline = process_provider | self.processor
    processed_iter = iter(processed_pipeline)

    try:
        while True:
            original_value = next(original_iter)
            processed_value = next(processed_iter)
            yield self.combine_func(original_value, processed_value)
    except StopIteration:
        pass

scrubber

This module provides various scrubber implementations for handling and processing data streams.

LinearScrubber

Bases: SlidingScrubber[T]

A scrubber that creates fixed-size sliding windows with configurable overlap.

This is a specialized sliding scrubber that emits windows of a fixed size and allows controlling the overlap between consecutive windows through a shift factor.

Parameters:

Name Type Description Default
window_length int

Number of points in each window, defaults to 100

100
shift_factor float

Fraction of window to shift after each emission, defaults to 1/3 (e.g., 0.5 means 50% overlap between consecutive windows)

1.0 / 3.0
source Handler[Any, T] | None

The handler providing input data, defaults to None Example: python # Create a data source with a sequence of numbers data_source = SimpleDataProvider(range(10)) # Create a linear scrubber with window size 4 and 50% overlap scrubber = LinearScrubber(window_length=4, shift_factor=0.5, source=data_source) # Process the windows for window in scrubber: print(f"Window values: {list(window.values)}") # Output: # Window values: [0, 1, 2, 3] # Window values: [2, 3, 4, 5] # Window values: [4, 5, 6, 7] # Window values: [6, 7, 8, 9]

None
Source code in pysatl_tsp/core/scrubber/linear_scrubber.py
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
class LinearScrubber(SlidingScrubber[T]):
    """A scrubber that creates fixed-size sliding windows with configurable overlap.

    This is a specialized sliding scrubber that emits windows of a fixed size and
    allows controlling the overlap between consecutive windows through a shift factor.

    :param window_length: Number of points in each window, defaults to 100
    :param shift_factor: Fraction of window to shift after each emission, defaults to 1/3
                        (e.g., 0.5 means 50% overlap between consecutive windows)
    :param source: The handler providing input data, defaults to None

    Example:
        ```python
        # Create a data source with a sequence of numbers
        data_source = SimpleDataProvider(range(10))

        # Create a linear scrubber with window size 4 and 50% overlap
        scrubber = LinearScrubber(window_length=4, shift_factor=0.5, source=data_source)

        # Process the windows
        for window in scrubber:
            print(f"Window values: {list(window.values)}")

        # Output:
        # Window values: [0, 1, 2, 3]
        # Window values: [2, 3, 4, 5]
        # Window values: [4, 5, 6, 7]
        # Window values: [6, 7, 8, 9]
        ```
    """

    def __init__(
        self, window_length: int = 100, shift_factor: float = 1.0 / 3.0, source: Handler[Any, T] | None = None
    ) -> None:
        """Initialize a linear scrubber with fixed window size and overlap.

        :param window_length: Number of points in each window, defaults to 100
        :param shift_factor: Fraction of window to shift after each emission, defaults to 1/3
        :param source: The handler providing input data, defaults to None
        """
        shift = max(1, int(shift_factor * window_length))
        super().__init__(take_condition=lambda buffer: len(buffer) >= window_length, shift=shift, source=source)
__init__
__init__(
    window_length: int = 100,
    shift_factor: float = 1.0 / 3.0,
    source: Handler[Any, T] | None = None,
) -> None

Initialize a linear scrubber with fixed window size and overlap.

Parameters:

Name Type Description Default
window_length int

Number of points in each window, defaults to 100

100
shift_factor float

Fraction of window to shift after each emission, defaults to 1/3

1.0 / 3.0
source Handler[Any, T] | None

The handler providing input data, defaults to None

None
Source code in pysatl_tsp/core/scrubber/linear_scrubber.py
117
118
119
120
121
122
123
124
125
126
127
def __init__(
    self, window_length: int = 100, shift_factor: float = 1.0 / 3.0, source: Handler[Any, T] | None = None
) -> None:
    """Initialize a linear scrubber with fixed window size and overlap.

    :param window_length: Number of points in each window, defaults to 100
    :param shift_factor: Fraction of window to shift after each emission, defaults to 1/3
    :param source: The handler providing input data, defaults to None
    """
    shift = max(1, int(shift_factor * window_length))
    super().__init__(take_condition=lambda buffer: len(buffer) >= window_length, shift=shift, source=source)
OfflineSegmentationScrubber

Bases: Scrubber[T]

A scrubber that segments time series data based on changepoints in batch mode.

This scrubber processes the entire input data in a batch (offline) mode and segments it according to a provided segmentation rule. The rule identifies changepoints in the data, which are then used to create non-overlapping segments.

This approach is suitable for scenarios where the entire dataset is available upfront and the segmentation logic requires global context or multiple passes over the data.

Parameters:

Name Type Description Default
segmentation_rule Callable[[ScrubberWindow[T]], list[int]]

Function that analyzes the complete series and returns a list of changepoint indices

required
source Handler[Any, T] | None

The handler providing input data, defaults to None Example: python # Create a data source with synthetic pattern data = [1, 1, 2, 2, 5, 5, 5, 1, 1, 1, 6, 6, 6, 6] data_source = SimpleDataProvider(data) # Define a simple variance-based segmentation rule def find_changepoints(window: ScrubberWindow[int]) -> list[int]: changepoints = [] # Simple detection of value changes for i in range(1, len(window)): if abs(window[i] - window[i - 1]) > 2: # Threshold for change changepoints.append(i) return changepoints # Create the segmentation scrubber segmenter = OfflineSegmentationScrubber(segmentation_rule=find_changepoints, source=data_source) # Process the segments for segment in segmenter: print(f"Segment values: {list(segment.values)}") # Output: # Segment values: [1, 1, 2, 2] # Segment values: [5, 5, 5] # Segment values: [1, 1, 1] # Segment values: [6, 6, 6, 6]

None
Source code in pysatl_tsp/core/scrubber/segmentation_scrubber.py
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
class OfflineSegmentationScrubber(Scrubber[T]):
    """A scrubber that segments time series data based on changepoints in batch mode.

    This scrubber processes the entire input data in a batch (offline) mode and
    segments it according to a provided segmentation rule. The rule identifies
    changepoints in the data, which are then used to create non-overlapping segments.

    This approach is suitable for scenarios where the entire dataset is available upfront
    and the segmentation logic requires global context or multiple passes over the data.

    :param segmentation_rule: Function that analyzes the complete series and returns a list of changepoint indices
    :param source: The handler providing input data, defaults to None

    Example:
        ```python
        # Create a data source with synthetic pattern
        data = [1, 1, 2, 2, 5, 5, 5, 1, 1, 1, 6, 6, 6, 6]
        data_source = SimpleDataProvider(data)


        # Define a simple variance-based segmentation rule
        def find_changepoints(window: ScrubberWindow[int]) -> list[int]:
            changepoints = []
            # Simple detection of value changes
            for i in range(1, len(window)):
                if abs(window[i] - window[i - 1]) > 2:  # Threshold for change
                    changepoints.append(i)
            return changepoints


        # Create the segmentation scrubber
        segmenter = OfflineSegmentationScrubber(segmentation_rule=find_changepoints, source=data_source)

        # Process the segments
        for segment in segmenter:
            print(f"Segment values: {list(segment.values)}")

        # Output:
        # Segment values: [1, 1, 2, 2]
        # Segment values: [5, 5, 5]
        # Segment values: [1, 1, 1]
        # Segment values: [6, 6, 6, 6]
        ```
    """

    def __init__(
        self, segmentation_rule: Callable[[ScrubberWindow[T]], list[int]], source: Handler[Any, T] | None = None
    ):
        """Initialize an offline segmentation scrubber.

        :param segmentation_rule: Function that analyzes the complete series and returns a list of changepoint indices
        :param source: The handler providing input data, defaults to None
        """
        super().__init__(source)
        self.segmentation_rule = segmentation_rule

    def __iter__(self) -> Iterator[ScrubberWindow[T]]:
        """Create an iterator that yields segments based on detected changepoints.

        This method collects all data from the source, applies the segmentation rule
        to identify changepoints, and then yields segments between the detected changepoints.

        :return: Iterator yielding ScrubberWindow instances for each segment
        :raises ValueError: If no source has been set
        """
        if self.source is None:
            raise ValueError("Source is not set")

        full_series_list = list(iter(self.source))
        full_series_deque = deque(full_series_list)
        series_window = ScrubberWindow(full_series_deque)
        change_points = self.segmentation_rule(series_window)
        segments = [0, *change_points, len(full_series_deque)]
        for start, end in zip(segments[:-1], segments[1:]):
            yield series_window[start:end]
__init__
__init__(
    segmentation_rule: Callable[
        [ScrubberWindow[T]], list[int]
    ],
    source: Handler[Any, T] | None = None,
)

Initialize an offline segmentation scrubber.

Parameters:

Name Type Description Default
segmentation_rule Callable[[ScrubberWindow[T]], list[int]]

Function that analyzes the complete series and returns a list of changepoint indices

required
source Handler[Any, T] | None

The handler providing input data, defaults to None

None
Source code in pysatl_tsp/core/scrubber/segmentation_scrubber.py
55
56
57
58
59
60
61
62
63
64
def __init__(
    self, segmentation_rule: Callable[[ScrubberWindow[T]], list[int]], source: Handler[Any, T] | None = None
):
    """Initialize an offline segmentation scrubber.

    :param segmentation_rule: Function that analyzes the complete series and returns a list of changepoint indices
    :param source: The handler providing input data, defaults to None
    """
    super().__init__(source)
    self.segmentation_rule = segmentation_rule
__iter__
__iter__() -> Iterator[ScrubberWindow[T]]

Create an iterator that yields segments based on detected changepoints.

This method collects all data from the source, applies the segmentation rule to identify changepoints, and then yields segments between the detected changepoints.

Returns:

Type Description
Iterator[ScrubberWindow[T]]

Iterator yielding ScrubberWindow instances for each segment

Raises:

Type Description
ValueError

If no source has been set

Source code in pysatl_tsp/core/scrubber/segmentation_scrubber.py
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
def __iter__(self) -> Iterator[ScrubberWindow[T]]:
    """Create an iterator that yields segments based on detected changepoints.

    This method collects all data from the source, applies the segmentation rule
    to identify changepoints, and then yields segments between the detected changepoints.

    :return: Iterator yielding ScrubberWindow instances for each segment
    :raises ValueError: If no source has been set
    """
    if self.source is None:
        raise ValueError("Source is not set")

    full_series_list = list(iter(self.source))
    full_series_deque = deque(full_series_list)
    series_window = ScrubberWindow(full_series_deque)
    change_points = self.segmentation_rule(series_window)
    segments = [0, *change_points, len(full_series_deque)]
    for start, end in zip(segments[:-1], segments[1:]):
        yield series_window[start:end]
OnlineSegmentationScrubber

Bases: Scrubber[T]

A scrubber that segments time series data in real-time based on a condition.

This scrubber processes data points sequentially (online mode) and segments the time series whenever a specified condition is met or a maximum segment size is reached. It's designed for streaming data where segments need to be identified in real-time without waiting for the complete dataset.

Parameters:

Name Type Description Default
segmentation_rule Callable[[ScrubberWindow[T]], bool]

Function that evaluates the current window and returns True when a segment should end

required
max_segment_size int

Maximum number of points in a segment before forcing a split, defaults to 2^64

2 ** 64
source Handler[Any, T] | None

The handler providing input data, defaults to None Example: python # Create a data source with streaming values data = [1, 1, 2, 3, 8, 9, 8, 2, 2, 3, 10, 10, 9, 9] data_source = SimpleDataProvider(data) # Define a threshold-based segmentation rule def detect_jump(window: ScrubberWindow[int]) -> bool: if len(window) < 2: return False # Detect a large jump in values last_value = window[-1] prev_value = window[-2] return abs(last_value - prev_value) > 3 # Create the online segmentation scrubber segmenter = OnlineSegmentationScrubber( segmentation_rule=detect_jump, max_segment_size=5, # Force segmentation after 5 points if no jump detected source=data_source, ) # Process the segments as they're detected for segment in segmenter: print(f"Segment values: {list(segment.values)}") # Output: # Segment values: [1, 1, 2, 3, 8] # Split due to jump from 3 to 8 and max size # Segment values: [9, 8, 2] # Split due to jump from 8 to 2 # Segment values: [2, 3, 10] # Split due to jump from 3 to 10 # Segment values: [10, 9, 9] # Remaining data

None
Source code in pysatl_tsp/core/scrubber/segmentation_scrubber.py
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
class OnlineSegmentationScrubber(Scrubber[T]):
    """A scrubber that segments time series data in real-time based on a condition.

    This scrubber processes data points sequentially (online mode) and segments
    the time series whenever a specified condition is met or a maximum segment size
    is reached. It's designed for streaming data where segments need to be identified
    in real-time without waiting for the complete dataset.

    :param segmentation_rule: Function that evaluates the current window and returns True when a segment should end
    :param max_segment_size: Maximum number of points in a segment before forcing a split, defaults to 2^64
    :param source: The handler providing input data, defaults to None

    Example:
        ```python
        # Create a data source with streaming values
        data = [1, 1, 2, 3, 8, 9, 8, 2, 2, 3, 10, 10, 9, 9]
        data_source = SimpleDataProvider(data)


        # Define a threshold-based segmentation rule
        def detect_jump(window: ScrubberWindow[int]) -> bool:
            if len(window) < 2:
                return False

            # Detect a large jump in values
            last_value = window[-1]
            prev_value = window[-2]
            return abs(last_value - prev_value) > 3


        # Create the online segmentation scrubber
        segmenter = OnlineSegmentationScrubber(
            segmentation_rule=detect_jump,
            max_segment_size=5,  # Force segmentation after 5 points if no jump detected
            source=data_source,
        )

        # Process the segments as they're detected
        for segment in segmenter:
            print(f"Segment values: {list(segment.values)}")

        # Output:
        # Segment values: [1, 1, 2, 3, 8]  # Split due to jump from 3 to 8 and max size
        # Segment values: [9, 8, 2]        # Split due to jump from 8 to 2
        # Segment values: [2, 3, 10]       # Split due to jump from 3 to 10
        # Segment values: [10, 9, 9]       # Remaining data
        ```
    """

    def __init__(
        self,
        segmentation_rule: Callable[[ScrubberWindow[T]], bool],
        max_segment_size: int = 2**64,
        source: Handler[Any, T] | None = None,
    ):
        """Initialize an online segmentation scrubber.

        :param segmentation_rule: Function that evaluates the current window and returns True when a segment should end
        :param max_segment_size: Maximum number of points in a segment before forcing a split, defaults to 2^64
        :param source: The handler providing input data, defaults to None
        """
        super().__init__(source)
        self.segmentation_rule = segmentation_rule
        self.max_segment_size = max_segment_size

    def __iter__(self) -> Iterator[ScrubberWindow[T]]:
        """Create an iterator that yields segments as they're detected in real-time.

        This method processes data points one by one, accumulating them in a buffer
        and checking after each addition whether the segmentation condition is met
        or the maximum segment size is reached.

        :return: Iterator yielding ScrubberWindow instances for each detected segment
        :raises ValueError: If no source has been set
        """
        if self.source is None:
            raise ValueError("Source is not set")
        current_window: ScrubberWindow[T] = ScrubberWindow(deque())
        for index, item in enumerate(self.source):
            current_window.append(item, index)

            if self.segmentation_rule(current_window) or len(current_window) >= self.max_segment_size:
                yield current_window.copy()
                current_window.clear()

        if current_window:
            yield current_window.copy()
__init__
__init__(
    segmentation_rule: Callable[[ScrubberWindow[T]], bool],
    max_segment_size: int = 2**64,
    source: Handler[Any, T] | None = None,
)

Initialize an online segmentation scrubber.

Parameters:

Name Type Description Default
segmentation_rule Callable[[ScrubberWindow[T]], bool]

Function that evaluates the current window and returns True when a segment should end

required
max_segment_size int

Maximum number of points in a segment before forcing a split, defaults to 2^64

2 ** 64
source Handler[Any, T] | None

The handler providing input data, defaults to None

None
Source code in pysatl_tsp/core/scrubber/segmentation_scrubber.py
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
def __init__(
    self,
    segmentation_rule: Callable[[ScrubberWindow[T]], bool],
    max_segment_size: int = 2**64,
    source: Handler[Any, T] | None = None,
):
    """Initialize an online segmentation scrubber.

    :param segmentation_rule: Function that evaluates the current window and returns True when a segment should end
    :param max_segment_size: Maximum number of points in a segment before forcing a split, defaults to 2^64
    :param source: The handler providing input data, defaults to None
    """
    super().__init__(source)
    self.segmentation_rule = segmentation_rule
    self.max_segment_size = max_segment_size
__iter__
__iter__() -> Iterator[ScrubberWindow[T]]

Create an iterator that yields segments as they're detected in real-time.

This method processes data points one by one, accumulating them in a buffer and checking after each addition whether the segmentation condition is met or the maximum segment size is reached.

Returns:

Type Description
Iterator[ScrubberWindow[T]]

Iterator yielding ScrubberWindow instances for each detected segment

Raises:

Type Description
ValueError

If no source has been set

Source code in pysatl_tsp/core/scrubber/segmentation_scrubber.py
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
def __iter__(self) -> Iterator[ScrubberWindow[T]]:
    """Create an iterator that yields segments as they're detected in real-time.

    This method processes data points one by one, accumulating them in a buffer
    and checking after each addition whether the segmentation condition is met
    or the maximum segment size is reached.

    :return: Iterator yielding ScrubberWindow instances for each detected segment
    :raises ValueError: If no source has been set
    """
    if self.source is None:
        raise ValueError("Source is not set")
    current_window: ScrubberWindow[T] = ScrubberWindow(deque())
    for index, item in enumerate(self.source):
        current_window.append(item, index)

        if self.segmentation_rule(current_window) or len(current_window) >= self.max_segment_size:
            yield current_window.copy()
            current_window.clear()

    if current_window:
        yield current_window.copy()
Scrubber

Bases: Handler[T, ScrubberWindow[T]]

Abstract base class for handlers that produce window views of time series data.

Scrubbers consume individual data points and produce windows (sections) of the data stream. They are essential components for algorithms that need to analyze multiple data points together, such as moving averages, pattern detection, or feature extraction.

Concrete implementations of this class define specific windowing strategies such as fixed-size sliding windows, tumbling windows, or context-based windows.

Parameters:

Name Type Description Default
source Handler[Any, T] | None

The handler providing input data, defaults to None Example: python # Example with a fixed-size sliding window scrubber (implementation not shown) # Create a data source data_source = SimpleDataProvider([10, 20, 30, 40, 50, 60, 70, 80]) # Create a sliding window scrubber with window size 3 window_scrubber = SlidingWindowScrubber(window_size=3, source=data_source) # Process windows for window in window_scrubber: # Each window is a ScrubberWindow instance print(f"Window values: {list(window.values)}") print(f"Window indices: {list(window.indices)}") # Calculate window statistics avg = sum(window.values) / len(window) print(f"Window average: {avg}") # Output: # Window values: [10, 20, 30] # Window indices: [0, 1, 2] # Window average: 20.0 # # Window values: [20, 30, 40] # Window indices: [1, 2, 3] # Window average: 30.0 # # ... and so on

None
Source code in pysatl_tsp/core/scrubber/abstract.py
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
class Scrubber(Handler[T, ScrubberWindow[T]]):
    """Abstract base class for handlers that produce window views of time series data.

    Scrubbers consume individual data points and produce windows (sections) of the
    data stream. They are essential components for algorithms that need to analyze
    multiple data points together, such as moving averages, pattern detection,
    or feature extraction.

    Concrete implementations of this class define specific windowing strategies such
    as fixed-size sliding windows, tumbling windows, or context-based windows.

    :param source: The handler providing input data, defaults to None

    Example:
        ```python
        # Example with a fixed-size sliding window scrubber (implementation not shown)

        # Create a data source
        data_source = SimpleDataProvider([10, 20, 30, 40, 50, 60, 70, 80])

        # Create a sliding window scrubber with window size 3
        window_scrubber = SlidingWindowScrubber(window_size=3, source=data_source)

        # Process windows
        for window in window_scrubber:
            # Each window is a ScrubberWindow instance
            print(f"Window values: {list(window.values)}")
            print(f"Window indices: {list(window.indices)}")

            # Calculate window statistics
            avg = sum(window.values) / len(window)
            print(f"Window average: {avg}")

        # Output:
        # Window values: [10, 20, 30]
        # Window indices: [0, 1, 2]
        # Window average: 20.0
        #
        # Window values: [20, 30, 40]
        # Window indices: [1, 2, 3]
        # Window average: 30.0
        #
        # ... and so on
        ```
    """

    def __init__(self, source: Handler[Any, T] | None = None) -> None:
        """Initialize a scrubber.

        :param source: The handler providing input data, defaults to None
        """
        super().__init__(source)

    @abstractmethod
    def __iter__(self) -> Iterator[ScrubberWindow[T]]:
        """Create an iterator that yields window views of the input data.

        Concrete implementations define specific windowing strategies.

        :return: Iterator yielding ScrubberWindow instances
        """
        pass
__init__
__init__(source: Handler[Any, T] | None = None) -> None

Initialize a scrubber.

Parameters:

Name Type Description Default
source Handler[Any, T] | None

The handler providing input data, defaults to None

None
Source code in pysatl_tsp/core/scrubber/abstract.py
221
222
223
224
225
226
def __init__(self, source: Handler[Any, T] | None = None) -> None:
    """Initialize a scrubber.

    :param source: The handler providing input data, defaults to None
    """
    super().__init__(source)
__iter__ abstractmethod
__iter__() -> Iterator[ScrubberWindow[T]]

Create an iterator that yields window views of the input data.

Concrete implementations define specific windowing strategies.

Returns:

Type Description
Iterator[ScrubberWindow[T]]

Iterator yielding ScrubberWindow instances

Source code in pysatl_tsp/core/scrubber/abstract.py
228
229
230
231
232
233
234
235
236
@abstractmethod
def __iter__(self) -> Iterator[ScrubberWindow[T]]:
    """Create an iterator that yields window views of the input data.

    Concrete implementations define specific windowing strategies.

    :return: Iterator yielding ScrubberWindow instances
    """
    pass
ScrubberWindow

Bases: Generic[T]

A sliding window container for time series data processing.

ScrubberWindow provides a specialized container for holding a window of time series data values along with their corresponding indices. It's optimized for efficient append and remove operations at the ends of the window, making it suitable for sliding window algorithms in time series processing.

This class manages two parallel deques: one for the actual data values and another for their corresponding indices or positions in the original data stream.

Parameters:

Name Type Description Default
values deque[T] | None

Deque containing the data values, defaults to None (empty deque)

None
indices deque[int] | None

Deque containing the indices corresponding to values, defaults to None (if not provided, sequential indices starting from 0 are used)

None

Raises:

Type Description
ValueError

If the lengths of values and indices don't match Example: python # Create an empty window window = ScrubberWindow() # Add values with automatic indices window.append(10.5) window.append(11.2) window.append(9.8) # Add value with explicit index window.append(12.1, index=100) # Get value by position in window first_value = window[0] # 10.5 # Get a slice of the window sub_window = window[1:3] # Contains 11.2 and 9.8 # Iterate through values for value in window: print(value) # Get original position of a value third_value_index = window.indices[2] # 2 fourth_value_index = window.indices[3] # 100

Source code in pysatl_tsp/core/scrubber/abstract.py
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
class ScrubberWindow(Generic[T]):
    """A sliding window container for time series data processing.

    ScrubberWindow provides a specialized container for holding a window of time series
    data values along with their corresponding indices. It's optimized for efficient
    append and remove operations at the ends of the window, making it suitable for
    sliding window algorithms in time series processing.

    This class manages two parallel deques: one for the actual data values and another
    for their corresponding indices or positions in the original data stream.

    :param values: Deque containing the data values, defaults to None (empty deque)
    :param indices: Deque containing the indices corresponding to values, defaults to None
                   (if not provided, sequential indices starting from 0 are used)
    :raises ValueError: If the lengths of values and indices don't match

    Example:
        ```python
        # Create an empty window
        window = ScrubberWindow()

        # Add values with automatic indices
        window.append(10.5)
        window.append(11.2)
        window.append(9.8)

        # Add value with explicit index
        window.append(12.1, index=100)

        # Get value by position in window
        first_value = window[0]  # 10.5

        # Get a slice of the window
        sub_window = window[1:3]  # Contains 11.2 and 9.8

        # Iterate through values
        for value in window:
            print(value)

        # Get original position of a value
        third_value_index = window.indices[2]  # 2
        fourth_value_index = window.indices[3]  # 100
        ```
    """

    def __init__(self, values: deque[T] | None = None, indices: deque[int] | None = None) -> None:
        """Initialize a scrubber window.

        :param values: Deque containing the data values, defaults to None (empty deque)
        :param indices: Deque containing the indices corresponding to values, defaults to None
                       (if not provided, sequential indices starting from 0 are used)
        :raises ValueError: If the lengths of values and indices don't match
        """
        if values is None:
            values = deque()
        if indices is None:
            indices = deque(range(len(values)))
        if len(values) != len(indices):
            raise ValueError("Values and indices of ScrubberWindow must be same length")
        self.values = values
        self.indices = indices

    def append(self, value: T, index: int | None = None) -> None:
        """Add a new value to the end of the window.

        :param value: The data value to append
        :param index: The index/position of the value in the original data stream,
                     defaults to None (auto-assigned as len(self))
        """
        self.values.append(value)
        if index is None:
            index = len(self)
        self.indices.append(index)

    def popleft(self) -> None:
        """Remove the oldest (leftmost) value from the window."""
        self.values.popleft()
        self.indices.popleft()

    def clear(self) -> None:
        """Remove all values from the window."""
        self.values.clear()
        self.indices.clear()

    def copy(self) -> ScrubberWindow[T]:
        """Create a deep copy of the window.

        :return: A new ScrubberWindow with copies of the values and indices
        """
        return ScrubberWindow(self.values.copy(), self.indices.copy())

    @overload
    def __getitem__(self, key: int) -> T: ...

    @overload
    def __getitem__(self, key: slice) -> ScrubberWindow[T]: ...

    def __getitem__(self, key: int | slice) -> T | ScrubberWindow[T]:
        """Get a value or sub-window by index or slice.

        :param key: Integer index or slice to retrieve
        :return: Single value (if key is int) or sub-window (if key is slice)
        :raises TypeError: If key is not an int or slice
        """
        match key:
            case int() as idx:
                return self.values[idx]

            case slice() as s:
                start, stop = s.start, s.stop
                if start and start < 0:
                    start = len(self) + start
                if stop and stop < 0:
                    stop = len(self) + stop
                return ScrubberWindow(
                    values=deque(islice(self.values, start, stop)),
                    indices=deque(islice(self.indices, start, stop)),
                )

            case _:
                raise TypeError(f"Unsupported key type: {type(key).__name__}")

    def __len__(self) -> int:
        """Get the number of values in the window.

        :return: The window size
        """
        return len(self.values)

    def __eq__(self, other: object) -> bool:
        """Check if this window equals another window.

        :param other: Another object to compare with
        :return: True if other is a ScrubberWindow with equal values and indices
        """
        if not isinstance(other, ScrubberWindow):
            return NotImplemented
        return self.values == other.values and self.indices == other.indices

    def __hash__(self) -> int:
        """Get window's hash code.

        :return: The hash value of the object
        """
        return hash((self.values, self.indices))

    def __repr__(self) -> str:
        """Get a string representation of the window.

        :return: String representation showing values and indices
        """
        return f"ScrubberWindow(values: {self.values}, indices: {self.indices})"

    def __iter__(self) -> Iterator[T]:
        """Create an iterator over the values in the window.

        :return: Iterator yielding window values
        """
        return iter(self.values)
__eq__
__eq__(other: object) -> bool

Check if this window equals another window.

Parameters:

Name Type Description Default
other object

Another object to compare with

required

Returns:

Type Description
bool

True if other is a ScrubberWindow with equal values and indices

Source code in pysatl_tsp/core/scrubber/abstract.py
143
144
145
146
147
148
149
150
151
def __eq__(self, other: object) -> bool:
    """Check if this window equals another window.

    :param other: Another object to compare with
    :return: True if other is a ScrubberWindow with equal values and indices
    """
    if not isinstance(other, ScrubberWindow):
        return NotImplemented
    return self.values == other.values and self.indices == other.indices
__getitem__
__getitem__(key: int) -> T
__getitem__(key: slice) -> ScrubberWindow[T]
__getitem__(key: int | slice) -> T | ScrubberWindow[T]

Get a value or sub-window by index or slice.

Parameters:

Name Type Description Default
key int | slice

Integer index or slice to retrieve

required

Returns:

Type Description
T | ScrubberWindow[T]

Single value (if key is int) or sub-window (if key is slice)

Raises:

Type Description
TypeError

If key is not an int or slice

Source code in pysatl_tsp/core/scrubber/abstract.py
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
def __getitem__(self, key: int | slice) -> T | ScrubberWindow[T]:
    """Get a value or sub-window by index or slice.

    :param key: Integer index or slice to retrieve
    :return: Single value (if key is int) or sub-window (if key is slice)
    :raises TypeError: If key is not an int or slice
    """
    match key:
        case int() as idx:
            return self.values[idx]

        case slice() as s:
            start, stop = s.start, s.stop
            if start and start < 0:
                start = len(self) + start
            if stop and stop < 0:
                stop = len(self) + stop
            return ScrubberWindow(
                values=deque(islice(self.values, start, stop)),
                indices=deque(islice(self.indices, start, stop)),
            )

        case _:
            raise TypeError(f"Unsupported key type: {type(key).__name__}")
__hash__
__hash__() -> int

Get window's hash code.

Returns:

Type Description
int

The hash value of the object

Source code in pysatl_tsp/core/scrubber/abstract.py
153
154
155
156
157
158
def __hash__(self) -> int:
    """Get window's hash code.

    :return: The hash value of the object
    """
    return hash((self.values, self.indices))
__init__
__init__(
    values: deque[T] | None = None,
    indices: deque[int] | None = None,
) -> None

Initialize a scrubber window.

Parameters:

Name Type Description Default
values deque[T] | None

Deque containing the data values, defaults to None (empty deque)

None
indices deque[int] | None

Deque containing the indices corresponding to values, defaults to None (if not provided, sequential indices starting from 0 are used)

None

Raises:

Type Description
ValueError

If the lengths of values and indices don't match

Source code in pysatl_tsp/core/scrubber/abstract.py
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
def __init__(self, values: deque[T] | None = None, indices: deque[int] | None = None) -> None:
    """Initialize a scrubber window.

    :param values: Deque containing the data values, defaults to None (empty deque)
    :param indices: Deque containing the indices corresponding to values, defaults to None
                   (if not provided, sequential indices starting from 0 are used)
    :raises ValueError: If the lengths of values and indices don't match
    """
    if values is None:
        values = deque()
    if indices is None:
        indices = deque(range(len(values)))
    if len(values) != len(indices):
        raise ValueError("Values and indices of ScrubberWindow must be same length")
    self.values = values
    self.indices = indices
__iter__
__iter__() -> Iterator[T]

Create an iterator over the values in the window.

Returns:

Type Description
Iterator[T]

Iterator yielding window values

Source code in pysatl_tsp/core/scrubber/abstract.py
167
168
169
170
171
172
def __iter__(self) -> Iterator[T]:
    """Create an iterator over the values in the window.

    :return: Iterator yielding window values
    """
    return iter(self.values)
__len__
__len__() -> int

Get the number of values in the window.

Returns:

Type Description
int

The window size

Source code in pysatl_tsp/core/scrubber/abstract.py
136
137
138
139
140
141
def __len__(self) -> int:
    """Get the number of values in the window.

    :return: The window size
    """
    return len(self.values)
__repr__
__repr__() -> str

Get a string representation of the window.

Returns:

Type Description
str

String representation showing values and indices

Source code in pysatl_tsp/core/scrubber/abstract.py
160
161
162
163
164
165
def __repr__(self) -> str:
    """Get a string representation of the window.

    :return: String representation showing values and indices
    """
    return f"ScrubberWindow(values: {self.values}, indices: {self.indices})"
append
append(value: T, index: int | None = None) -> None

Add a new value to the end of the window.

Parameters:

Name Type Description Default
value T

The data value to append

required
index int | None

The index/position of the value in the original data stream, defaults to None (auto-assigned as len(self))

None
Source code in pysatl_tsp/core/scrubber/abstract.py
76
77
78
79
80
81
82
83
84
85
86
def append(self, value: T, index: int | None = None) -> None:
    """Add a new value to the end of the window.

    :param value: The data value to append
    :param index: The index/position of the value in the original data stream,
                 defaults to None (auto-assigned as len(self))
    """
    self.values.append(value)
    if index is None:
        index = len(self)
    self.indices.append(index)
clear
clear() -> None

Remove all values from the window.

Source code in pysatl_tsp/core/scrubber/abstract.py
93
94
95
96
def clear(self) -> None:
    """Remove all values from the window."""
    self.values.clear()
    self.indices.clear()
copy
copy() -> ScrubberWindow[T]

Create a deep copy of the window.

Returns:

Type Description
ScrubberWindow[T]

A new ScrubberWindow with copies of the values and indices

Source code in pysatl_tsp/core/scrubber/abstract.py
 98
 99
100
101
102
103
def copy(self) -> ScrubberWindow[T]:
    """Create a deep copy of the window.

    :return: A new ScrubberWindow with copies of the values and indices
    """
    return ScrubberWindow(self.values.copy(), self.indices.copy())
popleft
popleft() -> None

Remove the oldest (leftmost) value from the window.

Source code in pysatl_tsp/core/scrubber/abstract.py
88
89
90
91
def popleft(self) -> None:
    """Remove the oldest (leftmost) value from the window."""
    self.values.popleft()
    self.indices.popleft()
SlidingScrubber

Bases: Scrubber[T]

A flexible scrubber that creates sliding windows of time series data based on custom conditions.

This scrubber allows defining custom conditions for when to emit a window and how far to slide the window after each emission. It accumulates data points in a buffer and yields the current window whenever the take condition evaluates to True.

Parameters:

Name Type Description Default
take_condition Callable[[ScrubberWindow[T]], bool]

Function that determines when to emit the current window

required
shift int

Number of points to shift the window after each emission

required
source Handler[Any, T] | None

The handler providing input data, defaults to None Example: python # Create a data source data_source = SimpleDataProvider([1, 2, 3, 4, 5, 6, 7, 8, 9, 10]) # Emit windows when they contain exactly 3 elements, shift by 2 condition = lambda window: len(window) == 3 scrubber = SlidingScrubber(take_condition=condition, shift=2, source=data_source) # Process the windows for window in scrubber: print(f"Window values: {list(window.values)}") # Output: # Window values: [1, 2, 3] # Window values: [3, 4, 5] # Window values: [5, 6, 7] # Window values: [7, 8, 9] # Window values: [9, 10] # Create a scrubber that emits windows based on their sum sum_condition = lambda window: sum(window.values) >= 10 sum_scrubber = SlidingScrubber(take_condition=sum_condition, shift=1, source=data_source) for window in sum_scrubber: print(f"Window with sum >= 10: {list(window.values)}, sum: {sum(window.values)}")

None
Source code in pysatl_tsp/core/scrubber/linear_scrubber.py
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
class SlidingScrubber(Scrubber[T]):
    """A flexible scrubber that creates sliding windows of time series data based on custom conditions.

    This scrubber allows defining custom conditions for when to emit a window and how far to
    slide the window after each emission. It accumulates data points in a buffer and yields
    the current window whenever the take condition evaluates to True.

    :param take_condition: Function that determines when to emit the current window
    :param shift: Number of points to shift the window after each emission
    :param source: The handler providing input data, defaults to None

    Example:
        ```python
        # Create a data source
        data_source = SimpleDataProvider([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])

        # Emit windows when they contain exactly 3 elements, shift by 2
        condition = lambda window: len(window) == 3
        scrubber = SlidingScrubber(take_condition=condition, shift=2, source=data_source)

        # Process the windows
        for window in scrubber:
            print(f"Window values: {list(window.values)}")

        # Output:
        # Window values: [1, 2, 3]
        # Window values: [3, 4, 5]
        # Window values: [5, 6, 7]
        # Window values: [7, 8, 9]
        # Window values: [9, 10]

        # Create a scrubber that emits windows based on their sum
        sum_condition = lambda window: sum(window.values) >= 10
        sum_scrubber = SlidingScrubber(take_condition=sum_condition, shift=1, source=data_source)

        for window in sum_scrubber:
            print(f"Window with sum >= 10: {list(window.values)}, sum: {sum(window.values)}")
        ```
    """

    def __init__(
        self, take_condition: Callable[[ScrubberWindow[T]], bool], shift: int, source: Handler[Any, T] | None = None
    ) -> None:
        """Initialize a sliding scrubber with custom condition and shift.

        :param take_condition: Function that determines when to emit the current window
        :param shift: Number of points to shift the window after each emission
        :param source: The handler providing input data, defaults to None
        """
        super().__init__(source)
        self._shift = shift
        self._buffer: ScrubberWindow[T] = ScrubberWindow()
        self._take_condition = take_condition

    def __iter__(self) -> Iterator[ScrubberWindow[T]]:
        """Create an iterator that yields windows based on the take condition.

        This method accumulates data points in a buffer and yields the current window
        whenever the take condition evaluates to True. After yielding, it shifts
        the window by the specified number of points.

        :return: Iterator yielding ScrubberWindow instances
        :raises ValueError: If no source has been set
        """
        if self.source is None:
            raise ValueError("Source is not set")

        for i, item in enumerate(self.source):
            self._buffer.append(item, i)
            if self._take_condition(self._buffer):
                yield self._buffer[:]

                for _ in range(self._shift):
                    if self._buffer:
                        self._buffer.popleft()
__init__
__init__(
    take_condition: Callable[[ScrubberWindow[T]], bool],
    shift: int,
    source: Handler[Any, T] | None = None,
) -> None

Initialize a sliding scrubber with custom condition and shift.

Parameters:

Name Type Description Default
take_condition Callable[[ScrubberWindow[T]], bool]

Function that determines when to emit the current window

required
shift int

Number of points to shift the window after each emission

required
source Handler[Any, T] | None

The handler providing input data, defaults to None

None
Source code in pysatl_tsp/core/scrubber/linear_scrubber.py
49
50
51
52
53
54
55
56
57
58
59
60
61
def __init__(
    self, take_condition: Callable[[ScrubberWindow[T]], bool], shift: int, source: Handler[Any, T] | None = None
) -> None:
    """Initialize a sliding scrubber with custom condition and shift.

    :param take_condition: Function that determines when to emit the current window
    :param shift: Number of points to shift the window after each emission
    :param source: The handler providing input data, defaults to None
    """
    super().__init__(source)
    self._shift = shift
    self._buffer: ScrubberWindow[T] = ScrubberWindow()
    self._take_condition = take_condition
__iter__
__iter__() -> Iterator[ScrubberWindow[T]]

Create an iterator that yields windows based on the take condition.

This method accumulates data points in a buffer and yields the current window whenever the take condition evaluates to True. After yielding, it shifts the window by the specified number of points.

Returns:

Type Description
Iterator[ScrubberWindow[T]]

Iterator yielding ScrubberWindow instances

Raises:

Type Description
ValueError

If no source has been set

Source code in pysatl_tsp/core/scrubber/linear_scrubber.py
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
def __iter__(self) -> Iterator[ScrubberWindow[T]]:
    """Create an iterator that yields windows based on the take condition.

    This method accumulates data points in a buffer and yields the current window
    whenever the take condition evaluates to True. After yielding, it shifts
    the window by the specified number of points.

    :return: Iterator yielding ScrubberWindow instances
    :raises ValueError: If no source has been set
    """
    if self.source is None:
        raise ValueError("Source is not set")

    for i, item in enumerate(self.source):
        self._buffer.append(item, i)
        if self._take_condition(self._buffer):
            yield self._buffer[:]

            for _ in range(self._shift):
                if self._buffer:
                    self._buffer.popleft()
abstract
Scrubber

Bases: Handler[T, ScrubberWindow[T]]

Abstract base class for handlers that produce window views of time series data.

Scrubbers consume individual data points and produce windows (sections) of the data stream. They are essential components for algorithms that need to analyze multiple data points together, such as moving averages, pattern detection, or feature extraction.

Concrete implementations of this class define specific windowing strategies such as fixed-size sliding windows, tumbling windows, or context-based windows.

Parameters:

Name Type Description Default
source Handler[Any, T] | None

The handler providing input data, defaults to None Example: python # Example with a fixed-size sliding window scrubber (implementation not shown) # Create a data source data_source = SimpleDataProvider([10, 20, 30, 40, 50, 60, 70, 80]) # Create a sliding window scrubber with window size 3 window_scrubber = SlidingWindowScrubber(window_size=3, source=data_source) # Process windows for window in window_scrubber: # Each window is a ScrubberWindow instance print(f"Window values: {list(window.values)}") print(f"Window indices: {list(window.indices)}") # Calculate window statistics avg = sum(window.values) / len(window) print(f"Window average: {avg}") # Output: # Window values: [10, 20, 30] # Window indices: [0, 1, 2] # Window average: 20.0 # # Window values: [20, 30, 40] # Window indices: [1, 2, 3] # Window average: 30.0 # # ... and so on

None
Source code in pysatl_tsp/core/scrubber/abstract.py
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
class Scrubber(Handler[T, ScrubberWindow[T]]):
    """Abstract base class for handlers that produce window views of time series data.

    Scrubbers consume individual data points and produce windows (sections) of the
    data stream. They are essential components for algorithms that need to analyze
    multiple data points together, such as moving averages, pattern detection,
    or feature extraction.

    Concrete implementations of this class define specific windowing strategies such
    as fixed-size sliding windows, tumbling windows, or context-based windows.

    :param source: The handler providing input data, defaults to None

    Example:
        ```python
        # Example with a fixed-size sliding window scrubber (implementation not shown)

        # Create a data source
        data_source = SimpleDataProvider([10, 20, 30, 40, 50, 60, 70, 80])

        # Create a sliding window scrubber with window size 3
        window_scrubber = SlidingWindowScrubber(window_size=3, source=data_source)

        # Process windows
        for window in window_scrubber:
            # Each window is a ScrubberWindow instance
            print(f"Window values: {list(window.values)}")
            print(f"Window indices: {list(window.indices)}")

            # Calculate window statistics
            avg = sum(window.values) / len(window)
            print(f"Window average: {avg}")

        # Output:
        # Window values: [10, 20, 30]
        # Window indices: [0, 1, 2]
        # Window average: 20.0
        #
        # Window values: [20, 30, 40]
        # Window indices: [1, 2, 3]
        # Window average: 30.0
        #
        # ... and so on
        ```
    """

    def __init__(self, source: Handler[Any, T] | None = None) -> None:
        """Initialize a scrubber.

        :param source: The handler providing input data, defaults to None
        """
        super().__init__(source)

    @abstractmethod
    def __iter__(self) -> Iterator[ScrubberWindow[T]]:
        """Create an iterator that yields window views of the input data.

        Concrete implementations define specific windowing strategies.

        :return: Iterator yielding ScrubberWindow instances
        """
        pass
__init__
__init__(source: Handler[Any, T] | None = None) -> None

Initialize a scrubber.

Parameters:

Name Type Description Default
source Handler[Any, T] | None

The handler providing input data, defaults to None

None
Source code in pysatl_tsp/core/scrubber/abstract.py
221
222
223
224
225
226
def __init__(self, source: Handler[Any, T] | None = None) -> None:
    """Initialize a scrubber.

    :param source: The handler providing input data, defaults to None
    """
    super().__init__(source)
__iter__ abstractmethod
__iter__() -> Iterator[ScrubberWindow[T]]

Create an iterator that yields window views of the input data.

Concrete implementations define specific windowing strategies.

Returns:

Type Description
Iterator[ScrubberWindow[T]]

Iterator yielding ScrubberWindow instances

Source code in pysatl_tsp/core/scrubber/abstract.py
228
229
230
231
232
233
234
235
236
@abstractmethod
def __iter__(self) -> Iterator[ScrubberWindow[T]]:
    """Create an iterator that yields window views of the input data.

    Concrete implementations define specific windowing strategies.

    :return: Iterator yielding ScrubberWindow instances
    """
    pass
ScrubberWindow

Bases: Generic[T]

A sliding window container for time series data processing.

ScrubberWindow provides a specialized container for holding a window of time series data values along with their corresponding indices. It's optimized for efficient append and remove operations at the ends of the window, making it suitable for sliding window algorithms in time series processing.

This class manages two parallel deques: one for the actual data values and another for their corresponding indices or positions in the original data stream.

Parameters:

Name Type Description Default
values deque[T] | None

Deque containing the data values, defaults to None (empty deque)

None
indices deque[int] | None

Deque containing the indices corresponding to values, defaults to None (if not provided, sequential indices starting from 0 are used)

None

Raises:

Type Description
ValueError

If the lengths of values and indices don't match Example: python # Create an empty window window = ScrubberWindow() # Add values with automatic indices window.append(10.5) window.append(11.2) window.append(9.8) # Add value with explicit index window.append(12.1, index=100) # Get value by position in window first_value = window[0] # 10.5 # Get a slice of the window sub_window = window[1:3] # Contains 11.2 and 9.8 # Iterate through values for value in window: print(value) # Get original position of a value third_value_index = window.indices[2] # 2 fourth_value_index = window.indices[3] # 100

Source code in pysatl_tsp/core/scrubber/abstract.py
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
class ScrubberWindow(Generic[T]):
    """A sliding window container for time series data processing.

    ScrubberWindow provides a specialized container for holding a window of time series
    data values along with their corresponding indices. It's optimized for efficient
    append and remove operations at the ends of the window, making it suitable for
    sliding window algorithms in time series processing.

    This class manages two parallel deques: one for the actual data values and another
    for their corresponding indices or positions in the original data stream.

    :param values: Deque containing the data values, defaults to None (empty deque)
    :param indices: Deque containing the indices corresponding to values, defaults to None
                   (if not provided, sequential indices starting from 0 are used)
    :raises ValueError: If the lengths of values and indices don't match

    Example:
        ```python
        # Create an empty window
        window = ScrubberWindow()

        # Add values with automatic indices
        window.append(10.5)
        window.append(11.2)
        window.append(9.8)

        # Add value with explicit index
        window.append(12.1, index=100)

        # Get value by position in window
        first_value = window[0]  # 10.5

        # Get a slice of the window
        sub_window = window[1:3]  # Contains 11.2 and 9.8

        # Iterate through values
        for value in window:
            print(value)

        # Get original position of a value
        third_value_index = window.indices[2]  # 2
        fourth_value_index = window.indices[3]  # 100
        ```
    """

    def __init__(self, values: deque[T] | None = None, indices: deque[int] | None = None) -> None:
        """Initialize a scrubber window.

        :param values: Deque containing the data values, defaults to None (empty deque)
        :param indices: Deque containing the indices corresponding to values, defaults to None
                       (if not provided, sequential indices starting from 0 are used)
        :raises ValueError: If the lengths of values and indices don't match
        """
        if values is None:
            values = deque()
        if indices is None:
            indices = deque(range(len(values)))
        if len(values) != len(indices):
            raise ValueError("Values and indices of ScrubberWindow must be same length")
        self.values = values
        self.indices = indices

    def append(self, value: T, index: int | None = None) -> None:
        """Add a new value to the end of the window.

        :param value: The data value to append
        :param index: The index/position of the value in the original data stream,
                     defaults to None (auto-assigned as len(self))
        """
        self.values.append(value)
        if index is None:
            index = len(self)
        self.indices.append(index)

    def popleft(self) -> None:
        """Remove the oldest (leftmost) value from the window."""
        self.values.popleft()
        self.indices.popleft()

    def clear(self) -> None:
        """Remove all values from the window."""
        self.values.clear()
        self.indices.clear()

    def copy(self) -> ScrubberWindow[T]:
        """Create a deep copy of the window.

        :return: A new ScrubberWindow with copies of the values and indices
        """
        return ScrubberWindow(self.values.copy(), self.indices.copy())

    @overload
    def __getitem__(self, key: int) -> T: ...

    @overload
    def __getitem__(self, key: slice) -> ScrubberWindow[T]: ...

    def __getitem__(self, key: int | slice) -> T | ScrubberWindow[T]:
        """Get a value or sub-window by index or slice.

        :param key: Integer index or slice to retrieve
        :return: Single value (if key is int) or sub-window (if key is slice)
        :raises TypeError: If key is not an int or slice
        """
        match key:
            case int() as idx:
                return self.values[idx]

            case slice() as s:
                start, stop = s.start, s.stop
                if start and start < 0:
                    start = len(self) + start
                if stop and stop < 0:
                    stop = len(self) + stop
                return ScrubberWindow(
                    values=deque(islice(self.values, start, stop)),
                    indices=deque(islice(self.indices, start, stop)),
                )

            case _:
                raise TypeError(f"Unsupported key type: {type(key).__name__}")

    def __len__(self) -> int:
        """Get the number of values in the window.

        :return: The window size
        """
        return len(self.values)

    def __eq__(self, other: object) -> bool:
        """Check if this window equals another window.

        :param other: Another object to compare with
        :return: True if other is a ScrubberWindow with equal values and indices
        """
        if not isinstance(other, ScrubberWindow):
            return NotImplemented
        return self.values == other.values and self.indices == other.indices

    def __hash__(self) -> int:
        """Get window's hash code.

        :return: The hash value of the object
        """
        return hash((self.values, self.indices))

    def __repr__(self) -> str:
        """Get a string representation of the window.

        :return: String representation showing values and indices
        """
        return f"ScrubberWindow(values: {self.values}, indices: {self.indices})"

    def __iter__(self) -> Iterator[T]:
        """Create an iterator over the values in the window.

        :return: Iterator yielding window values
        """
        return iter(self.values)
__eq__
__eq__(other: object) -> bool

Check if this window equals another window.

Parameters:

Name Type Description Default
other object

Another object to compare with

required

Returns:

Type Description
bool

True if other is a ScrubberWindow with equal values and indices

Source code in pysatl_tsp/core/scrubber/abstract.py
143
144
145
146
147
148
149
150
151
def __eq__(self, other: object) -> bool:
    """Check if this window equals another window.

    :param other: Another object to compare with
    :return: True if other is a ScrubberWindow with equal values and indices
    """
    if not isinstance(other, ScrubberWindow):
        return NotImplemented
    return self.values == other.values and self.indices == other.indices
__getitem__
__getitem__(key: int) -> T
__getitem__(key: slice) -> ScrubberWindow[T]
__getitem__(key: int | slice) -> T | ScrubberWindow[T]

Get a value or sub-window by index or slice.

Parameters:

Name Type Description Default
key int | slice

Integer index or slice to retrieve

required

Returns:

Type Description
T | ScrubberWindow[T]

Single value (if key is int) or sub-window (if key is slice)

Raises:

Type Description
TypeError

If key is not an int or slice

Source code in pysatl_tsp/core/scrubber/abstract.py
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
def __getitem__(self, key: int | slice) -> T | ScrubberWindow[T]:
    """Get a value or sub-window by index or slice.

    :param key: Integer index or slice to retrieve
    :return: Single value (if key is int) or sub-window (if key is slice)
    :raises TypeError: If key is not an int or slice
    """
    match key:
        case int() as idx:
            return self.values[idx]

        case slice() as s:
            start, stop = s.start, s.stop
            if start and start < 0:
                start = len(self) + start
            if stop and stop < 0:
                stop = len(self) + stop
            return ScrubberWindow(
                values=deque(islice(self.values, start, stop)),
                indices=deque(islice(self.indices, start, stop)),
            )

        case _:
            raise TypeError(f"Unsupported key type: {type(key).__name__}")
__hash__
__hash__() -> int

Get window's hash code.

Returns:

Type Description
int

The hash value of the object

Source code in pysatl_tsp/core/scrubber/abstract.py
153
154
155
156
157
158
def __hash__(self) -> int:
    """Get window's hash code.

    :return: The hash value of the object
    """
    return hash((self.values, self.indices))
__init__
__init__(
    values: deque[T] | None = None,
    indices: deque[int] | None = None,
) -> None

Initialize a scrubber window.

Parameters:

Name Type Description Default
values deque[T] | None

Deque containing the data values, defaults to None (empty deque)

None
indices deque[int] | None

Deque containing the indices corresponding to values, defaults to None (if not provided, sequential indices starting from 0 are used)

None

Raises:

Type Description
ValueError

If the lengths of values and indices don't match

Source code in pysatl_tsp/core/scrubber/abstract.py
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
def __init__(self, values: deque[T] | None = None, indices: deque[int] | None = None) -> None:
    """Initialize a scrubber window.

    :param values: Deque containing the data values, defaults to None (empty deque)
    :param indices: Deque containing the indices corresponding to values, defaults to None
                   (if not provided, sequential indices starting from 0 are used)
    :raises ValueError: If the lengths of values and indices don't match
    """
    if values is None:
        values = deque()
    if indices is None:
        indices = deque(range(len(values)))
    if len(values) != len(indices):
        raise ValueError("Values and indices of ScrubberWindow must be same length")
    self.values = values
    self.indices = indices
__iter__
__iter__() -> Iterator[T]

Create an iterator over the values in the window.

Returns:

Type Description
Iterator[T]

Iterator yielding window values

Source code in pysatl_tsp/core/scrubber/abstract.py
167
168
169
170
171
172
def __iter__(self) -> Iterator[T]:
    """Create an iterator over the values in the window.

    :return: Iterator yielding window values
    """
    return iter(self.values)
__len__
__len__() -> int

Get the number of values in the window.

Returns:

Type Description
int

The window size

Source code in pysatl_tsp/core/scrubber/abstract.py
136
137
138
139
140
141
def __len__(self) -> int:
    """Get the number of values in the window.

    :return: The window size
    """
    return len(self.values)
__repr__
__repr__() -> str

Get a string representation of the window.

Returns:

Type Description
str

String representation showing values and indices

Source code in pysatl_tsp/core/scrubber/abstract.py
160
161
162
163
164
165
def __repr__(self) -> str:
    """Get a string representation of the window.

    :return: String representation showing values and indices
    """
    return f"ScrubberWindow(values: {self.values}, indices: {self.indices})"
append
append(value: T, index: int | None = None) -> None

Add a new value to the end of the window.

Parameters:

Name Type Description Default
value T

The data value to append

required
index int | None

The index/position of the value in the original data stream, defaults to None (auto-assigned as len(self))

None
Source code in pysatl_tsp/core/scrubber/abstract.py
76
77
78
79
80
81
82
83
84
85
86
def append(self, value: T, index: int | None = None) -> None:
    """Add a new value to the end of the window.

    :param value: The data value to append
    :param index: The index/position of the value in the original data stream,
                 defaults to None (auto-assigned as len(self))
    """
    self.values.append(value)
    if index is None:
        index = len(self)
    self.indices.append(index)
clear
clear() -> None

Remove all values from the window.

Source code in pysatl_tsp/core/scrubber/abstract.py
93
94
95
96
def clear(self) -> None:
    """Remove all values from the window."""
    self.values.clear()
    self.indices.clear()
copy
copy() -> ScrubberWindow[T]

Create a deep copy of the window.

Returns:

Type Description
ScrubberWindow[T]

A new ScrubberWindow with copies of the values and indices

Source code in pysatl_tsp/core/scrubber/abstract.py
 98
 99
100
101
102
103
def copy(self) -> ScrubberWindow[T]:
    """Create a deep copy of the window.

    :return: A new ScrubberWindow with copies of the values and indices
    """
    return ScrubberWindow(self.values.copy(), self.indices.copy())
popleft
popleft() -> None

Remove the oldest (leftmost) value from the window.

Source code in pysatl_tsp/core/scrubber/abstract.py
88
89
90
91
def popleft(self) -> None:
    """Remove the oldest (leftmost) value from the window."""
    self.values.popleft()
    self.indices.popleft()
linear_scrubber
LinearScrubber

Bases: SlidingScrubber[T]

A scrubber that creates fixed-size sliding windows with configurable overlap.

This is a specialized sliding scrubber that emits windows of a fixed size and allows controlling the overlap between consecutive windows through a shift factor.

Parameters:

Name Type Description Default
window_length int

Number of points in each window, defaults to 100

100
shift_factor float

Fraction of window to shift after each emission, defaults to 1/3 (e.g., 0.5 means 50% overlap between consecutive windows)

1.0 / 3.0
source Handler[Any, T] | None

The handler providing input data, defaults to None Example: python # Create a data source with a sequence of numbers data_source = SimpleDataProvider(range(10)) # Create a linear scrubber with window size 4 and 50% overlap scrubber = LinearScrubber(window_length=4, shift_factor=0.5, source=data_source) # Process the windows for window in scrubber: print(f"Window values: {list(window.values)}") # Output: # Window values: [0, 1, 2, 3] # Window values: [2, 3, 4, 5] # Window values: [4, 5, 6, 7] # Window values: [6, 7, 8, 9]

None
Source code in pysatl_tsp/core/scrubber/linear_scrubber.py
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
class LinearScrubber(SlidingScrubber[T]):
    """A scrubber that creates fixed-size sliding windows with configurable overlap.

    This is a specialized sliding scrubber that emits windows of a fixed size and
    allows controlling the overlap between consecutive windows through a shift factor.

    :param window_length: Number of points in each window, defaults to 100
    :param shift_factor: Fraction of window to shift after each emission, defaults to 1/3
                        (e.g., 0.5 means 50% overlap between consecutive windows)
    :param source: The handler providing input data, defaults to None

    Example:
        ```python
        # Create a data source with a sequence of numbers
        data_source = SimpleDataProvider(range(10))

        # Create a linear scrubber with window size 4 and 50% overlap
        scrubber = LinearScrubber(window_length=4, shift_factor=0.5, source=data_source)

        # Process the windows
        for window in scrubber:
            print(f"Window values: {list(window.values)}")

        # Output:
        # Window values: [0, 1, 2, 3]
        # Window values: [2, 3, 4, 5]
        # Window values: [4, 5, 6, 7]
        # Window values: [6, 7, 8, 9]
        ```
    """

    def __init__(
        self, window_length: int = 100, shift_factor: float = 1.0 / 3.0, source: Handler[Any, T] | None = None
    ) -> None:
        """Initialize a linear scrubber with fixed window size and overlap.

        :param window_length: Number of points in each window, defaults to 100
        :param shift_factor: Fraction of window to shift after each emission, defaults to 1/3
        :param source: The handler providing input data, defaults to None
        """
        shift = max(1, int(shift_factor * window_length))
        super().__init__(take_condition=lambda buffer: len(buffer) >= window_length, shift=shift, source=source)
__init__
__init__(
    window_length: int = 100,
    shift_factor: float = 1.0 / 3.0,
    source: Handler[Any, T] | None = None,
) -> None

Initialize a linear scrubber with fixed window size and overlap.

Parameters:

Name Type Description Default
window_length int

Number of points in each window, defaults to 100

100
shift_factor float

Fraction of window to shift after each emission, defaults to 1/3

1.0 / 3.0
source Handler[Any, T] | None

The handler providing input data, defaults to None

None
Source code in pysatl_tsp/core/scrubber/linear_scrubber.py
117
118
119
120
121
122
123
124
125
126
127
def __init__(
    self, window_length: int = 100, shift_factor: float = 1.0 / 3.0, source: Handler[Any, T] | None = None
) -> None:
    """Initialize a linear scrubber with fixed window size and overlap.

    :param window_length: Number of points in each window, defaults to 100
    :param shift_factor: Fraction of window to shift after each emission, defaults to 1/3
    :param source: The handler providing input data, defaults to None
    """
    shift = max(1, int(shift_factor * window_length))
    super().__init__(take_condition=lambda buffer: len(buffer) >= window_length, shift=shift, source=source)
SlidingScrubber

Bases: Scrubber[T]

A flexible scrubber that creates sliding windows of time series data based on custom conditions.

This scrubber allows defining custom conditions for when to emit a window and how far to slide the window after each emission. It accumulates data points in a buffer and yields the current window whenever the take condition evaluates to True.

Parameters:

Name Type Description Default
take_condition Callable[[ScrubberWindow[T]], bool]

Function that determines when to emit the current window

required
shift int

Number of points to shift the window after each emission

required
source Handler[Any, T] | None

The handler providing input data, defaults to None Example: python # Create a data source data_source = SimpleDataProvider([1, 2, 3, 4, 5, 6, 7, 8, 9, 10]) # Emit windows when they contain exactly 3 elements, shift by 2 condition = lambda window: len(window) == 3 scrubber = SlidingScrubber(take_condition=condition, shift=2, source=data_source) # Process the windows for window in scrubber: print(f"Window values: {list(window.values)}") # Output: # Window values: [1, 2, 3] # Window values: [3, 4, 5] # Window values: [5, 6, 7] # Window values: [7, 8, 9] # Window values: [9, 10] # Create a scrubber that emits windows based on their sum sum_condition = lambda window: sum(window.values) >= 10 sum_scrubber = SlidingScrubber(take_condition=sum_condition, shift=1, source=data_source) for window in sum_scrubber: print(f"Window with sum >= 10: {list(window.values)}, sum: {sum(window.values)}")

None
Source code in pysatl_tsp/core/scrubber/linear_scrubber.py
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
class SlidingScrubber(Scrubber[T]):
    """A flexible scrubber that creates sliding windows of time series data based on custom conditions.

    This scrubber allows defining custom conditions for when to emit a window and how far to
    slide the window after each emission. It accumulates data points in a buffer and yields
    the current window whenever the take condition evaluates to True.

    :param take_condition: Function that determines when to emit the current window
    :param shift: Number of points to shift the window after each emission
    :param source: The handler providing input data, defaults to None

    Example:
        ```python
        # Create a data source
        data_source = SimpleDataProvider([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])

        # Emit windows when they contain exactly 3 elements, shift by 2
        condition = lambda window: len(window) == 3
        scrubber = SlidingScrubber(take_condition=condition, shift=2, source=data_source)

        # Process the windows
        for window in scrubber:
            print(f"Window values: {list(window.values)}")

        # Output:
        # Window values: [1, 2, 3]
        # Window values: [3, 4, 5]
        # Window values: [5, 6, 7]
        # Window values: [7, 8, 9]
        # Window values: [9, 10]

        # Create a scrubber that emits windows based on their sum
        sum_condition = lambda window: sum(window.values) >= 10
        sum_scrubber = SlidingScrubber(take_condition=sum_condition, shift=1, source=data_source)

        for window in sum_scrubber:
            print(f"Window with sum >= 10: {list(window.values)}, sum: {sum(window.values)}")
        ```
    """

    def __init__(
        self, take_condition: Callable[[ScrubberWindow[T]], bool], shift: int, source: Handler[Any, T] | None = None
    ) -> None:
        """Initialize a sliding scrubber with custom condition and shift.

        :param take_condition: Function that determines when to emit the current window
        :param shift: Number of points to shift the window after each emission
        :param source: The handler providing input data, defaults to None
        """
        super().__init__(source)
        self._shift = shift
        self._buffer: ScrubberWindow[T] = ScrubberWindow()
        self._take_condition = take_condition

    def __iter__(self) -> Iterator[ScrubberWindow[T]]:
        """Create an iterator that yields windows based on the take condition.

        This method accumulates data points in a buffer and yields the current window
        whenever the take condition evaluates to True. After yielding, it shifts
        the window by the specified number of points.

        :return: Iterator yielding ScrubberWindow instances
        :raises ValueError: If no source has been set
        """
        if self.source is None:
            raise ValueError("Source is not set")

        for i, item in enumerate(self.source):
            self._buffer.append(item, i)
            if self._take_condition(self._buffer):
                yield self._buffer[:]

                for _ in range(self._shift):
                    if self._buffer:
                        self._buffer.popleft()
__init__
__init__(
    take_condition: Callable[[ScrubberWindow[T]], bool],
    shift: int,
    source: Handler[Any, T] | None = None,
) -> None

Initialize a sliding scrubber with custom condition and shift.

Parameters:

Name Type Description Default
take_condition Callable[[ScrubberWindow[T]], bool]

Function that determines when to emit the current window

required
shift int

Number of points to shift the window after each emission

required
source Handler[Any, T] | None

The handler providing input data, defaults to None

None
Source code in pysatl_tsp/core/scrubber/linear_scrubber.py
49
50
51
52
53
54
55
56
57
58
59
60
61
def __init__(
    self, take_condition: Callable[[ScrubberWindow[T]], bool], shift: int, source: Handler[Any, T] | None = None
) -> None:
    """Initialize a sliding scrubber with custom condition and shift.

    :param take_condition: Function that determines when to emit the current window
    :param shift: Number of points to shift the window after each emission
    :param source: The handler providing input data, defaults to None
    """
    super().__init__(source)
    self._shift = shift
    self._buffer: ScrubberWindow[T] = ScrubberWindow()
    self._take_condition = take_condition
__iter__
__iter__() -> Iterator[ScrubberWindow[T]]

Create an iterator that yields windows based on the take condition.

This method accumulates data points in a buffer and yields the current window whenever the take condition evaluates to True. After yielding, it shifts the window by the specified number of points.

Returns:

Type Description
Iterator[ScrubberWindow[T]]

Iterator yielding ScrubberWindow instances

Raises:

Type Description
ValueError

If no source has been set

Source code in pysatl_tsp/core/scrubber/linear_scrubber.py
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
def __iter__(self) -> Iterator[ScrubberWindow[T]]:
    """Create an iterator that yields windows based on the take condition.

    This method accumulates data points in a buffer and yields the current window
    whenever the take condition evaluates to True. After yielding, it shifts
    the window by the specified number of points.

    :return: Iterator yielding ScrubberWindow instances
    :raises ValueError: If no source has been set
    """
    if self.source is None:
        raise ValueError("Source is not set")

    for i, item in enumerate(self.source):
        self._buffer.append(item, i)
        if self._take_condition(self._buffer):
            yield self._buffer[:]

            for _ in range(self._shift):
                if self._buffer:
                    self._buffer.popleft()
segmentation_scrubber
OfflineSegmentationScrubber

Bases: Scrubber[T]

A scrubber that segments time series data based on changepoints in batch mode.

This scrubber processes the entire input data in a batch (offline) mode and segments it according to a provided segmentation rule. The rule identifies changepoints in the data, which are then used to create non-overlapping segments.

This approach is suitable for scenarios where the entire dataset is available upfront and the segmentation logic requires global context or multiple passes over the data.

Parameters:

Name Type Description Default
segmentation_rule Callable[[ScrubberWindow[T]], list[int]]

Function that analyzes the complete series and returns a list of changepoint indices

required
source Handler[Any, T] | None

The handler providing input data, defaults to None Example: python # Create a data source with synthetic pattern data = [1, 1, 2, 2, 5, 5, 5, 1, 1, 1, 6, 6, 6, 6] data_source = SimpleDataProvider(data) # Define a simple variance-based segmentation rule def find_changepoints(window: ScrubberWindow[int]) -> list[int]: changepoints = [] # Simple detection of value changes for i in range(1, len(window)): if abs(window[i] - window[i - 1]) > 2: # Threshold for change changepoints.append(i) return changepoints # Create the segmentation scrubber segmenter = OfflineSegmentationScrubber(segmentation_rule=find_changepoints, source=data_source) # Process the segments for segment in segmenter: print(f"Segment values: {list(segment.values)}") # Output: # Segment values: [1, 1, 2, 2] # Segment values: [5, 5, 5] # Segment values: [1, 1, 1] # Segment values: [6, 6, 6, 6]

None
Source code in pysatl_tsp/core/scrubber/segmentation_scrubber.py
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
class OfflineSegmentationScrubber(Scrubber[T]):
    """A scrubber that segments time series data based on changepoints in batch mode.

    This scrubber processes the entire input data in a batch (offline) mode and
    segments it according to a provided segmentation rule. The rule identifies
    changepoints in the data, which are then used to create non-overlapping segments.

    This approach is suitable for scenarios where the entire dataset is available upfront
    and the segmentation logic requires global context or multiple passes over the data.

    :param segmentation_rule: Function that analyzes the complete series and returns a list of changepoint indices
    :param source: The handler providing input data, defaults to None

    Example:
        ```python
        # Create a data source with synthetic pattern
        data = [1, 1, 2, 2, 5, 5, 5, 1, 1, 1, 6, 6, 6, 6]
        data_source = SimpleDataProvider(data)


        # Define a simple variance-based segmentation rule
        def find_changepoints(window: ScrubberWindow[int]) -> list[int]:
            changepoints = []
            # Simple detection of value changes
            for i in range(1, len(window)):
                if abs(window[i] - window[i - 1]) > 2:  # Threshold for change
                    changepoints.append(i)
            return changepoints


        # Create the segmentation scrubber
        segmenter = OfflineSegmentationScrubber(segmentation_rule=find_changepoints, source=data_source)

        # Process the segments
        for segment in segmenter:
            print(f"Segment values: {list(segment.values)}")

        # Output:
        # Segment values: [1, 1, 2, 2]
        # Segment values: [5, 5, 5]
        # Segment values: [1, 1, 1]
        # Segment values: [6, 6, 6, 6]
        ```
    """

    def __init__(
        self, segmentation_rule: Callable[[ScrubberWindow[T]], list[int]], source: Handler[Any, T] | None = None
    ):
        """Initialize an offline segmentation scrubber.

        :param segmentation_rule: Function that analyzes the complete series and returns a list of changepoint indices
        :param source: The handler providing input data, defaults to None
        """
        super().__init__(source)
        self.segmentation_rule = segmentation_rule

    def __iter__(self) -> Iterator[ScrubberWindow[T]]:
        """Create an iterator that yields segments based on detected changepoints.

        This method collects all data from the source, applies the segmentation rule
        to identify changepoints, and then yields segments between the detected changepoints.

        :return: Iterator yielding ScrubberWindow instances for each segment
        :raises ValueError: If no source has been set
        """
        if self.source is None:
            raise ValueError("Source is not set")

        full_series_list = list(iter(self.source))
        full_series_deque = deque(full_series_list)
        series_window = ScrubberWindow(full_series_deque)
        change_points = self.segmentation_rule(series_window)
        segments = [0, *change_points, len(full_series_deque)]
        for start, end in zip(segments[:-1], segments[1:]):
            yield series_window[start:end]
__init__
__init__(
    segmentation_rule: Callable[
        [ScrubberWindow[T]], list[int]
    ],
    source: Handler[Any, T] | None = None,
)

Initialize an offline segmentation scrubber.

Parameters:

Name Type Description Default
segmentation_rule Callable[[ScrubberWindow[T]], list[int]]

Function that analyzes the complete series and returns a list of changepoint indices

required
source Handler[Any, T] | None

The handler providing input data, defaults to None

None
Source code in pysatl_tsp/core/scrubber/segmentation_scrubber.py
55
56
57
58
59
60
61
62
63
64
def __init__(
    self, segmentation_rule: Callable[[ScrubberWindow[T]], list[int]], source: Handler[Any, T] | None = None
):
    """Initialize an offline segmentation scrubber.

    :param segmentation_rule: Function that analyzes the complete series and returns a list of changepoint indices
    :param source: The handler providing input data, defaults to None
    """
    super().__init__(source)
    self.segmentation_rule = segmentation_rule
__iter__
__iter__() -> Iterator[ScrubberWindow[T]]

Create an iterator that yields segments based on detected changepoints.

This method collects all data from the source, applies the segmentation rule to identify changepoints, and then yields segments between the detected changepoints.

Returns:

Type Description
Iterator[ScrubberWindow[T]]

Iterator yielding ScrubberWindow instances for each segment

Raises:

Type Description
ValueError

If no source has been set

Source code in pysatl_tsp/core/scrubber/segmentation_scrubber.py
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
def __iter__(self) -> Iterator[ScrubberWindow[T]]:
    """Create an iterator that yields segments based on detected changepoints.

    This method collects all data from the source, applies the segmentation rule
    to identify changepoints, and then yields segments between the detected changepoints.

    :return: Iterator yielding ScrubberWindow instances for each segment
    :raises ValueError: If no source has been set
    """
    if self.source is None:
        raise ValueError("Source is not set")

    full_series_list = list(iter(self.source))
    full_series_deque = deque(full_series_list)
    series_window = ScrubberWindow(full_series_deque)
    change_points = self.segmentation_rule(series_window)
    segments = [0, *change_points, len(full_series_deque)]
    for start, end in zip(segments[:-1], segments[1:]):
        yield series_window[start:end]
OnlineSegmentationScrubber

Bases: Scrubber[T]

A scrubber that segments time series data in real-time based on a condition.

This scrubber processes data points sequentially (online mode) and segments the time series whenever a specified condition is met or a maximum segment size is reached. It's designed for streaming data where segments need to be identified in real-time without waiting for the complete dataset.

Parameters:

Name Type Description Default
segmentation_rule Callable[[ScrubberWindow[T]], bool]

Function that evaluates the current window and returns True when a segment should end

required
max_segment_size int

Maximum number of points in a segment before forcing a split, defaults to 2^64

2 ** 64
source Handler[Any, T] | None

The handler providing input data, defaults to None Example: python # Create a data source with streaming values data = [1, 1, 2, 3, 8, 9, 8, 2, 2, 3, 10, 10, 9, 9] data_source = SimpleDataProvider(data) # Define a threshold-based segmentation rule def detect_jump(window: ScrubberWindow[int]) -> bool: if len(window) < 2: return False # Detect a large jump in values last_value = window[-1] prev_value = window[-2] return abs(last_value - prev_value) > 3 # Create the online segmentation scrubber segmenter = OnlineSegmentationScrubber( segmentation_rule=detect_jump, max_segment_size=5, # Force segmentation after 5 points if no jump detected source=data_source, ) # Process the segments as they're detected for segment in segmenter: print(f"Segment values: {list(segment.values)}") # Output: # Segment values: [1, 1, 2, 3, 8] # Split due to jump from 3 to 8 and max size # Segment values: [9, 8, 2] # Split due to jump from 8 to 2 # Segment values: [2, 3, 10] # Split due to jump from 3 to 10 # Segment values: [10, 9, 9] # Remaining data

None
Source code in pysatl_tsp/core/scrubber/segmentation_scrubber.py
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
class OnlineSegmentationScrubber(Scrubber[T]):
    """A scrubber that segments time series data in real-time based on a condition.

    This scrubber processes data points sequentially (online mode) and segments
    the time series whenever a specified condition is met or a maximum segment size
    is reached. It's designed for streaming data where segments need to be identified
    in real-time without waiting for the complete dataset.

    :param segmentation_rule: Function that evaluates the current window and returns True when a segment should end
    :param max_segment_size: Maximum number of points in a segment before forcing a split, defaults to 2^64
    :param source: The handler providing input data, defaults to None

    Example:
        ```python
        # Create a data source with streaming values
        data = [1, 1, 2, 3, 8, 9, 8, 2, 2, 3, 10, 10, 9, 9]
        data_source = SimpleDataProvider(data)


        # Define a threshold-based segmentation rule
        def detect_jump(window: ScrubberWindow[int]) -> bool:
            if len(window) < 2:
                return False

            # Detect a large jump in values
            last_value = window[-1]
            prev_value = window[-2]
            return abs(last_value - prev_value) > 3


        # Create the online segmentation scrubber
        segmenter = OnlineSegmentationScrubber(
            segmentation_rule=detect_jump,
            max_segment_size=5,  # Force segmentation after 5 points if no jump detected
            source=data_source,
        )

        # Process the segments as they're detected
        for segment in segmenter:
            print(f"Segment values: {list(segment.values)}")

        # Output:
        # Segment values: [1, 1, 2, 3, 8]  # Split due to jump from 3 to 8 and max size
        # Segment values: [9, 8, 2]        # Split due to jump from 8 to 2
        # Segment values: [2, 3, 10]       # Split due to jump from 3 to 10
        # Segment values: [10, 9, 9]       # Remaining data
        ```
    """

    def __init__(
        self,
        segmentation_rule: Callable[[ScrubberWindow[T]], bool],
        max_segment_size: int = 2**64,
        source: Handler[Any, T] | None = None,
    ):
        """Initialize an online segmentation scrubber.

        :param segmentation_rule: Function that evaluates the current window and returns True when a segment should end
        :param max_segment_size: Maximum number of points in a segment before forcing a split, defaults to 2^64
        :param source: The handler providing input data, defaults to None
        """
        super().__init__(source)
        self.segmentation_rule = segmentation_rule
        self.max_segment_size = max_segment_size

    def __iter__(self) -> Iterator[ScrubberWindow[T]]:
        """Create an iterator that yields segments as they're detected in real-time.

        This method processes data points one by one, accumulating them in a buffer
        and checking after each addition whether the segmentation condition is met
        or the maximum segment size is reached.

        :return: Iterator yielding ScrubberWindow instances for each detected segment
        :raises ValueError: If no source has been set
        """
        if self.source is None:
            raise ValueError("Source is not set")
        current_window: ScrubberWindow[T] = ScrubberWindow(deque())
        for index, item in enumerate(self.source):
            current_window.append(item, index)

            if self.segmentation_rule(current_window) or len(current_window) >= self.max_segment_size:
                yield current_window.copy()
                current_window.clear()

        if current_window:
            yield current_window.copy()
__init__
__init__(
    segmentation_rule: Callable[[ScrubberWindow[T]], bool],
    max_segment_size: int = 2**64,
    source: Handler[Any, T] | None = None,
)

Initialize an online segmentation scrubber.

Parameters:

Name Type Description Default
segmentation_rule Callable[[ScrubberWindow[T]], bool]

Function that evaluates the current window and returns True when a segment should end

required
max_segment_size int

Maximum number of points in a segment before forcing a split, defaults to 2^64

2 ** 64
source Handler[Any, T] | None

The handler providing input data, defaults to None

None
Source code in pysatl_tsp/core/scrubber/segmentation_scrubber.py
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
def __init__(
    self,
    segmentation_rule: Callable[[ScrubberWindow[T]], bool],
    max_segment_size: int = 2**64,
    source: Handler[Any, T] | None = None,
):
    """Initialize an online segmentation scrubber.

    :param segmentation_rule: Function that evaluates the current window and returns True when a segment should end
    :param max_segment_size: Maximum number of points in a segment before forcing a split, defaults to 2^64
    :param source: The handler providing input data, defaults to None
    """
    super().__init__(source)
    self.segmentation_rule = segmentation_rule
    self.max_segment_size = max_segment_size
__iter__
__iter__() -> Iterator[ScrubberWindow[T]]

Create an iterator that yields segments as they're detected in real-time.

This method processes data points one by one, accumulating them in a buffer and checking after each addition whether the segmentation condition is met or the maximum segment size is reached.

Returns:

Type Description
Iterator[ScrubberWindow[T]]

Iterator yielding ScrubberWindow instances for each detected segment

Raises:

Type Description
ValueError

If no source has been set

Source code in pysatl_tsp/core/scrubber/segmentation_scrubber.py
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
def __iter__(self) -> Iterator[ScrubberWindow[T]]:
    """Create an iterator that yields segments as they're detected in real-time.

    This method processes data points one by one, accumulating them in a buffer
    and checking after each addition whether the segmentation condition is met
    or the maximum segment size is reached.

    :return: Iterator yielding ScrubberWindow instances for each detected segment
    :raises ValueError: If no source has been set
    """
    if self.source is None:
        raise ValueError("Source is not set")
    current_window: ScrubberWindow[T] = ScrubberWindow(deque())
    for index, item in enumerate(self.source):
        current_window.append(item, index)

        if self.segmentation_rule(current_window) or len(current_window) >= self.max_segment_size:
            yield current_window.copy()
            current_window.clear()

    if current_window:
        yield current_window.copy()

implementations

This module provides implementations of time series processing techniques.

KalmanFilterHandler

Bases: OnlineFilterHandler[float, float]

A handler that applies the Kalman filter to time series data in real-time.

This handler integrates the complete Kalman filter functionality for processing noisy time series data. It estimates the underlying state of a system based on a sequence of noisy measurements.

Parameters:

Name Type Description Default
F ndarray[Any, dtype[float64]]

State transition matrix

required
H ndarray[Any, dtype[float64]]

Measurement matrix

required
B Union[float, ndarray[Any, dtype[float64]]] | None

Control input matrix, defaults to 0

None
Q ndarray[Any, dtype[float64]] | None

Process noise covariance matrix, defaults to identity matrix

None
R ndarray[Any, dtype[float64]] | None

Measurement noise covariance matrix, defaults to identity matrix

None
P ndarray[Any, dtype[float64]] | None

Initial state covariance matrix, defaults to identity matrix

None
x0 ndarray[Any, dtype[float64]] | None

Initial state vector, defaults to zero vector

None
source Handler[Any, float] | None

The handler providing input data, defaults to None Example: import numpy as np from pysatl_tsp.core.data_providers import SimpleDataProvider from pysatl_tsp.implementations.processor.kalman_filter_handler import KalmanFilterHandler np.random.seed(42) true_signal = np.sin(np.linspace(0, 4 * np.pi, 1000)) noisy_signal = true_signal + np.random.normal(0, 0.1, 1000) data_source = SimpleDataProvider(noisy_signal.tolist()) dt: float = 1.0/60 F = np.array([[1, dt, 0],[0, 1, dt], [0, 0, 1]]) H = np.array([1, 0, 0]).reshape(1, 3) Q = np.array([[0.05, 0.05, 0.0], [0.05, 0.05, 0.0], [0.0, 0.0, 0.0]]) R = np.array([0.5]).reshape(1, 1) filter_handler: KalmanFilterHandler = KalmanFilterHandler(F=F, H=H, Q=Q, R=R, source=data_source) filtered_values = list(filter_handler) import matplotlib.pyplot as plt plt.figure(figsize=(10, 6)) plt.plot(range(len(noisy_signal)), noisy_signal, label='Measurements') plt.plot(range(len(filtered_values)), filtered_values, label='Kalman Filter Prediction') plt.legend() plt.show()

None
Source code in pysatl_tsp/implementations/processor/kalman_filter_handler.py
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
class KalmanFilterHandler(OnlineFilterHandler[float, float]):
    """A handler that applies the Kalman filter to time series data in real-time.

    This handler integrates the complete Kalman filter functionality for processing
    noisy time series data. It estimates the underlying state of a system based on
    a sequence of noisy measurements.

    :param F: State transition matrix
    :param H: Measurement matrix
    :param B: Control input matrix, defaults to 0
    :param Q: Process noise covariance matrix, defaults to identity matrix
    :param R: Measurement noise covariance matrix, defaults to identity matrix
    :param P: Initial state covariance matrix, defaults to identity matrix
    :param x0: Initial state vector, defaults to zero vector
    :param source: The handler providing input data, defaults to None

    Example:
    ```
        import numpy as np
        from pysatl_tsp.core.data_providers import SimpleDataProvider
        from pysatl_tsp.implementations.processor.kalman_filter_handler import KalmanFilterHandler

        np.random.seed(42)
        true_signal = np.sin(np.linspace(0, 4 * np.pi, 1000))
        noisy_signal = true_signal + np.random.normal(0, 0.1, 1000)

        data_source = SimpleDataProvider(noisy_signal.tolist())

        dt: float = 1.0/60
        F = np.array([[1, dt, 0],[0, 1, dt], [0, 0, 1]])
        H = np.array([1, 0, 0]).reshape(1, 3)
        Q = np.array([[0.05, 0.05, 0.0], [0.05, 0.05, 0.0], [0.0, 0.0, 0.0]])
        R = np.array([0.5]).reshape(1, 1)

        filter_handler: KalmanFilterHandler = KalmanFilterHandler(F=F, H=H, Q=Q, R=R, source=data_source)

        filtered_values = list(filter_handler)

        import matplotlib.pyplot as plt
        plt.figure(figsize=(10, 6))
        plt.plot(range(len(noisy_signal)), noisy_signal, label='Measurements')
        plt.plot(range(len(filtered_values)), filtered_values, label='Kalman Filter Prediction')
        plt.legend()
        plt.show()
    ```
    """

    def __init__(
        self,
        F: np.ndarray[Any, np.dtype[np.float64]],
        H: np.ndarray[Any, np.dtype[np.float64]],
        B: Union[float, np.ndarray[Any, np.dtype[np.float64]]] | None = None,
        Q: np.ndarray[Any, np.dtype[np.float64]] | None = None,
        R: np.ndarray[Any, np.dtype[np.float64]] | None = None,
        P: np.ndarray[Any, np.dtype[np.float64]] | None = None,
        x0: np.ndarray[Any, np.dtype[np.float64]] | None = None,
        source: Handler[Any, float] | None = None,
    ) -> None:
        """Initialize the Kalman filter handler with all necessary matrices.

        :param F: State transition matrix
        :param H: Measurement matrix
        :param B: Control input matrix, defaults to None
        :param Q: Process noise covariance matrix, defaults to None
        :param R: Measurement noise covariance matrix, defaults to None
        :param P: Initial state covariance matrix, defaults to None
        :param x0: Initial state vector, defaults to None
        :param source: The handler providing input data, defaults to None
        """
        if F is None or H is None:
            raise ValueError("Set proper system dynamics.")

        self.n: int = F.shape[1]
        self.m: int = H.shape[1]

        self.F: np.ndarray[Any, np.dtype[np.float64]] = F
        self.H: np.ndarray[Any, np.dtype[np.float64]] = H
        self.B: Union[float, np.ndarray[Any, np.dtype[np.float64]]] = 0 if B is None else B

        self.Q: np.ndarray[Any, np.dtype[np.float64]] = np.eye(self.n) if Q is None else Q
        self.R: np.ndarray[Any, np.dtype[np.float64]] = np.eye(self.n) if R is None else R
        self.P: np.ndarray[Any, np.dtype[np.float64]] = np.eye(self.n) if P is None else P

        self.x: np.ndarray[Any, np.dtype[np.float64]] = np.zeros((self.n, 1)) if x0 is None else x0

        super().__init__(filter_func=self._apply_kalman_filter, filter_config=None, source=source)

    def predict(
        self, u: Union[float, np.ndarray[Any, np.dtype[np.float64]]] = 0
    ) -> np.ndarray[Any, np.dtype[np.float64]]:
        """Predict the next state based on the model.

        :param u: Control input, defaults to 0
        :return: Predicted state vector
        """
        self.x = np.dot(self.F, self.x) + np.dot(self.B, u)

        self.P = np.dot(np.dot(self.F, self.P), self.F.T) + self.Q

        return self.x

    def update(self, z: float) -> None:
        """Update the state estimate based on the measurement.

        :param z: Measurement
        """
        y: np.ndarray[Any, np.dtype[np.float64]] = z - np.dot(self.H, self.x)

        S: np.ndarray[Any, np.dtype[np.float64]] = self.R + np.dot(self.H, np.dot(self.P, self.H.T))

        K: np.ndarray[Any, np.dtype[np.float64]] = np.dot(np.dot(self.P, self.H.T), np.linalg.inv(S))

        self.x = self.x + np.dot(K, y)

        I_matrix: np.ndarray[Any, np.dtype[np.float64]] = np.eye(self.n)
        self.P = np.dot(np.dot(I_matrix - np.dot(K, self.H), self.P), (I_matrix - np.dot(K, self.H)).T) + np.dot(
            np.dot(K, self.R), K.T
        )

    def _apply_kalman_filter(self, window: ScrubberWindow[float], _: Any) -> float:
        """Apply the Kalman filter to the latest point in the window.

        :param window: Window of historical data
        :param _: Unused configuration parameter
        :return: Filtered value
        """
        if not window:
            return 0.0

        measurement: float = window[-1]

        prediction_array = np.dot(self.H, self.predict())
        prediction: float = float(prediction_array.item())

        self.update(measurement)

        return prediction
__init__
__init__(
    F: ndarray[Any, dtype[float64]],
    H: ndarray[Any, dtype[float64]],
    B: Union[float, ndarray[Any, dtype[float64]]]
    | None = None,
    Q: ndarray[Any, dtype[float64]] | None = None,
    R: ndarray[Any, dtype[float64]] | None = None,
    P: ndarray[Any, dtype[float64]] | None = None,
    x0: ndarray[Any, dtype[float64]] | None = None,
    source: Handler[Any, float] | None = None,
) -> None

Initialize the Kalman filter handler with all necessary matrices.

Parameters:

Name Type Description Default
F ndarray[Any, dtype[float64]]

State transition matrix

required
H ndarray[Any, dtype[float64]]

Measurement matrix

required
B Union[float, ndarray[Any, dtype[float64]]] | None

Control input matrix, defaults to None

None
Q ndarray[Any, dtype[float64]] | None

Process noise covariance matrix, defaults to None

None
R ndarray[Any, dtype[float64]] | None

Measurement noise covariance matrix, defaults to None

None
P ndarray[Any, dtype[float64]] | None

Initial state covariance matrix, defaults to None

None
x0 ndarray[Any, dtype[float64]] | None

Initial state vector, defaults to None

None
source Handler[Any, float] | None

The handler providing input data, defaults to None

None
Source code in pysatl_tsp/implementations/processor/kalman_filter_handler.py
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
def __init__(
    self,
    F: np.ndarray[Any, np.dtype[np.float64]],
    H: np.ndarray[Any, np.dtype[np.float64]],
    B: Union[float, np.ndarray[Any, np.dtype[np.float64]]] | None = None,
    Q: np.ndarray[Any, np.dtype[np.float64]] | None = None,
    R: np.ndarray[Any, np.dtype[np.float64]] | None = None,
    P: np.ndarray[Any, np.dtype[np.float64]] | None = None,
    x0: np.ndarray[Any, np.dtype[np.float64]] | None = None,
    source: Handler[Any, float] | None = None,
) -> None:
    """Initialize the Kalman filter handler with all necessary matrices.

    :param F: State transition matrix
    :param H: Measurement matrix
    :param B: Control input matrix, defaults to None
    :param Q: Process noise covariance matrix, defaults to None
    :param R: Measurement noise covariance matrix, defaults to None
    :param P: Initial state covariance matrix, defaults to None
    :param x0: Initial state vector, defaults to None
    :param source: The handler providing input data, defaults to None
    """
    if F is None or H is None:
        raise ValueError("Set proper system dynamics.")

    self.n: int = F.shape[1]
    self.m: int = H.shape[1]

    self.F: np.ndarray[Any, np.dtype[np.float64]] = F
    self.H: np.ndarray[Any, np.dtype[np.float64]] = H
    self.B: Union[float, np.ndarray[Any, np.dtype[np.float64]]] = 0 if B is None else B

    self.Q: np.ndarray[Any, np.dtype[np.float64]] = np.eye(self.n) if Q is None else Q
    self.R: np.ndarray[Any, np.dtype[np.float64]] = np.eye(self.n) if R is None else R
    self.P: np.ndarray[Any, np.dtype[np.float64]] = np.eye(self.n) if P is None else P

    self.x: np.ndarray[Any, np.dtype[np.float64]] = np.zeros((self.n, 1)) if x0 is None else x0

    super().__init__(filter_func=self._apply_kalman_filter, filter_config=None, source=source)
predict
predict(
    u: Union[float, ndarray[Any, dtype[float64]]] = 0,
) -> np.ndarray[Any, np.dtype[np.float64]]

Predict the next state based on the model.

Parameters:

Name Type Description Default
u Union[float, ndarray[Any, dtype[float64]]]

Control input, defaults to 0

0

Returns:

Type Description
ndarray[Any, dtype[float64]]

Predicted state vector

Source code in pysatl_tsp/implementations/processor/kalman_filter_handler.py
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
def predict(
    self, u: Union[float, np.ndarray[Any, np.dtype[np.float64]]] = 0
) -> np.ndarray[Any, np.dtype[np.float64]]:
    """Predict the next state based on the model.

    :param u: Control input, defaults to 0
    :return: Predicted state vector
    """
    self.x = np.dot(self.F, self.x) + np.dot(self.B, u)

    self.P = np.dot(np.dot(self.F, self.P), self.F.T) + self.Q

    return self.x
update
update(z: float) -> None

Update the state estimate based on the measurement.

Parameters:

Name Type Description Default
z float

Measurement

required
Source code in pysatl_tsp/implementations/processor/kalman_filter_handler.py
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
def update(self, z: float) -> None:
    """Update the state estimate based on the measurement.

    :param z: Measurement
    """
    y: np.ndarray[Any, np.dtype[np.float64]] = z - np.dot(self.H, self.x)

    S: np.ndarray[Any, np.dtype[np.float64]] = self.R + np.dot(self.H, np.dot(self.P, self.H.T))

    K: np.ndarray[Any, np.dtype[np.float64]] = np.dot(np.dot(self.P, self.H.T), np.linalg.inv(S))

    self.x = self.x + np.dot(K, y)

    I_matrix: np.ndarray[Any, np.dtype[np.float64]] = np.eye(self.n)
    self.P = np.dot(np.dot(I_matrix - np.dot(K, self.H), self.P), (I_matrix - np.dot(K, self.H)).T) + np.dot(
        np.dot(K, self.R), K.T
    )

TimeSeriesCrossValidator

Bases: Handler[T, tuple[ScrubberWindow[T], ScrubberWindow[T]]]

A handler that implements expanding window cross-validation for time series data.

This handler produces a sequence of train-validation splits suitable for time series validation, where each split preserves the temporal order of data. It implements an expanding window approach, where the training set grows over time while the validation set has a fixed size and slides forward.

The handler ensures that: 1. The training set always has at least min_train_size points 2. The validation set always has exactly val_size points 3. The validation set always follows the training set temporally 4. Each new split adds val_size points to the training set

This approach respects the temporal nature of time series data and prevents data leakage from future to past.

Parameters:

Name Type Description Default
min_train_size int

Minimum number of points in the initial training set

required
val_size int

Number of points in each validation set

required
source Optional[Handler[Any, T]]

The handler providing input data, defaults to None Example: python import numpy as np import matplotlib.pyplot as plt # Generate a synthetic time series np.random.seed(42) ts = np.cumsum(np.random.normal(0, 1, 100)) # Random walk data_source = SimpleDataProvider(ts) # Create a cross-validator with min_train_size=50 and val_size=10 cv = TimeSeriesCrossValidator(min_train_size=50, val_size=10, source=data_source) # Visualize the different train-validation splits plt.figure(figsize=(14, 8)) x = np.arange(len(ts)) plt.plot(x, ts, "k-", alpha=0.3, label="Full time series") for i, (train, val) in enumerate(cv): train_indices = list(train.indices) val_indices = list(val.indices) # Plot each split plt.plot(train_indices, [ts[i] for i in train_indices], "b-", linewidth=2, alpha=0.7 - i * 0.1) plt.plot(val_indices, [ts[i] for i in val_indices], "r-", linewidth=2, alpha=0.7 - i * 0.1) # Add markers at the split point split_idx = train_indices[-1] plt.axvline(x=split_idx, color="g", linestyle="--", alpha=0.5) # Print information about this split print(f"Split {i + 1}:") print(f" Train: {len(train)} points (indices {train_indices[0]}..{train_indices[-1]})") print(f" Validation: {len(val)} points (indices {val_indices[0]}..{val_indices[-1]})") plt.title("Time Series Cross-Validation: Expanding Window Approach") plt.xlabel("Time") plt.ylabel("Value") # Add custom legend from matplotlib.lines import Line2D custom_lines = [ Line2D([0], [0], color="k", alpha=0.3), Line2D([0], [0], color="b", linewidth=2), Line2D([0], [0], color="r", linewidth=2), Line2D([0], [0], color="g", linestyle="--"), ] plt.legend( custom_lines, ["Full time series", "Training sets", "Validation sets", "Split points"], loc="upper left" ) plt.grid(True, alpha=0.3) plt.show() # Example model evaluation with each split from sklearn.linear_model import LinearRegression for i, (train, val) in enumerate(cv): # Prepare data train_indices = list(train.indices) train_X = np.array(train_indices).reshape(-1, 1) train_y = np.array(list(train.values)) val_indices = list(val.indices) val_X = np.array(val_indices).reshape(-1, 1) val_y = np.array(list(val.values)) # Train a simple model model = LinearRegression() model.fit(train_X, train_y) # Evaluate on validation set val_pred = model.predict(val_X) mse = np.mean((val_pred - val_y) ** 2) print(f"Split {i + 1} - Validation MSE: {mse:.4f}")

None
Source code in pysatl_tsp/implementations/processor/time_series_cross_validator.py
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
class TimeSeriesCrossValidator(Handler[T, tuple[ScrubberWindow[T], ScrubberWindow[T]]]):
    """A handler that implements expanding window cross-validation for time series data.

    This handler produces a sequence of train-validation splits suitable for time series
    validation, where each split preserves the temporal order of data. It implements an
    expanding window approach, where the training set grows over time while the validation
    set has a fixed size and slides forward.

    The handler ensures that:
    1. The training set always has at least `min_train_size` points
    2. The validation set always has exactly `val_size` points
    3. The validation set always follows the training set temporally
    4. Each new split adds `val_size` points to the training set

    This approach respects the temporal nature of time series data and prevents
    data leakage from future to past.

    :param min_train_size: Minimum number of points in the initial training set
    :param val_size: Number of points in each validation set
    :param source: The handler providing input data, defaults to None

    Example:
        ```python
        import numpy as np
        import matplotlib.pyplot as plt

        # Generate a synthetic time series
        np.random.seed(42)
        ts = np.cumsum(np.random.normal(0, 1, 100))  # Random walk
        data_source = SimpleDataProvider(ts)

        # Create a cross-validator with min_train_size=50 and val_size=10
        cv = TimeSeriesCrossValidator(min_train_size=50, val_size=10, source=data_source)

        # Visualize the different train-validation splits
        plt.figure(figsize=(14, 8))
        x = np.arange(len(ts))
        plt.plot(x, ts, "k-", alpha=0.3, label="Full time series")

        for i, (train, val) in enumerate(cv):
            train_indices = list(train.indices)
            val_indices = list(val.indices)

            # Plot each split
            plt.plot(train_indices, [ts[i] for i in train_indices], "b-", linewidth=2, alpha=0.7 - i * 0.1)
            plt.plot(val_indices, [ts[i] for i in val_indices], "r-", linewidth=2, alpha=0.7 - i * 0.1)

            # Add markers at the split point
            split_idx = train_indices[-1]
            plt.axvline(x=split_idx, color="g", linestyle="--", alpha=0.5)

            # Print information about this split
            print(f"Split {i + 1}:")
            print(f"  Train: {len(train)} points (indices {train_indices[0]}..{train_indices[-1]})")
            print(f"  Validation: {len(val)} points (indices {val_indices[0]}..{val_indices[-1]})")

        plt.title("Time Series Cross-Validation: Expanding Window Approach")
        plt.xlabel("Time")
        plt.ylabel("Value")

        # Add custom legend
        from matplotlib.lines import Line2D

        custom_lines = [
            Line2D([0], [0], color="k", alpha=0.3),
            Line2D([0], [0], color="b", linewidth=2),
            Line2D([0], [0], color="r", linewidth=2),
            Line2D([0], [0], color="g", linestyle="--"),
        ]
        plt.legend(
            custom_lines, ["Full time series", "Training sets", "Validation sets", "Split points"], loc="upper left"
        )

        plt.grid(True, alpha=0.3)
        plt.show()

        # Example model evaluation with each split
        from sklearn.linear_model import LinearRegression

        for i, (train, val) in enumerate(cv):
            # Prepare data
            train_indices = list(train.indices)
            train_X = np.array(train_indices).reshape(-1, 1)
            train_y = np.array(list(train.values))

            val_indices = list(val.indices)
            val_X = np.array(val_indices).reshape(-1, 1)
            val_y = np.array(list(val.values))

            # Train a simple model
            model = LinearRegression()
            model.fit(train_X, train_y)

            # Evaluate on validation set
            val_pred = model.predict(val_X)
            mse = np.mean((val_pred - val_y) ** 2)

            print(f"Split {i + 1} - Validation MSE: {mse:.4f}")
        ```
    """

    def __init__(self, min_train_size: int, val_size: int, source: Optional[Handler[Any, T]] = None):
        """Initialize a time series cross-validator.

        :param min_train_size: Minimum number of points in the initial training set
        :param val_size: Number of points in each validation set
        :param source: The handler providing input data, defaults to None
        """
        super().__init__(source)
        self.min_train_size = min_train_size
        self.val_size = val_size

    def __iter__(self) -> Iterator[tuple[ScrubberWindow[T], ScrubberWindow[T]]]:
        """Create an iterator that yields train-validation splits for time series cross-validation.

        This method creates splits where:
        1. The first split has exactly min_train_size points for training
        2. Each subsequent split adds val_size points to the training set
        3. Each validation set has exactly val_size points and follows the training set

        :return: Iterator yielding tuples of (training_window, validation_window)
        :raises ValueError: If no source has been set
        """
        if self.source is None:
            raise ValueError("Source is not set")

        scrubber = SlidingScrubber(
            lambda buffer: len(buffer) > self.min_train_size
            and (len(buffer) - self.min_train_size) % self.val_size == 0,
            shift=0,
            source=self.source,
        )
        handler: MappingHandler[ScrubberWindow[T], tuple[ScrubberWindow[T], ScrubberWindow[T]]] = MappingHandler(
            map_func=lambda window: (window[: -self.val_size], window[-self.val_size :])
        )

        yield from (scrubber | handler)
__init__
__init__(
    min_train_size: int,
    val_size: int,
    source: Optional[Handler[Any, T]] = None,
)

Initialize a time series cross-validator.

Parameters:

Name Type Description Default
min_train_size int

Minimum number of points in the initial training set

required
val_size int

Number of points in each validation set

required
source Optional[Handler[Any, T]]

The handler providing input data, defaults to None

None
Source code in pysatl_tsp/implementations/processor/time_series_cross_validator.py
110
111
112
113
114
115
116
117
118
119
def __init__(self, min_train_size: int, val_size: int, source: Optional[Handler[Any, T]] = None):
    """Initialize a time series cross-validator.

    :param min_train_size: Minimum number of points in the initial training set
    :param val_size: Number of points in each validation set
    :param source: The handler providing input data, defaults to None
    """
    super().__init__(source)
    self.min_train_size = min_train_size
    self.val_size = val_size
__iter__
__iter__() -> Iterator[
    tuple[ScrubberWindow[T], ScrubberWindow[T]]
]

Create an iterator that yields train-validation splits for time series cross-validation.

This method creates splits where: 1. The first split has exactly min_train_size points for training 2. Each subsequent split adds val_size points to the training set 3. Each validation set has exactly val_size points and follows the training set

Returns:

Type Description
Iterator[tuple[ScrubberWindow[T], ScrubberWindow[T]]]

Iterator yielding tuples of (training_window, validation_window)

Raises:

Type Description
ValueError

If no source has been set

Source code in pysatl_tsp/implementations/processor/time_series_cross_validator.py
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
def __iter__(self) -> Iterator[tuple[ScrubberWindow[T], ScrubberWindow[T]]]:
    """Create an iterator that yields train-validation splits for time series cross-validation.

    This method creates splits where:
    1. The first split has exactly min_train_size points for training
    2. Each subsequent split adds val_size points to the training set
    3. Each validation set has exactly val_size points and follows the training set

    :return: Iterator yielding tuples of (training_window, validation_window)
    :raises ValueError: If no source has been set
    """
    if self.source is None:
        raise ValueError("Source is not set")

    scrubber = SlidingScrubber(
        lambda buffer: len(buffer) > self.min_train_size
        and (len(buffer) - self.min_train_size) % self.val_size == 0,
        shift=0,
        source=self.source,
    )
    handler: MappingHandler[ScrubberWindow[T], tuple[ScrubberWindow[T], ScrubberWindow[T]]] = MappingHandler(
        map_func=lambda window: (window[: -self.val_size], window[-self.val_size :])
    )

    yield from (scrubber | handler)

processor

Module for time series processing implementations.

KalmanFilterHandler

Bases: OnlineFilterHandler[float, float]

A handler that applies the Kalman filter to time series data in real-time.

This handler integrates the complete Kalman filter functionality for processing noisy time series data. It estimates the underlying state of a system based on a sequence of noisy measurements.

Parameters:

Name Type Description Default
F ndarray[Any, dtype[float64]]

State transition matrix

required
H ndarray[Any, dtype[float64]]

Measurement matrix

required
B Union[float, ndarray[Any, dtype[float64]]] | None

Control input matrix, defaults to 0

None
Q ndarray[Any, dtype[float64]] | None

Process noise covariance matrix, defaults to identity matrix

None
R ndarray[Any, dtype[float64]] | None

Measurement noise covariance matrix, defaults to identity matrix

None
P ndarray[Any, dtype[float64]] | None

Initial state covariance matrix, defaults to identity matrix

None
x0 ndarray[Any, dtype[float64]] | None

Initial state vector, defaults to zero vector

None
source Handler[Any, float] | None

The handler providing input data, defaults to None Example: import numpy as np from pysatl_tsp.core.data_providers import SimpleDataProvider from pysatl_tsp.implementations.processor.kalman_filter_handler import KalmanFilterHandler np.random.seed(42) true_signal = np.sin(np.linspace(0, 4 * np.pi, 1000)) noisy_signal = true_signal + np.random.normal(0, 0.1, 1000) data_source = SimpleDataProvider(noisy_signal.tolist()) dt: float = 1.0/60 F = np.array([[1, dt, 0],[0, 1, dt], [0, 0, 1]]) H = np.array([1, 0, 0]).reshape(1, 3) Q = np.array([[0.05, 0.05, 0.0], [0.05, 0.05, 0.0], [0.0, 0.0, 0.0]]) R = np.array([0.5]).reshape(1, 1) filter_handler: KalmanFilterHandler = KalmanFilterHandler(F=F, H=H, Q=Q, R=R, source=data_source) filtered_values = list(filter_handler) import matplotlib.pyplot as plt plt.figure(figsize=(10, 6)) plt.plot(range(len(noisy_signal)), noisy_signal, label='Measurements') plt.plot(range(len(filtered_values)), filtered_values, label='Kalman Filter Prediction') plt.legend() plt.show()

None
Source code in pysatl_tsp/implementations/processor/kalman_filter_handler.py
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
class KalmanFilterHandler(OnlineFilterHandler[float, float]):
    """A handler that applies the Kalman filter to time series data in real-time.

    This handler integrates the complete Kalman filter functionality for processing
    noisy time series data. It estimates the underlying state of a system based on
    a sequence of noisy measurements.

    :param F: State transition matrix
    :param H: Measurement matrix
    :param B: Control input matrix, defaults to 0
    :param Q: Process noise covariance matrix, defaults to identity matrix
    :param R: Measurement noise covariance matrix, defaults to identity matrix
    :param P: Initial state covariance matrix, defaults to identity matrix
    :param x0: Initial state vector, defaults to zero vector
    :param source: The handler providing input data, defaults to None

    Example:
    ```
        import numpy as np
        from pysatl_tsp.core.data_providers import SimpleDataProvider
        from pysatl_tsp.implementations.processor.kalman_filter_handler import KalmanFilterHandler

        np.random.seed(42)
        true_signal = np.sin(np.linspace(0, 4 * np.pi, 1000))
        noisy_signal = true_signal + np.random.normal(0, 0.1, 1000)

        data_source = SimpleDataProvider(noisy_signal.tolist())

        dt: float = 1.0/60
        F = np.array([[1, dt, 0],[0, 1, dt], [0, 0, 1]])
        H = np.array([1, 0, 0]).reshape(1, 3)
        Q = np.array([[0.05, 0.05, 0.0], [0.05, 0.05, 0.0], [0.0, 0.0, 0.0]])
        R = np.array([0.5]).reshape(1, 1)

        filter_handler: KalmanFilterHandler = KalmanFilterHandler(F=F, H=H, Q=Q, R=R, source=data_source)

        filtered_values = list(filter_handler)

        import matplotlib.pyplot as plt
        plt.figure(figsize=(10, 6))
        plt.plot(range(len(noisy_signal)), noisy_signal, label='Measurements')
        plt.plot(range(len(filtered_values)), filtered_values, label='Kalman Filter Prediction')
        plt.legend()
        plt.show()
    ```
    """

    def __init__(
        self,
        F: np.ndarray[Any, np.dtype[np.float64]],
        H: np.ndarray[Any, np.dtype[np.float64]],
        B: Union[float, np.ndarray[Any, np.dtype[np.float64]]] | None = None,
        Q: np.ndarray[Any, np.dtype[np.float64]] | None = None,
        R: np.ndarray[Any, np.dtype[np.float64]] | None = None,
        P: np.ndarray[Any, np.dtype[np.float64]] | None = None,
        x0: np.ndarray[Any, np.dtype[np.float64]] | None = None,
        source: Handler[Any, float] | None = None,
    ) -> None:
        """Initialize the Kalman filter handler with all necessary matrices.

        :param F: State transition matrix
        :param H: Measurement matrix
        :param B: Control input matrix, defaults to None
        :param Q: Process noise covariance matrix, defaults to None
        :param R: Measurement noise covariance matrix, defaults to None
        :param P: Initial state covariance matrix, defaults to None
        :param x0: Initial state vector, defaults to None
        :param source: The handler providing input data, defaults to None
        """
        if F is None or H is None:
            raise ValueError("Set proper system dynamics.")

        self.n: int = F.shape[1]
        self.m: int = H.shape[1]

        self.F: np.ndarray[Any, np.dtype[np.float64]] = F
        self.H: np.ndarray[Any, np.dtype[np.float64]] = H
        self.B: Union[float, np.ndarray[Any, np.dtype[np.float64]]] = 0 if B is None else B

        self.Q: np.ndarray[Any, np.dtype[np.float64]] = np.eye(self.n) if Q is None else Q
        self.R: np.ndarray[Any, np.dtype[np.float64]] = np.eye(self.n) if R is None else R
        self.P: np.ndarray[Any, np.dtype[np.float64]] = np.eye(self.n) if P is None else P

        self.x: np.ndarray[Any, np.dtype[np.float64]] = np.zeros((self.n, 1)) if x0 is None else x0

        super().__init__(filter_func=self._apply_kalman_filter, filter_config=None, source=source)

    def predict(
        self, u: Union[float, np.ndarray[Any, np.dtype[np.float64]]] = 0
    ) -> np.ndarray[Any, np.dtype[np.float64]]:
        """Predict the next state based on the model.

        :param u: Control input, defaults to 0
        :return: Predicted state vector
        """
        self.x = np.dot(self.F, self.x) + np.dot(self.B, u)

        self.P = np.dot(np.dot(self.F, self.P), self.F.T) + self.Q

        return self.x

    def update(self, z: float) -> None:
        """Update the state estimate based on the measurement.

        :param z: Measurement
        """
        y: np.ndarray[Any, np.dtype[np.float64]] = z - np.dot(self.H, self.x)

        S: np.ndarray[Any, np.dtype[np.float64]] = self.R + np.dot(self.H, np.dot(self.P, self.H.T))

        K: np.ndarray[Any, np.dtype[np.float64]] = np.dot(np.dot(self.P, self.H.T), np.linalg.inv(S))

        self.x = self.x + np.dot(K, y)

        I_matrix: np.ndarray[Any, np.dtype[np.float64]] = np.eye(self.n)
        self.P = np.dot(np.dot(I_matrix - np.dot(K, self.H), self.P), (I_matrix - np.dot(K, self.H)).T) + np.dot(
            np.dot(K, self.R), K.T
        )

    def _apply_kalman_filter(self, window: ScrubberWindow[float], _: Any) -> float:
        """Apply the Kalman filter to the latest point in the window.

        :param window: Window of historical data
        :param _: Unused configuration parameter
        :return: Filtered value
        """
        if not window:
            return 0.0

        measurement: float = window[-1]

        prediction_array = np.dot(self.H, self.predict())
        prediction: float = float(prediction_array.item())

        self.update(measurement)

        return prediction
__init__
__init__(
    F: ndarray[Any, dtype[float64]],
    H: ndarray[Any, dtype[float64]],
    B: Union[float, ndarray[Any, dtype[float64]]]
    | None = None,
    Q: ndarray[Any, dtype[float64]] | None = None,
    R: ndarray[Any, dtype[float64]] | None = None,
    P: ndarray[Any, dtype[float64]] | None = None,
    x0: ndarray[Any, dtype[float64]] | None = None,
    source: Handler[Any, float] | None = None,
) -> None

Initialize the Kalman filter handler with all necessary matrices.

Parameters:

Name Type Description Default
F ndarray[Any, dtype[float64]]

State transition matrix

required
H ndarray[Any, dtype[float64]]

Measurement matrix

required
B Union[float, ndarray[Any, dtype[float64]]] | None

Control input matrix, defaults to None

None
Q ndarray[Any, dtype[float64]] | None

Process noise covariance matrix, defaults to None

None
R ndarray[Any, dtype[float64]] | None

Measurement noise covariance matrix, defaults to None

None
P ndarray[Any, dtype[float64]] | None

Initial state covariance matrix, defaults to None

None
x0 ndarray[Any, dtype[float64]] | None

Initial state vector, defaults to None

None
source Handler[Any, float] | None

The handler providing input data, defaults to None

None
Source code in pysatl_tsp/implementations/processor/kalman_filter_handler.py
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
def __init__(
    self,
    F: np.ndarray[Any, np.dtype[np.float64]],
    H: np.ndarray[Any, np.dtype[np.float64]],
    B: Union[float, np.ndarray[Any, np.dtype[np.float64]]] | None = None,
    Q: np.ndarray[Any, np.dtype[np.float64]] | None = None,
    R: np.ndarray[Any, np.dtype[np.float64]] | None = None,
    P: np.ndarray[Any, np.dtype[np.float64]] | None = None,
    x0: np.ndarray[Any, np.dtype[np.float64]] | None = None,
    source: Handler[Any, float] | None = None,
) -> None:
    """Initialize the Kalman filter handler with all necessary matrices.

    :param F: State transition matrix
    :param H: Measurement matrix
    :param B: Control input matrix, defaults to None
    :param Q: Process noise covariance matrix, defaults to None
    :param R: Measurement noise covariance matrix, defaults to None
    :param P: Initial state covariance matrix, defaults to None
    :param x0: Initial state vector, defaults to None
    :param source: The handler providing input data, defaults to None
    """
    if F is None or H is None:
        raise ValueError("Set proper system dynamics.")

    self.n: int = F.shape[1]
    self.m: int = H.shape[1]

    self.F: np.ndarray[Any, np.dtype[np.float64]] = F
    self.H: np.ndarray[Any, np.dtype[np.float64]] = H
    self.B: Union[float, np.ndarray[Any, np.dtype[np.float64]]] = 0 if B is None else B

    self.Q: np.ndarray[Any, np.dtype[np.float64]] = np.eye(self.n) if Q is None else Q
    self.R: np.ndarray[Any, np.dtype[np.float64]] = np.eye(self.n) if R is None else R
    self.P: np.ndarray[Any, np.dtype[np.float64]] = np.eye(self.n) if P is None else P

    self.x: np.ndarray[Any, np.dtype[np.float64]] = np.zeros((self.n, 1)) if x0 is None else x0

    super().__init__(filter_func=self._apply_kalman_filter, filter_config=None, source=source)
predict
predict(
    u: Union[float, ndarray[Any, dtype[float64]]] = 0,
) -> np.ndarray[Any, np.dtype[np.float64]]

Predict the next state based on the model.

Parameters:

Name Type Description Default
u Union[float, ndarray[Any, dtype[float64]]]

Control input, defaults to 0

0

Returns:

Type Description
ndarray[Any, dtype[float64]]

Predicted state vector

Source code in pysatl_tsp/implementations/processor/kalman_filter_handler.py
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
def predict(
    self, u: Union[float, np.ndarray[Any, np.dtype[np.float64]]] = 0
) -> np.ndarray[Any, np.dtype[np.float64]]:
    """Predict the next state based on the model.

    :param u: Control input, defaults to 0
    :return: Predicted state vector
    """
    self.x = np.dot(self.F, self.x) + np.dot(self.B, u)

    self.P = np.dot(np.dot(self.F, self.P), self.F.T) + self.Q

    return self.x
update
update(z: float) -> None

Update the state estimate based on the measurement.

Parameters:

Name Type Description Default
z float

Measurement

required
Source code in pysatl_tsp/implementations/processor/kalman_filter_handler.py
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
def update(self, z: float) -> None:
    """Update the state estimate based on the measurement.

    :param z: Measurement
    """
    y: np.ndarray[Any, np.dtype[np.float64]] = z - np.dot(self.H, self.x)

    S: np.ndarray[Any, np.dtype[np.float64]] = self.R + np.dot(self.H, np.dot(self.P, self.H.T))

    K: np.ndarray[Any, np.dtype[np.float64]] = np.dot(np.dot(self.P, self.H.T), np.linalg.inv(S))

    self.x = self.x + np.dot(K, y)

    I_matrix: np.ndarray[Any, np.dtype[np.float64]] = np.eye(self.n)
    self.P = np.dot(np.dot(I_matrix - np.dot(K, self.H), self.P), (I_matrix - np.dot(K, self.H)).T) + np.dot(
        np.dot(K, self.R), K.T
    )
TimeSeriesCrossValidator

Bases: Handler[T, tuple[ScrubberWindow[T], ScrubberWindow[T]]]

A handler that implements expanding window cross-validation for time series data.

This handler produces a sequence of train-validation splits suitable for time series validation, where each split preserves the temporal order of data. It implements an expanding window approach, where the training set grows over time while the validation set has a fixed size and slides forward.

The handler ensures that: 1. The training set always has at least min_train_size points 2. The validation set always has exactly val_size points 3. The validation set always follows the training set temporally 4. Each new split adds val_size points to the training set

This approach respects the temporal nature of time series data and prevents data leakage from future to past.

Parameters:

Name Type Description Default
min_train_size int

Minimum number of points in the initial training set

required
val_size int

Number of points in each validation set

required
source Optional[Handler[Any, T]]

The handler providing input data, defaults to None Example: python import numpy as np import matplotlib.pyplot as plt # Generate a synthetic time series np.random.seed(42) ts = np.cumsum(np.random.normal(0, 1, 100)) # Random walk data_source = SimpleDataProvider(ts) # Create a cross-validator with min_train_size=50 and val_size=10 cv = TimeSeriesCrossValidator(min_train_size=50, val_size=10, source=data_source) # Visualize the different train-validation splits plt.figure(figsize=(14, 8)) x = np.arange(len(ts)) plt.plot(x, ts, "k-", alpha=0.3, label="Full time series") for i, (train, val) in enumerate(cv): train_indices = list(train.indices) val_indices = list(val.indices) # Plot each split plt.plot(train_indices, [ts[i] for i in train_indices], "b-", linewidth=2, alpha=0.7 - i * 0.1) plt.plot(val_indices, [ts[i] for i in val_indices], "r-", linewidth=2, alpha=0.7 - i * 0.1) # Add markers at the split point split_idx = train_indices[-1] plt.axvline(x=split_idx, color="g", linestyle="--", alpha=0.5) # Print information about this split print(f"Split {i + 1}:") print(f" Train: {len(train)} points (indices {train_indices[0]}..{train_indices[-1]})") print(f" Validation: {len(val)} points (indices {val_indices[0]}..{val_indices[-1]})") plt.title("Time Series Cross-Validation: Expanding Window Approach") plt.xlabel("Time") plt.ylabel("Value") # Add custom legend from matplotlib.lines import Line2D custom_lines = [ Line2D([0], [0], color="k", alpha=0.3), Line2D([0], [0], color="b", linewidth=2), Line2D([0], [0], color="r", linewidth=2), Line2D([0], [0], color="g", linestyle="--"), ] plt.legend( custom_lines, ["Full time series", "Training sets", "Validation sets", "Split points"], loc="upper left" ) plt.grid(True, alpha=0.3) plt.show() # Example model evaluation with each split from sklearn.linear_model import LinearRegression for i, (train, val) in enumerate(cv): # Prepare data train_indices = list(train.indices) train_X = np.array(train_indices).reshape(-1, 1) train_y = np.array(list(train.values)) val_indices = list(val.indices) val_X = np.array(val_indices).reshape(-1, 1) val_y = np.array(list(val.values)) # Train a simple model model = LinearRegression() model.fit(train_X, train_y) # Evaluate on validation set val_pred = model.predict(val_X) mse = np.mean((val_pred - val_y) ** 2) print(f"Split {i + 1} - Validation MSE: {mse:.4f}")

None
Source code in pysatl_tsp/implementations/processor/time_series_cross_validator.py
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
class TimeSeriesCrossValidator(Handler[T, tuple[ScrubberWindow[T], ScrubberWindow[T]]]):
    """A handler that implements expanding window cross-validation for time series data.

    This handler produces a sequence of train-validation splits suitable for time series
    validation, where each split preserves the temporal order of data. It implements an
    expanding window approach, where the training set grows over time while the validation
    set has a fixed size and slides forward.

    The handler ensures that:
    1. The training set always has at least `min_train_size` points
    2. The validation set always has exactly `val_size` points
    3. The validation set always follows the training set temporally
    4. Each new split adds `val_size` points to the training set

    This approach respects the temporal nature of time series data and prevents
    data leakage from future to past.

    :param min_train_size: Minimum number of points in the initial training set
    :param val_size: Number of points in each validation set
    :param source: The handler providing input data, defaults to None

    Example:
        ```python
        import numpy as np
        import matplotlib.pyplot as plt

        # Generate a synthetic time series
        np.random.seed(42)
        ts = np.cumsum(np.random.normal(0, 1, 100))  # Random walk
        data_source = SimpleDataProvider(ts)

        # Create a cross-validator with min_train_size=50 and val_size=10
        cv = TimeSeriesCrossValidator(min_train_size=50, val_size=10, source=data_source)

        # Visualize the different train-validation splits
        plt.figure(figsize=(14, 8))
        x = np.arange(len(ts))
        plt.plot(x, ts, "k-", alpha=0.3, label="Full time series")

        for i, (train, val) in enumerate(cv):
            train_indices = list(train.indices)
            val_indices = list(val.indices)

            # Plot each split
            plt.plot(train_indices, [ts[i] for i in train_indices], "b-", linewidth=2, alpha=0.7 - i * 0.1)
            plt.plot(val_indices, [ts[i] for i in val_indices], "r-", linewidth=2, alpha=0.7 - i * 0.1)

            # Add markers at the split point
            split_idx = train_indices[-1]
            plt.axvline(x=split_idx, color="g", linestyle="--", alpha=0.5)

            # Print information about this split
            print(f"Split {i + 1}:")
            print(f"  Train: {len(train)} points (indices {train_indices[0]}..{train_indices[-1]})")
            print(f"  Validation: {len(val)} points (indices {val_indices[0]}..{val_indices[-1]})")

        plt.title("Time Series Cross-Validation: Expanding Window Approach")
        plt.xlabel("Time")
        plt.ylabel("Value")

        # Add custom legend
        from matplotlib.lines import Line2D

        custom_lines = [
            Line2D([0], [0], color="k", alpha=0.3),
            Line2D([0], [0], color="b", linewidth=2),
            Line2D([0], [0], color="r", linewidth=2),
            Line2D([0], [0], color="g", linestyle="--"),
        ]
        plt.legend(
            custom_lines, ["Full time series", "Training sets", "Validation sets", "Split points"], loc="upper left"
        )

        plt.grid(True, alpha=0.3)
        plt.show()

        # Example model evaluation with each split
        from sklearn.linear_model import LinearRegression

        for i, (train, val) in enumerate(cv):
            # Prepare data
            train_indices = list(train.indices)
            train_X = np.array(train_indices).reshape(-1, 1)
            train_y = np.array(list(train.values))

            val_indices = list(val.indices)
            val_X = np.array(val_indices).reshape(-1, 1)
            val_y = np.array(list(val.values))

            # Train a simple model
            model = LinearRegression()
            model.fit(train_X, train_y)

            # Evaluate on validation set
            val_pred = model.predict(val_X)
            mse = np.mean((val_pred - val_y) ** 2)

            print(f"Split {i + 1} - Validation MSE: {mse:.4f}")
        ```
    """

    def __init__(self, min_train_size: int, val_size: int, source: Optional[Handler[Any, T]] = None):
        """Initialize a time series cross-validator.

        :param min_train_size: Minimum number of points in the initial training set
        :param val_size: Number of points in each validation set
        :param source: The handler providing input data, defaults to None
        """
        super().__init__(source)
        self.min_train_size = min_train_size
        self.val_size = val_size

    def __iter__(self) -> Iterator[tuple[ScrubberWindow[T], ScrubberWindow[T]]]:
        """Create an iterator that yields train-validation splits for time series cross-validation.

        This method creates splits where:
        1. The first split has exactly min_train_size points for training
        2. Each subsequent split adds val_size points to the training set
        3. Each validation set has exactly val_size points and follows the training set

        :return: Iterator yielding tuples of (training_window, validation_window)
        :raises ValueError: If no source has been set
        """
        if self.source is None:
            raise ValueError("Source is not set")

        scrubber = SlidingScrubber(
            lambda buffer: len(buffer) > self.min_train_size
            and (len(buffer) - self.min_train_size) % self.val_size == 0,
            shift=0,
            source=self.source,
        )
        handler: MappingHandler[ScrubberWindow[T], tuple[ScrubberWindow[T], ScrubberWindow[T]]] = MappingHandler(
            map_func=lambda window: (window[: -self.val_size], window[-self.val_size :])
        )

        yield from (scrubber | handler)
__init__
__init__(
    min_train_size: int,
    val_size: int,
    source: Optional[Handler[Any, T]] = None,
)

Initialize a time series cross-validator.

Parameters:

Name Type Description Default
min_train_size int

Minimum number of points in the initial training set

required
val_size int

Number of points in each validation set

required
source Optional[Handler[Any, T]]

The handler providing input data, defaults to None

None
Source code in pysatl_tsp/implementations/processor/time_series_cross_validator.py
110
111
112
113
114
115
116
117
118
119
def __init__(self, min_train_size: int, val_size: int, source: Optional[Handler[Any, T]] = None):
    """Initialize a time series cross-validator.

    :param min_train_size: Minimum number of points in the initial training set
    :param val_size: Number of points in each validation set
    :param source: The handler providing input data, defaults to None
    """
    super().__init__(source)
    self.min_train_size = min_train_size
    self.val_size = val_size
__iter__
__iter__() -> Iterator[
    tuple[ScrubberWindow[T], ScrubberWindow[T]]
]

Create an iterator that yields train-validation splits for time series cross-validation.

This method creates splits where: 1. The first split has exactly min_train_size points for training 2. Each subsequent split adds val_size points to the training set 3. Each validation set has exactly val_size points and follows the training set

Returns:

Type Description
Iterator[tuple[ScrubberWindow[T], ScrubberWindow[T]]]

Iterator yielding tuples of (training_window, validation_window)

Raises:

Type Description
ValueError

If no source has been set

Source code in pysatl_tsp/implementations/processor/time_series_cross_validator.py
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
def __iter__(self) -> Iterator[tuple[ScrubberWindow[T], ScrubberWindow[T]]]:
    """Create an iterator that yields train-validation splits for time series cross-validation.

    This method creates splits where:
    1. The first split has exactly min_train_size points for training
    2. Each subsequent split adds val_size points to the training set
    3. Each validation set has exactly val_size points and follows the training set

    :return: Iterator yielding tuples of (training_window, validation_window)
    :raises ValueError: If no source has been set
    """
    if self.source is None:
        raise ValueError("Source is not set")

    scrubber = SlidingScrubber(
        lambda buffer: len(buffer) > self.min_train_size
        and (len(buffer) - self.min_train_size) % self.val_size == 0,
        shift=0,
        source=self.source,
    )
    handler: MappingHandler[ScrubberWindow[T], tuple[ScrubberWindow[T], ScrubberWindow[T]]] = MappingHandler(
        map_func=lambda window: (window[: -self.val_size], window[-self.val_size :])
    )

    yield from (scrubber | handler)
dema_handler
DEMAHandler

Bases: Handler[float | None, float | None]

A handler that calculates the Double Exponential Moving Average (DEMA).

The Double Exponential Moving Average (DEMA) is designed to reduce the lag associated with traditional moving averages. It puts more weight on recent data by using the formula: DEMA = 2 * EMA - EMA of EMA.

This implementation automatically configures a pipeline of EMA handlers to calculate both the primary EMA and the EMA of that EMA to produce DEMA values.

Parameters:

Name Type Description Default
length int

The period for EMA calculations, defaults to 10

10
source Handler[Any, float | None] | None

Input data source, defaults to None Example: python # Create a data source with numeric values data_source = SimpleDataProvider([1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0]) # Create a DEMA handler with length of 3 dema_handler = DEMAHandler(length=3) dema_handler.set_source(data_source) # Process the data for value in dema_handler: print(value) # The first few values will be None as the EMA needs to be established # Then the DEMA values will be calculated using 2 * EMA - EMA of EMA

None
Source code in pysatl_tsp/implementations/processor/dema_handler.py
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
class DEMAHandler(Handler[float | None, float | None]):
    """A handler that calculates the Double Exponential Moving Average (DEMA).

    The Double Exponential Moving Average (DEMA) is designed to reduce the lag
    associated with traditional moving averages. It puts more weight on recent data
    by using the formula: DEMA = 2 * EMA - EMA of EMA.

    This implementation automatically configures a pipeline of EMA handlers to
    calculate both the primary EMA and the EMA of that EMA to produce DEMA values.

    :param length: The period for EMA calculations, defaults to 10
    :param source: Input data source, defaults to None

    Example:
        ```python
        # Create a data source with numeric values
        data_source = SimpleDataProvider([1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0])

        # Create a DEMA handler with length of 3
        dema_handler = DEMAHandler(length=3)
        dema_handler.set_source(data_source)

        # Process the data
        for value in dema_handler:
            print(value)

        # The first few values will be None as the EMA needs to be established
        # Then the DEMA values will be calculated using 2 * EMA - EMA of EMA
        ```
    """

    def __init__(self, length: int = 10, source: Handler[Any, float | None] | None = None):
        """Initialize a DEMA handler.

        :param length: The period for EMA calculations, defaults to 10
        :param source: Input data source, defaults to None
        """
        self.length = length
        super().__init__(source)

    @staticmethod
    def _combine(ema: float | None, ema_of_ema: float | None) -> float | None:
        """Combine EMA and EMA-of-EMA to produce DEMA.

        Applies the formula: DEMA = 2 * EMA - EMA of EMA.
        If either input is None, returns None.

        :param ema: The primary EMA value
        :param ema_of_ema: The EMA of the EMA value
        :return: The calculated DEMA value or None if inputs are None
        """
        if ema is None or ema_of_ema is None:
            return None
        return 2 * ema - ema_of_ema

    def __iter__(self) -> Iterator[float | None]:
        """Create an iterator that yields DEMA values.

        This method constructs a pipeline that:
        1. Takes values from the source
        2. Calculates the primary EMA
        3. Calculates the EMA of the EMA
        4. Combines them using the DEMA formula

        :return: Iterator yielding DEMA values
        :raises ValueError: If no source has been set
        """
        if not self.source:
            raise ValueError("Source is not set")

        yield from (
            self.source | EMAHandler(length=self.length) | TeeHandler(EMAHandler(length=self.length), self._combine)
        )
__init__
__init__(
    length: int = 10,
    source: Handler[Any, float | None] | None = None,
)

Initialize a DEMA handler.

Parameters:

Name Type Description Default
length int

The period for EMA calculations, defaults to 10

10
source Handler[Any, float | None] | None

Input data source, defaults to None

None
Source code in pysatl_tsp/implementations/processor/dema_handler.py
41
42
43
44
45
46
47
48
def __init__(self, length: int = 10, source: Handler[Any, float | None] | None = None):
    """Initialize a DEMA handler.

    :param length: The period for EMA calculations, defaults to 10
    :param source: Input data source, defaults to None
    """
    self.length = length
    super().__init__(source)
__iter__
__iter__() -> Iterator[float | None]

Create an iterator that yields DEMA values.

This method constructs a pipeline that: 1. Takes values from the source 2. Calculates the primary EMA 3. Calculates the EMA of the EMA 4. Combines them using the DEMA formula

Returns:

Type Description
Iterator[float | None]

Iterator yielding DEMA values

Raises:

Type Description
ValueError

If no source has been set

Source code in pysatl_tsp/implementations/processor/dema_handler.py
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
def __iter__(self) -> Iterator[float | None]:
    """Create an iterator that yields DEMA values.

    This method constructs a pipeline that:
    1. Takes values from the source
    2. Calculates the primary EMA
    3. Calculates the EMA of the EMA
    4. Combines them using the DEMA formula

    :return: Iterator yielding DEMA values
    :raises ValueError: If no source has been set
    """
    if not self.source:
        raise ValueError("Source is not set")

    yield from (
        self.source | EMAHandler(length=self.length) | TeeHandler(EMAHandler(length=self.length), self._combine)
    )
ema_handler
EMAHandler

Bases: InductiveHandler[float | None, float | None]

Exponential Moving Average (EMA) handler.

Calculates EMA values for a sequence of input values, matching the functionality of pandas_ta.EMA implementation.

Parameters:

Name Type Description Default
length int

The period for EMA calculation, defaults to 10

10
adjust bool

Whether to use adjusted weights in calculation, defaults to False

False
sma bool

Whether to use SMA for initial value, defaults to True

True
alpha float | None

Custom smoothing factor, defaults to 2/(length+1) if None

None
source Handler[Any, float | None] | None

Input data source, defaults to None Example: python # Create a data source with numeric values data_source = SimpleDataProvider([1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0]) # Create an EMA handler with length of 5 ema_handler = EMAHandler(length=5) ema_handler.set_source(data_source) # Process the data for value in ema_handler: print(value) # The first 4 values will be None since we're using SMA initialization # The 5th value will be the SMA of the first 5 values # Subsequent values will be EMA values based on the formula

None
Source code in pysatl_tsp/implementations/processor/ema_handler.py
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
class EMAHandler(InductiveHandler[float | None, float | None]):
    """Exponential Moving Average (EMA) handler.

    Calculates EMA values for a sequence of input values, matching the functionality
    of pandas_ta.EMA implementation.

    :param length: The period for EMA calculation, defaults to 10
    :param adjust: Whether to use adjusted weights in calculation, defaults to False
    :param sma: Whether to use SMA for initial value, defaults to True
    :param alpha: Custom smoothing factor, defaults to 2/(length+1) if None
    :param source: Input data source, defaults to None

    Example:
        ```python
        # Create a data source with numeric values
        data_source = SimpleDataProvider([1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0])

        # Create an EMA handler with length of 5
        ema_handler = EMAHandler(length=5)
        ema_handler.set_source(data_source)

        # Process the data
        for value in ema_handler:
            print(value)

        # The first 4 values will be None since we're using SMA initialization
        # The 5th value will be the SMA of the first 5 values
        # Subsequent values will be EMA values based on the formula
        ```
    """

    def __init__(
        self,
        length: int = 10,
        adjust: bool = False,
        sma: bool = True,
        alpha: float | None = None,
        source: Handler[Any, float | None] | None = None,
    ):
        """Initialize EMA handler with specified parameters.

        :param length: The period for EMA calculation, defaults to 10
        :param adjust: Whether to use adjusted weights in calculation, defaults to False
        :param sma: Whether to use SMA for initial value, defaults to True
        :param alpha: Custom smoothing factor, defaults to 2/(length+1) if None
        :param source: Input data source, defaults to None
        """
        super().__init__(source)
        self.length = length
        self.adjust = adjust
        self.sma = sma
        if alpha is None:
            self.alpha = 2 / (self.length + 1)
        else:
            self.alpha = alpha

    def _initialize_state(self) -> dict[str, Any]:
        """Initialize state for EMA calculation.

        Creates the initial state dictionary with a window to collect values,
        variables to track the EMA calculation, and a position counter.

        :return: Dictionary containing initial state variables
        """
        return {
            "window": ScrubberWindow(),
            "ema_numerator": None if self.sma else 0,
            "ema_denominator": None if self.sma else 0,
            "position": 0,
        }

    def _update_state(self, state: dict[str, Any], value: float | None) -> dict[str, Any]:
        """Update state with a new value.

        Implements the same logic as the original pandas_ta.EMA function:
        - If sma=True, initialize EMA with the SMA of first 'length' values
        - If sma=False, initialize EMA with the first value
        - Then apply the standard EMA formula for subsequent values

        When adjust=True, uses an adjusted weighting method that gives
        more weight to recent observations.

        :param state: Current state dictionary
        :param value: New value to incorporate into the EMA calculation
        :return: Updated state dictionary
        """
        state["position"] += 1
        if value is not None:
            state["window"].append(value)

        if self.sma and state["position"] != self.length:
            return state

        if self.sma:
            if len(state["window"]):
                sma_value = sum(state["window"].values) / len(state["window"])
                state["window"].clear()
                state["ema_numerator"] = sma_value
                state["ema_denominator"] = 1
            self.sma = False
        elif self.adjust and value is not None:
            state["ema_numerator"] = (1 - self.alpha) * state["ema_numerator"] + value
            state["ema_denominator"] = (1 - self.alpha) * state["ema_denominator"] + 1
        elif not self.adjust and value is not None:
            if state["ema_denominator"]:
                state["ema_numerator"] = (1 - self.alpha) * state["ema_numerator"] + self.alpha * value
            else:
                state["ema_numerator"] = value
            state["ema_denominator"] = 1

        return state

    def _compute_result(self, state: dict[str, Any]) -> float | None:
        """Return the current EMA value or None if not yet initialized.

        Calculates the final EMA value by dividing the numerator by the denominator
        if the denominator exists (indicating that EMA is initialized).

        :param state: Current state of the handler
        :return: Current EMA value or None if not yet calculated
        """
        if state["ema_denominator"]:
            res: float = state["ema_numerator"] / state["ema_denominator"]
            return res
        return None
__init__
__init__(
    length: int = 10,
    adjust: bool = False,
    sma: bool = True,
    alpha: float | None = None,
    source: Handler[Any, float | None] | None = None,
)

Initialize EMA handler with specified parameters.

Parameters:

Name Type Description Default
length int

The period for EMA calculation, defaults to 10

10
adjust bool

Whether to use adjusted weights in calculation, defaults to False

False
sma bool

Whether to use SMA for initial value, defaults to True

True
alpha float | None

Custom smoothing factor, defaults to 2/(length+1) if None

None
source Handler[Any, float | None] | None

Input data source, defaults to None

None
Source code in pysatl_tsp/implementations/processor/ema_handler.py
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
def __init__(
    self,
    length: int = 10,
    adjust: bool = False,
    sma: bool = True,
    alpha: float | None = None,
    source: Handler[Any, float | None] | None = None,
):
    """Initialize EMA handler with specified parameters.

    :param length: The period for EMA calculation, defaults to 10
    :param adjust: Whether to use adjusted weights in calculation, defaults to False
    :param sma: Whether to use SMA for initial value, defaults to True
    :param alpha: Custom smoothing factor, defaults to 2/(length+1) if None
    :param source: Input data source, defaults to None
    """
    super().__init__(source)
    self.length = length
    self.adjust = adjust
    self.sma = sma
    if alpha is None:
        self.alpha = 2 / (self.length + 1)
    else:
        self.alpha = alpha
fwma_handler
FWMAHandler

Bases: WeightedMovingAverageHandler

Fibonacci Weighted Moving Average (FWMA) handler.

Calculates a moving average using Fibonacci sequence numbers as weights. The Fibonacci sequence (1, 1, 2, 3, 5, 8, 13, ...) provides a natural weighting scheme where each number is the sum of the two preceding ones.

By default, higher weights are assigned to more recent values (when asc=False), making this moving average more responsive to recent changes in the data.

Inherits general behavior from WeightedMovingAverageHandler, only changing the weight calculation method.

Example:

# Create a data source with numeric values
data_source = SimpleDataProvider([1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0])

# Create a FWMA handler with length of 5
fwma_handler = FWMAHandler(length=5)
fwma_handler.set_source(data_source)

# Process the data
for value in fwma_handler:
    print(value)

# First 4 values will be None (not enough data points)
# Subsequent values will be weighted averages using Fibonacci weights
# For length=5, weights would be [0.01, 0.01, 0.02, 0.03, 0.05] (normalized)
# or [0.05, 0.03, 0.02, 0.01, 0.01] when asc=False (default)

Source code in pysatl_tsp/implementations/processor/fwma_handler.py
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
class FWMAHandler(WeightedMovingAverageHandler):
    """Fibonacci Weighted Moving Average (FWMA) handler.

    Calculates a moving average using Fibonacci sequence numbers as weights.
    The Fibonacci sequence (1, 1, 2, 3, 5, 8, 13, ...) provides a natural
    weighting scheme where each number is the sum of the two preceding ones.

    By default, higher weights are assigned to more recent values (when asc=False),
    making this moving average more responsive to recent changes in the data.

    Inherits general behavior from WeightedMovingAverageHandler, only changing
    the weight calculation method.

    Example:
        ```python
        # Create a data source with numeric values
        data_source = SimpleDataProvider([1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0])

        # Create a FWMA handler with length of 5
        fwma_handler = FWMAHandler(length=5)
        fwma_handler.set_source(data_source)

        # Process the data
        for value in fwma_handler:
            print(value)

        # First 4 values will be None (not enough data points)
        # Subsequent values will be weighted averages using Fibonacci weights
        # For length=5, weights would be [0.01, 0.01, 0.02, 0.03, 0.05] (normalized)
        # or [0.05, 0.03, 0.02, 0.01, 0.01] when asc=False (default)
        ```
    """

    def _calculate_weights(self, length: int, asc: bool) -> list[float]:
        """Calculate Fibonacci weights for FWMA.

        Generates weights based on the Fibonacci sequence and normalizes them
        to sum to 1.0. The sequence order can be reversed based on the asc parameter.

        :param length: The number of weights to generate
        :param asc: Whether weights should be in ascending order
        :return: A list of normalized weights summing to 1.0
        """
        sequence = self._fibonacci_sequence(length)

        if not asc:
            sequence = sequence[::-1]

        # Normalize the weights to sum to 1.0
        total = sum(sequence)
        return [x / total for x in sequence]

    def _fibonacci_sequence(self, n: int) -> list[float]:
        """Generate the Fibonacci sequence of specified length.

        Creates a list containing the first n numbers in the Fibonacci sequence.
        The sequence starts with 1, 1 and each subsequent number is the sum of
        the two preceding ones.

        :param n: The length of the sequence to generate
        :return: A list containing the Fibonacci sequence
        """
        if n <= 0:
            return []

        if n == 1:
            return [1.0]

        sequence = [1.0, 1.0]
        for i in range(2, n):
            sequence.append(sequence[i - 1] + sequence[i - 2])

        return sequence
hma_handler
HMAHandler

Bases: Handler[float | None, float | None]

Hull Moving Average (HMA) handler.

The Hull Moving Average is designed to reduce lag while maintaining smoothness. It uses weighted moving averages (WMA) in a multi-step process to create a more responsive indicator that better follows price action.

The HMA is calculated using the following formula: HMA = WMA(2*WMA(n/2) - WMA(n), sqrt(n))

Parameters:

Name Type Description Default
length int

The period for HMA calculation Example: python # Create a data source with numeric values data_source = SimpleDataProvider([1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0]) # Create a Hull Moving Average handler with length of 4 hma_handler = HMAHandler(length=4) hma_handler.set_source(data_source) # Process the data for value in hma_handler: print(value) # The first few values may be None as the HMA needs historical data # Then the HMA values will follow, being more responsive than traditional # moving averages while maintaining smoothness.

required
Source code in pysatl_tsp/implementations/processor/hma_handler.py
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
class HMAHandler(Handler[float | None, float | None]):
    """Hull Moving Average (HMA) handler.

    The Hull Moving Average is designed to reduce lag while maintaining smoothness.
    It uses weighted moving averages (WMA) in a multi-step process to create a more
    responsive indicator that better follows price action.

    The HMA is calculated using the following formula:
    HMA = WMA(2*WMA(n/2) - WMA(n), sqrt(n))

    :param length: The period for HMA calculation

    Example:
        ```python
        # Create a data source with numeric values
        data_source = SimpleDataProvider([1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0])

        # Create a Hull Moving Average handler with length of 4
        hma_handler = HMAHandler(length=4)
        hma_handler.set_source(data_source)

        # Process the data
        for value in hma_handler:
            print(value)

        # The first few values may be None as the HMA needs historical data
        # Then the HMA values will follow, being more responsive than traditional
        # moving averages while maintaining smoothness.
        ```
    """

    def __init__(self, length: int):
        """Initialize a Hull Moving Average handler.

        :param length: The period for HMA calculation
        """
        super().__init__()
        self.length = length

    def __iter__(self) -> Iterator[float | None]:
        """Create an iterator that yields HMA values.

        This method constructs a pipeline that:
        1. Takes values from the source
        2. Calculates two WMAs with different periods (length//2 and length)
        3. Combines them using the formula: 2*WMA(length//2) - WMA(length)
        4. Applies another WMA with period=sqrt(length) to the result

        :return: Iterator yielding HMA values
        :raises ValueError: If no source has been set
        """
        if self.source is None:
            raise ValueError("Source is not set")

        def combine_func(lst: list[float | None]) -> float | None:
            if lst[0] is None or lst[1] is None:
                return None
            return 2 * lst[0] - lst[1]

        yield from (
            SimpleDataProvider(self.source)
            | CombineHandler(combine_func, WMAHandler(length=self.length // 2), WMAHandler(length=self.length))
            | WMAHandler(length=int(math.sqrt(self.length)))
        )
__init__
__init__(length: int)

Initialize a Hull Moving Average handler.

Parameters:

Name Type Description Default
length int

The period for HMA calculation

required
Source code in pysatl_tsp/implementations/processor/hma_handler.py
43
44
45
46
47
48
49
def __init__(self, length: int):
    """Initialize a Hull Moving Average handler.

    :param length: The period for HMA calculation
    """
    super().__init__()
    self.length = length
__iter__
__iter__() -> Iterator[float | None]

Create an iterator that yields HMA values.

This method constructs a pipeline that: 1. Takes values from the source 2. Calculates two WMAs with different periods (length//2 and length) 3. Combines them using the formula: 2*WMA(length//2) - WMA(length) 4. Applies another WMA with period=sqrt(length) to the result

Returns:

Type Description
Iterator[float | None]

Iterator yielding HMA values

Raises:

Type Description
ValueError

If no source has been set

Source code in pysatl_tsp/implementations/processor/hma_handler.py
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
def __iter__(self) -> Iterator[float | None]:
    """Create an iterator that yields HMA values.

    This method constructs a pipeline that:
    1. Takes values from the source
    2. Calculates two WMAs with different periods (length//2 and length)
    3. Combines them using the formula: 2*WMA(length//2) - WMA(length)
    4. Applies another WMA with period=sqrt(length) to the result

    :return: Iterator yielding HMA values
    :raises ValueError: If no source has been set
    """
    if self.source is None:
        raise ValueError("Source is not set")

    def combine_func(lst: list[float | None]) -> float | None:
        if lst[0] is None or lst[1] is None:
            return None
        return 2 * lst[0] - lst[1]

    yield from (
        SimpleDataProvider(self.source)
        | CombineHandler(combine_func, WMAHandler(length=self.length // 2), WMAHandler(length=self.length))
        | WMAHandler(length=int(math.sqrt(self.length)))
    )
kalman_filter_handler
KalmanFilterHandler

Bases: OnlineFilterHandler[float, float]

A handler that applies the Kalman filter to time series data in real-time.

This handler integrates the complete Kalman filter functionality for processing noisy time series data. It estimates the underlying state of a system based on a sequence of noisy measurements.

Parameters:

Name Type Description Default
F ndarray[Any, dtype[float64]]

State transition matrix

required
H ndarray[Any, dtype[float64]]

Measurement matrix

required
B Union[float, ndarray[Any, dtype[float64]]] | None

Control input matrix, defaults to 0

None
Q ndarray[Any, dtype[float64]] | None

Process noise covariance matrix, defaults to identity matrix

None
R ndarray[Any, dtype[float64]] | None

Measurement noise covariance matrix, defaults to identity matrix

None
P ndarray[Any, dtype[float64]] | None

Initial state covariance matrix, defaults to identity matrix

None
x0 ndarray[Any, dtype[float64]] | None

Initial state vector, defaults to zero vector

None
source Handler[Any, float] | None

The handler providing input data, defaults to None Example: import numpy as np from pysatl_tsp.core.data_providers import SimpleDataProvider from pysatl_tsp.implementations.processor.kalman_filter_handler import KalmanFilterHandler np.random.seed(42) true_signal = np.sin(np.linspace(0, 4 * np.pi, 1000)) noisy_signal = true_signal + np.random.normal(0, 0.1, 1000) data_source = SimpleDataProvider(noisy_signal.tolist()) dt: float = 1.0/60 F = np.array([[1, dt, 0],[0, 1, dt], [0, 0, 1]]) H = np.array([1, 0, 0]).reshape(1, 3) Q = np.array([[0.05, 0.05, 0.0], [0.05, 0.05, 0.0], [0.0, 0.0, 0.0]]) R = np.array([0.5]).reshape(1, 1) filter_handler: KalmanFilterHandler = KalmanFilterHandler(F=F, H=H, Q=Q, R=R, source=data_source) filtered_values = list(filter_handler) import matplotlib.pyplot as plt plt.figure(figsize=(10, 6)) plt.plot(range(len(noisy_signal)), noisy_signal, label='Measurements') plt.plot(range(len(filtered_values)), filtered_values, label='Kalman Filter Prediction') plt.legend() plt.show()

None
Source code in pysatl_tsp/implementations/processor/kalman_filter_handler.py
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
class KalmanFilterHandler(OnlineFilterHandler[float, float]):
    """A handler that applies the Kalman filter to time series data in real-time.

    This handler integrates the complete Kalman filter functionality for processing
    noisy time series data. It estimates the underlying state of a system based on
    a sequence of noisy measurements.

    :param F: State transition matrix
    :param H: Measurement matrix
    :param B: Control input matrix, defaults to 0
    :param Q: Process noise covariance matrix, defaults to identity matrix
    :param R: Measurement noise covariance matrix, defaults to identity matrix
    :param P: Initial state covariance matrix, defaults to identity matrix
    :param x0: Initial state vector, defaults to zero vector
    :param source: The handler providing input data, defaults to None

    Example:
    ```
        import numpy as np
        from pysatl_tsp.core.data_providers import SimpleDataProvider
        from pysatl_tsp.implementations.processor.kalman_filter_handler import KalmanFilterHandler

        np.random.seed(42)
        true_signal = np.sin(np.linspace(0, 4 * np.pi, 1000))
        noisy_signal = true_signal + np.random.normal(0, 0.1, 1000)

        data_source = SimpleDataProvider(noisy_signal.tolist())

        dt: float = 1.0/60
        F = np.array([[1, dt, 0],[0, 1, dt], [0, 0, 1]])
        H = np.array([1, 0, 0]).reshape(1, 3)
        Q = np.array([[0.05, 0.05, 0.0], [0.05, 0.05, 0.0], [0.0, 0.0, 0.0]])
        R = np.array([0.5]).reshape(1, 1)

        filter_handler: KalmanFilterHandler = KalmanFilterHandler(F=F, H=H, Q=Q, R=R, source=data_source)

        filtered_values = list(filter_handler)

        import matplotlib.pyplot as plt
        plt.figure(figsize=(10, 6))
        plt.plot(range(len(noisy_signal)), noisy_signal, label='Measurements')
        plt.plot(range(len(filtered_values)), filtered_values, label='Kalman Filter Prediction')
        plt.legend()
        plt.show()
    ```
    """

    def __init__(
        self,
        F: np.ndarray[Any, np.dtype[np.float64]],
        H: np.ndarray[Any, np.dtype[np.float64]],
        B: Union[float, np.ndarray[Any, np.dtype[np.float64]]] | None = None,
        Q: np.ndarray[Any, np.dtype[np.float64]] | None = None,
        R: np.ndarray[Any, np.dtype[np.float64]] | None = None,
        P: np.ndarray[Any, np.dtype[np.float64]] | None = None,
        x0: np.ndarray[Any, np.dtype[np.float64]] | None = None,
        source: Handler[Any, float] | None = None,
    ) -> None:
        """Initialize the Kalman filter handler with all necessary matrices.

        :param F: State transition matrix
        :param H: Measurement matrix
        :param B: Control input matrix, defaults to None
        :param Q: Process noise covariance matrix, defaults to None
        :param R: Measurement noise covariance matrix, defaults to None
        :param P: Initial state covariance matrix, defaults to None
        :param x0: Initial state vector, defaults to None
        :param source: The handler providing input data, defaults to None
        """
        if F is None or H is None:
            raise ValueError("Set proper system dynamics.")

        self.n: int = F.shape[1]
        self.m: int = H.shape[1]

        self.F: np.ndarray[Any, np.dtype[np.float64]] = F
        self.H: np.ndarray[Any, np.dtype[np.float64]] = H
        self.B: Union[float, np.ndarray[Any, np.dtype[np.float64]]] = 0 if B is None else B

        self.Q: np.ndarray[Any, np.dtype[np.float64]] = np.eye(self.n) if Q is None else Q
        self.R: np.ndarray[Any, np.dtype[np.float64]] = np.eye(self.n) if R is None else R
        self.P: np.ndarray[Any, np.dtype[np.float64]] = np.eye(self.n) if P is None else P

        self.x: np.ndarray[Any, np.dtype[np.float64]] = np.zeros((self.n, 1)) if x0 is None else x0

        super().__init__(filter_func=self._apply_kalman_filter, filter_config=None, source=source)

    def predict(
        self, u: Union[float, np.ndarray[Any, np.dtype[np.float64]]] = 0
    ) -> np.ndarray[Any, np.dtype[np.float64]]:
        """Predict the next state based on the model.

        :param u: Control input, defaults to 0
        :return: Predicted state vector
        """
        self.x = np.dot(self.F, self.x) + np.dot(self.B, u)

        self.P = np.dot(np.dot(self.F, self.P), self.F.T) + self.Q

        return self.x

    def update(self, z: float) -> None:
        """Update the state estimate based on the measurement.

        :param z: Measurement
        """
        y: np.ndarray[Any, np.dtype[np.float64]] = z - np.dot(self.H, self.x)

        S: np.ndarray[Any, np.dtype[np.float64]] = self.R + np.dot(self.H, np.dot(self.P, self.H.T))

        K: np.ndarray[Any, np.dtype[np.float64]] = np.dot(np.dot(self.P, self.H.T), np.linalg.inv(S))

        self.x = self.x + np.dot(K, y)

        I_matrix: np.ndarray[Any, np.dtype[np.float64]] = np.eye(self.n)
        self.P = np.dot(np.dot(I_matrix - np.dot(K, self.H), self.P), (I_matrix - np.dot(K, self.H)).T) + np.dot(
            np.dot(K, self.R), K.T
        )

    def _apply_kalman_filter(self, window: ScrubberWindow[float], _: Any) -> float:
        """Apply the Kalman filter to the latest point in the window.

        :param window: Window of historical data
        :param _: Unused configuration parameter
        :return: Filtered value
        """
        if not window:
            return 0.0

        measurement: float = window[-1]

        prediction_array = np.dot(self.H, self.predict())
        prediction: float = float(prediction_array.item())

        self.update(measurement)

        return prediction
__init__
__init__(
    F: ndarray[Any, dtype[float64]],
    H: ndarray[Any, dtype[float64]],
    B: Union[float, ndarray[Any, dtype[float64]]]
    | None = None,
    Q: ndarray[Any, dtype[float64]] | None = None,
    R: ndarray[Any, dtype[float64]] | None = None,
    P: ndarray[Any, dtype[float64]] | None = None,
    x0: ndarray[Any, dtype[float64]] | None = None,
    source: Handler[Any, float] | None = None,
) -> None

Initialize the Kalman filter handler with all necessary matrices.

Parameters:

Name Type Description Default
F ndarray[Any, dtype[float64]]

State transition matrix

required
H ndarray[Any, dtype[float64]]

Measurement matrix

required
B Union[float, ndarray[Any, dtype[float64]]] | None

Control input matrix, defaults to None

None
Q ndarray[Any, dtype[float64]] | None

Process noise covariance matrix, defaults to None

None
R ndarray[Any, dtype[float64]] | None

Measurement noise covariance matrix, defaults to None

None
P ndarray[Any, dtype[float64]] | None

Initial state covariance matrix, defaults to None

None
x0 ndarray[Any, dtype[float64]] | None

Initial state vector, defaults to None

None
source Handler[Any, float] | None

The handler providing input data, defaults to None

None
Source code in pysatl_tsp/implementations/processor/kalman_filter_handler.py
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
def __init__(
    self,
    F: np.ndarray[Any, np.dtype[np.float64]],
    H: np.ndarray[Any, np.dtype[np.float64]],
    B: Union[float, np.ndarray[Any, np.dtype[np.float64]]] | None = None,
    Q: np.ndarray[Any, np.dtype[np.float64]] | None = None,
    R: np.ndarray[Any, np.dtype[np.float64]] | None = None,
    P: np.ndarray[Any, np.dtype[np.float64]] | None = None,
    x0: np.ndarray[Any, np.dtype[np.float64]] | None = None,
    source: Handler[Any, float] | None = None,
) -> None:
    """Initialize the Kalman filter handler with all necessary matrices.

    :param F: State transition matrix
    :param H: Measurement matrix
    :param B: Control input matrix, defaults to None
    :param Q: Process noise covariance matrix, defaults to None
    :param R: Measurement noise covariance matrix, defaults to None
    :param P: Initial state covariance matrix, defaults to None
    :param x0: Initial state vector, defaults to None
    :param source: The handler providing input data, defaults to None
    """
    if F is None or H is None:
        raise ValueError("Set proper system dynamics.")

    self.n: int = F.shape[1]
    self.m: int = H.shape[1]

    self.F: np.ndarray[Any, np.dtype[np.float64]] = F
    self.H: np.ndarray[Any, np.dtype[np.float64]] = H
    self.B: Union[float, np.ndarray[Any, np.dtype[np.float64]]] = 0 if B is None else B

    self.Q: np.ndarray[Any, np.dtype[np.float64]] = np.eye(self.n) if Q is None else Q
    self.R: np.ndarray[Any, np.dtype[np.float64]] = np.eye(self.n) if R is None else R
    self.P: np.ndarray[Any, np.dtype[np.float64]] = np.eye(self.n) if P is None else P

    self.x: np.ndarray[Any, np.dtype[np.float64]] = np.zeros((self.n, 1)) if x0 is None else x0

    super().__init__(filter_func=self._apply_kalman_filter, filter_config=None, source=source)
predict
predict(
    u: Union[float, ndarray[Any, dtype[float64]]] = 0,
) -> np.ndarray[Any, np.dtype[np.float64]]

Predict the next state based on the model.

Parameters:

Name Type Description Default
u Union[float, ndarray[Any, dtype[float64]]]

Control input, defaults to 0

0

Returns:

Type Description
ndarray[Any, dtype[float64]]

Predicted state vector

Source code in pysatl_tsp/implementations/processor/kalman_filter_handler.py
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
def predict(
    self, u: Union[float, np.ndarray[Any, np.dtype[np.float64]]] = 0
) -> np.ndarray[Any, np.dtype[np.float64]]:
    """Predict the next state based on the model.

    :param u: Control input, defaults to 0
    :return: Predicted state vector
    """
    self.x = np.dot(self.F, self.x) + np.dot(self.B, u)

    self.P = np.dot(np.dot(self.F, self.P), self.F.T) + self.Q

    return self.x
update
update(z: float) -> None

Update the state estimate based on the measurement.

Parameters:

Name Type Description Default
z float

Measurement

required
Source code in pysatl_tsp/implementations/processor/kalman_filter_handler.py
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
def update(self, z: float) -> None:
    """Update the state estimate based on the measurement.

    :param z: Measurement
    """
    y: np.ndarray[Any, np.dtype[np.float64]] = z - np.dot(self.H, self.x)

    S: np.ndarray[Any, np.dtype[np.float64]] = self.R + np.dot(self.H, np.dot(self.P, self.H.T))

    K: np.ndarray[Any, np.dtype[np.float64]] = np.dot(np.dot(self.P, self.H.T), np.linalg.inv(S))

    self.x = self.x + np.dot(K, y)

    I_matrix: np.ndarray[Any, np.dtype[np.float64]] = np.eye(self.n)
    self.P = np.dot(np.dot(I_matrix - np.dot(K, self.H), self.P), (I_matrix - np.dot(K, self.H)).T) + np.dot(
        np.dot(K, self.R), K.T
    )
midpoint_handler
MidpointHandler

Bases: MovingWindowHandler[float | None, float | None]

Midpoint price handler.

Calculates the average of highest and lowest values over the period. This handler is useful for identifying the central price level within a range, providing a simple measure of the balance between high and low extremes.

Inherits parameters from MovingWindowHandler:

Parameters:

Name Type Description Default
length

The period for the calculation, defaults to 10

required
source

Input data source, defaults to None Example: python # Create a data source with numeric values data_source = SimpleDataProvider([1.0, 2.0, 3.0, 5.0, 4.0, 3.0, 2.0, 1.0]) # Create a midpoint handler with length of 4 midpoint_handler = MidpointHandler(length=4) midpoint_handler.set_source(data_source) # Process the data for value in midpoint_handler: print(value) # Output: # None # None # None # 3.0 (midpoint of [1.0, 2.0, 3.0, 5.0] = (1.0 + 5.0) / 2 = 3.0) # 3.5 (midpoint of [2.0, 3.0, 5.0, 4.0] = (2.0 + 5.0) / 2 = 3.5) # 4.0 (midpoint of [3.0, 5.0, 4.0, 3.0] = (3.0 + 5.0) / 2 = 4.0) # 3.5 (midpoint of [5.0, 4.0, 3.0, 2.0] = (2.0 + 5.0) / 2 = 3.5) # 3.0 (midpoint of [4.0, 3.0, 2.0, 1.0] = (1.0 + 4.0) / 2 = 2.5)

required
Source code in pysatl_tsp/implementations/processor/midpoint_handler.py
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
class MidpointHandler(MovingWindowHandler[float | None, float | None]):
    """Midpoint price handler.

    Calculates the average of highest and lowest values over the period.
    This handler is useful for identifying the central price level within a range,
    providing a simple measure of the balance between high and low extremes.

    Inherits parameters from MovingWindowHandler:
    :param length: The period for the calculation, defaults to 10
    :param source: Input data source, defaults to None

    Example:
        ```python
        # Create a data source with numeric values
        data_source = SimpleDataProvider([1.0, 2.0, 3.0, 5.0, 4.0, 3.0, 2.0, 1.0])

        # Create a midpoint handler with length of 4
        midpoint_handler = MidpointHandler(length=4)
        midpoint_handler.set_source(data_source)

        # Process the data
        for value in midpoint_handler:
            print(value)

        # Output:
        # None
        # None
        # None
        # 3.0  (midpoint of [1.0, 2.0, 3.0, 5.0] = (1.0 + 5.0) / 2 = 3.0)
        # 3.5  (midpoint of [2.0, 3.0, 5.0, 4.0] = (2.0 + 5.0) / 2 = 3.5)
        # 4.0  (midpoint of [3.0, 5.0, 4.0, 3.0] = (3.0 + 5.0) / 2 = 4.0)
        # 3.5  (midpoint of [5.0, 4.0, 3.0, 2.0] = (2.0 + 5.0) / 2 = 3.5)
        # 3.0  (midpoint of [4.0, 3.0, 2.0, 1.0] = (1.0 + 4.0) / 2 = 2.5)
        ```
    """

    def _compute_result(self, state: dict[str, Any]) -> float | None:
        """Calculate midpoint as (highest + lowest) / 2.

        Computes the average of the highest and lowest values in the current
        window. Returns None if there aren't enough values to fill the window
        or if any value in the window is None.

        :param state: Current state dictionary containing window values
        :return: Midpoint value or None if conditions aren't met
        """
        values = state["values"]

        if len(values) < self.length:
            return None

        highest = float("-inf")
        lowest = float("inf")
        for v in values:
            if v is None:
                return None
            highest = max(highest, v)
            lowest = min(lowest, v)

        return (highest + lowest) / 2
midprice_handler
MidpriceHandler

Bases: MovingWindowHandler[tuple[float | None, float | None], float | None]

Midprice handler that processes high/low price tuples.

Calculates the average of the highest high and lowest low over a period. This handler is particularly useful for financial time series data where both high and low prices are available, providing a measure of the center of the price range over the specified period.

Unlike simple averages, the midprice considers only extreme values, making it useful for range-bound markets and support/resistance identification.

Parameters:

Name Type Description Default
length

The period for the calculation, defaults to 10

required
source

Input data source providing (high, low) tuples, defaults to None Example: python # Create a data source with (high, low) price tuples data = [ (10.0, 8.0), # (high, low) (11.0, 9.0), (12.0, 8.5), (10.5, 7.5), (11.5, 9.5), (13.0, 10.0), ] data_source = SimpleDataProvider(data) # Create a midprice handler with length of 3 midprice_handler = MidpriceHandler(length=3) midprice_handler.set_source(data_source) # Process the data for value in midprice_handler: print(value) # Output: # None # None # 10.0 # (highest high 12.0 + lowest low 8.0) / 2 from first 3 tuples # 10.0 # (highest high 12.0 + lowest low 7.5) / 2 # 10.0 # (highest high 12.0 + lowest low 7.5) / 2 # 10.25 # (highest high 13.0 + lowest low 7.5) / 2

required
Source code in pysatl_tsp/implementations/processor/midprice_handler.py
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
class MidpriceHandler(MovingWindowHandler[tuple[float | None, float | None], float | None]):
    """Midprice handler that processes high/low price tuples.

    Calculates the average of the highest high and lowest low over a period.
    This handler is particularly useful for financial time series data where
    both high and low prices are available, providing a measure of the center
    of the price range over the specified period.

    Unlike simple averages, the midprice considers only extreme values, making
    it useful for range-bound markets and support/resistance identification.

    :param length: The period for the calculation, defaults to 10
    :param source: Input data source providing (high, low) tuples, defaults to None

    Example:
        ```python
        # Create a data source with (high, low) price tuples
        data = [
            (10.0, 8.0),  # (high, low)
            (11.0, 9.0),
            (12.0, 8.5),
            (10.5, 7.5),
            (11.5, 9.5),
            (13.0, 10.0),
        ]
        data_source = SimpleDataProvider(data)

        # Create a midprice handler with length of 3
        midprice_handler = MidpriceHandler(length=3)
        midprice_handler.set_source(data_source)

        # Process the data
        for value in midprice_handler:
            print(value)

        # Output:
        # None
        # None
        # 10.0  # (highest high 12.0 + lowest low 8.0) / 2 from first 3 tuples
        # 10.0  # (highest high 12.0 + lowest low 7.5) / 2
        # 10.0  # (highest high 12.0 + lowest low 7.5) / 2
        # 10.25 # (highest high 13.0 + lowest low 7.5) / 2
        ```
    """

    def _compute_result(self, state: dict[str, Any]) -> float | None:
        """Calculate midprice as (highest high + lowest low) / 2.

        Examines all high/low pairs in the current window, finds the highest high
        and lowest low values, and returns their average. This produces a measure
        of the central price level between the highest and lowest extremes.

        :param state: Current state dictionary containing window of (high, low) tuples
        :return: Midprice value or None if the window isn't filled or contains None values
        """
        values = state["values"]

        if len(values) < self.length:
            return None

        highest_high: float = float("-inf")
        lowest_low: float = float("inf")

        for high, low in values:
            if high is None or low is None:
                return None
            highest_high = max(highest_high, high)
            lowest_low = min(lowest_low, low)

        return (highest_high + lowest_low) / 2
ohlc4_handler
Ohlc4Handler

Bases: MappingHandler[tuple[float | None, float | None, float | None, float | None], float | None]

A handler that calculates the average of OHLC (Open, High, Low, Close) price data.

This handler computes the OHLC4 (also known as the typical price), which is the simple arithmetic mean of the Open, High, Low, and Close prices for each time period. The calculation helps smooth price data and can be used as input for other indicators.

Parameters:

Name Type Description Default
source Handler[Any, tuple[float | None, float | None, float | None, float | None]] | None

The handler providing OHLC tuples, defaults to None Example: python # Create a data source with OHLC price tuples ohlc_data = [ (100.0, 105.0, 98.0, 103.0), # (open, high, low, close) (103.0, 107.0, 101.0, 104.0), (104.0, 109.0, 102.0, 108.0), ] data_source = SimpleDataProvider(ohlc_data) # Create an OHLC4 handler ohlc4_handler = Ohlc4Handler(source=data_source) # Process the data for value in ohlc4_handler: print(value) # Output: # 101.5 # (100.0 + 105.0 + 98.0 + 103.0) / 4 # 103.75 # (103.0 + 107.0 + 101.0 + 104.0) / 4 # 105.75 # (104.0 + 109.0 + 102.0 + 108.0) / 4

None
Source code in pysatl_tsp/implementations/processor/ohlc4_handler.py
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
class Ohlc4Handler(MappingHandler[tuple[float | None, float | None, float | None, float | None], float | None]):
    """A handler that calculates the average of OHLC (Open, High, Low, Close) price data.

    This handler computes the OHLC4 (also known as the typical price), which is the simple
    arithmetic mean of the Open, High, Low, and Close prices for each time period.
    The calculation helps smooth price data and can be used as input for other indicators.

    :param source: The handler providing OHLC tuples, defaults to None

    Example:
        ```python
        # Create a data source with OHLC price tuples
        ohlc_data = [
            (100.0, 105.0, 98.0, 103.0),  # (open, high, low, close)
            (103.0, 107.0, 101.0, 104.0),
            (104.0, 109.0, 102.0, 108.0),
        ]
        data_source = SimpleDataProvider(ohlc_data)

        # Create an OHLC4 handler
        ohlc4_handler = Ohlc4Handler(source=data_source)

        # Process the data
        for value in ohlc4_handler:
            print(value)

        # Output:
        # 101.5  # (100.0 + 105.0 + 98.0 + 103.0) / 4
        # 103.75 # (103.0 + 107.0 + 101.0 + 104.0) / 4
        # 105.75 # (104.0 + 109.0 + 102.0 + 108.0) / 4
        ```
    """

    @staticmethod
    def _map_func(t: tuple[float | None, float | None, float | None, float | None]) -> float | None:
        """Calculate the average of OHLC values.

        Takes a tuple of Open, High, Low, Close values and returns their arithmetic mean.
        If any value in the tuple is None, returns None.

        :param t: Tuple of (open, high, low, close) values
        :return: Average of OHLC values or None if any value is None
        """
        res = 0.0
        for x in t:
            if x is None:
                return None
            res += x
        return res / 4

    def __init__(
        self, source: Handler[Any, tuple[float | None, float | None, float | None, float | None]] | None = None
    ):
        """Initialize an OHLC4 handler.

        :param source: The handler providing OHLC tuples, defaults to None
        """
        super().__init__(self._map_func, source)
__init__
__init__(
    source: Handler[
        Any,
        tuple[
            float | None,
            float | None,
            float | None,
            float | None,
        ],
    ]
    | None = None,
)

Initialize an OHLC4 handler.

Parameters:

Name Type Description Default
source Handler[Any, tuple[float | None, float | None, float | None, float | None]] | None

The handler providing OHLC tuples, defaults to None

None
Source code in pysatl_tsp/implementations/processor/ohlc4_handler.py
57
58
59
60
61
62
63
64
def __init__(
    self, source: Handler[Any, tuple[float | None, float | None, float | None, float | None]] | None = None
):
    """Initialize an OHLC4 handler.

    :param source: The handler providing OHLC tuples, defaults to None
    """
    super().__init__(self._map_func, source)
pwma_handler
PWMAHandler

Bases: WeightedMovingAverageHandler

Pascal Weighted Moving Average (PWMA) handler.

Calculates a weighted moving average using coefficients from Pascal's triangle as weights. Pascal's triangle provides a natural weighting scheme where the central values receive more weight than the extremes, creating a balanced but still centered weighting distribution.

This implementation matches the functionality of pandas_ta.pwma.

Inherits parameters from WeightedMovingAverageHandler:

Parameters:

Name Type Description Default
length

The period for the calculation, defaults to 10

required
asc

Whether weights should be applied in ascending order, defaults to False

required
source

Input data source, defaults to None Example: python # Create a data source with numeric values data_source = SimpleDataProvider([1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0]) # Create a PWMA handler with length of 4 pwma_handler = PWMAHandler(length=4) pwma_handler.set_source(data_source) # Process the data for value in pwma_handler: print(value) # First 3 values will be None (not enough data points) # For length=4, Pascal weights would be [1/8, 3/8, 3/8, 1/8] or [1/8, 3/8, 3/8, 1/8] when asc=False # The calculation gives more weight to the central values

required
Source code in pysatl_tsp/implementations/processor/pwma_handler.py
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
class PWMAHandler(WeightedMovingAverageHandler):
    """Pascal Weighted Moving Average (PWMA) handler.

    Calculates a weighted moving average using coefficients from Pascal's triangle
    as weights. Pascal's triangle provides a natural weighting scheme where the
    central values receive more weight than the extremes, creating a balanced
    but still centered weighting distribution.

    This implementation matches the functionality of pandas_ta.pwma.

    Inherits parameters from WeightedMovingAverageHandler:
    :param length: The period for the calculation, defaults to 10
    :param asc: Whether weights should be applied in ascending order, defaults to False
    :param source: Input data source, defaults to None

    Example:
        ```python
        # Create a data source with numeric values
        data_source = SimpleDataProvider([1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0])

        # Create a PWMA handler with length of 4
        pwma_handler = PWMAHandler(length=4)
        pwma_handler.set_source(data_source)

        # Process the data
        for value in pwma_handler:
            print(value)

        # First 3 values will be None (not enough data points)
        # For length=4, Pascal weights would be [1/8, 3/8, 3/8, 1/8] or [1/8, 3/8, 3/8, 1/8] when asc=False
        # The calculation gives more weight to the central values
        ```
    """

    def _calculate_weights(self, length: int, asc: bool) -> list[float]:
        """Calculate Pascal's triangle weights for PWMA.

        Generates weights based on a row of Pascal's triangle and normalizes them
        to sum to 1.0. The row is determined by (length-1) to match the pandas_ta
        implementation. The weights order can be reversed based on the asc parameter.

        :param length: The number of weights to generate
        :param asc: Whether weights should be in ascending order
        :return: A list of normalized weights summing to 1.0
        """
        # Generate the row of Pascal's triangle
        # Using (length-1) to match pandas_ta implementation
        triangle = [self._combination(length - 1, i) for i in range(length)]

        # Normalize the weights to sum to 1.0
        total = sum(triangle)
        weights = [w / total for w in triangle]

        if not asc:
            weights = weights[::-1]

        return weights

    def _combination(self, n: int, r: int) -> float:
        """Calculate the binomial coefficient (n choose r).

        Computes the binomial coefficient, also known as "n choose r", which
        represents the number of ways to choose r items from a set of n items
        without regard to order.

        Uses math.comb when available (Python 3.8+) for efficiency, otherwise
        falls back to calculation using factorials.

        :param n: The total number of items
        :param r: The number of items to choose
        :return: The binomial coefficient
        """
        return math.comb(n, r)

    def _factorial(self, n: int) -> int:
        """Calculate the factorial of n.

        Computes n! recursively. This method is used as a fallback when
        math.comb is not available for binomial coefficient calculation.

        :param n: The number to calculate factorial for
        :return: The factorial of n
        """
        if n <= 1:
            return 1
        return n * self._factorial(n - 1)
rma_handler
RMAHandler

Bases: InductiveHandler[float | None, float | None]

Wilder's Moving Average (RMA) Handler.

The Wilder's Moving Average is an Exponential Moving Average (EMA) with a modified alpha = 1 / length. It was introduced by J. Welles Wilder and is also known as the Smoothed Moving Average.

RMA gives greater weight to recent data and less weight to older data, but does so more gradually than a standard EMA, resulting in a smoother line. It's commonly used in technical analysis for calculating indicators like the Relative Strength Index (RSI).

Parameters:

Name Type Description Default
length int

The number of periods for the moving average calculation, defaults to 10

10
source Handler[Any, float | None] | None

Source handler providing the input data, defaults to None Example: python # Create a data source with numeric values data_source = SimpleDataProvider([1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0]) # Create an RMA handler with length of 5 rma_handler = RMAHandler(length=5) rma_handler.set_source(data_source) # Process the data for value in rma_handler: print(value) # First values will be None until we have 'length' values # Then RMA values will be calculated with alpha = 1/5

None
Source code in pysatl_tsp/implementations/processor/rma_handler.py
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
class RMAHandler(InductiveHandler[float | None, float | None]):
    """Wilder's Moving Average (RMA) Handler.

    The Wilder's Moving Average is an Exponential Moving Average (EMA) with
    a modified alpha = 1 / length. It was introduced by J. Welles Wilder and is
    also known as the Smoothed Moving Average.

    RMA gives greater weight to recent data and less weight to older data, but
    does so more gradually than a standard EMA, resulting in a smoother line.
    It's commonly used in technical analysis for calculating indicators like
    the Relative Strength Index (RSI).

    :param length: The number of periods for the moving average calculation, defaults to 10
    :param source: Source handler providing the input data, defaults to None

    Example:
        ```python
        # Create a data source with numeric values
        data_source = SimpleDataProvider([1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0])

        # Create an RMA handler with length of 5
        rma_handler = RMAHandler(length=5)
        rma_handler.set_source(data_source)

        # Process the data
        for value in rma_handler:
            print(value)

        # First values will be None until we have 'length' values
        # Then RMA values will be calculated with alpha = 1/5
        ```
    """

    def __init__(self, length: int = 10, source: Handler[Any, float | None] | None = None):
        """Initialize a Wilder's Moving Average handler.

        :param length: The number of periods for the moving average calculation, defaults to 10
        :param source: Source handler providing the input data, defaults to None
        """
        super().__init__(source)
        self.length = length
        self.alpha = 1 / length

    def _initialize_state(self) -> dict[str, float | int]:
        """Initialize state for RMA calculation.

        Creates an initial state dictionary with counters and accumulators set to zero.

        :return: Dictionary containing initial state variables
        """
        return {"enumerator": 0, "denominator": 0, "not_none_count": 0}

    def _update_state(self, state: dict[str, float | int], value: float | None) -> dict[str, float | int]:
        """Update state with a new value.

        Updates the running totals using the RMA formula with alpha = 1/length.
        Handles None values by treating them as zeros in the calculation but
        tracking how many non-None values have been seen.

        :param state: Current state dictionary
        :param value: New value to incorporate into the RMA calculation
        :return: Updated state dictionary
        """
        state["not_none_count"] += int(value is not None)
        state["denominator"] = (1 - self.alpha) * state["denominator"] + int(value is not None)
        if value is None:
            value = 0

        state["enumerator"] = (1 - self.alpha) * state["enumerator"] + value
        return state

    def _compute_result(self, state: dict[str, float | int]) -> float | None:
        """Calculate the RMA value based on current state.

        Returns the current RMA value if there's a non-zero denominator and we have
        seen at least 'length' non-None values. Otherwise returns None.

        :param state: Current state dictionary
        :return: Current RMA value or None if conditions aren't met
        """
        if not state["denominator"] or state["not_none_count"] < self.length:
            return None
        return state["enumerator"] / state["denominator"]
__init__
__init__(
    length: int = 10,
    source: Handler[Any, float | None] | None = None,
)

Initialize a Wilder's Moving Average handler.

Parameters:

Name Type Description Default
length int

The number of periods for the moving average calculation, defaults to 10

10
source Handler[Any, float | None] | None

Source handler providing the input data, defaults to None

None
Source code in pysatl_tsp/implementations/processor/rma_handler.py
40
41
42
43
44
45
46
47
48
def __init__(self, length: int = 10, source: Handler[Any, float | None] | None = None):
    """Initialize a Wilder's Moving Average handler.

    :param length: The number of periods for the moving average calculation, defaults to 10
    :param source: Source handler providing the input data, defaults to None
    """
    super().__init__(source)
    self.length = length
    self.alpha = 1 / length
sma_handler
SMAHandler

Bases: MovingWindowHandler[float | None, float | None]

Simple Moving Average (SMA) Handler.

The Simple Moving Average is the classic moving average that is the equally weighted average over n periods. It's one of the most common technical analysis tools that smooths price data by calculating the arithmetic mean of a given set of values over the specified period.

This handler properly handles None values by ignoring them in the calculation, and allows specifying a minimum number of valid observations required before producing a result.

Parameters:

Name Type Description Default
length int

The period for the SMA calculation, defaults to 10

10
min_periods int | None

Minimum number of observations required to have a value, defaults to length

None
source Handler[Any, float | None] | None

Input data source, defaults to None Example: python # Create a data source with numeric values and some None values data_source = SimpleDataProvider([1.0, 2.0, 3.0, None, 5.0, 6.0, 7.0, 8.0]) # Create an SMA handler with length of 4 and minimum periods of 3 sma_handler = SMAHandler(length=4, min_periods=3) sma_handler.set_source(data_source) # Process the data for value in sma_handler: print(value) # Output: # None # None # 2.0 # (1.0 + 2.0 + 3.0) / 3 (only 3 values, but min_periods=3) # 3.33 # (1.0 + 2.0 + 3.0 + 5.0) / 3 (ignoring None) # 3.33 # (2.0 + 3.0 + 5.0) / 3 (window moves, still ignoring None) # 4.67 # (3.0 + 5.0 + 6.0) / 3 # 6.0 # (5.0 + 6.0 + 7.0) / 3 # 6.5 # (5.0 + 6.0 + 7.0 + 8.0) / 4 (full window with all valid values)

None
Source code in pysatl_tsp/implementations/processor/sma_handler.py
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
class SMAHandler(MovingWindowHandler[float | None, float | None]):
    """Simple Moving Average (SMA) Handler.

    The Simple Moving Average is the classic moving average that is the equally
    weighted average over n periods. It's one of the most common technical analysis
    tools that smooths price data by calculating the arithmetic mean of a given set
    of values over the specified period.

    This handler properly handles None values by ignoring them in the calculation,
    and allows specifying a minimum number of valid observations required before
    producing a result.

    :param length: The period for the SMA calculation, defaults to 10
    :param min_periods: Minimum number of observations required to have a value, defaults to length
    :param source: Input data source, defaults to None

    Example:
        ```python
        # Create a data source with numeric values and some None values
        data_source = SimpleDataProvider([1.0, 2.0, 3.0, None, 5.0, 6.0, 7.0, 8.0])

        # Create an SMA handler with length of 4 and minimum periods of 3
        sma_handler = SMAHandler(length=4, min_periods=3)
        sma_handler.set_source(data_source)

        # Process the data
        for value in sma_handler:
            print(value)

        # Output:
        # None
        # None
        # 2.0   # (1.0 + 2.0 + 3.0) / 3 (only 3 values, but min_periods=3)
        # 3.33  # (1.0 + 2.0 + 3.0 + 5.0) / 3 (ignoring None)
        # 3.33  # (2.0 + 3.0 + 5.0) / 3 (window moves, still ignoring None)
        # 4.67  # (3.0 + 5.0 + 6.0) / 3
        # 6.0   # (5.0 + 6.0 + 7.0) / 3
        # 6.5   # (5.0 + 6.0 + 7.0 + 8.0) / 4 (full window with all valid values)
        ```
    """

    def __init__(
        self, length: int = 10, min_periods: int | None = None, source: Handler[Any, float | None] | None = None
    ):
        """Initialize SMA handler with specified parameters.

        :param length: The period for the SMA calculation, defaults to 10
        :param min_periods: Minimum number of non-None observations required to calculate a result,
                           defaults to length if None
        :param source: Input data source, defaults to None
        """
        super().__init__(length=length, source=source)
        self.min_periods = min_periods if min_periods is not None else length

    def _compute_result(self, state: dict[str, Any]) -> float | None:
        """Calculate simple moving average based on current values in the window.

        This method:
        1. Filters out None values from the current window
        2. Checks if the number of valid values meets or exceeds min_periods
        3. Calculates the arithmetic mean of valid values

        The SMA is calculated as the sum of valid values divided by the count of valid values,
        which means None values are completely ignored rather than treated as zeros.

        :param state: Current state containing the values in the moving window
        :return: Simple moving average of valid values or None if there aren't enough valid values
        """
        values = state["values"]

        # Filter out None values
        valid_values: list[float] = [v for v in values if v is not None]

        # Check if we have enough valid values
        if len(valid_values) < self.min_periods:
            return None

        # Calculate simple average
        return sum(valid_values) / len(valid_values)
__init__
__init__(
    length: int = 10,
    min_periods: int | None = None,
    source: Handler[Any, float | None] | None = None,
)

Initialize SMA handler with specified parameters.

Parameters:

Name Type Description Default
length int

The period for the SMA calculation, defaults to 10

10
min_periods int | None

Minimum number of non-None observations required to calculate a result, defaults to length if None

None
source Handler[Any, float | None] | None

Input data source, defaults to None

None
Source code in pysatl_tsp/implementations/processor/sma_handler.py
48
49
50
51
52
53
54
55
56
57
58
59
def __init__(
    self, length: int = 10, min_periods: int | None = None, source: Handler[Any, float | None] | None = None
):
    """Initialize SMA handler with specified parameters.

    :param length: The period for the SMA calculation, defaults to 10
    :param min_periods: Minimum number of non-None observations required to calculate a result,
                       defaults to length if None
    :param source: Input data source, defaults to None
    """
    super().__init__(length=length, source=source)
    self.min_periods = min_periods if min_periods is not None else length
t3_handler
T3Handler

Bases: Handler[float | None, float | None]

Tim Tillson's T3 Moving Average handler with lazy evaluation.

This handler implements the T3 adaptive moving average developed by Tim Tillson, which is designed to reduce lag and improve smoothness. The T3 is calculated as: T3 = c1 * e6 + c2 * e5 + c3 * e4 + c4 * e3, where e1-e6 are sequentially computed EMAs.

All calculations are performed lazily in a streaming fashion, computing values only when requested by the iterator.

Parameters:

Name Type Description Default
length int

Period for each EMA calculation, defaults to 10

10
a float

Volume factor (0 < a < 1), controls smoothness vs. responsiveness, defaults to 0.7

0.7
source Handler[Any, float | None] | None

Input data source, defaults to None Example: python # Create a data source with numeric values data_source = SimpleDataProvider([1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0]) # Create a T3 handler with length of 5 and volume factor of 0.7 t3_handler = T3Handler(length=5, a=0.7) t3_handler.set_source(data_source) # Process the data for value in t3_handler: print(value) # Initial values will be None as the T3 calculation requires # six sequential EMA calculations to establish

None
Source code in pysatl_tsp/implementations/processor/t3_handler.py
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
class T3Handler(Handler[float | None, float | None]):
    """Tim Tillson's T3 Moving Average handler with lazy evaluation.

    This handler implements the T3 adaptive moving average developed by Tim Tillson,
    which is designed to reduce lag and improve smoothness. The T3 is calculated as:
    T3 = c1 * e6 + c2 * e5 + c3 * e4 + c4 * e3, where e1-e6 are sequentially computed EMAs.

    All calculations are performed lazily in a streaming fashion, computing values only
    when requested by the iterator.

    :param length: Period for each EMA calculation, defaults to 10
    :param a: Volume factor (0 < a < 1), controls smoothness vs. responsiveness, defaults to 0.7
    :param source: Input data source, defaults to None

    Example:
        ```python
        # Create a data source with numeric values
        data_source = SimpleDataProvider([1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0])

        # Create a T3 handler with length of 5 and volume factor of 0.7
        t3_handler = T3Handler(length=5, a=0.7)
        t3_handler.set_source(data_source)

        # Process the data
        for value in t3_handler:
            print(value)

        # Initial values will be None as the T3 calculation requires
        # six sequential EMA calculations to establish
        ```
    """

    def __init__(self, length: int = 10, a: float = 0.7, source: Handler[Any, float | None] | None = None):
        """Initialize a T3 moving average handler.

        :param length: Period for each EMA calculation, defaults to 10
        :param a: Volume factor (0 < a < 1), controls smoothness vs. responsiveness, defaults to 0.7
        :param source: Input data source, defaults to None
        """
        super().__init__(source=source)
        self.length = length
        self.a = a if 0 < a < 1 else 0.7

    def __iter__(self) -> Iterator[float | None]:
        """Create an iterator that yields T3 moving average values.

        This method constructs a pipeline of six cascaded EMA calculations, where
        each EMA takes the output of the previous one as input. The final T3 value
        is a weighted sum of the last four EMAs in the sequence.

        :return: Iterator yielding T3 values
        :raises ValueError: If no source has been set
        """
        if self.source is None:
            raise ValueError("T3Handler requires a data source")

        # Calculate coefficients
        c1 = -self.a * self.a**2
        c2 = 3 * self.a**2 + 3 * self.a**3
        c3 = -6 * self.a**2 - 3 * self.a - 3 * self.a**3
        c4 = self.a**3 + 3 * self.a**2 + 3 * self.a + 1

        # Calculate e1 (first EMA)
        e1_provider = SimpleDataProvider(self.source)
        e1_pipeline = e1_provider | EMAHandler(length=self.length)

        # Calculate e2 (second EMA based on e1)
        e2_provider = SimpleDataProvider(iter(e1_pipeline))
        e2_pipeline = e2_provider | EMAHandler(length=self.length)

        # Calculate e3 (third EMA based on e2)
        e3_provider = SimpleDataProvider(iter(e2_pipeline))
        e3_pipeline = e3_provider | EMAHandler(length=self.length)
        e3_iter = iter(e3_pipeline)

        # Create copies of e3 for use in T3 formula and for calculating e4
        e3_for_t3, e3_for_e4 = tee(e3_iter)

        # Calculate e4 (fourth EMA based on e3)
        e4_provider = SimpleDataProvider(e3_for_e4)
        e4_pipeline = e4_provider | EMAHandler(length=self.length)
        e4_iter = iter(e4_pipeline)

        # Create copies of e4 for use in T3 formula and for calculating e5
        e4_for_t3, e4_for_e5 = tee(e4_iter)

        # Calculate e5 (fifth EMA based on e4)
        e5_provider = SimpleDataProvider(e4_for_e5)
        e5_pipeline = e5_provider | EMAHandler(length=self.length)
        e5_iter = iter(e5_pipeline)

        # Create copies of e5 for use in T3 formula and for calculating e6
        e5_for_t3, e5_for_e6 = tee(e5_iter)

        # Calculate e6 (sixth EMA based on e5)
        e6_provider = SimpleDataProvider(e5_for_e6)
        e6_pipeline = e6_provider | EMAHandler(length=self.length)
        e6_iter = iter(e6_pipeline)

        # Combine e3, e4, e5, e6 to calculate T3
        for e3, e4, e5, e6 in zip_longest(e3_for_t3, e4_for_t3, e5_for_t3, e6_iter):
            if e3 is None or e4 is None or e5 is None or e6 is None:
                yield None
            else:
                t3_value = c1 * e6 + c2 * e5 + c3 * e4 + c4 * e3
                yield t3_value
__init__
__init__(
    length: int = 10,
    a: float = 0.7,
    source: Handler[Any, float | None] | None = None,
)

Initialize a T3 moving average handler.

Parameters:

Name Type Description Default
length int

Period for each EMA calculation, defaults to 10

10
a float

Volume factor (0 < a < 1), controls smoothness vs. responsiveness, defaults to 0.7

0.7
source Handler[Any, float | None] | None

Input data source, defaults to None

None
Source code in pysatl_tsp/implementations/processor/t3_handler.py
44
45
46
47
48
49
50
51
52
53
def __init__(self, length: int = 10, a: float = 0.7, source: Handler[Any, float | None] | None = None):
    """Initialize a T3 moving average handler.

    :param length: Period for each EMA calculation, defaults to 10
    :param a: Volume factor (0 < a < 1), controls smoothness vs. responsiveness, defaults to 0.7
    :param source: Input data source, defaults to None
    """
    super().__init__(source=source)
    self.length = length
    self.a = a if 0 < a < 1 else 0.7
__iter__
__iter__() -> Iterator[float | None]

Create an iterator that yields T3 moving average values.

This method constructs a pipeline of six cascaded EMA calculations, where each EMA takes the output of the previous one as input. The final T3 value is a weighted sum of the last four EMAs in the sequence.

Returns:

Type Description
Iterator[float | None]

Iterator yielding T3 values

Raises:

Type Description
ValueError

If no source has been set

Source code in pysatl_tsp/implementations/processor/t3_handler.py
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
def __iter__(self) -> Iterator[float | None]:
    """Create an iterator that yields T3 moving average values.

    This method constructs a pipeline of six cascaded EMA calculations, where
    each EMA takes the output of the previous one as input. The final T3 value
    is a weighted sum of the last four EMAs in the sequence.

    :return: Iterator yielding T3 values
    :raises ValueError: If no source has been set
    """
    if self.source is None:
        raise ValueError("T3Handler requires a data source")

    # Calculate coefficients
    c1 = -self.a * self.a**2
    c2 = 3 * self.a**2 + 3 * self.a**3
    c3 = -6 * self.a**2 - 3 * self.a - 3 * self.a**3
    c4 = self.a**3 + 3 * self.a**2 + 3 * self.a + 1

    # Calculate e1 (first EMA)
    e1_provider = SimpleDataProvider(self.source)
    e1_pipeline = e1_provider | EMAHandler(length=self.length)

    # Calculate e2 (second EMA based on e1)
    e2_provider = SimpleDataProvider(iter(e1_pipeline))
    e2_pipeline = e2_provider | EMAHandler(length=self.length)

    # Calculate e3 (third EMA based on e2)
    e3_provider = SimpleDataProvider(iter(e2_pipeline))
    e3_pipeline = e3_provider | EMAHandler(length=self.length)
    e3_iter = iter(e3_pipeline)

    # Create copies of e3 for use in T3 formula and for calculating e4
    e3_for_t3, e3_for_e4 = tee(e3_iter)

    # Calculate e4 (fourth EMA based on e3)
    e4_provider = SimpleDataProvider(e3_for_e4)
    e4_pipeline = e4_provider | EMAHandler(length=self.length)
    e4_iter = iter(e4_pipeline)

    # Create copies of e4 for use in T3 formula and for calculating e5
    e4_for_t3, e4_for_e5 = tee(e4_iter)

    # Calculate e5 (fifth EMA based on e4)
    e5_provider = SimpleDataProvider(e4_for_e5)
    e5_pipeline = e5_provider | EMAHandler(length=self.length)
    e5_iter = iter(e5_pipeline)

    # Create copies of e5 for use in T3 formula and for calculating e6
    e5_for_t3, e5_for_e6 = tee(e5_iter)

    # Calculate e6 (sixth EMA based on e5)
    e6_provider = SimpleDataProvider(e5_for_e6)
    e6_pipeline = e6_provider | EMAHandler(length=self.length)
    e6_iter = iter(e6_pipeline)

    # Combine e3, e4, e5, e6 to calculate T3
    for e3, e4, e5, e6 in zip_longest(e3_for_t3, e4_for_t3, e5_for_t3, e6_iter):
        if e3 is None or e4 is None or e5 is None or e6 is None:
            yield None
        else:
            t3_value = c1 * e6 + c2 * e5 + c3 * e4 + c4 * e3
            yield t3_value
tema_handler
TEMAHandler

Bases: Handler[float | None, float | None]

Triple Exponential Moving Average (TEMA) handler with lazy evaluation.

This handler implements the TEMA indicator developed by Patrick Mulloy, which reduces lag by applying the formula: TEMA = 3 * (EMA1 - EMA2) + EMA3. Here EMA1 is the EMA of the original data, EMA2 is the EMA of EMA1, and EMA3 is the EMA of EMA2.

All calculations are performed lazily in a streaming fashion, computing values only when requested by the iterator.

Parameters:

Name Type Description Default
length int

Period for each EMA calculation Example: python # Create a data source with numeric values data_source = SimpleDataProvider([1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0]) # Create a TEMA handler with length of 5 tema_handler = TEMAHandler(length=5) tema_handler.set_source(data_source) # Process the data for value in tema_handler: print(value) # Initial values will be None as TEMA requires three levels of EMA # After initialization, TEMA values will follow the price action more closely # than a regular EMA while maintaining smoothness

required
Source code in pysatl_tsp/implementations/processor/tema_handler.py
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
class TEMAHandler(Handler[float | None, float | None]):
    """Triple Exponential Moving Average (TEMA) handler with lazy evaluation.

    This handler implements the TEMA indicator developed by Patrick Mulloy, which
    reduces lag by applying the formula: TEMA = 3 * (EMA1 - EMA2) + EMA3.
    Here EMA1 is the EMA of the original data, EMA2 is the EMA of EMA1, and
    EMA3 is the EMA of EMA2.

    All calculations are performed lazily in a streaming fashion, computing values only
    when requested by the iterator.

    :param length: Period for each EMA calculation

    Example:
        ```python
        # Create a data source with numeric values
        data_source = SimpleDataProvider([1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0])

        # Create a TEMA handler with length of 5
        tema_handler = TEMAHandler(length=5)
        tema_handler.set_source(data_source)

        # Process the data
        for value in tema_handler:
            print(value)

        # Initial values will be None as TEMA requires three levels of EMA
        # After initialization, TEMA values will follow the price action more closely
        # than a regular EMA while maintaining smoothness
        ```
    """

    def __init__(self, length: int):
        """Initialize a TEMA handler.

        :param length: Period for each EMA calculation
        """
        super().__init__()
        self.length = length

    def __iter__(self) -> Iterator[float | None]:
        """Create an iterator that yields TEMA values.

        This method constructs a pipeline of three cascaded EMA calculations and
        applies the TEMA formula: 3 * (EMA1 - EMA2) + EMA3.

        :return: Iterator yielding TEMA values
        :raises ValueError: If no source has been set
        """
        if self.source is None:
            raise ValueError("TEMAHandler requires a data source")

        # Calculate EMA1
        ema1_provider = SimpleDataProvider(self.source)
        ema1_pipeline = ema1_provider | EMAHandler(length=self.length)

        # Create two copies of the EMA1 iterator
        ema1_iter1, ema1_iter2 = tee(iter(ema1_pipeline))

        # Calculate EMA2 based on the first copy of EMA1
        ema2_provider = SimpleDataProvider(ema1_iter1)
        ema2_pipeline = ema2_provider | EMAHandler(length=self.length)

        # Create two copies of the EMA2 iterator
        ema2_iter1, ema2_iter2 = tee(iter(ema2_pipeline))

        # Calculate EMA3 based on the first copy of EMA2
        ema3_provider = SimpleDataProvider(ema2_iter1)
        ema3_pipeline = ema3_provider | EMAHandler(length=self.length)
        ema3_iter = iter(ema3_pipeline)

        # Combine all three iterators and apply the TEMA formula
        for ema1, ema2, ema3 in zip_longest(ema1_iter2, ema2_iter2, ema3_iter):
            if ema1 is None or ema2 is None or ema3 is None:
                yield None
            else:
                tema_value = 3 * (ema1 - ema2) + ema3
                yield tema_value
__init__
__init__(length: int)

Initialize a TEMA handler.

Parameters:

Name Type Description Default
length int

Period for each EMA calculation

required
Source code in pysatl_tsp/implementations/processor/tema_handler.py
43
44
45
46
47
48
49
def __init__(self, length: int):
    """Initialize a TEMA handler.

    :param length: Period for each EMA calculation
    """
    super().__init__()
    self.length = length
__iter__
__iter__() -> Iterator[float | None]

Create an iterator that yields TEMA values.

This method constructs a pipeline of three cascaded EMA calculations and applies the TEMA formula: 3 * (EMA1 - EMA2) + EMA3.

Returns:

Type Description
Iterator[float | None]

Iterator yielding TEMA values

Raises:

Type Description
ValueError

If no source has been set

Source code in pysatl_tsp/implementations/processor/tema_handler.py
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
def __iter__(self) -> Iterator[float | None]:
    """Create an iterator that yields TEMA values.

    This method constructs a pipeline of three cascaded EMA calculations and
    applies the TEMA formula: 3 * (EMA1 - EMA2) + EMA3.

    :return: Iterator yielding TEMA values
    :raises ValueError: If no source has been set
    """
    if self.source is None:
        raise ValueError("TEMAHandler requires a data source")

    # Calculate EMA1
    ema1_provider = SimpleDataProvider(self.source)
    ema1_pipeline = ema1_provider | EMAHandler(length=self.length)

    # Create two copies of the EMA1 iterator
    ema1_iter1, ema1_iter2 = tee(iter(ema1_pipeline))

    # Calculate EMA2 based on the first copy of EMA1
    ema2_provider = SimpleDataProvider(ema1_iter1)
    ema2_pipeline = ema2_provider | EMAHandler(length=self.length)

    # Create two copies of the EMA2 iterator
    ema2_iter1, ema2_iter2 = tee(iter(ema2_pipeline))

    # Calculate EMA3 based on the first copy of EMA2
    ema3_provider = SimpleDataProvider(ema2_iter1)
    ema3_pipeline = ema3_provider | EMAHandler(length=self.length)
    ema3_iter = iter(ema3_pipeline)

    # Combine all three iterators and apply the TEMA formula
    for ema1, ema2, ema3 in zip_longest(ema1_iter2, ema2_iter2, ema3_iter):
        if ema1 is None or ema2 is None or ema3 is None:
            yield None
        else:
            tema_value = 3 * (ema1 - ema2) + ema3
            yield tema_value
time_series_cross_validator
TimeSeriesCrossValidator

Bases: Handler[T, tuple[ScrubberWindow[T], ScrubberWindow[T]]]

A handler that implements expanding window cross-validation for time series data.

This handler produces a sequence of train-validation splits suitable for time series validation, where each split preserves the temporal order of data. It implements an expanding window approach, where the training set grows over time while the validation set has a fixed size and slides forward.

The handler ensures that: 1. The training set always has at least min_train_size points 2. The validation set always has exactly val_size points 3. The validation set always follows the training set temporally 4. Each new split adds val_size points to the training set

This approach respects the temporal nature of time series data and prevents data leakage from future to past.

Parameters:

Name Type Description Default
min_train_size int

Minimum number of points in the initial training set

required
val_size int

Number of points in each validation set

required
source Optional[Handler[Any, T]]

The handler providing input data, defaults to None Example: python import numpy as np import matplotlib.pyplot as plt # Generate a synthetic time series np.random.seed(42) ts = np.cumsum(np.random.normal(0, 1, 100)) # Random walk data_source = SimpleDataProvider(ts) # Create a cross-validator with min_train_size=50 and val_size=10 cv = TimeSeriesCrossValidator(min_train_size=50, val_size=10, source=data_source) # Visualize the different train-validation splits plt.figure(figsize=(14, 8)) x = np.arange(len(ts)) plt.plot(x, ts, "k-", alpha=0.3, label="Full time series") for i, (train, val) in enumerate(cv): train_indices = list(train.indices) val_indices = list(val.indices) # Plot each split plt.plot(train_indices, [ts[i] for i in train_indices], "b-", linewidth=2, alpha=0.7 - i * 0.1) plt.plot(val_indices, [ts[i] for i in val_indices], "r-", linewidth=2, alpha=0.7 - i * 0.1) # Add markers at the split point split_idx = train_indices[-1] plt.axvline(x=split_idx, color="g", linestyle="--", alpha=0.5) # Print information about this split print(f"Split {i + 1}:") print(f" Train: {len(train)} points (indices {train_indices[0]}..{train_indices[-1]})") print(f" Validation: {len(val)} points (indices {val_indices[0]}..{val_indices[-1]})") plt.title("Time Series Cross-Validation: Expanding Window Approach") plt.xlabel("Time") plt.ylabel("Value") # Add custom legend from matplotlib.lines import Line2D custom_lines = [ Line2D([0], [0], color="k", alpha=0.3), Line2D([0], [0], color="b", linewidth=2), Line2D([0], [0], color="r", linewidth=2), Line2D([0], [0], color="g", linestyle="--"), ] plt.legend( custom_lines, ["Full time series", "Training sets", "Validation sets", "Split points"], loc="upper left" ) plt.grid(True, alpha=0.3) plt.show() # Example model evaluation with each split from sklearn.linear_model import LinearRegression for i, (train, val) in enumerate(cv): # Prepare data train_indices = list(train.indices) train_X = np.array(train_indices).reshape(-1, 1) train_y = np.array(list(train.values)) val_indices = list(val.indices) val_X = np.array(val_indices).reshape(-1, 1) val_y = np.array(list(val.values)) # Train a simple model model = LinearRegression() model.fit(train_X, train_y) # Evaluate on validation set val_pred = model.predict(val_X) mse = np.mean((val_pred - val_y) ** 2) print(f"Split {i + 1} - Validation MSE: {mse:.4f}")

None
Source code in pysatl_tsp/implementations/processor/time_series_cross_validator.py
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
class TimeSeriesCrossValidator(Handler[T, tuple[ScrubberWindow[T], ScrubberWindow[T]]]):
    """A handler that implements expanding window cross-validation for time series data.

    This handler produces a sequence of train-validation splits suitable for time series
    validation, where each split preserves the temporal order of data. It implements an
    expanding window approach, where the training set grows over time while the validation
    set has a fixed size and slides forward.

    The handler ensures that:
    1. The training set always has at least `min_train_size` points
    2. The validation set always has exactly `val_size` points
    3. The validation set always follows the training set temporally
    4. Each new split adds `val_size` points to the training set

    This approach respects the temporal nature of time series data and prevents
    data leakage from future to past.

    :param min_train_size: Minimum number of points in the initial training set
    :param val_size: Number of points in each validation set
    :param source: The handler providing input data, defaults to None

    Example:
        ```python
        import numpy as np
        import matplotlib.pyplot as plt

        # Generate a synthetic time series
        np.random.seed(42)
        ts = np.cumsum(np.random.normal(0, 1, 100))  # Random walk
        data_source = SimpleDataProvider(ts)

        # Create a cross-validator with min_train_size=50 and val_size=10
        cv = TimeSeriesCrossValidator(min_train_size=50, val_size=10, source=data_source)

        # Visualize the different train-validation splits
        plt.figure(figsize=(14, 8))
        x = np.arange(len(ts))
        plt.plot(x, ts, "k-", alpha=0.3, label="Full time series")

        for i, (train, val) in enumerate(cv):
            train_indices = list(train.indices)
            val_indices = list(val.indices)

            # Plot each split
            plt.plot(train_indices, [ts[i] for i in train_indices], "b-", linewidth=2, alpha=0.7 - i * 0.1)
            plt.plot(val_indices, [ts[i] for i in val_indices], "r-", linewidth=2, alpha=0.7 - i * 0.1)

            # Add markers at the split point
            split_idx = train_indices[-1]
            plt.axvline(x=split_idx, color="g", linestyle="--", alpha=0.5)

            # Print information about this split
            print(f"Split {i + 1}:")
            print(f"  Train: {len(train)} points (indices {train_indices[0]}..{train_indices[-1]})")
            print(f"  Validation: {len(val)} points (indices {val_indices[0]}..{val_indices[-1]})")

        plt.title("Time Series Cross-Validation: Expanding Window Approach")
        plt.xlabel("Time")
        plt.ylabel("Value")

        # Add custom legend
        from matplotlib.lines import Line2D

        custom_lines = [
            Line2D([0], [0], color="k", alpha=0.3),
            Line2D([0], [0], color="b", linewidth=2),
            Line2D([0], [0], color="r", linewidth=2),
            Line2D([0], [0], color="g", linestyle="--"),
        ]
        plt.legend(
            custom_lines, ["Full time series", "Training sets", "Validation sets", "Split points"], loc="upper left"
        )

        plt.grid(True, alpha=0.3)
        plt.show()

        # Example model evaluation with each split
        from sklearn.linear_model import LinearRegression

        for i, (train, val) in enumerate(cv):
            # Prepare data
            train_indices = list(train.indices)
            train_X = np.array(train_indices).reshape(-1, 1)
            train_y = np.array(list(train.values))

            val_indices = list(val.indices)
            val_X = np.array(val_indices).reshape(-1, 1)
            val_y = np.array(list(val.values))

            # Train a simple model
            model = LinearRegression()
            model.fit(train_X, train_y)

            # Evaluate on validation set
            val_pred = model.predict(val_X)
            mse = np.mean((val_pred - val_y) ** 2)

            print(f"Split {i + 1} - Validation MSE: {mse:.4f}")
        ```
    """

    def __init__(self, min_train_size: int, val_size: int, source: Optional[Handler[Any, T]] = None):
        """Initialize a time series cross-validator.

        :param min_train_size: Minimum number of points in the initial training set
        :param val_size: Number of points in each validation set
        :param source: The handler providing input data, defaults to None
        """
        super().__init__(source)
        self.min_train_size = min_train_size
        self.val_size = val_size

    def __iter__(self) -> Iterator[tuple[ScrubberWindow[T], ScrubberWindow[T]]]:
        """Create an iterator that yields train-validation splits for time series cross-validation.

        This method creates splits where:
        1. The first split has exactly min_train_size points for training
        2. Each subsequent split adds val_size points to the training set
        3. Each validation set has exactly val_size points and follows the training set

        :return: Iterator yielding tuples of (training_window, validation_window)
        :raises ValueError: If no source has been set
        """
        if self.source is None:
            raise ValueError("Source is not set")

        scrubber = SlidingScrubber(
            lambda buffer: len(buffer) > self.min_train_size
            and (len(buffer) - self.min_train_size) % self.val_size == 0,
            shift=0,
            source=self.source,
        )
        handler: MappingHandler[ScrubberWindow[T], tuple[ScrubberWindow[T], ScrubberWindow[T]]] = MappingHandler(
            map_func=lambda window: (window[: -self.val_size], window[-self.val_size :])
        )

        yield from (scrubber | handler)
__init__
__init__(
    min_train_size: int,
    val_size: int,
    source: Optional[Handler[Any, T]] = None,
)

Initialize a time series cross-validator.

Parameters:

Name Type Description Default
min_train_size int

Minimum number of points in the initial training set

required
val_size int

Number of points in each validation set

required
source Optional[Handler[Any, T]]

The handler providing input data, defaults to None

None
Source code in pysatl_tsp/implementations/processor/time_series_cross_validator.py
110
111
112
113
114
115
116
117
118
119
def __init__(self, min_train_size: int, val_size: int, source: Optional[Handler[Any, T]] = None):
    """Initialize a time series cross-validator.

    :param min_train_size: Minimum number of points in the initial training set
    :param val_size: Number of points in each validation set
    :param source: The handler providing input data, defaults to None
    """
    super().__init__(source)
    self.min_train_size = min_train_size
    self.val_size = val_size
__iter__
__iter__() -> Iterator[
    tuple[ScrubberWindow[T], ScrubberWindow[T]]
]

Create an iterator that yields train-validation splits for time series cross-validation.

This method creates splits where: 1. The first split has exactly min_train_size points for training 2. Each subsequent split adds val_size points to the training set 3. Each validation set has exactly val_size points and follows the training set

Returns:

Type Description
Iterator[tuple[ScrubberWindow[T], ScrubberWindow[T]]]

Iterator yielding tuples of (training_window, validation_window)

Raises:

Type Description
ValueError

If no source has been set

Source code in pysatl_tsp/implementations/processor/time_series_cross_validator.py
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
def __iter__(self) -> Iterator[tuple[ScrubberWindow[T], ScrubberWindow[T]]]:
    """Create an iterator that yields train-validation splits for time series cross-validation.

    This method creates splits where:
    1. The first split has exactly min_train_size points for training
    2. Each subsequent split adds val_size points to the training set
    3. Each validation set has exactly val_size points and follows the training set

    :return: Iterator yielding tuples of (training_window, validation_window)
    :raises ValueError: If no source has been set
    """
    if self.source is None:
        raise ValueError("Source is not set")

    scrubber = SlidingScrubber(
        lambda buffer: len(buffer) > self.min_train_size
        and (len(buffer) - self.min_train_size) % self.val_size == 0,
        shift=0,
        source=self.source,
    )
    handler: MappingHandler[ScrubberWindow[T], tuple[ScrubberWindow[T], ScrubberWindow[T]]] = MappingHandler(
        map_func=lambda window: (window[: -self.val_size], window[-self.val_size :])
    )

    yield from (scrubber | handler)
trima_handler
TRIMAHandler

Bases: Handler[float | None, float | None]

Triangular Moving Average (TRIMA) Handler.

A weighted moving average where the shape of the weights are triangular and the greatest weight is in the middle of the period. Implemented as a pipeline of two SMA calculations with half of the requested length.

TRIMA gives more weight to the middle portion of the price series and less weight to the oldest and newest data. It is slower to respond to price changes but better at filtering out market noise than a simple moving average.

Parameters:

Name Type Description Default
length int

The period for the TRIMA calculation, defaults to 10

10
source Handler[Any, float | None] | None

Input data source, defaults to None Example: python # Create a data source with numeric values data_source = SimpleDataProvider([1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0]) # Create a TRIMA handler with length of 5 trima_handler = TRIMAHandler(length=5) trima_handler.set_source(data_source) # Process the data for value in trima_handler: print(value) # For a TRIMA with length=5, half_length=3: # First SMA(3) requires 3 values # Second SMA(3) of the first SMA values requires another 3 values # So the first few values will be None, then TRIMA values follow # The TRIMA gives more weight to the middle values in the calculation

None
Source code in pysatl_tsp/implementations/processor/trima_handler.py
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
class TRIMAHandler(Handler[float | None, float | None]):
    """Triangular Moving Average (TRIMA) Handler.

    A weighted moving average where the shape of the weights are triangular and the
    greatest weight is in the middle of the period. Implemented as a pipeline of two
    SMA calculations with half of the requested length.

    TRIMA gives more weight to the middle portion of the price series and less weight
    to the oldest and newest data. It is slower to respond to price changes but better
    at filtering out market noise than a simple moving average.

    :param length: The period for the TRIMA calculation, defaults to 10
    :param source: Input data source, defaults to None

    Example:
        ```python
        # Create a data source with numeric values
        data_source = SimpleDataProvider([1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0])

        # Create a TRIMA handler with length of 5
        trima_handler = TRIMAHandler(length=5)
        trima_handler.set_source(data_source)

        # Process the data
        for value in trima_handler:
            print(value)

        # For a TRIMA with length=5, half_length=3:
        # First SMA(3) requires 3 values
        # Second SMA(3) of the first SMA values requires another 3 values
        # So the first few values will be None, then TRIMA values follow
        # The TRIMA gives more weight to the middle values in the calculation
        ```
    """

    def __init__(self, length: int = 10, source: Handler[Any, float | None] | None = None):
        """Initialize TRIMA handler with specified parameters.

        :param length: The period for the TRIMA calculation, defaults to 10
        :param source: Input data source, defaults to None
        """
        super().__init__(source=source)
        self.length = length if length and length > 0 else 10
        self.half_length = round(0.5 * (self.length + 1))

    def __iter__(self) -> Iterator[float | None]:
        """Create an iterator that yields TRIMA values.

        This method implements the TRIMA calculation by creating a pipeline of two
        consecutive SMA calculations, each with half_length. This approach is
        mathematically equivalent to a weighted moving average with triangular weights.

        :return: Iterator yielding TRIMA values
        :raises ValueError: If no source has been set
        """
        if self.source is None:
            raise ValueError("Source is not set")

        yield from self.source | SMAHandler(length=self.half_length) | SMAHandler(length=self.half_length)
__init__
__init__(
    length: int = 10,
    source: Handler[Any, float | None] | None = None,
)

Initialize TRIMA handler with specified parameters.

Parameters:

Name Type Description Default
length int

The period for the TRIMA calculation, defaults to 10

10
source Handler[Any, float | None] | None

Input data source, defaults to None

None
Source code in pysatl_tsp/implementations/processor/trima_handler.py
43
44
45
46
47
48
49
50
51
def __init__(self, length: int = 10, source: Handler[Any, float | None] | None = None):
    """Initialize TRIMA handler with specified parameters.

    :param length: The period for the TRIMA calculation, defaults to 10
    :param source: Input data source, defaults to None
    """
    super().__init__(source=source)
    self.length = length if length and length > 0 else 10
    self.half_length = round(0.5 * (self.length + 1))
__iter__
__iter__() -> Iterator[float | None]

Create an iterator that yields TRIMA values.

This method implements the TRIMA calculation by creating a pipeline of two consecutive SMA calculations, each with half_length. This approach is mathematically equivalent to a weighted moving average with triangular weights.

Returns:

Type Description
Iterator[float | None]

Iterator yielding TRIMA values

Raises:

Type Description
ValueError

If no source has been set

Source code in pysatl_tsp/implementations/processor/trima_handler.py
53
54
55
56
57
58
59
60
61
62
63
64
65
66
def __iter__(self) -> Iterator[float | None]:
    """Create an iterator that yields TRIMA values.

    This method implements the TRIMA calculation by creating a pipeline of two
    consecutive SMA calculations, each with half_length. This approach is
    mathematically equivalent to a weighted moving average with triangular weights.

    :return: Iterator yielding TRIMA values
    :raises ValueError: If no source has been set
    """
    if self.source is None:
        raise ValueError("Source is not set")

    yield from self.source | SMAHandler(length=self.half_length) | SMAHandler(length=self.half_length)
wma_handler
WMAHandler

Bases: WeightedMovingAverageHandler

Weighted Moving Average (WMA) handler.

Calculates a moving average where each data point is multiplied by a weight before being included in the average. The weights increase or decrease linearly, giving more importance to recent data points by default.

This implementation matches the functionality of pandas_ta.wma, using a linear weighting scheme where the weight of each value is proportional to its position in the window.

Inherits parameters from WeightedMovingAverageHandler:

Parameters:

Name Type Description Default
length

The period for the calculation, defaults to 10

required
asc

Whether weights should be applied in ascending order, defaults to False When False, most recent values get higher weights

required
source

Input data source, defaults to None Example: python # Create a data source with numeric values data_source = SimpleDataProvider([1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0]) # Create a WMA handler with length of 4 wma_handler = WMAHandler(length=4) wma_handler.set_source(data_source) # Process the data for value in wma_handler: print(value) # First 3 values will be None (not enough data points) # For length=4, weights would be [0.1, 0.2, 0.3, 0.4] (with asc=False) # So the 4th value would be: (1.0*0.1 + 2.0*0.2 + 3.0*0.3 + 4.0*0.4) = 3.0 # Similarly, the 5th value: (2.0*0.1 + 3.0*0.2 + 4.0*0.3 + 5.0*0.4) = 4.0

required
Source code in pysatl_tsp/implementations/processor/wma_handler.py
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
class WMAHandler(WeightedMovingAverageHandler):
    """Weighted Moving Average (WMA) handler.

    Calculates a moving average where each data point is multiplied by a weight
    before being included in the average. The weights increase or decrease linearly,
    giving more importance to recent data points by default.

    This implementation matches the functionality of pandas_ta.wma, using a linear
    weighting scheme where the weight of each value is proportional to its position
    in the window.

    Inherits parameters from WeightedMovingAverageHandler:
    :param length: The period for the calculation, defaults to 10
    :param asc: Whether weights should be applied in ascending order, defaults to False
                When False, most recent values get higher weights
    :param source: Input data source, defaults to None

    Example:
        ```python
        # Create a data source with numeric values
        data_source = SimpleDataProvider([1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0])

        # Create a WMA handler with length of 4
        wma_handler = WMAHandler(length=4)
        wma_handler.set_source(data_source)

        # Process the data
        for value in wma_handler:
            print(value)

        # First 3 values will be None (not enough data points)
        # For length=4, weights would be [0.1, 0.2, 0.3, 0.4] (with asc=False)
        # So the 4th value would be: (1.0*0.1 + 2.0*0.2 + 3.0*0.3 + 4.0*0.4) = 3.0
        # Similarly, the 5th value: (2.0*0.1 + 3.0*0.2 + 4.0*0.3 + 5.0*0.4) = 4.0
        ```
    """

    def _calculate_weights(self, length: int, asc: bool) -> list[float]:
        """Calculate linear weights for WMA.

        Generates a sequence of linearly increasing weights that sum to 1.0.
        The weights increase from 1 to length, and are then normalized by dividing
        by their sum. The order can be reversed based on the asc parameter.

        :param length: The number of weights to generate
        :param asc: Whether weights should be in ascending order
                   When False (default), higher weights are assigned to more recent values
        :return: A list of normalized weights summing to 1.0
        """
        weights: list[float] = list(np.arange(1, length + 1))
        if not asc:
            weights = weights[::-1]

        # Calculate total weight (sum of weights)
        total_weight = np.sum(weights)

        # Return normalized weights
        return [float(w) / total_weight for w in weights]
zlma_handler
ZLMAHandler

Bases: Handler[float | None, float | None]

Zero Lag Moving Average (ZLMA) handler with lazy evaluation.

Implements the formula ZLMA = MA(2 * close - close.shift(lag)), where lag = int(0.5 * (length - 1)). All calculations are performed lazily in a streaming fashion, computing values only when requested by the iterator.

The ZLMA reduces lag by applying a forward-shifted moving average that compensates for the lag inherent in traditional moving averages.

Parameters:

Name Type Description Default
length int

Period for the moving average calculation

10
ma_handler Handler[Any, float | None] | None

Moving average handler to apply. Default is EMA with the specified length.

None
source Handler[Any, float | None] | None

Input data source, defaults to None Example: python # Create a data source with numeric values data_source = SimpleDataProvider([1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0]) # Create a ZLMA handler with length of 5 zlma_handler = ZLMAHandler(length=5) zlma_handler.set_source(data_source) # Process the data for value in zlma_handler: print(value) # Initial values will be None due to lag calculation requirements # Then ZLMA values will be calculated using the formula

None
Source code in pysatl_tsp/implementations/processor/zlma_handler.py
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
class ZLMAHandler(Handler[float | None, float | None]):
    """Zero Lag Moving Average (ZLMA) handler with lazy evaluation.

    Implements the formula ZLMA = MA(2 * close - close.shift(lag)), where lag = int(0.5 * (length - 1)).
    All calculations are performed lazily in a streaming fashion, computing values only
    when requested by the iterator.

    The ZLMA reduces lag by applying a forward-shifted moving average that compensates
    for the lag inherent in traditional moving averages.

    :param length: Period for the moving average calculation
    :param ma_handler: Moving average handler to apply. Default is EMA with the specified length.
    :param source: Input data source, defaults to None

    Example:
        ```python
        # Create a data source with numeric values
        data_source = SimpleDataProvider([1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0])

        # Create a ZLMA handler with length of 5
        zlma_handler = ZLMAHandler(length=5)
        zlma_handler.set_source(data_source)

        # Process the data
        for value in zlma_handler:
            print(value)

        # Initial values will be None due to lag calculation requirements
        # Then ZLMA values will be calculated using the formula
        ```
    """

    def __init__(
        self,
        length: int = 10,
        ma_handler: Handler[Any, float | None] | None = None,
        source: Handler[Any, float | None] | None = None,
    ):
        """Initialize a Zero Lag Moving Average handler.

        :param length: Period for the moving average calculation, defaults to 10
        :param ma_handler: Moving average handler to apply, defaults to EMAHandler with the specified length
        :param source: Input data source, defaults to None
        """
        super().__init__(source=source)
        self.length = length
        self.ma_handler = ma_handler if ma_handler is not None else EMAHandler(length=length)

    def __iter__(self) -> Iterator[float | None]:
        """Create an iterator that yields ZLMA values.

        This method implements the ZLMA calculation pipeline according to the formula:
        ZLMA = MA(2 * close - close.shift(lag)), where lag = int(0.5 * (length - 1)).

        :return: Iterator yielding ZLMA values
        :raises ValueError: If no source has been set
        """
        if self.source is None:
            raise ValueError("ZLMAHandler requires a data source")

        # Calculate lag
        lag = int(0.5 * (self.length - 1))

        yield from self.source | LagHandler(lag=lag) | self.ma_handler
__init__
__init__(
    length: int = 10,
    ma_handler: Handler[Any, float | None] | None = None,
    source: Handler[Any, float | None] | None = None,
)

Initialize a Zero Lag Moving Average handler.

Parameters:

Name Type Description Default
length int

Period for the moving average calculation, defaults to 10

10
ma_handler Handler[Any, float | None] | None

Moving average handler to apply, defaults to EMAHandler with the specified length

None
source Handler[Any, float | None] | None

Input data source, defaults to None

None
Source code in pysatl_tsp/implementations/processor/zlma_handler.py
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
def __init__(
    self,
    length: int = 10,
    ma_handler: Handler[Any, float | None] | None = None,
    source: Handler[Any, float | None] | None = None,
):
    """Initialize a Zero Lag Moving Average handler.

    :param length: Period for the moving average calculation, defaults to 10
    :param ma_handler: Moving average handler to apply, defaults to EMAHandler with the specified length
    :param source: Input data source, defaults to None
    """
    super().__init__(source=source)
    self.length = length
    self.ma_handler = ma_handler if ma_handler is not None else EMAHandler(length=length)
__iter__
__iter__() -> Iterator[float | None]

Create an iterator that yields ZLMA values.

This method implements the ZLMA calculation pipeline according to the formula: ZLMA = MA(2 * close - close.shift(lag)), where lag = int(0.5 * (length - 1)).

Returns:

Type Description
Iterator[float | None]

Iterator yielding ZLMA values

Raises:

Type Description
ValueError

If no source has been set

Source code in pysatl_tsp/implementations/processor/zlma_handler.py
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
def __iter__(self) -> Iterator[float | None]:
    """Create an iterator that yields ZLMA values.

    This method implements the ZLMA calculation pipeline according to the formula:
    ZLMA = MA(2 * close - close.shift(lag)), where lag = int(0.5 * (length - 1)).

    :return: Iterator yielding ZLMA values
    :raises ValueError: If no source has been set
    """
    if self.source is None:
        raise ValueError("ZLMAHandler requires a data source")

    # Calculate lag
    lag = int(0.5 * (self.length - 1))

    yield from self.source | LagHandler(lag=lag) | self.ma_handler