refactor: (codeflash)⚡️ Speed up method JSONCleaner._remove_control_characters by 1,491% (#5322)
* ⚡️ Speed up method `JSONCleaner._remove_control_characters` by 1,491%
To optimize the function `_remove_control_characters`, we can use the `translate` method with a translation table to remove control characters. This method is generally faster than using regular expressions for character replacement/removal tasks.
Here is the optimized version of the program.
By precompiling the translation table in the `__init__` method, we're reducing the repeated overhead of creating this table every time `_remove_control_characters` is called. Using `str.translate` with this precompiled table significantly improves the performance compared to using a regular expression substitution.
* add super()
---------
Co-authored-by: codeflash-ai[bot] <148906541+codeflash-ai[bot]@users.noreply.github.com>
This commit is contained in:
parent
90ba7f3ae7
commit
214829ab45
1 changed files with 6 additions and 2 deletions
|
|
@ -1,5 +1,4 @@
|
|||
import json
|
||||
import re
|
||||
import unicodedata
|
||||
|
||||
from langflow.custom import Component
|
||||
|
|
@ -83,7 +82,7 @@ class JSONCleaner(Component):
|
|||
|
||||
def _remove_control_characters(self, s: str) -> str:
|
||||
"""Remove control characters from the string."""
|
||||
return re.sub(r"[\x00-\x1F\x7F]", "", s)
|
||||
return s.translate(self.translation_table)
|
||||
|
||||
def _normalize_unicode(self, s: str) -> str:
|
||||
"""Normalize Unicode characters in the string."""
|
||||
|
|
@ -97,3 +96,8 @@ class JSONCleaner(Component):
|
|||
msg = f"Invalid JSON string: {e}"
|
||||
raise ValueError(msg) from e
|
||||
return s
|
||||
|
||||
def __init__(self):
|
||||
# Create a translation table that maps control characters to None
|
||||
super().__init__()
|
||||
self.translation_table = str.maketrans("", "", "".join(chr(i) for i in range(32)) + chr(127))
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue