Skip to content

Commit cfae0ee

Browse files
committed
Updated valx to version 0.2.4.
1 parent 56a26bc commit cfae0ee

File tree

7 files changed

+23
-5
lines changed

7 files changed

+23
-5
lines changed

.gitignore

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,4 +2,5 @@ __pycache__
22
*.pyc
33
dist/
44
build/
5-
venv/
5+
env/
6+
.venv/

README.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,12 @@ An open-source Python library for data cleaning tasks. It includes functions for
1111
> Please downgrade to `numpy` version `1.26.4`. Our ValX **DecisionTreeClassifier** AI model, relies on lower versions of `numpy`, because it was trained on these versions.
1212
> For more information see: https://techoverflow.net/2024/07/23/how-to-fix-numpy-dtype-size-changed-may-indicate-binary-incompatibility-expected-96-from-c-header-got-88-from-pyobject/
1313
14+
## Changes in 0.2.4
15+
16+
Fixed a major incompatibility issue with `scikit-learn` due to version changes in `scikit-learn v1.3.0` which causes compatibility issues with versions later than `1.2.2`. ValX can now be used with `scikit-learn` versions earlier and later than `1.3.0`!
17+
18+
We've also removed `scikit-learn==1.2.2` as a dependency, as most versions of `scikit-learn` will now work.
19+
1420
## Changes in 0.2.3
1521

1622
We have introduced a new optional `info_type` parameter into our `detect_sensitive_information`, and `remove_sensitive_information` functions, to allow you to have fine-grained control over what sensitive information you want to detect or remove.

setup.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,10 +15,10 @@
1515
packages=find_packages(),
1616
package_data={'valx': ['models/*']},
1717
install_requires=[
18-
'scikit-learn==1.2.2' # for the AI to function properly
18+
'scikit-learn'
1919
],
2020
classifiers=[
21-
'Development Status :: 3 - Alpha',
21+
'Development Status :: 5 - Production/Stable',
2222
'Intended Audience :: Developers',
2323
'License :: OSI Approved :: MIT License',
2424
'Programming Language :: Python :: 3.6',

test.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ def main():
1414
# Print out all detected words
1515
print([d['Word'] for d in detected_profanity])
1616

17-
removed = remove_profanity(sample_text, "text_cleaned.txt", language="All")
17+
remove_profanity(sample_text, "text_cleaned.txt", language="All")
1818

1919
# New version
2020
print(detect_hate_speech("You're so stupid."))

test_new.py

Whitespace-only changes.

valx/__init__.py

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33

44
# New version, that includes AI detection
55
import pickle
6+
from sklearn import __version__ as sklearnversion
67
from sklearn.tree import DecisionTreeClassifier # Import the DecisionTreeClassifier class
78
from sklearn.feature_extraction.text import CountVectorizer # Import the CountVectorizer class
89

@@ -2902,8 +2903,18 @@ def remove_sensitive_information(text_data, output_file=None, info_type=["email"
29022903

29032904
# New version
29042905
MODEL_DIR = os.path.join(os.path.dirname(__file__), 'models')
2905-
DECISION_TREE_MODEL_PATH = os.path.join(MODEL_DIR, 'decision_tree_model.sav')
29062906
COUNT_VECTORIZER_PATH = os.path.join(MODEL_DIR, 'count_vectorizer.sav')
2907+
DEFAULT_MODEL_FILENAME = "decision_tree_model.sav"
2908+
UPGRADED_MODEL_FILENAME = "classifier_upgraded.pkl"
2909+
2910+
# Check scikit-learn version
2911+
SKLEARN_VERSION = tuple(map(int, sklearnversion.split(".")))
2912+
2913+
# Set the model path based on the version
2914+
if SKLEARN_VERSION >= (1, 3, 0):
2915+
DECISION_TREE_MODEL_PATH = os.path.join(MODEL_DIR, UPGRADED_MODEL_FILENAME)
2916+
else:
2917+
DECISION_TREE_MODEL_PATH = os.path.join(MODEL_DIR, DEFAULT_MODEL_FILENAME)
29072918

29082919
# Load the saved models
29092920
model = pickle.load(open(DECISION_TREE_MODEL_PATH, 'rb'))
380 KB
Binary file not shown.

0 commit comments

Comments
 (0)