From 3eeb99dc38f5c24c96dd0d52282847824087ae04 Mon Sep 17 00:00:00 2001 From: eslopfer Date: Thu, 26 Jan 2023 23:42:38 +0000 Subject: [PATCH] docs(string_tokenizer_count): add description for exercise --- .../devops/string_tokenizer_count/README.md | 42 +++++++++++++++++++ 1 file changed, 42 insertions(+) create mode 100644 subjects/devops/string_tokenizer_count/README.md diff --git a/subjects/devops/string_tokenizer_count/README.md b/subjects/devops/string_tokenizer_count/README.md new file mode 100644 index 00000000..2fdff25c --- /dev/null +++ b/subjects/devops/string_tokenizer_count/README.md @@ -0,0 +1,42 @@ +## string_tokenizer_count + +### Instructions + +Create a file string_tokenizer_count.py that contains a function tokenizer_counter which takes in a string as a parameter and returns a dictionary of words and their count in the string. + +- The function should remove any punctuation from the string and convert it to lowercase before counting the words. + +- The function should return a dictionary of words and their count, sorted alphabetically by word. + +### Usage + +Here is an example of how to use the function: + +```python +string = "This is a test sentence, with various words and 123 numbers!" +result = tokenizer_counter(string) +print(string) +``` + +And its output: + +```console +string = "This is a test sentence, with various words and 123 numbers!" +result = tokenizer_counter(string) +``` + +### Hints + +- The `re` module can be used to remove non-alphanumeric characters. + +- The `collections` module can be used to count the words. + +- The `operator` module can be used to sort the dictionary alphabetically by word. + +### References + +- [`re` module](https://docs.python.org/3/library/re.html) + +- [`collections` module](https://docs.python.org/3/library/collections.html) + +- [`operator` module](https://docs.python.org/3/library/operator.html)