Back
Close
  • 12

Learning Opportunities

This puzzle can be solved using the following concepts. Practice using these concepts and improve your skills.

Statement

 Goal

Like encryption, data compression is one of the most important technologies in IT. Your goal is to write a very simple text compressor. The basics of that compressor are to replace repeating parts of the text with a reference to the index of the first appearance of the word. The reference is /, followed by the index number.

Compressing rules:
1. Punctuation marks are not part of a word
2. The reference has always to be shorter than the original, otherwise they are not counted in the index
3. Parts of words can be replaced
4. The part replacement can happen in front of the position of the whole word


Example: "Fischers Fritz fischt frische Fische. Frische Fische fischt Fischers Fritz."
Indexes: 0 1 2 3 4 5 4 2 0 1


5 words indexed, the repetitions are not indexed because they get replaced with the reference to the first word.
Using rule 1-3 creates the sentence: "Fischers Fritz fischt frische Fische. Frische /4 /2 /0 /1.". Adding rule 4 and 5 will also replace "Fisch" in "Fischers". The correct result is: "/4rs Fritz fischt frische Fische. Frische /4 /2 /0 /1."

Comparing both sentences shows the effectiveness of the compression:
"Fischers Fritz fischt frische Fische. Frische Fische fischt Fischers Fritz."
"/4rs Fritz fischt frische Fische. Frische /4 /2 /0 /1."

Important : note that words are indexed on the fly, indexes should not be precomputed.
With the following sentence "Indolore dolor dolore" :
"Indolore" has index 0, and no match.
"Dolor" has index 1, and 2 matches. The sentence becomes "In/1e dolor /1e".
Then "/1e" has index 2, and 1 match. The final sentence becomes "In/2 dolor /1e".
Therefore "Dolore" is never indexed as a full word, and "/1e" is a valid word to index.

Assumptions you should be aware of:

- There are three types of components in the input line: words, punctuations and separators.
- A separators is one SPACE character. Never multiple consecutive spaces. Never a tab.
- Adjacent words are always separated by exactly one separator.
- Separators are for separating two words, so that at the head and tail of the line there is no separator.
- Punctuations may or may not appear after a word.
- Should a punctuation appears, there is no space ahead the punctuation. There is always one SPACE (a separator) after the punctuation before the next word.
- "Punctuation characters" ahead of a word or within a word, should it appear, are not regarded as punctuation at all. They become part of the word suitable for indexing.
- Multiple "punctuation characters" can appear consecutively. By their locations and the above rules, they sometimes are real punctuations (not for indexing) and sometimes they are part of a word good for indexing.
- Input should not contain '/' as no escaping rule is defined.


Cover: "So tiny.................Tiny Toadlet." by pete. #hwcp is licensed with CC BY 2.0. To view a copy of this license, visit https://creativecommons.org/licenses/by/2.0/
Input
Line 1: A string with the text to compress
Output
Line 1: A string with the compressed text
Constraints
2 < Length of text < 2057
Example
Input
This is a text to test the text compression
Output
This is a text to test the /1 compression

A higher resolution is required to access the IDE

codingame x discord
Join the CodinGame community on Discord to chat about puzzle contributions, challenges, streams, blog articles - all that good stuff!
JOIN US ON DISCORD
Online Participants